Introduction to Running CFX on U2 - Introduction to the U2 Cluster Login and File Transfer
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Introduction to Running CFX on U2 Introduction to the U2 Cluster Getting Help Hardware Resources Software Resources Computing Environment Data Storage Login and File Transfer UBVPN Login and Logout More about X-11 Display File Transfer
Introduction to Running CFX on U2 Unix Commands Short list of Basic Unix Commands Reference Card Paths and Using Modules Starting the CFX Solver Launching CFX Monitoring Running CFX on the Cluster PBS Batch Scheduler Interactive Jobs Batch Jobs
Information and Getting Help Getting help: CCR uses an email problem ticket system. Users send their questions and descriptions of problems to ccr-help@ccr.buffalo.edu The technical staff receives the email and responds to the user. • Usually within one business day. This system allows staff to monitor and contribute their expertise to the problem. CCR website: http://www.ccr.buffalo.edu
Cluster Computing The u2 cluster is the major computational platform of the Center for Computational Research. Login (front-end) and cluster machines run the Linux operating system. Requires a CCR account. Accessible from the UB domain. The login machine is u2.ccr.buffalo.edu Compute nodes are not accessible from outside the cluster. Traditional UNIX style command line interface. A few basic commands are necessary.
Cluster Computing The u2 cluster consists of 1024 dual processor DELL compute nodes, 1024 dual quad core DELL nodes, and 1024 dual quad core IBM nodes. The compute nodes have Intel Xeon processors. Most of the cluster machines are 3.2 GHz with 2 GB of memory. There 64 compute nodes with 4 GB of memory and 96 with 8 GB. All nodes are connected to a gigabit ethernet network. 256 nodes are connected with Infiniband and 512 nodes are connected with Myrinet, both high speed fibre networks.
Data Storage Home directory: /user/UBITusername/u2 The default user quota for a home directory is 2GB. • Users requiring more space should contact the CCR staff. Data in home directories are backed up. • CCR retains data backups for one month. Projects directories: /projects[1-3]/research-group-name The default group quota for a project directory is 100GB. Data in project directories is NOT backed up by default. Scratch spaces are available for TEMPORARY use by jobs running on the cluster. /panasas/scratch provides > 100TB of space. • Accessible from the front-end and all compute nodes.
Accessing the U2 Cluster The u2 cluster front-end is accessible from the UB domain (.buffalo.edu) Use VPN for access from outside the University. The UBIT website provides a VPN client for Linux, MAC, and Windows machines. • http://ubit.buffalo.edu/software The VPN client connects the machine to the UB domain, from which u2 can be accessed. Telnet access is not permitted.
Login and X-Display LINUX/UNIX/Mac workstation: ssh u2.ccr.buffalo.edu • ssh UBITusername@u2.ccr.buffalo.edu The –X or –Y flags will enable an X-Display from u2 to the workstation. • ssh –X u2.ccr.buffalo.edu Windows workstation: Download and install the X-Win32 client from ubit.buffalo.edu/software/win/XWin32 Use the configuration to setup ssh to u2. Set the command to xterm -ls Logout: logout or exit in the login window.
File Transfer FileZilla is available for Windows, Linux and MAC machines. Check the UBIT software pages. This is a drag and drop graphical interface. Please use port 22 for secure file transfer. Command line file transfer for Unix. sftp u2.ccr.buffalo.edu • put, get, mput and mget are used to uploaded and download data files. • The wildcard “*” can be used with mput and mget. scp filename u2.ccr.buffalo.edu:filename
Basic Unix Commands Using the U2 cluster requires knowledge of some basic UNIX commands. The CCR Reference Card provides a list of the basic commands. The Reference Card is a pdf linked to www.ccr.buffalo.edu/display/WEB/Unix+Commands These will get you started, then you can learn more commands as you go. List files: • ls • ls –la (long listing that shows all files)
Basic Unix Commands View files: • cat filename (displays file to screen) • more filename (displays file with page breaks) Change directory: • cd directory-pathname • cd (go to home directory) • cd .. (go back one level) Show directory pathname • pwd (shows current directory pathname) Copy files and directories • cp old-file new-file • cp –R old-directory new-directory
Basic Unix Commands Move files and directories: • mv old-file new-file • mv old-directory new-directory • NOTE: move is a copy and remove Create a directory: • mkdir new-directory remove files and directories: • rm filename • rm –R directory (removes directory and contents) • rmdir directory (directory must be empty) • Note: be careful when using the wildcard “*” More about a command: man command
Basic Unix Commands View files and directory permissions using ls command. • ls –l Permissions have the following format: • -rwxrwxrwx … filename – user group other Change permissions of files and directories using the chmod command. • Arguments for chmod are ugo+-rxw – user group other read write execute • chmod g+r filename – add read privilege for group • chmod –R o-rwx directory-name – Removes read, write and execute privileges from the directory and its contents.
Basic Unix Commands There are a number of editors available: emacs, vi, nano, pico • Emacs will default to a GUI if logged in with X-DISPLAY enabled. Files edited on Windows PCs may have embedded characters that can create runtime problems. Check the type of the file: • file filename Convert DOS file to Unix. This will remove the Windows/DOS characters. • dos2unix –n old-file new-file
Modules Modules are available to set variables and paths for application software, communication protocols, compilers and numerical libraries. module avail (list all available modules) module load module-name (loads a module) • Updates PATH variable with path of application. module unload module-name (unloads a module) • Removes path of application from the PATH variable. module list (list loaded modules) module show module-name • Show what the module sets. Modules can be loaded in the user’s .bashrc file.
Setup a test case Create a subdirectory mkdir bluntbody Change directory to bluntbody cd bluntbody Copy the Blunt Body.def file to the bluntbody directory cp /util/cfx- ub/CFX110/ansys_inc/v110/CFX/examples/Blu ntBody.def BluntBody.def ls -l
Start an interactive job qsub -I -X -lnodes=1:GM:ppn=2 -lwalltime=01:00:00 -q debug -I “interactive” -X use the “X display” “-lnodes=1:GM:ppn=2” asks for 2 processors (2 x 2) on the Gigabit Myrinet connected 2-core nodes. -lwalltime=01:00:00 requests 1 hour “-q debug” requests the debug queue, used for testing purposes. The maximum wall time for this queue is 1 hour. The above request is for a quick test job. Queue details can be found here: http://www.ccr.buffalo.edu/display/WEB/U2+queues
Start an interactive job
Load CFX module Load the CFX module module load cfx/ub-121 Launch CFX: cfx5 The CFX solver GUI will display on the workstation Launch with detach from command line cfx5 & Click on CFX-Solver Manager 12.1
CFX initialization
CFX initialization (MAE 505 only) • The first time you start CFX on the cluster: • Go to tools-> ANSYS Client Licensing Utility • Click “Set License Preferences”, select 12.2 • Select “Use Academic License” and apply
CFX initialization (MAE 505 only)
Running CFX: parallel Click on File Select Define Run Select the BluntBody.def Type of Run: Full Run Mode: HP MPI Local Parallel start “top” to monitor the memory and CPU In another window on u2 start “jobvis” to moniter the performance on each core. Click Start Run in the CFX Define Run window. After solver is done, click NO for post processing. HP MPI Local Parallel is used when running on one multiprocessor machine. To use one core, you could have chosen “Serial”.
Running CFX: parallel
Running CFX: parallel
Running CFX: parallel Here, 349498 is the Job ID, given at the start of the interactive job. Notice that 2 cores are being used, as expected. The other slots are for communication over the network.
Running CFX : distributed parallel Start from a fresh login to the cluster, request an interactive job on 2 nodes w/ 2 cores each , load the CFX module, and launch CFX: qsub -I -X -lnodes=2:GM:ppn=2 -lwalltime=01:00:00 -q debug module load cfx/ub-121 cfx5 & The interactive job will log in on the first compute node in the nodelist; this is referred to as the “head node”. Open another window and log into the cluster. Type: qstat -an -u username You can see information on your job, including the job id. Type: jobvis (put job id here) Click on CFX-Solver Manager 12.1 Select .def file Type of Run: Full Run mode: HP MPI Distributed Parallel HP MPI Distributed Parallel is used with more than one compute node.
Running CFX : distributed parallel Example: Launch CFX and add a partition
Running CFX : distributed parallel Example: Add a second compute node and add a partition for that node.
Running CFX : distributed parallel Start run and monitor w/ jobvis. Notice that it now runs on 4 cores, between nodes. This job is using the Myrinet network for the MPI communication Ethernet is used for the filesystem IO and scheduler
Running on the U2 Cluster The compute machines are assigned to user jobs by the PBS (Portable Batch System) scheduler. The qsub command submits jobs to the scheduler Interactive jobs depend on the connection from the workstation to u2. If the workstation is shut down or disconnected from the network, then the job will terminate.
PBS Execution Model PBS executes a login as the user on the master host, and then proceeds according to one of two modes, depending on how the user requested that the job be run. Script - the user executes the command: qsub [options] job-script • where job-script is a standard UNIX shell script containing some PBS directives along with the commands that the user wishes to run (examples later). Interactive - the user executes the command: qsub [options] –I • the job is run “interactively,” in the sense that standard output and standard error are connected to the terminal session of the initiating ’qsub’ command. Note that the job is still scheduled and run as any other batch job (so you can end up waiting a while for your prompt to come back “inside” your batch job).
Execution Model Schematic qsub myscript pbs_server SCHEDULER No Yes Run? $PBS_NODEFILE node1 prologue $USER login myscript epilogue node2 nodeN
PBS Queues The PBS queues defined for the U2 cluster are CCR and debug. The CCR queue is the default The debug queue can be requested by the user. Used to test applications. qstat –q Shows queues defined for the scheduler. Availability of the queues. qmgr Shows details of the queues and scheduler.
PBS Queues Do you even need to specify a queue? You probably don’t need (and may not even be able) to specify a specific queue destination. Most of our PBS servers use a routing queue. The exception is the debug queue on u2, which requires a direct submission. This queue has a certain number of compute nodes set aside for its use during peak times. Usually, this queue has 32 compute nodes. The queue is always available, however it has dedicated nodes Monday through Friday, from 9:00am to 5:00pm. Use -q debug to specify the debug queue on the u2 cluster.
Batch Scripts - Resources The “-l” options are used to request resources for a job. Used in batch scripts and interactive jobs. -l walltime=01:00:00 wall-clock limit of the batch job. Requests 1 hour wall-clock time limit. If the job does not complete before this time limit, then it will be terminated by the scheduler. All tasks will be removed from the nodes. -l nodes=8:ppn=2 number of cluster nodes, with optional processors per node. Requests 8 nodes with 2 processors per node.
Environmental Variables $PBS_O_WORKDIR - directory from which the job was submitted. By default, a PBS job starts from the user’s $HOME directory. Note that you can change this default in your .cshrc or .bashrc file. add the following to your .cshrc file: if ( $?PBS_ENVIRONMENT ) then cd $PBS_O_WORKDIR endif or this to your .bashrc file: if [ -n "$PBS_ENVIRONMENT" ]; then cd $PBS_O_WORKDIR Fi In practice, many users change directory to the $PBS_O_WORKDIR directory in their scripts.
Environmental Variables $PBSTMPDIR - reserved scratch space, local to each host (this is a CCR definition, not part of the PBS package). This scratch directory is created in /scratch and is unique to the job. The $PBSTMPDIR is created on every compute node running a particular job. $PBS_NODEFILE - name of the file containing a list of nodes assigned to the current batch job. Used to allocate parallel tasks in a cluster environment.
Sample Script – parallel 1x2 Example of a PBS script: /util/pbs-scripts/pbsCFX-1x2
Sample Script – distributed parallel 2x2 Example of a PBS script: /util/pbs-scripts/pbsCFX-2x2
Submitting a Batch Job Navigate to a directory where the pbs script and your .def file reside; this is $PBS_O_WORKDIR qsub pbsCFX-2x2 qstat –an –u username jobvis “jobid” When finished, the output files will be in $PBS_O_WORKDIR
You can also read