Introduction to Running CFX on U2 - Introduction to the U2 Cluster Login and File Transfer

Page created by Joan Gomez

Uncategorized

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

Introduction to Running CFX on U2 - Introduction to the U2 Cluster Login and File Transfer

Introduction to Running CFX on U2

   Introduction to the U2 Cluster
       Getting Help
       Hardware Resources
       Software Resources
       Computing Environment
       Data Storage
   Login and File Transfer
       UBVPN
       Login and Logout
       More about X-11 Display
       File Transfer

Introduction to Running CFX on U2

   Unix Commands
     Short list of Basic Unix Commands
     Reference Card
     Paths and Using Modules
   Starting the CFX Solver
     Launching CFX
     Monitoring
   Running CFX on the Cluster
     PBS Batch Scheduler
     Interactive Jobs
     Batch Jobs

Information and Getting Help

 Getting help:
   CCR uses an email problem ticket system.
    Users send their questions and descriptions
    of problems to ccr-help@ccr.buffalo.edu
   The technical staff receives the email and
    responds to the user.
     • Usually within one business day.
   This system allows staff to monitor and
    contribute their expertise to the problem.
 CCR website:
   http://www.ccr.buffalo.edu

Cluster Computing

 The u2 cluster is the major computational
  platform of the Center for Computational
  Research.
 Login (front-end) and cluster machines run the
  Linux operating system.
 Requires a CCR account.
 Accessible from the UB domain.
 The login machine is u2.ccr.buffalo.edu
 Compute nodes are not accessible from outside
  the cluster.
 Traditional UNIX style command line interface.
    A few basic commands are necessary.

Cluster Computing

 The u2 cluster consists of 1024 dual processor
  DELL compute nodes, 1024 dual quad core DELL
  nodes, and 1024 dual quad core IBM nodes.
 The compute nodes have Intel Xeon processors.
 Most of the cluster machines are 3.2 GHz with 2
  GB of memory.
 There 64 compute nodes with 4 GB of memory
  and 96 with 8 GB.
 All nodes are connected to a gigabit ethernet
  network.
 256 nodes are connected with Infiniband and 512
  nodes are connected with Myrinet, both high
  speed fibre networks.

Cluster Computing

Data Storage

 Home directory:
    /user/UBITusername/u2
    The default user quota for a home directory is 2GB.
      • Users requiring more space should contact the CCR staff.
    Data in home directories are backed up.
      • CCR retains data backups for one month.
 Projects directories:
    /projects[1-3]/research-group-name
    The default group quota for a project directory is
     100GB.
    Data in project directories is NOT backed up by default.
 Scratch spaces are available for TEMPORARY use
  by jobs running on the cluster.
    /panasas/scratch provides > 100TB of space.
      • Accessible from the front-end and all compute nodes.

Accessing the U2 Cluster

 The u2 cluster front-end is accessible
 from the UB domain (.buffalo.edu)
   Use VPN for access from outside the
    University.
   The UBIT website provides a VPN client for
    Linux, MAC, and Windows machines.
     • http://ubit.buffalo.edu/software
   The VPN client connects the machine to the
    UB domain, from which u2 can be accessed.
   Telnet access is not permitted.

Login and X-Display

 LINUX/UNIX/Mac workstation:
   ssh u2.ccr.buffalo.edu
     • ssh UBITusername@u2.ccr.buffalo.edu
   The –X or –Y flags will enable an X-Display
    from u2 to the workstation.
     • ssh –X u2.ccr.buffalo.edu
 Windows workstation:
   Download and install the X-Win32 client from
    ubit.buffalo.edu/software/win/XWin32
   Use the configuration to setup ssh to u2.
   Set the command to xterm -ls
 Logout: logout or exit in the login window.

File Transfer

 FileZilla is available for Windows, Linux
  and MAC machines.
   Check the UBIT software pages.
   This is a drag and drop graphical interface.
   Please use port 22 for secure file transfer.
 Command line file transfer for Unix.
   sftp u2.ccr.buffalo.edu
     • put, get, mput and mget are used to uploaded and
       download data files.
     • The wildcard “*” can be used with mput and mget.
   scp filename u2.ccr.buffalo.edu:filename

Basic Unix Commands

 Using the U2 cluster requires knowledge
  of some basic UNIX commands.
 The CCR Reference Card provides a list of
  the basic commands.
   The Reference Card is a pdf linked to
    www.ccr.buffalo.edu/display/WEB/Unix+Commands
 These will get you started, then you can
  learn more commands as you go.
   List files:
     • ls
     • ls –la     (long listing that shows all files)

Basic Unix Commands

 View files:
   • cat filename        (displays file to screen)
   • more filename       (displays file with page breaks)
 Change directory:
   • cd directory-pathname
   • cd                  (go to home directory)
   • cd ..               (go back one level)
 Show directory pathname
   • pwd          (shows current directory pathname)
 Copy files and directories
   • cp old-file new-file
   • cp –R old-directory new-directory

Basic Unix Commands

 Move files and directories:
   • mv old-file new-file
   • mv old-directory new-directory
   • NOTE: move is a copy and remove
 Create a directory:
   • mkdir new-directory
 remove files and directories:
   • rm filename
   • rm –R directory     (removes directory and
                         contents)
   • rmdir directory     (directory must be empty)
   • Note: be careful when using the wildcard “*”
 More about a command: man command

Basic Unix Commands

 View files and directory permissions using ls command.
   • ls –l
 Permissions have the following format:
   • -rwxrwxrwx … filename
       – user group other
 Change permissions of files and directories using the
  chmod command.
   • Arguments for chmod are ugo+-rxw
       – user group other read write execute
   • chmod g+r filename
       – add read privilege for group
   • chmod –R o-rwx directory-name
       – Removes read, write and execute privileges from the directory
         and its contents.

Basic Unix Commands

 There are a number of editors available:
    emacs, vi, nano, pico
      • Emacs will default to a GUI if logged in with X-DISPLAY
        enabled.
 Files edited on Windows PCs may have
  embedded characters that can create runtime
  problems.
    Check the type of the file:
      • file filename
    Convert DOS file to Unix. This will remove the
     Windows/DOS characters.
      • dos2unix –n old-file new-file

Modules

 Modules are available to set variables and paths
  for application software, communication
  protocols, compilers and numerical libraries.
    module avail         (list all available modules)
    module load module-name        (loads a module)
      • Updates PATH variable with path of application.
    module unload module-name (unloads a module)
      • Removes path of application from the PATH variable.
    module list         (list loaded modules)
    module show module-name
      • Show what the module sets.
 Modules can be loaded in the user’s .bashrc file.

Setup a test case

 Create a subdirectory
   mkdir bluntbody
 Change directory to bluntbody
   cd bluntbody
 Copy the Blunt Body.def file to the
  bluntbody directory
   cp /util/cfx-
    ub/CFX110/ansys_inc/v110/CFX/examples/Blu
    ntBody.def BluntBody.def
   ls -l

Start an interactive job

   qsub -I -X -lnodes=1:GM:ppn=2 -lwalltime=01:00:00 -q debug
    -I “interactive”
    -X use the “X display”
    “-lnodes=1:GM:ppn=2” asks for 2 processors
      (2 x 2) on the Gigabit Myrinet connected 2-core
      nodes.
    -lwalltime=01:00:00 requests 1 hour
    “-q debug” requests the debug queue, used
      for testing purposes. The maximum wall time
      for this queue is 1 hour.
 The above request is for a quick test job.
Queue details can be found here:
http://www.ccr.buffalo.edu/display/WEB/U2+queues

Start an interactive job

Load CFX module

 Load the CFX module
   module load cfx/ub-121
 Launch CFX: cfx5
   The CFX solver GUI will display on the
    workstation
   Launch with detach from command line
   cfx5 &
 Click on CFX-Solver Manager 12.1

CFX initialization

CFX initialization (MAE 505 only)
• The first time you start CFX on the cluster:
  • Go to tools-> ANSYS Client Licensing Utility
  • Click “Set License Preferences”, select 12.2
  • Select “Use Academic License” and apply

CFX initialization (MAE 505 only)

Running CFX: parallel

 Click on File
 Select Define Run
 Select the BluntBody.def
 Type of Run: Full
 Run Mode: HP MPI Local Parallel
 start “top” to monitor the memory and CPU
 In another window on u2 start “jobvis” to moniter
  the performance on each core.
 Click Start Run in the CFX Define Run window.
 After solver is done, click NO for post processing.
 HP MPI Local Parallel is used when running on
  one multiprocessor machine. To use one core,
  you could have chosen “Serial”.

Running CFX: parallel

Running CFX: parallel

Running CFX: parallel

Here, 349498 is the Job ID, given at the start of the interactive job.
Notice that 2 cores are being used, as expected.
The other slots are for communication over the network.

Running CFX : distributed parallel

     Start from a fresh login to the cluster, request an interactive job on 2
     nodes w/ 2 cores each , load the CFX module, and launch CFX:
        qsub -I -X -lnodes=2:GM:ppn=2 -lwalltime=01:00:00 -q debug
        module load cfx/ub-121
        cfx5 &
    The interactive job will log in on the first compute node in the nodelist;
     this is referred to as the “head node”.
    Open another window and log into the cluster.
        Type: qstat -an -u username
        You can see information on your job, including the job id. Type:
           jobvis (put job id here)
    Click on CFX-Solver Manager 12.1
    Select .def file
     Type of Run: Full
     Run mode: HP MPI Distributed Parallel
    HP MPI Distributed Parallel is used with more than one compute node.

Running CFX : distributed parallel

    Example: Launch CFX and add a partition

Running CFX : distributed parallel

     Example: Add a second compute node and add a partition for that
     node.

Running CFX : distributed parallel
  Start run and monitor w/ jobvis. Notice that it now runs on 4 cores,
   between nodes.
        This job is using the Myrinet network for the MPI communication
        Ethernet is used for the filesystem IO and scheduler

Running on the U2 Cluster

 The compute machines are assigned to
  user jobs by the PBS (Portable Batch
  System) scheduler.
 The qsub command submits jobs to the
  scheduler
 Interactive jobs depend on the connection
  from the workstation to u2.
 If the workstation is shut down or
  disconnected from the network, then the
  job will terminate.

PBS Execution Model

 PBS executes a login as the user on the master host,
  and then proceeds according to one of two modes,
  depending on how the user requested that the job be
  run.
    Script - the user executes the command:
               qsub [options] job-script
      • where job-script is a standard UNIX shell script containing some PBS
        directives along with the commands that the user wishes to run
        (examples later).
    Interactive - the user executes the command:
              qsub [options] –I
      • the job is run “interactively,” in the sense that standard output and
        standard error are connected to the terminal session of the initiating
        ’qsub’ command. Note that the job is still scheduled and run as any
        other batch job (so you can end up waiting a while for your prompt to
        come back “inside” your batch job).

Execution Model Schematic

  qsub myscript        pbs_server             SCHEDULER

                                      No

                            Yes                       Run?
$PBS_NODEFILE

 node1      prologue    $USER login        myscript          epilogue
 node2

 nodeN

PBS Queues

 The PBS queues defined for the U2 cluster are CCR
  and debug.
 The CCR queue is the default
 The debug queue can be requested by the user.
    Used to test applications.
 qstat –q
    Shows queues defined for the scheduler.
    Availability of the queues.
 qmgr
    Shows details of the queues and scheduler.

PBS Queues

Do you even need to specify a queue?
 You probably don’t need (and may not even be able)
  to specify a specific queue destination.
 Most of our PBS servers use a routing queue.
 The exception is the debug queue on u2, which
  requires a direct submission. This queue has a
  certain number of compute nodes set aside for its
  use during peak times.
    Usually, this queue has 32 compute nodes.
    The queue is always available, however it has dedicated nodes
     Monday through Friday, from 9:00am to 5:00pm.
    Use -q debug to specify the debug queue on the u2 cluster.

Batch Scripts - Resources

  The “-l” options are used to request resources
   for a job.
     Used in batch scripts and interactive jobs.
  -l walltime=01:00:00 wall-clock limit of the batch
  job.
     Requests 1 hour wall-clock time limit.
     If the job does not complete before this time limit, then it will
      be terminated by the scheduler. All tasks will be removed
      from the nodes.
  -l nodes=8:ppn=2 number of cluster nodes, with
  optional processors per node.
     Requests 8 nodes with 2 processors per node.

Environmental Variables

  $PBS_O_WORKDIR - directory from which the job was
   submitted.
  By default, a PBS job starts from the user’s $HOME
   directory.
  Note that you can change this default in your .cshrc or
   .bashrc file.
       add the following to your .cshrc file:
             if ( $?PBS_ENVIRONMENT ) then
                cd $PBS_O_WORKDIR
             endif
       or this to your .bashrc file:
             if [ -n "$PBS_ENVIRONMENT" ]; then
                cd $PBS_O_WORKDIR
             Fi
  In practice, many users change directory to the
   $PBS_O_WORKDIR directory in their scripts.

Environmental Variables

  $PBSTMPDIR - reserved scratch space, local to each host
   (this is a CCR definition, not part of the PBS package).
       This scratch directory is created in /scratch and is unique to the
        job.
       The $PBSTMPDIR is created on every compute node running a
        particular job.
  $PBS_NODEFILE - name of the file containing a list of
   nodes assigned to the current batch job.
       Used to allocate parallel tasks in a cluster environment.

Sample Script – parallel 1x2

 Example of a PBS script:
      /util/pbs-scripts/pbsCFX-1x2

Sample Script – distributed parallel 2x2

   Example of a PBS script:
        /util/pbs-scripts/pbsCFX-2x2

Submitting a Batch Job

 Navigate to a directory where the pbs script and your .def
  file reside; this is $PBS_O_WORKDIR
 qsub pbsCFX-2x2
 qstat –an –u username
 jobvis “jobid”
 When finished, the output files will be in $PBS_O_WORKDIR

You can also read