Star-CD on Mace01
Two versions of Star-CD are installed: one in /software/starcd_402_001/ which is sym-linked to /software/starcd_mpich; the other in /software/starcd_402_001_lam/ which is sym-linked to /software/starcd_lam. (Should a newer version of Star-CD be installed, the sym-links will be updated to point to this newer version.)
- All Star-CD jobs run on Mace01 must be submitted to the batch/queue system, SGE.
Serial jobs should be submitted to the serial.q queue.
For parallel jobs, Star-CD uses MPI; it includes built-in implementations of MPI. Parallel Star-CD jobs should be submitted to the parallel.q queue and use the custom-built starcd.pe SGE parallel environment — see below for examples.
Star-CD uses SSH to start MPI-related processes. Therefore you will need to ensure you have promptless, passwordless SSH access across the Mace01 cluster to run parallel Star-CD jobs. This is done using an SSH key and an appropriate known_hosts file.
New users will have the required SSH configuration set up for them automatically. Established users may not have the required configuration. If in doubt, do not hesitate to ask the system-administrator for help.
MPICH vs LAM
Star-CD comes with a choice of MPI implementations to use. On Mace01, problems have been experienced with MPICH, which was initially chosen, when running jobs with more than two processes (i.e., using more than one compute node). Initial tests indicate that using LAM may lead to fewer problems — hopefully none.
Submitting a Parallel Star-CD Job to the Batch System, SGE
Star-CD jobs should be submitted to either the parallel-R2.q queue or the parallel-R5.q queue. (The remaining parallel queue, parallel-R4.q, is reserved for software which can take advantage of the dedicated MPI network on Rack 4.)
This is an experimental qsub script which depends on an experimental SGE parallel environment. It has been tested by only the system-administrator only (2009/Feb/20).
- comment out 'either' MPICH 'or' LAM;
likewise, choose 'one of' the two methods of launching starcd — these should be equivalent.
#!/bin/bash #$ -S /bin/bash #$ -cwd #$ -q parallel-R2.q # ...or parallel-R5.q... #$ -pe starcd.pe 4 # ...this is a custom-built SGE parallel environment... export LM_LICENSE_FILEfirstname.lastname@example.org export STARINI=Default MACHINEFILE="machinefile.$JOB_ID" # ...choose EITHER... # # . /software/starcd_402_001/etc/setstar # # ...OR... # . /software/starcd_402_001_lam/etc/setstar # # ...use EITHER starcd.pe-generated machinefile (from SGE's PE_HOSTFILE) : # star -dp -nodefile=$MACHINEFILE # # ...OR...use Star-CD auto-detection of SGE env : # #star -dp $PNP_JOBNODES exit_on_error $?