Computational Science Community Wiki

Mace01

Mace01 is a HPC cluster owned by the School of MACE; the system runs Scientific Linux v4 (a clone of RedHat Enterprise Linux). The primary role of this system is to run batch jobs in the traditional way; a limited number of jobs can also be run interactively, i.e., using an application's GUI. Computational jobs can also be run on the system via Condor (which is suitable for large numbers of small, short jobs).

Further Help

If further help with Mace01 is required, users should email rcs@manchester.ac.uk: stating on the Subject: line that the issue relates to Mace01; giving their username; and, of course, describing the problem itself!

What is the system suitable for?

Serial jobs (including Matlab and Mathematica jobs) and small parallel jobs — MPI-based jobs up to perhaps 8 or 16 processes — scalability depends on the application being run; parameter sweeps (using Condor); long-running jobs (e.g., up to a month, or longer by arrangement).

What is the system not suitable for?

The system is not suitable for large-scale parallel computations.

Available Software

Currently-available applications include:

Compilers:

Libraries:

Hardware

The system comprises:

Getting Access to Mace01

University of Manchester staff and postgraduate students may apply for access to Mace01 by emailing rcs@manchester.ac.uk.

Logging into Mace01; File Transfer; X-Windows

There are four things to consider when using Mace01:

Each is outlined below. For those new to SSH/SCP/SFTP and X-Windows (aka X11), more details may be found in Introductory Notes for New Users of RCS HPC Systems

Network/VPN/Firewall Issues

Not all computers can connect to Mace01:

Login Using SSH

Once network issues have been sorted out, as described above, you should be able to login using SSH &mdash remember to start the VPN if required.

Linux users will be able to login using OpenSSH, which comes with all popular distros, by typing

    ssh -l <username> mace01.mace.manchester.ac.uk
        # ...replace <username> with your username, cf. mpciish2...

at the command-line.

MS Windows users should download and install PuTTY, an SSH client.

File Transfer

It is likely that you will wish to upload files to Mace01, or download them from Mace01 to your desktop/laptop. Linux users can do this by using either scp or sftp, from the OpenSSH utilities suite (which comes will all popular distros), for example:

    scp myfile.txt <username>@mace01.mace.manchester.ac.uk:
        # ...to copy a file to Man2 --- don't forget the ":" at the end...

    scp <username>@mace01.mace.manchester.ac.uk:results.out results.copy
        # ...copy a file from Man2...

MS Windows users should download and install WinSCP, a GUI-based file-transfer client.

Using X-Windows: GUI-Based Applications

Using SSH on its own will enable you to login to Mace01 and use the command-line. If you want to use GUI-based applications, such as gedit, a Notepad-like editor, or you want to use Matlab interactively (i.e., use the GUI), then you need to run an X11 server on your local desktop/laptop machine and enable X11-tunnelling in your SSH connection.

All popular Linux distros run an X11-based desktop (e.g., GNOME, KDE) so the only remaining step is to enable X11-tunnelling when logging in:

    ssh -X -l <username> mace01.mace.manchester.ac.uk
        # ...that's an UPPERcase X...

MS Windows users will need to download and install an X11 server. The two obvious options are eXceed, for which the University has a site licence, and Xming, which is free to download and install. When connecting:

  1. Start the VPN, if necessary.
  2. Start eXceed or Xming, then
  3. start PuTTY, being careful to enable X11 tunnelling — click on "SSH" on the left-hand-side, then ensure the X11 tunnelling box is "checked", before starting a connection to Mace01.

Where Everything Is: Filesystems and Backups

Backups

Currently (2009, December) there are no backups of the system. It is hoped that backups of home-directories will be provided by IT Services in the near future. In the meantime, home-directories are mirrored occasionally to another system — approximately once a week.

Applications, Compilers and Libraries

Home Directories

Users' home-directories are in /home-fs1-b1, /home-fs1-g1, /home-fs2-b1 and /home-fs2-g1. (Several further disks exist on the fileservers, but these will not be used to provide more home-directory space as the available capacity in IT Services' backup system will be limited even when the new hardware is in production.)

Scratch Space

A total of about 2.5 TB of scratch space is available within /scratch-11. . ./scratch-62. This is available to all users.

Running Computational Jobs on Mace01

All CPU-intensive jobs (e.g., those lasting more than a few minutes) should be run on the compute nodes, not the login node. This is done by submitting jobs to the batch/queue system, SGE. Any jobs found running on the login node (lasting more than a few minutes) will be killed without warning.

Submitting jobs to the batch/queue system on Mace01 is straightforward. Those not familiar with batch systems, and in particular with SGE, should read the material available via at least the first of these links:

Information specific to Mace01 is given below.

Setting Up Your Environment

To make use of SGE on Mace01 you will need to have a few environment variables set, such as your PATH; unless you have changed your environment these will be set for you. If you find that you are unable to access SGE commands for whatever reason entering

    source /usr/local/sge60u8/default/common/settings.sh

should fix the problem.

Available Queues and Parallel Environments

Several queues are available on Mace01:

Experimental Interactive Queues for GUI-Based Work

Occasionally, particularly for users new to the system, GUI-based interactive work may be desirable (for example with Fluent or Matlab). This can be done within the batch system — see examples below.

Example Batch Job Details

High Level Programming Languages:

Computational Fluid Dynamics:

Example Interactive Job Details

Using the Fortran Compilers on Mace01

The MACE Condor Pool

Mace01 uses SGE as its primary job control system; users wishing to submit Fluent and MPICH jobs, for example, should do so in the traditional way, using this system. However, Condor is also available: this is used to dynamically backfill space between SGE jobs and also scavenge CPU cycles from a desktop machines in the George Begg teaching cluster which are also members of the pool. Users who wish to submit a large number of relatively small, short jobs (e.g. from a parameter sweep) should consider using Condor rather than SGE.

OpenMPI

Open MPI is installed on the system; Open MPI is an implementation of the Message Passing Interface — v2. Details, including information about compilation and also submission of jobs to the batch system (SGE) can be found on the dedicated page.

N.B. OpenMPI should not be confused with OpenMP which is a shared-memory API, not an implementation of MPI.

RCS administer other computational facilities and can help users gain access to more. Please visit the main page for details.