HPC

High Performance Computing

Overview

Oracle cloud supports High Performance Computing and makes it very easy to setup your own HPC cluster in the cloud. This tutorial here is a basic introduction to get your started.

This is a tutorial about SLURM on OCI with more background information: SLURM on OCI tutorial

You can find an alternative setup (tailored at deep learning and GPUs here: GPU cluster)

Before you get started

Consider if you actually need High Performance Computing (HPC) for your work. An HPC is a cluster consisting of multiple machines and it uses a head-node (here bastion host) from where jobs are submitted to this cluster using a job engine (for example slurm). If you have many jobs that need to be run independently than the setup described here will work well. A “real” HPC does more on top: There is a high-performance network between machines and it enables to run jobs that combine multiple machines (e.g. MPI). This would be needed if you have a problem that’s so large that a single machine wouldn’t be big enough. In this example here we build a cluster without this advanced networking. Most people will not need an HPC for their work and they should use a single virtual machine, because it requires considerably less setup work and easier to maintain.

Configure HPC cluster

Download the Terraform configuration from here as a zip file: https://github.com/oracle-quickstart/oci-hpc/releases/tag/v2.9.2

Make sure you selected the geographic region where you would like to create the resource (it should be close to you for best latencies).

Then go to Stacks under Resource Manager:

In the List Scope drop down menu, select your project compartment. Click Create Stack and upload the zip file as a Terraform configuration source.

give your cluster a name, but leave the default options for the rest:

Check that the cluster is being created in your compartment again and then hit Next

In cluster configuration you need to add your public SSH key for the opc admin account. Make sure to setup your SSH keys first create a public key

In Headnode options you need to select an Availability Domain. It doesn’t matter what you select there and the options will depend on the geographic region where you launch your HPC. You can keep the headnode default size, or you can select a different flavour:

In Compute node options you need to disable Use cluster network (this is for MPI and not required for most people. It requires special network hardware that’s not available in every region. If you need MPI please get in touch and we can help you setting this up). Select a compute node size that fits your problem size. Drop the initial compute size node to 1, because we will scale the cluster using autoscaling.

In Autoscaling you should enable scheduler based autoscaling, monitor the autoscaling and disable RDMA latency check if you are not using MPI.

For API authentication and Monitoring leave the defaults:

For Additional file system accept the defaults:

For Advanced bastion options, Avanced storage options and Network options you can accept the defaults:

For Software enable Install Spack package manager in addition to the defaults:

Then hit next and on the next page scroll to the end and tick Run apply:

Then hit Create

This will then create a custom HPC for your project (it will take a couple of minutes to complete).

Once everything is done you find the bastion IP (the “head node” or “login node”) under Outputs:

You can now ssh into the HPC as follows:

ssh opc@ipbastion

Be aware that this “opc” account is the admin account with sudo access of the cluster and should not be used to perform analyses. It is better to create a user account to perform the work in:

Once logged in with the opc account, you can create normal cluster users using the cluster command:

cluster user add test

These users can then login using a password only and do not require an SSH key.

There is a shared file storage (which can also be configured in size in the stack settings) in /nfs/cluster

More information can be found here: https://github.com/oracle-quickstart/oci-hpc

Configuring node memory

When you first submit jobs using sbatch, if you followed the above setup you may find you recieve the following error:

error: Memory specification can not be satisfied

This is happening as the RealMemory for each node (e.g. the amount of memory each compute node may use) has not yet been specified and defaults to a very low value. To rectify this, first work out how much memory to allocate to each node by running scontrol show nodes and looking at FreeMem.

To change the RealMemory, you must edit the slurm configuration file (which may be found in /etc/slurm/slurm.conf). Inside the slurm configuration file you will find several lines which begin NodeName=. These specify the settings for each node. To fix the error, on each of these lines, add RealMemory=AMOUNT where AMOUNT is the amount of memory you wish to allow the node to use.

Once you have done this, you must reconfigure slurm by running the following command:

sudo scontrol reconfigure

Configuring X11 forwarding

If you want to use graphical aplications you need to install:

sudo yum install install mesa-dri-drivers xorg-x11-server-Xorg xorg-x11-xauth xorg-x11-apps mesa-libGL xorg-x11-drv-nouveau.x86_64 -y

sudo vi /etc/ssh/sshd_config

change to:

X11Forwarding yes 
X11UseLocalhost no

then

sudo systemctl restart sshd

# or 
sudo service sshd restart

For full functionality, you may also need to add PrologFlags=X11 to your /etc/slurm/slurm.conf, along with enabling the following additional parameters in you /etc/ssh/sshd_config:

AllowAgentForwarding yes
AllowTcpForwarding yes
X11Forwarding yes
X11DisplayOffset 10
X11UseLocalhost no

On you main node, to restart slurm:

```console
sudo slurmctld restart

And on your worker nodes:

sudo service slurmd restart

After you’ve updated slurm, you can confirm the Prolog setting has taken:

sudo scontrol reconfigre; sudo scontrol show config | grep PrologFlags

And also check that x11 works!

srun --x11 xeyes

Troublehsooting: Editing a deployd stack fails

This can have many reasons, but the first one to check is:

Error: 409-Conflict, The Instance Configuration ocid1.instanceconfiguration.oc1.phx.aaaaaaaabycbnzxq4uskt4f7mklp4g4fcqk4m42aabj2r2fkchjygppdudua is associated to one or more Instance Pools.

This means that the Instance Pool blocks the terraform script. To get it back working you need to destroy the stack first and then rebuild it.

Another option is that the resource type you used is not supported:

Error: 400-InvalidParameter, Shape VM.Standard1.4 is incompatible with image ocid1.image.oc1..aaaaaaaamy4z6turov5otuvb3wlej2ipv3534agxcd7loajk2f54bfmlyhnq
Suggestion: Please update the parameter(s) in the Terraform config as per error message Shape VM.Standard1.4 is incompatible with image ocid1.image.oc1..aaaaaaaamy4z6turov5otuvb3wlej2ipv3534agxcd7loajk2f54bfmlyhnq

Here, I selected a shape that is too “small” and it fails. It needs at least VM.Standard2.4

Installing Custom Software

If you don’t want to use spack (or cannot) then a good strategy is to install under /nfs/cluster, add any relevant “bin” directories it to your path, and install there. As an example we will install go:

$ cd /nfs/cluster
$ wget https://go.dev/dl/go1.19.linux-amd64.tar.gz
$ sudo tar -C /nfs/cluster -xzf go1.19.linux-amd64.tar.gz
$ rm go1.19.linux-amd64.tar.gz

And then add the go bin to your bash profile (vim ~/.bash_profile) as follows:

export PATH=/nfs/cluster/go/bin:$PATH

and when you open a new shell or source ~/.bash_profile you should be able to see go on your path:

$ which go
/nfs/cluster/go/bin/go

$ go version
go version go1.19 linux/amd64

Further, since it’s located in the /nfs/cluster directory, it will be available on other nodes! Here is how to see the other nodes you have:

$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
compute*     up   infinite      1   idle compute-permanent-node-941

And then shell into one, and also find the go binary.

$ ssh compute-permanent-node-941
Last login: Sun Aug 14 02:30:01 2022 from relative-flamingo-bastion.public.cluster.oraclevcn.com
$ which go
/nfs/cluster/go/bin/go

Install Singularity

First, system dependencies. Follow the example above in install custom software to install Go. Next, install Singularity dependencies. These will need to be installed to each node.

sudo yum groupinstall -y 'Development Tools'
sudo yum install libseccomp-devel squashfs-tools cryptsetup -y
sudo yum install glib2-devel -y

Ensure Go is on your path (as shown above). Then install Singularity. We will install from source.

Important ensure you don’t have anything (e.g., pkg-config) loaded from spack, as this can interfere with installing Singularity using system libs. Also note that installing with system libs is a workaround for spack singularity not working perfectly (due to setuid). This means you’ll need to do these steps on each of your head login and worker nodes.

You can do the same with an official release. Note that you don’t need to compile this on the nfs node - you can compile it anywhere and make install to /nfs/cluster.

$ git clone https://github.com/sylabs/singularity
$ cd singularity
$ git submodule update --init

# You can also set prefix to be it's own directory, e.g., /nfs/cluster/singularity-<version>
$ ./mconfig --prefix=/nfs/cluster
$ cd ./builddir
$ make
$ make install

Once you install, make sure you add the newly created bin to your path (wherever that happens to be). E.g., that might look like:

export PATH=/nfs/cluster/go/bin:/nfs/cluster/bin:$PATH

And then when you source your ~/.bash_profile you can test:

$ which singularity
/nfs/cluster/bin/singularity

Advanced: Use MPI networking

Your first need to request access to those resources with this form.

Then follow the above instructions, but leave Use cluser network activated and RDMA options enabled.

Last modified August 10, 2023: add new tutorial (a690f9c)