User documentation for HPC resources at University of Manitoba
Since you have found this Website, you may be interested in Grex documentation. Grex is the University of Manitoba's High-Performance Computing system.
For experienced Grex users
Grex had a major, drastic Update on 9th year of its lifetime! The Update strongly affects the ways users interact with it: the OS updated to CentOS7, resource management software changed from Torque/Moab to SLURM, and communication libraries switched from MLNX OFED to RDMA-Core and UCX.
Thus, if you are a user experienced in the previous “version” of Grex, you might benefit from reading this dociment: Description of Grex changes.
For new Grex users
If you are a new Grex user, proceed to the quick start guide.
A Very Quick Start guide
- Create an account on CCDB. You will need and institutional Email address. If you are a sponsored user, you'd want to ask your PI for his CCRI code.
- After the CCDB account is approved, login to CCDB and apply for Westgrid Consortium account. Follow directions on portal.westgrid.ca to create Grex account.
- Wait for half a day. Install an SSH client, and SFTP client for your operating system.
- Connect to grex.westgrid.ca with SSH using your username/password from step 2.
- Make a sample job script, call it sleep.job . The job script is a text file that has a special syntax to be recognized by SLURM. You can use the editor nano , or any other right on Grex SSH prompt (vim, emacs, pico, …); you can also create the script file on your machine and upload to Grex using your SFTP client.
#!/bin/bash #SBATCH --ntasks=1 --cpus-per-task=1 #SBATCH --time=00:01 --mem-per-cpu=100mb echo "Hello world! will sleep for 10 seconds" time sleep 10 echo "all done"
- Submit the script using sbatch command, to the compute partition
sbatch --partition=compute sleep.job
- Wait until the job finishes; you can monitor queue's state with the ‘sq’ command. When the job finishes, slurm-NNNN.out should be in the job directory.
- Download the output slurm-NNNN.out from grex.westgrid.ca to your local machine using your SFTP client.
- Congratulations, you have just ran your fist HPC-style batch job. This is the general workflow, more or less; you'd just want to substitute the sleep command to something useful, like your-code.x your-input.dat .