User Guide
Learn how to use Skyway - RCC Cloud Solution OfferingRequirements of using Skyway
- Have an active RCC user account
- Experience using the Midway cluster
- Experience using the SLURM resource scheduler
Login to Skyway
ssh [CNetID]@midway2.rcc.uchicago.edu
ssh skyway.rcc.uchicago.edu
File Systems
1. /home/[CNetID]
This is the temporary home directory (no backup) for users on Skyway. Note, this is NOT the home file system on Midway, so you won’t see any contents from your home directory on midway. Please do NOT store any sizable or important data here.
TO DO: Add note here about changing $HOME environment variable to /cloud/aws/[CNetID].
2. /project and /project2
This is the RCC high-performance capacity storage file systems from Midway, mounted on Skyway, with the same quotas and usages as on Midway. Just as with running jobs on Midway, /project and /project2 should be treated as the location for users to store the data they intend to keep. This also acts as a way to make data accessible between Skyway and midway as the /project and /project2 filesystems are mounted on both systems.
Run cd /project/<labshare> or /project2/<labshare>, where <labshare> is the name of the lab account, to access the files by your lab or group. This will work even if the lab share directory does not appear in a file listing, e.g., ls /project.
3. /cloud/[cloud]/[CNetID]
Options of [cloud]: aws or gcp
This is the cloud scratch folder (no backup), which is intended for read/write of cloud compute jobs. For example, with Amazon cloud resources (AWS) The remote cloud S3 AWS bucket storage is mounted to Skyway at this path. Before submitting jobs to the cloud compute resources, users must first stage the data, scripts and executables their cloud job will use to the /cloud/aws/[CNetID] folder. After running their cloud compute job, users should then copy the data they wish to keep from the /cloud/aws/[CNetID] folder back to their project folder. Similarly, if users are using Google Cloud Platform (GCP), the scratch folder /cloud/gcp/[CNetID] should be used.
Software Modules
Skyway uses the same module version as is used on the Midway cluster to manage software packages, but the software modules are not the same. To check the available software modules on Skyway, issue the command “module avail“. For more information on using the module commands, see the module user manual. If there is a particular software package missing that your workflow requires, please write to help@rcc.uchicago.edu to request it be added to Skyway.
Current list of software modules installed on Skyway includes the following:
- anaconda2 — Python2 Anaconda distribution
- anaconda3 — Python3 Anaconda distribution
- parallelstudio — Intel compilers and mkl library
- R — statistical analsysis
- cuda — compiler tools
- gromacs — molecular dynamics software
- plumed — metadynamics package
How to prepare executable binaries?
It is not recommended to compile or install software packages directly on Skyway. Users should compile and install their own codes on the Midway2 cluster. Midway2 and Skyway have the same system architecture so any codes compiled on Midway2 will also likely run on Skyway without any recompilation.
Note that the /project and /project2 folders are only visible from skyway (skyway login). They are not visible from the cloud compute nodes, which is why users must copy the executable and other data required of their job, into the scratch space (/cloud/aws) in order for it to be accessible from the cloud compute nodes.
Submit and Manage Jobs via SLURM
Skyway uses SLURM to submit jobs the same as on the Midway cluster. Some commonly used commands are:
- sinfo – Show compute nodes status
- sbatch – Submit computing jobs
- scancel – Cancel submitted jobs
- sacct – Check logs of recent jobs
When submitting jobs, include following two options in the job script:
- –partition=rcc-aws
- –account=rcc-aws
Specify the cloud compute resource:
To submit jobs to cloud, you must specify a type of virtual machine (VM) by the option --constraint=[VM Type]
. The VM types currently supported through Skyway can be found in the table below. You can also get an up-to-date listing of the machine types by running command sinfo-node-types
on a skyway login node.
VM Type | Description | AWS EC2 Instance Type |
---|---|---|
t1 | 1 core, 1G Mem (for testing and building software) | t2.micro |
c1 | 1 core, 4G Mem (for serial jobs) | c5.large |
c8 | 8 cores, 32G Mem (for medium sized multicore jobs) | c5.4xlarge |
c36 | 36 cores, 144G Mem (for large memory jobs) | c5.18xlarge |
m24 | 24 cores, 384G Mem | r5.12xlarge |
g1 | 1x V100 GPU | p3.2xlarge |
g4 | 4x V100 GPUs | p3.8xlarge |
g8 | 8x V100 GPUs | p3.16xlarge |
To see more information about these types, please visit AWS EC2 Website. Please note that we are using the C5 compute optimized for Skyway at this moment, and the cores for each type is half (physical cores) as the numbers listed as vCPU (with hyper-threaded cores) on the website.
#!/bin/sh
#SBATCH --job-name=TEST
#SBATCH --partition=rcc-aws
#SBATCH --account=rcc-aws
#SBATCH --exclusive
#SBATCH --ntasks=1
#SBATCH --constraint=t2 # Specifies you would like to use a t2 instance
cd $SLURM_SUBMIT_DIR
hostname
lscpu
lscpu --extended
free -h
Interactive Jobs
sinteractive --partition=rcc-aws --constraint=t2 --ntasks=1
sinteractive --partition=rcc-aws --constraint=g1 --ntasks=1 --gres=gpu:1
User Packages for R and Python
The popular scripting languages, Python and R, manage their own packages/modules library. As usually the system location for these software and library are read-only, regular users usually install local packages by their own in the home folders (i.e., /home/[username]) by default. However, this is not recommended on Skyway, as it is expected that all user contents are stored at the cloud scratch space at /cloud/aws/[username]. Therefore, there are some extra steps to modify the default packages path for using Python and R.
Setting user local packages path for R
export R_LIBS_USER=/cloud/aws/${USER}/pkgs-R
if [ ! -d "${R_LIBS_USER}" ]; then mkdir ${R_LIBS_USER}; fi
[yuxing@rcc-aws-t2-micro-001 ~]$ module load R
[yuxing@rcc-aws-t2-micro-001 ~]$ R
...
> .libPaths()
[1] "/cloud/aws/yuxing/pkgs-R" "/software/r-3.5/lib64/R/library"
Setting user local packages path for IPython
PIP tool is used to install/manage Python packages.
pip install --install-option="--prefix=${PKGS_PYTHON}" package_name
${PKGS_PYTHON} should point to the path for the package installations, and you need to also add it into PYTHONPATH before running IPython in order to load modules successfully. Example:
export PKG_PYTHON=/cloud/aws/${USER}/pkgs-python
if [ ! -d "${PKG_PYTHON}" ]; then mkdir ${PKG_PYTHON}; fi
export PYTHONPATN=${PKG_PYTHON}:${PYTHONPATH}
Setting user local packages path for Conda
The best way to manage your local packages for Python within Conda software (Anaconda or Miniconda) is using the virtual environment.
conda create --prefix=/cloud/aws/${USER}/conda
Solving environment: done
## Package Plan ##
environment location: /cloud/aws/yuxing/conda
Proceed ([y]/n)? y
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use:
# > source activate /cloud/aws/yuxing/conda
#
# To deactivate an active environment, use:
# > source deactivate
#