Running mcmicro on the O2 compute cluster
O2 is a high-performance cluster at Harvard Medical School (HMS). Non-HMS users can safely ignore this page.
Familiarity with basic O2 commands will make running MCMICRO much easier on O2. If this is your first time interacting with the platform, please check out HMS’s O2 documentation.
Setting up for MCMICRO on O2
Follow installation instructions to install Nextflow. Docker is not needed, because O2 uses Singularity to execute module containers, and Singularity is already available on O2.
Please ensure that Java is available by running
module load java
.If your account is not in the
HITS_lsp-analysis
group (typegroups
to check), then please run the following command to prepare your environment:nextflow run labsyspharm/mcmicro/setup/O2ext.nf
When working with exemplars, please download your own copy to
/n/scratch/users/.../$USER/
(where$USER
is your eCommons ID and...
is its first letter).
O2 execution
To run the pipeline on O2, the following additional steps are required:
- If your account is in the
HITS_lsp-analysis
group, please add the flag-profile O2
. Use-profile O2,WSI
and-profile O2,TMA
for very large whole-slide images (WSIs) and tissue microarrays (TMAs), respectively. The profiles differ in the amount of resources requested for each module. - If your account is not in the
HITS_lsp-analysis
group, please use-profile O2ext
. Similarly, use-profile O2ext,WSI
and-profile O2ext,TMA
for WSIs and TMAs. - To avoid running over on your disk quota, it is also recommended to use
/n/scratch
for holding thework/
directory. Furthermore,/n/scratch
is faster than/home
or/n/groups
, so jobs will complete faster.
Compose an sbatch
script that encapsulates resource requests, module loading and the nextflow
command into a single entity. Create a submit_mcmicro.sh
file based on the following template:
#!/bin/sh
#SBATCH -p short
#SBATCH -J mcmicro
#SBATCH -o mcmicro-%J.log
#SBATCH -t 0-12:00
#SBATCH --mem=1G
#SBATCH --mail-type=END # Type of email notification- BEGIN,END,FAIL,ALL
#SBATCH --mail-user=user@university.edu # Email to which notifications will be sent
module purge
module load java
/home/$USER/bin/nextflow run labsyspharm/mcmicro --in /n/scratch/users/${USER:0:1}/$USER/exemplar-001 -profile O2 -w /n/scratch/users/${USER:0:1}/$USER/work
replacing relevant fields (e.g., user@university.edu
) with your own values.
The pipeline run can then be kicked off with sbatch submit_mcmicro.sh
.
Requesting O2 resources
The default profiles WSI
and TMA
establish reasonable defaults for O2 resource requests. These will work for most scenarios, but individual projects may also have custom time and memory requirements. To overwrite the defaults, compose a new config file (e.g., myproject.config
) specifying the desired custom requirements using process selectors. For example, to request 128GB of memory and 96 hours in the medium
queue for ASHLAR, one would specify
process{
withName:ashlar {
queue = 'medium'
time = '96h'
memory = '128G'
}
}
Use existing profiles as examples. Once myproject.config
is composed, it can be provided to a nextflow run
command using the -c
flag:
nextflow run labsyspharm/mcmicro --in /path/to/exemplar-001 -profile O2 -c myproject.config
Note that -profile
is still needed because it defines additional configurations, such as where to find the container modules. myproject.config
simply overrides a portion of the fields in the overall profile.