Link Search Menu Expand Document

The MCMICRO pipeline

BaSic ASHLAR Coreograph UNMICST S3Segmenter MCQuant CyLinter SCIMAP Minerva Other Modules

Click on the different modules to learn more.


Core modules:

Last updated on 2023-03-16.

All modules in MCMICRO are available as standalone executable Docker containers. When running modules within MCMICRO, the inputs and outputs will be handled by the pipeline and do not need to be specified explicitly.

BaSiC

Illumination correction

Description

The module implements the BaSiC method for correcting uneven illumination, developed externally by (Peng et al., 2017). The module doesn’t have any additional parameters.

Usage

By default, MCMICRO skips this step as it requires manual inspection of the outputs to ensure that illumination correction does not introduce artifacts for downstream processing. Add start-at: illumination to workflow parameters to request that MCMICRO runs the module.

  • Example params.yml:
workflow:
  start-at: illumination

Input

Unstitched images in any BioFormats-compatible format. Nextflow will take these from the raw/ subdirectory within the project.

Output

Dark-field and flat-field profiles for each unstitched image. Nextflow will write these to the illumination/ subdirectory within the project.

Back to top


ASHLAR

Stitching and registration

Description

The module performs simultaneous stiching of tiles and registration across channels. Check the ASHLAR website for the most up-to-date documentation.

Usage

MCMICRO runs ASHLAR by default. Add ashlar: to module options to control its behavior.

  • Example params.yml:
options:
  ashlar: --flip-y -c 5

Input

  • Unstitched images in any BioFormats-compatible format. Nextflow will take these from the raw/ subdirectory within the project.
  • [Optional] Dark-field and flat-field profiles from illumination correction. Nextflow will take these from the illumination/ subdirectory within the project.

Output

A pyramidal, tiled .ome.tif. Nextflow will write the output file to registration/ within the project directory.

Optional parameters for ASHLAR

Name; ShorthandDescriptionDefault
--align-channel CHANNEL; -c CHANNELAlign images using channel number CHANNELNumbering starts at 0
--flip-xFlip tile positions left-to-right to account for unusual microscope configurations 
--flip-yFlip tile positions top-to-bottom to account for unusual microscope configurations 
--flip-mosaic-xFlip mosaic image horizontally 
--flip-mosaic-yFlip mosaic image vertically 
--output-channels CHANNEL [CHANNEL...]Output only channels listed in CHANNELSNumbering starts at 0
--maximum-shift SHIFT; -m SHIFTMaximum allowed per-tile corrective shift in microns 
--filter-sigma SIGMAWidth in pixels of Gaussian filter to apply to images before alignmentDefault is 0 (which disables filtering)
--filename-format FORMAT; -f FORMATUse FORMAT to generate output filenames, with {cycle} and {channel} as required placeholders for the cycle and channel numbersdefault is cycle_{cycle}_channel_{channel}.tif
--pyramidWrite output as a single pyramidal TIFF 
--tile-size PIXELSSet tile width and height to PIXELS (pyramid output only)Default is 1024
--platesEnable mode for multi-well plates (for high-throughput screening assays) 

Troubleshooting

Visit the ASHLAR website for troubleshooting tips.

Back to top


Coreograph

TMA core detection and dearraying

Description

The modules uses the popular UNet deep learning architecture to identify cores within a tissue microarray (TMA). After identifying the cores, it extracts each one into a separate image to enable parallel downstream processing of all cores.

Usage

By default, MCMICRO assumes that the input is a whole-slide image. Add tma: true to module options to indicate that the input is a TMA instead. Add coreograph: to module options to control the module behavior.

  • Example params.yml:
workflow:
  tma: true
options:
  coreograph: --channel 3

Input

A fluorescence image of a tissue microarray where at least one channel is of DNA (such as Hoechst or DAPI). Nextflow will take this from the registration/ subfolder within the project.

Output*

  1. Individual cores as .tif stacks with user-selectable channel ranges
  2. Binary tissue masks (saved in the ‘mask’ subfolder)
  3. A TMA map showing the labels and outlines of each core for quality control purposes
  4. A text file listing the centroids of each core in the format: Y, X

* Nextflow will write images and masks to the dearray/ subfolder and the TMA map to the qc/coreo/ subfolder within the project.

map

Optional arguments to Coreograph

ParameterDefaultDescription
--downsampleFactorDefault is 5 times to match the training dataHow many times to downsample the raw image file
--channel Which channel is fed into UNet to generate probability maps (usually DAPI)
--buffer2The extra space around a core before cropping it. A value of 2 means there is twice the width of the core added as buffer around it.
--outputChan a range of channels to be exported. -1 is default and will export all channels (takes awhile). Select a single channel or a continuous range. --outputChan 0 10 will export channel 0 up to and including channel 10
--tissue Coreograph will assume that its input is a whole-slide image and will work to isolate individual tissue chunks into separate files

Troubleshooting

A troubleshooting guide can be found within Coreograph parameter tuning.

Back to top


UnMICST

Image segmentation - probability map generation

Description

UnMICST uses a convolutional neural network to annotate each pixel with the probability that it belongs to a given subcellular component (nucleus, cytoplasm, cell boundary). Check the UnMICST website for the most up-to-date documentation.

Usage

MCMICRO applies UnMicst to all input images by default. Add unmicst: to module options to control its behavior.

  • Example params.yml:
options:
  unmicst: --scalingFactor 0.5

Input

An .ome.tif, preferably flat field corrected. The model is trained on images acquired at a pixelsize of 0.65 microns/px. If your settings differ, you can upsample/downsample to some extent. Nextflow will use as input files from the registration/ subdirectory for whole-slide images and from the dearray/ subdirectory for tissue microarrays.

Output *

  1. a .tif stack where the different probability maps for each class are concatenated in the Z-axis in the order: nuclei foreground, nuclei contours, and background.
  2. a QC image with the DNA image concatenated with the nuclei contour probability map with suffix Preview

* Nextflow will write probability maps to the probability-maps/unmicst/ subfolder and the previews to the qc/unmicst/ subfolder within the project.

Optional arguments to UnMicst

ParameterDefaultDescription
--tool <version>unmicst-soloUnMicst version: unmicst-legacy is the old single channel model. unmicst-solo uses DAPI. unmicst-duo uses DAPI and lamin.
--modelhuman nuclei from DAPIThe name of the UNet model. By default, this is the human nuclei model that identifies nuclei centers, nuclei contours, and background from a DAPI channel. Other models include mouse nuclei from DAPI, and cytoplasm from stains resembling WGA
--channel <number>1The channel used to infer and generate probability maps from. If using UnMicst2, then specify 2 channels. If only 1 channel is specified, it will simply be duplicated. NOTE: If not using default value, the 1st channel must be specified to S3segmenter as –probMapChan in –s3seg-opts
--classOrderNoneIf your training data isn’t in the order 1. background, 2. contours, 3. foreground, you can specify the order here. For example, if you had trained the class order backwards, specify --classOrder 3 2 1. If you only have background and contours, use --classOrder 1 2 1.
--mean <value>Extracted from the modelOverride the trained model’s mean intensity. Useful if your images are significantly dimmer or brighter.
--std <value>Extracted from the modelOverride the trained model’s standard deviation intensity. Useful if your images are significantly dimmer or brighter.
--scalingFactor <value>1An upsample or downsample factor used to resize the image. Useful when the pixel sizes of your image differ from the model (ie. 0.65 microns/pixel for human nuclei model)
--stackOutputSpecifiedIf selected, UnMicst will write all probability maps as a single multipage tiff file. Otherwise, UnMicst will write each class as a separate file.
--GPU <index>AutomaticExplicitly specify which GPU (1-based indexing) you want to use. Useful for running on local workstations with multiple GPUs.

Troubleshooting

A troubleshooting guide can be found within UnMICST parameter tuning - additional information is also available on the UnMICST website .

Back to top


S3segmenter

Image segmentation - cell mask generation

Description

The modules applies standard watershed segmentation to probability maps to produce the final cell/nucleus/cytoplasm/etc. masks.

Usage

By default, MCMICRO applies S3segmenter to the output of all modules that produce probability maps. Add s3seg: to module options to control its behavior..

  • Example params.yml:
options:
  s3seg: --logSigma 2 10

Inputs

  1. A fully-stitched and registered .ome.tif, preferably flat field corrected. Nextflow will take these from the registration/ and dearray/ subdirectories, as approrpriate.
  2. A 3-class probability map, as derived by modules such as UnMICST or Ilastik.

S3segmenter assumes that you have:

  1. Acquired images of your sample with optimal acquisition settings.
  2. Stitched and registered the tiles and channels respectively (if working with a large piece of tissue) and saved it as a Bioformats compatible tiff file.
  3. Processed your image in some way so as to increase contrast between individual nuclei using classical or machine learning methods such as Ilastik (a random forest model) or UnMICST (a deep learning semantic segmentation model based on the UNet architecture). MCMICRO supports both.

Output

1) 32-bit label masks for each compartment of the cell:

  • nuclei.ome.tif (nuclei)
  • cytoplasm.ome.tif (cytoplasm)
  • cell.ome.tif (whole cell)
  • If only nuclei segmentation was carried out, cell.ome.tif is identical to nuclei.ome.tif

Nextflow saves these files to the segmentation/ subfolder within your project.

2) Two-channel quality control files with outlines overlaid on gray scale image of channel used for segmentation

  • nucleiOutlines.tif (nuclei),
  • cytoplasmOutlines.tif (cytoplasm)
  • cellOutlines.tif (whole cell)
  • If only nuclei segmentation was carried out, cellOutlines.tif is identical to nucleiOutilnes.tif

Nextflow saves these files to the qc/s3seg/ subfolder within your project.

NOTE: There are at least 2 ways to segment cytoplasm: i) using a watershed approach or ii) taking an annulus/ring around nuclei. Files generated using the annulus/ring method will have ‘Ring’ in the filename whereas files generated using watershed segmentation will not. It is important that these two groups of files are NOT combined and analyzed simultaneously as cell IDs will be different between them.

Optional arguments to S3Segmenter

ParameterDefaultDescription
--probMapChan <index>1which channel is used for nuclei segmentation. Coincides with the channel used in upstream semantic segmentation modules. Must specify when different from default.
--crop <selection>noCropType of cropping: interactiveCrop - a window will appear for user input to crop a smaller region of the image; plate - this is for small fields of view such as from a multiwell plate; noCrop, the default, is to use the entire image

Nuclei parameters:

ParameterDefaultDescription
--nucleiFilter <selection>IntPMMethod to filter false positive nuclei: IntPM - filter based on probability intensity; Int - filted based on raw image intensity
--logSigma <value> <value>3 60A range of nuclei diameters to search for.

Cytoplasm parameters:

ParameterDefaultDescription
--segmentCytoplasm <selection>ignoreCytoplasmSelect whether to segmentCytoplasm or ignoreCytoplasm
--CytoMaskChan <index>2One or more channels to use for segmenting cytoplasm, specified as 1-based indices (e.g., 2 is the 2nd channel).
--cytoMethod <selection>distanceTransformThe method to segment cytoplasm: distanceTransform - take the distance transform outwards from each nucleus and mask with the tissue mask; ring - take an annulus of a certain pixel size around the nucleus (see cytoDilation); hybrid - uses a combination of greyscale intensity and distance transform to more accurately approximate the extent of the cytoplasm. Similar to Cellprofiler’s implementation.
--cytoDilation <value>5The number of pixels to expand from the nucleus to get the cytoplasm ring.
--TissueMaskChan <index>Union of probMapChan and CytoMaskChanOne or more channels to use for identifying the general tissue area for masking purposes.

Troubleshooting

A troubleshooting guide can be found within S3segmenter parameter tuning.

Back to top


MCQuant

Single-cell data quantification

Description

The modules uses one or more segmentation masks against the original image to quantify the expression of every channel on a per-cell basis. Check the MCQuant README for the most up-to-date documentation.

Usage

By default, MCMICRO runs MCQuant on all cell segmentation masks that match the cell*.tif filename pattern. Add mcquant: to module options to specify a different mask or to provide additional module-specific arguments to MCMICRO.

  • Example params.yml:
options:
  mcquant: --masks cytoMask.tif nucleiMask.tif

Inputs

  1. A fully stitched and registered image in .ome.tif format. Nextflow will use images in the registration/ and dearray/ subfolders as appropriate.
  2. One or more segmentation masks in .tif format. Nextflow will use files in the segmentation/ subfolder within the project.
  3. A .csv file containing a marker_name column specifying names of individual channels. Nextflow will look for this file in the project directory.

Output

A cell-by-feature table mapping Cell IDs to marker expression and morphological features (including x,y coordinates).

Optional parameters for MCQuant

ParameterDescription
--mask_propsSpace separated list of additional metrics to be calculated for every mask. This is intended for metrics that depend only on the cell mask. If the metric depends on signal intensity, use --intensity-props instead. See list at https://scikit-image.org/docs/dev/api/skimage.measure.html#regionprops.
--intensity_propsSpace separated list of additional metrics to be calculated for every marker separately. By default only mean intensity is calculated. If the metric doesn’t depend on signal intensity, use --mask-props instead. See list at https://scikit-image.org/docs/dev/api/skimage.measure.html#regionprops Additionally available is gini_index, which calculates a single number between 0 and 1, representing how unequal the signal is distributed in each region. See https://en.wikipedia.org/wiki/Gini_coefficient. For example, to calculate the median intensity, specify --intensity_props median_intensity.

Back to top


CyLinter

Quality control

Description

CyLinter is a human-in-the-loop quality control pipeline. It accepts as input the set of files generated by MCMICRO, including segmentation masks and single-cell feature tables, and returns a set of de-noised feature tables for use in downstream analyses.

Usage

Because it requires human interactivity, CyLinter is not executed by MCMICRO directly. Instead, users are encourage to follow steps outlined on the CyLinter website after applying MCMICRO to their data.

Overview of CyLinter quality control software Screenshots depicting different phases of the CyLinter workflow.

Back to top


SCIMAP

Spatial analysis

Description

SCIMAP is a suite of tools that enables spatial single-cell analyses. Check the SCIMAP website for the most up-to-date documentation.

Usage

MCMICRO allows users to automatically apply SCIMAP’s clustering algorithms to the cell-by-feature table produced by MCQuant. The clustering results can be subsequently used for manual assignment of cell states. Since MCMICRO stops at MCQuant by default, users will need to explicitly request that the pipeline continues to the clustering step. MCMICRO’s usage of SCIMAP doesn’t have any parameters, and users are encouraged to check the SCIMAP website for more sophisticated human-in-the-loop analyses.

Add downstream: scimap and stop-at: downstream to workflow parameters to enable SCIMAP. Add mcquant: to module options to control its behavior.

  • Example params.yml:
workflow:
  stop-at: downstream
  downstream: scimap
options:
  scimap: --csv

Input

A cell-by-feature table in .csv format, as produced by MCQuant. Nextflow will look for these tables in the quantification/ subdirectory within the project.

Output

  1. A table of cluster assignments for each cell by the different clustering algorithms implemented within SCIMAP. These tables will be generated in .csv and .h5ad formats.
  2. A set of UMAP plots for the different clustering algorithms, with individual plots written to the plots/ subdirectory in .pdf format.

Nextflow will write all outputs to the cell-states/scimap/ subdirectory within the project.

Back to top


Minerva

Interactive viewing and sharing

Description

Minerave allows for fast, interactive viewing of multiplexed images. It also enables highlighting and effective sharing of important regions of interest among collaborators.

Usage

At the moment, MCMICRO does not automatically generate Minerva stories of the input images, and users need to manually provide MCMICRO outputs to Minerva in a separate workflow. To learn more about how to use Minerva, visit the Minerva wiki for the most up-to-date information about the Minerva suite.

Back to top


Other modules

NamePurposeReferences
IlastikProbability map generatorCode - DOI
CypositoryProbability map generator (cytoplasm only)Code
MesmerInstance segmentationCode - DOI
naivestatesCell type calling with Naive BayesCode
FastPGClustering (Louvain community detection)Code - DOI
scanpyClustering (Leiden community detection)Code
FlowSOMClustering (Self-organizing maps)Code

Suggest a module

Module suggestions can be made by posting to https://forum.image.sc/ and tagging your post with the mcmicro tag.

Back to top