Harvard LSP Tissue Imaging Program
Updated July, 2021
The purpose of this briefing is to introduce some key concepts relevant to highly multiplexed tissue imaging and to provide an overview of the essential functionality in MCMICRO. This discussion is not geared to any specific image acquisition technology, but we cover in greatest depth those methods that use fluorescently-labelled antibodies, fluorescent dyes and epifluorescence imaging. These methods include cyclic immunofluorescence (CyCIF)1, Multiplexed Immunofluorescence (MxIF)2, CO-Detection by indEXing (CODEX)3, and Signal Amplification By Exchange Reaction (immuno-SABER)4. Mass spectrometry-based methods, such as Imaging Mass Cytometry (IMC)5 and Multiplexed Ion Beam Imaging (MIBI)6, are also antibody-based but use metal tags, not fluorophores as labels. The common feature of all of these methods, as well as conventional transmission light microscopy used for immunohistochemistry, is that they generate data that can be represented as a series of intensity values on a two-dimensional raster. MIBI and IMC do this in one round of detection, whereas virtually all fluorescent-based methods are cyclic and generate high dimensional images via repeated collection of lower-dimensional images. MCMICRO can process 2D data from all of the methods described above, and extension to 3D is an area of active development.
No systematic comparison of tissue imaging methods has yet been published, but several large consortia, including the Human Tumor Atlas Network (HTAN), are working to perform such comparisons. From a first-principles perspective, it is possible to identify four relevant performance metric: (i) the multiplicity or plex of the assay, (ii) spatial resolution, (iii) spatial scale and statistical power, and (iv) sensitivity or signal to noise ratio (SNR). In practice these parameters are not independent of each other: objective lenses with higher resolving power (higher numerical aperture) gather light more efficiently and are therefore more sensitive, but they have smaller fields of view.
Most discussion of the tissue imaging focuses on the multiplicity – the number channels – with a maximum number of 60 to 100 channels being typical; however, the great majority of published high-plex tissue imaging methods involve 20-40 marker proteins. Increase in multiplicity are important, but most extant methods are limited by the specificity of antibody-antigen detection.
Increasing spatial resolution has been the focus of most microscopy advances in the past two decades (e.g. super resolution imaging by structure illumination7 or stochastic reconstruction8), but resolution has been little discussed in tissue imaging studies. Higher resolution improves SNR, makes segmentation more robust and is, of course, essential for discerning small structures. Most slide-scanning microscopes use objectives the range of 0.4 to 1.0 NA, giving them nominal lateral resolutions of ~600 to 250 nm at an illumination wavelength of 550nm (see MicroscopyU for details). Rapid improvements in the resolution of tissue images are possible, simply by adopting state-of-the-art imaging methods.
Spatial scale has also been neglected as a critical issue in tissue imaging: robust conclusions can be drawn from images only with sufficient spatial sampling. Because images exhibit spatial correlation on length scales up to 500 micron, specimens at least one square centimeter are essential for many purposes9; it is increasingly clear that tissue microarrays (TMAs) are generally inadequate, even though their use is widespread because they increase the sample number. In current practice, TMAs are not used for diagnosis, and the FDA requires that digital histology be based on whole-slide imaging (WSI)10. MCMICRO was developed with the demands of high-plex, whole-slide imaging in mind.
The sensitivity of an imaging method is typically dependent on a wide range of factors including the selectivity of the reagents, the quality of the instrumentation, resolution, etc. and must be evaluated with respect to specific objective criteria (typically yielding a receiver operator curve). As mentioned above, the field awaits the data needed for rigorous comparison of tissue imaging platforms (and even of the same imaging platforms across multiple laboratories).
All tissue imaging methods generate data comprising a series of intensity values on a raster. Multi-spectral data simply adds a dimension to the raster. Because resolution and field of view exhibit a reciprocal relationship – in both optical physics and the mapping of an image field onto the fixed raster of an electronic camera – whole slide images are almost always acquired by dividing a large specimen into tiles – usually on the order of 100 to 1,000 tiles, each a multi-dimensional TIFF - which are recorded sequentially by moving the microscope stage in X and Y (such a microscope is often called a “slide scanner”). These fields are then combined at sub-pixel accuracy into a mosaic image in a process known as stitching. Mosaics can be as large as 50,000 x 50,000 pixels x 100 channels, which corresponds to ~500 GByte of data – hence the need for specialized software. When high-plex images are assembled from multiple rounds of lower-plex imaging, it is also necessary to register channels to each other across imaging cycles. Multiple tools for image registration and stitching exist but these perform poorly with very large tissue images. We have therefore developed the ASHLAR 11 software package to combine registration and stitching for whole-slide imaging, guided by permutation tests.
Microscope illumination is rarely stable over the time periods required to collect a large number of image tiles across multiple experiments – which can sometimes span several days– and the illumination of individual tiles is also not perfectly uniform. We correct for these issues (a process known as flat fielding) using the BaSiC12 package. However, we are currently working to more tightly link ASHLAR and BaSiC to improve illumination evenness.
While high-plex 3D tissue imaging remains rare, most published 3D studies use stacks of images spaced along the Z axis (which is parallel to the objective lens in most cases) in live-cells studies; time is also captured by a series of images to create a movie. MCMICRO can manage 3D image stacks, although specialized viewers are required to look at the data. In preclinical settings, more effective ways to sample 3D data have been in development for many decades. The most common of these, optical deconvolution microscopy13, confocal microscopy and fluorescent light sheet microscopy (LSFM)14 acquire data directly in 3D data without the need for physical sectioning; LSFM is particularly valuable for tissue imaging because samples can be up to several hundred micron thick. It is odd that these essential breakthroughs in optical microscopy are not already widespread in high-plex tissue imaging, and we therefore expect their rapid introduction over the next few years. We are therefore working to add true 3D capability to MCMICRO.
Microscopy using Hematoxylin and Eosin (H&E), Romanowsky–Giemsa stains, and other colorimetric dyes, complemented by immunohistochemistry15 has long played the primary role in the study of tissue architecture in humans and other organisms 16,17. In a clinical setting, histopathology remains the primary means by which diseases such as cancer are staged and managed clinically18. High-plex tissue imaging aims to address the concern that H&E images, and classical histology in general, provide insufficient molecular information to precisely identify cell subtypes, study mechanisms of development, and characterize disease genes. The information acquired from over a century of anatomic histology is nonetheless essential for understanding tissue biology in a normal and diseased setting. We always acquired H&E in parallel with high-plex images, typically using serial sections. We and others are also working on ways to combine the two imaging modalities on the same section to facilitate single-cell analysis.
Image processing is necessary to extract quantitative data from images. Although machine learning directly on images shows promise, most high-plex tissue imaging studies seek single-cell data, and this requires image segmentation. At the current state of the art, segmentation is one of the most challenging steps in single-cell analysis of tissue images, and MCMICRO therefore provides a wide variety of methods and models. Over time, we expect these to improve and to become consolidated via hackathons and similar activities that identify optimal methods and best practices.
Segmentation is a computer vision technique that assigns class labels to an image in a pixel-wise manner to optimally subdivide it; in most, cases this is followed by marker quantification to extract marker intensities on a per-cell or per organelle basis. Extensive work has gone into segmenting methods for metazoan cells grown in culture, but segmentation of tissue images is substantially more complex: cell sizes and shapes are more diverse in tissues, and cells are often closely packed. Deep learning methods have become standard in image segmentation, object detection, and synthetic image generation19, based on architectures such as ResNet, UNet and Mask R-CNN 20,21. UNet, in particular, has become popular due to its ease of deployment on Graphical Processing Units (GPUs) and its superior performance. MCMICRO provides access to all of these architectures as a standard features. It is always necessary to examine an overlay of primary image data and segmentation mask to make sure that images are not over or under segmented.
One limitation of machine learning for tissue imaging is a lack of sufficient freely-available data with ground truth labelling. The EMIT dataset is intended to address this requirement, but experience with natural scene images20 has proven that the acquisition of sufficient data with accurate labels remains time consuming and rate limiting22. We therefore expect the EMIT dataset to grow steadily; users of MCMICRO should stay abreast of updates in segmentation methods and models.
In practice, all tissue images contain technical artifacts (pre-analytical variables) including sectioning artifacts (areas where the knife compresses or tears the specimen), embedded foreign objects (dust, hair) and regions of fat or necrotic tissue that cannot easily be analyzed. To this, cyclic image acquisition adds cell loss with increasing cycle number and fluorescence imaging (but not mass spectrometry) must overcome problems with autofluorescence. Humans are remarkably good at looking past these artifacts to identify biologically meaningful patterns in biological data. However, artifacts substantially complicate single-cell data analysis using computational methods: foreign objects are often the brightest things in an image, and stand out when high-dimensional data are clustered. We and others are working on human-in-the loop and automated methods to identify and suppress these artifacts, but until then, MCMICRO users must iteratively examine the underlying image data, segmentation mask, and quantified features (per-cell marker intensities) to minimize the impact of noise.
The conversion of images into single cell data generates a Cell Feature Table, which is analogous to a count table in RNA sequencing, and typically records the positions of individual cells along with derived features such as marker intensity, morphology, and quality control attributes. The Cell Feature Table is used for all subsequent analysis, for example by dimensional reduction tools such as tSNE and UMAP, and for cell type calling. MCMICRO also includes a variety of specialized tools for analyzing spatial data using methods derived from physics, geographic information systems and ecology, but Cell Feature Tables can also be visualized using many tools developed for visualization of single cell sequencing data, cellxgene23 for example. It is important to note, however, that whereas the frequency of a single mRNA corresponds to a single feature in scRNA-Seq, a single maker in an image can be processed to generate a large number of distinct features beyond intensity (e.g. shape, granularity, nuclear or membrane localization etc.).
The TIFF (Tagged Image File Format) is ideal for storing microscopy data at native resolution because it can combine multiple images in a single file (with each image occupying a separate layer in the file). Thus, a 3D multi-wavelength movie potentially containing hundreds of image planes can be stored in one TIFF file. TIFF files also contain metadata in the header that describes the organization and key properties of the images. For biomedical data, the Open Microscopy Environment (OME) TIFF format has become the most widely used standard for XML-based metadata and raster images. Because different vendors also have their own internal data standards, Bio-Formats software was developed by the OME community to convert proprietary formats into a standardized, open format, most recently OME-TIFF 6.0. This is a pyramid-encoded TIFF in which multiple resolutions of the same image are found in a single file to enable rapid pan and zoom, particularly using web tools (e.g. Google Maps). Many microscope vendors support Bio-Formats and this is therefore the standard supported by MCMICRO and other image processing software developed by the Laboratory of Systems Pharmacology (LSP).
Metadata standards for high-plex image data are rapidly developing: a wide variety of laboratories have come together to create the Minimum Information about Tissue Imaging Standard (MITI), and we will update this briefing to include that information as soon as possible.
MCMICRO is designed to solve the problem of processing high volumes of tissue image data and yield reliable image mosaics and single cell data. It is not solve all of the problems associated in the analysis and publication of images however, and we strongly recommend that laboratories also adopt the database and visualization tools provided by the OME community. The OME community is welcoming and it has many on-line resources that discuss the topics described above; OME also sponsors multiple workshops and conferences of interest to new and experienced microscopists. In our laboratories, we use MCMICRO, OME/OMERO and MINERVA in parallel24.
- Lin, J.-R. et al. Highly multiplexed immunofluorescence imaging of human tissues and tumors using t-CyCIF and conventional optical microscopes. eLife 7, (2018).
- Gerdes, M. J. et al. Highly multiplexed single-cell analysis of formalin-fixed, paraffin-embedded cancer tissue. PNAS 110, 11982–11987 (2013).
- Goltsev, Y. et al. Deep Profiling of Mouse Splenic Architecture with CODEX Multiplexed Imaging. Cell 174, 968-981.e15 (2018).
- Saka, S. K. et al. Immuno-SABER enables highly multiplexed and amplified protein imaging in tissues. Nat Biotechnol 37, 1080–1090 (2019).
- Giesen, C. et al. Highly multiplexed imaging of tumor tissues with subcellular resolution by mass cytometry. Nature Methods 11, 417–422 (2014).
- Angelo, M. et al. Multiplexed ion beam imaging (MIBI) of human breast tumors. Nat Med 20, 436–442 (2014).
- Wu, Y. & Shroff, H. Faster, sharper, and deeper: structured illumination microscopy for biological imaging. Nat Methods 15, 1011–1019 (2018).
- Rust, M. J., Bates, M. & Zhuang, X. Sub-diffraction-limit imaging by stochastic optical reconstruction microscopy (STORM). Nat. Methods 3, 793–795 (2006).
- Lin, J.-R. et al. Multiplexed 3D atlas of state transitions and immune interactions in colorectal cancer. bioRxiv 2021.03.31.437984 (2021) doi:10.1101/2021.03.31.437984.
- Health, C. for D. and R. Technical Performance Assessment of Digital Pathology Whole Slide Imaging Devices. U.S. Food and Drug Administration http://www.fda.gov/regulatory-information/search-fda-guidance-documents/technical-performance-assessment-digital-pathology-whole-slide-imaging-devices (2019).
- Muhlich, J., Chen, Y.-A., Russell, D. & Sorger, P. K. Stitching and registering highly multiplexed whole slide images of tissues and tumors using ASHLAR software. (2021) doi:10.1101/2021.04.20.440625.
- Peng, T. et al. A BaSiC tool for background and shading correction of optical microscopy images. Nat Commun 8, (2017).
- Sibarita, J.-B. Deconvolution microscopy. Adv Biochem Eng Biotechnol 95, 201–243 (2005).
- Power, R. M. & Huisken, J. A guide to light-sheet fluorescence microscopy for multiscale imaging. Nature Methods 14, 360–373 (2017).
- Immunologists, A. A. of. The Demonstration of Pneumococcal Antigen in Tissues by the Use of Fluorescent Antibody. The Journal of Immunology 45, 159–170 (1942).
- Albertson, D. G. Gene amplification in cancer. Trends in Genetics 22, 447–455 (2006).
- Shlien, A. & Malkin, D. Copy number variations and cancer. Genome Medicine 1, 62 (2009).
- Amin, M. B. et al. The Eighth Edition AJCC Cancer Staging Manual: Continuing to build a bridge from a population-based to a more ‘personalized’ approach to cancer staging. CA Cancer J Clin 67, 93–99 (2017).
- LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
- Ronneberger, O., Fischer, P. & Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. arXiv:1505.04597 [cs] (2015).
- He, K., Gkioxari, G., Dollár, P. & Girshick, R. Mask R-CNN. arXiv:1703.06870 [cs] (2018).
- Gurari, D. et al. How to Collect Segmentations for Biomedical Images? A Benchmark Evaluating the Performance of Experts, Crowdsourced Non-experts, and Algorithms. 2015 IEEE Winter Conference on Applications of Computer Vision (2015) doi:10.1109/WACV.2015.160.
- Megill, C. et al. cellxgene: a performant, scalable exploration platform for high dimensional sparse matrices. bioRxiv 2021.04.05.438318 (2021) doi:10.1101/2021.04.05.438318.
- Rashid, R. et al. Interpretative guides for interacting with tissue atlas and digital pathology data using the Minerva browser. Nat Biomed Eng. 2020.03.27.001834 (2020) doi:10.1101/2020.03.27.001834.