3 Multi-channel image processing

This book focuses on common analysis steps of spatially-resolved single-cell data after image segmentation and feature extraction. In this chapter, the sections describe the processing of multiplexed imaging data, including file type conversion, image segmentation, feature extraction and data export. To obtain more detailed information on the individual image processing approaches, please visit their repositories:

steinbock: The steinbock toolkit offers tools for multi-channel image processing using the command-line or Python code (Windhager, Bodenmiller, and Eling 2021). Supported tasks include IMC data pre-processing, multi-channel image segmentation, object quantification and data export to a variety of file formats. It supports functionality similar to those of the IMC Segmentation Pipeline (see below) and further allows deep-learning enabled image segmentation. The toolkit is available as platform-independent Docker container, ensuring reproducibility and user-friendly installation. Read more in the Docs.

IMC Segmentation Pipeline: The IMC segmentation pipeline offers a rather manual way of segmenting multi-channel images using a pixel classification-based approach. We continue to maintain the pipeline but recommend the use of the steinbock toolkit for multi-channel image processing. Raw IMC data pre-processing is performed using the readimc Python package to convert raw MCD files into OME-TIFF and TIFF files. After image cropping, an Ilastik pixel classifier is trained for image classification prior to image segmentation using CellProfiler. Features (i.e., mean pixel intensity) of segmented objects (i.e., cells) are quantified and exported. Read more in the Docs.

3.1 Image pre-processing (IMC specific)

Image pre-processing is technology dependent. While most multiplexed imaging technologies generated TIFF or OME-TIFF files which can be directly segmented using the steinbock toolkit, IMC produces data in the proprietary data format MCD.

To facilitate IMC data pre-processing, the readimc open-source Python package allows extracting the multi-modal (IMC acquisitions, panoramas), multi-region, multi-channel information contained in raw IMC images. Both the IMC Segmentation Pipeline and the steinbock toolkit use the readimc package for IMC data pre-processing. Starting from IMC raw data and a “panel” file, individual acquisitions are extracted as TIFF files and OME-TIFF files if using the IMC Segmentation Pipeline. The panel contains information of antibodies used in the experiment and the user can specify which channels to keep for downstream analysis. When using the IMC Segmentation Pipeline, random tiles are cropped from images for convenience of pixel labelling.

3.2 Image segmentation

The IMC Segmentation Pipeline supports pixel classification-based image segmentation while steinbock supports pixel classification-based and deep learning-based segmentation.

Pixel classification-based image segmentation is performed by training a random forest classifier using Ilastik on the randomly extracted image crops and selected image channels. Pixels are classified as nuclear, cytoplasmic, or background. Employing a customizable CellProfiler pipeline, the probabilities are then thresholded for segmenting nuclei, and nuclei are expanded into cytoplasmic regions to obtain cell masks.

Deep learning-based image segmentation is performed as presented by (Greenwald et al. 2021). Briefly, steinbock first aggregates user-defined image channels to generate two-channel images representing nuclear and cytoplasmic signals. Next, the DeepCell Python package is used to run Mesmer, a deep learning-enabled segmentation algorithm pre-trained on TissueNet, to automatically obtain cell masks without any further user input.

Segmentation masks are single-channel images that match the input images in size, with non-zero grayscale values indicating the IDs of segmented objects (e.g., cells). These masks are written out as TIFF files after segmentation.

3.3 Feature extraction

Using the segmentation masks together with their corresponding multi-channel images, the IMC Segmentation Pipeline as well as the steinbock toolkit extract object-specific features. These include the mean pixel intensity per object and channel, morphological features (e.g., object area) and the objects’ locations. Object-specific features are written out as CSV files where rows represent individual objects and columns represent features.

Furthermore, the IMC Segmentation Pipeline and the steinbock toolkit compute spatial object graphs, in which nodes correspond to objects, and nodes in spatial proximity are connected by an edge. These graphs serve as a proxy for interactions between neighboring cells. They are stored as edge list in form of one CSV file per image.

Both approaches also write out image-specific metadata (e.g., width and height) as a CSV file.

3.4 Data export

To further facilitate compatibility with downstream analysis, steinbock exports data to a variety of file formats such as OME-TIFF for images, FCS for single-cell data, the anndata format (Virshup et al. 2021) for data analysis in Python, and various graph file formats for network analysis using software such as CytoScape (Shannon et al. 2003). For export to OME-TIFF, steinbock uses xtiff, a Python package developed for writing multi-channel TIFF stacks.

3.5 Data import into R

In Section 5, we will highlight the use of the imcRtools and cytomapper R/Bioconductor packages to read spatially-resolved, single-cell and images as generated by the IMC Segmentation Pipeline and the steinbock toolkit into the statistical programming language R. All further downstream analyses are performed in R and detailed in the following sections.

References

Greenwald, Noah F., Geneva Miller, Erick Moen, Alex Kong, Adam Kagel, Thomas Dougherty, Christine Camacho Fullaway, et al. 2021. “Whole-Cell Segmentation of Tissue Images with Human-Level Performance Using Large-Scale Data Annotation and Deep Learning.” Nature Biotechnology 40: 555–65.

Shannon, Paul, Andrew Markiel, Owen Ozier, Nitin S. Baliga, Jonathan T. Wang, Daniel Ramage, Nada Amin, Benno Schwikowski, and Trey Ideker. 2003. “Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks.” Genome Research 13: 2498–2504.

Virshup, Isaac, Sergei Rybakov, Fabian J. Theis, Philipp Angerer, and F. Alexander Wolf. 2021. “Anndata: Annotated Data.” bioRxiv.

Windhager, Jonas, Bernd Bodenmiller, and Nils Eling. 2021. “An End-to-End Workflow for Multiplexed Image Processing and Analysis.” bioRxiv.