Object segmentation
In this step, objects such as cells will be segmented. This will result in grayscale object masks of the same x and y dimensions as the original images, containing unique pixel values for each object (object IDs, see File types).
Various segmentation approaches are supported, each of which is described in the following.
Pixel classification-based image segmentation vs end-to-end approaches
While pixel classification-based image segmentation using CellProfiler uses probability images to segment objects, end-to-end workflows such as DeepCell/Mesmer and Cellpose directly operate on images without the need for a preceding pixel classification step.
CellProfiler
CellProfiler is an open-source software for measuring and analyzing cell images. Here, CellProfiler is used for object detection and region growth-based object segmentation.
Pixel classification-based image segmentation
By design, the segmentation approach described in this section will be used as part of a pixel classification-based image segmentation workflow. As such, this approach requires probability images generated by a preceding pixel classification step as input; the assigned class probabilities will be used to identify and segment objects.
Pipeline preparation
In a first step, a CellProfiler pipeline is prepared for processing the images:
steinbock segment cellprofiler prepare
By default, this will create a CellProfiler pipeline cell_segmentation.cppipe
for segmenting cells in probability images generated during the Ilastik pixel classification step.
CellProfiler plugins
The generated CellProfiler pipeline makes use of custom plugins for multi-channel images, which are pre-installed in the steinbock Docker container. The pipeline can be inspected using CellProfiler as described below.
Modifying the pipeline
To interactively inspect, modify and run the pipeline, import it in CellProfiler (see Apps):
steinbock apps cellprofiler
Data/working directory
Within the container, your data/working directory containing the CellProfiler pipeline is accessible under /data
.
More detailed instructions on how to create CellProfiler pipelines can be found here.
Segmentation parameters
Segmentation using CellProfiler is highly customizable and sensitive to parameter choices. The default parameter values may not be suitable in all cases and parameter values require careful tuning for each dataset.
In particular, the generated pipeline is configured to down-size the probability images by a factor of two, to account for the default scaling applied in the Ilastik pixel classification step. If a different classification strategy or scale factor has been used to generate the probability images, the down-scale factor must be adjusted accordingly.
CellProfiler output
By default, the pipeline is configured to generate object masks as grayscale 16-bit unsigned integer TIFF images with the same name and x and y dimensions as the input images (see File types). Custom segmentation pipelines should adhere to this convention to ensure compatibility with downstream measurement tasks.
Batch processing
After the pipeline has been configured, it can be applied to a batch of probability images:
steinbock segment cellprofiler run
This will create grayscale object masks of the same x and y dimensions as the original images, containing unique pixel values for each object (object IDs, see File types). The default destination directory for these masks is masks
.
DeepCell
DeepCell is a deep learning library for single-cell analysis of biological images. Here, pre-trained DeepCell models are used for cell/nuclei segmentation from raw image data.
End-to-end cell segmentation
This approach operates directly on image intensities and does not require a preceding pixel classification step.
To segment cells using Mesmer:
steinbock segment deepcell --minmax
To segment nuclei using Mesmer:
steinbock segment deepcell --minmax --type nuclear
This will create grayscale cell/nuclear masks of the same x and y dimensions as the original images, containing unique pixel values for each cell/nucleus (object IDs, see File types). The default destination directory for these masks is masks
.
Pre-trained models
DeepCell uses pre-trained neural networks for object segmentation. To specify a pre-trained model, use the --model
option. If not specified, the default training data for the selected application (e.g. Mesmer) is downloaded.
DeepCell image data
Depending on the application, DeepCell requires images of specific dimensions. For example, in the case of cell segmentation using Mesmer, DeepCell expects two-channel images as input, where the first channel must be a nuclear channel (e.g. DAPI) and the second channel must be a membrane or cytoplasmic channel (e.g. E-Cadherin). This also applies to nuclear
-only segmentation tasks.
If a deepcell
column is present in the steinbock panel file, channels are sorted and grouped according to values in that column to generate the required input for DeepCell: For each image, each group of channels is aggregated by computing the mean along the channel axis (use the --aggr
option to specify a different aggregation strategy). The resulting images consist of one channel per group; channels without a group label are ignored.
If no deepcell
column is present, images are expected to be in the correct format already.
Unless specified otherwise using the --pixelsize
parameter, a value of 1 micrometer per pixel is assumed. This resolution parameter can also be used to fine-tune the generated cell/nuclear masks with regards to over/under-segmentation.
Channel-wise image normalization
If enabled, features (i.e., channels) are scaled for each image and each channel independently.
Specify --minmax
to enable min-max normalization and --zscore
to enable z-score normalization.
Preprocessing/postprocessing parameters
Application-dependent preprocessing/postprocessing parameters can be specified in YAML files using the --preprocess
/--postprocess
options. For the Mesmer application, this can e.g. be used to control thresholding, histogram normalization and watershed segmentation. Please refer to the DeepCell online documentation for available parameters. For example, one could specify --preprocess preprocessing.yml
, where preprocessing.yml
is a file in the steinbock data/working directory containing:
threshold: true
percentile: 99.9
normalize: true
kernel_size: 128
Cellpose
Experimental feature
This is an experimental feature and is only available in the -cellpose
flavors of the steinbock Docker container.
Segmentation using cellpose likely requires fine-tuning of parameters, e.g. using steinbock command-line interface options.
Cellpose is a generalist algorithm for cellular segmentation.
End-to-end cell segmentation
This approach operates directly on image intensities and does not require a preceding pixel classification step.
To segment cells using the default cyto2
model:
steinbock segment cellpose --minmax
To segment nuclei using the nuclei
model:
steinbock segment cellpose --minmax --model nuclei
Cellpose image data
Cellpose expects two-channel images as input, where the first channel must be a nuclear channel (e.g. DAPI) and the second channel must be a cytoplasmic channel (e.g. E-Cadherin). The nuclear channel is optional and only the cytoplasmic channel ("channel to segment") is required. Note that - compared to the original cellpose implementation - the channel order is reversed for compatibility with DeepCell/Mesmer.
If a cellpose
column is present in the steinbock panel file, channels are sorted and grouped according to values in that column to generate the required input for DeepCell: For each image, each group of channels is aggregated by computing the mean along the channel axis (use the --aggr
option to specify a different aggregation strategy). The resulting images consist of one channel per group; channels without a group label are ignored.
If no cellpose
column is present, images are expected to be in the correct format already.
GPU support
Currently. steinbock does not support cellpose segmentation with GPU support.
If GPU support is required, consider running cellpose on your host system independently.