Pixel classification

In this step, for each image, the probabilities of pixels belonging to a given class (e.g. Nucleus, Cytoplasm, Background) will be determined. This will result in probability images, with one color per class encoding the probability of pixels belonging to that class (see File types).

Pixel classification-based image segmentation

Probability images generated by pixel classification can be used to segment images, see Object segmentation.

Various classification approaches are supported, each of which is described in the following.

Ilastik

Ilastik is an application for interactive learning and segmentation. Here, Ilastik's semantic pixel classification workflow is used to perform pixel classification using random forests.

Data preparation

In a first step, input data are prepared for processing with Ilastik:

steinbock classify ilastik prepare --cropsize 50 --seed 123

With default desination file/directory paths shown in brackets, this will:

aggregate, scale and convert images to steinbock Ilastik format (ilastik_img)
extract and save one random crop of 50x50 pixels per image for training (ilastik_crops)
create a default steinbock Ilastik pixel classification project file (pixel_classifier.ilp)

By specifying the --seed parameter, this command reproducibly extracts crops from the same pseudo-random locations when executed repeatedly.

Ilastik image data

All generated image data are saved in steinbock Ilastik HDF5 format (undocumented).

If an ilastik column is present in the steinbock panel file, channels are sorted and grouped according to values in that column: For each image, each group of channels is aggregated by computing the mean along the channel axis (use the --aggr option to specify a different aggregation strategy). The generated Ilastik images consist of one channel per group; channels without a group label are ignored. In addition, the mean of all included channels is prepended to the generated Ilastik images as an additional channel, unless --no-mean is specified.

Furthermore, all generated Ilastik images are scaled two-fold in x and y, unless specified otherwise using the --scale command-line option. This helps with more accurately identifying object borders in segmentation workflows for images of relatively low resolution (e.g. Imaging Mass Cytometry). In applications with higher resolution (e.g. sequential immunofluorescence), it is recommended to not scale the image data, i.e., to specify --scale 1.

Training the classifier

To interactively train a new classifier, open the pixel classification project in Ilastik (see Apps):

steinbock apps ilastik

Data/working directory

Within the container, your data/working directory containing the Ilastik project file is accessible under /data.

More detailed instructions on how to use Ilastik for training a pixel classifier can be found here.

Class labels for segmentation

By default, the Ilastik pixel classification project is configured for training three classes (Nucleus, Cytoplasm, Background) for cell segmentation. Other segmentation workflows may require different numbers of classes and class labels (e.g. two classes for Tumor/Stroma segmentation). While the number and order of classes is arbitrary and can be changed by the user, it needs to be compatible with downstream segmentation steps.

Feature selection

The choice of features in Ilastik's feature selection step depends on the input data. For relatively small IMC datasets, the selection of all default features greater than or equal to 1 pixel is recommended.

Existing training data

Experimental feature

Reusing existing training data is an experimental feature. Use at own risk. Always make backups of your data.

Instead of training a new classifier, one can use an existing classifier by

replacing the generated Ilastik pixel classification project file with a pre-trained project, and
replacing the image crops (see Data preparation) with the crops originally used for training.

Subsequently, to ensure compatibility of the external Ilastik project file/crops:

steinbock classify ilastik fix

This will attempt to in-place patch the Ilastik pixel classification project and the image crops after creating a backup (.bak file/directory extension), unless --no-backup is specified.

Patching existing training data

This command will convert image crops to 32-bit floating point images with CYX dimension order and save them in steinbock Ilastik HDF5 format (undocumented). It will then adjust the metadata in the Ilastik project file accordingly.

Batch processing

After training the pixel classifier on the image crops (or providing and patching a pre-trained one), it can be applied to a batch of full-size images created in the Data preparation step as follows:

steinbock classify ilastik run

By default, this will create probability images in ilastik_probabilities, with one color per class encoding the probability of pixels belonging to that class (see File types).

Probability images

The size of the generated probability images are equal to the size of the Ilastik input images, i.e., scaled by a user-specified factor that defaults to 2 (see above). If applicable, make sure to adapt downstream segmentation workflows accordingly to create object masks matching the original (i.e., unscaled) images.

If the default three-class structure is used, the probability images are RGB images with the following color code:

Red: Nuclei
Green: Cytoplasm
Blue: Background