Generates a SingleCellExperiment from .txt files

Helper function to process raw .txt files acquired by the Hyperion imaging system into a SingleCellExperiment object. This function is mainly used to read-in data generated from a "spillover slide". Here, each .txt file contains the measurements of multiple pixels for a single stain across all open channels.

readSCEfromTXT(
  x,
  pattern = ".txt$",
  metadata_cols = c("Start_push", "End_push", "Pushes_duration", "X", "Y", "Z"),
  verbose = TRUE,
  read_metal_from_filename = TRUE
)

Arguments

x

input can be of different types:

A path: Full path to where the single stain .txt files are located.
A list object: A named list object where each entry is a data.frame or coercible to one. The names of each entry indicate the spotted metals (see details).

pattern

pattern to select which files should be read in (default ".txt$"). Only used when x is a path.

metadata_cols

character vector indicating which column entries of the .txt files should be saved in the colData(sce) slot.

verbose

logical indicating if additional information regarding the spotted and acquired masses should be shown.

read_metal_from_filename

should the sample metal and mass be extracted from the file/object names?

Value

returns a SCE object where pixels are stored as columns and acquired channels are stored as rows.

Reading in .txt files for spillover correction

As described in the original publication, single metal spots are acquired using the Hyperion imaging system. Each acquisition corresponds to one spot. All acquisitions are stored in a single .mcd file and individual acquisitions are stored in single .txt files.

This function aggregates these measurements into a single SingleCellExperiment object. For this, two inputs are possible:

x is a path: By default all .txt files are read in from the specified path. Here, the path should indicate the location of the spillover slide measurement. The file names of the .txt file must contain the spotted metal isotope name in the format (mt)(mass) (e.g. Sm152 for Samarium isotope with the atomic mass 152). Internally, the last occurrence of such a pattern is read in as the metal isotope name and stored in the colData(sce)$sample_id slot.
x is a named list: If there are issues with reading in the metal isotope names from the .txt file names, the user can provide a list for which each entry contains the contents of a single .txt file. The names of the list must indicate the spotted metal in the format (mt)(mass). These names will be stored in the colData(sce)$sample_id slot.

When read_metal_from_filename = FALSE, the function will not attempt to read in the spotted metal isotopes from the file or list names. Therefore, only the sample_id will be set based on the file/list names.

References

Chevrier, S. et al. 2017. “Compensation of Signal Spillover in Suspension and Imaging Mass Cytometry.” Cell Systems 6: 612–20.

Author

Nils Eling (nils.eling@dqbm.uzh.ch)

Examples

# Read files from path
path <- system.file("extdata/spillover", package = "imcRtools")

sce <- readSCEfromTXT(path) 
#> Spotted channels:  Dy161, Dy162, Dy163, Dy164
#> Acquired channels:  Dy161, Dy162, Dy163, Dy164
#> Channels spotted but not acquired:  
#> Channels acquired but not spotted:  
sce
#> class: SingleCellExperiment 
#> dim: 4 400 
#> metadata(0):
#> assays(1): counts
#> rownames(4): 161Dy(Dy161Di) 162Dy(Dy162Di) 163Dy(Dy163Di)
#>   164Dy(Dy164Di)
#> rowData names(2): channel_name marker_name
#> colnames(400): Dy161.1 Dy161.2 ... Dy164.99 Dy164.100
#> colData names(9): Start_push End_push ... sample_metal sample_mass
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):

# Read files as list
cur_file_names <- list.files(path, pattern = ".txt", full.names = TRUE)
cur_files <- lapply(cur_file_names, read.delim)
names(cur_files) <- sub(".txt", "", basename(cur_file_names))

sce <- readSCEfromTXT(cur_files) 
#> Spotted channels:  Dy161, Dy162, Dy163, Dy164
#> Acquired channels:  Dy161, Dy162, Dy163, Dy164
#> Channels spotted but not acquired:  
#> Channels acquired but not spotted:  
sce
#> class: SingleCellExperiment 
#> dim: 4 400 
#> metadata(0):
#> assays(1): counts
#> rownames(4): X161Dy.Dy161Di. X162Dy.Dy162Di. X163Dy.Dy163Di.
#>   X164Dy.Dy164Di.
#> rowData names(2): channel_name marker_name
#> colnames(400): Dy161.1 Dy161.2 ... Dy164.99 Dy164.100
#> colData names(9): Start_push End_push ... sample_metal sample_mass
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):