Helper function to process raw .txt files acquired by the
Hyperion imaging system into a SingleCellExperiment
object. This function is mainly used to read-in data generated from a
"spillover slide". Here, each .txt file contains the measurements of
multiple pixels for a single stain across all open channels.
readSCEfromTXT(
x,
pattern = ".txt$",
metadata_cols = c("Start_push", "End_push", "Pushes_duration", "X", "Y", "Z"),
verbose = TRUE,
read_metal_from_filename = TRUE
)
input can be of different types:
Full path to where the single stain .txt files are located.
A named list object where each entry is a
data.frame
or coercible to one. The names of each entry indicate the
spotted metals (see details).
pattern to select which files should be read in (default
".txt$"
). Only used when x
is a path.
character vector indicating which column entries of the
.txt files should be saved in the colData(sce)
slot.
logical indicating if additional information regarding the spotted and acquired masses should be shown.
should the sample metal and mass be extracted from the file/object names?
returns a SCE object where pixels are stored as columns and acquired channels are stored as rows.
As described in the original publication, single metal spots are acquired using the Hyperion imaging system. Each acquisition corresponds to one spot. All acquisitions are stored in a single .mcd file and individual acquisitions are stored in single .txt files.
This function aggregates these measurements into a single
SingleCellExperiment
object. For this, two inputs are possible:
x
is a path:
By default all .txt files are read in from the specified path. Here, the path
should indicate the location of the spillover slide measurement. The file
names of the .txt file must contain the spotted metal isotope name in the
format (mt)(mass)
(e.g. Sm152
for Samarium isotope with the
atomic mass 152). Internally, the last occurrence of such a pattern is read
in as the metal isotope name and stored in the colData(sce)$sample_id
slot.
x
is a named list:
If there are issues with reading in the metal isotope names from the .txt
file names, the user can provide a list for which each entry contains the
contents of a single .txt file. The names of the list must indicate the
spotted metal in the format (mt)(mass)
. These names will be stored in
the colData(sce)$sample_id
slot.
When read_metal_from_filename = FALSE
, the function will not attempt
to read in the spotted metal isotopes from the file or list names. Therefore,
only the sample_id
will be set based on the file/list names.
# Read files from path
path <- system.file("extdata/spillover", package = "imcRtools")
sce <- readSCEfromTXT(path)
#> Spotted channels: Dy161, Dy162, Dy163, Dy164
#> Acquired channels: Dy161, Dy162, Dy163, Dy164
#> Channels spotted but not acquired:
#> Channels acquired but not spotted:
sce
#> class: SingleCellExperiment
#> dim: 4 400
#> metadata(0):
#> assays(1): counts
#> rownames(4): 161Dy(Dy161Di) 162Dy(Dy162Di) 163Dy(Dy163Di)
#> 164Dy(Dy164Di)
#> rowData names(2): channel_name marker_name
#> colnames(400): Dy161.1 Dy161.2 ... Dy164.99 Dy164.100
#> colData names(9): Start_push End_push ... sample_metal sample_mass
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
# Read files as list
cur_file_names <- list.files(path, pattern = ".txt", full.names = TRUE)
cur_files <- lapply(cur_file_names, read.delim)
names(cur_files) <- sub(".txt", "", basename(cur_file_names))
sce <- readSCEfromTXT(cur_files)
#> Spotted channels: Dy161, Dy162, Dy163, Dy164
#> Acquired channels: Dy161, Dy162, Dy163, Dy164
#> Channels spotted but not acquired:
#> Channels acquired but not spotted:
sce
#> class: SingleCellExperiment
#> dim: 4 400
#> metadata(0):
#> assays(1): counts
#> rownames(4): X161Dy.Dy161Di. X162Dy.Dy162Di. X163Dy.Dy163Di.
#> X164Dy.Dy164Di.
#> rowData names(2): channel_name marker_name
#> colnames(400): Dy161.1 Dy161.2 ... Dy164.99 Dy164.100
#> colData names(9): Start_push End_push ... sample_metal sample_mass
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):