Tests if cell types interact more or less frequently than random

Cell-cell interactions are summarized in different ways and the resulting count is compared to a distribution of counts arising from random permutations.

testInteractions(
  object,
  group_by,
  label,
  colPairName,
  method = c("classic", "histocat", "patch", "interaction"),
  patch_size = NULL,
  iter = 1000,
  p_threshold = 0.01,
  return_samples = FALSE,
  tolerance = sqrt(.Machine$double.eps),
  BPPARAM = SerialParam()
)

Arguments

object: a SingleCellExperiment or SpatialExperiment object.
group_by: a single character indicating the colData(object) entry by which interactions are grouped. This is usually the image ID or patient ID.
label: single character specifying the colData(object) entry which stores the cell labels. These can be cell-types labels or other metadata entries.
colPairName: single character indicating the colPair(object) entry containing cell-cell interactions in form of an edge list.
method: which cell-cell interaction counting method to use (see details)
patch_size: if method = "patch", a single numeric specifying the minimum number of neighbors of the same type to be considered a patch (see details)
iter: single numeric specifying the number of permutations to perform
p_threshold: single numeric indicating the empirical p-value threshold at which interactions are considered to be significantly enriched or depleted per group.
return_samples: single logical indicating if the permuted interaction counts of all iterations should be returned.
tolerance: single numeric larger than 0. This parameter defines the difference between the permuted count and the actual counts at which both are regarded as equal. Default taken from all.equal.
BPPARAM: parameters for parallelized processing.

Value

a DataFrame containing one row per group_by entry and unique label entry combination (from_label, to_label). The object contains following entries:

ct: stores the interaction count as described in the details
p_gt: stores the fraction of perturbations equal or greater than ct
p_lt: stores the fraction of perturbations equal or less than ct
interaction: is there the tendency for a positive interaction (attraction) between from_label and to_label? Is p_lt greater than p_gt?
p: the smaller value of p_gt and p_lt.
sig: is p smaller than p_threshold?
sigval: Combination of interaction and sig.
- -1: interaction == FALSE and sig == TRUE
- 0: sig == FALSE
- 1: interaction == TRUE and sig == TRUE

NA is returned if a certain label is not present in this grouping level.

Counting and summarizing cell-cell interactions

In principle, the countInteractions function counts the number of edges (interactions) between each set of unique entries in colData(object)[[label]]. Simplified, it counts for each cell of type A the number of neighbors of type B. This count is averaged within each unique entry colData(object)[[group_by]] in four different ways:

1. method = "classic": The count is divided by the total number of cells of type A. The final count can be interpreted as "How many neighbors of type B does a cell of type A have on average?"

2. method = "histocat": The count is divided by the number of cells of type A that have at least one neighbor of type B. The final count can be interpreted as "How many many neighbors of type B has a cell of type A on average, given it has at least one neighbor of type B?"

3. method = "patch": For each cell, the count is binarized to 0 (less than patch_size neighbors of type B) or 1 (more or equal to patch_size neighbors of type B). The binarized counts are averaged across all cells of type A. The final count can be interpreted as "What fraction of cells of type A have at least a given number of neighbors of type B?"

4. method = "interaction": The count is divided by the total number of interactions from cell type A. The final count can be interpreted as the fraction of interactions of cell type A that occur with cell type B.

Testing for significance

Within each unique entry to colData(object)[[group_by]], the entries of colData(object)[[label]] are randomized iter times. For each iteration, the interactions are counted as described above. The result is a distribution of the interaction count under spatial randomness. The observed interaction count is compared against this Null distribution to derive empirical p-values:

p_gt: fraction of perturbations equal or greater than the observed count

p_lt: fraction of perturbations equal or less than the observed count

Based on these empirical p-values, the interaction score (attraction or avoidance), overall p value and significance by comparison to p_treshold (sig and sigval) are derived.

References

Schulz, D. et al., Simultaneous Multiplexed Imaging of mRNA and Proteins with Subcellular Resolution in Breast Cancer Tissue Samples by Mass Cytometry., Cell Systems 2018 6(1):25-36.e5

Shapiro, D. et al., histoCAT: analysis of cell phenotypes and interactions in multiplex image cytometry data, Nature Methods 2017 14, p. 873–876

Author

Vito Zanotelli

Jana Fischer

adapted by Nils Eling (nils.eling@dqbm.uzh.ch)

adapted by Marlene Lutz (marlene.lutz@uzh.ch)

Examples

library(cytomapper)
library(BiocParallel)
data(pancreasSCE)

pancreasSCE <- buildSpatialGraph(pancreasSCE, img_id = "ImageNb", 
                                 type = "knn", k = 3)
#> The returned object is ordered by the 'ImageNb' entry.
                               
# Classic style calculation - setting the seed inside SerialParam for reproducibility
(out <- testInteractions(pancreasSCE, 
                         group_by = "ImageNb",
                         label = "CellType", 
                         method = "classic",
                         colPairName = "knn_interaction_graph",
                         iter = 1000,
                         BPPARAM = SerialParam(RNGseed = 123)))
#> DataFrame with 27 rows and 10 columns
#>        group_by  from_label    to_label        ct        p_gt        p_lt
#>     <character> <character> <character> <numeric>   <numeric>   <numeric>
#> 1             1  celltype_A  celltype_A  1.823529 0.000999001 1.000000000
#> 2             1  celltype_A  celltype_B  0.470588 0.020979021 0.993006993
#> 3             1  celltype_A  celltype_C  0.705882 1.000000000 0.000999001
#> 4             1  celltype_B  celltype_A  0.875000 0.050949051 0.988011988
#> 5             1  celltype_B  celltype_B  0.250000 0.511488511 0.792207792
#> ...         ...         ...         ...       ...         ...         ...
#> 23            3  celltype_B  celltype_B  2.363636 0.000999001 1.000000000
#> 24            3  celltype_B  celltype_C  0.636364 1.000000000 0.000999001
#> 25            3  celltype_C  celltype_A        NA          NA          NA
#> 26            3  celltype_C  celltype_B  0.704918 1.000000000 0.000999001
#> 27            3  celltype_C  celltype_C  2.295082 0.000999001 1.000000000
#>     interaction           p       sig    sigval
#>       <logical>   <numeric> <logical> <numeric>
#> 1          TRUE 0.000999001      TRUE         1
#> 2          TRUE 0.020979021     FALSE         0
#> 3         FALSE 0.000999001      TRUE        -1
#> 4          TRUE 0.050949051     FALSE         0
#> 5          TRUE 0.511488511     FALSE         0
#> ...         ...         ...       ...       ...
#> 23         TRUE 0.000999001      TRUE         1
#> 24        FALSE 0.000999001      TRUE        -1
#> 25           NA          NA        NA        NA
#> 26        FALSE 0.000999001      TRUE        -1
#> 27         TRUE 0.000999001      TRUE         1
                                
# Histocat style calculation
(out <- testInteractions(pancreasSCE, 
                         group_by = "ImageNb",
                         label = "CellType", 
                         method = "histocat",
                         colPairName = "knn_interaction_graph",
                         iter = 1000,
                         BPPARAM = SerialParam(RNGseed = 123)))
#> DataFrame with 27 rows and 10 columns
#>        group_by  from_label    to_label        ct        p_gt        p_lt
#>     <character> <character> <character> <numeric>   <numeric>   <numeric>
#> 1             1  celltype_A  celltype_A   1.93750    0.003996 0.997002997
#> 2             1  celltype_A  celltype_B   1.14286    0.226773 0.782217782
#> 3             1  celltype_A  celltype_C   1.33333    1.000000 0.000999001
#> 4             1  celltype_B  celltype_A   1.16667    0.407592 0.609390609
#> 5             1  celltype_B  celltype_B   1.00000    0.648352 0.929070929
#> ...         ...         ...         ...       ...         ...         ...
#> 23            3  celltype_B  celltype_B   2.36364 0.000999001 1.000000000
#> 24            3  celltype_B  celltype_C   1.31250 1.000000000 0.000999001
#> 25            3  celltype_C  celltype_A        NA          NA          NA
#> 26            3  celltype_C  celltype_B   1.65385 0.868131868 0.136863137
#> 27            3  celltype_C  celltype_C   2.45614 0.000999001 1.000000000
#>     interaction           p       sig    sigval
#>       <logical>   <numeric> <logical> <numeric>
#> 1          TRUE 0.003996004      TRUE         1
#> 2          TRUE 0.226773227     FALSE         0
#> 3         FALSE 0.000999001      TRUE        -1
#> 4          TRUE 0.407592408     FALSE         0
#> 5          TRUE 0.648351648     FALSE         0
#> ...         ...         ...       ...       ...
#> 23         TRUE 0.000999001      TRUE         1
#> 24        FALSE 0.000999001      TRUE        -1
#> 25           NA          NA        NA        NA
#> 26        FALSE 0.136863137     FALSE         0
#> 27         TRUE 0.000999001      TRUE         1
                                
# Patch style calculation
(out <- testInteractions(pancreasSCE, 
                         group_by = "ImageNb",
                         label = "CellType", 
                         method = "patch",
                         patch_size = 3,
                         colPairName = "knn_interaction_graph",
                         iter = 1000,
                         BPPARAM = SerialParam(RNGseed = 123)))
#> DataFrame with 27 rows and 10 columns
#>        group_by  from_label    to_label        ct        p_gt        p_lt
#>     <character> <character> <character> <numeric>   <numeric>   <numeric>
#> 1             1  celltype_A  celltype_A  0.235294 0.000999001 1.000000000
#> 2             1  celltype_A  celltype_B  0.000000 1.000000000 0.992007992
#> 3             1  celltype_A  celltype_C  0.000000 1.000000000 0.000999001
#> 4             1  celltype_B  celltype_A  0.000000 1.000000000 0.974025974
#> 5             1  celltype_B  celltype_B  0.000000 1.000000000 0.998001998
#> ...         ...         ...         ...       ...         ...         ...
#> 23            3  celltype_B  celltype_B 0.5151515 0.000999001   1.0000000
#> 24            3  celltype_B  celltype_C 0.0000000 1.000000000   0.0019980
#> 25            3  celltype_C  celltype_A        NA          NA          NA
#> 26            3  celltype_C  celltype_B 0.0655738 0.983016983   0.0559441
#> 27            3  celltype_C  celltype_C 0.5737705 0.000999001   1.0000000
#>     interaction           p       sig    sigval
#>       <logical>   <numeric> <logical> <numeric>
#> 1          TRUE 0.000999001      TRUE         1
#> 2         FALSE 0.992007992     FALSE         0
#> 3         FALSE 0.000999001      TRUE        -1
#> 4         FALSE 0.974025974     FALSE         0
#> 5         FALSE 0.998001998     FALSE         0
#> ...         ...         ...       ...       ...
#> 23         TRUE 0.000999001      TRUE         1
#> 24        FALSE 0.001998002      TRUE        -1
#> 25           NA          NA        NA        NA
#> 26        FALSE 0.055944056     FALSE         0
#> 27         TRUE 0.000999001      TRUE         1
                         
# Interaction style calculation
(out <- testInteractions(pancreasSCE, 
                         group_by = "ImageNb",
                         label = "CellType", 
                         method = "interaction",
                         colPairName = "knn_interaction_graph"))
#> DataFrame with 27 rows and 10 columns
#>        group_by  from_label    to_label        ct        p_gt        p_lt
#>     <character> <character> <character> <numeric>   <numeric>   <numeric>
#> 1             1  celltype_A  celltype_A 0.6078431 0.000999001 1.000000000
#> 2             1  celltype_A  celltype_B 0.1568627 0.023976024 0.992007992
#> 3             1  celltype_A  celltype_C 0.2352941 1.000000000 0.000999001
#> 4             1  celltype_B  celltype_A 0.2916667 0.053946054 0.979020979
#> 5             1  celltype_B  celltype_B 0.0833333 0.504495504 0.774225774
#> ...         ...         ...         ...       ...         ...         ...
#> 23            3  celltype_B  celltype_B  0.787879 0.000999001 1.000000000
#> 24            3  celltype_B  celltype_C  0.212121 1.000000000 0.000999001
#> 25            3  celltype_C  celltype_A        NA          NA          NA
#> 26            3  celltype_C  celltype_B  0.234973 1.000000000 0.000999001
#> 27            3  celltype_C  celltype_C  0.765027 0.000999001 1.000000000
#>     interaction           p       sig    sigval
#>       <logical>   <numeric> <logical> <numeric>
#> 1          TRUE 0.000999001      TRUE         1
#> 2          TRUE 0.023976024     FALSE         0
#> 3         FALSE 0.000999001      TRUE        -1
#> 4          TRUE 0.053946054     FALSE         0
#> 5          TRUE 0.504495504     FALSE         0
#> ...         ...         ...       ...       ...
#> 23         TRUE 0.000999001      TRUE         1
#> 24        FALSE 0.000999001      TRUE        -1
#> 25           NA          NA        NA        NA
#> 26        FALSE 0.000999001      TRUE        -1
#> 27         TRUE 0.000999001      TRUE         1