Abstract
Multiplexed imaging technologies enable the study of biological tissues at single-cell resolution while preserving spatial information. Currently, high-dimension imaging data analysis is technology-specific and requires multiple tools, restricting analytical scalability and result reproducibility. Here we present SIMPLI (Single-cell Identification from MultiPLexed Images), a flexible and technology-agnostic software that unifies all steps of multiplexed imaging data analysis. After raw image processing, SIMPLI performs a spatially resolved, single-cell analysis of the tissue slide as well as cell-independent quantifications of marker expression to investigate features undetectable at the cell level. SIMPLI is highly customisable and can run on desktop computers as well as high-performance computing environments, enabling workflow parallelisation for large datasets. SIMPLI produces multiple tabular and graphical outputs at each step of the analysis. Its containerised implementation and minimum configuration requirements make SIMPLI a portable and reproducible solution for multiplexed imaging data analysis. Software is available at “SIMPLI [https://github.com/ciccalab/SIMPLI]”.
Subject terms: Image processing, Software, Imaging
Current high-dimension imaging data analysis methods are technology-specific and require multiple tools, restricting analytical scalability and result reproducibility. Here the authors present SIMPLI, a software that overcomes these limitations for single-cell and pixel analysis of multiplexed images at spatial resolution.
Introduction
A detailed investigation of tissue composition and function in health and disease requires spatially resolved, single-cell approaches that precisely quantify cell types and states as well as their interactions in situ. Recent technological advances have enabled to stain histological sections with multiple tagged antibodies that are subsequently detected using fluorescence microscopy or mass spectrometry1. High-dimensional imaging approaches such as imaging mass cytometry (IMC)2, multiplexed ion beam imaging (MIBI)3, co-detection by indexing (CODEX)4, multiplexed immunofluorescence (mIF, including cycIF)5 and multiplexed immunohistochemistry (mIHC)6,7 enable quantification and localisation of cells in sections from formalin-fixed paraffin-embedded (FFPE) tissues, including clinical diagnostic samples. This is of particular value for mapping the tissue-level characteristics of disease conditions and predicting the outcome of therapies that depend on the tissue environment, such as cancer immunotherapy. For example, a recent IMC phenotypic screen of breast cancer subtypes revealed the association between the heterogeneity of somatic mutations and that of the tumour microenvironment8. Similarly, a CODEX-based profile of FFPE tissue microarrays from high-risk colorectal cancer patients correlated PD1+CD4+ T cells with patient survival9.
The analysis of multiplexed images requires the conversion of pixel intensity data into single-cell data, which can then be characterised phenotypically, quantified comparatively and localised spatially in the tissue. Currently available tools are technology specific and cover only some steps of the whole analytical workflow (Table 1). For example, several computational approaches have been developed to process raw images and extract single-cell data either interactively (Ilastik10, CellProfiler411, CODEX Toolkit4) or via command line (imcyto12, ImcSegmentationPipeline13). Distinct sets of tools can then perform cell phenotyping (CellProfiler Analyst14, Cytomapper15, Immunocluster16) or analyse cell–cell spatial interactions (CytoMap17, ImaCytE18, SPIAT19, neighbouRhood20). Similarly, a few tools enable direct pixel-based analysis through pixel classification10 or quantification of pixel positive areas11. Despite such a variety of tools, none of them can perform all of the required analytical steps in a common pipeline. Two exceptions are histoCAT++21 and QuPath22, which however have been developed specifically for interactive use and are not well suited for the analysis of large datasets. Moreover, all of these tools rely on ad hoc configuration files and input formats, making the analysis challenging for users with limited computational skills and restricting the scalability, portability and reproducibility in different computing environments.
Table 1.
Computational tool | Image processing | Cell segmentation | Cell phenotyping preselected | Cell phenotyping unsupervised | Spatial analysis homotypic | Spatial analysis heterotypic | Pixel analysis | Parallelisation | Imaging technologies |
---|---|---|---|---|---|---|---|---|---|
SIMPLI | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | 1–6 |
CODEX Toolkit4 | Yes | Yes | Yes | Yes | Yes | Yes | No | No | 3 |
CellProfiler411 | Yes | Yes | Yes | No | No | Partial | Yes | Yes | 1–6 |
HistoCAT++21 | Yes | Yes | Yes | Yes | No | Yes | No | No | 1,2,4,5 |
QuPath22 | Yes | Yes | Yes | No | No | No | Yes | Yes | 1–6 |
Cytomapper15 | Partial | Yes | Yes | No | No | No | Yes | Yes | 1–5 |
Ilastik10 | No | Yes | Yes | No | Partial | No | Partial | Yes | 1–6 |
ImcSegmentationPipeline13 | Yes | Yes | No | No | No | Partial | No | Yes | 1 |
Imcyto12 | Yes | Yes | No | No | No | No | No | Yes | 1 |
SPIAT19 | No | No | Yes | No | Yes | Yes | No | No | 2,5,6 |
Giotto47 | No | No | No | Yes | Yes | Yes | No | No | 1–6 |
ImaCytE18 | No | No | No | Yes | No | Yes | No | No | 1 |
CellProfiler Analyst14 | No | No | Yes | No | No | No | No | No | 1–6 |
Immunocluster16 | No | No | No | Yes | No | No | No | No | 1 |
NeighbouRhood20 | No | No | No | No | No | Yes | No | No | 1–5 |
CytoMAP17 | No | No | No | No | No | Yes | No | No | 2 |
For each tool, reported are the steps of the analytical workflow that it can perform, whether it can be parallelised and the multiplexed imaging platform it can be applied to (1: IMC; 2: mIF; 3: CODEX; 4: MIBI; 5: mIHC; 6: spatial transcriptomic visualisation). A method was considered compatible with a given imaging technology if this was reported in the original publication or other studies.
Here we introduce SIMPLI (Single-cell Identification from MultiPLexed Images), a tool that combines processing of raw images, extraction of single-cell data, and spatially resolved quantification of cell types or functional states into a single pipeline (Table 1). This is achieved through the integration of well-established tools and newly developed scripts into the same workflow, enabling ad hoc configurations of the analysis while ensuring interoperability between its different parts. SIMPLI can be run on desktop computers as well as on high-performance-computing environments, where it can be easily applied to large datasets due to automatic workflow parallelisation. To demonstrate the flexibility of SIMPLI to work with different technologies and experimental conditions, we analyse the phenotypes and spatial distribution of cells in different tissues (human colon, appendix, colorectal cancer) using multiplexed images obtained with distinct technologies (IMC, mIF, CODEX).
Results
Overview of the SIMPLI analytical workflow
SIMPLI performs the analysis of multiplexed imaging data in three steps (Methods, Fig. 1) integrating well-established and newly developed standalone processes (Supplementary Fig. 1). Each process can be run independently or even skipped with the possibility of using alternative input data at each point of the workflow.
The first step of SIMPLI consists of processing raw data from single or multi-channel images or text files from a variety of high-dimensional imaging technologies (Fig. 1a and Supplementary Fig. 1a). After data extraction, pixel values for each marker can be optionally normalised by rescaling them in each sample. This allows the user to apply the same thresholds for background noise reduction across samples. Alternatively, the normalisation step can be skipped and sample-specific thresholds can be applied directly to individual, non-normalised images to minimise the effect of non-uniform staining. This is recommended for example if markers have low signal-to-noise ratios and the resulting thresholds may be too restrictive or if platform-specific normalisation is required. In the last step of data processing, masks for specific tissue compartments or markers are derived using a fully customisable pipeline based on CellProfiler411, where the user can apply filters, thresholds and morphological operations to each image. The resulting processed images can then be analysed at the cell (Fig. 1b) and pixel (Fig. 1c) levels.
The cell-based analysis aims to investigate the qualitative and quantitative cell composition of the tissue and is composed of (1) single-cell data extraction, (2) cell phenotyping and (3) spatial analysis of cell–cell distances (Fig. 1b).
To extract cell data, SIMPLI implements single-cell segmentation using either a deterministic11 or a deep learning23 approach (Supplementary Fig. 1b). The former enables deterministic filtering based on cells size and shape, as well as marker intensities. The latter applies pre-trained models (either provided by SIMPLI or supplied by the user) to identify cells with high accuracy. After cell segmentation, SIMPLI produces the masks of the individual cells and calculates the expression values for each marker in each cell. Cells belonging to tissue compartments or positive for certain markers can then be identified based on their overlap with the previously derived tissue or marker masks.
To define the cell phenotypes, SIMPLI uses two alternative approaches (Supplementary Fig. 1b). The first applies unsupervised clustering to all cells or preselected subsets of cells (for example those mapping to specific tissue compartments or positive for certain markers) using marker expression levels. This leads to the unbiased classification of cells into clusters with similar expression profiles indicating similar phenotypes. The second approach identifies cells with designated phenotypes by applying combinations of user-defined thresholds to the expression values of the markers of interest. These thresholds can be identified through an expert-guided examination of the original images or using the visualisation plots produced by SIMPLI. The two approaches can be used independently or as cross-validation of the cell phenotypes.
To identify cell aggregations within the same (homotypic) or across different (heterotypic) cell types, SIMPLI implements a spatial analysis of the distance between cells within the imaged tissue (Supplementary Fig. 1b). In the case of homotypic aggregations, SIMPLI identifies groups of cells of the same type within a user-defined distance and visually localises them as clusters in the tissue image. In the case of heterotypic aggregations, SIMPLI computes the distance distribution between distinct cell types and compares them across cell types and experimental conditions. Observed distance distributions can also be compared to expected distributions obtained by randomly reshuffling the cell identities in each sample.
The pixel-based approach implemented in SIMPLI enables quantification of areas positive for a specific marker or combination of markers, independently of their association with cells (Fig. 1c and Supplementary Fig. 1c). The obtained marker-positive areas are then normalised over the area of the whole image, or those of specific tissue compartments or positive for certain markers using the predefined masks, to allow comparisons across samples. The pixel-based analysis is useful for the investigation of tissue features that are not detectable at the cell level. For instance, extracellular or secreted proteins cannot be quantified with approaches dependent on cell segmentation. In addition, being completely cell agnostic, the pixel-based analysis can provide independent validation of cell-based observations.
SIMPLI generates tables, plots and images as outputs of each process, thus enabling the visualisation of results at every step of the analysis.
IMC quantification of secreted and cell-associated IgA in human colon
To test its performance and versatility, we applied SIMPLI to four case studies of multiplexed images obtained with different technologies and with diverse origin, size and resolution of the tissue sections (Table 2).
Table 2.
Case study | Imaging technology | Analysed samples (n) | Channels (n) | ROI (mm2) | Resolution (µm/pixel) | HPC platform | CPU time (h) | Elapsed real time (h) | RAM (GB) | Processes |
---|---|---|---|---|---|---|---|---|---|---|
1 (Fig. 2) | IMC | 6 | 28 | 1.00 | 1.00 | SGE | 00:20:41 | 00:06:10 | 4.1 |
Raw data processing Cell masking Single-cell quantification Pixel intensity comparison |
2 (Fig. 3) | IMC | 1 | 28 | 1.00 | 1.00 | SLURM | 00:06:25 | 00:05:30 | 4.2 |
Raw data processing Cell masking Unsupervised clustering Expression thresholding Homotypic cell distances |
3 (Fig. 4) | mIF | 1 | 7 | 5.45 | 0.50 | SLURM | 00:11:45 | 00:08:23 | 16.7 |
Thresholding and masking Expression thresholding Heterotypic cell distances |
4 (Fig. 5) | CODEX | 35 | 58 | 1.13 | 0.38 | SGE | 02:32:35 | 00:26:01 | 22.5 |
Expression thresholding Heterotypic cell distances |
For each case study, listed are the imaging technologies used to generate the tissue images, the number of samples and markers used, the size of the analysed region of interest (ROI), the resolution of the obtained images, the high-performance (HPC) platform and the computational resources employed to perform the analysis. These include the central processing unit (CPU) time and the elapsed real time, as well as the maximum random access memory (RAM) memory used. Finally, the specific analytical processes performed in each case study are also listed (single-cell segmentation was performed in all of them).
SGE Sun Grid Engine, SLURM Simple Linux Utility for Resource Management.
As a first case study, we used SIMPLI to compare the levels of secreted and cell-associated immunoglobulin A (IgA), the major immunoglobulin isotype in intestinal mucosa24, from IMC-derived multiplexed images of normal human colon. We stained six colon sections (CLN1-CLN6, Supplementary Data 1) with 26 antibodies marking T cells, macrophages, dendritic cells and B cells as well as stromal components (Supplementary Data 2) and ablated one region of interest (ROI) per sample.
Using SIMPLI, we extracted and normalised the 28 single-channel images (26 antibodies and two DNA intercalators) for each of the six ROIs and combined them into a single image per ROI (Fig. 2a). This normalisation enabled the selection of a single threshold for each marker to be used across all samples, thus reducing the complexity of the analysis configuration. By applying these thresholds to the E-cadherin and vimentin expression, we obtained the masks for the epithelium and the lamina propria, respectively (Fig. 2b). We used these masks to assign cells to the two compartments and normalise marker values or positive areas in the downstream analyses.
We then used the pixel-based approach to quantify both the IgA expressed by the plasma cells resident in the diffuse lymphoid tissue of the lamina propria as well as the secreted IgA undergoing transcytosis to traverse the epithelial compartment (Fig. 2b). As expected, most secreted IgA was localised in the epithelial crypts with only minimal presence of IgA+ area in the surface epithelium (Supplementary Fig. 2a). Quantification of the normalised IgA+ areas in the two compartments (Supplementary Fig. 2b) confirmed higher IgA+ levels in the lamina propria than in the epithelium (Fig. 2c). To assess the impact of image normalisation performed in the data processing step, we repeated the same analysis starting from the raw images and applying sample-specific thresholds to remove the background noise. The resulting IgA levels correlated linearly with those obtained from normalised images (Supplementary Fig. 2c), showing that data normalisation does not impact the results.
Next, we quantified the IgA+ plasma cells in the lamina propria using the cell-based approach. First, we performed single-cell segmentation with the deterministic approach and retained only cells overlapping for at least 30% or their area with the lamina propria mask (Fig. 2d and Supplementary Fig. 2d). We verified that varying the threshold of the overall had a minimal impact on the proportion of cells assigned to the lamina propria (Supplementary Fig. 2e). We then identified IgA+ plasma cells, T cells, macrophages, and dendritic cells resident in the lamina propria according to the highest overlap between the cell area and the mask of each immune cell population (Fig. 2e). Again, we verified that the relative proportion of these cell populations changed only minimally varying the threshold of the overlap with the lamina propria mask (Supplementary Fig. 2f). Finally, we quantified the four immune cell populations across the six samples and observed that IgA+ plasma cells constitute approximately 25% of all immune cells (Fig. 2f). This is consistent with previous quantifications of the fraction of plasma cells over the total mononucleated cells in the lamina propria of healthy individuals25.
The relative proportion of IgA+ plasma cells positively correlated with the normalised IgA+ area in the lamina propria, demonstrating that the quantification from the single-cell analysis is supported by the cell agnostic measurements at the pixel level (Fig. 2g).
Localisation of T follicular helper cells in IMC images of a germinal centre
As a second case study, we used SIMPLI to spatially localise the immune cell populations within a FFPE section of the healthy human appendix (APP1, Supplementary Data 1). After staining the tissue section with 28 markers (26 antibodies and two DNA intercalators, Supplementary Data 2), we performed IMC and used SIMPLI to extract and normalise the single-channel images from the raw IMC data. The resulting combined image revealed a germinal centre in the B cell area and follicle-associated epithelium forming the boundary with the appendiceal lumen (Fig. 3a).
We performed single-cell segmentation with both approaches implemented in SIMPLI and observed high overlap in the identified cells (Supplementary Fig. 3a), indicating a good concordance between the two methods. We then classified 7573 cells obtained with the deterministic segmentation approach in immune and epithelial cells based on the highest overlap with the corresponding masks obtained in the data processing step (Fig. 3b, c). We obtained similar proportions of cells starting from the raw data and applying the z-score normalisation and k-means clustering as implemented in Histocat26 (Supplementary Fig. 3b), again demonstrating that the normalisation implemented in SIMPLI does not impact the downstream analysis.
Next, we used both methods implemented in SIMPLI to further phenotype the T cells identified within the ROI. First, we applied unsupervised clustering using seven markers of T cell function (Supplementary Data 2). After inspection of the resulting clusters at different resolution levels, we selected 0.25 resolution that returned five distinct cell clusters (Fig. 3d). Based on the marker expression profiles, we assigned cluster 1 to CD4+ T cells, cluster 2 to CD8+CD45RO+ T cells, cluster 3 to CD4+CD45RA+ T cells, cluster 4 to CD4+CD45RO+ T cells and cluster 5 to PD1+CD4+ T cells (Fig. 3e). The latter likely represented a set of PD1+ T follicular helper cells known to be located in the germinal centre27. Interestingly, at higher resolution levels, cluster 5 was further divided into two smaller clusters showing PD1 high and low expression (Supplementary Fig. 3c). Similarly, clusters 1 and 2 were further divided into smaller subpopulation based on CD4 and CD45RO expression levels, respectively (Supplementary Fig. 3c). Therefore, although higher resolution levels increase the granularity of cell phenotyping, the unsupervised clustering approach implemented in SIMPLI is robust in identifying similar phenotyping clusters independently of the chosen resolution.
We re-identified the PD1+ T follicular helper cells with the second phenotyping approach based on expression thresholding of CD4 and PD1. Starting from all T cells, we first extracted CD4+ T cells (≥0.1 CD4 expression, Fig. 3f) and, within those, we further identified PD1+ cells (≥0.15 PD1 expression, Fig. 3g). Both thresholds were chosen after manual inspection of the histological images. The expression profile of the resulting PD1+CD4+ T cells (Fig. 3h) closely recapitulated that of cluster 5 (Fig. 3e). We repeated the same analysis for clusters 1–4 confirming the high overlap between cells in unsupervised clusters and those re-identified using marker expression thresholds (Supplementary Fig. 3d). Moreover, these cells showed similar expression profiles (Supplementary Fig. 3e) and spatial localisation (Supplementary Fig. 3f), indicating that cell phenotypes identified with unsupervised clustering can be confirmed through user-guided thresholding of marker expression.
Finally, we investigated the spatial localisation of PD1+ T follicular helper cells within the ROI by analysing their homotypic aggregations. This allowed us to localise a single high-density cluster containing 84% of PD1+CD4+ T cells within the germinal centre (Fig. 3i). This distribution of PD1+CD4+ T cells was in accordance with the localisation of T helper cells in the follicles of secondary lymphoid organs27 and was confirmed by the histological inspection of the tissue image (Fig. 3j).
mIF analysis of spatially resolved cell–cell interactions in rectal cancer
As a third case study, we applied SIMPLI to the spatial analysis of mIF-derived images of a rectal cancer sample (CRC1, Supplementary Data 1) stained with anti CD8, PD1, Ki67, PDL1, CD68, GzB and 4’,6-diamidino-2-phenylindole (DAPI) antibodies (Supplementary Data 2). We focused on a 5-mm2 ROI, rich in T cells and located at the invasive margin of the tumour (Fig. 4a). This allowed us to characterise the cell–cell interactions between PDL1+ cells and PD1+CD8+ T cells at the tumour boundary in a larger ROI, supporting the scalability of SIMPLI to the analysis of large regions (Table 2).
After image normalisation and single-cell segmentation, we identified PDL1+ and PD1+CD8+ cells by applying expert-defined thresholds to PDL1 (≥0.01), CD8 (≥0.01), and PD1 (≥0.005) expression levels, respectively. We extracted 2026 PDL1+ cells (Fig. 4b) and 3177 CD8+ cells, 94 of which also expressed PD1 (Fig. 4c). The two sets of PDL1+ and PD1+CD8+ cells constituted 3.7% and 0.2% of all cells in the analysed region, respectively (Fig. 4d). We confirmed similar proportions of PDL1+ and PD1+CD8+ cells by performing signal unmixing, cell segmentation and cell phenotyping with the Inform tissue analysis software28 (Akoya Biosciences, Fig. 4e).
We characterised the spatial relationship between these cells, focusing on the ones in close proximity to each other. Using the Euclidean distances between their centroids, we identified 35 PDL1+ cells and 21 PD1+CD8+ T cells at a distance lower than 12 μm apart, which corresponded to twice the maximum cell radius length. We considered these cells proximal enough to be engaging in PD1-PDL1 mediated interactions. By comparing PD1+CD8+ T cells proximal to PDL1+ cells and PD1+CD8+ T cells distal to PDL1+ cells, we found no difference in the expression of cytotoxicity (GzB) or proliferation (ki67) markers (Fig. 4f). This is in line with the broad range of cytotoxic activity in this T cell subset observed in colorectal cancer29. On the contrary, PDL1+ cells proximal to PD1+CD8+ T cells expressed higher levels of CD68 than PDL1+ cells distal to PD1+CD8+ T cells (Fig. 4g), suggesting spatial proximity between PDL1+ macrophages and PD1+CD8+ T cells. To validate this observation, we identified 1392 macrophages by applying an expert-defined threshold to CD68 expression value (≥0.01, Fig. 4h). We then classified these macrophages as PDL1- and PDL1+ cells, respectively, using 0.1 PDL1 expression threshold. Comparing the distance of the resulting two populations from the nearest PD1+CD8+ T cells, we confirmed that PDL1+CD68+ macrophages were significantly closer to PD1+CD8+ T cells than PDL1-CD68+ macrophages (Fig. 4i). By inspecting the imaged tissue at ×40 magnification, we confirmed the localisation of PDL1+CD68+ macrophages in close proximity to PD1+CD8+ cells, as well as the presence of both PD1+CD8+GzB- T cells and PD1+CD8+GzB+ T cells proximal to PDL1+ cells (Fig. 4j).
Comparison of cell distances in CODEX images of colorectal cancer subtypes
As a fourth case study, we used SIMPLI to compare the distances between immune cells and tumour or endothelial cells in Crohn’s-like reaction (CLR) and diffuse inflammatory infiltration (DII) colorectal cancer subtypes9. The high-dimensional imaging data were derived from 35 colorectal cancer samples (17 CLRs and 18 DIIs, Supplementary Data 1) and were obtained using CODEX with a 56 marker panel9 (Supplementary Data 2). Such a large number of antibodies enabled the identification and spatial localisation of T cells, B cells, plasma cells, macrophages, NK cells, granulocytes, dendritic cells, tumour cells, neuroendocrine cells, smooth muscle, nerves, lymphatic and blood vessels (Fig. 5a).
We processed the raw data from the original study, including normalisation. We then performed single-cell segmentation and quantified the main cell types identified in the original study9 by applying expert-defined thresholds to the expression of markers representative of each population (CDX2, MUC1 or cytokeratin for tumour cells; CD34 or CD31 for endothelial cells; vimentin for stromal cells; CD11c for dendritic cells; CD38 for B cells; CD3 and CD4 for CD4+ T cells; CD3, CD4 and FOXP3 for Tregs; CD3 and CD8 for CD8+ T cells, CD68 for macrophages). The obtained relative proportions of immune cells across all samples were highly concordant with those reported in the original study9 (Fig. 5b).
We then measured the distances of the main immune cell types from tumour cells and blood vessels by performing a heterotypic spatial analysis as implemented in SIMPLI. First, we calculated the distances of each macrophage, CD8+ T cell, CD4+ T cell, Treg and B cell to the nearest tumour cell or endothelial cell using the coordinates of the cell centroids. From these, we derived the corresponding distance distributions from the nearest tumour cell or endothelial cell in each sample. Finally, we compared the resulting distributions between 17 CLR and 18 DII colorectal cancer subtypes. After correcting for multiple testing, we considered biologically relevant only differences between the median distances of the two sample subtypes bigger than 8 µm, corresponding to the diameter of B and T lymphocytes30. With this approach, we found that Tregs were significantly closer to tumour cells in DII (median distance = 22.4 µm) compared to CLR (35.6 µm, Fig. 5c). On the contrary, B cells were more proximal to blood vessels in CLR (33.5 µm) than in DII (43.3 µm, Fig. 5d). We further supported these results with a permutation test, where we re-labelled randomly the identities to all cells in each sample for 10,000 times to derive an expected distribution of differences in distances between CLR and DII cells. The comparisons of observed values to the expected distributions, confirmed that Tregs were significantly closer to tumour cells in DII (Fig.5e) while B cells were significantly closer to blood vessels in CLR (Fig. 5f). Since the spatial randomness used as a baseline for the permutation test is an approximation of the highly organised structure of biological tissues, we sought further support this result through independent inspection of the spatial distributions of B cells in CLRs (Fig. 5g) and DII (Fig. 5h) in the histological images.
This result, not reported in the original study, showcases the discovery potential of the quantitative analysis of spatial relationships between cell populations implemented in SIMPLI. In addition, the SIMPLI graphical representations of the tissue composition as an overlay of cell boundaries colour-coded by cell populations greatly facilitate the visual inspection of their spatial interactions in their original tissue context.
Discussion
SIMPLI is an open-source, customisable and technology-independent tool for the analysis of multiplexed imaging data. It enables the processing of raw images, the extraction of cell data and the spatially resolved quantification of cell types or functional states as well as a cell-independent analysis of tissues at the pixel level, all within a single platform (Table 1). Moreover, it gives high flexibility to the user who can decide whether to skip processes implemented in SIMPLI and replace them with external tools to then re-start the pipeline at any point.
In comparison to currently available software, SIMPLI increases the portability, scalability and reproducibility of the analysis (Table 2). Moreover, it can easily accommodate specific analytical requirements across a wide range of tissues and imaging technologies at different levels of resolution and multiplexing through user-friendly configuration files. SIMPLI interoperates with multiple software and programming languages by leveraging workflow management and containerisation. This makes the inclusion of additional algorithms, features and imaging data formats easy to implement. For example, as possible future developments, SIMPLI may include alternative methods of cell segmentation, pixel and cell classification or a Graphical User Interface for interactive data visualisation. For this reason, we will maintain SIMPLI and its documentation up-to-date and will further expand it to leverage new tools as they become adopted by the community. Similarly, feedback from users will be collected through the dedicated GitHub repository.
Multiplexed imaging methods have proven to be a powerful approach for the study of tissues through the in-depth characterisation of cell phenotypes and interactions. SIMPLI, which was recently able to reveal differences in the composition of the microenvironment between colorectal cancers responsive and resistant to anti-PD1 immunotherapy31, represents an effort to make these analyses more accessible to a wider community. This will enable the exploitation of highly multiplexed imaging technologies for multiple applications, ranging from basic life science and pharmaceutical research to precision medical use in the clinics.
Methods
All patients enrolled in this study provided written informed consent in accordance with approved institutional guidelines (University College London Hospital, REC Reference: 20/YH/0088; Istituto Clinico Humanitas, REC Reference: ICH-25-09).
SIMPLI description and implementation
SIMPLI’s workflow is divided into three steps (raw image processing; cell-based analysis; pixel-based analysis), which are constituted of multiple standalone processes (Fig. 1 and Supplementary Fig. 1). Processes can be executed sequentially or independently from the command line or through a configuration file that can be edited with any text editor. This allows the user to skip some of them and use alternative input data for downstream analyses. In addition, parameters and options can be specified through the same configuration files without the need to set up tool-specific input files in any specific directory structure.
Raw data from IMC or MIBI experiments (.mcd or.txt files) are converted into single or multi-channel.tiff images with imctools32. Data from other multiplexed imaging platforms may be supplied directly as raw single or multi-channel tiff images (Supplementary Fig. 1a). Raw images can be thresholded individually to minimise the effect of non-uniform staining and then used directly for the cell- and pixel-based analyses. Alternatively, they can be first normalised across samples by rescaling pixel values of each channel up to the 99th percentile of the distribution using the EBImage33 package and custom R scripts. Normalised images can then be processed with CellProfiler411 to generate thresholded images and masks of tissue compartments or markers to be used in the following steps. In this step, the user can apply a range of filters, thresholds and morphological operations to each image, according to the experimental plan.
Pixel-based and cell-based analyses can be run as single workflows or in parallel within the same run. Both of them provide multiple outputs of the various processes, including tabular text files, visualisation plots and comparisons across samples (Supplementary Fig. 1).
The cell-based analysis is composed of cell data extraction, cell phenotyping and spatial analysis (Supplementary Fig. 1b). The extraction of cell data starts with single-cell segmentation using CellProfiler411 or StarDist23 with scikit-image34 used for feature extraction. In the latter case, default models or user-provided trained models can be used. Cell segmentation returns (1) single-cell data consisting of the marker expression values and the coordinates of each cell in the ROI and (2) the ROI segmentation mask marking all the pixels belonging to each cell with its unique identifier. Cells mapping to tissue compartments or positive for certain markers can then be identified based on their overlap with the tissue compartments or marker masks derived in the previous step. These cells are visualised in the ROI as outlines, while their proportions are quantified in barplots and boxplots.
All cells or only those in specific tissue compartments or positive for certain markers can be further phenotyped using two approaches. The first consists of unsupervised clustering based on the marker expression values using Seurat35. Cells are represented as nodes in a k-nearest neighbour graph based on their Euclidean distances in a principal component analysis space. This graph is then partitioned into clusters using the Louvain algorithm36 at user-defined levels of resolution leading to the unsupervised identification of cell phenotypes. Clusters of cell phenotypes are plotted as scatterplots in Uniform Manifold Approximation and Projection (UMAP)37 space. The second phenotyping approach is based on user-defined thresholds of marker expression values that can be combined using logical operators for the identification of designated cell phenotypes. The distributions of cells are represented as density plots based on the marker expression levels. In both phenotyping approaches, the expression profiles of the cell types are plotted as heatmaps, their proportions quantified in barplots and boxplots and their locations in the ROI visualised as cell outlines.
Once cell populations and phenotypes have been identified, the spatial analysis investigates the distance between cells of the same (homotypic aggregations) or different (heterotypic aggregations) types. The homotypic and heterotypic spatial analyses can be run in parallel or singularly on one or more sets of cells. In the homotypic analysis, clusters of cells of the same type within a user-defined distance are identified with DBSCAN38 as implemented in the fpc39 R package. These homotypic cell aggregations are visualised as position maps, reporting cell location and high-density clusters in the ROI. In the heterotypic analysis, the cell distances, defined as the Euclidean distances between cell centroids, are computed using custom R scripts and visualised as density plots. The resulting distribution of cell distances can be compared between group of samples using a two-sided Wilcoxon test with Benjamini–Hochberg FDR correction. Observed distances can also be compared to the distribution of expected distances obtained by reshuffling cell identities in each sample randomly for a user-defined number of times (default value = 10,000 reshufflings, Supplementary Fig. 1b). The statistical significance of this comparison is evaluated with a two-tailed permutation test adjusted for multiple hypothesis testing with the Benjamini–Hochberg correction.
The pixel-based analysis quantifies areas positive for a user-defined combination of markers using the EBImage33 package with custom R scripts (Supplementary Fig. 1c). These measurements are performed starting from the thresholded images produced in the raw image processing step (Supplementary Fig. 1a). The marker-positive areas obtained in this way are then normalised over the area of the whole image or specific tissue or marker compartments. The resulting normalised positive areas can then be quantified in barplots and boxplots.
SIMPLI is implemented as a Nextflow40 pipeline employing Singularity containers41 hosted on Singularity Hub42 to manage all the libraries and software tools. This allows SIMPLI to automatically manage all dependencies, irrespective of the running platform. Nextflow also manages automatic parallelisation of all processes while still allowing the selection of parts of the analysis to execute.
Sample description
Six FFPE blocks of normal (non-cancerous) colon mucosa (CLN1-CLN6), one of normal appendix (APP1) and one of rectal cancer (CRC1) were obtained from eight individuals who underwent surgery for the removal of colorectal cancers (Supplementary Data 1). All blocks were reviewed by an expert pathologist (M.R.-J.).
Staining and IMC ablation of human colon mucosa and appendix
Four-µm-thick sections were cut from each block of samples CLN1-CLN6 and APP1 with a microtome and used for staining with a panel of 26 antibodies targeting the main immune, stromal and epithelial cell populations of the gastrointestinal tract (Supplementary Data 2). The optimal dilution of each antibody in the panel was identified by staining and ablating FFPE appendix sections. The resulting images were reviewed by a mucosal immunologist (J.S.) and the dilution giving the best signal to background ratio was selected for each antibody (Supplementary Data 2). To perform the staining for IMC, slides were dewaxed after a 1-h incubation at 60 °C, rehydrated and heat-induced antigen retrieval was performed with a pressure cooker in Antigen Retrieval Reagent-Basic (R&D Systems). Slides were incubated in a 10% BSA (Sigma), 0.1% Tween (Sigma), and 2% Kiovig (Shire Pharmaceuticals) Superblock Blocking Buffer (Thermo Fisher) blocking solution at room temperature for 2 h. Each antibody was added to a primary antibody mix at the selected concentration in blocking solution and incubated overnight at 4 °C. After two washes in PBS and PBS-0.1% Tween, the slides were treated with the DNA intercalator Cell-ID™ Intercalator-Ir (Fluidigm) (containing the two iridium isotopes 191Ir and 193Ir) 1.25 mM in a PBS solution. After a 30-min incubation, the slides were washed once in PBS and once in MilliQ water and air-dried. The stained slides were then loaded in the Hyperion Imaging System (Fluidigm) imaging module to obtain light-contrast high-resolution images of approximately 4 mm2. These images were used to select the ROI in each slide. For CLN1-CLN6, 1 mm2 ROIs were selected to contain the full thickness of the colon mucosa, with epithelial crypts in a longitudinal orientation. For APP1, a 1-mm2 ROI containing a lymphoid follicle in its whole depth alongside a portion of lamina propria and of the epithelium was selected. ROIs were ablated at a 1 µm/pixel resolution and 200 Hz frequency.
IMC data analysis of human colon mucosa
Twenty-eight images from 26 antibodies (Supplementary Data 2) and two DNA intercalators were obtained from the raw.txt files of the ablated regions in CLN1-CLN6 using the data extraction process. Pixel intensities for each channel were normalised to the 99th percentile in all samples and Otsu thresholding was performed on the normalised images with a custom CellProfiler4 pipeline, which was employed also to generate the masks for the lamina propria (using the Vimentin channel including all <75-pixel large negative areas) and the epithelium (starting from the Pan-keratin and E-cadherin channels, dilatating the images with a three-pixel disk and the filling up of all <75-pixel large negative areas). These masks were then added into a sum image, which underwent dilatation with a three-pixel disk and filling up of all <25-pixel large negative areas. Positive features outside of the lamina and epithelium were removed with an opening operation using a 150-pixel radius and the lamina propria mask was subtracted from the sum image to generate the final mask for the epithelial compartment. These masks and the thresholded images were used as input for the pixel-based and cell-based analysis processes. The IgA masks employed for the pixel analysis were generated using a three-class global Otsu thresholding with two background classes after applying a Gaussian filter with a 1.5-pixel large radius to remove high-intensity artefacts of that size, which we noticed after manual inspection of the images.
To evaluate the effect of normalisation on the downstream analysis, sample-specific thresholds were manually selected for IgA, E-Cadherin, Pan-Keratin and Vimentin and applied to the raw images. The resulting thresholded images were used to generate lamina propria and epithelial masks for each sample individually.
Pixel-level analysis was performed on the IgA masks derived from either the normalised or the raw images and IgA+ areas in the tissue, lamina propria and epithelium were measured and normalised over the areas of the three compartments.
Cell-level analysis started with CellProfiler4 segmentation first on DNA1 with global Otsu thresholding to identify the cell nuclei. Then, cells were identified by radially expanding each nucleus for up to 10 pixels over a membrane mask derived from the IgA, CD3, CD68, CD11c and E-cadherin channels. After inspection by an expert histologist (J.S.), only cells overlapping with the lamina propria mask by at least 30% were retained.
Cell identities were assigned according to the highest overlap of the cell area with marker-specific thresholds defined by an expert histologist (J.S.): ≥15% of the IgA mask for IgA cells; ≥15% of the CD3 mask for T cells; ≥25% of the CD68 mask for macrophages; ≥15% of CD11c mask for dendritic cells.
IMC data analysis of human appendix
Images from the same 26 antibodies and two DNA intercalators used in the colon mucosa (Supplementary Data 2) were obtained from the raw.txt files of the ablated region in APP1, normalised to the 99th percentile and thresholded with CellProfiler4 as described above. For the cell-based analysis, nuclei were identified using the DNA1 channel and cells were isolated through watershed segmentation with the nuclei as seeds on a membrane mask summing up CD45, Pan-keratin and E-cadherin thresholded images.
Cells were assigned to the epithelium or to immune cell populations if they overlapped for ≥10% with the following masks: CD3 mask for T cells; CD20 and CD27 masks for B cells; CD68 mask for macrophages; CD11c mask for dendritic cells; E-cadherin+ and Pan-keratin+ masks for epithelial cells.
T cells were further phenotyped using unsupervised clustering at resolutions between 0.1 and 1.0, with 0.05 intervals and based on the cell marker intensity for CD3, CD45RA, CD45RO, CD4, CD8, Ki67 and PD1. The resulting clusters were manually inspected and the clustering with the highest number of biologically meaningful clusters (resolution = 0.25) was chosen. Clusters were re-identified using mean intensity thresholds defined by an expert histologist (J.S.) for the following markers: CD3 >0.06 for cluster 1; CD8a >0.125 for cluster 2; CD45RA >0.125 for cluster 3; CD4 >0.125 and CD45RO >0.15 for cluster 4; and CD4 >0.1 and PD1 >0.15 for cluster 5.
Homotypic aggregations of PD1+CD4+ T cells (cluster 5, resolution = 0.25) were computed using a minimum of five points per cluster and a reachability parameter corresponding to a density of at least 5 cells/mm2.
CD3 staining and mIF of human rectal cancer
Two 4-µm-thick serial sections were cut from CRC1 FFPE block using a microtome. The first slide was dewaxed and rehydrated before carrying out HIER with Antigen Retrieval Reagent-Basic (R&D Systems). The tissue was then blocked and incubated with the anti-CD3 antibody (Dako, Supplementary Data 2) followed by horseradish peroxidase (HRP) conjugated anti-rabbit antibody (Dako) and stained with 3,3’ diaminobenzidine substrate (Abcam) and haematoxylin. Areas with CD3+ infiltration in the proximity of the tumour invasive margin were identified by a clinical pathologist (M.R.-J.)
The second slide was stained with a panel of six antibodies (CD8, PD1, Ki67, PDL1, CD68, GzB, Supplementary Data 2), Opal fluorophores and DAPI on a Ventana Discovery Ultra automated staining platform (Roche). Expected expression and cellular localisation of each marker as well as fluorophore brightness were used to minimise fluorescence spillage upon antibody-Opal pairing. Following a 1-h incubation at 60 °C, the slide was subjected to an automated staining protocol on an autostainer. The protocol involved deparaffinisation (EZ-Prep solution, Roche), HIER (DISC. CC1 solution, Roche) and seven sequential rounds of 1-h incubation with the primary antibody, 12 min incubation with the HRP-conjugated secondary antibody (DISC. Omnimap anti-Ms HRP RUO or DISC. Omnimap anti-Rb HRP RUO, Roche) and 16-min incubation with the Opal reactive fluorophore (Akoya Biosciences). For the last round of staining, the slide was incubated with Opal TSA-DIG reagent (Akoya Biosciences) for 12 min followed by Opal 780 reactive fluorophore for 1 h (Akoya Biosciences). A denaturation step (100 °C for 8 min) was introduced between each staining round in order to remove the primary and secondary antibodies from the previous cycle without disrupting the fluorescent signal. The slide was counterstained with DAPI (Akoya Biosciences) and coverslipped using ProLong Gold antifade mounting media (Thermo Fisher Scientific). The Vectra Polaris automated quantitative pathology imaging system (Akoya Biosciences) was used to scan the labelled slide. Six fields of view, within the area selected by the pathologist, were scanned at ×20 and ×40 magnification using appropriate exposure times and loaded into inForm28 for spectral unmixing and autofluorescence isolation using the spectral libraries.
mIF data analysis
After spectral unmixing and merging of six ×20 fields of view for a total of >5 mm2 ROI (Table 2), one single-tiff image was extracted for each marker and its intensity was rescaled from 0 to 1 with custom R scripts. The resulting single-tiff images were pre-processed to remove the background noise with Otsu thresholding in CellProfiler4 and used for cell segmentation by applying a global threshold to the DAPI channel and selecting all objects with a diameter between four and 60 pixels. PD1+CD8+ cells, CD68+ cells and PDL1+ cells were then identified using mean intensity thresholds of 0.01 for CD8, 0.005 for PD1, 0.01 for CD68 and 0.01 for PDL1. All thresholds were inspected by an expert histologist (J.S.). To crosscheck these results, images were analysed with the Inform28 package. After spectral unmixing, images were segmented with the Adaptive Cell Segmentation option applied to the DAPI channel for nuclei identification (“relative intensity” = 0.1, “splitting sensitivity” = 0.1, “minimum size” = 5). Then PD1+CD8+ cells and PDL1+ cells were identified.
The distributions of minimum distances between PDL1+ cells and PD1+CD8+ cells were calculated from the coordinates of the centroids of each cell in the image. All PDL1+ cells and PD1+CD8+ cells at a distance from each other lower than double the maximum cell radius (24 pixels = 12 µm) were considered as proximal. All other cells were classified as distal.
CODEX data analysis
A published dataset of colorectal CODEX images9 was downloaded from The Cancer Imaging Archive (10.7937/tcia.2020.fqn0-0326). It consisted of processed CODEX data from 35 colorectal cancer samples divided in two groups (CLR and DII) according to the peritumoral inflammatory levels and the presence of tertiary lymphoid structures9. For each sample, four.tiff images were available representing four 0.6-mm spots from two 70-core tissue microarrays. These images were hyperstacks of 58 channels including 56 antibodies (Supplementary Data 2) and two DNA markers with a resolution of 377 nm/pixel. After the manual review of all 140 spots, one representative image per sample was selected, having the best focus and containing both tumour and peritumoural immune infiltrates.
The single-channel tiff files for each selected image were extracted and the pixel intensities were rescaled from 0 to 1 with a custom R script. Using SIMPLI, single-cell segmentation was performed in each of the 35 images by applying a global threshold to the HOECHST channel to identify the nuclei and retain all objects with a diameter between 5 and 40 pixels. Each nucleus was then expanded by 5 pixels in all directions to define the cell area.
Resulting single cells were assigned to ten phenotypes according to the mean cell expression of CDX2 >0.15 or MUC1 >0.15 or cytokeratin >0.15 for tumour cells; CD34 >0.15 or CD31 >0.15 for endothelial cells; vimentin >0.1 for other stromal cells; CD11c >0.3 for dendritic cells; CD38 >0.26 for B cells; CD4 >0.13 and CD3 >0.1 for CD4+ T cells; CD4 >0.12 and FOXP3 >0.5 and CD3 >0.1 for Tregs; CD8 >0.16 and CD3 >0.1 for CD8+ T cells, and CD68 >0.11 for macrophages. The heterotypic spatial analysis was performed by calculating the minimum distances of macrophages, CD8+ T cells, CD4+ T cells, Treg cells, and B cells to tumour cells and endothelial cells using the coordinates of the cell centroids. Only comparisons where the difference of the median cell–cell distances between the two histological subtypes was greater than 8 µm, corresponding to the diameter of B and T cells30, were retained, no samples or cells were excluded from the analysis. As further support, a permutation test for each of the retained comparisons was run by re-assigning cell identities randomly in each sample 10,000 times. The resulting expected random distributions were compared to the observed values using a two-tailed permutation test and corrected for multiple testing.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Supplementary information
Acknowledgements
We thank Sharavan Vishaan Venkateswaran for testing SIMPLI. This work was supported by Cancer Research UK (C43634/A25487, F.D.C.), the Cancer Research UK King’s Health Partners Centre at King’s College London (C604/A25135, F.D.C.), the Cancer Research UK City of London Centre (C7893/A26233, F.D.C.), innovation programme under the Marie Skłodowska-Curie (grant agreement No CONTRA-766030, F.D.C.) and the Francis Crick Institute, which receives its core funding from Cancer Research UK (FC001002, F.D.C.), the UK Medical Research Council (FC001002, F.D.C.), and the Wellcome Trust (FC001002, F.D.C.) and Crohn’s and Colitis UK (M2019/3, J.S.). For the purpose of Open Access, the authors have applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission.
Source data
Author contributions
F.D.C. conceived and directed the study with support of J.S. M.B. developed the software with the help of D.T. and M.R.K. L.M. and A.A.-S performed the experiments. M.B, L.M, A.A.-S., M.J.P., J.S. and F.D.C. analysed the data. M.R.-J, G.B. and L.L. identified the samples and provided clinical assessments. M.R.-J. performed pathological assessments. M.B. and F.D.C. wrote the manuscript with contribution from A.A.-S., L.M. and M.J.P. All authors approved the manuscript.
Peer review
Peer review information
Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Data availability
The imaging mass cytometry data of human colon mucosa generated in this study have been deposited in the Zenodo database under accession code “5545882”43. The imaging mass cytometry data of the human appendix generated in this study have been deposited in the Zenodo database under accession code “5545760”44. The multiplex immunofluorescence data of human colorectal cancer generated in this study have been deposited in the Zenodo database under accession code “5545864”45. All other relevant data supporting the key findings of this study are available within the article and its Supplementary Information files or from the corresponding author upon reasonable request. Source Data are provided with this paper.
Code availability
SIMPLI’s code, documentation and an example dataset are available at “SIMPLI [https://github.com/ciccalab/SIMPLI]”46. The software code is protected by copyright. No permission is required from the rights-holder for non-commercial research uses. Commercial use will require a license from the rights-holder. For further information contact translation@crick.ac.uk who will reply within 5 business days.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-022-28470-x.
References
- 1.Parra ER, Francisco-Cruz A, Wistuba II State-of-the-art of profiling immune contexture in the era of multiplexed staining and digital analysis to study paraffin tumor tissues. Cancers. 2019;11:247. doi: 10.3390/cancers11020247. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Giesen C, et al. Highly multiplexed imaging of tumor tissues with subcellular resolution by mass cytometry. Nat. Methods. 2014;11:417–422. doi: 10.1038/nmeth.2869. [DOI] [PubMed] [Google Scholar]
- 3.Angelo M, et al. Multiplexed ion beam imaging of human breast tumors. Nat. Med. 2014;20:436–442. doi: 10.1038/nm.3488. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Goltsev Y, et al. Deep profiling of mouse splenic architecture with CODEX multiplexed imaging. Cell. 2018;174:968–981.e915. doi: 10.1016/j.cell.2018.07.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Lin JR, Fallahi‐Sichani M, Chen JY, Sorger PK. Cyclic immunofluorescence (CycIF), a highly multiplexed method for single‐cell imaging. Curr. Protoc. Chem. Biol. 2016;8:251–264. doi: 10.1002/cpch.14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Bauman, T. M. et al Quantitation of protein expression and co-localization using multiplexed immuno-histochemical staining and multispectral imaging. J. Vis. Exp. e53837 (2016). [DOI] [PMC free article] [PubMed]
- 7.Morrison LE, et al. Brightfield multiplex immunohistochemistry with multispectral imaging. Lab. Investig. 2020;100:1124–1136. doi: 10.1038/s41374-020-0429-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Jackson HW, et al. The single-cell pathology landscape of breast cancer. Nature. 2020;578:615–620. doi: 10.1038/s41586-019-1876-x. [DOI] [PubMed] [Google Scholar]
- 9.Schürch CM, et al. Coordinated cellular neighborhoods orchestrate antitumoral immunity at the colorectal cancer invasive front. Cell. 2020;182:1341–1359.e1319. doi: 10.1016/j.cell.2020.07.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Berg S, et al. ilastik: interactive machine learning for (bio)image analysis. Nat. Methods. 2019;16:1226–1232. doi: 10.1038/s41592-019-0582-9. [DOI] [PubMed] [Google Scholar]
- 11.McQuin C, et al. CellProfiler 3.0: next-generation image processing for biology. PLoS Biol. 2018;16:e2005970. doi: 10.1371/journal.pbio.2005970. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.van Maldegem F, et al. Characterisation of tumour microenvironment remodelling following oncogene inhibition in preclinical studies with imaging mass cytometry. Nat. Commun. 2021;12:5906. doi: 10.1038/s41467-021-26214-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Zanotelli, V. R. T. & Bodenmiller, B. ImcSegmentationPipeline: A pixelclassification based multiplexed image segmentation pipeline. 10.5281/zenodo.3841961 (2017).
- 14.Jones TR, et al. CellProfiler Analyst: data exploration and analysis software for complex image-based screens. BMC Bioinforma. 2008;9:482. doi: 10.1186/1471-2105-9-482. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Eling N, Damond N, Hoch T, Bodenmiller B. cytomapper: an R/Bioconductor package for visualization of highly multiplexed imaging data. Bioinformatics. 2020;36:5706–5708. doi: 10.1093/bioinformatics/btaa1061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Opzoomer JW, et al. ImmunoCluster provides a computational framework for the nonspecialist to profile high-dimensional cytometry data. eLife. 2021;10:e62915. doi: 10.7554/eLife.62915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Stoltzfus CR, et al. CytoMAP: a spatial analysis toolbox reveals features of myeloid cell organization in lymphoid tissues. Cell Rep. 2020;31:107523. doi: 10.1016/j.celrep.2020.107523. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Somarakis A, Unen VV, Koning F, Lelieveldt B, Höllt T. ImaCytE: visual exploration of cellular micro-environments for imaging mass cytometry data. IEEE Trans. Vis. Computer Graph. 2021;27:98–110. doi: 10.1109/TVCG.2019.2931299. [DOI] [PubMed] [Google Scholar]
- 19.Yang, T. et al. SPIAT: an R package for the spatial image analysis of cells in tissues. Preprint at bioRxiv 2020.2005.2028.122614 (2020).
- 20.NeighbouRhood. https://github.com/BodenmillerGroup/neighbouRhood (2019).
- 21.Catena R, Montuenga LM, Bodenmiller B. Ruthenium counterstaining for imaging mass cytometry. J. Pathol. 2018;244:479–484. doi: 10.1002/path.5049. [DOI] [PubMed] [Google Scholar]
- 22.Bankhead P, et al. QuPath: open source software for digital pathology image analysis. Sci. Rep. 2017;7:16878. doi: 10.1038/s41598-017-17204-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Schmidt, U., Weigert, M., Broaddus, C., Myers, G. Cell Detection with Star-Convex Polygons. (Springer International Publishing, 2018).
- 24.Pabst O, Slack E. IgA and the intestinal microbiota: the importance of being specific. Mucosal Immunol. 2020;13:12–21. doi: 10.1038/s41385-019-0227-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Dorn I, Schlenke P, Mascher B, Stange EF, Seyfarth M. Lamina propria plasma cells in inflammatory bowel disease: intracellular detection of immunoglobulins using flow cytometry. Immunobiology. 2002;206:546–557. doi: 10.1078/0171-2985-00203. [DOI] [PubMed] [Google Scholar]
- 26.Schapiro D, et al. histoCAT: analysis of cell phenotypes and interactions in multiplex image cytometry data. Nat. Methods. 2017;14:873–876. doi: 10.1038/nmeth.4391. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Song W, Craft J. T follicular helper cell heterogeneity: time, space, and function. Immunological Rev. 2019;288:85–96. doi: 10.1111/imr.12740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Kramer AS, et al. InForm software: a semi-automated research tool to identify presumptive human hepatic progenitor cells, and other histological features of pathological significance. Sci. Rep. 2018;8:3418. doi: 10.1038/s41598-018-21757-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Zhang L, et al. Lineage tracking reveals dynamic relationships of T cells in colorectal cancer. Nature. 2018;564:268–272. doi: 10.1038/s41586-018-0694-x. [DOI] [PubMed] [Google Scholar]
- 30.Strokotov D, et al. Is there a difference between T- and B-lymphocyte morphology? J. Biomed. Opt. 2009;14:064036. doi: 10.1117/1.3275471. [DOI] [PubMed] [Google Scholar]
- 31.Bortolomeazzi M, et al. Immunogenomics of colorectal cancer response to checkpoint blockade: analysis of the KEYNOTE 177 trial and validation cohorts. Gastroenterology. 2021;161:1179–1193. doi: 10.1053/j.gastro.2021.06.064. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.imctools. https://github.com/BodenmillerGroup/imctools (2017).
- 33.Pau G, Fuchs F, Sklyar O, Boutros M, Huber W. EBImage—an R package for image processing with applications to cellular phenotypes. Bioinformatics. 2010;26:979–981. doi: 10.1093/bioinformatics/btq046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Van der Walt S, et al. scikit-image: image processing in Python. PeerJ. 2014;2:e453. doi: 10.7717/peerj.453. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Butler A, Hoffman P, Smibert P, Papalexi E, Satija R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 2018;36:411–420. doi: 10.1038/nbt.4096. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J. Stat. Mech.: Theory Exp. 2008;2008:P10008. doi: 10.1088/1742-5468/2008/10/P10008. [DOI] [Google Scholar]
- 37.Leland M, John H, Nathaniel S, Lukas G. UMAP: Uniform Manifold Approximation and Projection. J. Open Source Softw. 2018;3:861. doi: 10.21105/joss.00861. [DOI] [Google Scholar]
- 38.Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD-96 Proceedings. (1996).
- 39.Henning, C. fpc. https://cran.r-project.org/package=fpc (2020).
- 40.Di Tommaso P, et al. Nextflow enables reproducible computational workflows. Nat. Biotechnol. 2017;35:316–319. doi: 10.1038/nbt.3820. [DOI] [PubMed] [Google Scholar]
- 41.Kurtzer GM, Sochat V, Bauer MW. Singularity: scientific containers for mobility of compute. PLoS One. 2017;12:e0177459. doi: 10.1371/journal.pone.0177459. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Sochat VV, Prybol CJ, Kurtzer GM. Enhancing reproducibility in scientific computing: Metrics and registry for Singularity containers. PLoS One. 2017;12:e0188511. doi: 10.1371/journal.pone.0188511. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Bortolomeazzi, M. et al. Imaging Mass Cytometry of human normal colon mucosa (CLN1-6) from: A SIMPLI (Single-cell Identification from MultiPLexed Images) approach for spatially resolved tissue phenotyping at single-cell resolution. 10.5281/zenodo.5545882 (2021). [DOI] [PMC free article] [PubMed]
- 44.Bortolomeazzi, M. et al. Imaging Mass Cytometry Images (APP1) from: A SIMPLI (Single-cell Identification from MultiPLexed Images) approach for spatially resolved tissue phenotyping at single-cell resolution. 10.5281/zenodo.5545760 (2021). [DOI] [PMC free article] [PubMed]
- 45.Bortolomeazzi, M. et al. Vectra Polatis image of human colorectal cancer (CRC1) from: A SIMPLI (Single-cell Identification from MultiPLexed Images) approach for spatially resolved tissue phenotyping at single-cell resolution. 10.5281/zenodo.5545864 (2021). [DOI] [PMC free article] [PubMed]
- 46.Bortolomeazzi, M. et al. A SIMPLI (Single-cell Identification from MultiPLexed Images) approach for spatially-resolved tissue phenotyping at single-cell resolution. ciccalab/SIMPLI10.5281/zenodo.5807230 (2021). [DOI] [PMC free article] [PubMed]
- 47.Dries R, et al. Giotto: a toolbox for integrative analysis and visualization of spatial expression data. Genome Biol. 2021;22:78. doi: 10.1186/s13059-021-02286-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The imaging mass cytometry data of human colon mucosa generated in this study have been deposited in the Zenodo database under accession code “5545882”43. The imaging mass cytometry data of the human appendix generated in this study have been deposited in the Zenodo database under accession code “5545760”44. The multiplex immunofluorescence data of human colorectal cancer generated in this study have been deposited in the Zenodo database under accession code “5545864”45. All other relevant data supporting the key findings of this study are available within the article and its Supplementary Information files or from the corresponding author upon reasonable request. Source Data are provided with this paper.
SIMPLI’s code, documentation and an example dataset are available at “SIMPLI [https://github.com/ciccalab/SIMPLI]”46. The software code is protected by copyright. No permission is required from the rights-holder for non-commercial research uses. Commercial use will require a license from the rights-holder. For further information contact translation@crick.ac.uk who will reply within 5 business days.