Research advances depend on standards
Advances in research methods that create large datasets have motivated innovations in experimental measurement standards to promote comparison between datasets and among lab groups. Such increases in data sharing can accelerate progress in complex experimental systems by providing access to more data than is available from any one lab alone. This is particularly valuable in revealing variations in human translational studies where differences among study subjects may be subtle but are nonetheless critical for understanding divergent health outcomes. The focus on complex interactions in precision medicine–such as immune-mediated pathogenesis and in depth profiling studies–has led investigators to produce many such datasets and created the need for meaningful data sharing. The large datasets generated present opportunities for tremendous advances through additional data-mining and re-use of data for new users. Consideration of this need has launched a call to action internationally for common standards in experimental data collection, health data standards in clinical settings, as well as efforts in data annotation and reporting nomenclature (1).
The use of repositories for data sharing has been broadly adopted over the last two decades for ‘-omics’ research in e.g., DNA sequencing (dbGap), transcriptional microarray data (GEO, ArrayExpress), microbiome (hmpdacc), metabolomics (Metabolomics Data Repository). This more recently has included immunologic data in NIH-supported ImmPORT (www.immuneprofiling.org/hipc/page/show), and the Human Immunology Project Consortium (HIPC) where one of the goals is the integration of data across centers (2). Effective experimental standards and normalization are critical to the success of this effort and include elements such as sample handling, reagents, instrument setup and data analysis for flow cytometry (2) or commonly agreed-upon and tested controls and protocols to verify technical performance and data quality for transcriptional studies (3). Further, the NIH initiative for rigor and reproducibility, and the required authentication of reagents as part of every grant proposal, places these concerns foremost for investigators before the first experiment begins (4).
Technological advances in single cell phenotyping: CyTOF
Flow cytometry has been a powerful tool for analysis of the immune system on a cellular level. Mass cytometry or CyTOF (Cytometry by Time-Of-Flight) is a new technology for multiparameter (>40) single cell analysis which substitutes heavy metal isotope conjugates for fluorophores as reporters and uses mass spectrometry as a readout (5). Thus with CyTOF, many more markers can be combined, providing unprecedented multidimensional immune cell profiling. An important advantage of the CyTOF technology is that the metal-conjugates have little/no background or overlap between channels, such as emission spectra of the fluorescent reporters in flow cytometry, thus removing compensation issues that can be challenging and may reduce sensitivity. Cell preparation protocols for CyTOF are technically similar to flow cytometry but experiments combining 40 antibody specificities can be routine (5).
CyTOF provides in depth measurement of protein abundances that define cell activity and function in a lineage specific manner and has already been applied to characterizing blood cell ontogeny and variation, signatures that correlate with parameters of surgical recovery, autoimmune or infectious disease, responding phenotypes in cancer, the functional significance of the extreme diversity of the Natural Killer cell repertoire in viral infection, and predictive markers of preterm births (5–8). CyTOF even holds the promise of examination of solid tissue, which opens new avenues of investigation in tumor biology and tissue pathology (5). Another valuable feature of CyTOF is its excellent sensitivity for studies from a small number of cells, as sample quantities are frequently limiting, particularly in human studies. In our facility, we have detected the main immune subsets using as few as 104 peripheral blood mononuclear cells–and for broad classifications 103–and successful phenotyping of immune cell types from skin biopsies, providing excellent support for broad use of CyTOF in a variety of translational studies (9). CyTOF allows us to examine functional responses of multiple cell types and subsets simultaneously and provides high quality dynamic characterization of signaling to define mechanisms of cellular regulation.
Essential standards to reduce experimental variation
Biological variation is a significant challenge to understanding underlying mechanisms in human studies and in model systems. To identify meaningful differences that contribute to diverse clinical responses, experimental variation must be reduced to an absolute minimum. Thus thorough and detailed consideration is required for all elements of study design. These include, for example, sample collection methods, choice of control groups, circadian rhythms, reagent supply, bridging standards, assay variation, and normalizing control samples. This is particularly critical for studies conducted across multiple laboratories or for which samples will accrue over long time periods.
Use of standards is an important issue overall, and particularly for studies using CyTOF, as the increased number of antibodies used to label a sample magnifies potential variation across dimensions (10). Several procedures are in use to address variation in CyTOF studies and relevant considerations for experimental design and pipeline are available (8). One example is barcoding of samples, which has the potential to minimize sample variability as well as to reduce experimental expenses (5,11). Of note, barcoding procedures involve detergent permeabilization and fixation. Thus for labeling of cellular epitopes which require fresh cells, barcodes would generally be added after surface labeling of individual samples, partially reducing their effective advantages. Commercial sources of lyophilized human blood cells (Beckman Coulter, Brea CA and BioLegend, San Diego, CA) can be used in some cases to identify cell subsets, although as they are fixed they do not provide a staining control for fresh cells. Another effective approach to address variation in CyTOF studies is inclusion of a 4-bead calibrator (EQ™ Four Element Calibration Beads) with each sample to track instrument variation in signal strength and performance (12). These are polystyrene beads embedded with metal isotopes (140/142Ce), europium (151/153Eu), holmium (165Ho), and lutetium (175/176Lu) (Fluidigm/DVS Science, Sunnyvale, CA). Signal from the beads is assessed using a normalization algorithm and provides a valuable measure to address instrument variation (13) independent of the cells under investigation. However, polystyrene beads are subject to less variation in the ionization process than complex biological material (e.g., cell suspensions, dispersed tissue samples), and thus do not fully reflect biological variations.
In the present issue of Cytometry (page XXX), Kleinsteuber et al highlight the need for a reference control in CyTOF studies comparing samples over time in a longitudinal study. They present a method for spiking experimental samples with cells from a batch drawn from a reference donor. By collecting a large volume of blood from a single healthy donor, they created up to 50 aliquots of a reference sample. These cells were labeled for CD45 using a distinct metal before being added to the experimental sample (4 × 105 reference cells per 2 × 106 patient cells) and labeling with the full 25-antibody panel of the experiment (Figure 2a of the paper in this issue at page XXX). A minimum of 100,000 events/sample were collected to ensure detection of the reference cells. Signal from the reference cells is employed during analysis for quality control of antibody staining and for variations in gating between samples. As demonstrated here, frozen cells from a buffy coat were thawed and labeled with each study sample to provide a stable comparison. These are particularly useful for lineage markers and cell subsets of similar composition, and were used in the study to show both reproducibility of markers and variation between donors (Figure 3 of the paper in this issue at page XXX). The authors employ the reference cells to distinguish activation and inhibitory markers on CD8+ T cells from HIV-infected patients. Two markers were excluded from the analysis: Ki-67 proliferation marker for absence of staining in the reference cells, and early activation marker CD69 for variable elevated measurements that clouded identification of negative populations in the gating analysis. Gating was conducted manually and through use of ‘tethered gates’, which use pre-set percentiles to automatically adjust gates. The reproducibility further allowed identification of staining “errors” where 10-fold lower concentrations of two antibodies were employed as a quality control strategy. For the three experiments with two HIV+ patients, they detected highly reproducible frequencies for the 18 antibodies in the CD8+ T cell panel. Further, CyTOF studies highlighted differences between the two patients tested and the healthy donor, and detected higher production of TNFα in cells from bronchoalveolar lavage (BAL) than from peripheral blood cells. Detecting such distinctions will be central to many translational investigations seeking differences between cell phenotypes and function among stratified patient populations and/or healthy subjects.
Critical elements are noted by Kleinsteuber et al (page XXX) for consideration prior to adopting this reference sample strategy to compensate for variation within experiments. A reference sample can be perpetuated beyond the supply of a single buffy coat (~50 reference tubes) by pooling multiple samples for a larger supply and/or by use of bridging reference samples when the supply is exhausted. Reference samples of frozen cells may show distinct functional profiles compared to fresh cells, thus the reference is most effectively applied to identifying cell populations rather than functions. The authors expect the protocol will be easily adaptable for different cell types, cell lines, or animal cells as appropriate for the project. Even while being most relevant within a study, this approach provides an excellent reference point. All the data is publically available as noted in the manuscript and is a resource for other users.
Essential standards in use to reduce computational variation
Flow cytometry established standards for reporting experimental conditions and sharing results (14) that have been preserved for CyTOF as well. As in flow cytometry, gating of cell subsets follows exclusion of debris (Iridium− DNA−), cell doublets (Iridiumhi DNAhi) and dead cells (cisplatin+). A normalization algorithm using EQ beads is implemented to address instrument variation. Also as in flow cytometry, the subjectivity of manual gating introduces variability into the data and significantly impacts the reproducibility, robustness and comparability of results, particularly in multi-project or multi-center studies. For the multidimensional data produced by CyTOF, conventional manual gating is inadequate and more advanced analysis platforms with suitable software infrastructure are required. Automated gating algorithms for structuring the analytic approach hold the promise of reducing variation between gating by individual investigators, as even highly-trained investigators have considerable variation (15). Methods such as semi-automated tethered gating as used by Kleinsteuber et al (page XXX) distinguish cells between groups or after stimulation.
Using CyTOF as a standardized high-throughput technology requires specialized algorithms. Many novel platforms are rapidly appearing such as SPADE, ViSNE, Citrus, Phenograph, Scaffold, DREMI/DREVI. These clustering programs have been designed to visualize high-dimensional single-cell data, identify cell populations that would be missed by conventional gating, to visually and quantitatively gauge the phenotypic diversity between cell types and donors, and to quantify signaling interactions and networks (5). Multidimensional data generated by CyTOF can be assessed using the Cytobank platform or through computational methods available in the R or Matlab statistical programming environment. For computational standardization, the R code may be available for re-analysis of data and establishing central gating to reduce variation (e.g., http://www.bioconductor.org/packages/release/bioc/html/flowClust.html), and it will be valuable to combine parallel use of manual gating and unsupervised clustering (10).
Blending High Dimensional Specialties
Progress in understanding pathologic processes includes multifactorial elements of translational study including clinical, demographic, medication, sample collection and processing, which must be integrated with computational data processing, instrument variation, and normalization protocols. Advancing our knowledge will require all of this, as well as new algorithms, reassessing of data trends in validation cohorts, iterative communication, and high standards across the board. The quantitative analysis from CyTOF datasets lends itself to presenting data based on the computational output. It is worth the effort to derive the context needed to be biologically informative followed by effective interpretation as we add observations that lead to improved understanding.
Acknowledgments
This work was supported in part by a grant from the NIH (AI 089992). The author is grateful to members of the Human Immunology Project Consortium (HIPC) for valuable discussions.
Footnotes
The author has no conflict of interest to declare.
LITERATURE CITED
- 1.Brusic V, Gottardo R, Kleinstein SH, Davis MM, committee Hs Computational resources for high-dimensional immune analysis from the Human Immunology Project Consortium. Nat Biotechnol. 2014;32:146–8. doi: 10.1038/nbt.2777. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Maecker HT, McCoy JP, Nussenblatt R. Standardizing immunophenotyping for the Human Immunology Project. Nat Rev Immunol. 2012;12:191–200. doi: 10.1038/nri3158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Baker SC, Bauer SR, Beyer RP, Brenton JD, Bromley B, Burrill J, Causton H, Conley MP, Elespuru R, Fero M, et al. The External RNA Controls Consortium: a progress report. Nat Methods. 2005;2:731–4. doi: 10.1038/nmeth1005-731. [DOI] [PubMed] [Google Scholar]
- 4.Collins FS, Tabak LA. Policy: NIH plans to enhance reproducibility. Nature. 2014;505:612–3. doi: 10.1038/505612a. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Spitzer MH, Nolan GP. Mass Cytometry: Single Cells, Many Features. Cell. 2016;165:780–91. doi: 10.1016/j.cell.2016.04.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Nair N, Mei HE, Chen SY, Hale M, Nolan GP, Maecker HT, Genovese M, Fathman CG, Whiting CC. Mass cytometry as a platform for the discovery of cellular biomarkers to guide effective rheumatic disease therapy. Arthritis Res Ther. 2015;17:127. doi: 10.1186/s13075-015-0644-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.O’Gorman WE, Hsieh EW, Savig ES, Gherardini PF, Hernandez JD, Hansmann L, Balboni IM, Utz PJ, Bendall SC, Fantl WJ, et al. Single-cell systems-level analysis of human Toll-like receptor activation defines a chemokine signature in patients with systemic lupus erythematosus. J Allergy Clin Immunol. 2015;136:1326–1336. doi: 10.1016/j.jaci.2015.04.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Gaudilliere B, Ganio EA, Tingle M, Lancero HL, Fragiadakis GK, Baca QJ, Aghaeepour N, Wong RJ, Quaintance C, El-Sayed YY, et al. Implementing Mass Cytometry at the Bedside to Study the Immunological Basis of Human Diseases: Distinctive Immune Features in Patients with a History of Term or Preterm Birth. Cytometry A. 2015;87:817–29. doi: 10.1002/cyto.a.22720. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Yao Y, Liu R, Shin MS, Trentalange M, Allore H, Nassar A, Kang I, Pober J, Montgomery RR. CyTOF supports efficient detection of immune cell subsets from small samples. J Immunol Methods. 2014;415:1–5. doi: 10.1016/j.jim.2014.10.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Cosma A. A time to amaze, a time to settle down, and a time to discover. Cytometry A. 2015;87:795–6. doi: 10.1002/cyto.a.22724. [DOI] [PubMed] [Google Scholar]
- 11.Lai L, Ong R, Li J, Albani S. A CD45-based barcoding approach to multiplex mass-cytometry (CyTOF) Cytometry A. 2015;87:369–74. doi: 10.1002/cyto.a.22640. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Finck R, Simonds EF, Jager A, Krishnaswamy S, Sachs K, Fantl W, Pe’er D, Nolan GP, Bendall SC. Normalization of mass cytometry data with bead standards. Cytometry A. 2013;83:483–94. doi: 10.1002/cyto.a.22271. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Tricot S, Meyrand M, Sammicheli C, Elhmouzi-Younes J, Corneau A, Bertholet S, Malissen M, Le Grand R, Nuti S, Luche H, et al. Evaluating the efficiency of isotope transmission for improved panel design and a comparison of the detection sensitivities of mass cytometer instruments. Cytometry A. 2015;87:357–68. doi: 10.1002/cyto.a.22648. [DOI] [PubMed] [Google Scholar]
- 14.Lee JA, Spidlen J, Boyce K, Cai J, Crosbie N, Dalphin M, Furlong J, Gasparetto M, Goldberg M, Goralczyk EM, et al. MIFlowCyt: the minimum information about a Flow Cytometry Experiment. Cytometry A. 2008;73:926–30. doi: 10.1002/cyto.a.20623. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Finak G, Langweiler M, Jaimes M, Malek M, Taghiyar J, Korin Y, Raddassi K, Devine L, Obermoser G, Pekalski ML, et al. Standardizing Flow Cytometry Immunophenotyping Analysis from the Human ImmunoPhenotyping Consortium. Sci Rep. 2016;6:20686. doi: 10.1038/srep20686. [DOI] [PMC free article] [PubMed] [Google Scholar]
