Abstract
The Flow Cytometry: Critical Assessment of Population Identification Methods (FlowCAP) challenges were established to compare the performance of computational methods for identifying cell populations in multidimensional flow cytometry data. Here we report the results of FlowCAP-IV where algorithms from seven different research groups predicted the time to progression to AIDS among a cohort of 384 HIV+ subjects, using antigen-stimulated peripheral blood mononuclear cell (PBMC) samples analyzed with a 14-color staining panel. Two approaches (FlowReMi.1 and flowDensity-flowType-RchyOptimyx) provided statistically significant predictive value in the blinded test set. Manual validation of submitted results indicated that unbiased analysis of single cell phenotypes could reveal unexpected cell types that correlated with outcomes of interest in high dimensional flow cytometry datasets.
Keywords: flow cytometry, bioinformatics, clustering, classification, data analysis, HIV, clinical outcome, supervised analysis
Introduction
A wide range of computational tools is required for systematic analysis of high-dimensional and high-throughput single cell data in different clinical and experimental settings. Despite recent developments in flow cytometry bioinformatics [1], guidance for end users regarding the appropriate use of these methods has been scarce. The FlowCAP Consortium has developed a series of benchmarks for objective evaluation of computational tools in different settings. FlowCAP-I focused on the evaluation of cell population identification algorithms compared to manual analysis by human experts for a selection of different cell populations across several datasets [2]. While this approach was helpful to demonstrate that algorithms can identify known populations, it did not directly address the effectiveness of algorithms that identified cell types not previously delineated by the human analysts. To address this limitation the next FlowCAP challenges focused on the identification of cellular correlates of an independent biological variable (e.g. a clinical outcome). While these challenges demonstrated that computational analysis could outperform manual analysis for identification of cell types correlated with external factors, they did not establish a benchmark for evaluating the relative performance of data analysis pipelines because in one challenge many algorithms were able to achieve perfect classification accuracy, whereas in another challenge none of the algorithms (or human expert) could successfully identify a cell type that correlated with the clinical outcome. Therefore, a more challenging benchmark for evaluation of these algorithms was required to help evaluate their relative performance. In this article we report the results of the latest FlowCAP challenge that addresses this requirement through the use of a complex dataset to evaluate data analysis strategies for identification of correlates of clinical outcome – in this case the time to progression to AIDS in HIV positive participants.
Materials and Methods
We used data from a long-standing natural history study of HIV [3], from which 384 individuals were selected, with known HIV seroconversion dates (so that an estimated date of infection could be calculated) and long-term clinical follow-up [4]. For each individual, the time to AIDS diagnosis (“survival time”) was known, and patients were censored upon initiation of treatment or when lost to follow-up. We chose the earliest available cell specimen (<18 months post-infection (p.i.); mean = 7 months p.i.); our previous analysis showed that the timing of the cell specimen had no relationship to survival time. Cells were stained with two polychromatic flow cytometry panels: the first assessed the resting phenotypes of T-cells directly ex vivo [4] (but was not used for this study), while the second characterized cytokine and other functional responses to HIV, after stimulation with pools of overlapping peptides from HIV-Gag proteins (Chattopadhyay et al., under review). Specifically, the latter panel examined the cytokines IFNg, IL2, and TNF, along with a costimulatory molecule (CD154) upregulated by CD4+ T-cells upon activation and a marker of T-cell degranulation (CD107a). Cell surface markers that define the maturity of the HIV-specific T-cell were examined in this panel as well (CD45RO, CCR7, CD27, and CD57) so that the functional responses could be ascribed to cell types well-described in the literature (e.g., central memory T-cells, senescent T-cells, etc.). Manual analysis of the dataset was performed previously, and showed that no specific characteristics of the anti-HIV T-cell response in the selected samples could predict survival time (specifically, the following characteristics were tested: total magnitude of the HIV Gag or Env specific T-cell responses, magnitude of individual functional responses, polyfunctionality of the CD4 or CD8 response, and differentiation state of the response; these are described in greater detail in a manuscript under review (Chattopadhyay, et al)). It is important to note, however, that the manual data analysis examined only cell characteristics that were known to the biologists a priori. Manual data analysis of all possible combinations of markers in the dataset was not feasible. All selected samples from divided into two equal training- and test-sets randomly and uniformly. No correlation with the clinical outcome was observed across the two sets (pvalue > 0.22). The FlowCAP participants were blinded to the clinical outcomes in the testset, which was used for independently verifying the submitted results.
To ensure the feasibility of the study, members of the FlowCAP organizing committee not participating in the submission of algorithm results performed an independent analysis [5]. The raw Gag-stimulated samples were pre-processed using an optimized logicle transformation [6] and 1000 cells per patient were selected randomly. K-means clustering was used to identify 50 cell types simultaneously in all patients using the pooled cells. For each cluster, the frequency of the cells (i.e., the number of cells in the cluster divided by the total number of cells) was used for the correlative analysis (Cox-proportional hazards model) on the training set. The median expression of each marker for each cell type as well as log-rank p-values for each cluster were calculated (Supplemental Figure 1). The most significant clusters included CD3+CD4+ and CD3+CD8+ populations that also express a range of other markers (e.g., CD27, CCR7, and IL2) and lacked expression of others (e.g., CD57 and TNF). These clusters remained correlated with the clinical outcome in the test set, suggesting that successful identification of correlates of disease progression in this dataset was possible (Supplemental Figure 2).
Data from unstimulated and Gag-stimulated samples were distributed to all participants. The dataset was randomly partitioned into a training set and a test set of equal sizes. The clinical information (survival time and censorship status) of the patients in the test set was not provided to the participants. Each participant provided a vector of final predictions (extracted from one or a combination of several cell types) that was most correlated with the clinical outcome (as measured by a lon-rank test on a Cox proportional hazards regression). The complete source code as well as the required software packages for independent reproduction of the results was also required. The results were evaluated by an independent log-rank test on the blinded test set.
Results
Twenty-four groups requested to participate in FlowCAP-IV and received the data. We received 9 submissions from 7 groups (Table 1) using combinations of different algorithms including [7]-[17]. Log-rank test p-values for the submitted result indicated that while several algorithms identified cell types strongly correlated with the clinical outcome in the training dataset, only the flowDensity-flowType-RchyOptimyx and FloReMi.1 approaches maintained a significant correlation (p < 0.05) in the blinded test set. flowDensity is a supervised sequential bivariate clustering algorithm that mimics manual gating methods. flowType-RchyOptimyx is an approach that uses partitioning of cells into categories (e.g., positive or negative) using 1-D density estimate to calculate cut-offs for each marker and enumerate all cell types in a sample, followed by dynamic programing to efficiently construct k-shortest paths to important cell populations. FloReMi.1 combines flowType for cell population identification with a random forest approach using cell population-based features to build a survival regression model. One version of the three different versions of FloReMi submitted (FloReMi.1) remained significantly correlated with the clinical outcome after Bonferroni adjustment for multiple hypothesis testing to account for the multiple algorithm submissions submitted (i.e., as the number of submitted results increases it is possible that some will have significant correlations in the test dataset due to chance alone).
Table 1.
summary of all submitted results.
Name | Approach | Availability | Reference |
---|---|---|---|
BorFlowFP | Exhaustive all-relevant feature selection with the Boruta method over the flow cytometry fingerprints data followed by modeling with Survival Random Forest algorithm. | Boruta and FlowFP are free, open-source R packages available from CRAN and BioConductor, respectively. | [15] |
FloReMi.1 | Partitioning of cells (e.g., into positive or negative) to identify cell populations, followed by feature extraction of both cell population size and MFI values. Features with minimal redundancy were selected as input for a random survival forest for survival time prediction. | R Code to reproduce our results is available at www.github.com/SofieVG/FloReMi and in Supplementary data from FlowRepository.org | [12] |
FloReMi.2 | As FloReMi.1, but using a Cox-proportional hazards model for survival time prediction | R Code to reproduceresults is available at www.github.com/SofieVG/FloReMi and in supplementary data from FlowRepository.org | [12] |
FloReMi.3 | As FloReMi.1, but using an additive hazards model for survival time prediction | R Code to reproduce results is available at www.github.com/SofieVG/FloReMi and in supplementary data from FlowRepository.org | [12] |
flowDensity/flowType/RchyOptimyx | Density-based getting and partitioning of cells (e.g., into positive or negative) followed by dynamic programing to ID k-shortest paths to important cell populations | Free, open-source R packages available from BioConductor | [10,7,8,9] |
GANN | Identifying profiles of individual bin channels of fluorescence for the different cell markers which can be used for distinguishing different groups/categories | C++ written program, available by contacting the author and in supplementary data from FlowRepository.org | [11,13,14] |
RTOMT | Partitioning of cells (e.g., into positive or negative). Combining clinical diagnosis and survival time into a single target vector. Regression tree. | Free, open-source code, Matlab available in supplementary data from FlowRepository.org | NA |
SPADE.SNR | SPADE was used to derive cell clusters in high-dimensional space. Important clusters were selected by a signal-to-noise ratio based on cell frequency of subjects who showed disease progression. | Freely available implementation in Matlab | 16 |
EMMIXflow | Clustering using expectation maximization fitting of skew-t mixture models was used for feature extraction. A random survival forest was used for building the predictive model. | R source code available in supplementary data from FlowRepository.org | 17 |
The algorithms used different cell types for the final predictions. For example, FloReMi.1 identified a CD3+ CD4− CD45RO− CD27− CCR7− population (likely effector CD8+ cells [18], with mixed expression of CD57+ and CD57− cells. Higher frequencies of these cells early in HIV disease were correlated with worse prognosis, consistent with previous reports in chronic infection[19]. The flowType pipeline identified a CD27− TNF− CD154− population with predictive value, but this could not be interpreted further because there was either mixed or negative expression of all the other markers studied. This is not necessarily a limitation of the algorithm; it suggests that more markers were needed in the original experiment to positively identify a disease correlate. Still, the result is consistent with that obtained by FloReMi, in that CD27− cells were also identified.
Importantly, several algorithms identified populations within the CD14/VIVID+ channel, originally designed to exclude dead cells and monocytes. Cluster #3 of our independent analysis (Supplemental Figure 1) further describes this population, which includes CD3+ CD4− CD14/VIVID+ CD57− cells. This cluster was correlated with outcome both in the training and test sets. To validate this population, we constructed a gating strategy for this cell type (Supplemental Figure 3) and applied it to the unstimulated set of samples. The correlation with the clinical outcome was preserved (Supplemental Figure 4), arguing against the possibility of experimental artifact, since the identified population appeared significant in all samples (stimulated and unstimulated). Stated differently, since the population lacked cytokine-expressing cells, we expected to observe it (and find the same clinical correlations) in the unstimulated samples. Finally, we confirmed these results in a different dataset from the same cohort[10], using manual gates to identify the CD3+ CD4- CD14/VIVID+ CD57- cells. The boundaries of the gates were manually adjusted using FlowJo (Treestar Inc, Ashland, OR) to account for technical variations across the two datasets. The frequency of the manually identified cell type remained correlated with the clinical outcome (Supplemental Figure 5), confirming robustness of the results in this dataset.
We performed additional experiments to better define these cells. Based on their phenotype, we surmised that the correlation arose from dead cells or live monocytes. Notably, although the cells appeared CD3+, this was likely a consequence of well-known experimental artifacts: non-specific antibody binding by dead cells [20] and/or monocyte binding of our mouse anti-human IgG1 CD3 antibody via FCgRIIA (CD32A, [21]). To disentangle the staining patterns of the viability marker between dead cells and monocytes, we stained cells with VIVID and then measured CD14 expression using a different fluorochrome. Live, CD14+ monocytes bound intermediate levels of VIVID compared to live lymphocytes and dead monocytes (Supplemental Figure 6); a likely result of the increased size and protein content of live monocytes compared to lymphocytes[20]. This suggested that the correlation found in the dataset represented live monocytes, not dead cells (further confirmed by a separate experiment, where the correlate population expressed other markers of monocytes, CD16 and CD11c). This was confirmed by testing viability data from the original dataset, obtained using microscopic examination of ethidium bromide/acridine orange stained cells (EB/AO) at the time of thaw. There was no relationship between the frequency of dead cells, or the number of cells recovered from the vial as measured by a hemocytometer, and survival times (Supplemental Figure 7). This strongly suggests that monocytes accounted for the correlation between CD14+/VIVID+ cells in early infection and survival time, consistent with previous data [22]–[25]. This analysis demonstrates how careful analysis of biology, experimental artifacts, and computational results can provide an unbiased analysis of all available cells, revealing unexpected cell types that correlate with clinical outcomes.
Discussion
As high throughput and high dimensional flow and mass cytometry datasets become increasingly common, there is a critical need for automated data analysis approaches. As demonstrated by our HIV natural history studies, correlates of survival often lie within complex populations of cells not typically queried by directed, manual data analysis. Comprehensive analysis of a dataset, in which cells expressing all possible combinations of markers are examined, is unfeasible with classical, manual data analysis alone. Fortunately, there are a wide variety of approaches to automated data analysis. However, there have been few carefully controlled comparative studies that have assessed their performance. FlowCAP provides a forum for such comparison.
FlowCAP-IV was designed to address the limitations in some of the past challenges by using a dataset containing complex correlates of an externally defined variable (time to AIDS progression/diagnosis). Participants were provided a training set and their submitted results (survival status) on a blinded and independent test set. There were a number of reasons this dataset was selected for a FlowCAP challenge. It contained well-defined clinical data, providing a standardized and meaningful outcome variable. The data was also robustly collected using stringent quality control approaches, with previous studies performed in the same cohort of individuals. A variety of parameters were examined in this study, including multiple markers of T-cell maturity and function, providing a complex test. Our previous analysis also suggested that correlates of survival were contained within complex cell populations, and thus would be challenging to identify. Finally, the size of the dataset was consistent with the types of studies increasingly performed today, for which better data analysis solutions are needed.
A key product of our work with automated data analysis is the recognition that correlates of biologically important outcomes can lie in cell types that are not well described in the literature. In fact, the reliance on classically defined cell types for manual data analysis is problematic in high-dimensional experiments, as the definition of a particular cell type might differ by investigator or paper. For example, a central memory T-cell in one study may be CD45RA− CCR7+, while in another study the cell is defined as CD45RA− CD27+. When all of these markers are measured together in a sample, it becomes difficult to decide which combination of markers should define the cell type of interest and only a few cell populations are assessed due to the burden of manual analysis. In this regard, automated unbiased approaches that consider expression of all markers are vastly preferable since no potentially important cell populations are missed; this is a feature of all the automated approaches tested in this challenge. Notably, we show here that pre-gating data (to remove dead cells or monocytes, for example) can cause similar problems, as the excluded cell types may include correlates. However, this consideration is balanced by the problem of interpreting biology from the potentially artifactual staining patterns on the excluded cells.
The methods included in this study can be broken into several components: 1) Preprocessing (e.g., transformation [26] and normalization [27]); 2) feature extraction (e.g., clustering [28], probability binning [29], combinatorial gating [7], [10], and cluster matching [30], [31]); 3) supervised analysis (e.g., single-variate [32]–[34] and multi-variate models [35]); 4) characterization [8], [36] and visualization [16], [37], [38]. Not only do these components vary in their individual performance across different biological applications, their interactions with each other in a large data analysis pipeline also further complicates the choice of appropriate methods [39] The establishment of objective benchmarks like the one reported here enables further analysis in which different components are examined subject to a wide range of technical and biological variations to identify the combination with optimal performance. It is also possible that the optimal combination may differ by data type or study design. FlowCAP provides a platform for future evaluation of these possibilities.
Supplementary Material
Figure 1.
The log-rank test p-values of the submissions on (A) the training-set and (B) the test-set.
Acknowledgments
This work was supported by the International Society for Advancement of Cytometry’s Scholarship (NA and PKC), an Ann Schreiber Mentored Investigator Award from the Ovarian Cancer Research Fund OCRF 292495 (NA), a Canadian Institute of Health Research Postdoctoral Fellowship 321510 (NA), a Canadian Institute of Health Research Scholarship for Strategic Training in Bioinformatics (NA), University of British Columbia’s 4YF Scholarship (NA), Rachford and Carlota A. Harris Professorship (GPN), and the following grants: NIH 152175.5041015.0412, 1R33CA183692-01 U19 AI057229, U54CA149145, N01-HV-00242, 1U19AI100627, 5R01AI07372405, 1R33CA18365401, 1R33CA183692, R01CA184968, 1 R33 CA183654, A1073724, CA184968, and R33 CA183692 (GPN), DOD W81XWH-12-1-0591 (GPN), William Lawrence & Blanche Hughes Foundation (GPN), DOD (Department of Defense) UCD - W81XWH-13-BCRP-BREAKTHROUGH (GPN), Ovarian Cancer Teal Innovator Award (GPN), Entertainment Industry Foundation (NWCRA) (GPN), Lymphoma Research Foundation (GPN), Bill and Melinda Gates Fdn. OPP 1017093 GF12421137101 (GPN), Alliance for Lupus Research (GPN), Northrop-Grumman Corp 7500108142, NIH/NIBIB EB008400 (RRB, MM, NA, GF, RG, RS), NSERC (RB, MM), NIH/NIAID R24AI054953 (TM) NSERC (RB, MM), Agency for Innovation by Science and Technology Ph.D. grant (SVG), Research Foundation – Flanders Postdoctoral Fellowship (CVNCI R01CA163481 (PQ) Polish National Science Centre grant 2011/01/N/ST6/07035 (MBK). The authors wish to acknowledge Canada’s Michael Smith Genome Sciences Centre, Vancouver, Canada for use of High Performance Computing resources.
Footnotes
Availability
Raw data, clinical meta-data, and the submissions by the participants including software are publically available at http://flowrepository.org[40], under repository ID FR-FCM-ZZ99 and (Note: this code will be activated only upon publication. Reviewer access is available through: http://flowrepository.org/id/RvFrXVcrR4m0jEcsoPpnEdziUfaVUtGYdXZw4bEqvlnGfRX9aSYNSltB4bdTG4DH).
References
- 1.O’Neill K, Aghaeepour N, Spidlen J, Brinkman R. Flow cytometry bioinformatics. PLoS Comput Biol. 2013;9(12):e1003365. doi: 10.1371/journal.pcbi.1003365. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Aghaeepour N, Finak G, FlowCAP Consortium, DREAM Consortium. Hoos H, Mosmann TR, Brinkman R, Gottardo R, Scheuermann RH. Critical assessment of automated flow cytometry data analysis techniques. Nat Methods. 2013 Mar;10(3):228–238. doi: 10.1038/nmeth.2365. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Weintrob AC, Fieberg AM, Agan BK, Ganesan A, Crum-Cianflone NF, Marconi VC, Roediger M, Fraser SL, Wegner SA, Wortmann GW. Increasing Age at HIV Seroconversion From 18 to 40 Years Is Associated With Favorable Virologic and Immunologic Responses to HAART. JAIDS Journal of Acquired Immune Deficiency Syndromes. 2008 Sep;49(1):40–47. doi: 10.1097/QAI.0b013e31817bec05. [DOI] [PubMed] [Google Scholar]
- 4.Ganesan A, Chattopadhyay PK, Brodie TM, Qin J, Gu W, Mascola JR, Michael NL, Follmann DA, Roederer M, Infectious Disease Clinical Research Program HIV Working Group Immunologic and virologic events in early HIV infection predict subsequent rate of progression. J Infect Dis. 2010 Jan;201(2):272–284. doi: 10.1086/649430. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.MacQueen J. Some methods for classification and analysis of multivariate observations. (14) 1967;1:281–297. [Google Scholar]
- 6.Moore WA, Parks DR. Update for the logicle data scale including operational code implementations. Cytometry A. 2012 Apr;81(4):273–277. doi: 10.1002/cyto.a.22030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.O’Neill K, Jalali A, Aghaeepour N, Hoos H, Brinkman RR. Enhanced flowType/RchyOptimyx: a Bioconductor pipeline for discovery in high-dimensional cytometry data. Bioinformatics. 2014 Jan;30(9):1329–1330. doi: 10.1093/bioinformatics/btt770. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Aghaeepour N, Jalali A, O’Neill K, Chattopadhyay PK, Roederer M, Hoos HH, Brinkman RR. RchyOptimyx: cellular hierarchy optimization for flow cytometry. Cytometry A. 2012 Dec;81(12):1022–1030. doi: 10.1002/cyto.a.22209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Malek M, Taghiyar MJ, Chong L, Finak G, Gottardo R, Brinkman RR. flowDensity: reproducing manual gating of flow cytometry data by automated density-based cell population identification. Bioinformatics. 2014 Oct; doi: 10.1093/bioinformatics/btu677. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Aghaeepour N, Chattopadhyay PK, Ganesan A, O’Neill K, Zare H, Jalali A, Hoos HH, Roederer M, Brinkman RR. Early immunologic correlates of HIV protection can be identified from computational analysis of complex multivariate T-cell flow cytometry assays. Bioinformatics. 2012 Apr;28(7):1009–1016. doi: 10.1093/bioinformatics/bts082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Tong DL, Ball GR, Pockley AG. gEM/GANN: A multivariate computational strategy for auto-characterizing relationships between cellular and clinical phenotypes and predicting disease progression time using high-dimensional flow cytometry data. Cytometry Part A. 2015 doi: 10.1002/cyto.a.22622. [DOI] [PubMed] [Google Scholar]
- 12.Van Gassen S, Callebaut B, Van Helden MJ. FlowSOM: Using self-organizing maps for visualization and interpretation of cytometry data. Cytometry Part …. 2015 doi: 10.1002/cyto.a.22625. [DOI] [PubMed] [Google Scholar]
- 13.Tong DL, Mintram R. Genetic Algorithm-Neural Network (GANN): a study of neural network activation functions and depth of genetic algorithm search applied to feature selection. International Journal of Machine Learning and Cybernetics. 2010;1(1):75–87. [Google Scholar]
- 14.Tong DL, Schierz AC. Hybrid genetic algorithm-neural network: feature extraction for unpreprocessed microarray data. Artif Intell Med. 2011 Sep;53(1):47–56. doi: 10.1016/j.artmed.2011.06.008. [DOI] [PubMed] [Google Scholar]
- 15.Rogers WT, Holyst HA. FlowFP: A Bioconductor Package for Fingerprinting Flow Cytometric Data. Adv Bioinformatics. 2009;2009(7):193947–11. doi: 10.1155/2009/193947. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Qiu P, Simonds EF, Bendall SC, Gibbs KD, Jr, Bruggner RV, Linderman MD, Sachs K, Nolan GP, Plevritis SK. Extracting a cellular hierarchy from high-dimensional cytometry data with SPADE. Nature Biotechnology. 2011 Oct;29(10):886–891. doi: 10.1038/nbt.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.McLachlan G, Krishnan T. The EM algorithm and extensions. Vol. 382. John Wiley & Sons; 2007. [Google Scholar]
- 18.Chattopadhyay PK, Roederer M. Good cell, bad cell: Flow cytometry reveals T-cell subsets important in HIV disease. Cytometry A. 2010 Jul;77(7):614–622. doi: 10.1002/cyto.a.20905. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Papagno L, Spina CA, Marchant A, Salio M, Rufer N, Little S, Dong T, Chesney G, Waters A, Easterbrook P, Dunbar PR, Shepherd D, Cerundolo V, Emery V, Griffiths P, Conlon C, McMichael AJ, Richman DD, Rowland-Jones SL, Appay V. Immune Activation and CD8+ T-Cell Differentiation towards Senescence in HIV-1 Infection. PLoS Biol. 2004 Feb;2(2):e20. doi: 10.1371/journal.pbio.0020020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Perfetto SP, Chattopadhyay PK, Lamoreaux L, Nguyen R, Ambrozak D, Koup RA, Roederer M. Amine reactive dyes: An effective tool to discriminate live and dead cells in polychromatic flow cytometry. Journal of Immunological Methods. 2006 Jun;313(1):199–208. doi: 10.1016/j.jim.2006.04.007. [DOI] [PubMed] [Google Scholar]
- 21.Bruhns P. Properties of mouse and human IgG receptors and their contribution to disease models. Blood. 2012 Jun;119(24):5640–5649. doi: 10.1182/blood-2012-01-380121. [DOI] [PubMed] [Google Scholar]
- 22.Müller-Trutwin M, Hosmalin A. Role for plasmacytoid dendritic cells in anti-HIV innate immunity. Immunol Cell Biol. 2005 Oct;83(5):578–583. doi: 10.1111/j.1440-1711.2005.01394.x. [DOI] [PubMed] [Google Scholar]
- 23.Sabado RL, O’Brien M, Subedi A, Qin L, Hu N, Taylor E, Dibben O, Stacey A, Fellay J, Shianna KV, Siegal F, Shodell M, Shah K, Larsson M, Lifson J, Nadas A, Marmor M, Hutt R, Margolis D, Garmon D, Markowitz M, Valentine F, Borrow P, Bhardwaj N. Evidence of dysregulation of dendritic cells in primary HIV infection. Blood. 2010 Nov;116(19):3839–3852. doi: 10.1182/blood-2010-03-273763. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Sandler NG, Wand H, Roque A, Law M, Nason MC, Nixon DE, Pedersen C, Ruxrungtham K, Lewin SR, Emery S, Neaton JD, Brenchley JM, Deeks SG, Sereti I, Douek DC, INSIGHT SMART Study Group Plasma levels of soluble CD14 independently predict mortality in HIV infection. J Infect Dis. 2011 Mar;203(6):780–790. doi: 10.1093/infdis/jiq118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Hasegawa A, Liu H, Ling B, Borda JT, Alvarez X, Sugimoto C, Vinet-Oliphant H, Kim W-K, Williams KC, Ribeiro RM, Lackner AA, Veazey RS, Kuroda MJ. The level of monocyte turnover predicts disease progression in the macaque model of AIDS. Blood. 2009 Oct;114(14):2917–2925. doi: 10.1182/blood-2009-02-204263. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Qian Y, Liu Y, Campbell J, Thomson E, Kong YM, Scheuermann RH. FCSTrans: An open source software system for FCS file conversion and data transformation. Cytometry A. 2012 May;81(5):353–356. doi: 10.1002/cyto.a.22037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Hahne F, Khodabakhshi AH, Bashashati A, Wong C-J, Gascoyne RD, Weng AP, Seyfert-Margolis V, Bourcier K, Asare A, Lumley T, Gentleman R, Brinkman RR. Per-channel basis normalization methods for flow cytometry data. Cytometry A. 2010 Feb;77(2):121–131. doi: 10.1002/cyto.a.20823. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Ge Y, Sealfon SC. flowPeaks: a fast unsupervised clustering for flow cytometry data via K-means and density peak finding. Bioinformatics. 2012 Aug;28(15):2052–2058. doi: 10.1093/bioinformatics/bts300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Roederer M, Treister A, Moore W, Herzenberg LA. Probability binning comparison: A metric for quantitating univariate distribution differences. 2001 Sep;45(1):37–46. doi: 10.1002/1097-0320(20010901)45:1<37::aid-cyto1142>3.0.co;2-e. [DOI] [PubMed] [Google Scholar]
- 30.Azad A, Khan A, Rajwa B, Pyne S, Pothen A. Classifying Immunophenotypes With Templates From Flow Cytometry. presented at the the International Conference, New York; New York, USA. 2007; pp. 256–265. [Google Scholar]
- 31.A Non-parametric Bayesian Model with Random Effects: Joint Cell Clustering and Cluster Matching for Anomalous Sample Phenotype Identification. 2014 Feb;:1–26. doi: 10.1186/1471-2105-15-314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA. 2001 Apr;98(9):5116–5121. doi: 10.1073/pnas.091062498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Bruggner RV, Bodenmiller B, Dill DL, Tibshirani RJ, Nolan GP. Automated identification of stratifying signatures in cellular subpopulations. PNAS. 2014 Jul;111(26):E2770–7. doi: 10.1073/pnas.1408792111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Gaudillière B, Fragiadakis GK, Bruggner RV, Nicolau M, Finck R, Tingle M, Silva J, Ganio EA, Yeh CG, Maloney WJ, Huddleston JI, Goodman SB, Davis MM, Bendall SC, Fantl WJ, Angst MS, Nolan GP. Clinical recovery from surgery correlates with single-cell immune signatures. Sci Transl Med. 2014 Sep;6(255):255ra131–255ra131. doi: 10.1126/scitranslmed.3009701. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Zare H, Haffari G, Gupta A, Brinkman RR. Scoring relevancy of features based on combinatorial analysis of Lasso with application to lymphoma diagnosis. BMC Genomics. 2013;14(1):S14. doi: 10.1186/1471-2164-14-S1-S14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Anchang B, Do MT, Zhao X, Plevritis SK. CCAST: A Model-Based Gating Strategy to Isolate Homogeneous Subpopulations in a Heterogeneous Population of Single Cells. PLoS Comput Biol. 2014 Jul;10(7):e1003664. doi: 10.1371/journal.pcbi.1003664. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Amir E-AD, Davis KL, Tadmor MD, Simonds EF, Levine JH, Bendall SC, Shenfeld DK, Krishnaswamy S, Nolan GP, Pe’er D. viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia. Nature Biotechnology. 2013 Jun;31(6):545–552. doi: 10.1038/nbt.2594. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Shekhar K, Brodin P, Davis MM, Chakraborty AK. Automatic Classification of Cellular Expression by Nonlinear Stochastic Embedding (ACCENSE) PNAS. 2014 Jan;111(1):202–207. doi: 10.1073/pnas.1321405111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Finak G, Frelinger J, Jiang W, Newell EW, Ramey J, Davis MM, Kalams SA, De Rosa SC, Gottardo R. OpenCyto: an open source infrastructure for scalable, robust, reproducible, and automated, end-to-end flow cytometry data analysis. PLoS Comput Biol. 2014 Aug;10(8):e1003806. doi: 10.1371/journal.pcbi.1003806. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Spidlen J, Breuer K, Rosenberg C, Kotecha N, Brinkman RR. FlowRepository: A resource of annotated flow cytometry datasets associated with peer-reviewed publications. Cytometry A. 2012 Sep;81(9):727–731. doi: 10.1002/cyto.a.22106. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.