SUMMARY
Cell-to-cell expression variation (CEV) is a prevalent feature of even well-defined cell populations, but its functions, particularly at the organismal level, are not well understood. Using single-cell data obtained via high-dimensional flow cytometry of T cells as a model, we introduce an analysis framework for quantifying CEV in primary cell populations and studying its functional associations in human cohorts. Analyses of 840 CEV phenotypes spanning multiple baseline measurements of 14 proteins in 28 T cell subpopulations suggest that the quantitative extent of CEV can exhibit substantial subject-to-subject differences and yet remain stable within healthy individuals over months. We linked CEV to age and disease-associated genetic polymorphisms, thus implicating CEV as a biomarker of aging and disease susceptibility and suggesting that it might play an important role in health and disease. Our dataset, interactive figures, and software for computing CEV with flow cytometry data provide a resource for exploring CEV functions.
In Brief
Cell-to-cell expression variation (CEV) is a prevalent feature of cell populations, but the functional relevance of this variation is not well understood. Lu et al. develop an analysis framework for quantifying CEV in human immune cell populations. They show that CEV can exhibit substantial subject-to-subject differences but is largely stable within individuals, and they identify CEV correlates of aging and disease-associated genetic polymorphisms
Graphical Abstract

INTRODUCTION
Cell-to-cell phenotypic variation, or cellular heterogeneity, is a pervasive phenomenon (Altschuler and Wu, 2010). Cell-to-cell expression variation (CEV) in transcript or protein levels, in particular, has been consistently observed in both clonal and non-clonal cell populations from unicellular organisms to human cell lines (Eldar and Elowitz, 2010). CEV in a cell population could originate from fluctuations in gene expression as a result of the stochastic nature of biochemical and cellular processes (e.g., mRNA and protein production and degradation), or it could reflect sources of “extrinsic” variations, such as differences in (1) the microenvironment surrounding individual cells (e.g., different sites within tumors) (Marusyk et al., 2012), (2) developmental lineage and stages (Geissmann et al., 2010), and (3) cycling status among cells (Fleming et al., 1993). It is becoming increasingly clear that the extent of CEV, even within well-defined cell types or relatively homogeneous cell populations, could play functional roles (Eldar and Elowitz, 2010), such as conferring distinct cellular differentiation or activation potential to a subset of cells in a population (Chang et al., 2008; Feinerman et al., 2008) or enabling “bet-hedging” against potential environmental stresses and fluctuations (Arkin et al., 1998; Kussell and Leibler, 2005; Thattai and van Oudenaarden, 2004). The bet-hedging strategy could be particularly relevant for the immune system given that a timely response to an unanticipated “danger” signal (e.g., a fast-evolving pathogen) is desirable.
Human peripheral-blood immune cells offer an attractive avenue for studying CEV in primary cell populations because they are accessible, include many well-defined, functionally diverse immune cell populations, and are amendable to high-throughput single-cell analysis using flow cytometry and emerging technologies such as mass cytometry and single-cell RNA sequencing (RNA-seq) (Bendall et al., 2011; Stegle et al., 2015). Although immune cell phenotyping, e.g., measuring the relative frequency of immune cell subsets via flow cytometry followed by subpopulation gating, is often used for studying immune status in health and disease, the properties and the biological and functional relevance of the CEV present in peripheral immune cell subsets have not been systematically studied.
Cellular heterogeneity in human peripheral immune cells is often attributed to the presence of functionally distinct cell subsets (e.g., naive, central, and effector memory CD4+ T cells), each of which is thought to express a unique combination of marker genes or proteins that enable the identification (or gating) of the subset. However, in addition to this discrete-subset notion of CEV (herein referred to as “discrete CEV”), even within a well-defined, seemingly homogeneous population of cells, the expression level of a gene can vary substantially from one cell to another in a continuous fashion (“continuous CEV”). Recent single-cell analysis of human hematopoietic cells has suggested that such continuous expression heterogeneity might be prevalent and play functional roles (Antebi et al., 2013; Newell et al., 2012; Rebhahn et al., 2014; Satija and Shalek, 2014). For example, variation in the level of a signaling-pathway component can confer distinct levels of responsiveness to input signals in individual cells (Bendall et al., 2011; Feinerman et al., 2008). These observations indicate that the degree of CEV in a cell population can potentially influence both the total signaling output and the extent of functional heterogeneity, such as cytokine secretion behavior, within that cell population.
Questions about CEV functions in humans naturally extend to the population level. For example, does the extent of such variation differ among individuals and change over time in one person? Is CEV under the influence of genetics? A large number of genetic variants (called expression quantitative trait loci [eQTLs]) are known to affect the average expression level of a gene in a cell population or tissue, including major immune cell types (Lee et al., 2014; Raj et al., 2014; Ye et al., 2014), but it is unclear whether CEV, i.e., the variability around mean expression across cells, can also be under genetic regulation by QTLs (or cevQTLs) in a cell-type- or cell-subset-dependent manner in humans. Addressing these questions requires measuring CEV in human cohorts. In turn, by utilizing the genetic and phenotypic diversity of the human population (Tsang, 2015), we can begin to use correlation analysis across subjects to functionally link CEV to parameters such as age, genetic polymorphisms, and disease status and susceptibility.
Here, we introduce an analysis framework for quantifying CEV in peripheral immune cell populations within individual human subjects and for assessing functional correlates of CEV in a cohort. We applied this framework to flow cytometry data on single T lymphocytes in a cohort of 61 healthy subjects (Tsang et al., 2014). Although single-cell data have been generated routinely by flow cytometry in immunological investigations, they have been used mostly for cellular phenotyping (e.g., assessing the relative frequency or the average protein expression of immune cell subsets), but not for quantitative assessment of CEV. Our analyses examined a large number of CEV parameters derived from diverse T cell subsets, and by taking advantage of multiple baseline measurements within individuals, we found that a large fraction of the CEV we assessed, including both discrete and continuous CEV, was stable within subjects over a timescale of months. By correlating the degree of this CEV to age, we found a number of significant associations that were independently replicated in another healthy cohort. Furthermore, we uncovered cevQTLs involving genetic polymorphisms previously linked to diseases such as asthma and rheumatoid arthritis, suggesting that CEV can be influenced by genetics in humans and that differences in the extent of CEV among individuals could potentially play an important role in shaping disease risk and development.
The framework we introduce and the associated software pipeline (available at https://heterogeneity.niaid.nih.gov) are applicable to the study of CEV—for example, utilizing the large amount of flow cytometry datasets already generated in immunology—in other immune cell types and can potentially be extended to other high-throughput single-cell data such as those obtained by RNA-seq and mass cytometry. To enable exploration of our dataset and analysis results by the community, we have created web-based interactive figures (iFigures) (Obermoser et al., 2013), accessible at the above URL.
RESULTS
Quantifying Cell-to-Cell Protein Expression Heterogeneity in Human Immune Cell Subsets
Our analysis framework (Figure 1) was developed with the baseline samples obtained in an influenza vaccination study involving healthy, unrelated human subjects (n = 61, aged 21–62 years, male/female = 25:36) (Table S1). These peripheral-blood mononuclear cell (PBMC) samples were collected 7 days before (day −7) and immediately prior to (day 0) vaccination (baselines 1 and 2, respectively), as well as on day 70 after vaccination (baseline 3); day 70 was used as an additional baseline because our original study showed that vaccination-induced changes had largely reverted by that time point (Tsang et al., 2014). For each sample, single-cell protein-expression profiles were acquired with multiple 15-color flow cytometry antibody panels (Table S2) (Biancotto et al., 2011). Here, we focused our analyses on a panel covering 28 manually gated T cell subpopulations (all were CD45+CD3+), including CD4+ and CD8+ T cells and subsets within these major populations, e.g., naive (CD45RA+), memory (CD45RA−), regulatory (CD25+), and activated (e.g., HLA-DR+) fractions (see Table S3 and Experimental Procedures). Analysis of multiple replicates of PBMC samples via our pipeline of generating flow cytometry data and manual gating indicated high technical reproducibility (see Figure S1B in Tsang et al., 2014). Although the level of protein expression is often used for gating cell subsets, here we were interested in studying the degree of cell-to-cell variation in gene expression—including both continuous (Figure 1A) and discrete (Figure 1B) CEV—within well-defined cell subsets and their potential functions and genetic underpinnings. Continuous CEV reflects cell-to-cell expression differences along a non-disjoint continuum, whereas discrete CEV arises from two or more discernably distinct clusters of cells expressing markedly different protein amounts, akin to how immune cell subsets are classically defined.
Figure 1. Quantification of Cell-to-Cell Variation in Protein or Gene Expression within a Cell Population.
For a Figure360 author presentation of Figure 1, see http://dx.doi.org/10.1016/j.immuni.2016.10.025#mmc6.
(A) The level of protein X varies from cell to cell in the depicted cell population, and its distribution among cells is shown; the degree of this variability (illustrated by the orange bar) can be quantified by metrics such as standard deviation (SD) and median absolute deviation (MAD).
(B) Data are similar to those in (A), but an example of “discrete” heterogeneity is shown instead of “continuous” heterogeneity; here, the distribution of expression levels is bimodal with two clusters of cells.
(C) A heterogeneity parameter (HP) is defined as a combination of a variance metric applied to quantify the CEV of a protein (or gene) in a cell population. Some HP examples from this study are listed.
(D) Our analysis framework and workflow. PBMC samples were collected at multiple time points from healthy human subjects as part of an influenza vaccination study. Individual PBMC samples were assayed with a 15-color flow cytometry panel designed for T cell phenotyping; the relative protein expression levels (based on fluorescence intensity) at the single-cell level were then extracted. To assess CEV within individual samples (each sample corresponds to one subject at one time point), we generated an “HP profile” by (1) manually gating cell subsets, resulting in 28 T cell subsets; (2) assembling a single-cell expression profile consisting of all measured proteins (except for the viability marker) for each cell subset; (3) computing HPs (see Supplemental Notations and Supplemental Definitions in the Supplemental Experimental Procedures) for each protein, cell-subset, and variance-metric combination; (4) assembling HPs across all proteins and cell subsets in a sample into a cell subset by protein matrix for downstream analyses; (5) assessing each HP for its degree of variability across baseline time points within individual subjects (“intra-subject”) and across subjects (“inter-subject”) after generating HP profiles for all samples; and (6) identifying and analyzing HPs deemed stable over the three baseline time points (7 days before, immediately before, and 70 days after vaccination; see Experimental Procedures) within individual subjects for potential association with other phenotypic parameters and genotypes in the cohort.
For each of the 28 gated T cell subsets in a sample, the CEV of individual proteins was quantified via both the standard deviation (SD) and the median absolute deviation (MAD) among all cells within the subset (see Supplemental Experimental Procedures on mathematical definitions and notations). Because outlier cells, such as those with very low or no apparent expression, can influence the values of SD, MAD was included as an additional variance measure to increase robustness against such outliers (see Supplemental Experimental Procedures). Because the SD and MAD were computed for each of the 14 measured proteins across single cells within each cell subset (the marker measuring cell viability was excluded), a large number of such metrics (herein referred to as heterogeneity parameters [HPs]) reflecting the CEV of different proteins in distinct cell subpopulations were derived from our dataset (Figure 1C). Furthermore, to capture the notion of “total” CEV summing across all proteins within a cell subset, we computed scale-independent versions of SD and MAD for each marker and summed them to obtain, respectively, the total SD and MAD. All together, we used our framework to define and analyze 840 HPs in this study (Figure 1D; Table S4A). Note that whereas CEV refers to the general concept of cell-to-cell expression variation within a cell population, an HP corresponds to a specific instantiation reflecting the CEV of a given protein (e.g., CD38) in a particular population of cells (e.g., naive CD4+ T cells) as captured by a quantitative measure of variance (SD, MAD, or total) (Figure 1C).
To begin to assess where an HP lies along the continuous or discrete spectrum (Figures 1A and 1B), for each protein and cell subpopulation within a sample we inferred whether the single-cell expression distribution of the protein within the subpopulation had more than one mode (i.e., multi-modal) by using the dip unimodality test (Hartigan and Hartigan, 1985) (Table S4A and see Supplemental Experimental Procedures). Although the continuous or discrete status of HPs assessed in this manner can differ from subject to subject, a substantial number (184) of HPs had single-cell distributions supported by the dip test as continuous in all subjects across all baseline time points (Table S4A). Note that because of the inherent challenge of ascertaining the unimodality of single-cell distributions, the information provided by the dip test was used only to aid result interpretation, e.g., to help gauge where CEV showing significant association with a human phenotype lies along the discrete or continuous spectrum (see below).
To enable convenient navigation, visualization, and exploration of our data and analysis results, we have created web-based iFigures (https://heterogeneity.niaid.nih.gov). On our website, a user can, via point-and-click, select a specific subject, time point, and HP to inspect the underlying single-cell expression distributions of a protein in a particular cell subset. Single-cell distributions from subjects of interest can also be compared according to results from correlation analysis (see below). In addition to providing increased transparency, this resource can help motivate and enable further analyses of the data. A step-by-step overview of our data-processing and -transformation procedure is also provided on our website.
Cell-to-Cell Expression Heterogeneity within Individual Human Subjects
We first sought to examine the HP profile of T cell subsets within individual subjects (Figure 2; iFigure 2). Because the value of an HP can depend on the average expression level of the protein (e.g., highly expressed proteins tend to have higher absolute SDs and MADs; Newman et al., 2006), for this particular analysis only, we derived mean-independent versions of the HPs after adjusting for average expression so that HPs could be meaningfully compared across different cell subsets (see Supplemental Experimental Procedures).
Figure 2. The HP Profile of a Representative 23-Year-Old Female Subject.
(A) HPs capturing the CEV of 14 proteins in 28 T cell subsets. Because HPs can be correlated with the mean expression level of the protein in the cell subset, for visualization purposes, here a “mean-independent” version of the HPs is shown (see Supplemental Experimental Procedures). The rows are arranged according to the gating hierarchy. Heatmaps for other subjects can be viewed in our web-based iFigures (iFigure 3).
(B–D) HP examples illustrating the nature of CEV across cell subsets. (B) The expression variation of CD38 in CD8+CD25+ cells can be partially attributed to the presence of discrete clusters of cells, as evidenced by the multi-modal distribution of CD38 expression among the cells. (C and D) The CEV of the same protein in CD4+ T cells (C) and CCR7− cells of the total memory CD4+ T cells (D) were lower than the example in (B), and they exhibited continuous differences in CD38 expression level among cells without apparent discrete clusters.
As expected, the CEV of immune cell lineage markers, such as CD45 (marking leukocytes) and CD3 (marking T cells), was generally low and largely coherent across subjects and cell subsets (Figure 2A; Figures S1A and S1B). By contrast, the CEV of proteins whose expression more directly reflects cellular state, such as immune cell activation (e.g., HLA-DR), was higher than that of lineage markers within individual cell subsets and exhibited more variability among subjects (Figure S1). For example, the CEV of HLA-DR (major histocompatibility complex class II molecules) was consistently higher than that of CD45 across most cell subsets (Figure 2A; Figures S1A and S1C), which is not surprising given that positive CD45 expression was used for gating all T cell subsets, and biologically, unlike CD45 expression, HLA-DR expression in individual cells is likely to be substantially more dependent on external signals, and thus, cell-to-cell differences in the micro-environment or asynchrony in response to environmental signals could lead to higher CEV. Furthermore, the CEV of non-lineage markers tended to vary substantially from one cell subset to another, e.g., HLA-DR had higher CEV within several CD8+ subsets than within the corresponding subpopulations gated by the same markers in the CD4+ lineage in a large fraction of the subjects (Figure 2A; Figure S1C). Thus, independently of average expression, the CEV of proteins marking functional or cell states can be substantially different across subsets of T cells within a person, potentially reflecting subset-to-subset differences in the need for CEV regulation across subpopulations of immune cells.
Examination of the distribution of protein expression across single cells revealed both continuous and discrete CEV (Figures 1A and 1B). For example, the CD25+ subpopulation of CD8+ T cells in the subject shown in Figure 2 was composed of at least two separable groups of cells characterized by low (or no) and high CD38 expression (Figure 2B). In contrast, some proteins showed apparently continuous CEV in this subject, e.g., CD38 expression spread across a continuum without discernible discrete clusters in either the CD4+ T cell or CCR7− effector memory CD4+ subpopulations (Figure 2C or 2D, respectively). These examples also highlight how CEV of the same protein can be quantitatively different across different cell subsets (Figures 2B–2D): the CEV of CD38 expression was highest in CD8+CD25+ cells among these three examples because this subpopulation contained at least two cell clusters marked by well-separated levels of CD38 expression—these clusters could potentially be novel subsets with distinct functional properties. There were also quantitative differences in the degree of CEV between the two continuous cases where CD38 heterogeneity was apparently higher in total CD4+ cells than in the CCR7− effector memory subset (Figures 2C and 2D). Thus, the extent of protein expression heterogeneity can be different from one cell subset to another as a result of a combination of discrete and continuous variation in protein expression among single cells.
A Large Number of HPs Exhibit Substantial Inter-subject Variation and Are Temporally Stable over a Timescale of Months
An important issue in studying CEV is to discern whether the observed cell-to-cell differences reflect the underlying biology of the system or arise from noise associated with sample collection and analyses. To address this question, we took advantage of the multiple baseline collections (days −7, 0, and 70) for each subject to assess inter-subject variability and temporal stability of HPs; HPs that exhibit relatively high inter-subject variability and are temporally stable within individual subjects over this timescale are more likely to reflect biologically relevant signals (after potential batch effects are accounted for; see Experimental Procedures). The specific criteria we used to assess temporal stability were as follows: (1) the HP should be significantly correlated between the pre-vaccination time points, i.e., individuals with higher levels of an HP on day 0 also tended to exhibit higher levels on day −7 (we required a pairwise Pearson correlation of at least 0.6 and a p value cutoff of 0.001 after accounting for experimental batch effects); and (2) the temporal (or intra-subject) variability of an HP over the three time points should be smaller (less than 50% of observed variability) than its subject-to-subject differences (Figure 3A; see also Experimental Procedures). More than 70% of the HPs we assessed passed these criteria, and most of them exhibited substantial inter-subject differences (Figure 3A; Table S4B). For example, the continuous CEV of CD38 expression in CD4+CD27+ cells was temporally stable and had substantial inter-subject variability (Figure 3, inset; Figures 3B and 3C)—see also Figure S2 for an example of a temporally less stable HP involving CD38 expression in a different cell subset. Thus, given that a substantial fraction of HPs remained relatively stable for a period of more than 2 months within individuals according to data obtained from three independent blood draws, technical noise (such as that arising from variation in blood draws and processing) is unlikely to be the sole or major contributor to the observed CEV.
Figure 3. 70% of HPs Assessed Are Deemed Temporally Stable across Three Baseline Time Points Spanning More Than 2 Months.
(A) The histogram of the “within-subject stability scores,” i.e., percentage of variance of an HP explained by inter-subject variations, is shown for all of the HPs we assessed. Scatterplots of an example of a temporally stable HP (the SD of CD38 expression in CD4+CD27+ cells) show that the HP was highly correlated across time points. Multiple criteria were used for determining whether an HP was “stable”; the minimum cutoff required (>50%) for the percentage of HP variance explained by subject-to-subject differences is indicated (see main text and Experimental Procedures).
(B) The same HP (y axis) in each of the 61 subjects (x axis) is shown for all three baseline time points. The subjects are sorted in ascending order (from left to right) according to the average value of the HP across the three time points. The HPs were very similar across time points within individuals but varied substantially more across subjects.
(C) Single-cell distributions of CD38 expression in the same cell subset from two representative subjects. The length of the bar beneath the density indicates the SD. The distributions remained largely unchanged across the baseline time points within each subject, whereas the spread of the distribution was consistently lower in subject 257 than in subject 235.
Association between CEV and Age
Temporally stable HPs could reflect subject-specific immune states and therefore could potentially be useful for immune monitoring. Thus, we next searched for functional associations of temporally stable HPs by utilizing the extensive inter-subject differences we observed (Figure 3A) to look for possible correlations between HPs and phenotypic variables. Uncovering such correlations would also further support the notion that CEV is biologically relevant.
Aging is known to affect diverse immune functions, such as the attenuated vaccination responses in older individuals (Duraisingham et al., 2013), and it is also one of the few known correlates of CEV phenotypes in mammals: the degree of mRNA expression variation in mouse cardiac myocytes has been shown to be correlated with aging (Bahar et al., 2006). However, whether CEV within immune cell subpopulations is associated with age remains unexplored. Although a given HP, as shown above, can be temporally stable over a timescale of months, aging-associated changes typically occur over a longer timescale of years, and thus, some of our temporally stable HPs can be correlates of aging.
By using data from all three baseline time points and accounting for contributions from other factors (i.e., covariates), including the relative frequency of the cell subset, the average expression level of the protein within the subset, and experimental batch (see Experimental Procedures), we used mixed-effect modeling to assess the degree to which the inter-subject variation of an HP can be explained by age. Our analysis identified 122 significant associations (false-discovery rate [FDR] < 0.1) (Figure S3A), corresponding to 91 “collapsed” hits after the SD and MAD versions of HPs from the same cell-subset and protein-marker combination were merged (Table S5). We further evaluated the significant associations by using an independent flow cytometry dataset acquired from the same panel of markers from 34 distinct healthy subjects enrolled as control individuals in a chronic lymphocytic leukemia study (Biancotto et al., 2012) Table S1). We applied the same gating and analysis procedure to compute the HPs. After we removed cell subsets with too few cells (see Experimental Procedures), 92 of the 122 age associations were testable in the smaller validation cohort, and 23 (25%) of those were validated at a cutoff of FDR <0.1. This validation rate is significantly higher than that expected by chance (p = 0.003, Fisher’s exact test).
The validated age-associated HPs included both discrete and continuous cases of CEV. Among the apparently discrete examples, the one involving CCR7 expression heterogeneity in CD4+CD25+ T regulatory (Treg) cells is of particular interest given the crucial role of CCR7 in regulating lymphocyte migration (Sallusto et al., 1999). Our data indicate that older subjects tended to have higher levels of CEV in CCR7 expression among CD25+ Treg cells than did younger subjects; the young had predominantly CCR7+ cells, whereas older individuals had both CCR7− and CCR7+ subsets (Figures 4A and 4B; iFigure 4). We next examined this difference together with the expression level of FOXP3, which is a classical marker for delineating so-called “naturally arising” (FOXP3+) Treg cells from “resting memory” (FOXP3−) cells within the larger CD4+CD25+ subset (Sakaguchi, 2005; Triplett et al., 2012). Our data confirmed the existence of these two subsets (Figures 4C–4E) and revealed that aging is associated with an increase in the frequency of CCR7−FOXP3+ cells. In this context, it is interesting to note that aging was previously found to be positively associated with the loss of CD8+ FOXP3+CCR7+ cells in humans (Suzuki et al., 2012). Given that CCR7 can regulate lymphocyte trafficking (Förster et al., 1999), our observation suggests that the migration profile of FOXP3+ Treg cells tends to become more heterogeneous as humans age, whereby an increasing fraction lose (or attenuate) CCR7 expression, thereby potentially changing their relative distribution in tissues and lymphoid organs.
Figure 4. An Age-Associated, “Discrete” HP Involving CCR7 Expression Variation in Treg Cells.
(A) A scatterplot showing the SD of CD197 (CCR7) expression in CD4+CD25+ T cells (Treg cells) (y axis) against age (x axis). To help visualize the association detected by our model, we show a version of the HP (the “partial residual”) after contributions from covariates were accounted for (the values shown correspond to the partial residuals from the fitted mixed-effect model averaged over three baseline time points; see Supplemental Experimental Procedures). See also iFigure 4, where a “raw” value version (instead of the partial residual) of this scatterplot can be viewed.
(B) Distribution of CCR7 expression across cells in the CD4+CD25+ subset is shown for three example subjects spanning the age range. The variation in CEV across subjects can be partially attributed to differences in the frequency of discrete cell clusters: older subjects tended to have a higher fraction of cells with lower CCR7 expression.
(C–E) Assessing CCR7 CEV together with FOXP3. Dot plots comparing FOXP3 and CCR7 expression in single cells are shown for subjects 204 (C), 251 (D), and 212 (E). Contour lines (yellow) indicate cell density in the dot plots.
CD38 is a cell-surface molecule that can be induced upon T cell activation (Mehta et al., 1996). The CEV of CD38 in several related, overlapping CD4+ T cell subsets exhibited continuous patterns of heterogeneity (unimodal in 100% of subjects according to the dip test; Tables S4A and S5B), and the magnitude of the CEV was negatively associated with age (Table S5). These subpopulations expressed one or both of CD45RA and CD27, both of which (or in combination) were thought to identify largely naive, antigen-inexperienced cells; on average, the CD27+ (ID33), CD45RA+ (ID34), and CD27+CD45RA+ (ID121) subsets composed 86%, 74%, and 66% of CD4+ T cells, respectively, within individuals in our cohort. In older subjects, the CEV of CD38 was lower than that in younger subjects in these cell subsets (illustrative examples involving the top hit, CD4+CD27+ cells, are shown in Figure 5A and Figure S3B, as well as in iFigure 4; see also Figure S4). Although the CEV of a protein can depend on the protein’s mean expression, here the average expression of CD38 among cells in this subset was already accounted for as a covariate in our mixed-effect model; average expression was indeed associated with neither age nor the CEV of CD38 in our cohort (Figure 5B; Figure S3C). It is tempting to speculate that a lower level of CD38 CEV could reflect a narrower phenotypic diversity among naive CD4+ T cells in older subjects. For example, in addition to marking activation, data from in vitro activation experiments (Scalzo-Inguanti and Plebanski, 2011) linked the level of CD38 expression on naive human CD4+ cells before activation to proliferation capacity and a cytokine-production profile after stimulation. Our observation thus provides an intriguing example of how continuous CEV in a primary cell population could be functionally connected to an organismal phenotype.
Figure 5. An Age-Associated, “Continuous” HP Involving CD38 Expression Variation in Naive CD4+ T Cells.
(A) As in Figure 4A, a scatterplot shows the partial residual of the SD of CD38 expression in CD4+CD27+ naive cells (y axis) against age (x axis). Distributions of CD38 expression from example subjects spanning the age range are shown as in Figure 4B. See also iFigure 4, where a “raw” value version (instead of the partial residual) of this scatterplot can be viewed.
(B) Partial residual of the HP (y axis) is plotted against the mean expression of CD38 (x axis). Subjects are colored according to age (see also Figure S3C).
The Genetic Underpinnings of CEV: Linking CEV to Genetic Variants Associated with Disease
We next sought to assess whether genetics can explain some of the inter-subject variation observed in temporally stable HPs by searching for cevQTLs. Given the relatively small size of our cohort, we limited our analysis to only 796 SNPs pre-selected on the basis of known association with human immune traits and susceptibility to infectious or autoimmune diseases according to genome-wide association studies (GWASs) (Table S6) (Welter et al., 2014).
Our analysis used data from two of the baseline time points (days −7 and 0) to discover associations and then assessed whether those could be validated with data from day 70. In particular, instead of using all three time points to assess temporal stability, we first identified a list of temporally stable HPs by applying the same procedure as above but using only data from days −7 and 0 because we wanted to reserve data from day 70 as independent measurements for validation. Among these temporally stable HPs, we found 13 significant HP-SNP associations (with FDR < 0.1 for both days 0 and −7) by using a model that explicitly accounts for age, gender, cohort population structure, mean protein expression within the cell subset, and the frequency of the subset from which the HP was derived (see Experimental Procedures), corresponding to ten unique cevQTLs when hits involving SD and MAD of the same cell-subset and protein-marker combination were merged. All ten cevQTLs were confirmed by independent measurements from day 70 samples (FDR <0.1; Tables S7A and S7B). For cevQTLs with fewer than 10% of subjects homozygous for the minor allele, we further ascertained the robustness of our findings by removing those homozygous subjects and reassessing association by using the same statistical model. Our analysis confirmed that all such associations remained statistically significant (Table S7D).
The ten cevQTLs we identified included SNPs known to be associated with diseases such as multiple sclerosis, asthma, rheumatoid arthritis, and Crohn’s disease. For example, we uncovered a positive association between the CEV of HLA-DR in CD8+ T cells and the number of the minor, protective allele (G) of rs1588265. This is a SNP in the intron of the gene PDE4D on chromosome 5 and has been linked to asthma (Himes et al., 2009) (Figures 6A and 6B; Figure S5A; iFigure 5). Notably, it resides on a different chromosome than the HLA locus and thus potentially acts in trans to shape the expression variability of HLA-DR, akin to how trans-eQTLs can affect the average expression of genes residing in distant chromosomal locations from the QTL (Westra et al., 2013). Our findings suggest that increased HLA-DR CEV is associated with increased protection against disease. Although the mean level of HLA-DR expression in CD8+T cells was also highly variable across subjects (e.g., see Figure 6B), it was not significantly associated with rs1588265 (Figure 6C; Figures S5B and S5C). Given that HLA-DR is often considered a marker of T cell activation, our finding suggests that the quantitative extent of cell-to-cell variation in the activation status of CD8+ T cells could be linked to asthma susceptibility. Further work will need to assess whether this association suggests a causal, reactive, or independent mechanism (Schadt et al., 2005), i.e., whether lowering HLA-DR CEV can cause asthma, whether asthma drives decreased HLA-DR CEV, or whether the genetic polymorphism is associated with both asthma and HLA-DR CEV via independent pathways; a combination of these scenarios is also possible depending on the underlying molecular and cellular networks.
Figure 6. Two Examples of HPs Associated with SNPs Linked to Disease Susceptibility.
(A) The SD of HLA-DR expression in CD8+T cells (y axis) from day 0 is plotted against the number of the minor, disease-protective allele of SNP rs1588265 (x axis). The partial residual of HP is shown as in Figure 4A to account for covariates (see Supplemental Experimental Procedures). See also iFigure 5 for plots showing data from other time points; a “raw” value version (instead of the partial residual) of this plot can also be viewed.
(B) Distribution of HLA-DR expression in CD8+ T cells is shown for example subjects with two, one, or zero copies (top to bottom) of the minor allele.
(C) Scatterplot showing HP (the partial residual) against mean expression; subjects are colored by genotype (see also Figure S5B).
(D) Data are similar to those in (A), but the SD of CD38 heterogeneity in CD4+CD45RA+T cells (ID34) (the partial residual) is shown against the number of the minor, protective allele at SNP rs1588265.
(E) Data are similar to those in (B): the distribution of CD38 expression among cells in cell subset ID34 is shown for subjects with 2, 1, or 0 copies (top to bottom) of the minor allele.
(F) Data are similar to those in (C), but CD38 in CD4+CD45RA+ T cells (ID34) is shown (see also Figure S6B).
Another interesting example involved a positive association between the asthma-associated SNP discussed above (rs1588265) and the continuous CEV of CD38 expression in CD4+CD45RA+ naive T cells (unimodal in 100% of subjects according to the dip test; Figures 6D and 6E; Figure S6A; iFigure 5; Tables S4A and S7B). This association is, again, statistically significantly independent of mean expression (Figure 6F; Figures S6B and S6C) and age, even though age itself was correlated with this and related HPs, as discussed above (see CD38 CEV of CD4+CD27+ cells in Figure 5 and that of a related population [CD4+CD45RA+ cells] in Figure S4; see also iFigure 4). We also evaluated, but did not detect significant association between, this SNP and the frequency of CD38+ cells within several functionally important CD4+ T cell subsets (including Treg cells and total, effector, and central memory cells), suggesting that our observation cannot be simply attributed to differences in the frequency of discrete cell clusters defined by the extent of CD38 expression within well-known T cell subsets. Together, our findings put forth the notion that not only can genetic variants affect the mean expression level of a gene, but they can also shape the degree of discrete and continuous expression variability from one cell to another in the cell population, which in turn could be a marker for, or might even play a mechanistic role in, modulating disease susceptibility and development.
DISCUSSION
Here, we have introduced a conceptual and analytic framework for systematically exploring and quantitatively assessing CEV in peripheral immune cells in human populations. In addition to providing web-based ifigures to enhance transparency and to enable further exploration of our data and analysis results, we have made our dataset available for further analysis and computational method development (https://heterogeneity.niaid.nih.gov). In addition, the software pipeline we developed for computing HPs and associated documentation can be downloaded and applied to other flow cytometry datasets, including those from previous studies available in ImmPort (Bhattacharya et al., 2014) and FlowRepository (Spidlen et al., 2012), to provide potential insights into CEV functions.
Our analyses of CEV phenotypes involving 14 proteins in 28 peripheral T cell subpopulations in a healthy cohort suggest that such phenotypes can exhibit substantial subject-to-subject differences and yet remain relatively stable within individuals over a timescale of 2 months. Thus, they can be considered a new class of quantitative traits and biomarkers potentially reflective of an individual’s biological states and therefore potentially useful for human immune monitoring. Supporting this notion and the idea that CEV is functionally relevant, we uncovered age and genetic associations of temporally stable CEVs, thus implicating them as potential biomarkers of disease susceptibility and aging. However, we note that our statistical power for detecting temporally stable HPs and age or genetic associations was still limited by the relatively small cohort size, short longitudinal span of baseline time points (~2 months), and non-uniform age distributions (more young than older subjects). In addition, despite careful compensation, the CEV of a protein computed from high-dimensional flow cytometry data could partially reflect crosstalk from another marker; thus, additional checks, such as assessing whether the mean expression of another protein can explain away an association, can be helpful (e.g., we performed such checks for the associations highlighted above, and all remain significant [data not shown]). Despite these limitations, many HPs exhibit convincing signs of being temporally stable over months, and we were able to confirm genetically associated HPs by using a set of independent samples collected more than 2 months apart from the discovery samples, as well as replicate a significant fraction of the age associations by using an independent cohort.
In addition to identifying discrete CEV classically used for defining immune cell subsets, we identified temporally stable, continuous CEV involving, for instance, CD38 expression in CD27+ and CD45RA+ T helper cells. These observations are consistent with recent findings that heterogeneity within immune cell populations is often manifested on a continuous scale (Satija and Shalek, 2014): human B cells tend to mature along a continuum of phenotypic states (Bendall et al., 2011), T cell activation response varies as a results of continuous differences in the expression level of signaling components across cells (Feinerman et al., 2008), and the combinatorial patterns of cytokine expression in key human CD8+ T cell subsets “blurs” the discrete boundaries previously thought to exist among these subpopulations into a continuum of intermediate states (Newell et al., 2012). Such continuous cell-to-cell variation in protein expression has been consistently observed but is often ignored in the analysis of flow cytometry data; thus, its biological significance has been underexplored. Our finding that continuous CEV can be associated with aging and genetics implies that it could have biological functions and be used to mark personal immune states. In addition, some of the temporally stable discrete CEV we uncovered, particularly that associated with age or genetics, could point to novel subpopulations within the T cell subsets we analyzed. Although dividing CEV into discrete and continuous groups is conceptually intuitive and helps interpretation, these two types are more reflective of the two ends of a spectrum along which cell populations lie. For example, in principle even cell populations exhibiting seemingly smooth, continuous distributions in the expression of a particular set of proteins can potentially comprise a large number of small, functionally distinct cell subsets (Chang et al., 2008).
Some of the associations between CEV and age could reflect hallmarks of immunological aging, such as chronic inflammation in tissues (Chung et al., 2009) or the cumulative exposure to infection. For instance, given that CCR7− Treg cells can accumulate in inflamed sites (Menning et al., 2007) and tend not to enter secondary lymph nodes, where they inhibit memory or naive T cell responses (Campbell and Koch, 2011; Schneider et al., 2007), our observation of increased CCR7 CEV (due to higher fractions of CCR7− cells) in older subjects suggests that as humans age, Treg cells could spend more time in peripheral sites to potentially regulate local inflammation. Consistent with this idea, the skin of older subjects, for example, is known to have more FOXP3+ Treg cells than that of younger adults, and this increase is associated with the prevalent adoption of an M2, anti-inflammatory phenotype by skin-resident macrophages (Agius et al., 2009). Interestingly, CCR7 expression heterogeneity is also known to be different between naive and effector- or memory-like Treg cells, the latter of which tend to have, at least in mice, elevated frequencies of CCR7− cells (Menning et al., 2007). Thus, given that older individuals presumably have cumulatively encountered more infections and therefore generated more memory populations, their Treg cells could become more heterogeneous and possess more CCR7− cells.
It is intriguing that the continuous CEV of CD38 in several largely overlapping CD4+ T cell subsets (marked by CD45RA and/or CD27, both of which are thought to identify largely naive, antigen-inexperienced cells) is a correlate of both age and, in an independent manner, an asthma-associated genetic variant (rs1588265) identified from GWASs. Both younger subjects and those carrying protective alleles of the genetic variant tend to have increased variability in CD38 expression among naive CD4+ T cells. Although this observation could reflect enrichment of CD38+ cells in other overlapping cell subsets, e.g., Treg cells and memory cells, we did not detect significant associations between the frequency of CD38+ fractions within major CD4+ T cell subsets and age or the genetic variant (rs1588265). CD38 is classically thought of as a marker for antigenic activation, yet there is also evidence that CD4+CD38+ T helper cells in humans are enriched with naive cells (marked by CD45RA expression), and upon activation in vitro, these cells tend to be hypo-proliferative and secrete IL-13, whereas their CD38− counterparts, again after in vitro stimulation, are more proliferative and secrete Th1-associated cytokines, such as IFN-γ and TNF-α (Scalzo-Inguanti and Plebanski, 2011). Thus, individuals with elevated CD38 CEV in naive CD4+ T cells could have a broader spectrum of response upon antigen encounter, perhaps spreading along the continuous spectrum from Th1- to Th2-like responses (Antebi et al., 2013) rather than the discrete Th1-Th2 divisions classically used to define responses from CD4+ T cells. How a continuous, more heterogeneous response can potentially be associated with beneficial phenotypes—in this case, younger and presumably healthier immune states as well as decreased asthma susceptibility—remains to be explored. Perhaps akin to the “bet-hedging” strategy employed by bacterial populations (Acar et al., 2008; Arkin et al., 1998; Kussell and Leibler, 2005; Thattai and van Oudenaarden, 2004), having more diverse responses to fluctuating environmental signals, e.g., uncertain timing of encounters with pathogens and allergens, would help ensure that there are appropriate responses from at least some cells. It is also possible that what matters most are the integrated responses across cells, e.g., the balance between inflammatory and tolerogenic responses, which can be shaped by the extent of pre-existing cell-to-cell heterogeneity. Our observations raise the possibility that continuous cellular heterogeneity phenotypes could be important markers and determinants of health and disease.
Genetic variants are known to regulate the frequencies of peripheral immune cell subset in humans (Orrù et al., 2013; Roederer et al., 2015), as well as the average transcript and protein levels in a cell population (Cookson et al., 2009; Gilad et al., 2008; Rockman and Kruglyak, 2006), including major immune cell types such as T and dendritic cells (Lee et al., 2014; Raj et al., 2014; Ye et al., 2014). In the single-cell yeast organism Saccharomyces cerevisiae, a few genetic variants have been reported to be associated with the extent of CEV of a fluorescent protein reporter (Ansel et al., 2008)—such CEV is often attributed primarily to stochastic gene expression (Raj and van Oudenaarden, 2008), particularly for an isogenic population of single-cell organisms growing in well-mixed culture conditions. However, whether (and if so, how) this finding extends to endogenous genes and to complex multi-cellular organisms with diverse cell types and subpopulations is unclear. Here, we uncovered evidence that independently of mean expression and cell-subset frequency, the continuous expression variation among cells in a T cell subpopulation in humans can be under genetic influence. Thus, evolutionary forces can shape, potentially via distinct genetic variants, the average size of a cell subpopulation, the mean expression level of a gene in that subpopulation, and its expression variation among cells. Note that the cevQTLs we identified here involve CEV in a population of cells (i.e., each person has a distinct CEV value for each combination of cell subset and protein), and thus they are distinct, both in concept and in biological basis, from the QTLs found earlier to be associated with the inter-subject variance of average gene expression in lymphoblastoid cell lines derived from HapMap individuals (Hulse and Cai, 2013). Nonetheless, it will be interesting to explore potential connections between these two distinct notions of variability in humans. Our observation that CEV can be associated with immune-disease-linked variants that affect neither a gene’s average expression level nor the frequency of the T lymphocyte subpopulation from which the CEV derived suggests that alterations in CEV alone can be functionally consequential. Genetics can shape both discrete and continuous CEV via a variety of potential molecular mechanisms, such as by tuning rates of transcription, mRNA degradation, or translation without changing mean expression or by controlling the strength of regulatory feedback loops to achieve the optimal distribution of cell subsets with distinct phenotypes (Raj and van Oudenaarden, 2008). Thus, the genetic diversity of the human population can be funneled through myriad molecular mechanisms to shape CEV phenotypes in different cell types and organs to ultimately affect human health and disease.
Our findings also suggest that CEV phenotypes can be effective biomarkers for precision medicine (Collins and Varmus, 2015). In particular, given the prevalent involvement of the immune system in diverse pathologies (Germain and Schwartzberg, 2011), temporally stable HPs of immune cell populations, such as those we identified here in peripheral T cells, are potentially a good source of biomarkers. In this context, it is interesting to note that the phenotypic effect of genetic mutations can differ depending on the expression level of certain genes (Vu et al., 2015). Although this phenomenon was reported on the basis of comparing the effect of mutations across two strains of a species, it is likely that the effect of a mutation can differ across single cells depending on the expression level of certain transcripts in the cell. Thus, in principle, the extent of CEV in a cell population can affect the overall expressivity of a mutation. For example, lowering the CEV of a transcript could buffer the effect of a mutation across all cells, whereas increased CEV could cause some cells to cross the expression threshold in some genes and start expressing the phenotype of the mutation. Therefore, achieving accurate prediction of disease progression and treatment outcomes might require modeling of the interactions between CEVs and genetics, in addition to other interactive effects (e.g., gene-environment interactions). As single-cell profiling technologies continue to advance at a rapid pace, the study of CEV phenotypes in larger human cohorts should help further reveal their biologic functions and potential utility for enabling precision medicine.
EXPERIMENTAL PROCEDURES
Further details can be found in the Supplemental Experimental Procedures.
Human Protocol
As described in Tsang et al. (2014), healthy volunteers over the age of 18 years were enrolled in NIH protocol 09-H-0239 (Clinicaltrials.gov: NCT01191853), approved, and monitored by NIH institutional review boards in accordance with the Declaration of Helsinki. Further details can be found in the Extended Experimental Procedures of Tsang et al. (2014).
Flow Cytometry Data Processing
PBMC sample collection and processing is described in Tsang et al. (2014). The present study focused on data generated from the “T1” flow cytometry panel for assessing T cells. We compensated and transformed flow data according to standard practice and gated 28 T cell subsets by using a predefined hierarchy.
Computing HPs
For each cell subset and each protein, we calculated two types of HPs, the SD and the MAD, of the single-cell expression distribution. To capture the total heterogeneity of expression in a cell subset, we summed up the SD (or MAD) parameters of all proteins in the same subset (after proper scaling so that each protein contributed comparably; see Supplemental Experimental Procedures).
Assessing Temporal Stability
For each HP, we partitioned its total variability (i.e., sum of squares) for all subjects over all time points into three components: experimental batches, subjects, and remaining variations. An HP was deemed stable if (1) the batch and subject components were <50% and ≥50%, respectively, (2) the Pearson correlation between days −7 and 0 was greater than 0.6, and (3) to guard against batch effects in inter-day correlations, the batch-corrected correlation between days 0 and −7 was significant (see Supplemental Experimental Procedures).
Identifying Age Associations
We tested for associations between HPs and age by using mixed-effect models. For each HP involving cell subset C and protein P, we fitted a model with age, gender, the relative size of C, and the mean expression of P in C as fixed effects and subject and batch as random effects. We used data from all time points and excluded samples with <70% viable cells or <50 cells in subset C to minimize the influence from samples with too few cells. We used the likelihood-ratio test (LRT) for p value calculation to compare the model with a simpler one without the age term. Because we tested multiple HPs, we estimated the FDR at each p value cutoff threshold by using a permutation-based procedure (see Supplemental Experimental Procedures) and reported age-associated HPs with FDR < 0.1. For validation, we used a separate cohort of 34 unrelated, healthy subjects. PBMCs were collected and processed according to the same procedure. To avoid subsets with very few cells, we tested an HP only if it was defined on a subset where at least 20 subjects in the validation cohort had at least 50 cells. Because of the smaller size of the validation dataset (only one time point and fewer subjects), we divided the subjects into two age groups (< or ≥ 40 years) and used the t test to perform validation. HPs below the FDR cutoff of 0.1 were deemed validated.
Identifying HP-SNP Associations
We selected immune-related SNPs from the NHGRI GWAS Catalog (circa 2011) (Welter et al., 2014). We tested association between an HP and a SNP by using a linear model and included the same covariates as the age association analysis above, but we also included the top three principle components capturing the demographic structure (computed with EIGENSTRAT from the 698,893 genotyped SNPs passing our quality filters: call rate > 95% and MAF ≥ 0.01) (Patterson et al., 2006). We filtered samples according to the same criteria as in the age analysis and used the Matrix eQTL package to fit the model to data from each time point (Shabalin, 2012). Statistical significance was assessed with a LRT comparing the above model with a simpler one without the SNP term. We controlled for the FDR by using a permutation-based procedure (see Supplemental Experimental Procedures). We reported HPs on the basis of an FDR cutoff of 0.1 on both days −7 and 0. For validation, we applied the same procedure to the data from day 70 and reported results on the basis of an FDR cutoff of 0.1.
Supplementary Material
Highlights.
Developed framework for assessing cell-to-cell expression variation (CEV)
CEV is often variable among but temporally stable within individuals over months
Identified peripheral T cell CEV as a correlate of aging
CEV can potentially be regulated by disease-associated genetic polymorphisms
ACKNOWLEDGMENTS
We thank Adrian Wiestner for providing healthy samples for age-association validation; Candace Liu for help with figures and software testing; Pamela Schwartzberg, Nicolas van Panhuys, and Ronald Germain for comments on the manuscript; and members of the J.S.T. lab and the Trans-NIH Center for Human Immunology (CHI) for discussions. This work was funded by the intramural program of the NIAID and of NIH institutes supporting the CHI.
Footnotes
SUPPLEMENTAL INFORMATION
Supplemental Information includes Supplemental Experimental Procedures, six figures, and seven tables and can be found with this article online at http://dx.doi.org/10.1016/j.immuni.2016.10.025.
REFERENCES
- Acar M, Mettetal JT, and van Oudenaarden A (2008). Stochastic switching as a survival strategy in fluctuating environments. Nat. Genet 40, 471–475. [DOI] [PubMed] [Google Scholar]
- Agius E, Lacy KE, Vukmanovic-Stejic M, Jagger AL, Papageorgiou A-P, Hall S, Reed JR, Curnow SJ, Fuentes-Duculan J, Buckley CD, et al. (2009). Decreased TNF-α synthesis by macrophages restricts cutaneous immunosurveillance by memory CD4+ T cells during aging. J. Exp. Med 206, 1929–1940. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Altschuler SJ, and Wu LF (2010). Cellular heterogeneity: do differences make a difference? Cell 141, 559–563. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ansel J, Bottin H, Rodriguez-Beltran C, Damon C, Nagarajan M, Fehrmann S, François J, and Yvert G (2008). Cell-to-cell stochastic variation in gene expression is a complex genetic trait. PLoS Genet 4, e1000049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Antebi YE, Reich-Zeliger S, Hart Y, Mayo A, Eizenberg I, Rimer J, Putheti P, Pe’er D, and Friedman N (2013). Mapping differentiation under mixed culture conditions reveals a tunable continuum of T cell fates. PLoS Biol. 11, e1001616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Arkin A, Ross J, and McAdams HH (1998). Stochastic kinetic analysis of developmental pathway bifurcation in phage λ-infected Escherichia coli cells. Genetics 149, 1633–1648. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bahar R, Hartmann CH, Rodriguez KA, Denny AD, Busuttil RA, Dollè MET, Calder RB, Chisholm GB, Pollock BH, Klein CA, and Vijg J (2006). Increased cell-to-cell variation in gene expression in ageing mouse heart. Nature 441, 1011–1014. [DOI] [PubMed] [Google Scholar]
- Bendall SC, Simonds EF, Qiu P, Amir AD, Krutzik PO, Finck R, Bruggner RV, Melamed R, Trejo A, Ornatsky OI, et al. (2011). Single-cell mass cytometry of differential immune and drug responses across a human hematopoietic continuum. Science 332, 687–696. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bhattacharya S, Andorf S, Gomes L, Dunn P, Schaefer H, Pontius J, Berger P, Desborough V, Smith T, Campbell J, et al. (2014). ImmPort: disseminating data to the public for the future of immunology. Immunol. Res 58,234–239. [DOI] [PubMed] [Google Scholar]
- Biancotto A, Fuchs JC, Williams A, Dagur PK, and McCoy JP Jr. (2011). High dimensional flow cytometry for comprehensive leukocyte immunophenotyping (CLIP) in translational research. J. Immunol. Methods 363,245–261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Biancotto A, Dagur PK, Fuchs JC, Wiestner A, Bagwell CB, and McCoy JP Jr. (2012). Phenotypic complexity of T regulatory subsets in patients with B-chronic lymphocytic leukemia. Mod. Pathol 25, 246–259. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Campbell DJ, and Koch MA (2011). Phenotypical and functional specialization of FOXP3+ regulatory T cells. Nat. Rev. Immunol 11, 119–130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chang HH, Hemberg M, Barahona M, Ingber DE, and Huang S (2008). Transcriptome-wide noise controls lineage choice in mammalian progenitor cells. Nature 453, 544–547. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chung HY, Cesari M, Anton S, Marzetti E, Giovannini S, Seo AY, Carter C, Yu BP, and Leeuwenburgh C (2009). Molecular inflammation: underpinnings of aging and age-related diseases. Ageing Res. Rev 8, 18–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Collins FS, and Varmus H (2015). A new initiative on precision medicine. N. Engl. J. Med 372, 793–795. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cookson W, Liang L, Abecasis G, Moffatt M, and Lathrop M (2009). Mapping complex disease traits with global gene expression. Nat. Rev. Genet. 70, 184–194. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Duraisingham SS, Rouphael N, Cavanagh MM, Nakaya HI, Goronzy JJ, and Pulendran B (2013). Systems Biology of Vaccination in the Elderly. In Systems Biology MG Katze, ed. (Springer Berlin; Heidelberg: ), pp. 117–142. [DOI] [PubMed] [Google Scholar]
- Eldar A, and Elowitz MB (2010). Functional rolesfornoise in geneticcircuits. Nature 467, 167–173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Feinerman O, Veiga J, Dorfman JR, Germain RN, and Altan-Bonnet G (2008). Variability and robustness in T cell activation from regulated heterogeneity in protein levels. Science 321, 1081–1084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fleming WH, Alpern EJ, Uchida N, Ikuta K, Spangrude GJ, and Weissman IL (1993). Functional heterogeneity is associated with the cell cycle status of murine hematopoietic stem cells. J. Cell Biol 122, 897–902. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Förster R, Schubel A, Breitfeld D, Kremmer E, Renner-Müller I, Wolf E, and Lipp M (1999). CCR7 coordinates the primary immune response by establishing functional microenvironments in secondary lymphoid organs. Cell 99, 23–33. [DOI] [PubMed] [Google Scholar]
- Geissmann F, Manz MG, Jung S, Sieweke MH, Merad M, and Ley K (2010). Development of monocytes, macrophages, and dendritic cells. Science 327, 656–661. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Germain RN, and Schwartzberg PL (2011). The human condition: an immunological perspective. Nat. Immunol 12, 369–372. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gilad Y, Rifkin SA, and Pritchard JK (2008). Revealing the architecture of gene regulation: the promise of eQTL studies. Trends Genet. 24, 408–415. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hartigan JA, and Hartigan PM (1985). The Dip Test of Unimodality. Ann. Stat 13, 70–84. [Google Scholar]
- Himes BE, Hunninghake GM, Baurley JW, Rafaels NM, Sleiman P, Strachan DP, Wilk JB, Willis-Owen SAG, Klanderman B, Lasky-Su J, et al. (2009). Genome-wide association analysis identifies PDE4D as an asthma-susceptibility gene. Am. J. Hum. Genet 84, 581–593. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hulse AM, and Cai JJ (2013). Genetic variants contribute to gene expression variability in humans. Genetics 193, 95–108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kussell E, and Leibler S (2005). Phenotypic diversity, population growth, and information in fluctuating environments. Science 309, 2075–2078. [DOI] [PubMed] [Google Scholar]
- Lee MN, Ye C, Villani A-C, Raj T, Li W, Eisenhaure TM, Imboywa SH, Chipendo PI, Ran FA, Slowikowski K, et al. (2014). Common genetic variants modulate pathogen-sensing responses in human dendritic cells. Science 343, 1246980. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marusyk A, Almendro V, and Polyak K (2012). Intra-tumour heterogeneity: a looking glass for cancer? Nat. Rev. Cancer 12, 323–334. [DOI] [PubMed] [Google Scholar]
- Mehta K, Shahid U, and Malavasi F (1996). Human CD38, a cell-surface protein with multiple functions. FASEB J 10, 1408–1417. [DOI] [PubMed] [Google Scholar]
- Menning A, Höpken UE, Siegmund K, Lipp M, Hamann A, and Huehn J (2007). Distinctive role of CCR7 in migration and functional activity of naive- and effector/memory-like Treg subsets. Eur. J. Immunol 37, 1575–1583. [DOI] [PubMed] [Google Scholar]
- Newell EW, Sigal N, Bendall SC, Nolan GP, and Davis MM (2012). Cytometry by time-of-flight shows combinatorial cytokine expression and virus-specific cell niches within a continuum of CD8+ T cell phenotypes. Immunity 36, 142–152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Newman JRS, Ghaemmaghami S, Ihmels J, Breslow DK, Noble M, DeRisi JL, and Weissman JS (2006). Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature 441, 840–846. [DOI] [PubMed] [Google Scholar]
- Obermoser G, Presnell S, Domico K, Xu H, Wang Y, Anguiano E, Thompson-Snipes L, Ranganathan R, Zeitner B, Bjork A, et al. (2013). Systems scale interactive exploration reveals quantitative and qualitative differences in response to influenza and pneumococcal vaccines. Immunity 38, 831–844. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Orrù V, Steri M, Sole G, Sidore C, Virdis F, Dei M, Lai S, Zoledziewska M, Busonero F, Mulas A, et al. (2013). Genetic variants regulating immune cell levels in health and disease. Cell 155, 242–256. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Patterson N, Price AL, and Reich D (2006). Population structure and eigenanalysis. PLoS Genet. 2, e190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Raj A, and van Oudenaarden A (2008). Nature, nurture, or chance: stochastic gene expression and its consequences. Cell 735, 216–226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Raj T, Rothamel K, Mostafavi S, Ye C, Lee MN, Replogle JM, Feng T, Lee M, Asinovski N, Frohlich I, et al. (2014). Polarization of the effects of autoimmune and neurodegenerative risk alleles in leukocytes. Science 344, 519–523. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rebhahn JA, Deng N, Sharma G, Livingstone AM, Huang S, and Mosmann TR (2014). An animated landscape representation of CD4+ T-cell differentiation, variability, and plasticity: insights into the behavior of populations versus cells. Eur. J. Immunol 44, 2216–2229. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rockman MV, and Kruglyak L (2006). Genetics of global gene expression. Nat. Rev. Genet 7, 862–872. [DOI] [PubMed] [Google Scholar]
- Roederer M, Quaye L, Mangino M, Beddall MH, Mahnke Y, Chattopadhyay P, Tosi I, Napolitano L, Terranova Barberio M, Menni C, et al. (2015). The genetic architecture of the human immune system: a bioresource for autoimmunity and disease pathogenesis. Cell 767,387–403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sakaguchi S (2005). Naturally arising Foxp3-expressing CD25+CD4+ regulatory T cells in immunological tolerance to self and non-self. Nat. Immunol 6, 345–352. [DOI] [PubMed] [Google Scholar]
- Sallusto F, Lenig D, Förster R, Lipp M, and Lanzavecchia A (1999). Two subsets of memory T lymphocytes with distinct homing potentials and effector functions. Nature 407, 708–712. [DOI] [PubMed] [Google Scholar]
- Satija R, and Shalek AK (2014). Heterogeneity in immune responses: from populations to single cells. Trends Immunol. 35, 219–229. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scalzo-Inguanti K, and Plebanski M (2011). CD38 identifies a hypo-proliferative IL-13-secreting CD4+ T-cell subset that does not fit into existing naive and memory phenotype paradigms. Eur. J. Immunol 47, 1298–1308. [DOI] [PubMed] [Google Scholar]
- Schadt EE, Lamb J, Yang X, Zhu J, Edwards S, Guhathakurta D, Sieberts SK, Monks S, Reitman M, Zhang C, et al. (2005). An integrative genomics approach to infer causal associations between gene expression and disease. Nat. Genet 37, 710–717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schneider MA, Meingassner JG, Lipp M, Moore HD, and Rot A (2007). CCR7 is required for the in vivo function of CD4+ CD25+ regulatory T cells. J. Exp. Med 204, 735–745. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shabalin AA (2012). Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28, 1353–1358. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spidlen J, Breuer K, Rosenberg C, Kotecha N, and Brinkman RR (2012). FlowRepository: a resource of annotated flow cytometry datasets associated with peer-reviewed publications. Cytometry A 87, 727–731. [DOI] [PubMed] [Google Scholar]
- Stegle O, Teichmann SA, and Marioni JC (2015). Computational and analytical challenges in single-cell transcriptomics. Nat. Rev. Genet. 76, 133–145. [DOI] [PubMed] [Google Scholar]
- Suzuki M, Jagger AL, Konya C, Shimojima Y, Pryshchep S, Goronzy JJ, and Weyand CM (2012). CD8+CD45RA+CCR7+FOXP3+ T cells with immunosuppressive properties: a novel subset of inducible human regulatory T cells. J. Immunol. 789, 2118–2130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thattai M, and van Oudenaarden A (2004). Stochastic gene expression in fluctuating environments. Genetics 767, 523–530. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Triplett TA, Curti BD, Bonafede PR, Miller WL, Walker EB, and Weinberg AD (2012). Defining a functionally distinct subset of human memory CD4+ T cells that are CD25POS and FOXP3NEG. Eur. J. Immunol 42, 1893–1905. [DOI] [PubMed] [Google Scholar]
- Tsang JS (2015). Utilizing population variation, vaccination, and systems biology to study human immunology. Trends Immunol. 36, 479–493. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsang JS, Schwartzberg PL, Kotliarov Y, Biancotto A, Xie Z, Germain RN, Wang E, Olnes MJ, Narayanan M, Golding H, et al. ; Baylor HIPC Center; CHI Consortium (2014). Global analyses of human immune variation reveal baseline predictors of postvaccination responses. Cell 757, 499–513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vu V, Verster AJ, Schertzberg M, Chuluunbaatar T, Spensley M, Pajkic D, Hart GT, Moffat J, and Fraser AG (2015). Natural Variation in Gene Expression Modulates the Severity of Mutant Phenotypes. Cell 762, 391–402. [DOI] [PubMed] [Google Scholar]
- Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, Klemm A, Flicek P, Manolio T, Hindorff L, and Parkinson H (2014). The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42, D1001–D1006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Westra H-J, Peters MJ, Esko T, Yaghootkar H, Schurmann C, Kettunen J, Christiansen MW, Fairfax BP, Schramm K, Powell JE, et al. (2013). Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat. Genet 45, 1238–1243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ye CJ, Feng T, Kwon H-K, Raj T, Wilson MT, Asinovski N, McCabe C, Lee MH, Frohlich I, Paik HI, et al. (2014). Intersection of population variation and autoimmunity genetics in human T cell activation. Science 345, 1254665. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.






