Abstract
Phenotypic heterogeneity has been widely observed in cellular populations. However, the extent to which heterogeneity contains biologically or clinically important information is not well understood. Here, we investigated whether patterns of basal signaling heterogeneity, in untreated cancer cell populations, could distinguish cellular populations with different drug sensitivities. We modeled cellular heterogeneity as a mixture of stereotyped signaling states, identified based on colocalization patterns of activated signaling molecules from microscopy images. We found that patterns of heterogeneity could be used to separate the most sensitive and resistant populations to paclitaxel within a set of H460 lung cancer clones and within the NCI-60 panel of cancer cell lines, but not for a set of less heterogeneous, immortalized noncancer human bronchial epithelial cell (HBEC) clones. Our results suggest that patterns of signaling heterogeneity, characterized as ensembles of a small number of distinct phenotypic states, can reveal functional differences among cellular populations.
Keywords: cancer, heterogeneity, multivariate analysis, signaling, systems biology
Introduction
Phenotypic heterogeneity is a commonly observed phenomenon in biology (Elsasser, 1984; Rubin, 1990). The physiological importance of phenotypic heterogeneity within cellular populations has been poorly understood. However, a growing body of evidence suggests that heterogeneity—even within clonal populations—may have functional consequences, such as effects on survival odds or homeostatic responses in response to fluctuating environments, pathogen invasion, or drug treatment (Luria and Delbruck, 1943; Balaban et al, 2004; Anderson et al, 2006; Suel et al, 2007; Chang et al, 2008; Cohen et al, 2008; Feinerman et al, 2008; Gascoigne and Taylor, 2008; Wilson et al, 2008). Many studies have focused on identifying a molecular basis for the origins of observed heterogeneity (Snijder et al, 2009; Spencer et al, 2009). However, regardless of its origins, there are many intriguing questions regarding whether heterogeneity contains biological information. Is heterogeneity a reproducible property of cellular populations? At what resolution should heterogeneity be examined? Do different patterns of heterogeneity reflect functional differences among cellular populations? And, does heterogeneity, observed with different readouts, contain similar information?
We choose cancer as a biological context to investigate whether information is contained in cellular heterogeneity. Classically, cancer cells have been shown to exhibit a high degree of heterogeneity in phenotypes, such as signaling and drug response (Heppner, 1984; Rubin, 1990; Anderson et al, 2006; Ichim and Wells, 2006; Campbell and Polyak, 2007; Cohen et al, 2008; Gascoigne and Taylor, 2008). In practice, this phenotypic heterogeneity is often ignored as ‘noise’ or viewed as an impediment to understanding the response of cancer cells to drugs. Determining the response of cancer cell populations to drug perturbations is an important challenge in basic and clinical research. Promising results based on population-averaged methods have come from large-scale profiling of genomes (van ‘t Veer et al, 2002), mRNAs, and miRNAs across different cancer populations (van ‘t Veer et al, 2002; Lu et al, 2005; Nagrath et al, 2007; Schlabach et al, 2008). When specific drug-response pathways are known, directed studies of mutational heterogeneity among cancer populations can also be effective in searching for signatures of resistance (Choi et al, 2007; Engelman et al, 2007; Liegl et al, 2008). These approaches require pooling analytes from many cancer cells, which obscures information that might be encoded as cellular heterogeneity within a cancer population. Recent studies have begun quantifying cellular variability within single cancer populations after perturbation with drug treatment (Cohen et al, 2008; Gascoigne and Taylor, 2008; Slack et al, 2008; Brock et al, 2009; Spencer et al, 2009). However, it is unknown what information can be revealed through characterization of heterogeneity before treatment, and further, whether such measures can be reliably related to the drug sensitivities of cancer populations.
Understanding the relevance—if any—of cellular diversity to cancer requires quantitative approaches for relating patterns of heterogeneity to functional outcomes, such as drug sensitivity. In practice, close examination of any cellular population will reveal heterogeneity, and it is a challenge to identify which components of phenotypic variability contain functionally important information. Developments in high-content imaging and flow cytometry have enabled the comparison of heterogeneity across multiple populations and conditions (Singh et al, 2003; Kiel et al, 2005; Wang et al, 2007; Kotecha et al, 2008; Slack et al, 2008). Image-based methods can capture phenotypic heterogeneity arising from the spatial distribution of signaling molecules within individual cells and, ultimately, be extended to account for other, higher-order determinants of in vivo heterogeneity, including spatial organization and microenvironment within healthy and diseased tissues.
Earlier, we developed a quantitative, image-based approach to characterize heterogeneity observed within and among cellular populations, based on patterns of signaling marker colocalization (Slack et al, 2008). The heterogeneous responses of drug-treated cancer populations were characterized as mixtures of phenotypically distinct subpopulations, each modeled around a ‘stereotyped’ cellular phenotype. Patterns of heterogeneous responses were shown to be reproducible, and models of heterogeneity—based on a limited, but nontrivial number of subpopulations—were shown to be sufficient to distinguish different classes of drugs based on their mechanism of action. Here, in complement to our previous study, we investigated the extent to which patterns of basal signaling heterogeneity, present within cancer populations before treatment, revealed information about population-level response to drug perturbation. In this case, we used prediction of population drug sensitivity as an objective measure of the degree to which our decomposition of heterogeneity contained biological information.
Results
Experimental approach for capturing heterogeneity of basal signaling states
Determining which aspects of heterogeneity contain information requires a collection of populations with diverse outcomes for a specific functional readout. We initiated our studies by generating a collection of 49 low-passage clonal populations from the highly metastatic non-small cell lung cancer cell line H460 (Supplementary Figure 1A) (Ichim and Wells, 2006). Consistent with earlier studies of clonal populations, variability among the H460 clones was observed for functional readouts such as growth rate, total cell count, local cell density, and cell morphology (Supplementary Figure 2) (Heppner, 1984; Carney et al, 1985). This collection of cancer populations, with similar genetics and cell type, therefore, provided an ideal test bed for our investigations.
Which cellular readouts should be selected to capture heterogeneity? One approach is to select specific biomarkers that target conjectured or known links between cellular mechanism and functional outcome (Snijder et al, 2009). However, the focus of our study was to identify signatures of heterogeneity that may be informative in the context of diverse cancer types. Therefore, we took an alternative approach and selected combinations of general signaling readouts to capture the heterogeneity of cellular populations in ‘basal’ (untreated) conditions. Four multiplexed immunofluorescent marker sets (MS) were chosen and studied independently (Supplementary Table 1; MS1: DNA/pSTAT3/pPTEN; MS2: DNA/pERK/pP38; MS3: DNA/E-cadherin/pGSK3-β/β-catenin; and MS4: DNA/pAkt/H3K9-Ac). These biomarkers, selected to monitor the activity levels of key signal transduction components associated with diverse areas of cancer biology (Bremnes et al, 2002; Pandolfi, 2004; Zhou et al, 2004; Haura et al, 2005; Normanno et al, 2006; Stewart et al, 2006; Barre et al, 2007; Rocques et al, 2007) enabled us to obtain a snapshot of the ensemble of cellular signaling states present within our clonal cancer populations.
Identification of common cellular signaling stereotypes
A wide range of signaling phenotypes was observed within and across untreated clonal populations based on immunofluorescent microscopy images of MS1. Although some clones appeared by eye to be phenotypically similar to the parent, other clones appeared quite different (Figure 1B; Supplementary Figure 1B). In addition, within each clone we observed cells with diverse signaling patterns as defined by marker intensity and colocalization (Supplementary Figures 1C and D). However, closer inspection of all 50 cancer populations suggested that most cell phenotypes fell into a relatively small number of signaling ‘stereotypes’; that is, each stereotype was present, to varying degrees of proportion, within all clones (Supplementary Figures 1B–D and 3). These observations suggested that each clonal population could be characterized as a mixture of a small number of common signaling stereotypes.
Figure 1.
Non-small cell lung cancer H460 clones exhibit a high degree of phenotypic heterogeneity. (A) (Top) Cellular heterogeneity can be characterized as a mixture of phenotypically distinct subpopulations using a Gaussian mixture model (GMM). Shown is the result of computing a ‘reference’ GMM of five subpopulations. Points in GMM scatter plots correspond to individual cells, visualized through feature representation and PCA reduction to two dimensions. Colored ellipses represent covariance 1 s.d. from the mean for each Gaussian cluster (see Supplementary information); cells in this (and all subsequent) scatter plot are colored by the subpopulation of maximum probability. (Bottom) Images of four representative cells from each computed subpopulation are shown. (B) Clones display phenotypically diverse signaling states as measured by activation and colocalization patterns of pSTAT3 and pPTEN immunostaining. Although some clones are phenotypically similar to the parent (e.g. clone 65), others are dramatically dissimilar to the parent (e.g. clone 100). (C) Heterogeneity observed in each clonal population is summarized with a subpopulation profile: a vector estimating the proportion of cells in each subpopulation. (D) Cell populations with similar overall distributions of marker intensities may have dramatically different proportions of subpopulations. Stacked histograms of subpopulation intensities are shown for the parent and two clones. Colors correspond to subpopulations identified in (A). Black outlines correspond to overall histogram; vertical lines indicate population medians. Pseudocolors for images in (A) are specified above the scatter plots. Pseudocolors for images in (B) are: DNA-blue, pSTAT3-green, pPTEN-red. Scale bars: 20 μm. MS1 refers to marker set 1.
To capture common signaling stereotypes among the clones, we applied an earlier developed approach for approximating cellular distributions as mixtures of subpopulations, which is unbiased by prior knowledge of cell- or marker-specific phenotypes (Supplementary information) (Slack et al, 2008). In summary, we analyzed each MS independently as follows. We applied automated cell segmentation to our image data (Loo et al, 2007), extracted cellular features from ratios of marker intensities at every pixel within a cellular region, and identified a small number (∼⩽10) of ‘maximally informative’ signaling features by principal component analysis (PCA) (Supplementary information) (Turk and Pentland, 1991; Slack et al, 2008). These PCA-based features were used in all subsequent analysis (though only the first two dimensions, PC1 and PC2, are used for visualization).
Approximately 4000 cells were analyzed per MS and per clone (∼200 000 cells per MS). For each MS, a ‘reference’ set of cells was generated by randomly subsampling ∼10% of cells from all the 50 H460 populations. Finally, each reference set was represented as a mixture of subpopulations, modeled as Gaussian distributions with means centered on distinct, ‘stereotyped’ signaling states (Figure 1A, top panel; Supplementary information). (Other choices of distributions for approximating local, high-density regions of cellular feature space, such as skew t-distributions (Pyne et al, 2009), could be made in future studies. Such choices may provide better approximations when the distributions are not normally distributed or may better model specific biological phenotypes.) We then used our mixture model to assign to each cell a probability of belonging to each subpopulation. These probabilities were used for all subsequent analysis, though for visualization purposes cells were assigned to the subpopulation of highest probability (Figure 1A, scatter plot).
The heterogeneity of each cell population was estimated by using our computed reference subpopulation model (Slack et al, 2008). In brief, the (posterior) probability of each cell belonging to the identified subpopulations was computed using Bayes’ rule and represented as a probability vector whose entries summed to one. An expected overall proportion of each subpopulation was computed by averaging these probability vectors over the cell population to obtain a subpopulation profile. Replicates were averaged to obtain a single final profile of subpopulation fractions per condition. In essence, these profiles of probability vectors (p1,…,pk) yielded a decomposition of each population, D, as a weighted mixture, D≈∑psDs, of the k reference subpopulation distributions, Ds. These profiles provided interpretable summarizations of heterogeneity present within the clones, and captured differences in subpopulation fractions, such as due to enrichment of cells into different phenotypic states and/or general population shifts.
To evaluate the optimal number of subpopulations, we applied two standard model fit criteria: Bayesian information-theoretical criterion and the Gap statistics. These standard performance metrics evaluate models by rewarding fit to data, but penalize over fitting due to increased model complexity. Our results suggested that cellular heterogeneity among all 50 H460 populations in our four MS could be reasonably modeled by a low number (3–7) of signaling stereotypes (Supplementary Figure 4). For convenience, in subsequent analysis we chose to use reference models of five subpopulations for all MS; this choice is in line with the estimates of model fit, and allowed us to test whether a small number of subpopulations could capture information contained in cellular heterogeneity.
Examination of representative cells from the five identified subpopulations (chosen near the mean of each Gaussian distribution) revealed consistent and significant differences in the activation levels of key signaling proteins (Figure 1A, bottom panel). (We noted that the subpopulations were not particularly enriched for specific cell-cycle states; Supplementary Figure 5.) Importantly, identification of these subpopulations revealed dramatic differences in heterogeneity among clones that were not easily distinguished on the basis of population-level statistics of average cellular marker expression alone. For example, clone #65, clone #100, and the parent have essentially indistinguishable means and relatively similar distributions of intensities for pSTAT3 and pPTEN in MS1 (Figure 1B–D). However, the mixture of subpopulations for clones #100, 65 and the parent were distinct (though #65 appears closer to the parent mixture than #100) (Figure 1C and D). These small collections of subpopulation phenotypes provided an intermediate (less complex than single cell but more informative than population average) resolution for examining and comparing heterogeneity observed among our H460 clones.
Comparison of heterogeneity across clonal cancer populations
We next compared heterogeneity observed across our entire collection of H460 clones. We began by studying cellular heterogeneity observed with MS1, and then made use of the other marker sets (MS2–4) to test the dependence of our findings on our initial choices of readouts. Differences in heterogeneity among the clones could be seen as differences in fractions of cells in each of the five subpopulations (Figure 2, thumbnail images and scatter plots). To assess the variation of signaling heterogeneity among the clones, we transformed the subpopulation profiles of the clones to reflect their log-fold enrichment of subpopulations compared with the parent, and grouped the profiles by hierarchical clustering based on their Euclidean distances (Supplementary information). Interestingly, clustering of the enrichment profiles revealed a relatively small number of distinct patterns (or ‘signatures’) of signaling heterogeneity (MS1 is shown in Figure 2; MS1–4 are shown in Supplementary Figure 6). In addition, subpopulation profiles from replicates of the same clone were much more similar to each other on average than replicates of clones selected from different clusters, indicating that our proposed measures of heterogeneity were experimentally reproducible (Supplementary Figure 8). Thus, cell-to-cell variation was captured by a few signaling stereotypes common to all the clonal populations and, further, only a few distinct patterns of heterogeneity were observed within our collection of clonal populations. Our decomposition of observed cell signaling heterogeneity provided an approach to visualize the diversity of heterogeneity among our clones, succinctly encapsulate the apparent complexity of cancer phenotypes, and compare clones at a resolution greater than provided by population means.
Figure 2.
Distinct patterns of signaling heterogeneity can be compared across H460 clones. Shown are results computed using marker set 1 (DNA/pSTAT3/pPTEN); pseudocolors for the thumbnail images are as in Figure 1B. At the top are the representative GMM scatter plots from eight clones and the parent (P) culture. Below are thumbnail images of each clone. Yellow/blue heat map shows enrichment/de-enrichment of subpopulations (rows) for all clones (columns). Profiles are computed as the log ratio of clone subpopulation proportions relative to the parent. Clone clustering is determined by hierarchical clustering (dendrogram at bottom). The dendrogram is plotted to produce decreasing average sensitivity to paclitaxel (colored squares above the heat map; Supplementary information). Paclitaxel sensitivity is scored relative to the parent and displayed in red (resistant) and green (sensitive) color scale (gray: paclitaxel-sensitivity scores of clones 33 and 35 are unreliable due to an image-focus problem.) (Supplementary information). Similar results using marker sets 2–4 are shown in Supplementary Figure 6.
Classification of drug sensitivity from patterns of signaling heterogeneity
Do patterns of subpopulation mixtures reflect functional differences among the clones? It is known that not all cancer subpopulations respond equally to drugs (Tang et al, 2007; Cohen et al, 2008; Gascoigne and Taylor, 2008). Hence, we wondered whether clones with similar patterns of pre-existing heterogeneity would have similar drug sensitivities. The H460 cancer populations were given identical 48 h treatments of the chemotherapeutic drugs paclitaxel (10 nM) and doxorubicin (1 μM). Cells were then fixed and stained with standard markers for apoptosis, and an index of relative drug sensitivity for each clone to the parent was computed based on the log ratios of remaining nonapoptotic cell counts; negative (or positive) values indicated greater drug resistance (or sensitivity) than the parent (Supplementary information). We observed that clones with similar patterns of heterogeneity tended to have similar drug sensitivities (Supplementary Figure 3). As most clones had similar sensitivities to paclitaxel and doxorubicin, we carried our analysis forward using only paclitaxel. Hierarchical clustering (MS1 is shown in Figure 2, MS1–4 are shown in Supplementary Figures 6 and 7) and multidimensional scaling (Figure 3A) (Borg and Groenen, 1997) of subpopulation profiles revealed striking separation of paclitaxel-sensitive from paclitaxel-nonsensitive clones. (As expected, cells stained without primary antibodies, but with secondary antibodies plus Hoechst alone, yielded no separation; Supplementary Figure 9.) Thus, heterogeneity of cellular signaling states observed in our untreated H460 clones contained information that captured sensitivity to drug treatment.
Figure 3.
Clones with similar patterns of subpopulation profiles have similar drug sensitivities. Clone IDs and relative sensitivities to paclitaxel are as in Figure 2. (A) Paclitaxel-sensitive clones can be separated from nonsensitive clones based on patterns of subpopulation profiles. Multidimensional scaling (MDS) is used to visualize the subpopulation vectors of the H460 populations with respect to the Kullback–Leibler divergence measure (Supplementary information). Solid black squares indicate replicates of parental clone from the seven imaging plates; filled circles indicate clones; gray open circles (clones 33 and 35) indicate unreliable sensitivity scores. (B) The closest three neighbors of a clone tend to have similar drug sensitivities across all markers. Clones are sorted from least to greatest sensitivity to paclitaxel. Heat map indicates the number of nearest neighbors to each clone that are among the top 10-most sensitive (top panel) or resistant (bottom panel) to paclitaxel. (C) Clones of similar paclitaxel sensitivity tend to be phenotypically similar across all four marker sets. Thumbnails of clones (columns) labelled in (A) are shown for all four marker sets (rows). Columns are grouped from left to right by decreasing sensitivity to paclitaxel. Scale bar: 20 μm.
To what extent does the separation of drug sensitivities based on patterns of pre-existing heterogeneity depend on MS choice? We observed that the nearest neighbors of a clone in one MS were often close neighbors in the other MS (Supplementary Figure 10); there were ∼20 clones whose three-nearest neighbors in MS1 remained close in MS2 (>6-fold more than expected by chance). Further, the sets of nearest neighbors of a clone across marker sets tended to have similar average drug sensitivities, independent of our choice of MS (Figure 3B). Conversely, clones of similar drug sensitivities tended to have similar phenotypes across all marker sets (Figure 3C; Supplementary Figures 3 and 10). The consistency of information across signaling markers and clones suggested the possibility that similar patterns of cellular heterogeneity were reflective of ‘deeper’ similarities of underlying regulatory networks.
How separable are the collections of ‘sensitive’ and ‘resistant’ subpopulation profiles? We computed the accuracy of separating these two classes of profiles using a linear support vector machine (Supplementary information). Our complete set of H460 clones had separation accuracies between ∼70 and 76% for our MS (Figure 4A-I, MS1–4). However, separation accuracies between sets of clones with ‘extreme’ sensitivities were much higher (∼80–100% for the 10 or 20 most sensitive and resistant clones) (Figure 4A-I, MS1–4; Supplementary Figure 11A). A repeat experiment gave similar results. However, as may be expected from other studies of clones (Chang et al, 2008), we observed that separation accuracy of our low-passage H460 clones decreased over the period of a month (Supplementary Table 5). Finally, to assess the predictive value of our model of H460 heterogeneity, we recomputed separation accuracies using a leave-one-out strategy (Supplementary information). Prediction accuracies for the complete and ‘extreme’ sets of clones were similar, though slightly reduced, to the full separation accuracies (66–73% and 80–90%, respectively) (Supplementary Figure 12) across MS1–4. Thus, clones with extreme opposite sensitivities had distinct and separable patterns of heterogeneity: distinct patterns of heterogeneity reflected functional divergence.
Figure 4.
Models of H460 lung cancer heterogeneity can be used to classify sensitivity to paclitaxel for other cancer populations. (A) Accuracies of separating paclitaxel-resistant and -sensitive collections of cell populations based on their subpopulation profiles by a linear SVM (random separation: 50%; perfect separation: 100%). Columns correspond to marker sets; rows correspond to different pairs of sensitive and resistant groups of cell populations. ‘All:’ all populations grouped into either resistant or sensitive classes; ‘Extreme 2N:’ populations only included when in the N-most sensitive or resistant populations. All subpopulations are computed based on H460 reference model. ¶Accuracy not statistically significant (P>0.05). †Accuracy not 1 s.d. above the average accuracy over all possible permutations of resistant/sensitive assignments (Supplementary Figure 11). *The least resistant cell line was not used for SVM analysis to create a balanced (four resistant, four sensitive) data set (Supplementary information). (B) Noncancerous HBEC clones display less diversity than the panel of H460 clones. (Top panel) HBEC clones show reduced ranges of drug sensitivities (bottom panel) and dissimilarity among phenotypic profiles compared with H460 clones. Reference model for bottom panel is built by sampling both HBEC and H460 clones; the number of subpopulations is varied from 3 to 14. Error bars are 90% confidence intervals based on bootstrapping (Supplementary information). (C) Drug sensitivity among diverse cancer populations can be separated by subpopulation profiles. The H460 reference model was used to compute subpopulation profiles for nine adherent cell lines with the most extreme GI50 values for paclitaxel within the NCI-60 panel.
Classification of drug sensitivity in diverse cell populations
We wondered whether the phenotypic diversification and separation of drug sensitivity by cellular heterogeneity would also hold for a collection of noncancer clone populations. We applied our H460 subpopulation model to a panel of 75 noncancerous, immortalized human bronchial epithelial cell (HBEC) clones (Vaughan et al, 2006) stained with MS1 and MS4. The HBEC clones showed reduced ranges of overall heterogeneity and drug sensitivities compared with the H460 cancer clones (Figure 4B), as monitored by our assay, and showed no significant separation, even when tested on the subset of clones with extreme paclitaxel sensitivities (Figure 4A-II; Supplementary Figure 11B). In addition, separation was poor even after building an HBEC reference model of heterogeneity (data not shown). These results were consistent with the expectation that cancer is associated with increased phenotypic heterogeneity compared with normal cells (Heppner, 1984; Rubin, 1990; Campbell and Polyak, 2007; Gascoigne and Taylor, 2008). Thus, in contrast with the H460 cancer clones, among these noncancer HBEC clones, heterogeneity provided no additional information for separating functional differences, presumably due to greater similarity among founder cells and/or more tightly regulated ranges of signaling states.
We next tested whether models of cellular heterogeneity developed on the H460 clones could reveal information about the drug sensitivity of cellular populations of diverse cancer types. We applied our H460 model of heterogeneity to nine cell lines selected from the NCI-60 panel (Shankavaram et al, 2009) with extreme GI50 values for paclitaxel (NCI-9—five sensitive and four resistant) (Supplementary information). These selected cell lines were derived from breast, colon, lung, ovarian, and renal cancers (Supplementary Table 2). Remarkably, subpopulation profiles for these populations were well separated by paclitaxel sensitivities using MS4, and to a lesser degree MS1 (Figures 4A-II and C; Supplementary Figure 11B). Here, similar separation accuracies could also be obtained using a reference model of heterogeneity built entirely from subsampled cells within the NCI-9 cell lines (Supplementary Figure 13). As with the clones, repeat experiments gave similar separation accuracies. However, in this case, separation accuracies remained similar (and high) even after 2 months of additional time in culture (Supplementary Table 5). As might be expected, the observed relationship between heterogeneity and drug sensitivity was more stable for these well-established cell lines than for the low-passage clones. These results suggested that diverse cancer types may share an overlapping repertoire of signaling states (Jones et al, 2008; Parsons et al, 2008; Valle et al, 2008), whose heterogeneous ensembles have similar relationships to function.
To what extent did the identification of information contained in cellular heterogeneity depend on the choices made in our study? Clearly, not every marker set, feature, or model parameter will be equally informative. For example, paclitaxel sensitivity among the H460 and NCI panels could neither be predicted by a panel of markers including its drug target microtubules MS5: DNA/actin/β-tubulin, nor by a panel of ‘neutral’ markers MS6: DNA/GAPDH/Pericentrin (Figure 4A-I, MS5–6; Supplementary Figures 11A, 12, and 14). Alternatively, for the sole purpose of developing functional predictions, it may be possible to identify specific markers and features whose population-averaged measurements can provide accurate classification. For example, the average intensity of β-catenin in MS3 provided exceptional classification accuracy among all markers (accuracy=78.72%, P<0.05 for the complete set of H460 clones). Population-averaged measurements also lend themselves to multiplexed assays, such as those performed with array-based technology; features based on population-averaged measurements can be easily combined from parallel assays, thereby allowing greater numbers of markers to be explored than can be studied at present on individual cells. However, information may be lost; classification of paclitaxel sensitivity based on population-averaged expression of any three random randomly chosen readouts from MS1–4 performed on average 5% poorer (or 10% if β-catenin was dropped) than our heterogeneity profiles based on three readouts. Furthermore, ensemble-averaged measurements may be predictive of function (e.g. drug response), yet poorly represent individual cellular behaviors and lead to inaccurate models of cell function (Ferrell and Machleder, 1998). Finally, a critical parameter for decomposing heterogeneity is the coarseness of the approximation (Yin et al, 2008; Pyne et al, 2009). In cross-validation studies, we found that the range of subpopulation numbers suggested by model fit criteria (i.e. 3–7 subpopulations) coincided well with the range that provided highest separation accuracies of the H460 clones by drug sensitivities (Supplementary Figure 15). In future, refinement of model parameters may be improved by incorporating additional biological knowledge to determine when subpopulations should be merged or further split.
Discussion
Cellular heterogeneity has been classically described within cellular populations, both in the settings of cell culture and in vivo. Heterogeneity, as an absolute property of a cellular population and collection of molecular readouts, can be difficult to interpret. However, relative differences in heterogeneity, such as may be due to epigenetics, genetics, or environmental conditions, may be more interpretable, in particular when tested for correlation with functional differences.
In the context of differences due to pharmacological perturbations, heterogeneity may be observed before or after treatment. In earlier work (Slack et al, 2008), the ability to distinguish mechanistic classes of perturbations based on heterogeneous cancer cell responses was studied. In contrast, here we investigated whether patterns of basal signaling heterogeneity contained information predictive of subsequent population response to perturbation. We used drug sensitivity classification to provide an objective test of whether our decomposition of heterogeneity contained biologically relevant information. (It was not the goal of this study to develop or optimize predictors for drug sensitivities that outperform other methods.) We modeled the (quasi-equilibrium) distributions of cell signaling phenotypes present within populations from snapshots of large numbers of cells (Chang et al, 2008; Huang et al, 2009), and found that measures of these distributions served as informative, predictive readouts of population-level responses to perturbation. Our approach allowed us to decompose heterogeneous cellular distributions into a small number of more phenotypically homogenous states (Figure 1), compare and group populations based on their patterns of heterogeneity (Figure 2), identify a consistent relationship between heterogeneity and function across multiple sets of general signaling markers (Figure 3) and, finally, test whether a common model of basal signaling heterogeneity could be used to predict drug sensitivities across different cancer populations (Figure 4). In general, characterization of the ensemble of subpopulation mixture may be required to distinguish functional differences among populations. However, in certain cases, (de-) enrichment for specific subpopulations may be sufficient to account for overall functional differences. For example, in MS1, enrichment for subpopulation pairs (S1, S4) or (S2, S3) separated paclitaxel-sensitive from -nonsensitive clones (Supplementary Figure 6). Future studies are required to investigate the deeper molecular states of specific subpopulations (Loo et al, 2009a, 2009b) and their relationship to drug response. We note that in this study, cellular phenotypes were captured on the basis of the spatial colocalization patterns of signaling activity readouts from fixed cells. The physical sorting and subsequent investigation of our identified subpopulations remain challenging.
Important questions remain, such as the origins and evolution of the phenotypic diversification, why our decomposition of heterogeneity predicts drug responsiveness in our defined culture conditions, and why classification is possible on the basis of a limited number of biomarkers that were not chosen based on a prior knowledge of the biology of drug responsiveness, but rather on a general survey of pathways implicated in cancer. The observed heterogeneity among the H460 clones could be due to several factors, including differences in epigenetic states and genetic diversity that may have been present within the parent population or evolved within the clones during their short time of establishment. Regardless, we found that a simple description of the observed heterogeneity contained functional information. One possibility for our success using a limited number of biomarkers may be that our subpopulations reveal ‘deeper’ underlying states that broadly reflect signaling in multiple pathways, and thus may be distinguishable by a small number of ‘general’ signaling markers. Another possibility is that our approach has connected the characteristic behaviors of regulatory networks in two operating regimes: namely, networks operating within each cancer clone shape the stochastic distributions of cell signaling states in unchallenged conditions (Huang et al, 2009) as well as determine an overall population response to an acute challenge (i.e. drug treatment). It is also interesting to speculate whether patterns of heterogeneity observed in primary cancer samples can be interpreted to reveal clinically important information. Importantly, the answer to this question is independent of whether profiles of clinical and cell line samples directly share common signatures. Nevertheless, the potential to study the physiological states of cell populations at a resolution greater than population averages, yet more summarized than individual cells, is highly compelling and our approach may help to interpret heterogeneity observed in healthy and diseased tissues.
Materials and methods
All clones were seeded in triplicate on the same day onto seven 96-well plates (each plate contained seven clones and the parent), grown under identical conditions for 16 h, fixed, stained with the four MS, and imaged at × 20 magnification. Image intensities were scaled for each plate to normalize parental replicates among all plates. Subpopulation profiles were performed as described earlier (Slack et al, 2008) (Supplementary information).
Supplementary Material
Supplementary information, Supplementary Tables S1–S5, Supplementary Figures S1–S15
Acknowledgments
We thank Nam Bui, Luc Grard, Lit-Hsin Loo, Kathy Lyons, Benjamin Pavie, Michael Slack, James Sullivan, and Denise Tria for technical help, John Minna, Tim Mitchison, Rama Ranganathan, Jerry Shay, Gürol Süel, and Michael White for helpful discussions and readings of the paper, and BD Transduction Laboratories for reagents. This research was supported by the National Institute of Health grants NIH GM007062 (RJS), R01 GM081549 (LFW), and R01 GM085442 (SJA), the Endowed Scholars program at UT Southwestern Medical Center (LFW and SJA), a UT Southwestern SPORE project grant (LFW), the Welch Foundation I-1619 (SJA) and I-1644 (LFW), the Rita Allen Foundation (SJA), and the Mary Kay Ash Charitable Foundation (SJA).
Footnotes
The authors declare that they have no conflict of interest.
References
- Anderson AR, Weaver AM, Cummings PT, Quaranta V (2006) Tumor morphology and phenotypic evolution driven by selective pressure from the microenvironment. Cell 127: 905–915 [DOI] [PubMed] [Google Scholar]
- Balaban NQ, Merrin J, Chait R, Kowalik L, Leibler S (2004) Bacterial persistence as a phenotypic switch. Science 305: 1622–1625 [DOI] [PubMed] [Google Scholar]
- Barre B, Vigneron A, Perkins N, Roninson IB, Gamelin E, Coqueret O (2007) The STAT3 oncogene as a predictive marker of drug resistance. Trends Mol Med 13: 4–11 [DOI] [PubMed] [Google Scholar]
- Boland MV, Murphy RF (2001) A neural network classifier capable of recognizing the patterns of all major subcellular structures in fluorescence microscope images of HeLa cells. Bioinformatics 17: 1213–1223 [DOI] [PubMed] [Google Scholar]
- Borg I, Groenen P (1997) Modern Multidimensional Scaling: Theory and Applications. New York: Springer-Verlag [Google Scholar]
- Bremnes RM, Veve R, Hirsch FR, Franklin WA (2002) The E-cadherin cell-cell adhesion complex and lung cancer invasion, metastasis, and prognosis. Lung Cancer 36: 115–124 [DOI] [PubMed] [Google Scholar]
- Brock A, Chang H, Huang S (2009) Non-genetic heterogeneity—a mutation-independent driving force for the somatic evolution of tumours. Nat Rev Genet 10: 336–342 [DOI] [PubMed] [Google Scholar]
- Campbell LL, Polyak K (2007) Breast tumor heterogeneity: cancer stem cells or clonal evolution? Cell Cycle 6: 2332–2338 [DOI] [PubMed] [Google Scholar]
- Carney DN, Gazdar AF, Bepler G, Guccion JG, Marangos PJ, Moody TW, Zweig MH, Minna JD (1985) Establishment and identification of small cell lung cancer cell lines having classic and variant features. Cancer Res 45: 2913–2923 [PubMed] [Google Scholar]
- Chang HH, Hemberg M, Barahona M, Ingber DE, Huang S (2008) Transcriptome-wide noise controls lineage choice in mammalian progenitor cells. Nature 453: 544–547 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Choi S, Henderson MJ, Kwan E, Beesley AH, Sutton R, Bahar AY, Giles J, Venn NC, Pozza LD, Baker DL, Marshall GM, Kees UR, Haber M, Norris MD (2007) Relapse in children with acute lymphoblastic leukemia involving selection of a preexisting drug-resistant subclone. Blood 110: 632–639 [DOI] [PubMed] [Google Scholar]
- Cohen AA, Geva-Zatorsky N, Eden E, Frenkel-Morgenstern M, Issaeva I, Sigal A, Milo R, Cohen-Saidon C, Liron Y, Kam Z, Cohen L, Danon T, Perzov N, Alon U (2008) Dynamic proteomics of individual cancer cells in response to a drug. Science 322: 1511–1516 [DOI] [PubMed] [Google Scholar]
- Elsasser WM (1984) Outline of a theory of cellular heterogeneity. Proc Natl Acad Sci USA 81: 5126–5129 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Engelman JA, Zejnullahu K, Mitsudomi T, Song Y, Hyland C, Park JO, Lindeman N, Gale CM, Zhao X, Christensen J, Kosaka T, Holmes AJ, Rogers AM, Cappuzzo F, Mok T, Lee C, Johnson BE, Cantley LC, Janne PA (2007) MET amplification leads to gefitinib resistance in lung cancer by activating ERBB3 signaling. Science 316: 1039–1043 [DOI] [PubMed] [Google Scholar]
- Feinerman O, Veiga J, Dorfman JR, Germain RN, Altan-Bonnet G (2008) Variability and robustness in T cell activation from regulated heterogeneity in protein levels. Science 321: 1081–1084 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ferrell JE Jr, Machleder EM (1998) The biochemical basis of an all-or-none cell fate switch in Xenopus oocytes. Science 280: 895–898 [DOI] [PubMed] [Google Scholar]
- Gascoigne KE, Taylor SS (2008) Cancer cells display profound intra- and interline variation following prolonged exposure to antimitotic drugs. Cancer Cell 14: 111–122 [DOI] [PubMed] [Google Scholar]
- Haura EB, Zheng Z, Song L, Cantor A, Bepler G (2005) Activated epidermal growth factor receptor-Stat-3 signaling promotes tumor survival in vivo in non-small cell lung cancer. Clin Cancer Res 11: 8288–8294 [DOI] [PubMed] [Google Scholar]
- Heppner GH (1984) Tumor heterogeneity. Cancer Res 44: 2259–2265 [PubMed] [Google Scholar]
- Huang S, Ernberg I, Kauffman S (2009) Cancer attractors: a systems view of tumors from a gene network dynamics and developmental perspective. Semin Cell Dev Biol 20: 869–876 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ichim CV, Wells RA (2006) First among equals: the cancer cell hierarchy. Leuk Lymphoma 47: 2017–2027 [DOI] [PubMed] [Google Scholar]
- Jones S, Zhang X, Parsons DW, Lin JC, Leary RJ, Angenendt P, Mankoo P, Carter H, Kamiyama H, Jimeno A, Hong SM, Fu B, Lin MT, Calhoun ES, Kamiyama M, Walter K, Nikolskaya T, Nikolsky Y, Hartigan J, Smith DR et al. (2008) Core signaling pathways in human pancreatic cancers revealed by global genomic analyses. Science 321: 1801–1806 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kiel MJ, Yilmaz OH, Iwashita T, Terhorst C, Morrison SJ (2005) SLAM family receptors distinguish hematopoietic stem and progenitor cells and reveal endothelial niches for stem cells. Cell 121: 1109–1121 [DOI] [PubMed] [Google Scholar]
- Kotecha N, Flores NJ, Irish JM, Simonds EF, Sakai DS, Archambeault S, Diaz-Flores E, Coram M, Shannon KM, Nolan GP, Loh ML (2008) Single-cell profiling identifies aberrant STAT5 activation in myeloid malignancies with specific clinical and biologic correlates. Cancer Cell 14: 335–343 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kozaki K, Miyaishi O, Tsukamoto T, Tatematsu Y, Hida T, Takahashi T (2000) Establishment and characterization of a human lung cancer cell line NCI-H460-LNM35 with consistent lymphogenous metastasis via both subcutaneous and orthotopic propagation. Cancer Res 60: 2535–2540 [PubMed] [Google Scholar]
- Liegl B, Kepten I, Le C, Zhu M, Demetri GD, Heinrich MC, Fletcher CD, Corless CL, Fletcher JA (2008) Heterogeneity of kinase inhibitor resistance mechanisms in GIST. J Pathol 216: 64–74 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loo LH, Lin HJ, Singh DK, Lyons KM, Altschuler SJ, Wu LF (2009a) Heterogeneity in the physiological states and pharmacological responses of differentiating 3T3-L1 preadipocytes. J Cell Biol 187: 375–384 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loo LH, Lin HJ, Steininger RJ III, Wang Y, Wu LF, Altschuler SJ (2009b) An approach for extensibly profiling the molecular states of cellular subpopulations. Nat Methods 6: 759–765 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loo LH, Wu LF, Altschuler SJ (2007) Image-based multivariate profiling of drug responses from single cells. Nat Methods 4: 445–453 [DOI] [PubMed] [Google Scholar]
- Lu J, Getz G, Miska EA, Alvarez-Saavedra E, Lamb J, Peck D, Sweet-Cordero A, Ebert BL, Mak RH, Ferrando AA, Downing JR, Jacks T, Horvitz HR, Golub TR (2005) MicroRNA expression profiles classify human cancers. Nature 435: 834–838 [DOI] [PubMed] [Google Scholar]
- Luria SE, Delbruck M (1943) Mutations of bacteria from virus sensitivity to virus resistance. Genetics 28: 491–511 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nagrath S, Sequist LV, Maheswaran S, Bell DW, Irimia D, Ulkus L, Smith MR, Kwak EL, Digumarthy S, Muzikansky A, Ryan P, Balis UJ, Tompkins RG, Haber DA, Toner M (2007) Isolation of rare circulating tumour cells in cancer patients by microchip technology. Nature 450: 1235–1239 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Normanno N, De Luca A, Maiello MR, Campiglio M, Napolitano M, Mancino M, Carotenuto A, Viglietto G, Menard S (2006) The MEK/MAPK pathway is involved in the resistance of breast cancer cells to the EGFR tyrosine kinase inhibitor gefitinib. J Cell Physiol 207: 420–427 [DOI] [PubMed] [Google Scholar]
- Pandolfi PP (2004) Breast cancer—loss of PTEN predicts resistance to treatment. N Engl J Med 351: 2337–2338 [DOI] [PubMed] [Google Scholar]
- Parsons DW, Jones S, Zhang X, Lin JC, Leary RJ, Angenendt P, Mankoo P, Carter H, Siu IM, Gallia GL, Olivi A, McLendon R, Rasheed BA, Keir S, Nikolskaya T, Nikolsky Y, Busam DA, Tekleab H, Diaz LA Jr, Hartigan J et al. (2008) An integrated genomic analysis of human glioblastoma multiforme. Science 321: 1807–1812 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Perlman ZE, Slack MD, Feng Y, Mitchison TJ, Wu LF, Altschuler SJ (2004) Multidimensional drug profiling by automated microscopy. Science 306: 1194–1198 [DOI] [PubMed] [Google Scholar]
- Pyne S, Hu X, Wang K, Rossin E, Lin TI, Maier LM, Baecher-Allan C, McLachlan GJ, Tamayo P, Hafler DA, De Jager PL, Mesirov JP (2009) Automated high-dimensional flow cytometric data analysis. Proc Natl Acad Sci USA 106: 8519–8524 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rocques N, Abou Zeid N, Sii-Felice K, Lecoin L, Felder-Schmittbuhl MP, Eychene A, Pouponnot C (2007) GSK-3-mediated phosphorylation enhances Maf-transforming activity. Mol Cell 28: 584–597 [DOI] [PubMed] [Google Scholar]
- Rubin H (1990) The significance of biological heterogeneity. Cancer Metastasis Rev 9: 1–20 [DOI] [PubMed] [Google Scholar]
- Schlabach MR, Luo J, Solimini NL, Hu G, Xu Q, Li MZ, Zhao Z, Smogorzewska A, Sowa ME, Ang XL, Westbrook TF, Liang AC, Chang K, Hackett JA, Harper JW, Hannon GJ, Elledge SJ (2008) Cancer proliferation gene discovery through functional genomics. Science 319: 620–624 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shankavaram UT, Varma S, Kane D, Sunshine M, Chary KK, Reinhold WC, Pommier Y, Weinstein JN (2009) CellMiner: a relational database and query tool for the NCI-60 cancer cell lines. BMC Genomics 10: 277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Singh SK, Clarke ID, Terasaki M, Bonn VE, Hawkins C, Squire J, Dirks PB (2003) Identification of a cancer stem cell in human brain tumors. Cancer Res 63: 5821–5828 [PubMed] [Google Scholar]
- Slack MD, Martinez ED, Wu LF, Altschuler SJ (2008) Characterizing heterogeneous cellular responses to perturbations. Proc Natl Acad Sci USA 105: 19306–19311 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Snijder B, Sacher R, Ramo P, Damm EM, Liberali P, Pelkmans L (2009) Population context determines cell-to-cell variability in endocytosis and virus infection. Nature 461: 520–523 [DOI] [PubMed] [Google Scholar]
- Spencer SL, Gaudet S, Albeck JG, Burke JM, Sorger PK (2009) Non-genetic origins of cell-to-cell variability in TRAIL-induced apoptosis. Nature 459: 428–432 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stewart MH, Bosse M, Chadwick K, Menendez P, Bendall SC, Bhatia M (2006) Clonal isolation of hESCs reveals heterogeneity within the pluripotent stem cell compartment. Nat Methods 3: 807–815 [DOI] [PubMed] [Google Scholar]
- Suel GM, Kulkarni RP, Dworkin J, Garcia-Ojalvo J, Elowitz MB (2007) Tunability and noise dependence in differentiation dynamics. Science 315: 1716–1719 [DOI] [PubMed] [Google Scholar]
- Tang C, Ang BT, Pervaiz S (2007) Cancer stem cell: target for anti-cancer therapy. FASEB J 21: 3777–3785 [DOI] [PubMed] [Google Scholar]
- Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3: 71–86 [DOI] [PubMed] [Google Scholar]
- Valle L, Serena-Acedo T, Liyanarachchi S, Hampel H, Comeras I, Li Z, Zeng Q, Zhang HT, Pennison MJ, Sadim M, Pasche B, Tanner SM, de la Chapelle A (2008) Germline allele-specific expression of TGFBR1 confers an increased risk of colorectal cancer. Science 321: 1361–1365 [DOI] [PMC free article] [PubMed] [Google Scholar]
- van ‘t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415: 530–536 [DOI] [PubMed] [Google Scholar]
- Vaughan MB, Ramirez RD, Wright WE, Minna JD, Shay JW (2006) A three-dimensional model of differentiation of immortalized human bronchial epithelial cells. Differentiation 74: 141–148 [DOI] [PubMed] [Google Scholar]
- Wang M, Zhou X, King RW, Wong ST (2007) Context based mixture model for cell phase identification in automated fluorescence microscopy. BMC Bioinformatics 8: 32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wilson A, Laurenti E, Oser G, van der Wath RC, Blanco-Bose W, Jaworski M, Offner S, Dunant CF, Eshkind L, Bockamp E, Lio P, Macdonald HR, Trumpp A (2008) Hematopoietic stem cells reversibly switch from dormancy to self-renewal during homeostasis and repair. Cell 135: 1118–1129 [DOI] [PubMed] [Google Scholar]
- Yin Z, Zhou X, Bakal C, Li F, Sun Y, Perrimon N, Wong ST (2008) Using iterative cluster merging with improved gap statistics to perform online phenotype discovery in the context of high-throughput RNAi screens. BMC Bioinformatics 9: 264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou BP, Deng J, Xia W, Xu J, Li YM, Gunduz M, Hung MC (2004) Dual regulation of Snail by GSK-3beta-mediated phosphorylation in control of epithelial-mesenchymal transition. Nat Cell Biol 6: 931–940 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary information, Supplementary Tables S1–S5, Supplementary Figures S1–S15