Abstract
Normal breast luminal epithelial progenitors have been implicated as cell of origin in basal-like breast cancer, but their anatomical localization remains understudied. Here, we combine collection under the microscope of organoids from reduction mammoplasties and single-cell mRNA sequencing (scRNA-seq) of FACS-sorted luminal epithelial cells with multicolor imaging to profile ducts and terminal duct lobular units (TDLUs) and compare them with breast cancer subtypes. Unsupervised clustering reveals eleven distinct clusters and a differentiation trajectory starting with keratin 15+ (K15+) progenitors enriched in ducts. Spatial mapping of luminal progenitors is confirmed at the protein level by staining with critical duct markers. Comparison of the gene expression profiles of normal luminal cells with those of breast cancer subtypes suggests a strong correlation between normal breast ductal progenitors and basal-like breast cancer. We propose that K15+ basal-like breast cancers originate in ductal progenitors, which emphasizes the importance of not only lineages but also cellular position within the ductal-lobular tree.
Subject terms: Mammary stem cells, Gene expression, Tumour heterogeneity, Differentiation
Introduction
Breast cancer is not a single disease. Rather, it relies on several different tumor subtypes each with their own phenotype and clinical outcome1 (for review see ref. 2). One of the most difficult-to-treat subtypes is the basal-like. Basal-like breast cancer originates from progenitor cells within the normal breast, typically among premenopausal women. We and others have previously narrowed down a luminal progenitor, which is double positive for K14 and K19 as a likely candidate cell of origin of basal-like breast cancer3–6. In basal-like breast cancer, apparent equivalents to double-positive cells are believed to contribute to aggressive behavior by taking on a leader role in invasion7. Indeed, knockdown of K14 in these cancer cells is sufficient to block what is referred to as collective invasion7. In primary tumors, the basal-like cells, reminiscent of normal double-positive cells are considered progenitors. In a tumor setting, these cells exhibit the potential of acquiring a hybrid epithelial-mesenchymal transition (EMT) state with a permanent aggressive potential8. Moreover, the EMT state seems to govern the level of progenitor activity as well as malignant behavior8–11. It is therefore important to understand in more detail the relationship between double-positive cells in the normal breast and the cells of the basal-like subtype of breast cancer.
Previous studies have emphasized the importance of spatial mapping based on combining micro-collection of organoids directly from reduction mammoplasties with quantitative fluorescence-activated cell sorting (FACS) and multicolor imaging to investigate human breast progenitors3,12. These and other studies have provided compelling evidence for the site-specific presence of distinct progenitors in ducts and TDLUs3,12–14. This observation is potentially much more far-reaching if viewed in context with data of distinct disease-free survival rates exclusively determined by cell of origin in ducts and lobules as determined by mammography and histology15,16. That duct-derived tumors may exhibit the worst prognosis15,16 emphasizes the importance of establishing further evidence for a relationship between normal double-positive progenitors in ducts and basal-like breast cancer.
Studies by others have recently employed scRNA-seq to describe the diversity of the epithelial cells in the human breast17–20. While these confirm the existence of three distinct epithelial populations irrespective of donor age and measures taken to enrich for epithelial cells prior to analysis, it remains unanswered whether the luminal epithelial compartment can be resolved further and whether transcriptional profiles relate to anatomical position. To address this, we here used micro-collected primary normal breast organoids to isolate trophoblast antigen 2+ (TROP2+)/CD271− luminal epithelial cells from ducts and TDLUs and performed scRNA-seq. We discovered that the most immature luminal human breast epithelial cells reside in ducts and exhibit a unique expression profile that includes high levels of KRT15. This signature was found to correlate strongly with basal-like breast cancer. Together, our data provide a new level of resolution of phenotypes in the human normal breast for precision of cancer cell of origin studies.
Results
Combined micro-collection of organoids, FACS, and scRNA-seq lead to spatial mapping of differentially expressed genes of luminal epithelial cells in the human breast gland
The presence of a luminal stem cell zone in ducts based on functional assays has been well described3,5,21. However, relatively little is known about the molecular constitution enabling the cells to function as progenitors or about the relationship between the stem cell hierarchy and tissue architecture. To address these questions, we micro-collected ducts and TDLUs from reduction mammoplasties, used lineage-specific cell-surface markers to enrich for luminal epithelial cells by a FACS protocol including TROP2 instead of EpCAM in combination with CD271 previously shown to optimize the separation of luminal and myoepithelial cells12,22,23, and subjected the resulting populations to scRNA-seq (Fig. 1a–c). To minimize variation due to age, parity, and ductal-lobular ratio12,24,25, we used biopsies from three same-aged young adults (age 18), collected in the range of 30 to 50 TDLUs and ducts, respectively, from each, and sorted a total number of 36,000 ductal- and 36,000 TDLU- derived cells for scRNA-seq. For an integrative analysis of the luminal lineage, we performed clustering of a total of 20,286 cells and on average 1300 genes per cell using Seurat (version 3.0)26, which identified 12 clusters with distinct gene expression profiles (Fig. 1c). Analysis of cluster entropy indicated unskewed contribution from the three biopsies to each cluster (Supplementary Fig. 1a). The cluster designated 0 included a minor proportion of cells (146 cells, 0.7% of the total population) reflecting the presence of immune cells, and thus, was excluded from further analysis. The cells were separated in two major groups of ten clusters labeled 1.1–1.4 and 2.1–2.6, respectively. The remaining cluster located in between was labeled cluster 3. Since the clustering relied on compiled data from ducts and lobules in separate, the contribution of each to the collective image was readily resolved and revealed a higher contribution of duct-derived cells to clusters 1.1–1.4, and to a significant level in cluster 1.2, and TDLU-derived cells to clusters 2.1–2.6 (Fig. 1d and Supplementary Fig. 1b).
The ductal-enriched group comprises immature luminal progenitors including K14+/K19+ double-positive cells and a human-specific population of K15+ cells
To infer the roles of the clusters, differentially expressed genes (DEGs) were examined and summarized in Fig. 2a and Supplementary Data 1. From these data, it became obvious that clusters 1.1–1.4 in general may represent immature progenitors by expression of, e.g., ALDH1A3 (aldehyde dehydrogenase 1 family member A3) and KRT153,27–29, whereas clusters 2.1–2.6 and 3 represent more mature luminal epithelial cells by expression of BCL2, FOXA1 (forkhead box A1) and several endocrine receptors (Fig. 2a)28,30,31. This hierarchical division was echoed in a screen of lineage-specific cell-surface markers within the list of the in silico human surfaceome32. Thus, we found expression of established progenitor markers TNFRSF11A (RANK), CD55, PROM1 (prominin 1) and KIT (c-Kit)33–39 in clusters 1.1–1.4 and differentiated luminal epithelial markers ALCAM (CD166), AREG (amphiregulin), and TNFSF11 (RANKL) in clusters 2.1–2.6 (Fig. 2a)28,40–42. The full list of DEGs encoding cell-surface proteins is available in Supplementary Data 2. Intriguingly, the significantly duct-enriched cluster 1.2 accumulates MCAM (CD146), KRT14 and KRT15 expressing cells on a KRT19-positive background (Fig. 2b, c)—a combination of phenotypes, which have been amply validated in functional progenitor assays and which have been localized primarily to ducts at the protein level3,21,28. Cluster 3 was characterized by a significant expression of prolactin induced protein (PIP) and mucin like 1 (MUCL1) compared to other clusters. Also, immunoglobulin superfamily 1 (IGSF1) encoding a cell-surface molecule is exclusively upregulated in cluster 3 (Fig. 2a). To uncover any species related controversy concerning the generation of epithelial lineages43, we compared our data with existing scRNA-seq data based on the mouse mammary gland44. Of note, in accordance with others45, we found that expression of KRT15 in the luminal compartment is specific for the human breast, and moreover that the mapping of the human stem cell hierarchy differs from that of mice (Supplementary Fig. 2a).
To further substantiate the analysis of maturation among clusters based on DEGs, we applied the lineage inference algorithm Slingshot46 in the search for a potential hierarchy in an unbiased and unsupervised manner. In short, this method identifies a trajectory based on a minimum spanning tree algorithm towards the most differentiated state. Cells that are placed closer to the beginning of the trajectory belong to an early time point in the lineage. Using our single-cell transcriptome data of luminal epithelial cells, Slingshot built several trajectories all starting in cluster 1.1 and ending in either clusters 2.1–2.3 or cluster 1.4, as summarized in Fig. 2d. Accordingly, estimation of pseudotime places the least differentiated cells in clusters 1.1–1.2 and the most differentiated cells of the luminal lineage in clusters 2.1–2.3 (Fig. 2d). These data were confirmed by geneset enrichment analysis showing that whereas clusters 1.1 and 1.2 are particularly high in genes involved in epithelium development (adj p < 0.001, Fig. 2e), suggesting a role upstream of epithelial differentiation within a hierarchy, clusters 2.1–2.6 are enriched for genes involved in anatomical structure morphogenesis (adj p < 0.0001, Fig. 2e). Additional gene sets in support of a hierarchical organization with respect to response to extracellular signaling, MAPK signaling, mammary gland development, nuclear receptor and ERBB signaling are highlighted in Supplementary Fig. 2b (adj p < 0.05). However, cluster 1.4 is somewhat of a conundrum by being the end of a separate trajectory never leaving the progenitor compartment (Fig. 2d). Therefore, in order to characterize this subcluster relative to the rest of cluster 1, we sought for a marker suitable for prospective isolation of subcluster 1.4 progenitors in a FACS-based protocol. We identified podocalyxin-like (PODXL) as an ideal candidate. PODXL is a gene encoding a member of the CD34 sialomucin protein family, which is expressed in hematopoietic stem cells47, and whose expression has been associated with basal-like breast cancer48. At the protein level, it is expressed at the apical surface of a subset of c-Kit+ luminal progenitors35 (Fig. 2f and Supplementary Fig. 2c), and PODXL+ cells from reduction mammoplasties can be readily identified in a FACS protocol with c-Kit included (Supplementary Fig. 2d). By this protocol we defined mature luminal cells as PODXL−/c-Kit-, c-Kit+ progenitors as PODXL−/c-Kit+, and PODXL+ progenitors as PODXL+/c-Kit+/−, respectively (Supplementary Fig. 2d). These three cell types were gated for and plated at clonal density in a colony formation assay (Supplementary Fig. 2e). Indeed, PODXL+ cells turned out to exhibit the highest colony-forming capability (Fig. 2f), confirming also at the functional level that cluster 1.4 represents a progenitor population.
Micro-collection- and cluster-based spatial mapping is confirmed by multicolor imaging in situ
In order to validate the scRNA-seq cluster profiling and the apparent enrichment of duct-derived progenitors in cluster 1.2 and late progenitors in cluster 1.4 at the protein level with particular emphasis on surface markers, we searched our antibody repository and identified CD55 or annexin A1 and SLC34A2, respectively, as promising candidates. In line with our scRNA-seq data, we have previously found ductal-enriched expression of K15 and heterogeneous distribution of c-Kit expression by immunostaining3,33. Here, in an effort to classify cluster 1.2 we found that K15, due to its higher expression level, was superior to K14, which we have otherwise used as a ductal progenitor marker3,21. Thus, here we compared cluster 1.2-associated markers CD55 or annexin A1 with K15 as well as a cluster 1.4-associated marker SLC34A2 with c-Kit in ten different human breast biopsies using multicolor imaging. As inferred from the scRNA-seq (Fig. 3a), the majority of biopsies showed strong co-staining of CD55 with K15 (7 out of 10 biopsies, Fig. 3b) and to some extent co-staining of annexin A1 with K15 (5 out of 10 biopsies) in ducts compared to TDLUs (Fig. 3b), while SLC34A2 essentially co-stained with c-Kit (10 out of 10 biopsies) in both ducts and TDLUs (Supplementary Fig. 3a). To assess whether CD55 added to the established c-Kit protocol28 for enrichment of progenitors, smears were recovered from different gate combinations of FACS and stained for K15 (Supplementary Fig. 3b). A significantly higher frequency of strongly K15+ (K15high) cells was seen from the combined CD55/c-Kit gate compared to c-Kit alone (Supplementary Fig. 3c). We have previously found evidence of progenitor heterogeneity by comparing c-Kit+ and CD146+ cells functionally21. Our present data add to these differentiation programs, since CD55 co-stained with CD146 in ductal luminal cells, while CD146 rarely overlapped with the cell-surface marker PODXL, which is a cluster 1.4-associated cell-surface marker as shown above (Supplementary Fig. 3d, e)—all in line with the scRNA-seq data. The lack of overlap between the preferentially ductal CD55 and PODXL did not only unfold in ducts. Rather, the strongest PODXL staining was seen in CD55neg TDLUs including the lobules proper (Supplementary Fig. 3f). Since we have previously shown that primarily the lobules and TDLUs are rich in hormone receptor-positive late progenitors and differentiated cells28,49, we here co-stained with PODXL and the surrogate marker, Ks20.8, of estrogen/progesterone receptor-positive cells28. Clearly, the hormone receptor-positive-cells were negative for PODXL (Supplementary Fig. 4a). Collectively, these data are in favor of the existence of a ductal luminal progenitor CD55+/K15high cell, which is phenotypically and functionally distinct from those of TDLUs as summarized in Supplementary Fig. 4b.
Comparison of duct and TDLU expression profiles with those of breast cancer subtypes
Finally, we investigated whether the identified single-cell transcriptome signatures overlapped with those of breast cancer. Gene expression and molecular characteristics of breast cancer have allowed the classification of breast cancer into several subtypes1,50. It has previously been shown that gene expression signatures of luminal progenitors are significantly correlated with basal-like breast cancer35. By adapting the method by Lim et al.35, we calculated the expression signature scores of ductal and TDLU luminal DEGs (Supplementary Data 3) and compared them to those of breast cancer subtypes51. As shown in Fig. 4a, the transcriptome signature of luminal cells in ducts exhibited an expression profile that was much more similar to those of the basal-like subtype of breast cancer than those of the luminal subtypes. The TDLU-derived luminal signature, on the other hand, showed a stronger compatibility with the luminal breast cancer subtypes (Fig. 4a). In addition, we asked which clusters aligned with the basal-like breast cancer subtypes using the established Prediction Analysis of Microarray 50 (PAM50) subtyping52, as well as a new subtyping classifier based on scRNAseq of breast cancer cells using the “SCSubtype” gene signatures53. Accordingly, PAM50 subtyping showed that clusters 1.1–1.4, in contrast to clusters 2 and 3, are more closely related to basal-like breast cancer (Supplementary Table 1). Likewise, upon comparison with SCSubtype gene signatures, clusters 1.2, 1.3, and 1.4 show a positive correlation with basal-like breast cancer (Supplementary Fig. 5). Recent molecular profiling has stratified basal-like breast cancer within triple-negative breast cancer (TNBC)54–56. We therefore further compared TNBC subtype gene signatures with clusters 1.1 to 1.4. Our analysis according to TNBC subtyping of DEGs among cluster 1, reveals that clusters 1.1–1.4 are related to basal-like 1/mesenchymal, basal-like 2/mesenchymal, basal-like 2 and immunomodulatory subtypes, respectively (Supplementary Fig. 6 and Supplementary Data 4). Intriguingly, assessment of the expression level of KRT15 in a dataset of 2,164 breast cancer biopsies subdivided according to the most widely used classification showed a significantly higher expression level in basal-like breast cancer compared to any of the other subtypes (Fig. 4b)57. While others have reported that K15 protein is expressed among TNBC, HER2 and Luminal A carcinomas58, we here sought to corroborate that K15 is a marker of basal-like breast cancer also at the protein level. In a series of TNBC biopsies, 22/36 (61%) biopsies stained positive for K14 and 9/36 (25%) of these stained positive for K15. All nine K15+ biopsies contained K14+/K15+ cells in addition to K14+ and K15+ cells (Fig. 4c). Taken together, the data suggest that progenitors with a ductal profile represent the most immature cell type within the luminal lineage of the human breast and a likely source of basal-like breast cancer.
Discussion
The present work demonstrates that the two major segments of the human breast ductal tree, i.e., the ducts and the TDLUs, are specifically enriched for cells reminiscent of the major breast cancer subtypes. This agrees with clinical data that duct- and TDLU-derived breast carcinomas exhibit unique histological and radiological appearances as well as clinical outcomes15. Our study opens for precision cell of origins comparisons with breast cancer subtypes. For example, we find that within the luminal epithelial lineage the most immature progenitors are characterized by the expression of basal-like breast cancer-associated K15 and a localization preferentially to ducts. Furthermore, we provide a proof of principle that luminal progenitors close to the apex of the hierarchy can be isolated by new combinations of surface markers revealing progenitors lending themselves to mechanistic studies of breast cancer subtype specific transformation and evolution. In the present study, scRNA-seq is based on biopsies from three age-matched young women. While this may serve as a starting point, importantly, we and others have shown that aberrant basal-like luminal cells accumulate with age25,59,60, thus implying that further studies of the resemblance between the transcriptomic profiles of normal luminal breast cells and breast cancer should take the age of the donors into account.
It is becoming increasingly clear not least by scRNA-seq that the luminal epithelial compartment consists of a multitude of progenitors and differentiated cells exhibiting molecular profiles overlapping with breast cancer subtypes18,25,61–64. Nevertheless, as far as cell of origin of breast cancer is concerned, the current view is centered around an estrogen receptor-negative, c-Kit-positive progenitor, which constitutes a relatively large proportion of cells widely distributed along the entire ductal-lobular tree35,65,66. This concurs with the widely held notion in the mouse mammary gland field based on early transplantation experiments that any part of the gland can give rise to the entire complement of the ductal-lobular tree if transplanted to a cleared fat pad (for review see67). Therefore, the alternative explanation to breast cancer subtypes is that they depend on the order and magnitude of genomic aberrations rather than distinct cells of origin, which is furthered here (for review see ref. 67). Indeed, early studies based on X-chromosome inactivation patterns showed that an entire duct and ductal-lobular unit had the same genomic constitution68. Therefore, the differences recorded in the present and previous studies are most likely governed by microenvironmental cues eventually leading to spatially determined, more permanent states of immaturity or differentiation12–14,69.
The present work highlights the value of ductal expression of K15 as a marker of an immature progenitor cell zone. K15 expression has been widely used as a biomarker for epithelial stem cells70–72. The antibody clone used here against K15 (LHK15) has been exhaustively shown to be entirely specific for K1573. However, the protein expression pattern in ducts of the human breast appears to be broader than what one would expect from cells near the apex of a differentiation hierarchy43. In general, human breast stem cells are believed to reside in the basal layer12,74. K15 is, however, expressed by luminal cells and not basal cells and as such it may have another function more relevant for progenitor or even differentiated cells. Analogously, in the esophagus, K15+ progenitors are found in the non-stem cell suprabasal layer (for review see also ref. 75). The fact that K15 in some tissues stain stem cells and in other tissues stain their progeny has been explained by different mechanisms of K15 regulation: a differentiation-specific mechanism involving the PKC/AP1 pathway and a basal-specific mechanism mediated by FOXM176. Therefore, K15 expression in the normal breast undoubtedly covers more progenitors than those suspected of being cells of origin to breast cancer.
We also discovered additional markers of human breast progenitors. One of these, CD55, is enriched in cluster 1.2, spatially mapped primarily to ducts, and expressed coordinately to a great extent with K15. CD55 is a glycosylphosphatidylinositol-anchored protein, which regulates complement activation pathway, and it is also referred to as decay-accelerating factor 1 (Daf1). In human breast cancer cell lines, it renders cells resistant to apoptosis and thus facilitates tumorigenesis77. However, its function in normal breast is not known. A recent scRNA-seq study in the mouse mammary gland suggests CD55 as an early progenitor subset marker in the basal compartment and a marker of luminal transit cells during development78. In the adult mouse gland, CD55+ cells were found exclusively in the luminal epithelial compartment78. In addition, based on colony-forming assays, the CD55+ cells exhibited an about three times higher colony-forming capacity as an indication of their progenitor status78. Our colony formation data including c-Kit+ and PODXL+ luminal cells representing cells in group 1 of clusters are in good agreement with a progenitor status of this entire group of cells. We have previously shown a progenitor potential of CD146+ cells21. Whereas CD55 adds to the value of c-Kit for identifying progenitors in this compartment, the PODXL+ cells seem to mark a separate progenitor compartment. PODXL expression is of particular interest since others have reported that high PODXL expression is associated with higher risk categories, and that breast carcinomas with high PODXL expression are more likely to exhibit characteristics of basal-like cancers48. In support of such association between PODXL expression and prognosis, silencing of PODXL expression in a basal-like human breast cancer cell line reduces primary tumor formation and metastasis79, and analysis of the EMT program in an immortalized transformed human breast cell line reveals PODXL as a key promotor of extravasation80. The finding of PODXL primarily outside the CD55+ compartment combined with functional progenitor activity implicates the existence of a residential progenitor zone within the lobules. This may be at variance with our previous findings of limited colony-forming activity in cells from TDLUs3. It is possible, though, that the differences can be explained either by the use of different culture media in the two studies or that in the present study, both ductal and TDLU-derived PODXL+ cells contribute to the CFU activity. Importantly, however, we here find good agreement between staining, FACS and mRNA as far as CD146, CD55 and PODXL are concerned. Nevertheless, this does not exclude that broader populations of cells may be captured in particular at the protein level depending, e.g., on the threshold setting of detection81.
Collectively, we here provide evidence for a hitherto underappreciated spatial distribution of luminal progenitors and unravel a progenitor cell population in the ducts of the human breast with resemblance to basal-like breast cancer. These findings emphasize the relevance of cells of origin in breast cancer in general and pave the way for further investigation of the development and progression of basal-like breast cancer in particular.
Methods
Human tissue
The use of human tissue has been approved by the Scientific Ethical Committee of Region Hovedstaden and the Danish Data Protection Agency with reference to H-2-2011-052 and 2011-41-6722, respectively, and patients agreed to donate tissue by written consent. Normal breast tissue was acquired from 29 female donors undergoing reduction mammoplasty for cosmetic purposes. Donors remain anonymous except their ages at the time of surgery. 36 breast carcinoma specimens were donated by women undergoing mastectomy for primary breast cancer. Tissue was cut into pieces for cryo-sectioning or cut finely prior to dissociation using 900 U/ml collagenase solution (Worthington Biochemical) in DMEM/F12 (Gibco) supplemented with 2 mM glutamine and 50 μg/ml gentamycin (Biological Industries) to release epithelial organoids, upon collagenase digestion comprised of epithelium and adjacent stromal cells82, which were then stored in liquid nitrogen with 90% fetal bovine serum (F7524, Sigma-Aldrich) and 10% dimethyl sulfoxide (D2650, Sigma-Aldrich), which we find, is the optimal condition for freezing, thawing and survival83. Some of the biopsies used in this study have been included in previous studies21,23,28,84.
Fluorescence-activated cell sorting (FACS)
Primary breast organoids were used for micro-collection (collection under the microscope) of ducts and TDLUs under the Leica DMIL microscope3,12,23. Organoids were dissociated using 0.25% trypsin in 1 mM EDTA (Sigma, E5134), following resuspension in HEPES buffer (Sigma, H3375) and filtering with a 100 μm filter12,22. Details regarding antibodies including the dilutions and catalog numbers used for all experiments included are summarized in Supplementary Table 2. Hereafter, samples were incubated at 4 °C with antibodies against an epithelial marker, TROP2 (brilliant violet (BV) 421 or BV510 conjugated) and a myoepithelial marker, CD271 (PE or APC conjugated) when sorting cells for scRNA-seq or with additional antibodies against c-Kit (BV421 or PE conjugated) and PODXL when cells were used for colony formation assays. Secondary antibody BV421-anti mouse IgM was added for staining the non-conjugated PODXL primary antibody. The secondary antibody was added to the control. For the comparison of PODXL and CD146, following antibodies were used; a luminal marker, EpCAM (CD326, BV786 conjugated), CD271 (PE conjugated), CD146 (AF647 conjugated) and PODXL, which is mentioned above. To sort luminal cells according to their expression of CD55 and c-Kit, TROP2 (BV510 conjugated), CD271 (APC conjugated) were used together with c-Kit (PE conjugated) and unconjugated CD55 in combinations with BV421-anti mouse IgM. After incubation, cells were washed twice with HEPES buffer and filtered through a 20 μm filter (BD, 340624) following addition of 1 μg/ml Fixable Viability Stain 780 (BD Horizon, 565388). Cell sorting was performed using the FACSAria™ Fusion Cytometer (BD) or the BD FACSAria™ III Cytometer with a 100 μm nozzle and prior multicolor compensations.
Single-cell RNA sequencing
Organoids representing ducts and TDLUs (30–50 of each from each biopsy) from normal breast tissue of three age matched (18 years old) individuals were micro-collected and TROP2+/CD271− luminal cells of each sample were sorted using FACS as described above. The myoepithelial cells were used in a separate study23.
Chromium Single-Cell 3’ Reagent Kits were employed for single-cell transcriptome sequencing. Version 2 (first two donors, PN-120237, PN-120236, PN-120262) or version 3 (last donor, PN-1000075, PN-1000073) of the kits were used for RNA isolation, cDNA amplification and library preparation. Hereafter, the Illumina® NextSeq500/550 High Output Kit v2 for 150 cycles (20024907) was used according to the manufacturer’s instructions for sequencing. Resulting files were demultiplexed and aligned to the human reference genome GRCh38-1.2.0.pre-mRNA. Hereafter, the data were filtered and barcodes as well as unique molecular identifiers were counted (10x Genomics Cell Ranger software). The R package “Seurat” version 3.0 was used for quality control, pre-processing (filtering, normalization, integration) and data analysis (clustering, data visualization, detection of differentially expressed genes)26. During filtering, cells with a feature count out of a range between 200 and 2000 as well as with a mitochondrial count higher than 10% were excluded. In order to analyze samples of three different donors together, datasets were integrated according to the “Integration and Label Transfer” tutorial of Seurat26. The equal contribution of biopsies to each cluster and cluster entropies were calculated to confirm successful integration of the data with an adapted function of the R package “Conos” (version 1.4.5)85. After clustering, the dimensionality reduction technique UMAP was applied to the whole dataset for visualization. DEGs for clusters were defined using a log2 fold-change (FC) cutoff of 0.1, and a threshold of 1% for the relative number of cells expressing the gene in the given group. Finally, only statistically significant genes (adjusted p-value < 0.05, Wilcoxon rank sum test together with Bonferroni correction for multiple testing) were used for further analysis. The in silico human surfaceome was utilized to search for genes encoding cell-surface proteins32.
Comparison of DEGs to molecular breast cancer subtype classifiers
DEGs between ductal and TDLU luminal cells (Supplementary Data 3) were compared to the gene expression profile of breast cancer subtypes using a method that calculates “expression signature scores” based on log2FCs of DEGs in a geneset and gene expression values of the same genes in a reference dataset.35. Gene expression data of breast cancer were extracted from the Gene Expression Omnibus (GEO) with accession number GSE3165 and the platform GPL887 explained in more detail by Herschkowitz et al.51. The expression values were background-corrected and normalized by loess normalization using the R packages “affy” (version 1.68.0) and “limma” (version 3.46.0)86,87. An expression signature score was calculated for each combination of DEGs of ductal to TDLU and samples of breast cancer subtypes (Basal: n = 28, Normal: n = 6, Claudin low: n = 6, Her2: n = 14, LumB: n = 17, LumA: n = 22). Used values were the log2FC of marker genes in our single-cell sequencing data, including markers with a negative fold-change, and the expression value of the same genes in the breast cancer samples. A high log2FC as well as a high expression of a gene in the breast cancer subtype resulted in a larger expression signature score. Thus, high scores indicate a high similarity of the single-cell cluster with the cancer sample. Kruskal–Wallis one-way analysis of variance with Dunn’s multiple comparisons test was used for statistical analysis. In order to compare to molecular subtypes based on PAM50 gene signature52, we employed R package “genefu” (version 2.26.0)88 and estimated the probability of each cluster belonging to each subtype. For comparison with a single-cell method of breast cancer subtype classification “SCSubtype”53, expression signature scores were computated based on similarity between cluster-specific DEGs (Supplementary Data 1) and SCSubtype gene lists according to Lim et al.35. To further examine the basal-like breast cancer signatures within clusters 1.1 to 1.4, Lehmann´s TNBC gene signatures55 were compared to DEGs among clusters 1.1, 1.2, 1.3, 1.4 (Supplementary Data 4) using Lim et al.’s method above. Since only the gene lists and not the actual expression values were available for SCSubtype and TNBC gene signatures, we set a value of 1 as upregulated genes, while a value of –1 was used for downregulated genes for the calculations of expression signature scores.
Pathway analysis
The functional enrichment analysis was performed using g:Profiler (version e102_eg49_p15_7a9b4d6, https://biit.cs.ut.ee/gprofiler) with a significance threshold of 0.05 after correcting for multiple testing89. Among the DEGs for cell clusters (Supplementary Data 1), genes of log2FC > 0.5, with >1.5 fold-change of the percentage of cells expressing the gene in a cluster compared to the rest of cells were tested for the “g:GOSt functional profiling” function and genes were sorted descendingly by log2FC.
Trajectory inference
By use of a lineage inference tool, “Slingshot” (version 1.8.0)46, we calculated a differentiation trajectory and pseudotime of luminal cells. Cells from cluster 0 (immune cells) were excluded from the analysis. The values from the principal component analysis (PCA) and Seurat-annotated clusters were used as input. Neither starting nor ending clusters were pre-defined and trajectories were identified in an unsupervised manner.
KRT15 gene expression in breast cancer subtypes and in mice
KRT15 gene expression in breast cancer subtypes was retrieved from the gene expression database of normal and tumor tissues (GENT2)57. Gene expression data of Krt15 in the mouse mammary gland was obtained from Tabula Muris, a compendium of scRNA-seq data derived from mouse tissue44.
Colony formation assay
Using FACS as described above, PODXL+, KIT+ and KIT-/PODXL− cells were sorted into BioCoat™ Collagen I 96 Clear Well Plates (Corning, 354407) with 5 cells per well (1 cell/6.4 mm2). Cells were incubated for 3 weeks in TGFβR2i medium ((DMEM (Gibco, 21068028) and Ham’s F12 Nutrient Mix (Gibco, 21765029) 3:1 with 2 mM glutamine (Sigma, G7513), 5% FCS (Sigma-Aldrich, F7524), 0.5 μg/ml hydrocortisone (Sigma-Aldrich, H-0888), 5 μg/ml insulin (Sigma-Aldrich, I6634), 10 ng/ml cholera toxin (Sigma-Aldrich, C-8052), 10 ng/ml EGF (Peprotech, AF-100-15), 180 μM adenine (Sigma-Aldrich, A3159), 10 μM Y-27632 (AbMole BioScience, M1817), 5 nM amphiregulin (Peprotech, 100-55B), 25 μM RepSox (Sigma-Aldrich, R0158), and 10 μM SB431542 (Axon Medchem, 1661))21,28. Hereafter, cells were fixed with methanol (VWR Chemicals, 20847.307) for 5 min at –20 °C and stained with hematoxylin (Sigma-Aldrich, MHS16). For representative images, cells were stained with 0.4% crystal violet (Sigma Life Sciences, C6158) in 1:1 PBS and 96% ethanol. The number of wells with colonies was counted under a microscope (Leica DM2000), using a cutoff of 15 cells to define a colony. The assay was performed with four different biopsies.
Immunohistochemistry and immunocytochemistry
Snap-frozen normal breast biopsies from reduction mammoplasties and archival TNBCs, characterized as such as estrogen receptor-negative, progesterone receptor-negative and human epidermal growth factor receptor 2-low/negative, K5-positive and/or K17-positive, were cut into 6 μm thick sections using a CryoStar NX50 cryostat (Thermo Scientific). Cryostat sections and cell smears upon FACS were fixed either with methanol (VWR Chemicals, 20847.307) for 5 min at –20 °C, or with 3.7% formaldehyde (Merck, 104002) for 10 min at room temperature following permeabilization with 0.1% Triton X-100 (Sigma-Aldrich, X-100)3,28. Sections were incubated with primary antibodies for 2 h (Supplementary Table 2) and secondary antibodies for 30 min, with PBS washes in between. Finally, ProLong™ Gold anti fade reagent with DAPI (Molecular Probes, P36934) was applied. Images were acquired using confocal microscopes (Leica DM5500B equipped with a DFC550 camera, or a Zeiss LSM710 confocal system). Staining results were strictly dependent on the fixation protocol and dilutions of antibodies as specified in Supplementary Table 2. For the quantification of K15high cells in smears, ImageJ (version 1.53k) was used with a lower threshold of 20 for detection of cells with intense K15 staining. The archival breast carcinomas that had been characterized as TNBCs84 were stained for K5 and K17, before co-staining for K14 and K15.
Statistical analysis
All statistical analyses were performed using the softwares R Studio (version 1.2.5001 and version 3.6.2) or GraphPad Prism (version 9.0.0). Data were tested for normal distribution using Shapiro–Wilk and Kolmogorov–Smirnov tests. Tests to determine significant differences between datasets were chosen separately for each experiment and are specified in the figure legends. Significance is indicated as follows: p < 0.05*, p < 0.01**, p < 0.005***, p < 0.001****.
Supplementary information
Acknowledgements
We thank Tove Marianne Lund, Lena Kristensen, and Anita Sharma Friismose for excellent technical assistance. Capio CFR Hospitaler (Benedikte Thuesen and Trine Foged Henriksen) and the donors are acknowledged for providing breast biopsy material. The Core Facility for Integrated Microscopy (University of Copenhagen) is acknowledged for confocal microscope accessibility. We thank Gelo Dela Cruz and the DanStem Flow Cytometry Platform for access to FCS Express to analyze flow cytometry data. Furthermore, we thank Helen Neil and the DanStem Genomics Platform for technical expertize, support, and the use of instruments. This work was supported by Novo Nordisk Fonden (NNF17CC0027852) and Danish Research Council grant 10-092798 (to DanStem), Familien Erichsens Mindefond and Vera og Carl Johan Michaelsens Legat (to J.K.), Toyota-Fonden Denmark and Anita og Tage Therkelsens Fond (to R.V.), Dagmar Marshalls Fond (to L.R.-J.), Novo Nordisk Fonden (NNF18CC0033666 (to N.G.), the Kirsten and Freddy Johansens Fond, Agnes and Poul Friis Fond (to O.W.P.).
Source data
Author contributions
K.T.K., N.G., S.D., and U.P. performed experiments. O.W.P., R.V., and J.K. designed the study. K.T.K., N.G., S.D., K.K., L.R.-J., O.W.P., R.V., and J.K. analyzed the data. K.T.K., N.G., L.R.-J., O.W.P., R.V., and J.K. wrote the manuscript. K.T.K. and N.G. contributed equally to this paper. All authors read and approved the manuscript.
Data availability
Raw data of scRNA-seq are available in EGA European Genome-Phenome Archive with Study ID: EGAS00001005963 and all DEGs are provided as Supplementary Data 1 to 4. The source data underlying Figs. 1d, 2b, f, 3b, and 4a, b and Supplementary Figs. 1b, 3a, c, f, 5 and 6 are provided as a Source Data file.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Katharina Theresa Kohler, Nadine Goldhammer.
Supplementary information
The online version contains supplementary material available at 10.1038/s41523-022-00444-8.
References
- 1.Sørlie T, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl Acad. Sci. USA. 2001;98:10869–10874. doi: 10.1073/pnas.191367098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Polyak K. Breast cancer: origins and evolution. J. Clin. Invest. 2007;117:3155–3163. doi: 10.1172/JCI33295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Villadsen R, et al. Evidence for a stem cell hierarchy in the adult human breast. J. Cell Biol. 2007;177:87–101. doi: 10.1083/jcb.200611114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Keller PJ, et al. Defining the cellular precursors to human breast cancer. Proc. Natl Acad. Sci. USA. 2012;109:2772–2777. doi: 10.1073/pnas.1017626108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Engelsen AST, et al. AXL is a driver of stemness in normal mammary gland and breast cancer. iScience. 2020;23:101649. doi: 10.1016/j.isci.2020.101649. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Britschgi A, et al. The Hippo kinases LATS1 and 2 control human breast cell fate via crosstalk with ERalpha. Nature. 2017;541:541–545. doi: 10.1038/nature20829. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Cheung KJ, Gabrielson E, Werb Z, Ewald AJ. Collective invasion in breast cancer requires a conserved basal epithelial program. Cell. 2013;155:1639–1651. doi: 10.1016/j.cell.2013.11.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Kroger C, et al. Acquisition of a hybrid E/M state is essential for tumorigenicity of basal breast cancer cells. Proc. Natl Acad. Sci. USA. 2019;116:7353–7362. doi: 10.1073/pnas.1812876116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Petersen OW, et al. Differential tumorigenicity of two autologous human breast carcinoma cell lines, HMT-3909S1 and HMT-3909S8, established in serum-free medium. Cancer Res. 1990;50:1257–1270. [PubMed] [Google Scholar]
- 10.Petersen OW, et al. The plasticity of human breast carcinoma cells is more than epithelial to mesenchymal conversion. Breast Cancer Res. 2001;3:213–217. doi: 10.1186/bcr298. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Petersen OW, et al. Epithelial to mesenchymal transition in human breast cancer can provide a nonmalignant stroma. Am. J. Pathol. 2003;162:391–402. doi: 10.1016/S0002-9440(10)63834-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Fridriksdottir AJ, et al. Proof of region-specific multipotent progenitors in human breast epithelia. Proc. Natl Acad. Sci. USA. 2017;114:E10102–E10111. doi: 10.1073/pnas.1714063114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Arendt LM, et al. Anatomical localization of progenitor cells in human breast tissue reveals enrichment of uncommitted cells within immature lobules. Breast Cancer Res. 2014;16:453. doi: 10.1186/s13058-014-0453-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Honeth G, et al. Models of breast morphogenesis based on localization of stem cells in the developing mammary lobule. Stem Cell Rep. 2015;4:699–711. doi: 10.1016/j.stemcr.2015.02.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Tabar L, et al. A proposal to unify the classification of breast and prostate cancers based on the anatomic site of cancer origin and on long-term patient outcome. Breast Cancer (Auckl. 2014;8:15–38. doi: 10.4137/BCBCR.S13833. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Tabar L, et al. A new approach to breast cancer terminology based on the anatomic site of tumour origin: The importance of radiologic imaging biomarkers. Eur. J. Radio. 2022;149:110189. doi: 10.1016/j.ejrad.2022.110189. [DOI] [PubMed] [Google Scholar]
- 17.Nguyen QH, et al. Profiling human breast epithelial cells using single cell RNA sequencing identifies cell diversity. Nat. Commun. 2018;9:2028. doi: 10.1038/s41467-018-04334-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Bhat-Nakshatri P, et al. A single-cell atlas of the healthy breast tissues reveals clinically relevant clusters of breast epithelial cells. Cell Rep. Med. 2021;2:100219. doi: 10.1016/j.xcrm.2021.100219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Pal B, et al. A single-cell RNA expression atlas of normal, preneoplastic and tumorigenic states in the human breast. EMBO J. 2021;40:e107333. doi: 10.15252/embj.2020107333. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Peng, S., Hebert, L. L., Eschbacher, J. M. & Kim, S. Single-Cell RNA Sequencing of a Postmenopausal Normal Breast Tissue Identifies Multiple Cell Types That Contribute to Breast Cancer. Cancers (Basel)12, 10.3390/cancers12123639 (2020). [DOI] [PMC free article] [PubMed]
- 21.Isberg OG, et al. A CD146 FACS protocol enriches for luminal keratin 14/19 double positive human breast progenitors. Sci. Rep. 2019;9:14843. doi: 10.1038/s41598-019-50903-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Goldhammer N, Kim J, Timmermans-Wielenga V, Petersen OW. Characterization of organoid cultured human breast cancer. Breast Cancer Res. 2019;21:141. doi: 10.1186/s13058-019-1233-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Goldhammer N, Kim J, Villadsen R, Ronnov-Jessen L, Petersen OW. Myoepithelial progenitors as founder cells of hyperplastic human breast lesions upon PIK3CA transformation. Commun. Biol. 2022;5:219. doi: 10.1038/s42003-022-03161-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Russo J, Rivera R, Russo IH. Influence of age and parity on the development of the human breast. Breast Cancer Res Treat. 1992;23:211–218. doi: 10.1007/BF01833517. [DOI] [PubMed] [Google Scholar]
- 25.Pelissier Vatter FA, et al. High-dimensional phenotyping identifies age-emergent cells in human mammary epithelia. Cell Rep. 2018;23:1205–1219. doi: 10.1016/j.celrep.2018.03.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Stuart T, et al. Comprehensive integration of single-cell data. Cell. 2019;177:1888–1902. doi: 10.1016/j.cell.2019.05.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Ginestier C, et al. ALDH1 is a marker of normal and malignant human mammary stem cells and a predictor of poor clinical outcome. Cell Stem Cell. 2007;1:555–567. doi: 10.1016/j.stem.2007.08.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Fridriksdottir AJ, et al. Propagation of oestrogen receptor-positive and oestrogen-responsive normal human breast cells in culture. Nat. Commun. 2015;6:8786. doi: 10.1038/ncomms9786. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Celis JE, et al. Identification of a subset of breast carcinomas characterized by expression of cytokeratin 15: relationship between CK15+ progenitor/amplified cells and pre-malignant lesions and invasive disease. Mol. Oncol. 2007;1:321–349. doi: 10.1016/j.molonc.2007.09.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Bernardo GM, et al. FOXA1 is an essential determinant of ER alpha expression and mammary ductal morphogenesis. Development. 2010;137:2045–2054. doi: 10.1242/dev.043299. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Lu QL, Abel P, Foster CS, Lalani EN. bcl-2: role in epithelial differentiation and oncogenesis. Hum. Pathol. 1996;27:102–110. doi: 10.1016/s0046-8177(96)90362-7. [DOI] [PubMed] [Google Scholar]
- 32.Bausch-Fluck D, et al. The in silico human surfaceome. Proc. Natl Acad. Sci. USA. 2018;115:E10988–E10997. doi: 10.1073/pnas.1808790115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Kim J, Villadsen R. Expression of luminal progenitor marker CD117 in the human breast gland. J. Histochem. Cytochem. 2018;66:879–888. doi: 10.1369/0022155418788845. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Anderson LH, Boulanger CA, Smith GH, Carmeliet P, Watson CJ. Stem cell marker prominin-1 regulates branching morphogenesis, but not regenerative capacity, in the mammary gland. Dev. Dyn. 2011;240:674–681. doi: 10.1002/dvdy.22539. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Lim E, et al. Aberrant luminal progenitors as the candidate target population for basal tumor development in BRCA1 mutation carriers. Nat. Med. 2009;15:907–913. doi: 10.1038/nm.2000. [DOI] [PubMed] [Google Scholar]
- 36.Joshi PA, et al. RANK signaling amplifies WNT-responsive mammary progenitors through R-SPONDIN1. Stem Cell Rep. 2015;5:31–44. doi: 10.1016/j.stemcr.2015.05.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Sigl V, et al. RANKL/RANK control Brca1 mutation. Cell Res. 2016;26:761–774. doi: 10.1038/cr.2016.69. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Regan JL, et al. c-Kit is required for growth and survival of the cells of origin of Brca1-mutation-associated breast cancer. Oncogene. 2012;31:869–883. doi: 10.1038/onc.2011.289. [DOI] [PubMed] [Google Scholar]
- 39.Nolan E, et al. RANK ligand as a potential target for breast cancer prevention in BRCA1-mutation carriers. Nat. Med. 2016;22:933–939. doi: 10.1038/nm.4118. [DOI] [PubMed] [Google Scholar]
- 40.Sternlicht MD, et al. Mammary ductal morphogenesis requires paracrine activation of stromal EGFR via ADAM17-dependent shedding of epithelial amphiregulin. Development. 2005;132:3923–3933. doi: 10.1242/dev.01966. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Burkhardt M, et al. Cytoplasmic overexpression of ALCAM is prognostic of disease progression in breast cancer. J. Clin. Pathol. 2006;59:403–409. doi: 10.1136/jcp.2005.028209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Tanos T, et al. Progesterone/RANKL is a major regulatory axis in the human breast. Sci. Transl. Med. 2013;5:182ra155. doi: 10.1126/scitranslmed.3005654. [DOI] [PubMed] [Google Scholar]
- 43.Dontu G, Ince TA. Of mice and women: a comparative tissue biology perspective of breast stem cells and differentiation. J. Mammary Gland Biol. Neoplasia. 2015;20:51–62. doi: 10.1007/s10911-015-9341-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Tabula Muris C, et al. Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris. Nature. 2018;562:367–372. doi: 10.1038/s41586-018-0590-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Saeki K, et al. Mammary cell gene expression atlas links epithelial cell remodeling events to breast carcinogenesis. Commun. Biol. 2021;4:660. doi: 10.1038/s42003-021-02201-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Street K, et al. Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics. BMC Genomics. 2018;19:477. doi: 10.1186/s12864-018-4772-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Nielsen JS, McNagny KM. Novel functions of the CD34 family. J. Cell Sci. 2008;121:3683–3692. doi: 10.1242/jcs.037507. [DOI] [PubMed] [Google Scholar]
- 48.Forse CL, et al. Elevated expression of podocalyxin is associated with lymphatic invasion, basal-like phenotype, and clinical outcome in axillary lymph node-negative breast cancer. Breast Cancer Res. Treat. 2013;137:709–719. doi: 10.1007/s10549-012-2392-y. [DOI] [PubMed] [Google Scholar]
- 49.Petersen OW, Høyer PE, van Deurs B. Frequency and distribution of estrogen receptor-positive cells in normal, nonlactating human breast tissue. Cancer Res. 1987;47:5748–5751. [PubMed] [Google Scholar]
- 50.Perou CM, et al. Molecular portraits of human breast tumours. Nature. 2000;406:747–752. doi: 10.1038/35021093. [DOI] [PubMed] [Google Scholar]
- 51.Herschkowitz JI, et al. Identification of conserved gene expression features between murine mammary carcinoma models and human breast tumors. Genome Biol. 2007;8:R76. doi: 10.1186/gb-2007-8-5-r76. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Parker JS, et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J. Clin. Oncol. 2009;27:1160–1167. doi: 10.1200/JCO.2008.18.1370. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Wu SZ, et al. A single-cell and spatially resolved atlas of human breast cancers. Nat. Genet. 2021;53:1334–1347. doi: 10.1038/s41588-021-00911-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.van de Rijn M, et al. Expression of cytokeratins 17 and 5 identifies a group of breast carcinomas with poor clinical outcome. Am. J. Pathol. 2002;161:1991–1996. doi: 10.1016/S0002-9440(10)64476-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Lehmann BD, et al. Identification of human triple-negative breast cancer subtypes and preclinical models for selection of targeted therapies. J. Clin. Invest. 2011;121:2750–2767. doi: 10.1172/JCI45014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Tang P, Tse GM. Immunohistochemical surrogates for molecular classification of breast carcinoma: a 2015 update. Arch. Pathol. Lab. Med. 2016;140:806–814. doi: 10.5858/arpa.2015-0133-RA. [DOI] [PubMed] [Google Scholar]
- 57.Park SJ, Yoon BH, Kim SK, Kim SY. GENT2: an updated gene expression database for normal and tumor tissues. BMC Med Genomics. 2019;12:101. doi: 10.1186/s12920-019-0514-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Moreira JM, et al. Tissue proteomics of the human mammary gland: towards an abridged definition of the molecular phenotypes underlying epithelial normalcy. Mol. Oncol. 2010;4:539–561. doi: 10.1016/j.molonc.2010.09.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Garbe JC, et al. Accumulation of multipotent progenitors with a basal differentiation bias during aging of human mammary epithelia. Cancer Res. 2012;72:3687–3701. doi: 10.1158/0008-5472.CAN-12-0157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Shalabi SF, et al. Evidence for accelerated aging in mammary epithelia of women carrying germline BRCA1 or BRCA2 mutations. Nat. Aging. 2021;1:838–849. doi: 10.1038/s43587-021-00104-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Chen W, et al. Single-cell landscape in mammary epithelium reveals bipotent-like cells associated with breast cancer risk and outcome. Commun. Biol. 2019;2:306. doi: 10.1038/s42003-019-0554-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Rosenbluth JM, et al. Organoid cultures from normal and cancer-prone human breast tissues preserve complex epithelial lineages. Nat. Commun. 2020;11:1711. doi: 10.1038/s41467-020-15548-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Santagata S, et al. Taxonomy of breast cancer based on normal cell phenotype predicts outcome. J. Clin. Invest. 2014;124:859–870. doi: 10.1172/JCI70941. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Hu T, Zhao G, Liu Y, Long M. A machine learning approach to differentiate two specific breast cancer subtypes using androgen receptor pathway genes. Technol. Cancer Res. Treat. 2021;20:15330338211027900. doi: 10.1177/15330338211027900. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Molyneux G, et al. BRCA1 basal-like breast cancers originate from luminal epithelial progenitors and not from basal stem cells. Cell Stem Cell. 2010;7:403–417. doi: 10.1016/j.stem.2010.07.010. [DOI] [PubMed] [Google Scholar]
- 66.Proia TA, et al. Genetic predisposition directs breast cancer phenotype by dictating progenitor cell fate. Cell Stem Cell. 2011;8:149–163. doi: 10.1016/j.stem.2010.12.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Visvader, J. E. & Smith, G. H. Murine mammary epithelial stem cells: discovery, function, and current status. Cold Spring Harb. Perspect. Biol3, 10.1101/cshperspect.a004879 (2011). [DOI] [PMC free article] [PubMed]
- 68.Deng G, Lu Y, Zlotnikov G, Thor AD, Smith HS. Loss of heterozygosity in normal tissue adjacent to breast carcinomas. Science. 1996;274:2057–2059. doi: 10.1126/science.274.5295.2057. [DOI] [PubMed] [Google Scholar]
- 69.Morsing M, et al. Fibroblasts direct differentiation of human breast epithelial progenitors. Breast Cancer Res. 2020;22:102. doi: 10.1186/s13058-020-01344-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Yoshida S, et al. Cytokeratin 15 can be used to identify the limbal phenotype in normal and diseased ocular surfaces. Invest. Ophthalmol. Vis. Sci. 2006;47:4780–4786. doi: 10.1167/iovs.06-0574. [DOI] [PubMed] [Google Scholar]
- 71.Liu Y, Lyle S, Yang Z, Cotsarelis G. Keratin 15 promoter targets putative epithelial stem cells in the hair follicle bulge. J. Invest. Dermatol. 2003;121:963–968. doi: 10.1046/j.1523-1747.2003.12600.x. [DOI] [PubMed] [Google Scholar]
- 72.Giroux V, et al. Long-lived keratin 15+ esophageal progenitor cells contribute to homeostasis and regeneration. J. Clin. Invest. 2017;127:2378–2391. doi: 10.1172/JCI88941. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Aldehlawi H, et al. The monoclonal antibody EPR1614Y against the stem cell biomarker keratin K15 lacks specificity and reacts with other keratins. Sci. Rep. 2019;9:1943. doi: 10.1038/s41598-018-38163-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Eirew P, et al. A method for quantifying normal human mammary epithelial stem cells with in vivo regenerative ability. Nat. Med. 2008;14:1384–1389. doi: 10.1038/nm.1791. [DOI] [PubMed] [Google Scholar]
- 75.Bose A, Teh MT, Mackenzie IC, Waseem A. Keratin k15 as a biomarker of epidermal stem cells. Int. J. Mol. Sci. 2013;14:19385–19398. doi: 10.3390/ijms141019385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Bose A, et al. Two mechanisms regulate keratin K15 expression in keratinocytes: role of PKC/AP-1 and FOXM1 mediated signalling. PLoS ONE. 2012;7:e38599. doi: 10.1371/journal.pone.0038599. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Ikeda J, et al. Prognostic significance of CD55 expression in breast cancer. Clin. Cancer Res. 2008;14:4780–4786. doi: 10.1158/1078-0432.CCR-07-1844. [DOI] [PubMed] [Google Scholar]
- 78.Pal B, et al. Construction of developmental lineage relationships in the mouse mammary gland by single-cell RNA profiling. Nat. Commun. 2017;8:1627. doi: 10.1038/s41467-017-01560-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Snyder KA, et al. Podocalyxin enhances breast tumor growth and metastasis and is a target for monoclonal antibody therapy. Breast Cancer Res. 2015;17:46. doi: 10.1186/s13058-015-0562-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Frose J, et al. Epithelial-mesenchymal transition induces podocalyxin to promote extravasation via Ezrin Signaling. Cell Rep. 2018;24:962–972. doi: 10.1016/j.celrep.2018.06.092. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Virtanen S, Schulte R, Stingl J, Caldas C, Shehata M. High-throughput surface marker screen on primary human breast tissues reveals further cellular heterogeneity. Breast Cancer Res. 2021;23:66. doi: 10.1186/s13058-021-01444-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Stampfer M, Hallowes RC, Hackett AJ. Growth of normal human mammary cells in culture. Vitro. 1980;16:415–425. doi: 10.1007/BF02618365. [DOI] [PubMed] [Google Scholar]
- 83.Rønnov-Jessen L, Petersen OW. Induction of alpha-smooth muscle actin by transforming growth factor-beta 1 in quiescent human breast gland fibroblasts. Implications for myofibroblast generation in breast neoplasia. Lab. Invest. 1993;68:696–707. [PubMed] [Google Scholar]
- 84.Bechmann MB, Brydholm AV, Codony VL, Kim J, Villadsen R. Heterogeneity of CEACAM5 in breast cancer. Oncotarget. 2020;11:3886–3899. doi: 10.18632/oncotarget.27778. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Barkas N, et al. Joint analysis of heterogeneous single-cell RNA-seq dataset collections. Nat. Methods. 2019;16:695–698. doi: 10.1038/s41592-019-0466-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Gautier L, Cope L, Bolstad BM, Irizarry RA. affy-analysis of Affymetrix GeneChip data at the probe level. Bioinformatics. 2004;20:307–315. doi: 10.1093/bioinformatics/btg405. [DOI] [PubMed] [Google Scholar]
- 87.Ritchie ME, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43:e47. doi: 10.1093/nar/gkv007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Gendoo DM, et al. Genefu: an R/Bioconductor package for computation of gene expression-based signatures in breast cancer. Bioinformatics. 2016;32:1097–1099. doi: 10.1093/bioinformatics/btv693. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Raudvere U, et al. g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update) Nucleic Acids Res. 2019;47:W191–W198. doi: 10.1093/nar/gkz369. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Raw data of scRNA-seq are available in EGA European Genome-Phenome Archive with Study ID: EGAS00001005963 and all DEGs are provided as Supplementary Data 1 to 4. The source data underlying Figs. 1d, 2b, f, 3b, and 4a, b and Supplementary Figs. 1b, 3a, c, f, 5 and 6 are provided as a Source Data file.