1 Supplemental Information - Table of content: SUPPLEMENTALFIGURES ANDLEGENDS………………………………………………………………………………………………………………………2 FIGURE S1:FREQUENT OBSERVATIONOF CD44 + CD24 NEG/LOW SURFACE EXPRESSIONIN ERNEGATIVE BREAST CANCERLINES ANDTUMOR CULTURES.………………………………………………………………………………………………………………………………………………………..……….2 FIGURE S2:ESAEXPRESSIONANDALDEFLUORACTIVITY ARE OBSERVEDINSUBSETS OF CD44 + CD24 LOW+ CELLS………………………………………………………………………………………..……….…………………………………………………………………… 3 FIGURE S3:CD44 + CD24 LOW+ GIVE RISE TOBOTH CD44 + CD24 LOW+ ANDCD44 + CD24 NEG PROGENY WHILE CD44 + CD24 NEG YIELDONLY CD44 + CD24 NEG ……………………………………………………………………………………………………………………………………….. 4 FIGURE S4:EXPRESSIONOF LUNG, BRAIN ANDBONE METASTASIS GENE EXPRESSIONSIGNATURES IS ENRICHEDIN CD44 + CD24 LOW+ SUBPOPULATIONSFROMMDA-MB-231ANDDT-22……………………………………………………………………………….. 5 FIGURE S5:EXPRESSIONOF EMBRYONIC STEMCELLTRANSCRIPTIONFACTORS ANDTARGETS IS ENRICHEDIN CD44 + CD24 LOW+ SUBPOPULATIONS………………………………………………………………………………………………………………………………. 7 FIGURE S6:GSISENSITIVITY IS RESTRICTEDTOTHE CD44 + CD24 LOW+ ..………………………………………………………………………………….. 9 LEGENDS FOR TABLE S1 (DATASET)…………………………………………………………………………………………………………………………..11 TABLE S1:INDIVIDUAL GENE SCORES FROMGSAIN SORTEDCD44 + CD24 LOW+ COMPAREDTOCD44 + CD24 NEG SUBPOPULATIONS FROMMDA-MB-231 ANDDT-22 CELLS……………………………………………………………………………………………..…11 SUPPLEMENTALMETHODS……………………………………………………………………………………………………………………………………..12 CHROMATIN IMMUNOPRECIPITATIONASSAY…………………………………………………………………………………………………………………….12 REAL TIME PCRPRIMERSEQUENCES……………………………………………………………………………………………………………………………...12 MICROARRAY DATA ACQUISITIONANDANALYSIS................................................................................................................12 MICROARRAY DATA PROCESSINGANDNORMALIZATION……………………………………………………………………………………………………….12 DEFININGA NOTCH TARGET GENE SET……………………………………………………………………………………………………………………………13 GENE SET ENRICHMENT ANALYSIS………………………………………………………………………………………………………………………………….13 2 FigureS1: Frequent observationof CD44+CD24neg/lo surfaceexpressioninER negativebreast cancer lines andtumor cultures. A. ER negativebreastcancer celllines B. TNBC-derived dissociatedtumor (DT)cultures C. ER positive linesand D. Equal numbers of mixedMDA-MB-231 andMCF7 cells.3 FigureS2. ESA expressionandaldefluor activity areobservedinsubsetsof CD44+CD24low+ cells. A. CD44 andCD24 in DT-25at passage(P3)compared to DT-25 at passage (P9). B.Serial mammospheresformed from sorted CD44+CD24low+ andCD44+CD24neg from DT-25. Mean +/- SEM, *p=0.0054 C. Aldefluor activity (top) and flow cytometry for surface ESA (bottom) were assayed as described in MDA-MB-231andDT-22. Respectiveunstained controls are shown. D. Cells were gatedfor ESA+ (left)orALDH1+ (right) andCD44/CD24assayed.4 Figure S3. CD44+CD24low+ give rise to both CD44+CD24low+ and CD44+CD24neg progeny while CD44+CD24neg yieldonly CD44+CD24neg. CD44+CD24low+ or CD44+CD24neg weresorted from DT-22 andDT25and100,000 cells cultured. A, D. Populationgrowthfromsorted CD44+CD24low+ andCD44+CD24neg B, E. The proportionofCD44+CD24low+ orCD44+CD24neg cells arisingfromCD44+CD24low+ cells isshown over time. C, F. Growth curves of progeny of CD44+CD24low+ over 14 days. As for MDA-MB-231 (Figure 3A) CD44+CD24neggenerated only CD44+CD24negcells over14days (notshown here). A B DT - 22 0 4 8 1 2 1 6 2 0 2 4 D a y s C D 2 4 ne g C D 2 4 l ow + P r o g e n y o f C D 2 4 l o w + 100 0 20 40 60 80 % C el l s 3 12 23 46 58 72 82 97 88 77 54 42 28 18 0 4 8 12 16 20 C e l l # x 1 0 4 D a y s 400 300 200 100 0 C D 2 4 ne g p r o g e n y C D 2 4 l ow + p r o g e n y 500 600 DT - 25 0 4 8 12 16 20 400 300 200 100 C e l l # x 1 0 4 C D 2 4 n e g C D 2 4 l o w + P r o g e n y o f C D2 4 l o w + C e l l # x 1 0 4 400 300 200 100 500 0 0 4 8 1 2 1 6 2 0 C D 2 4 n e g p ro g en y C D 2 4 l ow + p r o g e n y 3 21 32 44 61 78 84 97 79 68 56 39 22 16 0 4 8 1 2 1 6 2 0 2 4 D a y s C D 2 4 ne g C D 2 4 l ow + P r o g e n y o f C D 2 4 l o w + 100 0 20 40 60 80 % C ell s 0 4 8 12 16 20 300 200 100 C e l l #x104 C D 2 4 ne g C D 2 4 l ow + P r o g e n y o f C D2 4 l o w + C D E F5 A B C M D A - MB - 231 M D A - MB - 231 M D A - MB - 231 L u n g Up B r a i n Up B o n e Up6 Figure S4. Expression of lung, brain and bone metastasis gene expression signatures is enriched in CD44+CD24low+ subpopulations from MDA-MB-231 andDT-22 A,B,C. Expressionof lung(A), brain(B) andbone(C) metastasisgeneexpressionsignatures isenriched in CD44+CD24low+ subpopulations in MDA-MB-231. Gene set analysis (GSA) compared enrichment of indicated signatures. Shownare ordered gene scores for each gene inthe lineplot andthe averagefold change inthe heatmap(orange indicates highexpressionin CD44+CD24low+ andblue is low). Average fold gene expressionchanges are indicated by bar graphs. Upper left panel shows enrichment score andthe p-value. D,E. Expression of lung (D) and brain (E) metastasis gene expression signatures is enriched in CD44+CD24low+ subpopulations in DT-22 analysis as above. A list of the genes usedfor eachsignature and their individual genescoresare inshown inSupportingInformationTable 1. D E DT - 22 DT - 22 B r a i n Up L u n g Up P T G S 2 I D 2 F S C N 1 C X C L 1 KR T 8 1 M A N 1 A 1 T N C K Y N U L T B P 1 E R E G VC A M 1 A N G P T L 4 M M P 1 P T G S 2 H B E G F C S F 3 B 4 G A L T 6 S E P P 1 L T B P 1 P E L I 1 P L O D 2 C O L 1 3 A 1 A N G P T L 4 L A M A 4 M M P 17 A B C MD A - MB - 231 MD A - MB - 231 MD A - MB - 231 E S U p N O S T F s U p N O S T a r ge t s U p8 Figure S5. Expression of embryonic stem cell transcription factors and targets is enriched in CD44+CD24low+ subpopulations. A-F. GSA comparingenrichment of indicated embryonic stem cell transcriptionfactors (A, B andD, E) and genes upregulated by NOS (Nanog, Oct4 and Sox2) (C, F). The “NOS TF up” is a gene set of transcriptional regulators identified in human ES that overexpress Nanog, Oct4, or Sox2. The “NOS Targets up” gene set contains the activated genes from theChIP-array for Nanog, Oct4, andSox2. A list of the genes used for each signature and their individual gene scores are in shown inSupplementary Table 1. For MDA-MB-231 (A-C), andDT-22 (D-F) the ordered scores for each gene inthe line plot and the average foldchange inthe heatmapare shown (orange indicates highexpressioninCD44+CD24low+ andblue, low). Enrichmentscores andp-values are atupper left. Average foldgene expressionchanges are shown in bar graphs and a list of genes and their individual gene scores are in Supporting InformationTable 1. 9 G M D A - MB - 231 CD24low+ C D 2 4 n e g C t r l RO (1 0 u M) B rd U PI 2N 4N 2N 4N 2N 4N 2N 4N F C t r l 0 5 10 15 20 25 P r im a r y S e c o n d a r y T e r t ia r y C D2 4 n e g DM SO C D2 4 n e g DA P T D A P T 0 10 20 40 50 30 60 70 1 ° 2 ° 3 ° C D 2 4 l o w + C D 2 4 n eg S p h e r e s / 10 4 c e l l s 0 5 10 15 20 25 30 35 40 45 50 P r im a r y Se c o n d a r y T e r t ia r y C D2 4 lo w + DM SO CD2 4 lo w + DA P T 0 5 10 15 20 25 30 35 40 45 50 P r im a r y Se c o n d a r y T e r t ia r y C D2 4 lo w + DM SO CD2 4 lo w + DA P T C t r l D A P T 0 10 20 40 50 30 60 70 S phe r e s / 10 4 c e l l s 1 ° 2 ° 3 ° * ** *** 0 5 10 15 20 25 30 35 40 45 50 P r im a r y Se c o n d a r y T e r t ia r y C D2 4 lo w + DM SO CD2 4 lo w + DA P T A E B C S o x 2 N1 - I C D β - a c t i n C 5 1 0 D A PT ( µ M ) DT - 22 M D A - MB - 231 N1 - ICD So x 2 β - a ct i n C 5 1 0 D A P T (µM) S phe r e s / 10 4 c e l l s 0 10 20 40 50 30 C D 2 4 l o w + 1 ° 2 ° 3 ° 0 5 10 15 20 25 30 35 40 45 50 P r im a r y Se c o n d a r y T e r t ia r y C D2 4 lo w + DM SO CD2 4 lo w + DA P T 0 5 10 15 20 25 30 35 40 45 50 P r im a r y Se c o n d a r y T e r t ia r y C D2 4 lo w + DM SO CD2 4 lo w + DA P T C t r l D A P T C D 2 4 n eg 0 5 10 15 20 25 P r im a r y S e c o n d a r y T e r t ia r y C D2 4 n e g DM SO C D2 4 n e g DA P T C t r l D A P T S phe r e s / 10 4 c e l l s 0 10 20 40 50 30 1 ° 2 ° 3 ° * ** *** 0 5 10 15 20 25 30 35 40 45 50 P r im a r y Se c o n d a r y T e r t ia r y C D2 4 lo w + DM SO CD2 4 lo w + DA P T H D C D 2 4 l o w + C D 2 4 n eg S phe r e s / 10 4 c e l l s 0 10 20 40 50 30 60 1 ° 2 ° 3 ° D T 2 5 1 ° 2 ° 3 ° S phe r e s / 10 4 c e l l s 0 10 20 40 50 30 60 0 5 10 15 20 25 30 35 40 45 50 P r im a r y Se c o n d a r y T e r t ia r y C D2 4 lo w + DM SO CD2 4 lo w + DA P T 0 5 10 15 20 25 30 35 40 45 50 P r im a r y Se c o n d a r y T e r t ia r y C D2 4 lo w + DM SO CD2 4 lo w + DA P T C t r l D A P T C t r l 0 5 10 15 20 25 P r im a r y S e c o n d a r y T e r t ia r y C D2 4 n e g DM SO C D2 4 n e g DA P T D A P T *** ** * 0 5 10 15 20 25 30 35 40 45 50 P r im a r y Se c o n d a r y T e r t ia r y C D2 4 lo w + DM SO CD2 4 lo w + DA P T ICell # x104 600 400 200 800 1000 1200 0 4 8 12 M D A - MB - 231 T O T A L # C D2 4 lo w T O T A L # C D2 4 n e g C D2 4 lo w w it h R OCD24negwith RO Days D 2 4 l o w + C D 2 4 n e g C D 2 4 l o w + / R OCD24neg/ RO Cell # x104 100 50 300 0 4 8 12 DT - 22 T O T A L # C D2 4 lo w T O T A L # C D2 4 n e g C D2 4 lo w w it h R O C D2 4 n e g w it h R O Days C D 2 4 l o w + C D 2 4 n e g C D 2 4 l o w + / R OCD24neg/ RO 150 J I n va s i o n ** FigureS6. GSI sensitivity isrestrictedtotheCD44+CD24low+. A-E. CD44+CD24low+ (CD24low+) andCD44+CD24neg(CD24neg) cells sorted from MDA-MB-231 (A, B), DT-22 (C, D) andDT-25 (E)were treatedwith or without DAPT. Effect of DAPT oncleaved Notch 1 (N1-ICD)and Sox2 +/- after 24 hrs of 5 and 10µM DAPT in CD44+CD24low+ cells (A & C). Serial mammospheres of indicated cells from sortedpopulations +/-5M DAPT (Mean± SEM, Student’s t-test). The pvalues in panel B: *p= 0.00001; **p=0.00002; ***p=0.0001; panel D: *p= 0.0045; **p=0.006; ***p=0.0054; panel10 E: *p= 0.0027; **p=0.0005; ***p=0.007 Serial mammospheres from DAPT-treated CD24low+ cells were significantly reduced compared to respective untreated controls. (*) denotes means statistically different by student’s t test from control primary spheres. Treated and untreated tertiary mammospheres from CD24neg cells were reduced significantly comparedto primary spheres, but both wereunaffected by DAPT (B, D& E). F. Mean soft agar colonies arising from sorted populations of MDA-MB-231 +/-5M DAPT (Mean +/- SEM, ttest, *p=0.04). G. Matrigel invasion of sorted populations from DT-22 +/- 10µM RO4929097 (RO) generated by xCELLigence real time cell analysis, graphed as Mean ± SEM. CD24low+ cells show significantly greater invasion compared to CD24neg cells (**p= 0.0055 at T=12 hrs, Student’s t test). RO significantly attenuated invasion (*p= 0.004 at T= 12hrs, Student’s t test) of CD24low+ cells. CD24neg cells were unaffected. H. Cellcycle profilesof sorted populations fromMDA-MB-231after48 hrs +/-10µM RO4929097. I, J. GSI does not affect MDA-MB-231 or DT-22 cellproliferation. Proliferationcurves of 100,000 cells from sorted CD44+CD24low+ or CD44+CD24neg populations from MDA-MB-231 (I) and DT22 (J) cultured over 12 days +/-10µM RO4929097. 11 Legendfor Table S1 (Dataset) Table S1: Individual gene scores from GSA results showing enrichment of various gene signatures in sorted CD44+CD24low+ compared to CD44+CD24neg subpopulations from MDA-MB-231 and DT-22 cells. Higher gene scores indicate increased expression while lower gene scores indicate decreased expression. 12 Supplemental Methods Chromatin Immunoprecipitation Assay: Cells were treated with2% formaldehyde for 10 min at 220C forChIP assaywith anti-cleaved Notch1 antibody(CellSignaling) or controlIgG. Immunoprecipitated DNAcorrespondingto –770to–616from the transcriptional start site of human Sox 2 promoter was amplified byPCRusingthe primer pair: SOX2-F: GCCAAAGAGCTGAGTTGGAC; SOX2-R:CCCAAACCTCTGTCCTCAAA Thetwo regions of the mouse Sox2 promoter were amplified byPCRusingthefollowingprimer pairs: mSOX2(1)-F:CTGTGGTTGCTCTTTGTAGCA; mSOX2(1)-R:TGTAGGGGCACCTTCATTTT; mSOX2(2)-F:CCTAGGAAAAGGCTGGGAAZ; mSOX(2)-R: CACTCACCCCCTCTTCTCAC. Real Time PCRPrimer Sequences: The primers used to amplify NOTCH1, NOTCH2, NOTCH3, NOTCH4, NANOG, SOX2, JAGGED1, HEY1, GAPDHand HPRT (F = forward; R= reverse) were as follows: hNotch1-F: GAAGAACGGGGCTAACAAAGAT; hNotch1-R: GTCCATATGATCCGTGATGTCC; hNotch2-F: AGCTACTGTGAGGAGCAACTCG; hNotch2-R: GATTCTGGCACTCATCCACTTC; hNotch3-F: TCAAAAATGGAGCCAATAAGGA; hNotch3-R: AAAGTGGTCCAACAGCAGCTT; hNotch4-F: GATAAAGATGCCCAGGACAACA; hNotch4-R: GTCAGCAGATCCCAGTGGTTAC; hNanog-F : GATCGGGCCCGCCACCATGAGTGTGGATCCAGCTTG; hNanog-R: GATCGAGCTCCATCTTCACACGTCTTCAGGTTG; hSox2-F : CCTCCGGGACATGATCAG; hSox2-R: TTCTCCCCCCTCCAGTTC; hJAGGED1-F: ATCCTCGAGAGCACCAGCGCGAACAGCAG; hJAGGED1-R: ATCGAATTCCCCGCGGTCTGCTATACGAT; hHEY1-F: ATCACCCACACATCGCACACCC hHEY1-R: ACTAGGGGGCGCTCGCAAGG mSox2-F: TCAAGGCAGAGAAGAGAGTGTTTGC mSox2-R: GAAGCGGAGCTCGAGACGGG hGAPDH-F : ACCCAGAAGACTGTGGATGG; hGAPDH-R: TCTAGACGGCAGGTCAGGTC; mHPRT-F: CACAGGACTAGAACACCTGC; mHPRT-R: GCTGGTGAAAAGGACCTC. Microarray Data Acquisition and Analysis: RNA was isolated with miRNeasy kit (Qiagen) and quantified by Nanodrop 8000 Spectrophotometer (Thermo Scientific, Wilmington) and qualityverified byRNA6000 Nano kit (Agilent, Santa Clara, CA) on a Bioanalyzer 2100 and expressionanalysis used the Illumina platform(see Supplemental Methods). Biotinylated cRNA was prepared using Illumina TotalPrep RNA Amplification Kit (Ambion, Inc., Austin, TX) per manufacturer from 400ng total RNA. Samples were added to the Beadchip after randomization using a randomized block design to reduce batch effects. Hybridization to the Sentrix Human-HT12 Expression BeadChip (Illumina, Inc., San Diego, CA), washing and scanning were per Illumina BeadStation 500 manual (revision C). Microarray data analysis used Illumina GenomeStudio software.13 Microarray data processing and normalization: Microarraydata processingand analysis performed used the Rlanguage and environment forstatistical computingversion 2.13 and Bioconductorversion 2.8. Bead-summaryexpression data for the Illumina HumanHT-12 v4 BeadChip were normalized to correct for differences in expression within and between chipsusingthe variance stabilization and normalization (vsn) method as implemented in the beadarrayR package version 2.2.0. Data for MDA-MB-231 and DT22 cell lines were separately normalized. The IlluminaHumanv4.db package version 1.10.0 was used to obtain probe mappings to official gene symbols. If multiple probes correspond tothe same gene, the probe with thehighest variance wasused. In this way, the expressionmatrix wasreduced so that eachexpression valuecorresponds to a single geneannotated bythe official gene symbol. Defining a NOTCH targetgene set: ANOTCHtargetsgene set wasdefined byGSIwashout of a metastatic MDA-MB-231 variant treated with 10μM DAPT for 48 hrs, washedX3, and cultured four more hrs in 10 mg/ml of cyclohexamide. Trizole extracted RNAwasreverse transcribedand used with a customTaqMan RT-PCRarraycards for the candidate NOTCHtarget genes. Genes showinganupregulation of at least 1.5fold in response to GSI washoutwereconsidered target genes. Gene Set Enrichment Analysis: MDA-MB-231 lines with discrete metastatic tissue tropisms have been used to define gene expression signatures for lung, bone, or brainmetastasis (Bos etal, 2009;Kanget al, 2003; Minn et al, 2005). All three metastasis signatures were compared with the CD24 lowand negative expression profiles. For the lung and brain metastasis gene signatures, the 18-gene and 17-gene versions used in clinical outcome analysis (Bos et al, 2009; Minn et al, 2005) were used, respectively. Probe identifiers were mapped to the official gene symbol. Only genes in the metastasis signatures that showed upregulation compared to parental MDA-MB-231 were included in the GSAof CD44+CD24low+and CD44+CD24neg populations. For the embryonic stem cell signatures (Li YQ, 2010), the published gene lists were used and mapped to official gene symbols. We examined the signatures characteristic of human embryonic stem(hES) cells (Liu at al, 2007), of geneswhose promoters are bound and activated byNanog, Oct4 and Sox2 (NOS targets) in hES, and of thesubset of NOS targets encodingtranscriptional factors (NOS TFs) (Liu et al 2006; Li YQ, 2010). The gene signature fromES, NOS-targets, NOS-TFs were compared with the genes expressed inCD44+CD24low+ and CD44+CD24negpopulations. NOTCH targets gene set wasidentified by comparinggenes expressedbefore and after GSIwashout of a metastatic MDA-MB-231 variant. NOTCH target genes upregulated after GSI withdrawal were compared with those expressed in CD44+CD24low+and CD44+CD24neg populations. Finally, genes differentiallyexpressed in other stemcell enriched populations from cell lines and/or primary tumors (Dalerba et al, 2007; Frank et al, 2010; O’Brien et al, 2009) were separated into up-and down-regulated gene setsbased on the published fold- change differences and also examined for enrichment in our CD44+CD24low+and CD44+CD24neg populations. These collectionsofgene signatureswereused in Gene Set Analysis (GSA) as implemented in the GSARpackage version 1.03. For GSA, a two-class paired comparison between and CD44+CD24low+ and CD44+CD24negcells using the maxmean method and re-standardization based on all genes in the microarraydata set was used. Gene sets showingpositive or negative enrichment were deemed significant if the false discoveryrate and nominal p-value were less than 0.05 using1000 permutations. GSAwas performed separatelyfor the MDA-MB-231 subpopulations andthe DT-22 subpopulations.