Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 Jun 27.
Published in final edited form as: Breast Cancer Res Treat. 2010 Sep 15;125(3):879–883. doi: 10.1007/s10549-010-1159-6

Gene expression array testing of FFPE archival breast tumor samples: an optimized protocol for WG-DASL® sample preparation

CC Ton 1,2, N Vartanian 1, X Chai 2, MG Lin 1, X Yuan 1, KE Malone 2, CI Li 2, A Dawson 3, C Sather 3, J Delrow 3, L Hsu 2, PL Porter 1,2,4,*
PMCID: PMC3124315  NIHMSID: NIHMS264058  PMID: 20842525

Abstract

Archived formalin-fixed, paraffin embedded (FFPE) tissues constitute a vast, well-annotated, but unexploited resource for the molecular study of cancer progression, largely because degradation, chemical modification, and cross-linking, render FFPE RNA a suboptimal substrate for conventional analytical methods. We report here a modified protocol for RNA extraction from FFPE tissues which maximized the success rate (with 100% of samples) in the expression profiling of a set of 60 breast cancer samples on the WG-DASL platform; yielding data of sufficient quality such that in hierarchical clustering (a) 12/12 (100%) replicates correctly identified their respective counterparts, with a high self-correlation (r = 0.979), and (b) the overall sample set grouped with high specificity into ER+ (38/40; 95%) and ER- (18/20; 90%) subtypes. These results indicate that a large fraction of decade-old FFPE samples, of diverse institutional origins and processing histories, can yield RNA suitable for gene expression profiling experiments.

Keywords: FFPE RNA extraction, WG-DASL, gene expression profiling

Introduction

Formalin-fixed, paraffin embedded (FFPE) tissues constitute a vast, well-annotated, but untapped source of archived clinical material for the molecular study of cancer [6], largely because chemically modified and degraded nucleic acids in such tissues are poor molecular substrates for downstream, conventional analytical methods, particularly array methods dependent on 3’-oligo-dT-primed reverse transcription (RT-PCR) [16]. Examples of conventional microarray gene expression profiling platforms adversely affected by FFPE RNA include the Agilent 22K GeneChip, with a reported success rate of >24% on breast cancer samples [17]; although more recent variants, e.g. the Affymetrix Plus 2.0 and Exon 1.0 platforms, have performed better [15].

Of great promise in mitigating such difficulties is the DASL (cDNA-mediated annealing, selection, extension and ligation) assay (Illumina, CA), a technology based on highly multiplexed RT-PCR applied in a bead-microarray format [4]. Illumina’s implementation of this underlying technology has rendered the DASL assay relatively unbiased and robust when applied to degraded FFPE RNA samples from breast [1]; [11]; [16]; [19] and other tissues [5], [3]; [14]; [12]. A more recent version of the method, WG-DASL (whole-genome DASL), capable of addressing 24,000 RefSeq sequences, encompassing the majority of the human transcriptome, has been applied successfully to a small number of FFPE ovarian cancer samples [2].

Microarray-based gene expression studies using fresh frozen tumor samples have shown that breast cancer can be classified into ‘intrinsic’ molecular subtypes on the basis of distinct transcription profiles, thereby providing objective, quantitative criteria for resolving disease heterogeneity and for patient stratification ([18], [21]; [13]; [9]. Methods that provide reliable and accurate gene expression profiling of FFPE samples would add greatly to our understanding of the significance of breast cancer subtypes; and we show here that improved RNA extraction protocols in conjunction with the WG-DASL method can achieve an optimal success rate in gene expression profiling of FFPE samples.

Materials and Methods

In this study, we compared two RNA extractions methods, focusing on yield, purity, and RT-PCR performance, using (a) a modified protocol based on the Qiagen RNeasy FFPE kit (Valencia, CA, USA) (Table 1) and (b) the Roche Hi Pure FFPE RNA kit (Roche, Mannheim, Germany), which is the method recommended by Illumina (San Diego, CA, USA) for DASL applications. Modifications made to the baseline manufacturer’s (Qiagen) protocol included: (a) double xylene deparaffinization step, (b) double ethanol wash, and (c) extension of proteinase k digestion to 2 hours from 15 min, with continuous agitation at 55 C, 500 rpm. The modified Qiagen method was used on 60 AJCC stage I or II invasive ductal breast tumors, 1 cm or larger, diagnosed between 1993 and 1999 (11–17 years old) as part of the population-based cohort study [20]; and comprising 40 estrogen receptor positive (ER+; by immunohistochemistry) and 20 ER-tumors. A similar set of 36 FFPE ER+ tumors were extracted by the Roche Hi Pure FFPE RNA kit. H&E sections were marked to exclude non-tumor tissue prior to sectioning with either method.

Table 1.

Modified Qiagen RNeasy FFPE RNA extraction protocol for WG-DASL assay.

Modified FFPE RNA Extraction Protocol
  • Cut ≥3 x 6 micron slices per block and place in sterile, RNase-free 1.5 mL microcentrifuge tube.

  • Deparaffinize with two sequential 100% xylene washes, with vigorous intermittent vortexing and incubation at room temperature for 5 min. Centrifuge to recover pellets between washes.

  • Perform two washes with absolute ethanol, with vigorous intermittent vortexing and incubation at room temperature for 5 min. Centrifuge to recover pellets between washes.

  • Air dry pellets thoroughly at 50C for 5 min in Eppendorf Thermomixer.

  • Digest with Proteinase K at 55C for 2 h with constant agitation at 500 rpm in Thermomixer.

  • Heat pulse at 80C, 15 min, with agitation; to reverse cross-linking.

  • Continue purifying RNA according to original Qiagen RNeasy FFPE kit protocol.

  • Perform Nanodrop UV spectrophotometry/quantitation and RPL13a TaqMan realtime PCR.

  • Apply samples in WG-DASL procedure at ≥200 ng input RNA; A260/A280 ≥1.8; and RPL13a Ct ≤29.

The TaqMan realtime PCR analysis of RPL13A is currently the quality control standard for predicting RNA performance on the DASL assay [9], with RT-PCR cycle threshold Ct values of ≤29 generally taken to signify RNA of sufficient quality to give reproducible results on DASL [1]; [7]. Here, RNA integrity was assessed by RT-PCR for RPL13A using the QuantiTect Primer Assay QT00089915 (Qiagen). All 60 tumor samples, regardless of Ct values, were processed according to vendor’s instructions and hybridized to Illumina whole-genome arrays (Human-Ref-8 BeadChips, v.3) at the recommended input concentration of 200 ng/uL. For the modified Qiagen method, 12 replicates were applied to the arrays with doubled RNA input, at 400 ng/uL. For the 36 samples processed by the Roche method, 10 samples out of the 20 (20/36; 56%) that had passed RT-PCR pre-qualification (Ct≤29) were tested on the WG-DASL array at 200 ng/uL.

The raw array data was exported via GenomeStudio v1.0 software to the R v2.9.2 statistical software Bioconductor lumi package [8] for quantile normalization and log2 transformation. Signal intensity filtering excluding probes with signals below the 75% percentile of negative controls was applied. The data was then subjected to “shorth” variance filtering (Bioconductor package “genefilter”, Huber and Bras), such that only probes with standard deviation exceeding the mean intensity standard deviation of arrays within the “shorth” interval were retained for analysis [10]. Intrinsic probes were selected on the basis of those whose expression showed the most variation among tumors of different subjects compared with technical and stochastic variations: 12 pairs of original vs. replicate data were analyzed by the ANOVA F-test to compare between-tumor variations with within-tumor variations; probes with values in >75% of the 24 samples and “Bonferroni” adjusted p-value <0.05 were selected.

Results and Discussion

We found that the modified Qiagen method led to substantially better results in terms of median yield (by eight fold; Fig. 1A) and higher consistency of purity. This enhanced performance compared to the Roche method may be attributable to a combination of several factors: (a) the additional deparaffinization and alcohol wash steps; (b) the lengthened proteinase k incubation time, using a stabilized and highly active proteinase k formulation (provided by Qiagen); and (c) inclusion of a high temperature heat pulse step for the reversal of RNA cross-linking (Qiagen). There was a wide spread in sample quality, as measured by RPL13a RT-PCR (Fig 1B), with a median Ct of 29.43 ± 4.71. Approximately half of all tumor samples (Qiagen: 28/60, 47%; Roche: 20/36, 56%) had Ct values ≤29 which, based on published expectations, would have predicted a ~50% success rate on DASL. Surprisingly, we found that 100% of all samples processed by the Qiagen method, regardless of Ct, yielded usable data on the WG-DASL platform, whereas only 80% (8/10) of pre-qualifying (Ct≤29) Roche samples tested gave valid DASL data, suggesting that the “Ct 29≤rule” may be conservative for the modified Qiagen method.

Figure 1. Quality control results of RNA prepared by the modified Qiagen FFPE RNeasy and Roche Hi Pure FFPE methods.

Figure 1

(A) The modified Qiagen protocol (n=60) gave a median RNA yield of 8.04 ± 6.11μg (± MAD, median absolute deviation), with a range of 1.19-24.33μg (excluding the single outlier at 53 μg; Fig 1A), and a mean A260/A280 ratio of 1.92 ± 0.06, near the ideal of 2.0. In comparison, the Roche method achieved a median yield of 1.17± 0.73μg on comparable FFPE breast cancer samples (n=36), with a mean A260/A280 of 1.89 ± 0.42. This amounted to an almost eight-fold under-performance in recovery, and also significantly higher variability in purity. Total RNA yield varied considerably between samples with the Qiagen method; but 75% of samples yielded over 5 μg of RNA, and none less than 1.2 μg, in great surplus over the recommended input amount of 200 ng. In contrast, 17% (6/36) of Roche samples failed to yield at least this amount. (B) RPL13A RT-PCR cycle threshold (Ct) values also ranged widely, with a median Ct of 29.40± 4.77 for Qiagen samples and 28.30 ± 2.44 for Roche. There was no statistically significant difference between the median Ct values (p<0.05), nor between the proportions of samples with Ct≤29 for Qiagen (47%) and Roche (55%). (C) Histograms of pairwise Pearson product-moment correlation r between 60 unrelated tumors and between 12 pairs of technical replicates, equally divided between ER+ and ER- tumors, with mean r = 0.972. There was no statistically significant difference in reproducibility (self-correlation among duplicates) between the two subtypes of tumors at level 0.05.

The WG-DASL assay detected a mean of 15,886 probes (12,830 genes) after shorth variance filtering, a number similar to the 15,852 probes detectable (p<0.01) in frozen lung tumor using the v3 BeadChip under comparable conditions [2]. The degree of self-correlation between duplicates, indicated by the average Pearson product-moment correlation coefficient r at the probe level, was 0.979 (Fig. 1C), which compares favorably with reported ranges for breast tumors of 0.73–0.97 [1] and >0.98 [2].

Based on the methodology of Perou et al. [18] we derived an ‘intrinsic set’ of 729 probes (704 genes) for the present dataset. Employing this intrinsic probe set in hierarchical clustering (Fig 2), the array data successfully (a) grouped each pair of 12 duplicates as immediate neighbors within the dendrogram, thus correctly identifying their respective partners in 100% (12/12) of cases; and (b) classified ER+ (38/40; 95%) and ER- (18/20; 90%) tumors into distinct clusters with an overall ≥90% specificity. Interestingly, in two instances of apparent misclassification (ER+ samples 06 and 32 within the ER- clade), a subsequent re-examination of ER levels of the two tumors by immunohistochemistry revealed that they indeed had the two lowest ER scores (Allred 4 of 8) of the entire ER+ cohort (mean ER score=7; max=8). This suggests that the WG-DASL analysis was able to identify additional “intrinsic” molecular relatedness between the lowest-expressing ER+ samples and the ER- group, transcending the ER+/− dichotomy captured by standard biomarkers.

Figure 2. Hierarchical clustering of 60 breast cancer samples and 12 replicates.

Figure 2

Euclidean distance (vertical scale) and Ward agglomeration were used. Two principal clades were generated, resolving the sample set into ER+ (38/40, 95%) and ER-(18/20, 90%) branches with high specificity. As discussed in the text, the discordant samples 32 and 06 (*) have the lowest ER scores by immunohistochemistry.

Conclusions

These results show that a very high proportion of FFPE samples, between 1–2 decades in age, obtained from a wide cross-section of institutions (fourteen; including university/research medical centers, community hospitals, private and military clinics), and with highly varied processing histories, can be extracted by a modified Qiagen method to yield large, microgram quantities of RNA suitable for gene expression profiling with the WG-DASL assay.

Acknowledgments

Funding: NIH/NCI R01CA098858 (KEM, CIL, PLP, LH); Safeway Foundation (PLP, KEM); NIH/NCI P50 CA148143 (PLP)

The authors wish to thank Ryan Basom for assistance with use of Bioconductor lumi package, and in data preprocessing.

Footnotes

Conflicts of Interests

None declared

References

  • 1.Abramovitz M, Ordanic-Kodani M, Wang Y, Li Z, Catzavelos C, Bouzyk M, Sledge GW, Jr, Moreno CS, Leyland-Jones B. Optimization of RNA extraction from FFPE tissues for expression profiling in the DASL assay. BioTechniques. 2008;44:417–423. doi: 10.2144/000112703. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.April C, Klotzle B, Royce T, Wickham-Garcia E, Boyaniwsky T, Izzo J, Cox D, Jones W, Rubio R, Holton K, Matulonis U, Quackenbush J, Fan JB. Whole-genome gene expression profiling of formalin-fixed, paraffin-embedded tissue samples. PLoS One. 2009;4:e8162. doi: 10.1371/journal.pone.0008162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Bibikova M, Chudin E, Arsanjani A, Zhou L, Garcia EW, Modder J, Kostelec M, Barker D, Downs T, Fan JB, Wang-Rodriguez J. Expression signatures that correlated with Gleason score and relapse in prostate cancer. Genomics. 2007;89:666–672. doi: 10.1016/j.ygeno.2007.02.005. [DOI] [PubMed] [Google Scholar]
  • 4.Bibikova M, Talantov D, Chudin E, Yeakley JM, Chen J, Doucet D, Wickham E, Atkins D, Barker D, Chee M, Wang Y, Fan JB. Quantitative gene expression profiling in formalin-fixed, paraffin-embedded tissues using universal bead arrays. Am J Pathol. 2004;165:1799–1807. doi: 10.1016/S0002-9440(10)63435-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bibikova M, Yeakley JM, Chudin E, Chen J, Wickham E, Wang-Rodriguez J, Fan JB. Gene expression profiles in formalin-fixed, paraffin-embedded tissues obtained with a novel assay for microarray analysis. Clin Chem. 2004;50:2384–2386. doi: 10.1373/clinchem.2004.037432. [DOI] [PubMed] [Google Scholar]
  • 6.Bouchie A. Coming soon: a global grid for cancer research. Nat Biotechnol. 2004;22:1071–1073. doi: 10.1038/nbt0904-1071. [DOI] [PubMed] [Google Scholar]
  • 7.Burr TDR, Green A, Ellis I, Murray C. Evaluating gene expression in FFPE tissues using DASL: validation by quantitative real-time PCR and immunohistochemistry. National Cancer Research Institute conference poster; 2008. p. B31. [Google Scholar]
  • 8.Du P, Kibbe WA, Lin SM. lumi: a pipeline for processing Illumina microarray. Bioinformatics. 2008;24:1547–1548. doi: 10.1093/bioinformatics/btn224. [DOI] [PubMed] [Google Scholar]
  • 9.Fan C, Oh DS, Wessels L, Weigelt B, Nuyten DS, Nobel AB, van't Veer LJ, Perou CM. Concordance among gene-expression-based predictors for breast cancer. N Engl J Med. 2006;355:560–569. doi: 10.1056/NEJMoa052933. [DOI] [PubMed] [Google Scholar]
  • 10.Grübel R. The length of the Shorth. Annals of Statistics. 1988;16:9. [Google Scholar]
  • 11.Haller AC, Kanakapalli D, Walter R, Alhasan S, Eliason JF, Everson RB. Transcriptional profiling of degraded RNA in cryopreserved and fixed tissue samples obtained at autopsy. BMC Clin Pathol. 2006;6:9. doi: 10.1186/1472-6890-6-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Hoshida Y, Villanueva A, Kobayashi M, Peix J, Chiang DY, Camargo A, Gupta S, Moore J, Wrobel MJ, Lerner J, Reich M, Chan JA, Glickman JN, Ikeda K, Hashimoto M, Watanabe G, Daidone MG, Roayaie S, Schwartz M, Thung S, Salvesen HB, Gabriel S, Mazzaferro V, Bruix J, Friedman SL, Kumada H, Llovet JM, Golub TR. Gene expression in fixed tissues and outcome in hepatocellular carcinoma. N Engl J Med. 2008;359:1995–2004. doi: 10.1056/NEJMoa0804525. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Hu Z, Fan C, Oh D, Hu Z, Fan C, Oh D, Marrow J, He X, Qaqish B, Livasy C, Carey L, Reynolds E, Dressler L, Nobel A, Parker J, Ewend M, Sawyer L, Wu J, Liu Y, Nanda R, Tretiakova M, Orrico A, Dreher D, Palazzo J, Perreard L, Nelson E, Mone M, Hansen H, Mullins M, Quackenbush J, Ellis M, Olopade O, Bernard P, Perou C. The molecular portraits of breast tumors are conserved across microarray platforms. BMC Genomics. 2006;7:96. doi: 10.1186/1471-2164-7-96. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Li HR, Wang-Rodriguez J, Nair TM, Yeakley JM, Kwon YS, Bibikova M, Zheng C, Zhou L, Zhang K, Downs T, Fu XD, Fan JB. Two-dimensional transcriptome profiling: identification of messenger RNA isoform signatures in prostate cancer from archived paraffin-embedded cancer specimens. Cancer Res. 2006;66:4079–4088. doi: 10.1158/0008-5472.CAN-05-4264. [DOI] [PubMed] [Google Scholar]
  • 15.Linton K, Hey Y, Dibben S, Miller C, Freemont A, Radford J, Pepper S. Methods comparison for high-resolution transcriptional analysis of archival material on Affymetrix Plus 2.0 and Exon 1.0 microarrays. BioTechniques. 2009;47:587–596. doi: 10.2144/000113169. [DOI] [PubMed] [Google Scholar]
  • 16.Paik S. Methods for gene expression profiling in clinical trials of adjuvant breast cancer therapy. Clin Cancer Res. 2006;12:1019s–1023s. doi: 10.1158/1078-0432.CCR-05-2296. [DOI] [PubMed] [Google Scholar]
  • 17.Penland SK, Keku TO, Torrice C, He X, Krishnamurthy J, Hoadley KA, Woosley JT, Thomas NE, Perou CM, Sandler RS, Sharpless NE. RNA expression analysis of formalin-fixed paraffin-embedded tumors. Lab Invest. 2007;87:383–391. doi: 10.1038/labinvest.3700529. [DOI] [PubMed] [Google Scholar]
  • 18.Perou C, Serlie T, Eisen M, van de Rijins M, Jeffrey S, Rees C, Pollack J, Ross D, Johnsen H, Akslen L, Fluge O, Pergamenschikov A, Williams C, Zhu S, Lenning P, Borresen-Dale A, Brown P, Botstein D. Molecular portraits of human breast tumors. Nature. 2000;406:747–752. doi: 10.1038/35021093. [DOI] [PubMed] [Google Scholar]
  • 19.Ravo M, Mutarelli M, Ferraro L, Grober OM, Paris O, Tarallo R, Vigilante A, Cimino D, De Bortoli M, Nola E, Cicatiello L, Weisz A. Quantitative expression profiling of highly degraded RNA from formalin-fixed, paraffin-embedded breast tumor biopsies by oligonucleotide microarrays. Lab Invest. 2008;88:430–440. doi: 10.1038/labinvest.2008.11. [DOI] [PubMed] [Google Scholar]
  • 20.Reding KWDDR, McTiernan A, Hsu L, Davis S, Daling JR, Porter PL, Malone KE. Age-related variation in the relationship between menopausal hormone therapy and the risk of dying from breast cancer. Breast Cancer Research Treatment. doi: 10.1007/s10549-010-1174-7. submitted. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Sorlie T, Perou C, Tibshirani R, Aas T, Geisler S, HJ, Hastie T, Eisen M, van de Rijn M, Jeffrey S, Thorsen T, Quist H, Matese J, Brown P, Botstein D, Eystein Lonning P, Borresen-Dale A. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A. 2001;98:10869–10874. doi: 10.1073/pnas.191367098. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES