Abstract
Background
Several different cDNA labeling methods have been developed for microarray based gene expression analysis. We have examined the accuracy and reproducibility of such five commercially available methods in detection of predetermined ratio values from target spike mRNAs (A. thaliana) in a background of total RNA. The five different labeling methods were: direct labeling (CyScribe), indirect labeling (FairPlay™ – aminoallyl), two protocols with dendrimer technology (3DNA® Array 50™ and 3DNA® submicro™), and hapten-antibody enzymatic labeling (Micromax™ TSA™). Ten spike controls were mixed to give expected Cy5/Cy3 ratios in the range 0.125 to 6.0. The amounts of total RNA used in the labeling reactions ranged from 5 – 50 μg.
Results
The 3DNA array 50 and CyScribe labeling methods performed best with respect to relative deviation from the expected values (16% and 17% respectively). These two methods also displayed the best overall accuracy and reproducibility. The FairPlay method had the lowest total experimental variation (22%), but the estimated values were consistently higher than the expected values (36%). TSA had both the largest experimental variation and the largest deviation from the expected values (45% and 48% respectively).
Conclusion
We demonstrate the usefulness of spike controls in validation and comparison of cDNA labeling methods for microarray experiments.
Background
High-throughput global gene expression analysis with cDNA- and oligonucleotide-based microarrays has become a common research tool [1,2]. Unfortunately, the method still suffers from inadequate precision due to the many sources of variation during the experimental process [3-5]. Some important parameters to ensure a reliable cDNA microarray experiment are: 1) the quality of the glass-slide, 2) the quality and quantity of the probes (e.g. PCR-products) printed on the glass-slide, 3) the quality and quantity of the RNA samples, 4) the cDNA labeling method, 5) the hybridization protocol, and 6) the scanning procedure. Many efforts have been made to optimize and standardize each of these steps [6-16], but there are still a limited number of data sets describing all methods and strategies in use, especially regarding the labeling of cDNA target samples. Recently the reproducibility, sensitivity and accuracy of a selection of different labeling methods in cDNA microarray hybridization have been compared [13-16]. However, none of these studies have used external mRNA standards (spikes) with predetermined ratio distribution in evaluation of accuracy and reproducibility of the different methods.
In this study, we have added various amounts of 10 different spike mRNAs (Arabidopsis thaliana) in two samples of total RNA. The ratio data generated from these spikes were used to evaluate and compare five different commercially available cDNA labeling methods.
Results and discussions
We have used an approach based on a series of external standards (spikes) to evaluate the reproducibility and accuracy of five commercially available cDNA labeling methods: direct labeling (CyScribe), indirect labeling (FairPlay), two protocols with dendrimer technology: 3DNA Array 50 (3DNA50) and 3DNA submicro (3DNA), and hapten-antibody enzymatic labeling (TSA). Predefined amounts of 10 exogenous A. thaliana mRNAs were added to two rat BT4C total-RNA samples (from two different treatments of cells), resulting in known ratio distribution for the spikes (range: 0.125 – 6.0; See Methods).
The observed ratios of the 10 spikes (calculated as MMR = median of medians of ratios) (Table 1) showed that spikes with ratios below 1.0 were best reproduced with the TSA method, whereas FairPlay showed the largest deviations from the expected values for these spikes. For Spike 1, only CyScribe showed an observed value close to the expected 1.0. The other four methods produced higher values than 1.0. The TSA method showed the largest deviations from expected values for spikes with expected ratios in the range 2.0 – 6.0 (Table 1). The between-array variation with TSA was also highest for these large ratio-spikes.
Table 1.
Spikes | Expected ratio | Observed ratio for the five methods1: | Range3 of MMR values | ||||
1 | 2 | 3 | 4 | 5 | |||
CyScribe [50 μg]2 | FairPlay [20 μg] | TSA [5 μg] | 3DNA [5 μg] | 3DNA50 [20 μg] | |||
Spike 7 | 0.125 | 0.16 (0.05)4 | 0.17 (0.05) | 0.10 (0.04) | 0.10 (0.06) | 0.17 (0.04) | 0.10 – 0.17 |
Spike 10 | 0.25 | 0.28 (0.09) | 0.44 (0.11) | 0.25 (0.11) | 0.26 (0.17) | 0.37 (0.10) | 0.25 – 0.44 |
Spike 9 | 0.33 | 0.25 (0.08) | 0.49 (0.18) | 0.27 (0.11) | 0.22 (0.13) | 0.27 (0.10) | 0.22 – 0.49 |
Spike 8 | 0.50 | 0.60 (0.21) | 0.67 (0.11) | 0.46 (0.14) | 0.45 (0.17) | 0.46 (0.11) | 0.45 – 0.67 |
Spike 1 | 1.00 | 1.22 (0.41) | 1.72 (0.28) | 1.69 (0.69) | 1.64 (0.83) | 2.42 (0.65) | 1.22 – 2.42 |
Spike 2 | 2.00 | 1.85 (0.70) | 2.66 (0.64) | 3.12 (1.20) | 2.40 (0.51) | 2.41 (0.53) | 1.85 – 2.41 |
Spike 3 | 3.00 | 1.56 (0.60) | 3.58 (1.36) | 4.20 (1.90) | 3.85 (0.70) | 3.44 (0.79) | 1.56 – 4.20 |
Spike 4 | 4.00 | 3.92 (1.34) | 4.85 (0.95) | 7.04 (2.77) | 5.15 (1.26) | 4.24 (1.06) | 3.92 – 7.04 |
Spike 5 | 5.00 | 5.63 (1.80) | 7.22 (1.39) | 13.99 (6.56) | 7.66 (1.25) | 5.78 (1.45) | 5.63 – 13.99 |
Spike 6 | 6.00 | 5.94 (2.04) | 7.97 (1.90) | 18.63 (10.64) | 5.38 (1.13) | 6.28 (1.16) | 5.38 – 18.63 |
1The observed ratio values for the first four protocols are median of medians of ratios (MMR) from four replicate hybridizations. For the 3DNA50 method, the data are based on three replicate hybridizations. 2μg total RNA used in labeling reactions. 3The range is given over MMR values for the five different methods. 4The between-array standard deviations, given in parenthesis, represent the standard deviation of the four (three) MMR values from the replicate hybridizations for each method.
In summary, the overall relative deviations from the expected ratios (Table 2) showed that CyScribe, 3DNA50 and 3DNA had the lowest values (16%, 17% and 24% respectively), while both TSA and FairPlay showed the largest relative deviations (48% and 36% respectively).
Table 2.
cDNA labeling method | Relative deviation from expected values | Total CV2 | Ratio of variability between and within arrays3 | RAR4 |
CyScribe | 0.16 | 0.38 | 2.09 | 0.17 |
FairPlay | 0.36 | 0.22 | 1.49 | 0.20 |
TSA | 0.48 | 0.45 | 1.78 | 0.68 |
3DNA | 0.24 | 0.45 | 0.88 | 0.28 |
3DNA50 | 0.17 | 0.26 | 1.43 | 0.10 |
1The table shows only the overall values calculated as the median of ten individual values belonging to the ten spikes. 2CV is the total coefficient of variation of the ratio data. 3The values given are the square root of ratio between the treatment sum of squares (between arrays) and the error sum of squares (within array) from a one-way analysis of variance (see Methods). 4RAR is the relative accuracy and reproducibility (see Methods).
We calculated the median total coefficient of variation of ratios (CV) over the 10 spikes in each method as seen in Table 2. The FairPlay method showed the lowest total experimental CV (22%) followed by 3DNA50 and CyScribe (26% and 38% respectively). The TSA and 3DNA methods showed the largest total experimental variations (45%). The total variability was decomposed into variability between arrays and variability within array using a one-way analysis of variance (see Methods). The between-array variations were almost two times higher than the within-array variations for all of the five methods, except for the 3DNA method (Table 2).
A combined evaluation of accuracy and reproducibility was studied using the parameter relative accuracy and reproducibility (RAR; See Methods) (Table 2). 3DNA50 and CyScribe showed the lowest RAR (0.10 and 0.17 respectively), whereas methods using low amounts of starting RNA (3DNA and especially TSA), showed high RAR values (0.28 and 0.68 respectively).
Shrinkage of relative expression ratios in microarrays, especially for the 3DNA method, has previously been reported by several investigators [15,16]. We did not observe shrinkage of the spike ratios for the 3DNA method, although we detected saturation at the high end of observed ratios for this method. Experiments giving a more accurate evaluation of shrinkage should presumably include spikes with even larger expected ratios than presented here.
The relatively high CV values seen with the TSA method could be a result of high and non-uniform background fluorescence that was seen for all four replicate hybridizations produced with this method (data not shown). High background levels with the TSA method were also reported by Richter and co-workers [15]. The 3DNA method produced arrays with the lowest signal intensities, which may in turn explain the large experimental variation that was also observed with this method.
The RAR values calculated in this study could indicate a positive correlation between RNA quantity in the labeling reactions and the accuracy and reproducibility of the labeling method. In the study conducted by Ritcher et al., both direct and indirect methods were shown to be more reliable than 3DNA and TSA methods when compared to results with Nothern blots. Manduchi and colleagues [16] reported similar observations regarding the overall performance of the direct, indirect, and 3DNA methods.
Conclusions
In conclusion, the 3DNA50 and CyScribe methods showed the best overall performance. The FairPlay method had the lowest experimental variation, but showed consistently higher values than the expected values. TSA had both the largest experimental variation and the largest deviation from the expected values.
When the amount of starting RNA was not a limitation, we showed that all of the three labeling methods, 3DNA50, CyScribe, or FairPlay, had comparable performances. Using small quantities of total RNA as template, the 3DNA method was the better of the two methods analyzed (i.e. 3DNA and TSA). However, as the 3DNA method showed considerable experimental variation, we therefore suggest that researchers also look into labeling methods other than the two presented here when the amount of input RNA is small. The use of amino C6dT-modified random hexamers to prime cDNA synthesis in conjunction with aminoallyl dUTP [17], or RNA amplification methods [18,19] could be alternatives. The use of resonance light scattering (RLS) particles in signal detection is another promising technology, which also allows small amounts of starting RNA [20].
Methods
Rat cDNA microarrays
The rat cDNA microarrays used in this study, were printed and purchased from The Norwegian Microarray Consortium (NMC: http://www.mikromatrise.no/). In addition to the ~13800 sequence verified rat cDNA probes from Research Genetics (Huntsville, AL, USA; http://www.resgen.com/) printed in duplicates on amino silane coated slides (CMT GAPS II, Corning Life Sciences, Corning, NY), ten different cDNAs from Arabidopsis thaliana (SpotReport™, Stratagene, La Jolla, CA, USA) were each printed 32 times on the slides.
Cell culturing and RNA isolation
Total RNA was extracted from BT4C rat glioma cells treated with drug (1 mM LiCl) or saline (1 mM NaCl) using the GenElute™ Mammalian Total RNA Miniprep Kit (Sigma-Aldrich, St. Louis, MO, USA). One large batch of total RNA from each treatment was quality-controlled using UV-spectrophotometry and the BioAnalyzer 2100 (Agilent, Palo Alto, CA, USA) and then used for all microarray experiments.
cDNA labeling
Prior to reverse transcription, exactly defined amounts of exogenous A. thaliana mRNAs (SpotReport™, Stratagene) were added to the control RNA and to the test RNA, giving expected ratios in the range 0.125 to 6.0 (Table 1) in a microarray experiment.
The following cDNA labeling kits were used (μg total RNA is given in Table 1): 1) CyScribe First Strand cDNA Labeling Kit (Amersham Biosciences, Little Chalfont, Buckinghamshire, England), 2) FairPlay™ Microarray Labeling Kit (Stratagene), 3) MICROMAX™ TSA™ Labeling and Detection Kit (Perkin Elmer Life Sciences, Boston, MA, USA), 4) 3DNA® Submicro™ Expression Array Detection Kit (Genisphere Inc, Hatfield, PA, USA) and 5)3DNA® Array 50™ Expression Array Detection Kit (Genisphere Inc).
The amount of starting RNA used for the different methods was chosen based on the recommendations from the manufacturers protocols and is shown in Table 1. All cDNA labeling reactions were performed as recommended by the manufacturers, but with the following modification:
The labeled cDNA samples were purified and upconcentrated using Microcon® columns (YM-30; Millipore, Bedford, MA, USA) in all protocols except for the 3DNA Submicro Expression Array Detection Kit.
Hybridization
Identical prehybridizations were performed for all 20 microarray experiments. The arrays were incubated for 45 min in a 50 ml plastic tube containing 35 ml of prehybridization buffer (5x SSC, 0.1% SDS, 1% BSA) at 65°C, followed by washing in ddH2O (five times each in two separate tubes; RT), and in isopropanol (five times; RT), and then dried by centrifugation at 1000 rpm for 2 min in a microplate centrifuge.
For the CyScribe, FairPlay and TSA Micromax methods, the hybridization-mixture was slightly modified compared to manufacturer's recommended procedure: 7.9 μl 20x SSC, 1.4 μl 10% SDS, 2.5 μl 50x Denhardt, 1 μl yeast tRNA (4 μg/μl), 1 μl poly d(A) (6 μg/μl) and 2 μl 1% BSA were added to the labeled and purified cDNA. The final volume was adjusted to 45 μl by adding 10 mM Tris (pH 8.0). This hybridization-mixture was denatured for 4 min at 96°C, centrifuged for 5 min in a microcentrifuge and then immediately applied to the prehybridized microarray slide. The slide was incubated overnight (~16 hours) at 65°C inside a hybridization chamber (ArrayIt™; TeleChem International Inc., Sunnyvale, CA, USA) in a waterbath. For both 3DNA methods, the 2-step hybridization protocol provided by the manufacturer was used.
The post hybridization treatment and washing were performed as recommended by the manufacturers for both 3DNA methods and the TSA method. The following washing was performed for the CyScribe and FairPlay methods: The slides were washed in 2x SSC, 0.1% SDS (~65°C) to remove the cover-slip, followed by three subsequent washing steps with agitation: 1x SSC (~65°C; 5 min), 0.2x SSC (RT; 5 min), 0.05x SSC (RT; 1 min), and finally spun dry by centrifugation at 1000 rpm for 2 min in a microplate centrifuge.
A total of four arrays were hybridized for each labeling method. The same quality-controlled array batch was used for all experiments and the same person did all of the hybridizations.
Scanning and data analysis
All arrays were scanned with the GenePix® 4000B scanner (Axon Instruments Inc., Union City, CA, USA), followed by image analysis with the GenePix® Pro 3.0 image analysis software (Axon Instruments). Median intensity of the spot and local background was then derived and transferred to the R language and environment for statistical computing and graphics http://www.r-project.org/. Filtering was performed by first excluding spots automatically flagged by the GenePix software. Then spots with background-subtracted intensities less than 200 in both channels were removed. Finally spots with signal-to-local background ratios (S/B) less than 1.5 in any of the two channels were excluded.
We normalized the log2-ratio data (spike values not included) by using print-tip group loess normalization (degree 2 and span 0.4) as described by Yang et al. [21]. The spike ratios were then adjusted by normalization factors obtained from the loess curves. This normalization procedure was equally applied to all arrays.
Statistics
Median of medians of ratios (MMR). For each spike we calculated the median of ratios within each array and then calculated the median of these median ratios from the replicate arrays, obtaining one total measure of expression ratio for each spike.
Relative deviation from the expected values (RD) represents the absolute difference between MMR and the corresponding expected ratio expressed as a percentage of the expected ratio.
Total coefficient of variation of ratios (CV) for each spike was calculated as the ratio of standard deviation (over 128 ratios; 32 pr. array times four replicate arrays) to the median instead of the mean.
Relative accuracy and reproducibility (RAR) was calculated as the sum of the squared RD and the squared total CV for each spike, representing a combined measure of accuracy and reproducibility for that spike ratio.
A one-way analysis of variance (ANOVA) was fitted to the log-ratio data for each spike in each method (32 observations for each of four arrays), with array as a "treatment effect". The total variability for each spike was decomposed into variability between arrays (treatment sum of squares), and variability within array (error sum of squares). All statistical analyses were done using the R language.
Authors' contributions
AB did the processing and analysis of microarray data and drafted the manuscript. HE and VS guided data processing and participated in editing of the manuscript. RL coordinated the study, performed the practical laboratory work and edited the manuscript.
Acknowledgments
Acknowledgements
We acknowledge the Norwegian Microarray Consortium for providing the microarrays. We thank Dr. Einar Martens Foundation for financial support. We thank Mette Langaas for providing and helping us with the R-code for normalization procedure and the ANOVA-analysis.
Contributor Information
Azadeh Badiee, Email: azadeh.badiee@helse-bergen.no.
Hans Geir Eiken, Email: hans.geir.eiken@helse-bergen.no.
Vidar M Steen, Email: vidar.martin.steen@helse-bergen.no.
Roger Løvlie, Email: roger.lovlie@helse-bergen.no.
References
- Schulze A, Downward J. Navigating gene expression using microarrays--a technology review. Nat Cell Biol. 2001;3:E190–5. doi: 10.1038/35087138. [DOI] [PubMed] [Google Scholar]
- Deyholos MK, Galbraith DW. High-density microarrays for gene expression analysis. Cytometry. 2001;43:229–238. doi: 10.1002/1097-0320(20010401)43:4<229::AID-CYTO1055>3.3.CO;2-U. [DOI] [PubMed] [Google Scholar]
- Lee ML, Kuo FC, Whitmore GA, Sklar J. Importance of replication in microarray gene expression studies: statistical methods and evidence from repetitive cDNA hybridizations. Proc Natl Acad Sci U S A. 2000;97:9834–9839. doi: 10.1073/pnas.97.18.9834. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schuchhardt J, Beule D, Malik A, Wolski E, Eickhoff H, Lehrach H, Herzel H. Normalization strategies for cDNA microarrays. Nucleic Acids Res. 2000;28:E47. doi: 10.1093/nar/28.10.e47. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang W, Laborde PM, Coombes KR, Berry DA, Hamilton SR. Cancer genomics: promises and complexities. Clin Cancer Res. 2001;7:2159–2167. [PubMed] [Google Scholar]
- Wang X, Ghosh S, Guo SW. Quantitative quality control in microarray image processing and data acquisition. Nucleic Acids Res. 2001;29:E75. doi: 10.1093/nar/29.15.e75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tseng GC, Oh MK, Rohlin L, Liao JC, Wong WH. Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Res. 2001;29:2549–2557. doi: 10.1093/nar/29.12.2549. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yue H, Eastman PS, Wang BB, Minor J, Doctolero MH, Nuttall RL, Stack R, Becker JW, Montgomery JR, Vainer M, Johnston R. An evaluation of the performance of cDNA microarrays for detecting changes in global mRNA expression. Nucleic Acids Res. 2001;29:E41. doi: 10.1093/nar/29.8.e41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wildsmith SE, Archer GE, Winkley AJ, Lane PW, Bugelski PJ. Maximization of signal derived from cDNA microarrays. Biotechniques. 2001;30:202–208. doi: 10.2144/01301dd04. [DOI] [PubMed] [Google Scholar]
- Yang YH, Buckley MJ, Speed TP. Analysis of cDNA microarray images. Brief Bioinform. 2001;2:341–349. doi: 10.1093/bib/2.4.341. [DOI] [PubMed] [Google Scholar]
- Ramdas Latha, Coombes Kevin, Baggerly Keith, Abruzzo Lynne, Highsmith W Edward, Krogmann Tammy, Hamilton Stanley, Zhang Wei. Sources of nonlinearity in cDNA microarray expression measurements. Genome Biology. 2001;2:research0047.1 – research0047.7. doi: 10.1186/gb-2001-2-11-research0047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dudley AM, Aach J, Steffen MA, Church GM. Measuring absolute expression with microarrays with a calibrated reference sample and an extended signal intensity range. Proc Natl Acad Sci U S A. 2002;99:7554–7559. doi: 10.1073/pnas.112683499. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stears RL, Getts RC, Gullans SR. A novel, sensitive detection system for high-density microarrays using dendrimer technology. Physiol Genomics. 2000;3:93–99. doi: 10.1152/physiolgenomics.2000.3.2.93. [DOI] [PubMed] [Google Scholar]
- Yu J, Othman MI, Farjo R, Zareparsi S, MacNee SP, Yoshida S, Swaroop A. Evaluation and optimization of procedures for target labeling and hybridization of cDNA microarrays. Mol Vis. 2002;8:130–137. [PubMed] [Google Scholar]
- Richter A, Schwager C, Hentze S, Ansorge W, Hentze MW, Muckenthaler M. Comparison of fluorescent tag DNA labeling methods used for expression analysis by DNA microarrays. Biotechniques. 2002;33:620–630. doi: 10.2144/02333rr05. [DOI] [PubMed] [Google Scholar]
- Manduchi E, Scearce LM, Brestelli JE, Grant GR, Kaestner KH, Stoeckert CJ. Comparison of different labeling methods for two-channel high-density microarray experiments. Physiol Genomics. 2002;10:169–179. doi: 10.1152/physiolgenomics.00120.2001. [DOI] [PubMed] [Google Scholar]
- Xiang CC, Kozhich OA, Chen M, Inman JM, Phan QN, Chen Y, Brownstein MJ. Amine-modified random primers to label probes for DNA microarrays. Nat Biotechnol. 2002;20:738–742. doi: 10.1038/nb0702-738. [DOI] [PubMed] [Google Scholar]
- Zhao H, Hastie T, Whitfield ML, Borresen-Dale AL, Jeffrey SS. Optimization and evaluation of T7 based RNA linear amplification protocols for cDNA microarray analysis. BMC Genomics. 2002;3:31. doi: 10.1186/1471-2164-3-31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Iscove NN, Barbara M, Gu M, Gibson M, Modi C, Winegarden N. Representation is faithfully preserved in global cDNA amplified exponentially from sub-picogram quantities of mRNA. Nat Biotechnol. 2002;20:940–943. doi: 10.1038/nbt729. [DOI] [PubMed] [Google Scholar]
- Yguerabide J, Yguerabide EE. Light-scattering submicroscopic particles as highly fluorescent analogs and their use as tracer labels in clinical and biological applications. Anal Biochem. 1998;262:137–156. doi: 10.1006/abio.1998.2759. [DOI] [PubMed] [Google Scholar]
- Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed TP. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 2002;30:e15. doi: 10.1093/nar/30.4.e15. [DOI] [PMC free article] [PubMed] [Google Scholar]