Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2008 Sep;142(5):802–807. doi: 10.1111/j.1365-2141.2008.07261.x

An international standardization programme towards the application of gene expression profiling in routine leukaemia diagnostics: the Microarray Innovations in LEukemia study prephase

Alexander Kohlmann 1, Thomas J Kipps 2, Laura Z Rassenti 2, James R Downing 3, Sheila A Shurtleff 3, Ken I Mills 4, Amanda F Gilkes 4, Wolf-Karsten Hofmann 5, Giuseppe Basso 6, Marta Campo Dell’Orto 6, Robin Foà 7, Sabina Chiaretti 7, John De Vos 8, Sonja Rauhut 9, Peter R Papenhausen 10, Jesus M Hernández 11, Eva Lumbreras 11, Allen E Yeoh 12, Evelyn S Koay 12, Rachel Li 1, Wei-min Liu 1, Paul M Williams 1, Lothar Wieczorek 1, Torsten Haferlach 9
PMCID: PMC2654477  PMID: 18573112

Abstract

Gene expression profiling has the potential to enhance current methods for the diagnosis of haematological malignancies. Here, we present data on 204 analyses from an international standardization programme that was conducted in 11 laboratories as a prephase to the Microarray Innovations in LEukemia (MILE) study. Each laboratory prepared two cell line samples, together with three replicate leukaemia patient lysates in two distinct stages: (i) a 5-d course of protocol training, and (ii) independent proficiency testing. Unsupervised, supervised, and r2 correlation analyses demonstrated that microarray analysis can be performed with remarkably high intra-laboratory reproducibility and with comparable quality and reliability.

Keywords: microarray, gene expression profiling, leukaemia, standardization, diagnostics


Several microarray studies have already demonstrated the identification of differentially expressed genes associated with distinct clinical and therapeutically relevant classes of leukaemias (Golub et al, 1999; Armstrong et al, 2002; Schoch et al, 2002; Yeoh et al, 2002). Given that microarray assays analyse the expression of multiple genes in parallel, they appear to be a robust test method for diagnostic usage (Kohlmann et al, 2003, 2005; Haferlach et al, 2005). However, to date, all of these studies aimed at subclassifying leukaemia subtypes through gene expression profiling have been performed mainly as monocentric studies that included only a limited number of patients or using mostly RNA specimens that were predominantly analysed retrospectively from archived samples.

Here we report data from an international study group formed around the European Leukemia Network (ELN, http://www.leukemia-net.org) in 11 laboratories: seven from the ELN, three from the United States, and one in Singapore. The so-called Microarray Innovations in LEukemia (MILE) study programme will prospectively assess the clinical accuracy of gene expression profiles of 16 acute and chronic leukaemia subclasses, of myelodysplastic syndromes (MDS), and a “none of the target classes” control group, as compared to current routine diagnostic workup in over 3000 patients. As a first step representing a major effort to standardize the microarray analysis workflow in the participating centres, a prephase of the MILE study was performed. This report presents the results of the prephase, i.e., a standardization programme of the microarray procedure in the participating laboratories in order to ensure a robust gene expression profiling test performance before patient samples were analysed.

Materials and methods

There were two stages in the MILE prephase study: protocol training and proficiency testing. As part of the initial protocol training each participating laboratory was provided with identical equipment, including reagent kits, enzymes, spectrophotometer, and heat block instruments, and eight microarray experiments were performed at each centre with an on-site trainer in the respective laboratory being trained. The eight samples analysed during the training course were represented by MCF-7 (breast adenocarcinoma) and HepG2 (liver carcinoma) cell line total RNA (Ambion, Austin, TX, USA) with 1·0 μg and 5·0 μg input of total RNA, respectively, and four leukaemia patient sample lysates prepared from mononuclear cells obtained after Ficoll density purification. Patient lysates comprised cells of one chronic myeloid leukaemia (CML), one chronic lymphocytic leukaemia (CLL), and two replicate lysates of an AML patient sample (containing a translocation t(8;21), French-American-British (FAB) type M2). The total RNA from the patient lysates was extracted at each centre as part of the training programme, making these samples a test of the entire microarray process workflow post sample acquisition (RNeasy kit, Qiagen, Hilden, Germany). Subsequently, after the training phase and for operator proficiency testing, each laboratory independently performed four microarray experiments each for MCF-7 and HepG2 cell lines with inputs of 1·5 μg, 3·0 μg, 5·0 μg, and 8·0 μg total RNA. In total, 204 microarray profiles were included in the analysis (for details see Appendix SI and SII). The three anonymous replicate patient lysates were provided by the Laboratory for Leukaemia Diagnostics in Munich, Germany. All patients gave their informed consent for participation after having been advised of the purpose and investigational nature of the study. The study design adhered to the tenets of the Declaration of Helsinki and was approved by the ethics committees of the participating institutions before its initiation. Details on the microarray analysis workflow, image analysis, quality reports, as well as statistical methods are given in Appendix SI.

Results

Intra-laboratory reproducibility of gene expression analyses

As shown in an unsupervised Principal Component Analysis (PCA), the individual gene expression profiles grouped closely together with their corresponding biological sample types based on the underlying similarity, but not according to the centre where the microarray experiments were performed (Fig 1). The arrows in Fig 1 indicate that the four leukaemia sample preparations from Centre 9 (N17-20), as well as one HepG2 preparation from Centre 3 (N18) were outliers in the PCA. Large differences in gene expression profiles were also observed with respect to the manufacturing batches for MCF-7 total RNA, but overall, a high level of reproducibility between laboratories was seen when a standardized protocol for microarray analysis was followed by trained operators. According to the unsupervised PCA plots, replicated gene expression profiles of the HepG2 cell line were more biologically homogeneous and not as influenced by manufacturing batch numbers, as seen for MCF-7 cell line replicates. Therefore, replicated profiles of the HepG2 cell line were chosen to further investigate the intra- and inter-laboratory correlations. All centres generated highly reproducible gene expression profiles for this cell line, as shown in the box plot analysis of r2 values from all pairwise comparisons within each centre for the sample type HepG2 (Fig 2A), where mean r2 values range from 0·973 to 0·988. The slightly higher variability at Centre 11 might be explained by a higher number of operators and replicate analyses than in other centres. Figure 2B shows the intra-site repeatability of microarray data based on quantitative signal values and qualitative detection calls. The number of generally detected genes for each sample type at each centre varied from 24 627–27 075 for HepG2 and 25 841–28 953 for MCF-7. The coefficient of variation (CV) of the quantitative signal values between the intra-site replicates was calculated using the generally detected subset of genes for each sample type HepG2 and MCF-7 at each laboratory. The distribution of the replicate CV measures across the set of detected genes is displayed in a series of box plots. The different laboratories demonstrated similar replicate CV median values of 1·962–3·234% for HepG2 and 1·869–2·864% for MCF-7.

Fig 1.

Fig 1

Unsupervised principal component analysis (PCA). A total of 204 experiments are included in the three-dimensional PCA and each sphere represents the gene expression profile for a cell line or leukaemia sample. The signal used is DQN1. The first three principal components (PC) account for 41·0% of variation of the data (PC1 = 18·1%, PC2 = 14·9%, PC3 = 8·0%). The analysis is based on all probe sets represented on the HG-U133 Plus 2.0 microarray without any filtering process (n = 54 613). Outliers are marked with arrows. (A) The same sample types are represented by the same colour spheres. Distinct manufacturing batch numbers of the cell lines are given in Appendix SI. (B) Samples processed within the same centre are represented by the same colour spheres.

Fig 2.

Fig 2

Analysis of intra- and inter-laboratory reproducibility. (A) Box-and-whisker plots display, for each laboratory, the intra-laboratory squared correlation coefficients (r2) of all probe sets represented on the HG-U133 Plus 2.0 microarray for the HepG2 cell line sample. The signal used is DS. Each laboratory analysed six HepG2 samples using various amounts of starting total RNA: 1·0 μg, 1·5 μg, 3·0 μg, 5·0 μg (duplicate), or 8·0 μg, respectively. Thus, all possible different pairwise comparisons were performed (Count). Mean r2 values (black arrow) and standard deviation (SD) values are given for each of the series of comparisons for each laboratory. Outliers are represented as red boxes. Note: more comparisons were performed in Centres 9 and 11 because multiple operators contributed microarray data (Appendix SII). (B) Repeatability of expression signal within laboratories. The CV of the expression signal values between centre replicates of the same sample type was calculated for all generally detected genes (left y-axis). The distributions of replicate CVs are presented in a series of eleven box-and-whisker plots: one for each of the two sample types HepG2 (left) or MCF-7 (right) at the eleven distinct laboratories. The median (line), interquartile range as well as the 10th and 90th percentile values are indicated in each plot. Only genes that were generally detected were included in the box plots and CV calculations. The number of generally detected genes was defined as being called present in at least one third of the samples, e.g., at least two out of the six replicates per centre. This number varied by sample and laboratory and is noted as the line plot with the y-axis on the right. (C) Box-and-whisker plots display the inter-laboratory squared correlation coefficients (r2) of all probe sets represented on the HG-U133 Plus 2.0 microarray for the HepG2 cell line sample. The signal used is DS. Each centre analysed six HepG2 samples using various amounts of starting total RNA: 1·0 μg, 1·5 μg, 3·0 μg, 5·0 μg (duplicate), or 8·0 μg, respectively. Here, microarray data from Centre 3 is compared with all other laboratories. Each inter-laboratory analysis with different pairwise comparisons is represented by a single box plot (Count). Mean r2 values (black arrow) and standard deviation (SD) values are given for each series of comparisons. Outliers are represented as red boxes. Note: more comparisons were performed in Centres 9 and 11 because multiple operators contributed microarray data (Appendix SII). (D) Scatter plot analysis of inter-laboratory reproducibility. The graph shows 10 distinct scatter plot analyses, each displaying a comparison between Centre 3 and the other laboratories for the 5·0 μg HepG2 sample run at the stage of proficiency testing. The r2 value calculation is based on DS intensity signals from all probe sets on the HG-U133 Plus 2.0 microarray.

Inter-laboratory reproducibility of gene expression analyses

As an example of inter-laboratory reproducibility of gene expression analyses, correlations between Centre 3 and all other ten laboratories are given (Fig 2C and D). The degree of correlation was only slightly different to the intra-laboratory reproducibility (Fig 2C). The minimum and maximum mean values were 0·959 and 0·985, respectively. This again demonstrated a high inter-laboratory correlation of HepG2 gene expression profiles and confirms the outstanding performance of microarray analysis in the 11 centres. This high inter-laboratory consistency can be also shown in pairwise scatter plot analyses. The 5·0 μg HepG2 replicate analysis between Centre 3 and other laboratories is shown as an example (Fig 2D). A very tight distribution of gene expression data can be observed along the diagonal line for every paired HepG2 sample. Additional analyses of inter-site correlations for HepG2 subsets across all laboratories, along with hierarchical cluster and principal component analyses, are given in Appendix SI. Furthermore, the online section also contains an analysis of the relative contribution of different sources of both technical and biological variability in gene expression measurements.

Discussion

Taken together, this study demonstrated that standardizing experimental protocols for microarray analysis and performing a thorough operator training resulted in excellent comparability with respect to both data sets generated within a participating laboratory and across 11 different laboratories in three continents. This extends the observations of a recent across-platform comparison study from the Toxicogenomics Research Consortium (Bammler et al, 2005). In particular, and also noted by Bammler et al (2005), the standardization of RNA labelling protocols using common procedures was recognized as an important contributor to signal intensity correlations across different laboratories. Our study further shows consistent results when compared with the intra-platform precision demonstrated from three different centres in the recent MicroArray Quality Consortia data (Shi et al, 2006).

In conclusion, this standardization effort represented the prerequisite foundation of the first phase of the MILE study, wherein 1889 patients have, thus far, been analysed by whole genome expression microarrays (Haferlach et al, 2006). The protocol devised for sample preparation takes only one working day from cDNA synthesis to cocktail hybridization and is easily applicable in a daily routine setting. The standardization of gene expression profiling testing in this way has the potential to offer identical objective diagnostic results in any trained laboratory throughout the world. Thus, microarrays are getting substantially closer to a routine application of gene expression profiling for the diagnosis of leukaemias in the clinical practice.

Authors’ contributions

AK, LW, TH: design of the study and drafting the article; RL, WML, PMW: statistical analysis and interpretation of data; TJK, LZR, JRD, SAS, KIM, AFG, WKH, GB, MCDO, RF, SC, JDV, SR, PRP, JMH, EL, AEY, ESK: data acquisition, interpretation of data, and article revision. All authors approved the final version submitted for publication.

Acknowledgments

We would like to acknowledge the technical assistance of Traci Lyn Toy, W. Kent Williams, Letha Phillips, Verena Serbent, Simona Tavolaro, Monica Messina, Julie Tsai, Matt Eaton, Véronique Pantesco, William Overman, Ted Farr, Cecilia S. N. Kwok, Pei Tee Hwan, and Dr. Lu Yi. We further thank Dr. Geertruy te Kronnie, Prof. Marie Christine Béné, Prof. Claude Preudhomme, and Prof. Elizabeth Macintyre for support throughout the conduct of the prephase of the MILE study.

Funding

This study is part of the MILE Study (Microarray Innovations In LEukemia) programme, an ongoing collaborative effort headed by the European Leukaemia Network (ELN) and sponsored by Roche Molecular Systems, Inc., addressing gene expression signatures in acute and chronic leukaemias. This work is further partly supported by AIRC (Associazione Italiana per la Ricerca sul Cancro), Milan, Ministero dell’Università e della Ricerca, Fondo per gli Investimenti della Ricerca di Base (FIRB) and COFIN, Rome, Italy.

Conflict of interest

AK, RL, WML, PMW, and LW are employed by Roche Molecular Systems, Inc. and are involved in the AmpliChip Leukaemia Test research programme, a gene expression microarray for the subclassification of leukaemia. TH is a consultant for F. Hoffmann-La Roche Ltd, Basel, Switzerland. The other authors report no potential conflicts of interest.

Supplementary material

The following supplementary material is available for this article online:

Appendix SI. Details on microarray analysis, manufacturing lot numbers of cell lines, and additional information on interlaboratory reproducibility.

bjh0142-0802-SD1.xls (82KB, xls)

Appendix SII. Information on microarray quality parameters.

bjh0142-0802-SD2.xls (126.5KB, xls)

Appendix SIII. r2 correlation data for MCF-7 cell line data.

bjh0142-0802-SD3.xls (126.5KB, xls)

Appendix SIV. r2 correlation data for HepG2 cell line data.

bjh0142-0802-SD4.xls (28.5KB, xls)

Appendix SV. r2 correlation data for leukaemia samples data.

bjh0142-0802-SD5.doc (469KB, doc)

The material is available as part of the online article from: http://www.blackwell-synergy.com/doi/abs/10.1111/j.1365-2141.2008.07261.x

(This link will take you to the article abstract).

Please note: Blackwell Publishing are not responsible for the content or functionality of any supplementary materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.

References

  1. Armstrong SA, Staunton JE, Silverman LB, Pieters R, Den Boer ML, Minden MD, Sallan SE, Lander ES, Golub TR, Korsmeyer SJ. MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nature Genetics. 2002;30:41–47. doi: 10.1038/ng765. [DOI] [PubMed] [Google Scholar]
  2. Bammler T, Beyer RP, Bhattacharya S, Boorman GA, Boyles A, Bradford BU, Bumgarner RE, Bushel PR, Chaturvedi K, Choi D, Cunningham ML, Deng S, Dressman HK, Fannin RD, Farin FM, Freedman JH, Fry RC, Harper A, Humble MC, Hurban P, Kavanagh TJ, Kaufmann WK, Kerr KF, Jing L, Lapidus JA, Lasarev MR, Li J, Li YJ, Lobenhofer EK, Lu X, Malek RL, Milton S, Nagalla SR, O’malley JP, Palmer VS, Pattee P, Paules RS, Perou CM, Phillips K, Qin LX, Qiu Y, Quigley SD, Rodland M, Rusyn I, Samson LD, Schwartz DA, Shi Y, Shin JL, Sieber SO, Slifer S, Speer MC, Spencer PS, Sproles DI, Swenberg JA, Suk WA, Sullivan RC, Tian R, Tennant RW, Todd SA, Tucker CJ, Van Houten B, Weis BK, Xuan S, Zarbl H. Standardizing global gene expression analysis between laboratories and across platforms. Nature Methods. 2005;2:351–356. doi: 10.1038/nmeth754. [DOI] [PubMed] [Google Scholar]
  3. Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science. 1999;286:531–537. doi: 10.1126/science.286.5439.531. [DOI] [PubMed] [Google Scholar]
  4. Haferlach T, Kohlmann A, Schnittger S, Dugas M, Hiddemann W, Kern W, Schoch C. Global approach to the diagnosis of leukemia using gene expression profiling. Blood. 2005;106:1189–1198. doi: 10.1182/blood-2004-12-4938. [DOI] [PubMed] [Google Scholar]
  5. Haferlach T, Kohlmann A, Basso G, Bene MC, Downing J, Shurtleff S, Hernandez JM, Hofmann WK, Kipps TJ, Kronnie TT, Liu WM, Li R, Macintyre E, Preudhomme C, Chiaretti S, Rassenti L, de Vos J, Yeoh A, Brown C, Williams M, Mills K, Wieczorek L, Foa R. An international multi-center study to define the application of microarrays in the diagnosis and subclassification of leukemia (MILE study): interim analysis based on 1,889 patients achieves 95.4% prediction accuracy. Blood. 2006;108:34A–35A. [Google Scholar]
  6. Kohlmann A, Schoch C, Schnittger S, Dugas M, Hiddemann W, Kern W, Haferlach T. Molecular characterization of acute leukemias by use of microarray technology. Genes, Chromosomes & Cancer. 2003;37:396–405. doi: 10.1002/gcc.10225. [DOI] [PubMed] [Google Scholar]
  7. Kohlmann A, Schoch C, Dugas M, Rauhut S, Weninger F, Schnittger S, Kern W, Haferlach T. Pattern robustness of diagnostic gene expression signatures in leukemia. Genes, Chromosomes & Cancer. 2005;42:299–307. doi: 10.1002/gcc.20126. [DOI] [PubMed] [Google Scholar]
  8. Schoch C, Kohlmann A, Schnittger S, Brors B, Dugas M, Mergenthaler S, Kern W, Hiddemann W, Eils R, Haferlach T. Acute myeloid leukemias with reciprocal rearrangements can be distinguished by specific gene expression profiles. Proceedings of the National Academy of Sciences of the United States of America. 2002;99:10008–10013. doi: 10.1073/pnas.142103599. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Shi L, Reid LH, Jones WD, Shippy R, Warrington JA, Baker SC, Collins PJ, de Longueville F, Kawasaki ES, Lee KY, Luo Y, Sun YA, Willey JC, Setterquist RA, Fischer GM, Tong W, Dragan YP, Dix DJ, Frueh FW, Goodsaid FM, Herman D, Jensen RV, Johnson CD, Lobenhofer EK, Puri RK, Scherf U, Thierry-Mieg J, Wang C, Wilson M, Wolber PK, Zhang L, Amur S, Bao W, Barbacioru CC, Lucas AB, Bertholet V, Boysen C, Bromley B, Brown D, Brunner A, Canales R, Cao XM, Cebula TA, Chen JJ, Cheng J, Chu TM, Chudin E, Corson J, Corton JC, Croner LJ, Davies C, Davison TS, Delenstarr G, Deng X, Dorris D, Eklund AC, Fan XH, Fang H, Fulmer-Smentek S, Fuscoe JC, Gallagher K, Ge W, Guo L, Guo X, Hager J, Haje PK, Han J, Han T, Harbottle HC, Harris SC, Hatchwell E, Hauser CA, Hester S, Hong H, Hurban P, Jackson SA, Ji H, Knight CR, Kuo WP, Leclerc JE, Levy S, Li QZ, Liu C, Liu Y, Lombardi MJ, Ma Y, Magnuson SR, Maqsodi B, McDaniel T, Mei N, Myklebost O, Ning B, Novoradovskaya N, Orr MS, Osborn TW, Papallo A, Patterson TA, Perkins RG, Peters EH, Peterson R, Philips KL, Pine PS, Pusztai L, Qian F, Ren H, Rosen M, Rosenzweig BA, Samaha RR, Schena M, Schroth GP, Shchegrova S, Smith DD, Staedtler F, Su Z, Sun H, Szallasi Z, Tezak Z, Thierry-Mieg D, Thompson KL, Tikhonova I, Turpaz Y, Vallanat B, Van C, Walker SJ, Wang SJ, Wang Y, Wolfinger R, Wong A, Wu J, Xiao C, Xie Q, Xu J, Yang W, Zhang L, Zhong S, Zong Y, Slikker W., Jr The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nature Biotechnology. 2006;24:1151–1161. doi: 10.1038/nbt1239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Yeoh EJ, Ross ME, Shurtleff SA, Williams WK, Patel D, Mahfouz R, Behm FG, Raimondi SC, Relling MV, Patel A, Cheng C, Campana D, Wilkins D, Zhou X, Li J, Liu H, Pui CH, Evans WE, Naeve C, Wong L, Downing JR. Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell. 2002;1:133–143. doi: 10.1016/s1535-6108(02)00032-6. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

bjh0142-0802-SD1.xls (82KB, xls)
bjh0142-0802-SD2.xls (126.5KB, xls)
bjh0142-0802-SD3.xls (126.5KB, xls)
bjh0142-0802-SD4.xls (28.5KB, xls)
bjh0142-0802-SD5.doc (469KB, doc)

Articles from British Journal of Haematology are provided here courtesy of Wiley

RESOURCES