Supplementary material for Virtaneva et al. (2001) Proc. Natl. Acad. Sci. USA 98 (3), 11241129. (10.1073/pnas.031566698)
Materials and Methods
RNA Extraction and cRNA Target Preparation for Oligonucleotide Arrays.
RNA from CD34+ cells was extracted by using a total RNA extraction protocol (Qiagen). AML bone marrow (BM) samples in cell-freezing medium were thawed in 10 vol of RNA Stat-60 (Tel-Test, Houston, TX); 0.2 vol of chloroform was added to the lysed cells, and the aqueous layer containing the RNA was purified with RNeasy columns (Qiagen). The integrity of the individual RNA samples were verified by denaturing agarose gel electrophoresis. A total of 4-8 mg of total cellular RNA from the CD34+ cells and 8 mg of total cellular RNA from the AML samples was used for cDNA synthesis according to the Affymetrix (Santa Clara, CA) protocol. Briefly, a mixture of in vitro transcribed cRNAs of cloned bacterial genes for lysA, pheB, thrB, and jojF (American Type Culture Collection) was added as external controls to monitor the efficiency of cRNA synthesis. First-strand cDNA synthesis was performed at 42°C for 1 hr with the Superscript II system (GIBCO/BRL) at a final concentration of 1´ first-strand synthesis buffer, 10 mM DTT, 500 mM dNTPs, 100 pmol of T7-(T)24 primer, and 200 units of reverse transcriptase. Second-strand cDNA synthesis was performed at 16°C for 2 hr at a final concentration of 1´ second-strand buffer, 250 mM dNTP, 1.2 mM DTT, 65 units/ml DNA ligase, 250 units/ml DNA polymerase I, 13 units/ml RNase H. Second-strand synthesis reaction mixtures were extracted with 25:24:1 phenol/chloroform/isoamyl alcohol (saturated with 10 mM Tris·HCl, pH 8.0/1 mM EDTA) Phase Lock Gels (5 Prime ® 3 Prime), precipitated in 0.75 vol of 5 M NH4OAc, 2.5 vol of 100% ethanol, and 2 ml of glycogen and washed. In vitro transcription labeling with biotinylated UTP and CTP performed according to the manufacturers recommendations (Enzo Diagnostics) for 10 hr at 37°C. Amplified cRNA was purified on an affinity column (RNeasy, Qiagen), and the quality of the amplification was verified by denaturing agarose gel electrophoresis. cRNAs were fragmented for 35 min at 94°C in 50 mM Tris-acetate, pH 8.1/100 mM KOAc/10 mM Mg(OAc)2. The hybridization cocktail consisted of 0.125 mg/ml fragmented cRNA, 50 pM control oligonucleotide B2, 0.1 mg/ml herring sperm DNA, 0.5 mg/ml acetylated BSA, 100 mM Mes, 20 mM EDTA, 0.01% Tween 20 (total Na+ = 1 M), and bacterial sense cRNA contols for bioB, bioC, bioD, and cre at 1.5, 5.0, 25, and 100 pM, respectively.HuGeneFL Array Hybridization and Scanning.
The target hybridization mixture was heated at 99°C for 5 min followed by 5-min incubation at the hybridization temperature of 45°C. Prewetted array cartridges were incubated with the hybridization mix for 16 hr in a mixing rotisserie at 60 rpm. Array cartridges were washed first with 6´ SSPE/0.01% Tween-20/0.005% anti-foam and then with 100 mM NaMes/0.01% Tween-20 according to Affymetrix washing protocols. Hybridized arrays were subsequently stained with 100 mM Mes/0.05% Tween-20 (total Na+ = 1 M), containing 2 mg/ml acetylated BSA, 0.1 mg/ml normal goat IgG (Sigma), and 3 mg/ml biotinylated goat anti-streptavidin antibody (Vector Laboratories) following the Affymetrix antibody staining protocol. After the washes, the probe arrays were scanned with the GeneChip system confocal scanner. Fluorescence intensities of scanned probe arrays were analyzed with the GeneChip 3.1 software (Affymetrix). Expression and log10 expression levels for each gene are derived from intensity values collected from 20 oligonucleotide probe pairs. The software provides scaled intensities such that the average intensity for each chip is approximately equivalent. In addition, the software makes an absolute call (present/absent) for each gene. Expression and log expression were further normalized across samples by using a modification of the linear scaling method (1). Analyses were performed primarily with log expression values, which have reduced skew and desirable variability properties. All primary expression data used for statistical analysis are available at http://cancergenetics.med.ohio-state.edu/microarrayunit.Statistical Analysis.
Analyses were performed with S-Plus 3.4 (Mathsoft, Seattle). Four pairwise group comparisons were performed: the combined AML groups vs. CD34+, AML+8 vs. CD34+, AML-CN vs. CD34+, and AML+8 vs. AML-CN. Expression levels of genes across groups were compared by using two-sample t tests (described as z tests for simplicity), with appropriate significance testing as described below. For each comparison, lists of the most up-regulated and down-regulated genes were prepared. To reduce false positives for sample groups comparisons, we used only genes that were called present in a majority of samples in a given comparison, referring to such genes as "expressed." For two-way cluster analysis, a variation filter was applied to all genes to select those genes with a range of log expression values in excess of 2. Filtered genes were median centered and median-polished as recommended in the software manual. Correlation-based, centered, average linkage clustering was applied to the genes and samples in a two-way cluster analysis by using Cluster and TreeView (2). The values were median centered and median-polished as recommended in the software manual. Class prediction approaches were applied to the group comparisons and included linear discriminant analysis (3) and techniques based on weighted contributions of genes showing greatest expression differences in the two groups (1). A simpler z-coordinate procedure was also performed, in which expression values were converted into unit normal deviates for the genes showing the greatest differences across groups. For each sample in a pairwise group comparison, these deviates were averaged over the genes in which group 1 exceeded group 2 to form a prediction coordinate (coordinate 1). Similarly, coordinate 2 was computed as the average over those genes where group 2 exceeded group 1. Cross-validation was performed by holding out one sample at a time and recreating the entire prediction rule, providing nearly unbiased estimates of prediction error rates (4). Class prediction was performed by selecting the group with the mean closest to the predictor variables of the withheld sample. For the analysis of expression by chromosome, expression levels of each sample were normalized to have a common mean and variance equal to the global mean and variance of the AML samples. Chromosome-wide expression levels were compared in the two AML groups by first calculating for each gene the ratio of its expression level to the mean expression for that gene in the AML-CN group. For each sample, these ratios were then averaged over the genes on that chromosome expressed in a majority of AML samples. An additional calibration step was then employed in which the weighted average of the ratios across the chromosomes was set to 1. This last step altered the expression ratios by 4%, and does not affect the conclusion of a specific chromosome 8 expression increase in AML+8. Ninety-five percent confidence intervals were calculated based on a two-sample t comparison of chromosome-specific ratios in the two groups. This analysis was supplemented with an analysis in which log expression for each gene was rescaled to have common mean and variance, and these values were directly averaged over the genes within the chromosome for each sample. The statistic from the resulting t test was then recomputed for each of 1,000 permutation samples in which group membership (AML-CN vs. AML+8) was randomized. For the comparison of AML+8 to AML-CN for genes within a chromosome, expression levels for these genes were similarly calculated as the ratio of mean expression in AML+8 to mean expression in AML-CN.TaqMan Real-Time PCR Assay.
RNA from the same original extraction was used for both the DNA microarray expression profiling and the real-time PCR assay. Contaminating DNA was removed from the RNA samples by DNase I treatment (DNA-free; Ambion, Austin, TX). The quality of the DNA-free RNA was confirmed by standard PCR to detect the absence of a genomic fragment for GPI. For cDNA synthesis the Superscript Choice system (Life Technologies) was used according to the manufacturers recommendation to reverse transcribe 1.5 mg of total cellular RNA with both random hexamer and oligo(dT)12-18 primers at 42°C for 1 hr. cDNA samples were treated with SuperscriptII RNase H- for 1 hr, pooled, and diluted with H2O. TaqMan 5' nuclease real-time PCR assays were carried out for each sample in 25-ml reaction mixtures of 1´ universal master mix, 1´ human b-actin (VIC-labeled) as an endogenous control, 200 nM target forward and reverse primers, and 100 nM of the target TaqMan oligonucleotide in 50°C for 2 min, 95°C for 10 min, 40 cycles of 95°C for 15 sec, and 60°C for 1 min. All experimental TaqMan oligonucleotide probes were labeled with 6-carboxyfluorescein (6-FAM) at the 5' end and the quencher 6-carboxytetramethylrhodamine (TAMRA) at the 3' end. Sequences for PCR primers and TaqMan oligonucleotide probes are available upon request. The comparative CT method was used to determine the ratio of target and endogenous control according to specifications (Applied Biosystems), to assess the relative levels of expression and expression differences between AML+8 and AML-CN. First the parameter threshold cycle (CT) was determined for the target and internal control; then the cycle number difference DCT was calculated. To determine the cDNA copy number difference between the two AML sample groups, the difference between the AML-CN DCT value and the AML+8 DCT value was determined and noted as DDCT. The difference in cDNA copy number between AML-CN and AML+8 cDNAs was calculated by the expression 2-DDCT, which correlates the cycle difference DDCT to the cDNA copy doubling per each PCR cycle.References
1. Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A., et al. (1999) Science 286, 531-537.
2. Eisen, M. B., Spellman, P. T., Brown, P. O. & Botstein, D. (1998) Proc. Natl. Acad. Sci. USA 95, 14863-14868.
3. Johnson, R. J. & Wichern, D. W. (1992) Applied Multivariate Statistical Analysis (Prentice Hall, Englewood Cliffs, NJ).
4. Cox, D. R. & Hinkley, D. V. (1974) Theoretical Statistics (Chapman and Hall, New York).
5. Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E. S. & Golub, T. R. (1999) Proc. Natl. Acad. Sci. USA 96, 2907-2912.
6. Alon, U., Barkai, N., Notterman, D. A., Gish, K., Ybarra, S., Mack, D. & Levine, A. J. (1999) Proc. Natl. Acad. Sci. USA 96, 6745-6750.