Hakonarson et al. 10.1073/pnas.0409904102. |
Fig. 3. PBMC-derived gene expression profile predicting GC-R from GC-S asthma patients at baseline. (A) Differential expression of 11 genes that most accurately separated GC responders from nonresponders in the training set at baseline is shown. Genes were ranked by a metric similar to signal to noise and were considered the most differentially expressed genes according to the metric used. For each gene shown, red indicates a high level of expression relative to the mean; blue indicates a low level of expression relative to the mean. (B) Values expressed as mean + SD are shown for the groups.
Supporting Text
Classification of Drug Response Phenotypes
A predictor was generated for the training set by using a weighted voting algorithm similar to that used by Golub et al. (1) by selecting genes that are deemed most relevant in distinguishing GC responders from nonresponders. The weighted voting algorithm makes a weighted linear combination of relevant "informative" genes obtained in the training set to provide a classification scheme for new samples in the test set. The algorithm is calculated in the following way. The mean (m) and standard deviation (e) for each of the two classes (resistant and sensitive) in the training set is first calculated. Then the Euclidean distance between the two classes is calculated for each gene (x) such that EDX = Ö(mR mS)2. The variance of each class is equal to the standard deviation within that class squared, VR,S = e2. Next, the metric, used to choose the most informative genes and for the calculation of weight, for each gene is calculated: MX = (EDX) 2/(VS + VR). To predict the class of a test sample Y each gene X casts a vote for each class: WXR = MXÖ(Y -mR)2 and WXS = MXÖ(Y - mS)2. The final class of test Y is found by the lesser of (S WXR) and (S WXS). Each gene makes a vote based on its metric and which class its signal is closest to. The class with the smallest vote at the end is the predicted class, i.e., the class the test sample is closest to using the Euclidean distance, as the measure is the predicted class. Thus, this method determines whether an accurate prediction of a drug response can be achieved for a test set (with noninformative prior) by expression values of a limited set of genes (n = 1-50) from the training set, as described (2-4).
Reagents
The RPMI medium 1640 was obtained from GIBCO/BRL. The cytokines, IL-1b and TNF-a, and the ELISA kits for IL-13 were obtained from R&D Systems. Dexamethasone was purchased from Sigma. Ficoll-Paque PLUS was obtained from Amersham Pharmacia Biosciences. TRIzol reagent was purchased from Invitrogen Life Technologies (Carlsbad, CA). RNeasy Mini kit was purchased from Qiagen (Valencia, CA) and DNase I was from Ambion (Austin, TX). The GeneChip microarrays and instrument system were provided by Affymetrix (Santa Clara, CA). All reagents used for hybridization target preparation were purchased from the suppliers recommended by Affymetrix in the GeneChip Expression Analysis technical manual (www.affymetrix.com). A TaqMan Reverse Transcription Reagents kit, TaqMan Universal PCR Master Mix, and TaqMan Assays were purchased from Applied Biosystems.
1. Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A., et al. (1999) Science 286, 531-537.
2. Bolstad, B. M., Irizzary, R. A., Astrand, M. & Speed, T. P. (2003) Bioinformatics 19, 185-193.
3. Irizarry, R. A., Hobbs, B., Collin, F., Beazer-Barclay, Y. D., Antonellis, K. J., Scherf, U. & Speed, T. P. (2003) Biostatistics 4, 249-264.
4. Ambroise, C. & McLachlan, G. J. (2000) Proc. Natl. Acad. Sci. USA 99, 6562-6566.