Fig. 3. ARPA for the MTA sample.
a Derived tree by CART for the SUD status as categorical target variable (disjunctive affection status, i.e., substance use of either alcohol, or nicotine, cannabis, or other drugs). As the MTA is a longitudinal study, we used SUD status at 96 and 120 month follow-ups and applied a lag analysis of SUD emergence. The derived tree included demographic (site of ascertainment), and genetic variables (markers rs2172802, rs61747658, rs12509110, and rs6856328). The combination of variants rs61747658 and rs2172802 generated an important discriminant splitting of SUD affected and unaffected classes. b Variable importance scores derived by Random Forest and TreeNet analysis were compatible with the variables included in the tree derived by CART. c, d TreeNet analysis to maximize ROC area and minimize classification error using 200 trees. The AUC were 0.808 and 0.643 for learning and testing samples, while the proportions of misclassification for SUD cases in the cross-validation experiment, for learning and testing data were 0.314 and 0.358, respectively. Conventions as in Fig. 1