Skip to main content
. Author manuscript; available in PMC: 2012 Mar 30.
Published in final edited form as: Nat Biotechnol. 2010 Jul 30;28(8):827–838. doi: 10.1038/nbt.1665

Table 2.

Modeling factor options frequently adopted by MAQC-II data analysis teams

Modeling factor Option Original analysis (training => validation)
Number of teams Number of endpoints Number of models
Summary and normalization Loess 12 3 2,563
RMA 3 7 46
MAS5 11 7 4,947
Batch-effect removal None 10 11 2,281
Mean shift 3 11 7,279
Feature selection SAM 4 11 3,771
FC+P 8 11 4,711
T-Test 5 11 400
RFE 2 11 647
Number of features 0~9 10 11 393
10~99 13 11 4,445
≥1,000 3 11 474
100~999 10 11 4,298
Classification algorithm DA 4 11 103
Tree 5 11 358
NB 4 11 924
KNN 8 11 6,904
SVM 9 11 986

Analytic options used by two or more of the 14 teams that submitted models for all endpoints in both the original and swap experiments. RMA, robust multichip analysis; SAM, significance analysis of microarrays; FC, fold change; RFE, recursive feature elimination; DA, discriminant analysis; Tree, decision tree; NB, naive Bayes; KNN, K-nearest neighbors; SVM, support vector machine.