Skip to main content
. Author manuscript; available in PMC: 2014 Jun 30.
Published in final edited form as: J Biomed Inform. 2013 Dec 25;48:160–170. doi: 10.1016/j.jbi.2013.12.012

Table 3.

Average runtime (in seconds) of the different predictive modeling pipeline task types for the three different EHR data sets.

Task Type Runtime (seconds)
Small Medium Large
Cohort Construction 6.14 6.53 9.32
Feature Construction (Diagnosis) 15.77 275.05 266.64
Feature Construction (Lab) 27.94 551.49 104.29
Feature Construction (Medication) 26.43 203.75 131.47
Feature Construction (Procedure) 13.93 -- 199.25
Feature Construction (Symptoms) -- 88.33 --
Feature Construction (Merge) 1.32 12.16 16.28
Cross-validation 7.00 42.37 248.41
Feature Selection (Fisher Score) 5.47 65.87 72.64
Feature Selection (Information Gain) 8.50 103.95 370.83
Classification (K-Nearest Neighbor) 0.16 64.02 1651.12
Classification (Naïve Bayes) 0.20 7.74 34.25
Classification (Logistic Regression) 0.18 24.56 342.65
Classification (Random Forest) 1.21 190.66 1649.13