Skip to main content
. 2021 Oct 6;7(41):eabf0354. doi: 10.1126/sciadv.abf0354

Table 1. Patient counts in de-identified data and the fraction of datasets excluded by our exclusion.

criteria. Dataset sizes are after the exclusion criteria are applied.

Truven UCM
Distinct patients 115,805,687 69,484
Male Female Male Female
ASD diagnosis
count*
12,146 3018 307 70
Control count* 2,301,952 2,186,468 20,249 17,386
AUC at 125 weeks 82.3% 82.5% 83.1% 81.37%
AUC at 150 weeks 84.79% 85.26% 82.15% 83.39%
Excluded fraction of the datasets
Positive category 0.0002 0.0 0.0160 0.0
Control category 0.0045 0.0045 0.0413 0.0476
Average number of diagnostic codes in excluded patients (corresponding
number in included patients)
Positive category 4.33 (35.93) 0.0 (36.07) 2.6 (9.75) 0.0 (10.18)
Control category 1.57 (17.06) 1.48 (15.96) 2.32 (6.8) 2.07 (6.79)

*Cohort sizes are smaller than the total number of distinct patients due to the following exclusion criteria: (i) At least one code within our complete set of tracked diagnostic codes is present in the patient record, and (ii) time lag between first and last available record for a patient is at least 15 weeks.