Skip to main content
. 2018 Dec 27;26(3):219–227. doi: 10.1093/jamia/ocy164

Figure 1.

Figure 1.

Truncated histograms of the sorted EHR+USC defined sampling strata for the entire Vanderbilt-Adult population and for samples of size 4500 using random (RS) and maximum entropy sampling (MES). The ideal sampling frequency is 19 per stratum with the remaining 130 individuals being randomly selected from available strata. The MES sample is enriched compared to the random sample, especially with individuals from strata with sparse counts (strata 1-112); all individuals belonging to strata 1-132 were included in the final sample. The y-axis is truncated at 100 to highlight the distribution of rare sampling strata (eg, stratum counts ranged from 1 to over 70K with 58% belonging to the top 5 sampling strata).