Skip to main content
Sage Choice logoLink to Sage Choice
. 2023 Oct 16;43(7-8):930–948. doi: 10.1177/0272989X231196916

Development and Validation of the US Diabetes, Obesity, Cardiovascular Disease Microsimulation (DOC-M) Model: Health Disparity and Economic Impact Model

David D Kim 1,, Lu Wang 2, Brianna N Lauren 3, Junxiu Liu 4, Matti Marklund 5,6, Yujin Lee 7, Renata Micha 8, Dariush Mozaffarian 9,*, John B Wong 10,*
PMCID: PMC10625721  NIHMSID: NIHMS1924377  PMID: 37842820

Abstract

Background

Few simulation models have incorporated the interplay of diabetes, obesity, and cardiovascular disease (CVD); their upstream lifestyle and biological risk factors; and their downstream effects on health disparities and economic consequences.

Methods

We developed and validated a US Diabetes, Obesity, Cardiovascular Disease Microsimulation (DOC-M) model that incorporates demographic, clinical, and lifestyle risk factors to jointly predict overall and racial-ethnic groups-specific obesity, diabetes, CVD, and cause-specific mortality for the US adult population aged 40 to 79 y at baseline. An individualized health care cost prediction model was further developed and integrated. This model incorporates nationally representative data on baseline demographics, lifestyle, health, and cause-specific mortality; dynamic changes in modifiable risk factors over time; and parameter uncertainty using probabilistic distributions. Validation analyses included assessment of 1) population-level risk calibration and 2) individual-level risk discrimination. To illustrate the application of the DOC-M model, we evaluated the long-term cost-effectiveness of a national produce prescription program.

Results

Comparing the 15-y model-predicted population risk of primary outcomes among the 2001–2002 National Health and Nutrition Examination Survey (NHANES) cohort with the observed prevalence from age-matched cross-sectional 2003–2016 NHANES cohorts, calibration performance was strong based on observed-to-expected ratio and calibration plot analysis. In most cases, Brier scores fell below 0.0004, indicating a low overall prediction error. Using the Multi-Ethnic Study of Atherosclerosis cohorts, the c-statistics for assessing individual-level risk discrimination were 0.85 to 0.88 for diabetes, 0.93 to 0.95 for obesity, 0.74 to 0.76 for CVD history, and 0.78 to 0.81 for all-cause mortality, both overall and in three racial-ethnic groups. Open-source code for the model was posted at https://github.com/food-price/DOC-M-Model-Development-and-Validation.

Conclusions

The validated DOC-M model can be used to examine health, equity, and the economic impact of health policies and interventions on behavioral and clinical risk factors for obesity, diabetes, and CVD.

Highlights

  • We developed a novel microsimula’tion model for obesity, diabetes, and CVD, which intersect together and – critically for prevention and treatment interventions – share common lifestyle, biologic, and demographic risk factors.

  • Validation analyses, including assessment of (1) population-level risk calibration and (2) individual-level risk discrimination, showed strong performance across the overall population and three major racial-ethnic groups for 6 outcomes (obesity, diabetes, CVD, and all-cause mortality, CVD- and DM-cause mortality)

  • This paper provides a thorough explanation and documentation of the development and validation process of a novel microsimulation model, along with the open-source code (https://github.com/food-price/ DOCM_validation) for public use, to serve as a guide for future simulation model assessments, validation, and implementation.

Keywords: microsimulation model, validation, calibration, obesity, diabetes, cardiovascular disease

Introduction

Decision-analytic simulation modeling has helped advance understanding of the contributions and comparative impacts of risk factors, prevention strategies, and treatment interventions for guiding clinical and health policy decisions, for example, in public health crises such as the COVID-19 pandemic.13 By integrating multiple data sources, extrapolating population outcomes, and assessing critical assumptions, simulation modeling can inform health policy decision making, but most disease simulation models focus on a single disease (e.g., diabetes,46 hypertension, 7 coronary heart disease [CHD] and stroke,810 and cancers1114). Yet, some chronic conditions often share risk factors with prognostic implications in individuals that affect overall population health. This is particularly salient for diet-related diseases, including obesity, diabetes, and cardiovascular disease (CVD), which share demographic, lifestyle, and clinical risk factors and cluster in high-risk individuals.

In addition, despite advancements in managing and treating obesity, diabetes, and CVD,1520 individuals from racial-ethnic underrepresented populations and lower socioeconomic backgrounds remain disproportionately affected by these conditions, creating systematic preventable differences in disease burdens of disease, known as health disparities. 21 The persistent health disparities require additional data collection and analyses to understand how racial and ethnic demographic vulnerability is distributed in the population, 22 and the National Institute of Health has issued a special interest in simulation modeling and systems science to address health disparities. 23 However, many existing simulation models do not link underlying individual-level behavior with clinical risk factors and the health consequences24,25 and, in particular, have limited capability of examining distributional impact across racial-ethnic and socioeconomic subpopulations.

To address these gaps, we developed and validated a novel microsimulation model that jointly incorporates demographic, clinical, and lifestyle risk factors (e.g., diet) to project obesity, diabetes, CVD, and cause-specific mortality as interrelated health outcomes for the US population, overall and by racial-ethnic groups. We further developed and integrated an individualized health care cost prediction model. With the growing use and importance of simulation models to examine population health and disparities, this article aims to provide detailed documentation of how this model was developed and validated. We illustrate the application of our novel microsimulation model to evaluate the long-term cost-effectiveness of a national produce prescription program. We additionally provide the open-source code for public use to serve as a guide for future simulation model assessments, validation, and implementation.

Methods

Model Overview

The Diabetes, Obesity, Cardiovascular Disease Microsimulation (DOC-M) model is a probabilistic, dynamic, individual-level, health-state transition model, programmed in R-4.1.0, 26 that projects obesity, diabetes, CVD, and their associated complications for population health and health policy decision making. The DOC-M simulates individuals and their individual characteristics to project their health trajectories. For example, each simulated person in the DOC-M model could develop a health event and move to a new health state (or not) each year: no CVD or diabetes, diabetes without CVD, CVD without diabetes, both CVD and diabetes, and death, plus 4 CVD-related events (first or recurrent stroke or CHD, with an option for revascularization for each; Figure 1). The model also captures the incidence and prevalence of overweight (body mass index [BMI] ≥ 25 and <30 kg/m2) and obesity (BMI ≥30 kg/m2), based on each individual’s dynamic BMI influenced by their lifestyle behaviors and underlying secular age-sex-race/ethnicity-specific national trends. In sum, the model tracks a person’s annual likelihood of experiencing health events (e.g., developing diabetes and CVD) and death based on individual factors. The source code is freely available at https://github.com/food-price/DOC-M-Model-Development-and-Validation. Figure 2 describes major model components (i.e., modules) in the DOC-M model, explained in detail in subsequent sections along with key input parameters listed in Table 1.

Figure 1.

Figure 1

Conceptual diagram of the model structure.

The figure highlights key transitions from 1 health state/event to another state/event using solid arrows, while dotted-line arrows represent cause- or event-specific mortality. Gray rectangles represent 5 different health states in which individuals can stay throughout modeled periods, while yellow circles show cardiovascular disease events that individuals can experience in any given year. Obesity was separately tracked based on an individual’s body mass index.

BMI, body mass index; CHD, coronary heart disease; CVD, cardiovascular disease; RVSC, revascularization, including coronary artery bypass surgery and percutaneous coronary intervention.

Figure 2.

Figure 2

Major components of the Diabetes, Obesity, CVD Microsimulation (DOC-M) model.

ACC/AHA, American College of Cardiology/American Heart Association; ASCVD, atherosclerotic cardiovascular disease; BMI, body mass index; BP, blood pressure; CDC, Centers for Disease Control and Prevention; CHD, coronary heart disease; CVD, cardiovascular disease; DM, diabetes mellitus; HBP, high blood pressure; HDL-C, high-density lipoprotein cholesterol; HRQOL, health-related quality of life; HTN, hypertension; MEPS, Medical Expenditure Panel Survey; TC, total cholesterol.

Table 1.

Key Model Parameters and Data Sources

Value/Mean (SE) Probabilistic Distributions a Primary Source
Disease-specific incidence and mortality
 Developing type 2 diabetes Framingham Offspring Study 8-y diabetes risk model (Online Supplement Section C, Table C1) Deterministic (based on regression coefficients) Wilson et al. 27
 Initial ASCVD ACC/AHA 10-y ASCVD risk model (Online Supplement Section D, Table D1) Deterministic (based on regression coefficients) Goff et al. 28
 % CHD v. stroke Sex-race-specific values (47.4%–64.0% v. 36.0%–52.6%) (Online Supplement Section E, Table E1) N/A Benjamin et al. 29
Recurrent ASCVD Framingham Heart Study 2-y risk model for recurrent CHD (Online Supplement Section F, Table F1) Deterministic (based on regression coefficients) D’Agostino et al. 30
 % CHD v. stroke Sex-race-specific values (58.0%–73.2% v. 26.8%–42.0%) (Online Supplement Section E, Table E1) N/A Benjamin et al. 29
 Disease-specific mortality CDC Wonder cause-specific mortality by age, sex, race/ethnicity groups (Online Supplement Section G, Table G1) Beta Centers for Disease Control and Prevention. 31
Probabilities among individuals undergoing RVSC
 % Receiving RVSC 67.3% (Online Supplement Section E, Table E2) Beta Benjamin et al. 29
 % CABG v. PCI among RVSC 28.9 vs. 71.1%
(Online Supplement Section E, Table E2)
Beta Benjamin et al. 29
 Death from CABG v. PCI 1.8% v. 2.1 (Online Supplement Section E, Table E2) Beta Benjamin et al. 29
Health-related quality of life (HRQOL)
 Individual HRQOL HRQOL prediction model (Online Supplement Section H, Table H1) Normal (based on regression coefficients and SE) Lubetkin et al. 32
 HRQOL decrements with CHD −0.055 (0.011) Beta Davies et al. 33
 HRQOL decrements with stroke −0.3 (0.06) Beta Davies et al. 33
Health care costs
 Annual health care cost Health care cost prediction model (Table 3 and Online Supplement Section I) Normal (based on regression coefficients and SE) Our analysis
 Cost of CHD (one time) 10,034 (2,006) (Online Supplement Section J) Gamma US Centers for Medicare & Medicaid Services 34
 Cost of stroke (one time) 15,994 (3,199) (Online Supplement Section J) Gamma US Centers for Medicare & Medicaid Services 34
 CABG 44,538 (8,908) (Online Supplement Section J) Gamma US Centers for Medicare & Medicaid Services 34
 PCI 18,477 (3,695) (Online Supplement Section J) Gamma US Centers for Medicare & Medicaid Services 34

ACC/AHA, American College of Cardiology/American Heart Association; ASCVD, atherosclerotic cardiovascular disease; CDC, Centers for Disease Control and Prevention; CHD, coronary heart disease; HRQOL, health-related quality of life; RVSC, revascularization, including coronary artery bypass surgery (CABG) and percutaneous coronary intervention (PCI); N/A, not applicable; SE, standard error.

a

Where uncertainty around input parameters (e.g., cost of CABG/PCI) is not available, we assume 20% of the mean estimate as a standard error to generate parameters for probabilistic distributions.

Primary Data

The model is populated with individual-level data from the National Health and Nutrition Examination Survey (NHANES), which is nationally representative of the US noninstitutionalized civilian population. By appropriately weighting each simulated individual, the model provides US population estimates. At baseline, the model considers only adults aged 40 to 79 y, based on the availability of validated diabetes and CVD risk prediction models in this age range, who then age as the model propagates each year. For the purpose of model validation, the DOC-M model used a closed cohort design, although open cohorts (i.e., new 40-y-olds entering each year) are also possible. The NHANES data sets, available in their current form since 1999, include detailed information on self-reported sociodemographics (e.g., age, sex, race and ethnicity, income, education), self-reported baseline health conditions, lifestyle risk factors (e.g., diet, physical activity, adiposity, and smoking), and clinical risk factors from physical examination, laboratory measures, and health and medication questionnaires. 35 Because the NHANES contains a single variable, including race and ethnicity constructs, we use the term race/ethnicity for this study. Based on survey participants’ responses on race/ethnicity categories, we classified individuals into 4 categories: non-Hispanic White, non-Hispanic Black, Hispanic (including Mexican American and other Hispanic), and others (including non-Hispanic Asian and multiracial individuals). Missing data (generally ∼10% for the relevant variables) were handled using multiple imputations, generating 10 sets of imputed values assuming data missing at random.

Obesity and Cardiometabolic Risk Factors

From each individual in the NHANES data, BMI was calculated from measured height and weight. Systolic/diastolic blood pressure was measured by averaging multiple readings from a single visit (available up to 4 readings). 36 Fasting glucose and total blood cholesterol data were measured at the NHANES mobile exam center. Diabetes was defined from self-report or any 1 of 4 clinical criteria (i.e., fasting plasma glucose ≥ 126 mg/dL, 2-h postprandial plasma glucose ≥ 200 mg/dL, hemoglobin A1c ≥ 6.5%, or use of diabetic medications). 37 A prior history of CVD was based on self-reported history of stroke, heart attack, and CHD. Angina was defined based on the Rose questionnaire criteria or the use of anti-angina medications. 38 Online Supplement Section A provides additional details in defining diabetes and CVD history from NHANES data.

In addition, the DOC-M estimated and incorporated temporal population trends (instead of static risk factor distributions) at the individual level for BMI and 5 other major cardiometabolic risk factors, including systolic and diastolic blood pressure (SBP, DBP), fasting glucose, total cholesterol, high-density lipoprotein cholesterol (HDL-C), and triglycerides. Using NHANES data from 1999 to 2016, we estimated the average annual percentage change in these risk factors to project future trends in 16 population subgroups, jointly stratified by age (40–59, 60–79 y), sex (male and female), and race/ethnicity (non-Hispanic White, non-Hispanic Black, Hispanic, and others)14,39,40 (Online Supplement Section B). These risk factors were selected based on their use in validated risk prediction models for diabetes and CVD (Figure 2).

Disease Risk Modules

The DOC-M model applied 3 separate US-based validated risk prediction models for the incidence of diabetes, initial CVD events, and recurrent CVD events. The risk of incident diabetes was estimated using the Framingham Offspring Study 8-y risk prediction model based on the following predictors: fasting glucose, BMI, HDL-C (<40 mg/dL in men, 50 mg/dL in women), triglycerides, SBP (≥ 130/85 mm Hg) or treatment for hypertension, and parental history of diabetes 27 (Online Supplement Section C). During the initial calibration process (see the “Model Validation and Calibration” section), we recognized that the demographic population of the Framingham Offspring Study diabetes model was 99% of non-Hispanic Whites. 27 Thus, we calibrated the risk of diabetes for non-Hispanic Black and Hispanic adults because of their known higher risks for developing diabetes. 41 Based on a US-based prospective cohort study, the model inflates diabetes incidence 2.4-fold higher in non-Hispanic Black and Hispanic women and 1.5-fold higher in non-Hispanic Black and Hispanic men. 42 In a sensitivity analysis, we also examined diabetes prediction using the Atherosclerosis Risk in Communities (ARIC) study (the model with clinical variables plus fasting glucose and lipid). 43

The risk of a first atherosclerotic cardiovascular disease (ASCVD) event was estimated using the American College of Cardiology/American Heart Association (ACC/AHA) 10-y ASCVD risk equation, which provides sex- and racial-ethnic–specific risk calculations 28 based on 8 risk predictors: age, sex, total cholesterol, HDL-C, SBP, hypertension treatment, smoker status, and diabetes. We then stratified the risk of ASCVD into stroke and CHD risk using the estimated sex-race-specific proportions of incident cases of each 29 (Online Supplement Sections D and E). Finally, for recurrent ASCVD, we applied the Framingham Heart Study coronary risk model, which provides sex-specific 2-y risk models for recurrent CHD events for persons with CVD history based on age, SBP, smoking status, fasting cholesterol level, and diabetes 30 (Online Supplement Section F).

For each individual in the microsimulation model, their unique distribution of characteristics and risk factors was used to predict their annualized future risks of diabetes, initial ASCVD, recurrent ASCVD (among those with prior nonfatal ASCVD), and death, using the above risk models. Individual characteristics and corresponding disease risk were updated over time to reflect aging and changes in risk factors based on temporal population trends, with reestimation of individual annualized risk of initial ASCVD, diabetes, and recurrent ASCVD every 10, 8, and 2 y, respectively (or when an intervention occurred, for models evaluating an intervention). Each individual’s multiyear risk was converted into an annual event-specific probability. 44

Mortality Modules

We extracted the average US national cause-specific mortality between 2012 and 2016 from CDC Wonder, 31 which collects a single underlying cause of death and demographic data for US residents based on death certificates, and stratified the mortality data jointly by 5-y age groups, sex, and 4 race-ethnicity groups. Cause-specific mortality was defined using ICD-10-CM diagnosis codes for CVD-specific mortality (I20-125: ischemic heart disease [IHD], I60-I69: cerebrovascular disease) and diabetes-specific mortality (E10-I14: diabetes mellitus). All other-cause mortality represents death from non-CVD and non–diabetes-related mortality. In each year, other-cause mortality rates from CDC Wonder also held constant at their 2012 to 2016 averages are applied to individuals who are not predicted to have diabetes or CVD in that year. We adjusted the overall cause-specific mortality for a given age, sex, and race/ethnicity to cause-specific mortality conditional on disease status, with annual rates per 100,000 converted to annual probabilities (Online Supplement Section G, Table G1).

Our model assumes that all cause-specific mortalities (IHD, stroke, and diabetes) were applied to only those with conditions (CVD history, first or recurrent CVD, and diabetes), instead of the general population. Because there are no available public data on the age-sex-race–specific CVD/diabetes mellitus death rate conditional on CVD history or diabetes, we applied Bayes rule to adjust the overall cause-specific mortality to cause-specific mortality conditional on disease status for each population subgroup i, jointly defined by age, sex, and race/ethnicity.

P(IHD_Death|CVDHistory)i=P(IHD_death)i*P(CVDHistory|IHD_Death)iP(CVDHistory)iP(Stroke_Death|CVDHistory)i=P(Stroke_death)i*P(CVDHistory|Stroke_Death)iP(CVDHistory)iP(DM_Death|DM)i=P(DM_death)i*P(DM|DM_Death)iP(DM)i

where we assume that P(CVDHistory|IHD_Death)i=P(CVDHistory|Stroke_Death)i=P(DMHistory|DM_Death)i=1 (i.e., if you die of a specific cause, it is certain that you have those conditions). P(CVDHistory)i and P(DM)i represent the prevalence of the conditions in the specific subgroup i, which was directly estimated from the NHANES.

Finally, we also examined historical trends in CVD mortality from 1999 to 2016. We observed decreasing CVD mortality across all 4 of these subgroups (non-Hispanic Black females, non-Hispanic Black males, non-Hispanic White females, and non-Hispanic White males) over time, particularly for the older subpopulation, aged 75 to 79 y (Online Supplement Section G, Figures G1 and G2). A prior study found that half of the decline in CVD-related mortality between 1980 and 2000 may be attributable to reductions in major risk factors with the other half due to treatment advances. 45 However, our analyses found that the downward slopes have plateaued since 2010, and our model assumed that mortality remains constant. The assumption indicates that, as our model incorporated trends in risk factors, future mortality trends depend only on changes in risk factors, not technological advances.

Health-Related Quality of Life and Costs Modules

Health-related quality of life (HRQOL) was measured based on established patient-based estimates of how changes in health status alter the quality of life, from a scale of 0.0 (death) to 1.0 (perfect health). 46 The DOC-M model incorporated a previously developed HRQOL prediction model for the US nationally representative sample based on demographic, socioeconomic, and chronic disease factors 32 (Online Supplement Section H). Our model also incorporated event-specific short-term decrements in HRQOL for individuals experiencing an acute CHD (−0.055 or loss of 20 healthy days) or stroke (−0.3 or loss of 110 healthy days) in a given year. 33

Health care costs were derived from the US nationally representative 2014 to 2016 Medical Panel Expenditure Survey (MEPS) data (n = 73,174, representing 244.4 million US adults aged 18 y and older after accounting for MEPS sampling weights), a major data source on health care utilization and associated costs among the noninstitutionalized US population. 47 We modeled annual health care costs for each individual in the DOC-M based on age, sex, race/ethnicity, BMI (continuous), and diet-related clinical conditions, including diabetes, hypertension, and history of CVD. After multiple goodness-of-fit testing,4850 we applied a 2-part model with a logit model in the first stage to estimate the probability of incurring any health care expenditures, followed by a generalized linear model with a log link and a gamma distribution to model health care expenditures among those having any expenditure. 51 MEPS survey design and weights were applied to derive nationally representative cost estimates (Online Supplement Section I).

The resulting cost model estimated average marginal effects to predict the incremental change in health care expenditures for each predictor. 52 Using the marginal effects estimates and changes in each individual’s demographic and health profile over time, the DOC-M model predicted an individual’s annual health care expenditures. In addition, for event and procedure-specific costs, we estimated weighted-average total payments based on the number of total discharges across relevant diagnosis-related groups from the Medicare Inpatient Prospective Payment data 34 (Online Supplement Section J). All costs were expressed in 2017 U.S. dollars and the Personal Health Care index was used to adjust total medical expenditures for inflation. These annual health care costs were estimated in STATA 16 53 and integrated into the main DOC-M model.

Handling Multiple Levels of Uncertainty

The DOC-M model deals with multiple levels of uncertainty, including stochastic, parameter, and sampling uncertainty. To minimize the impact of stochastic uncertainty, the DOC-M model can replicate each individual multiple times (e.g., 1,000 times) and average the results across the replications. Parameter uncertainty for input parameters is jointly incorporated by randomly drawing 1,000 sets of input parameters based on their probabilistic distribution, using 1,000 Monte Carlo simulations for each replicated individual. For example, with 1,000 individual replicates, the model could generate 1 million runs per person (=1,000 individual replicates × 1,000 simulations). Sampling uncertainty incorporated the sampling weights to estimate population outcomes and standard errors. Finally, the model combines within-simulation variance (i.e., sampling uncertainty) and between-simulation variance (i.e., parameter uncertainty) using an adaptation of Rubin’s rule to reflect integrated uncertainty in modeled outcomes.54,55 Table 2 summarizes our approach with technical details in Online Supplement Section K.

Table 2.

Types of Uncertainty in DOC-M and Our Approach to Handling the Uncertainty

Types of Uncertainty Concepts/Subtypes Methods for Handling Uncertainty in the DOC-M Model
Stochastic uncertainty Random variabilities in outcomes between identical individuals Replicating each individual multiple times (e.g., 1,000 times) and each replicated individual is going through the simulated health status transitions and averaging the results across the replicates to generate more consistent outcomes
Parameter uncertainty The uncertainty in the estimation of the parameter of interest (e.g., policy effect, the coefficient in the prediction models) Conducting probabilistic sensitivity analysis for multiple inputs by randomly drawing 1,000 sets of input parameter values based on the prespecified probability distributions
Sampling uncertainty The uncertainty in using a population subset (e.g., NHANES participants) to estimate population characteristics Estimating population outcomes and standard error by incorporating NHANES sampling weight. The DOC-M model adapted Rubin’s rule to combine the sampling uncertainty (i.e., within-simulation variance) and the parameter uncertainty (between-simulation variance) in estimating the uncertainty of the modeled outcomes
Imputation uncertainty Uncertainty around the true value for missing data Averaging the imputed values across 10 imputed data sets (assuming missing at random)

Model Validation and Calibration

Based on a methodology report on modeling and simulation in the context of the health technology assessment, 56 we defined model calibration as “the process of determining the parameter values so that model outputs match observed empirical data” and model validation as “the process of comparing model outputs with expert judgment, observed data, or other models, without further modification of model parameters.” Through multiple iterations of model development and pressure testing, we verified the correctness of the mathematical structure and the implementation of the computational model (i.e., internal consistency). We conducted validation analyses in 2 ways, assessing 1) population-level risk calibration (i.e., comparing projected v. observed prevalence of risk factors and outcomes) and 2) individual-level risk discrimination (i.e., comparing projected v. observed occurrence of individual events) for 4 primary outcomes: obesity, diabetes, ASCVD, and all-cause mortality. Validation was done for the overall US adult population and 3 racial-ethnic groups (non-Hispanic white, non-Hispanic black, and Hispanic adults). Asians and other racial-ethnic minorities were excluded due to small sample sizes in NHANES data.

Population-risk calibration

After initial adjustment for higher risks of developing diabetes among Non-Hispanic Black and Hispanic individuals (see the “Disease Risk Module” section), we assessed population-level risk calibration by comparing the 15-y model-predicted population risk of primary outcomes among the 2001–2002 NHANES cohort aged 40 to 79 y (n = 2,944, representing 110.3 million US adults aged 40 to 79 y after accounting for NHANES sampling weights) with the observed prevalence of these outcomes in the age-matched population at each NHANES cycle from 2003 to 2016. The observed-to-expected ratio assessed whether predictions were systematically too low or too high (i.e., specifically whether the 95% confidence interval of observed outcomes included the mean predicted outcomes at year 15), and calibration plot analysis assessed whether statistically significant differences were identified from a slope of the calibration plot = 1 and intercept = 0, respectively.5759 A slope significantly smaller than 1 reflects overprediction, handled with corresponding calibration (shrinkage) of regression coefficients or parameters of interest. In contrast, a slope larger than 1 reflects underprediction, calibrated with an augmentation. 57 Figure 3 provides a schematic diagram of our calibration process. In our methodologic development, for example, the ACC/AHA ASCVD model was found to overpredict CVD mortality among non-Hispanic Black adults (slope = 0.5502, intercept = 0.0024), which after calibration nearly perfectly predicted the observed outcome (Online Supplement Section L). Finally, we used the Brier score—the mean of squared differences between those predictions and their corresponding observed values—to summarize the overall prediction error, along with root mean square error. 60

Figure 3.

Figure 3

A schematic diagram of our calibration process.

Individual-risk discrimination

For additional validation, we assessed individual risk discrimination (i.e., how well the model differentiated individuals who developed diabetes, obesity, or CVD and eventually died in the Multi-Ethnic Study of Atherosclerosis (MESA) prospective cohort (data access through the NHLBI Biologic Specimen and Data Repository Information Coordinating Center). We selected MESA because it is a prospective cohort study of understanding CVD and associated risk factors among a racial-ethnic diverse, population-based sample of 6,814 asymptomatic men and women aged 45 to 84 y at baseline. Approximately 38% of the recruited participants self-identified as non-Hispanic White adults, 28% as African American, 22% as Hispanic, and 12% as Asian (note: this is the original classification according to the MESA cohort data). 61 Using MESA baseline data from 2000 to 2002, we first estimated individual predicted risk based on the proportion of simulations (out of 1,000) in which a given individual was predicted to experience obesity and diabetes at year 9 (as the data were not available beyond year 9) and CVD and all-cause mortality at year 14. Then, based on observed, individual-level events documented through follow-up, we assessed risk discrimination with receiver-operating characteristics (ROC) curves 8 and the c-statistic (i.e., area under the ROC curve). 62 We considered models reasonable when the c-statistic exceeded 0.7 and strong when exceeding 0.8. 63 The ROC analysis was based on maximum-likelihood ROC models with a binomial distribution of the latent variable and conducted in STATA 16. 53

Case Study: Cost-Effectiveness of a National Produce Prescription Program

To illustrate the application of the DOC-M model, we evaluated the long-term cost-effectiveness of a national produce prescription program. While the full study is published elsewhere, 64 we briefly explain our case study. Produce prescription is a nutrition intervention strategy that prescribes free or discounted fresh produce to patients with diet-related chronic diseases identified in the health system. Although produce prescriptions have been suggested to be effective in improving short-term food security and health outcomes, the potential impact of implementing produce prescription programs for patients with diabetes on long-term health, costs, and cost-effectiveness in the United States has not been established. Using a nationally representative US adult population aged 40 to 79 y with diabetes and food insecurity from the NHANES 2013 to 2018, our DOC-M model incorporated evidence on produce prescription programs’ effectiveness in improving diet, BMI, and HbA1c 65 and the association of diet, BMI, and HbA1c with diet-related disease risks 66 to estimate the lifetime population impact of implementing produce prescription programs on CVD outcomes, quality-adjusted life-years (QALYs), and health care costs and productivity benefits.

Results

Population Characteristics

Table 3 shows the baseline characteristics of the 2 validation cohorts in 2001 to 2002 NHANES and MESA. The average age in NHANES was 54.4 y (interquartile range: 46–62 y), with 51% females and 77% Non-Hispanic White adults. The MESA cohort was older (mean age: 62.1 y) and more diverse (39% Non-Hispanic White adults), with a greater proportion of current smokers (49%) than the NHANES cohort. Other disease profiles were similar between the 2 populations (hypertension: 40% v. 41%, diabetes: 13.2% v. 13.8%, obesity: 33.8% v. 32.2%), except that the MESA cohort, by design, excluded prevalent CVD (10.6% of the NHANES sample).

Table 3.

Characteristics of the 2001–2002 NHANES Cohort Aged 40–79 y at Baseline and the MESA Cohort at Baseline

Risk Calibration Sample: NHANES 2001–2002 (N = 29,440, a Survey Weight Adjusted) Risk Discrimination Sample: MESA Exam 1 (Baseline; N = 6,814)
x¯ s IQR x¯ s IQR
Age, y 54.4 10.7 46–62 62.1 10.2 53–70
Total cholesterol, mg/dL 211 45.2 183–234 194.1 35.7 170–215
HDL-cholesterol, mg/dL 52.5 15.8 41–62 50.9 14.8 40–59
Systolic blood pressure, mm Hg 128 19.9 114v137 126.6 21.5 111–140
Diastolic blood pressure, mm Hg 74.4 12.4 68.7–81.3 71.9 10.3 65–78.5
Body mass index, kg/m2 28.7 6.3 24.5–31.6 28.3 5.4 24.5–31.2
Fasting glucose, mg/dL 108 35.0 92.3–109 97.4 30.3 83–99
Triglyceride, mg/dL 166 212 85–184 131.6 88.8 78–161
% 95% CI b % 95% CI
Female 51.0 49.3–52.7 52.8 N/A
Race
 Non-Hispanic White 76.9 71.3–81.6 38.5 37.3–39.6
 Non-Hispanic Black 9.9 6.6–14.6 27.8 26.7–28.8
 Hispanic 9.5 5.3–16.3 22.0 21.0–22.9
 Asian/other c 3.8 2.7–5.3 11.8 11.0–12.6
Parental history of diabetes d 50.7 47.5–53.9 41.6 40.4–42.8
Blood pressure ≥130/85 mm Hg 40.0 36.3–43.8 41.3 40.1–42.5
Treatment for hypertension 25.1 22.3–28.2 44.9 43.7–46.1
Current smoker 21.6 19.3–24.0 49.4 48.3–50.6
Diabetes 13.2 11.7–14.6 13.8 13.0–14.7
Obesity 33.8 31.1–36.6 32.2 31.1–33.3
CVD history 10.6 8.8–12.4 0 N/A

CI, confidence interval; CVD, cardiovascular disease; HDL, high-density lipoprotein; IQR, interquartile range; s, standard deviation; MESA, Multi-Ethnic Study of Atherosclerosis; NHANES, National Health and Nutrition Examination Survey.

a

Based on 10 imputations of the NHANES data set (N = 2,944 for 2001–2002). There is no statistical difference across all variables between a single imputed data set and 10 imputed data sets.

b

The 95% CIs were estimated by accounting for the survey weights and sampling design.

c

In MESA, “other race” includes only adults of Chinese descent.

d

For the MESA data, a family history of diabetes (fhxdb2) was used.

Model Performance and Validation: Population-Risk calibration

Based on 1,000 Monte Carlo simulations, our 15-y mean prediction from the NHANES 2001 to 2002 cohort fell within the 95% confidence interval (CI) of the actual observed national prevalence in future years for all primary (obesity, diabetes, ASCVD, and all-cause mortality) and secondary outcomes (CVD- and diabetes-cause mortality), both overall and among the 3 racial-ethnic groups (Table 4, Online Supplement Section M). Of 24 specific calibration targets (6 primary and secondary outcomes for 4 groups [overall and 3 racial-ethnic groups]), the only exceptions were the prevalence of CVD among Hispanic adults and CVD-cause mortality among non-Hispanic Black adults, for which the 15-y mean predictions met the criteria for all prior years except at 2015 to 2016 (year 15). Online Supplement Section M shows the calibration plot analyses for all 4 primary and 2 secondary outcomes, both overall and within each racial-ethnic group. The slopes of the calibration plots were not statistically different from 1 in all predictions. The intercepts were statistically significantly different from 0 for only 3 predictions (of 24), but even in these cases, the intercept estimates were very close to zero: 0.016 (95% CI: 0.006, 0.026) for all-cause mortality among Hispanic adults, 0.008 (0.005, 0.011) for CVD-cause mortality among Non-Hispanic White adults, and 0.004 (0.001, 0.006) for diabetes-cause mortality among the overall population. In most cases, Brier scores (i.e., mean squared error) fell below 0.0004, with the largest Brier score being 0.0012 for obesity among Hispanic adults.

Table 4.

Risk Calibration Performance across Primary and Secondary Outcomes by Racial-Ethnic Groups

Outcomes Racial-Ethnic Groups Observed (O) v. Predicted (P): Does 95% CI of Observed Outcomes Include the Mean Predicted Outcomes at Year 15? Calibration Plot: Does Slope = 1 and Intercept = 0? (Testing for Statistically Significant Difference) R 2 RMSE Brier Score
Primary outcomes
 Diabetes All Yes
O: 0.281 [0.243, 0.318]
P: 0.26 [0.25, 0.28]
Yes
Slope: 1.069 [0.840, 1.298]
Intercept: −0.010 [−0.058, 0.038]
0.956 0.011 0.0001
Non-Hispanic White Yes
O: 0.252 [0.211, 0.294]
P: 0.23 [0.22, 0.25]
Yes
Slope: 1.041 [0.752, 1.330]
Intercept: −0.002 [−0.054, 0.051]
0.928 0.025 0.0002
Non-Hispanic Black Yes
O: 0.350 [0.307, 0.394]
P: 0.35 [0.32, 0.38]
Yes
Slope: 1.051 [0.594, 1.501]
Intercept: −0.004 [−0.137, 0.130]
0.841 0.025 0.0004
Hispanic Yes
O: 0.393 [0.324, 0.463]
P: 0.40 [0.35, 0.45]
Yes
Slope: 1.009 [0.751, 1.267]
Intercept: −0.011 [−0.094, 0.072]
0.939 0.019 0.0004
 CVD history All Yes
O: 0.179 [0.160, 0.198]
P: 0.171 [0.152, 0.189]
Yes
Slope: 0.876 [0.455, 1.296]
Intercept: 0.024 [−0.034, 0.083]
0.812 0.010 0.0001
Non-Hispanic White Yes
O: 0.173 [0.148, 0.197]
P: 0.167 [0.148, 0.187]
Yes
Slope: 0.852 [0.288, 1.417]
Intercept: 0.029 [−0.048, 0.107]
0.695 0.013 0.0002
Non-Hispanic Black Yes
O: 0.197 [0.162, 0.232]
P: 0.22 [0.188, 0.25]
Yes
Slope: 0.618 [0.169, 1.066]
Intercept: 0.049 [−0.035, 0.1329]
0.655 0.012 0.0001
Hispanic No, but 95% CI of predicted outcomes include the mean observed outcome
O: 0.162 [0.137, 0.188]
P: 0.133 [0.103, 0.164]
Yes
Slope: 1.226 [0.624, 1.828]
Intercept: −0.005 [−0.065, 0.056]
0.806 0.016 0.0002
 Obesity All Yes
O: 0.415 [0.379, 0.451]
P: 0.41 [0.38, 0.44]
Yes
Slope: 0.918 [0.537, 1.300]
Intercept: 0.028 [−0.116, 0.172]
0.853 0.010 0.0001
Non-Hispanic White Yes
O: 0.407 [0.363, 0.451]
P: 0.406 [0.376, 0.435]
Yes
Slope: 0.978 [0.397, 1.560]
Intercept: 0.000 [-0.218, 0.219]
0.739 0.013 0.0002
Non-Hispanic Black Yes
O: 0.468 [0.425, 0.511]
P: 0.50 [0.46, 0.53]
Yes
Slope: 0.789 [0.115, 1.463]
Intercept: 0.109 [−0.203, 0.420]
0.578 0.023 0.0005
Hispanic Yes
O: 0.493 [0.434, 0.552]
P: 0.463 [0.402, 0.524]
Yes
Slope: 0.920 [0.353, 1.486]
Intercept: 0.060 [-0.162, 0.276]
0.725 0.034 0.0012
 All-cause mortality All Yes
O: 0.191 [0.166, 0.216]
P: 0.191 [0.180, 0.20]
Yes
Slope: 1.001 [0.975, 1.044]
Intercept: 0.002 [−0.001, 0.006]
0.997 0.004 0.0000
Non-Hispanic White Yes
O: 0.189 [0.159, 0.219]
P: 0.188 [0.174, 0.20]
Yes
Slope: 1.016 [0.987, 1.045]
Intercept: 0.001 [−0.002, 0.004]
0.998 0.003 0.0000
Non-Hispanic Black Yes
O: 0.245 [0.218, 0.273]
P: 0.25 [0.22, 0.27]
Yes for slope, No for intercept
Slope: 0.985 [0.910, 1.061]
Intercept: 0.016 [0.006, 0.026]
0.984 0.010 0.0001
Hispanic Yes
O: 0.185 [0.136, 0.234]
P: 0.180 [0.132, 0.23]
Yes
Slope: 1.059 [0.989, 1.130]
Intercept: −0.001 [−0.007, 0.006]
0.988 0.007 0.0001
Secondary outcomes
 CVD-cause mortality All Yes
O: 0.029 [0.024, 0.035]
P: 0.035 [0.030, 0.039]
No
Slope: 0.862 [0.778, 0.947]
Intercept: 0.003 [0.001, 0.005]
0.974 0.002 0.0000
Non-Hispanic White Yes
O: 0.036 [0.024, 0.048]
P: 0.035 [0.030, 0.040]
Yes for slope, No for intercept
Slope: 0.927 [0.769, 1.994]
Intercept: 0.008 [0.005, 0.011]
0.925 0.003 0.0000
Non-Hispanic Black No, but yes for all prior years
O: 0.029 [0.024, 0.035]
P: 0.037 [0.030, 0.044]
No
Slope: 0.804 [0.741, 0.866]
Intercept: 0.002 [0.001, 0.004]
0.983 0.001 0.0000
Hispanic Yes
O: 0.029 [0.007, 0.050]
P: 0.035 [0.019, 0.052]
Yes
Slope: 0.919 [0.697, 1.142]
Intercept: 0.003 [-0.001, 0.007]
0.860 0.004 0.0000
 DM-cause mortality All Yes
O: 0.005 [0.003, 0.007]
P: 0.006 [0.004, 0.007]
Yes
Slope: 0.911 [0.802, 1.021]
Intercept: 0.000 [-0.000, 0.001]
0.961 0.000 0.0000
Non-Hispanic White Yes
O: 0.003 [0.000, 0.005]
P: 0.003 [0.002, 0.004]
Yes
Slope: 0.964 [0.830, 1.097]
Intercept: 0.000 [−0.000, 0.000]
0.949 0.000 0.0000
Non-Hispanic Black Yes
O: 0.014 [0.005, 0.023]
P: 0.014 [0.010, 0.018]
Yes
Slope: 1.065 [0.918, 1.212]
Intercept: −0.001 [−0.002, 0.000]
0.949 0.001 0.0000
Hispanic Yes
O: 0.013 [0.006, 0.020]
P: 0.017 [0.012, 0.022]
Yes for slope, No for intercept
Slope: 0.775 [0.466, 1.084]
Intercept: 0.004 [0.001, 0.006]
0.693 0.003 0.0000

CI, confidence interval; CVD, cardiovascular disease; DM, diabetes mellitus; RMSE, root mean square error.

In sensitivity analyses using the ARIC diabetes risk equation, performance was less robust than when using the Framingham Offspring Study risk model, with adjustment for increased lifetime diabetes risk among non-Hispanic Black and Hispanic adults (Online Supplement Section N).

Model Performance and Validation: Individual-Risk Discrimination

In the prospective MESA cohort, the DOC-M model accurately predicted the occurrence of primary outcomes in individual participants, with similar performance across racial-ethnic groups (Figure 4, Online Supplement Section O). The c-statistics were 0.85 to 0.88 for diabetes, 0.93 to 0.95 for obesity, 0.74 to 0.76 for CVD history, and 0.78 to 0.81 for all-cause mortality.

Figure 4.

Figure 4

Receiver-operating characteristic analysis of the Multi-Ethnic Study of Atherosclerosis cohort for risk discrimination.

Model Estimates for Individual Health Care Spending Associated with Cardiometabolic Diseases

Among US adults aged 40 to 79 y at baseline (2001–2002), the DOC-M estimated mean individual annual health care cost to be $5,730 (95% CI: $5,530–5,930). For the 83% of individuals without prevalent diabetes or CVD at baseline, the baseline annual health care cost was $4,090 (95% CI: $3,910–4,270). For those with diabetes and no CVD (6.9%), baseline annual costs were 2.5-fold higher, at $10,460 (95% CI: $9,620–11,300), whereas for those with CVD and no diabetes (6.9%), baseline annual costs were more than 3-fold higher, at $14,050 (95% CI: $12,990–15,120). For adults with both CVD and diabetes at baseline (3.0%), annual health care costs were more than 5-fold higher than for adults with neither at baseline, at $21,340 (95% CI: $19,030–23,650).

Table 5 shows the marginal differences in individual health care expenditure based on changes in demographics, risk factors, and disease conditions. Each characteristic predicted significant differences in cost, compared with the base cost for a 40-y-old, non-Hispanic White man without diabetes or CVD at baseline. For example, estimated baseline annual health expenditures for a 40-y-old White man with a BMI of 25 kg/m2 and no chronic cardiomebolic diseases were $2,832 (95% CI: $2,618–3,045), compared with $7,764 (95% CI: $6,984–8,545) for a 40-y-old White man with a BMI of 31, diabetes, and hypertension and $36,500 (95% CI: 31,525–41,473) for a 65-year-old White man with a BMI of 31, diabetes, hypertension, CHD, and prior stroke.

Table 5.

Marginal Effects of Individual Characteristics on Estimated Annual Health Care Expenditures among US Adults Aged 40–79 y, Based on 73,174 Individuals in MEPS 2014–2016 (2017 USD)

Mean Annual Health Care Expenditures (95% CI)
Mean ($) 95% CI
Baseline health care expenditure a 2,895 2,687, 3,102
Age after 40 y, each year (i.e., age 40 = 0) +95.1 84.2, 106
Changes in BMI from BMI 28, each kg/m2 (e.g., BMI 25 = −3; BMI 30 = 2) +41.0 15.1, 66.8
Female v. male sex +1,984 1,645, 2,323
Race/ethnicity
 Non-Hispanic White Reference
 Non-Hispanic Black −1,629 −2,113, −1,144
 Hispanics −2,312 −2,705, −1,920
 Non-Hispanic other −1,581 −2,163, −998
Diabetes +3,842 3,167, 4,516
High blood pressure +2,101 1,634, 2,568
Coronary heart disease +4,711 3,834, 5,587
Stroke +4,850 3,673, 6,028
a

Baseline health care expenditures were estimated for individuals who were age 40 y, male, non-Hispanic White, BMI 28, and had no clinical conditions. The marginal effects represent additional changes in health care expenditures from the baseline expenditure by the change in 1 unit of the predictor. For example, with all other predictors unchanged from baseline characteristics (i.e., BMI 28, male, non-Hispanic White, and no clinical conditions), a 1-unit change in age (from age 40 to age 41 y) would increase annual health care expenditures by $95 on average.

Lifetime Cost-Effectiveness of a National Produce Prescription Program

Over the lifetimes of the current US adult population with diabetes and food insecurity (about 6.5 million), the DOC-M model estimated that implementing produce prescriptions would prevent 292,000 (95% uncertainty interval: 143,000–440,000) CVD events, generate 260,000 (110,200–411,000) QALYs, cost $44.3 billion in implementation costs, and save $39.6 billion (20.5–58.6) in health care costs and $4.8 billion (1.8–7.7) in productivity costs. The program was highly cost-effective from a health care perspective (incremental cost-effectiveness ratio: $18,100/QALY) and health-improving and cost-saving from a societal perspective (net savings of $ 0.1 billion; Table 6). Importantly, the programs would provide greater health benefits and result in more favorable cost-effectiveness ratios among non-Hispanic Black and Hispanic patients than non-Hispanic White patients, primarily due to higher baseline cardiometabolic and mortality risk among non-Hispanic Black and Hispanic patients (data are not shown, and the complete analyses are available elsewhere 64 ).

Table 6.

Lifetime Population Health Impact, Economic Impact, and Cost-Effectiveness of Produce Prescription among US Adults with Diabetes and Food Insecurity (Population Size: 5.74 million)

Scenario Total CVD Events, Thousands First CVD Cases, Thousands Recurrent CVD Events, Thousands QALYs a Health Care Cost, a $ Billions Productivity Loss, a $ Billions Produce Prescription Program Costs (Food and Administrative Costs), a $Billions ICER, Health Care Perspective, $/QALY ICER, Societal Perspective, $/QALY
No intervention
 Estimates 5,210,000 2,510,000 2,700,000 79,200,000 1,460 230 N/A
 95% UI (5,110,000 to 5,310,000) (2,480,000 to 2,540,000) (2,620,000 to 2,780,000) (77,600,000 to 81,000,000) (1,340 to
1,580)
(154 to
326)
Produce prescription programs
 Estimates 4,920,000 2,430,000 2,480,000 79,400,000 1,420 225 44.3
 95% UI (4,740,000 to 5,085,000) (2,380,000 to 2,480,000) (2,350,000 to 2,610,000) (77,900,000 to 81,200,000) (1,300 to 1,530) (151 to 319) (32.1 to 56.4)
Incremental change with produce prescription programs
 Estimates −292,000 −77,500 −214,000 260,000 −39.6 −4.8 44.3 18,100 Dominant (health improving and cost-saving)
 95% UI (−440,000 to −143,000) (−120,000 to −35,500) (−323,000 to −105,000) (110,000 to 411,000) (−58.6 to −20.5) (−7.7 to −1.8) (32.1 to 56.4)

CVD, cardiovascular disease; ICER, incremental cost-effectiveness ratio; N/A, not applicable; QALY, quality-adjusted life-year; UI, uncertainty interval.

a

We discounted QALYs and all costs at 3% annual rate.

Discussion

By combining clinical and epidemiological studies of risk factors, disease burden, health disparities, and health care utilization, simulation modeling objectively synthesizes these disparate evidence sources and their uncertainty to inform population health policy decisions. Demonstrating model development and testing their validity is an important process that could improve public and scientific trust. For transparency in reporting and documentation, we describe a novel DOC-M model’s development and validation process, along with the publicly available source code (https://github.com/food-price/DOCM_validation).

Clnical importance of the DOC-M model includes joint predictions of the onset and complications from diet-related diseases (i.e., obesity, diabetes, CVD) and cause-specific mortality. This is an improvement over some simulation models that focus on a single disease (e.g., obesity,67,68 diabetes,46 hypertension, 7 CHD, and stroke810,69,70 or an organ-specific cancer1114). Chronic conditions often share common risk factors with health implications for affected individuals. This is particularly true for diet-related diseases, for which demographic, lifestyle, and clinical risk factors have strong positive correlations and cluster in high-risk individuals. By accounting for these correlations, the DOC-M model accounts for risk heterogeneity across individuals to better assess the population effects of clinical, behavioral, and policy interventions across risk factors and disease pathways relevant to diet-related diseases.

Methodologically, our validation analyses applied both population-risk calibrations (using serial cross-sectional national data) and individual-risk discriminations (using a diverse community-based prospective cohort) for the overall US population and three racial-ethnic groups. The former population-risk calibrations are necessary for a model’s face validity when seeking to inform health policy decisions. The latter individual-risk discrimination is especially important for clinicians and patients as individual health outcomes matter. However, assessing both calibration and discrimination has rarely been conducted together. 71 Based on standard metrics, the validation analyses of our model demonstrated strong performance for both population-risk calibration and individual-risk discrimination for all primary and secondary outcomes for the adult US population overall and by racial-ethnic groups, although certain outcomes among non-Hispanic Black and Hispanic adults had greater uncertainty. However, as acknowledged by the pooled cohort ASCVD risk equation developers used in our model, 37 the sample population contained a relatively low number of non-Hispanic Black adults that contributed to greater uncertainty in their disease predictions. Still, the pooled cohort ASCVD risk equation has reasonable calibration in many populations and remains the most commonly used CVD risk prediction model. 72 In addition, our ROC analyses using the prospective MESA cohort showed excellent risk discrimination. Future updates and calibration of DOC-M could strengthen racial-ethnic-specific predictions as new evidence and data emerge.

The DOC-M model’s other innovative feature is the ability to forecast health trajectories across racial-ethnic and socioeconomic status groups in the nationally representative NHANES data set. Unlike other popular approaches using “simulated” individuals created from population-level summary statistics in microsimulation models8,10,25,73 our approach preserves correlations between individual-level characteristics (e.g., insurance status, household size, geographic region, education level, household income, food security status). Other features include incorporating age-related and secular trends in underlying risk factors and developing probabilistic models to reflect both sampling and parameter uncertainty. In comparison, the Real-World Progression in Diabetes (RAPIDS) model also preserves correlations across individual-level characteristics and predicts the individual trajectory of biomarkers. 74 However, it focuses only on individuals with diabetes, is based on the US Veterans Affairs data, and does not currently account for parameter uncertainty. Along with the independently developed and integrated individualized health care cost prediction model, the DOC-M model allows us to investigate the long-term distributional health and economic impacts of interventions to address upstream shared risk drivers, such as social determinants of health, for the US adult population. For example, our case study of evaluating the long-term cost-effectiveness of a national produce prescription program illustrates the application of the DOC-M model for US adults with diabetes and food insecurity. Our publication details our approach with additional subgroup analyses by age, race/ethnicity, educational attainment, and insurance status. 64

Finally, this article includes the publicly available source code for the DOC-M model (https://github.com/food-price/DOCM_validation). A recognized gap for many cutting-edge research studies is the failure to reach end users in a clear and timely fashion. 75 This problem is particularly notable for simulation modeling studies as they require integrating multiple data sources and extrapolating with certain assumptions, making results harder to interpret, less accessible to nontechnical audiences, and prone to undetectable errors.76,77 Despite some concerns about intellectual property rights and potential misuse of the open-source code, 78 we believe the open-source DOC-M model—a product of federally funded research grants—can help enhance model transparency, usability, and adaptability to foster collaboration across international research communities.

We view our racial-ethnic group-level validation approach as our best attempt to make credible group-level projections with existing data. Our model incorporated the established evidence on distributional differences in baseline risk factors, disease incidence, and mortality across racial/ethnic groups to facilitate the accurate prediction of long-term health and economic outcomes among population subgroups, including racial-ethnic ones, and to guide relevant policy decisions concerned about health inequities. Importantly, racial-ethnic subgroups are social constructs and not biological risk factors, so their associations with risk contain inherent heterogeneity and confounding due to structural racism and may change over time.

Our study has limitations. NHANES is representative of the US civilian noninstitutionalized resident population, whereas the CDC data source for mortality is based on deaths for all US residents. Differences between these populations might contribute to some discrepancy between observed and predicted outcomes during the initial (precalibration) assessment. Similarly, MEPS, used to generate our individual health care cost prediction model, excludes persons in nursing homes and assisted living facilities, representing a particularly high-expenditure group. As medical conditions and utilization data in MEPS may also be subject to underreporting due to response and recall biases, 79 our model’s health care expenditure could underestimate health care expenditures by 10% to 20%8083 and should be considered conservative estimates.

For predicting trends for each obesity and cardiometabolic risk factor, we estimated temporal trends (i.e., average annual percentage change) in 16 population subgroups (based on age and race/ethnicity) from the historical cross-sectional NHANES data in the absence of nationally representative longitudinal follow-up data. Thus, we could not incorporate cardiometabolic risk factor trends at the individual level. Nonetheless, our validation analyses showed strong model performance with population-risk calibration and individual-risk discrimination. Should such individual-level data become available, future work could more precisely incorporate the longitudinal correlation among cardiometabolic risk factors (e.g., the approach used in the RAPID model 74 ). Also, we did not incorporate a trend in mortality based on the fact that downward slopes in CVD-related mortality have plateaued since 2010. As our model incorporated trends in risk factors, we implicitly assumed that no substantial mortality reduction would occur due to technological advances in our future prediction of CVD or other-cause mortality. Because model validation should be considered only partially complete, it is essential to continuously update and refine the model to reflect new data on cause-specific mortality, underlying trends in risk factors, and observed outcomes.

Conclusion

This study presents the development and validation of a novel US microsimulation model that incorporates demographic, lifestyle, and clinical risk factors with temporal trends to predict onset and complications from obesity, diabetes, CVD, and all-cause mortality as primary outcomes and CVD- and diabetes-cause specific mortality as secondary outcomes, for both the US adult population and within racial-ethnic groups. The DOC-M model can be used to examine health, equity, and the economic impact of health policies and interventions on clinical and behavioral risk factors for obesity, diabetes, and CVD. We also report our model’s development and validation process and provide publicly available source code for others to use, adapt, and enhance, with the hope of fostering future simulation model development and validation.

Supplemental Material

sj-docx-1-mdm-10.1177_0272989X231196916 – Supplemental material for Development and Validation of the US Diabetes, Obesity, Cardiovascular Disease Microsimulation (DOC-M) Model: Health Disparity and Economic Impact Model

Supplemental material, sj-docx-1-mdm-10.1177_0272989X231196916 for Development and Validation of the US Diabetes, Obesity, Cardiovascular Disease Microsimulation (DOC-M) Model: Health Disparity and Economic Impact Model by David D. Kim, Lu Wang, Brianna N. Lauren, Junxiu Liu, Matti Marklund, Yujin Lee, Renata Micha, Dariush Mozaffarian and John B. Wong in Medical Decision Making

Footnotes

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Financial support for this study was provided entirely by a grant from NIH/NHLBI R01HL130735 and R01HL115189. The funding agreement ensured the authors’ independence in designing the study, interpreting the data, writing, and publishing the report.

Data Availability Statement: The input data and source code for the US Diabetes, Obesity, Cardiovascular Disease Microsimulation (DOC-M) model are available at https://github.com/food-price/DOC-M-Model-Development-and-Validation.

Contributor Information

David D. Kim, Section of Hospital Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA.

Lu Wang, Friedman School of Nutrition Science and Policy, Tufts University, Boston, MA, USA.

Brianna N. Lauren, Friedman School of Nutrition Science and Policy, Tufts University, Boston, MA, USA

Junxiu Liu, Department of Population Health Science and Policy, the Icahn School of Medicine at Mount Sinai, New York, NY, USA.

Matti Marklund, The George Institute for Global Health, University of New South Wales, Sydney, Australia; Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA.

Yujin Lee, Department of Food and Nutrition, Myongji University, Yongin, South Korea.

Renata Micha, Friedman School of Nutrition Science and Policy, Tufts University, Boston, MA, USA.

Dariush Mozaffarian, Friedman School of Nutrition Science and Policy, Tufts University, Boston, MA, USA.

John B. Wong, Division of Clinical Decision Making, Tufts Medical Center, Boston, MA, USA.

References

  • 1. Biggerstaff M, Slayton RB, Johansson MA, Butler JC. Improving pandemic response: employing mathematical modeling to confront coronavirus disease 2019. Clin Infect Dis. 2022;74(5):913–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Bertozzi AL, Franco E, Mohler G, Short MB, Sledge D. The challenges of modeling and forecasting the spread of COVID-19. Proc Natl Acad Sci U S A. 2020;117(29):16732–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Owens DK, Whitlock EP, Henderson J, et al. Use of decision models in the development of evidence-based clinical preventive services recommendations: methods of the U.S. Preventive Services Task Force. Ann Intern Med. 2016;165(7):501–8. [DOI] [PubMed] [Google Scholar]
  • 4. Huang ES, Basu A, O’Grady M, Capretta JC. Projecting the future diabetes population size and related costs for the U.S. Diabetes Care. 2009;32(12):2225–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Eddy DM, Schlessinger L. Validation of the archimedes diabetes model. Diabetes Care. 2003;26(11):3102–10. [DOI] [PubMed] [Google Scholar]
  • 6. Russell LB, Valiyeva E, Roman SH, Pogach LM, Suh DC, Safford MM. Hospitalizations, nursing home admissions, and deaths attributable to diabetes. Diabetes Care. 2005;28(7):1611–7. [DOI] [PubMed] [Google Scholar]
  • 7. Russell LB, Valiyeva E, Carson JL. Effects of prehypertension on admissions and deaths: a simulation. Arch Intern Med. 2004;164(19):2119–24. [DOI] [PubMed] [Google Scholar]
  • 8. Pandya A, Sy S, Cho S, Alam S, Weinstein MC, Gaziano TA. Validation of a cardiovascular disease policy microsimulation model using both survival and receiver operating characteristic curves. Med Decis Making. 2017;37(7):802–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Heller DJ, Coxson PG, Penko J, et al. Evaluating the iImpact and cost-effectiveness of statin use guidelines for primary prevention of coronary heart disease and stroke. Circulation. 2017;136(12):1087–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Weinstein MC, Coxson PG, Williams LW, Pass TM, Stason WB, Goldman L. Forecasting coronary heart disease incidence, mortality, and cost: the Coronary Heart Disease Policy Model. Am J Public Health. 1987;77(11):1417–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Burger EA, de Kok I, Groene E, et al. Estimating the natural history of cervical carcinogenesis using simulation models: a CISNET comparative analysis. J Natl Cancer Inst. 2020;112(9):955–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Criss SD, Cao P, Bastani M, et al. Cost-effectiveness analysis of lung cancer screening in the United States: a comparative modeling study. Ann Intern Med. 2019;171(11):796–804. [DOI] [PubMed] [Google Scholar]
  • 13. Alagoz O, Berry DA, de Koning HJ, et al. Introduction to the Cancer Intervention and Surveillance Modeling Network (CISNET) breast cancer models. Med Decis Making. 2018;38(1 suppl):3S–8S. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Kim DD, Wilde PE, Michaud DS, et al. Cost effectiveness of nutrition policies on processed meat: implications for cancer burden in the U.S. Am J Prev Med. 2019;57(5):e143–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. American Diabetes Association. 1. Improving care and promoting health in populations: standards of medical care in diabetes-2020. Diabetes Care. 2020;43(suppl 1):S7–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Powers WJ, Rabinstein AA, Ackerson T, et al. 2018 guidelines for the early management of patients with acute ischemic stroke: a guideline for healthcare professionals from the American Heart Association/American Stroke Association. Stroke. 2018;49(3):e46–110. [DOI] [PubMed] [Google Scholar]
  • 17. LeBlanc ES, Patnode CD, Webber EM, Redmond N, Rushkin M, O’Connor EA. Behavioral and pharmacotherapy weight loss interventions to prevent obesity-related morbidity and mortality in adults: updated evidence report and systematic review for the US Preventive Services Task Force. JAMA. 2018;320(11):1172–91. [DOI] [PubMed] [Google Scholar]
  • 18. Arnett DK, Goodman RA, Halperin JL, Anderson JL, Parekh AK, Zoghbi WA. AHA/ACC/HHS strategies to enhance application of clinical practice guidelines in patients with cardiovascular disease and comorbid conditions: from the American Heart Association, American College of Cardiology, and US Department of Health and Human Services. Circulation. 2014;130(18):1662–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Eckel RH, Jakicic JM, Ard JD, et al. 2013 AHA/ACC guideline on lifestyle management to reduce cardiovascular risk: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. Circulation. 2014;129(25 suppl 2):S76–99. [DOI] [PubMed] [Google Scholar]
  • 20. Jacobs AK, Anderson JL, Halperin JL. The evolution and future of ACC/AHA clinical practice guidelines: a 30-year journey: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. J Am Coll Cardiol. 2014;64(13):1373–84. [DOI] [PubMed] [Google Scholar]
  • 21. Centers for Disease Control and Prevention. Health disparities. 2017. Available from: https://www.cdc.gov/aging/disparities/index.htm [Accessed 20 October, 2021].
  • 22. Chowkwanyun M, Reed AL., Jr. Racial health disparities and Covid-19—caution and context. N Engl J Med. 2020;383(3):201–3. [DOI] [PubMed] [Google Scholar]
  • 23. National Institutes of Health. Notice of Special Interest (NOSI): simulation modeling and systems science to address health disparities. 2020. Available from: https://grants.nih.gov/grants/guide/notice-files/NOT-MD-20-025.html
  • 24. Penalvo JL, Cudhea F, Micha R, et al. The potential impact of food taxes and subsidies on cardiovascular disease and diabetes burden and disparities in the United States. BMC Med. 2017;15(1):208. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Pearson-Stuttard J, Bandosz P, Rehm CD, et al. Reducing US cardiovascular disease burden and disparities through national and targeted dietary policies: a modelling study. PLoS Med. 2017;14(6):e1002311. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. R Core Team. R: A Language and Environment for Statistical Computing. Vienna (Austria): R Foundation for Statistical Computing; 2021. [Google Scholar]
  • 27. Wilson PW, Meigs JB, Sullivan L, Fox CS, Nathan DM, D’Agostino RB, Sr. Prediction of incident diabetes mellitus in middle-aged adults: the Framingham Offspring Study. Arch Intern Med. 2007;167(10):1068–74. [DOI] [PubMed] [Google Scholar]
  • 28. Goff DC, Jr, Lloyd-Jones DM, Bennett G, et al. 2013 ACC/AHA guideline on the assessment of cardiovascular risk: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. Circulation. 2014;129(25 suppl 2):S49–73. [DOI] [PubMed] [Google Scholar]
  • 29. Benjamin EJ, Virani SS, Callaway CW, et al. Heart disease and stroke statistics-2018 Update: a report from the American Heart Association. Circulation. 2018;137(12):e67–492. [DOI] [PubMed] [Google Scholar]
  • 30. D’Agostino RB, Russell MW, Huse DM, et al. Primary and subsequent coronary risk appraisal: new results from the Framingham study. Am Heart J. 2000;139(2 Pt 1):272–81. [DOI] [PubMed] [Google Scholar]
  • 31. Centers for Disease Control and Prevention. Underlying cause of death 1999-2016. 2018. Available from: https://wonder.cdc.gov/wonder/help/ucd.html# [Accessed 21 September, 2018].
  • 32. Lubetkin EI, Jia H, Franks P, Gold MR. Relationship among sociodemographic factors, clinical conditions, and health-related quality of life: examining the EQ-5D in the U.S. general population. Qual Life Res. 2005;14(10):2187–96. [DOI] [PubMed] [Google Scholar]
  • 33. Davies EW, Matza LS, Worth G, et al. Health state utilities associated with major clinical events in the context of secondary hyperparathyroidism and chronic kidney disease requiring dialysis. Health Qual Life Outcomes. 2015;13:90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. US Centers for Medicare & Medicaid Services. Medicare provider utilization and payment data: inpatient. 2019. Available from: https://data.cms.gov/provider-summary-by-type-of-service/medicare-inpatient-hospitals/medicare-inpatient-hospitals-by-provider
  • 35. National Center for Health Statistics. National Health and Nutrition Examination Survey: Overview. Hyattsville (MD): National Center for Health Statistics; 2018. [Google Scholar]
  • 36. Yoon SS, Gu Q, Nwankwo T, Wright JD, Hong Y, Burt V. Trends in blood pressure among adults with hypertension: United States, 2003 to 2012. Hypertension. 2015;65(1):54–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. American Diabetes Association. 2. Classification and diagnosis of diabetes: standards of medical care in diabetes—2019. Diabetes Care. 2019;42(suppl 1):S13–28. [DOI] [PubMed] [Google Scholar]
  • 38. Rose GA, Blackburn H. Cardiovascular survey methods. Monogr Ser World Health Organ. 1968;56:1–188. [PubMed] [Google Scholar]
  • 39. Mariotto AB, Yabroff KR, Shao Y, Feuer EJ, Brown ML. Projections of the cost of cancer care in the United States: 2010-2020. J Natl Cancer Inst. 2011;103(2):117–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Clegg LX, Hankey BF, Tiwari R, Feuer EJ, Edwards BK. Estimating average annual per cent change in trend analysis. Stat Med. 2009;28(29):3670–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Narayan KMV, Boyle JP, Thompson TJ, Sorensen SW, Williamson DF. Lifetime risk for diabetes mellitus in the United States. JAMA. 2003;290(14):1884–90. [DOI] [PubMed] [Google Scholar]
  • 42. Brancati FL, Kao WHL, Folsom AR, Watson RL, Szklo M. Incident type 2 diabetes mellitus in African American and white adults the Atherosclerosis Risk in Communities Study. JAMA. 2000;283(17):2253–9. [DOI] [PubMed] [Google Scholar]
  • 43. Schmidt MI, Duncan BB, Bang H, et al. Identifying individuals at high risk for diabetes: the Atherosclerosis Risk in Communities study. Diabetes Care. 2005;28(8):2013–8. [DOI] [PubMed] [Google Scholar]
  • 44. Fleurence RL, Hollenbeak CS. Rates and probabilities in economic modelling: transformation, translation and appropriate application. Pharmacoeconomics. 2007;25(1):3–6. [DOI] [PubMed] [Google Scholar]
  • 45. Ford ES, Ajani UA, Croft JB, et al. Explaining the decrease in U.S. deaths from coronary disease, 1980-2000. N Engl J Med. 2007;356(23):2388–98. [DOI] [PubMed] [Google Scholar]
  • 46. Feeny D, Krahn M, Prosser LA, Salomon JA. Valuing health outcomes. In: Neumann PJ, Sanders GD, Russell LB, Siegel JE, Ganiats TG. eds. Cost–Effectiveness in Health and Medicine. 2nd ed. New York: Oxford University Press; 2016. p 167–99. [Google Scholar]
  • 47. Agency for Healthcare Research and Quality. The Medical Expenditure Panel Survey (MEPS). Rockville (MD): Agency for Healthcare Research and Quality; 2021. [Google Scholar]
  • 48. Box GE, Cox DR. An analysis of transformations. J R Stat Soc Series B Stat Methodol. 1964;26(2):211–43. [Google Scholar]
  • 49. Manning WG, Mullahy J. Estimating log models: to transform or not to transform? J Health Econ. 2001;20(4):461–94. [DOI] [PubMed] [Google Scholar]
  • 50. Park RE. Estimation with heteroscedastic error terms. Econometrica. 1966;34(4):888. [Google Scholar]
  • 51. Deb P, Norton EC, Manning WG. Health Econometrics Using Stata. College Station (TX): Stata Press; 2017. [Google Scholar]
  • 52. Norton EC, Dowd BE, Maciejewski ML. Marginal effects-quantifying the effect of changes in risk factors in logistic regression models. JAMA. 2019;321(13):1304–5. [DOI] [PubMed] [Google Scholar]
  • 53. StataCorp. Stata Statistical Software: Release 16. College Station (TX): StataCorp; 2019. [Google Scholar]
  • 54. Dakin HA, Leal J, Briggs A, Clarke P, Holman RR, Gray A. Accurately reflecting uncertainty when using patient-level simulation models to extrapolate clinical trial data. Med Decis Making. 2020;40(4):460–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Rubin DB. Multiple Imputation for Nonresponse in Surveys. Hoboken (NJ): Wiley-Interscience; 2004. [Google Scholar]
  • 56. Dahabreh IJ, Chan JA, Earley A, et al. Modeling and Simulation in the Context of Health Technology Assessment: Review of Existing Guidance, Future Research Needs, and Validity Assessment. Rockville (MD): Agency for Healthcare Research and Quality; 2017. [PubMed] [Google Scholar]
  • 57. Steyerberg EW, Vickers AJ, Cook NR, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology. 2010;21(1):128–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Van Calster B, Nieboer D, Vergouwe Y, De Cock B, Pencina MJ, Steyerberg EW. A calibration hierarchy for risk models was defined: from utopia to empirical data. J Clin Epidemiol. 2016;74:167–76. [DOI] [PubMed] [Google Scholar]
  • 59. van Geloven N, Giardiello D, Bonneville EF, et al. Validation of prediction models in the presence of competing risks: a guide through modern methods. BMJ. 2022;377:e069249. [DOI] [PubMed] [Google Scholar]
  • 60. Brier GW. Verification of forecasts expressed in terms of probability. Mon Weather Rev. 1950;78(1):1–3. [Google Scholar]
  • 61. Bild DE, Bluemke DA, Burke GL, et al. Multi-ethnic study of atherosclerosis: objectives and design. Am J Epidemiol. 2002;156(9):871–81. [DOI] [PubMed] [Google Scholar]
  • 62. Harrell FE., Jr. Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis. Cham (Switzerland): Springer; 2015. [Google Scholar]
  • 63. Hosmer DW, Lemeshow S. Applied Logistic Regression. New York: John Wiley & Sons; 2000. [Google Scholar]
  • 64. Wang L, Lauren BN, Hager K, et al. Health and economic impacts of implementing produce prescription programs for diabetes in the United States: a microsimulation study. J Am Heart Assoc. 2023;12(15):e029215. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Bhat S, Coyle DH, Trieu K, et al. Healthy food prescription programs and their impact on dietary behavior and cardiometabolic risk factors: a systematic review and meta-analysis. Adv Nutr. 2021;12(5):1944–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Miller V, Micha R, Choi E, Karageorgou D, Webb P, Mozaffarian D. Evaluation of the quality of evidence of the association of foods and nutrients with cardiovascular disease and diabetes: a systematic review. JAMA Netw Open. 2022;5(2):e2146705. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67. Ward ZJ, Long MW, Resch SC, Giles CM, Cradock AL, Gortmaker SL. Simulation of growth trajectories of childhood obesity into adulthood. N Engl J Med. 2017;377(22):2145–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68. Ward ZJ, Bleich SN, Cradock AL, et al. Projected U.S. State-level prevalence of adult obesity and severe obesity. N Engl J Med. 2019;381(25):2440–50. [DOI] [PubMed] [Google Scholar]
  • 69. Pearson-Stuttard J, Guzman-Castillo M, Penalvo JL, et al. Modeling future cardiovascular disease mortality in the United States: national trends and racial and ethnic disparities. Circulation. 2016;133(10):967–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70. Pandya A, Sy S, Cho S, Weinstein MC, Gaziano TA. Cost-effectiveness of 10-year risk thresholds for initiation of statin therapy for primary prevention of cardiovascular disease. JAMA. 2015;314(2):142–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71. Wessler BS, Paulus J, Lundquist CM, et al. Tufts PACE clinical predictive model registry: update 1990 through 2015. Diagn Progn Res. 2017;1:20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72. Arnett DK, Khera A, Blumenthal RS. 2019 ACC/AHA guideline on the primary prevention of cardiovascular disease: part 1, lifestyle and behavioral factors. JAMA Cardiol. 2019;4(10):1043–4. [DOI] [PubMed] [Google Scholar]
  • 73. Lewsey JD, Lawson KD, Ford I, et al. A cardiovascular disease policy model that predicts life expectancy taking into account socioeconomic deprivation. Heart. 2015;101(3):201–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74. Basu A, Sohn MW, Bartle B, Chan KCG, Cooper JM, Huang E. Development and validation of the Real-World Progression in Diabetes (RAPIDS) model. Med Decis Making. 2019;39(2):137–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75. Grimshaw JM, Eccles MP, Lavis JN, Hill SJ, Squires JE. Knowledge translation of research findings. Implement Sci. 2012;7:50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76. Dunlop WCN, Mason N, Kenworthy J, Akehurst RL. Benefits, challenges and potential strategies of open source health economic models. Pharmacoeconomics. 2017;35(1):125–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77. Cohen JT, Neumann PJ, Wong JB. A call for open-source cost-effectiveness analysis. Ann Intern Med. 2018;168(7):529. [DOI] [PubMed] [Google Scholar]
  • 78. Padula WV, McQueen RB, Pronovost PJ. Finding resolution for the responsible transparency of economic models in health and medicine. Med Care. 2017;55(11):915–7. [DOI] [PubMed] [Google Scholar]
  • 79. Zuvekas SH, Olin GL. Validating household reports of health care use in the medical expenditure panel survey. Health Serv Res. 2009;44(5 Pt 1):1679–700. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80. Aizcorbe A, Liebman E, Pack S, Cutler DM, Chernew ME, Rosen AB. Measuring health care costs of individuals with employer-sponsored health insurance in the U.S.: a comparison of survey and claims data. Stat J IAOS. 2012;28(1-2):43–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81. Bernard D, Cowan C, Selden T, Cai L, Catlin A, Heffler S. Reconciling medical expenditure estimates from the MEPS and NHEA, 2007. Medicare Medicaid Res Rev. 2012;2(4). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82. Olin G, Zuvekas S, Kumar V, Ward P, Williams K, Wobus D. Medicare-MEPS Validation Study: A Comparison of Hospital and Physician Expenditures. Working Paper No. 08003. Rockville (MD): Agency for Healthcare Research and Quality; 2008. [Google Scholar]
  • 83. Zuvekas SH, Olin GL. Accuracy of medicare expenditures in the medical expenditure panel survey. Inquiry. 2009;46(1):92–108. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

sj-docx-1-mdm-10.1177_0272989X231196916 – Supplemental material for Development and Validation of the US Diabetes, Obesity, Cardiovascular Disease Microsimulation (DOC-M) Model: Health Disparity and Economic Impact Model

Supplemental material, sj-docx-1-mdm-10.1177_0272989X231196916 for Development and Validation of the US Diabetes, Obesity, Cardiovascular Disease Microsimulation (DOC-M) Model: Health Disparity and Economic Impact Model by David D. Kim, Lu Wang, Brianna N. Lauren, Junxiu Liu, Matti Marklund, Yujin Lee, Renata Micha, Dariush Mozaffarian and John B. Wong in Medical Decision Making


Articles from Medical Decision Making are provided here courtesy of SAGE Publications

RESOURCES