Skip to main content
American Journal of Epidemiology logoLink to American Journal of Epidemiology
. 2021 Aug 18;191(1):188–197. doi: 10.1093/aje/kwab219

G-Computation and Agent-Based Modeling for Social Epidemiology: Can Population Interventions Prevent Posttraumatic Stress Disorder?

Stephen J Mooney , Aaron B Shev, Katherine M Keyes, Melissa Tracy, Magdalena Cerdá
PMCID: PMC8897987  PMID: 34409437

Abstract

Agent-based modeling and g-computation can both be used to estimate impacts of intervening on complex systems. We explored each modeling approach within an applied example: interventions to reduce posttraumatic stress disorder (PTSD). We used data from a cohort of 2,282 adults representative of the adult population of the New York City metropolitan area from 2002–2006, of whom 16.3% developed PTSD over their lifetimes. We built 4 models: g-computation, an agent-based model (ABM) with no between-agent interactions, an ABM with violent-interaction dynamics, and an ABM with neighborhood dynamics. Three interventions were tested: 1) reducing violent victimization by 37.2% (real-world reduction); 2) reducing violent victimization by100%; and 3) supplementing the income of 20% of lower-income participants. The g-computation model estimated population-level PTSD risk reductions of 0.12% (95% confidence interval (CI): −0.16, 0.29), 0.28% (95% CI: −0.30, 0.70), and 1.55% (95% CI: 0.40, 2.12), respectively. The ABM with no interactions replicated the findings from g-computation. Introduction of interaction dynamics modestly decreased estimated intervention effects (income-supplement risk reduction dropped to 1.47%), whereas introduction of neighborhood dynamics modestly increased effectiveness (income-supplement risk reduction increased to 1.58%). Compared with g-computation, agent-based modeling permitted deeper exploration of complex systems dynamics at the cost of further assumptions.

Keywords: agent-based modeling, g-computation, mathematical models, posttraumatic stress disorder, social epidemiology, violence

Abbreviations

ABM

agent-based model

CI

confidence interval

PTSD

posttraumatic stress disorder

WTC

World Trade Center

Estimating the impacts of interventions on social conditions such as poverty and violent victimization poses modeling challenges, because these conditions intersect through complex social forces (1, 2). Mechanistically, social forces include both “feedback loops” (e.g., when violent victimization induced by poverty affects future employment which in turn affects future poverty), social interaction (e.g., violent victimization requires a potential perpetrator to interact with a potential victim) (3), and dynamic spatial processes (e.g., social and structural forces in neighborhoods that create economic opportunities and give rise to patterns of migration and social interaction) that cannot be easily incorporated in regression models.

Simulation-based methods such as agent-based models (ABMs) incorporate these dynamics within analyses (4–6) but require numerous assumptions for inferences (5, 7). Recently, Hernán (8) proposed that feedback loops as conceptualized in simulation models are equivalent to time-varying confounding as conceptualized in causal inference. It follows that g-computation (9–11), a model-based standardization method that accounts for time-varying confounding affected by prior exposure, can be used to investigate the impacts of social interventions, allowing intervention effects to incorporate causal feedback loops without diverging from established principles of statistical inference (8). Such g-computation models are mathematically equivalent to a microsimulation (12), a form of ABM that does not include spatial movement or between-agent interactions (13).

However, feedback loops are not the only complex system process for which researchers turn to simulation methods. Spatial movements, interactions between agents, and neighborhood dynamics are often components of social epidemiologic theories (14) that may be needed to accurately estimate intervention effects, yet typically cannot be incorporated into causal models rooted in independence assumptions. By contrast, ABMs can include these mechanisms explicitly. For example, agent-based social epidemiology models frequently include a simulated space for agents to move through, and trigger events such as violent victimization only when 2 agents come into proximity in that simulated space (15, 16). However, any mechanisms incorporated into a model must be explicitly specified, and that specification typically cannot be verified empirically (17). Accordingly, although an ABM can be specified to be mathematically equivalent to g-computation, in practice, ABMs frequently require stronger modeling assumptions than g-computation (12). Table 1 summarizes the sources that g-computation and ABMs can draw on, contrasted with sources of data a projection based on a conventional regression could draw on. (Note that as of 2021, we are not aware of any g-computation model that incorporates group-level variables aggregated from individual-level variables during the simulation; nonetheless, it could be done.)

Table 1.

Sources of Data Leveraged to Model Impacts of Population Interventions

Modeling Strategy
Data Sources for Models Conventional Regression G-Computation Agent-Based Modeling
Individual and group-level observations from primary data set X X X
Potentially counterfactual time-varying variables simulated from models fit to primary data set X X
Potentially counterfactual group-level variables aggregated from individual-level simulated variables X X
Potentially counterfactual time-varying variables simulated from models fit to other data sets X
Potentially counterfactual time-varying variables simulated from hypothesized mechanisms for interactions between units X

In sum, agent-based modeling can be used to estimate potential intervention effects for outcomes that are dynamic, social, and spatial; when constrained to exclude some of these dynamics, agent-based modeling reduces to g-computation. To explore the potential g-computation may hold as a complement to agent-based modeling in research on complex social systems, we compared results from a demonstration g-computation analysis with those from a comparable ABM exploring population prevention strategies for posttraumatic stress disorder (PTSD). We then expanded that ABM to incorporate violence-perpetration dynamics and neighborhood effects.

PTSD is a common and debilitating condition that affects approximately 3.5% of US adults in a given year (18). Population prevention strategies for PTSD have the potential to focus on distal or proximal risk factors (19, 20). Distal risk factors such as poverty are strongly correlated with PTSD incidence, both because poverty increases exposure to traumatic events and because wealth can buffer individuals from the consequences of experiencing a traumatic event (21), and so intervening to prevent poverty may prevent PTSD. Alternately, preventing exposure to traumatic events themselves—for example, by directly preventing violent victimization—may also prevent PTSD cases. Policy makers considering violence-reduction or poverty-prevention programs can benefit from quantitative estimates of the impacts of such programs on outcomes such as PTSD. To account for complexities within social systems when estimating such impacts, researchers use modeling strategies such as agent-based modeling or g-computation.

Our work builds on prior work comparing effect transportability using g-computation and ABM (13) by focusing on the impact of modeling choices. We first modeled prevention of interpersonal violence, defined as the intentional use of physical force to harm another person, representing the direct cause of some PTSD incidence. We next modeled a reduction of poverty levels among study participants, representing a modifiable social condition that might influence PTSD risk indirectly (22). We explored these interventions in a g-computation model as well as a series of ABMs with and without accounting for agent interactions and neighborhood effects. Our work thus serves not only as an exploration of 2 potential PTSD-prevention interventions, but also as a case study of the modeling and analytical issues that arise when using methods such as g-computation and agent-based modeling to answer a social epidemiologic question using survey data with complex analytical techniques.

METHODS

Subjects and setting

We used data from a prospective, population-based cohort study of the adult population of the New York metropolitan area, including New York City itself and 14 surrounding counties. The survey was conducted to assess the mental health of the New York metropolitan area population 6 months following the World Trade Center (WTC) disaster on September 11, 2001 (23). Study recruitment, which began with contact through an area-probability random-digit dial procedure wherein 1 adult (age 18 or older) member of each successfully contacted household was randomly selected for the study, has been described in detail elsewhere (24). The baseline cooperation rate was 56%.

Respondents were surveyed at baseline (about 6 months after the WTC disaster), and 6, 18, and 33 months after baseline, for 4 total waves. Subjects who were never reinterviewed after baseline were excluded from this analysis (n = 470; 17%), leaving 2,282 participants who completed at least 1 follow-up survey. Phone interviews were conducted with computer assistance by trained interviewers in one of 4 languages (English, Spanish, Mandarin, and Cantonese) using translated and back-translated questionnaires. Informed consent was obtained verbally at each interview. The New York Academy of Medicine institutional review board approved the data collection protocol.

Measures

At baseline, study participants self-reported age, race, and ethnicity (White, Asian, Black, Hispanic, or other), sex, marital status (married or unmarried couple; divorced, separated, or widowed; or never married), educational attainment (graduate degree; college degree; some college; high school graduate or General Educational Development equivalent; or less than high school), household income ($: <50,000, 50,000–99,999, ≥100,000 or more), frequency of drinking alcohol (number of past 30 days with any alcohol consumption and average number of drinks per day on which drinking occurred), both lifetime and past-year history of traumatic events including violent victimization, and history of stressful life events (death in the family, change in marital status, problems at work, etc.) in the 12 months prior to the WTC disaster. We considered the following traumatic events to constitute violent victimization: being the target of an attack or the threat of an attack with a weapon; being the target of an attack with the intent to seriously injure or kill without a weapon; forced sexual contact; being in any other situation inciting fear of being seriously injured or killed; and witnessing someone else being seriously injured or killed. At each follow-up interview, participants reported their exposure to traumatic and stressful life events since the previous interview as well as any changes in household income, marital status, and alcohol use.

At each wave including baseline, interviewers assessed PTSD symptoms using a module adapted from the National Women’s Study (25). Consistent with prior work and Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, criteria, we considered the presence of at least 1 reexperiencing symptom, at least 3 avoidance symptoms, and at least 2 arousal symptoms to constitute PTSD (23). Because the WTC disaster occurred prior to baseline, we focused on PTSD unrelated to the WTC disaster.

Missing data

As is typical in longitudinal surveys, we were missing data both due to survey nonresponse and due to item nonresponse among those who completed surveys. Survey nonresponse ranged from approximately 15% to approximately 30% at each wave. We accounted for survey nonresponse using inverse probability of observation weights among subjects who did complete the survey for that wave, using age, sex, race/ethnicity, baseline marital status, education, baseline income, baseline traumatic event exposure, baseline stressor exposure, baseline drinking status, and lifetime history of PTSD and violent victimization to construct the weights.

Within survey responses, item nonresponse was low. All covariates for which data were missing were imputed using k-nearest neighbors (KNN) imputation (26) implemented in R, version 3.5.1 (R Foundation for Statistical Computing, Vienna, Austria). The KNN algorithm used all other available covariates for imputation. We chose to bootstrap before imputing in order to focus on accurate point estimates; however, some empirical evidence suggests that deferring resampling until after imputation results in very similar estimates (27) at substantially lower computational costs.

Statistical analysis

We used 2 approaches to estimate population-level determinants of PTSD. The first approach used parametric g-computation to estimate the average causal effect of our hypothetical interventions in our observed population. Parametric g-computation involves fitting a series of parametric models that predict not only the primary outcome of interest but also any time-varying covariates affected by prior exposures. Using these models, the observed baseline data, and any hypothetical interventions to reset parameters, the g-computation algorithm sequentially simulates observations at each time point for each time-varying variable. The difference between estimates of the outcome of interest under a hypothetical intervention and the outcome of interest under the “natural course” intervention—wherein all variables supplied to the algorithm take on their observed values—constitutes the estimated causal effect of the hypothetical intervention.

Our second approach was to use the same parametric models to build an ABM that could simulate the same counterfactuals. ABM building consists of specifying a population of agents (in this case, people), a set of attributes of those agents (in this case, characteristics reflecting subject covariates), and a set of rules through which agents interact, including updating those attributes. When the set of rules that update the attributes are parametric models fitted to a particular data set, the ABM is mathematically equivalent to g-computation (8). After developing an ABM that simulated each agent’s updates using the parameters derived from the parametric models included within our g-computation approach, we extended that ABM to explore whether incorporating agent interactions and neighborhood effects provided additional insight into mechanisms shaping the population distribution of violence and PTSD.

Model specification

Our analysis followed the conventional g-computation algorithm (28), such that we modeled the distribution of each time-varying variable conditional on each other time-varying variable. Our update order was violent victimization, marital status, drinking status, traumatic events, stressful events, and income—that is, for each wave, we first simulated violent victimization status from time-fixed covariates and past-wave time-varying covariates. We then used the simulated value for violent victimization status when simulating marital status, and so on. The prevalence of PTSD at the end of the final simulation wave thus represented the outcome expected under a hypothetical intervention (as specified below). A diagram of this causal model is displayed in Web Figure 1 (available at https://doi.org/10.1093/aje/kwab219), and an overall walk-through of the g-computation algorithm in conjunction with imputation is documented in Web Appendix 1. We used different parametric models for each wave to account for differing elapsed calendar time between waves and different wording for traumatic life-event and life-stressor questions at baseline, and used nonparametric bootstrapping to compute confidence intervals (since influence function–based confidence intervals are not valid for g-computation (29)). As part of our model calibration, we compared the difference between the observed rate of PTSD in the original data and the (estimated) rate of PTSD under the natural-course intervention. Following prior work, we considered a 95% confidence interval for the difference including zero to constitute evidence that model calibration was adequate (10).

The baseline ABM used a similar approach—we simulated 806,203 agents representing 10% of the adult population of New York City (we chose 10% as a balance between the computational cost of simulating the full population and the between-simulation variability of a smaller proportion) and then updated each agent’s violent victimization status, marital status, drinking status, traumatic events history, stressful events history, income, and PTSD status, in that order. We added 2 extensions to the ABM to explicitly model mechanisms of violence and space: First, we modified the victimization code such that agents were embedded in a 400 × 625 grid roughly resembling New York City (so each grid cell represents a square of ground roughly 50 m × 50 m) and allowed victimization only when a potential perpetrator was in proximity (within a 20-cell radius) of a potential victim, directly modeling social interactions that could affect violent victimization risk. To enable this mechanism, which has been used in prior ABMs simulating violence in the NYC population (15, 30, 31), each agent had a probability of becoming a potential perpetrator determined as a function of sex, age, income, education, and history of violence exposure as drawn from the National Epidemiologic Survey on Alcohol and Related Conditions (32). Second, again drawing on prior New York City ABMs (15, 30, 31), we defined 59 neighborhood areas representing New York City community districts and assigned each agent to a neighborhood. We then assigned several characteristics to each community district, taken from 2000 US Census data: 1) proportion unemployed, 2) proportion foreign-born, 3) proportion in managerial or professional occupations, and 4) proportion female-headed households. We refitted models for updating time-varying covariates and PTSD to include these neighborhood factors. In contrast to prior models, for simplicity, we did not allow agents to move between neighborhoods. Violence between agents was predicted by neighborhood characteristics based on previous studies and allowed to arise as a result of individual and neighborhood characteristics; neighborhood associations with violence were weighted to contribute 10% of the agents’ probabilities of victimization and perpetration. No quantitative estimates were available to parameterize this partitioning of risk, so we picked 10% to be consistent with previous ABM studies of neighborhoods and violence (30).

Incorporating this spatial mechanism into the ABM allowed us to explore the implications of neighborhood crime clustering in the model. For example, because history of violence exposure affects each agent’s risk for victimization, the cumulative impact of a violence-prevention intervention could result in a greater PTSD-reduction impact in locations where more potential victims were present at baseline. However, such mechanistic explorations, which are typically a component of ABMs, come at the cost of the strong assumption that we specified the neighborhood contributions to risk correctly.

Validation, calibration, and the natural course

Uncertainty surrounding mechanistic relationships leads to misspecification, and such uncertainty is often amplified in social epidemiology models, either because the social process is not fully understood or because important variables are mismeasured or missing. ABM building therefore typically includes validation and calibration steps (in which selected model parameters are adjusted to ensure that simulated macro-scale outcomes show face validity) as a part of the modeling process (33). G-computation typically includes a validation step wherein an analyst tests that the natural-course outcome (i.e., the potential outcomes predicted by the model when the pattern of exposure matches what was observed in reality) is comparable to the actually observed outcome (10), but analysts do not typically adjust parameters in response to this step. To avoid developing a calibration step for g-computation while ensuring direct comparison between our g-computation and our ABM, we did not include a parameter adjustment step in any models.

However, for both modeling approaches, we performed simulations in which each simulated participant’s income group remained what it was in the observed data and each participant was exposed to violence in a given wave if, in the observed data, the participant was exposed to violence in that wave. These simulations represent the “natural course,” which provides a baseline against which interventions can be compared. We compared natural-course estimates against observed PTSD prevalence as a loose test for misspecification (10).

Hypothetical interventions

Both ABM and g-computation approaches require explicit definition regarding specification of hypothetical interventions (34). Our interventions affected violence or income directly and had no effects that were not mediated by the variable intervened on (35), an unrealistic but simplifying assumption. All interventions were imposed at baseline and remained in place through 4 waves of simulation, representing the years from 2002 through 2006.

We assessed 2 hypothetical violence-reduction interventions: one that reduced violence exposure subsequent to baseline by 37.2% and one that removed all violence in that period. This initial reduction percentage matches the reduction in violent crime reported in New York City between 2001 and 2016 (36) and can be conceptualized as what might have happened had the observed reduction between the WTC disaster and the present occurred all at once in 2001. We chose the 100% reduction to represent an estimate of the PTSD that would be eliminated in a hypothetical world without violence (i.e., the numerator of the population attributable fraction) (37).

In contrast to the more proximal violence intervention, we also assessed a hypothetical intervention on income. We assumed a random 20% of study participants whose income was in the lowest group (less than $50,000) at baseline moved to the second group ($50,000–$99,999) throughout follow-up. We selected 20% as representing spreading the roughly $2.1 billion/year increase in the New York City Police Budget between 2001 and 2016 (38, 39) into $20,000/year income supplements, which would support approximately 100,000 households.

Software

The g-computation analyses were conducted in R for Windows, version 3.5.1 (The R Project for Statistical Computing, Vienna, Austria). Practical hurdles in g-computation implementation are documented in Web Appendix 2. The ABM was developed using Recursive Porous Agent Simulation Toolkit for Java (RepastJ, version 3.0; University of Chicago, Chicago, Illinois) and Java Standard Edition (JavaSE, version 1.8; Oracle, Redwood Shores, California). Code for these analyses is available at GitHub (40).

RESULTS

Table 2 displays selected characteristics of the study sample by PTSD history prior to imputation. Of the 2,282 respondents in the analytical sample, 82% (n = 1,865) reported no history of PTSD at baseline. Respondents with a history of PTSD had lower income and were more likely to report a history of violent victimization, female sex, and Hispanic ethnicity. At baseline, 9.8% of respondents reported ever having been the victim of a violent event. During the course of follow-up, approximately 5% of subjects reported violent victimization for the first time. Onset of violent victimization declined over the course of the study even as time between waves increased (Table 3). New onset of PTSD declined as well; a total of 16% of the cohort (n = 263) reported new onset of PTSD during at least 1 wave of follow-up among those who had reported no history of PTSD at baseline.

Table 2.

Selected Characteristics at Baseline for the Participants in the World Trade Center Study (n = 2,282), New York, New York, 2002–2006

Total (n = 2,282) No History of PTSD (n = 1,865) History of PTSD (n = 417)
Characteristic No. % No. % No. %
Sex
 Male 1,034 45.3 874 46.9 160 38.4
 Female 1,248 54.7 991 53.1 257 61.6
Race
 White 1,363 59.7 1,133 61.7 230 55.7
 Black 317 13.9 257 14.0 60 14.5
 Hispanic 364 16.0 276 15.0 88 21.3
 Other 204 9.1 169 9.2 35 8.5
Age, yearsa 44.7 (16.2) 45.3 (16.6) 42.4 (13.8)
Income, $
 <50,000 924 48.5 726 47.1 198 54.4
 50,000—99,999 569 29.9 462 30.0 107 29.4
 ≥100,000 412 21.6 353 22.9 59 16.2
Alcohol consumption
 Abstinent 1139 51.3 938 51.7 201 49.6
 Light Drinker 834 37.6 689 38.0 145 35.8
 Heavy Drinker 247 11.1 188 10.4 59 14.6
Any lifetime violent victimization 118 10.5 81 9.7 37 12.8

Abbreviation: PTSD, posttraumatic stress disorder.

a Values are expressed as mean (standard deviation).

Table 3.

Victimization and New-Onset of Posttraumatic Stress Disorder Unrelated to the World Trade Center Disaster Since Baseline at Each Wave for Participants With No Prior History of Posttraumatic Stress Disorder in the World Trade Center Study Using Imputed Data

Wave 1 (n = 2,282) Wave 2 (n = 1,939) Wave 3 (n = 1,832) Wave 4 (n = 1,610)
Outcome No. % No. % No. % No. %
Violent victimizationa 224 9.8 70 3.6 53 2.9 45 2.8
New onset of PTSD 57 2.5 88 4.5 120 6.6 120 7.5
New onset of PTSD (cumulative) 57 2.5 119 6.1 217 11.8 263 16.3

Abbreviation: PTSD, posttraumatic stress disorder

a Wave 1 figures for violent victimization represent reported lifetime history of victimization. Based on data from New York, New York, 2002.

Under the “natural course” intervention, the g-computation analysis estimated a prevalence of new onset of non-WTC PTSD slightly lower than the observed prevalence (15.1%, 95% confidence interval (CI): 14.4, 18.5, compared with 16.3% in observed data). In this analysis, all 3 hypothetical interventions reduced PTSD incidence (Table 4). The violence-reduction interventions were somewhat less effective than the income intervention. Specifically, in g-computation models, the 37.2% reduction in violence reduced the estimated PTSD incidence rate by 0.12 percentage points (95% CI: –0.16, 0.29), preventing approximately 600 new PTSD cases in the simulated population of 500,000 adults. The 100% reduction in violence reduced estimated PTSD incidence by 0.28 percentage points (95% CI: –0.30, 0.70), or approximately 1,400 new PTSD cases. By contrast, increasing income for 20% of those in the lowest income bracket prevented non-WTC PTSD in 1.6% of the simulated cohort (95% CI: 0.40, 2.12), reducing estimated incident PTSD cases by approximately 6,000. Coefficients for the all models are shown in Web Tables 1–7.

Table 4.

Risk of New Onset Posttraumatic Stress Disorder Under 3 Hypothetical Scenarios as Estimated Using G-Computation and Agent-Based Modelinga

Model and Simulation Scenario Absolute Risk, % 95% CI RD, % 95% CI
G-computation
 Natural course 15.1 14.4, 18.5
 Reduce violence 37.2% 15.0 14.4, 18.4 −0.12 −0.29, 0.16
 Reduce violence 100% 14.9 14.2, 18.3 −0.28 −0.70, 0.30
 Increase income for the lowest quintile 13.6 13.1, 17.3 −1.55 −2.12, −0.40
Agent-based model
 Natural course 14.9 14.4, 15.5
 Reduce violence 37.2% 14.8 14.3, 15.5 −0.07 −0.10, 0.05
 Reduce violence 100% 14.7 14.1, 15.3 −0.26 −0.34, −0.15
Increase income for the lowest quintile 13.4 12.9, 14.1 −1.53 −1.65, −1.41
Agent-based model with perpetration
 Natural course 14.9 14.3, 15.5
 Reduce violence 37.2% 14.9 14.2, 15.4 0.00 −0.09, 0.10
 Reduce violence 100% 14.6 14.1, 15.3 −0.25 −0.32, −0.09
 Increase income for the lowest quintile 13.4 12.9, 14.1 −1.47 −1.54, −1.31
Agent-based model with perpetration and neighborhood effects
 Natural course 15.0 14.5, 15.8
 Reduce violence 37.2% 15.0 14.5, 15.8 0.00 −0.11, 0.09
 Reduce violence 100% 14.7 14.1, 15.3 −0.36 −0.47, −0.26
 Increase income for the lowest quintile 13.5 12.8, 14.0 −1.58 −1.75, −1.51

Abbreviations: CI, confidence interval; RD, risk difference.

a Simulated population of 500,000 adults representative of the New York City metropolitan area, 2002–2006.

As expected, results from the basic ABM replicated the g-computation algorithm results nearly exactly (Figure 1), with the partial reduction in violence decreasing estimated PTSD prevalence by 0.07 percentage points (95% CI: −0.05, 0.10) and the complete removal of violence decreasing estimated prevalence by 0.26 percentage points (95% CI: 0.15, 0.34). Adding a violence-perpetration mechanism to the ABM only modestly affected the estimated average impact of the interventions (Table 4), whereas adding the neighborhood mechanism made the impact of complete violence-reduction somewhat more substantial (risk reduction with the neighborhood mechanism was 0.36 percentage points (95% CI: 0.26, 0.47) compared with 0.25 percentage points (95% CI: 0.09, 0.32) without it), likely because the neighborhood mechanism enabled the model to account for clustering of perpetrators and victims, effectively targeting the violence-reduction intervention to the population most likely to develop PTSD and allowing the impact of violence reduction to be enhanced over the course of the simulation. The impact of the violence-perpetration mechanism on the magnitude of the income intervention effect was negligible, but the violence-perpetration and mechanism neighborhood effects resulted in a slightly greater effect size (1.58-percentage-point change, compared with 1.53), again potentially due to the intervention disproportionately protecting agents clustered near potential violence perpetrators.

Figure 1.

Figure 1

Prevalence of posttraumatic stress disorder (PTSD) specified interventions, stratified by simulation type. A) The natural course (no intervention); B) 37% violence reduction; C) 100% violence reduction; D) increasing income for those in the lowest income quintile. ABM, agent-based model.

DISCUSSION

We used g-computation and agent-based modeling with population-based survey data to explore the reductions in the risk of PTSD that might arise as a result of several potential interventions: 2 to reduce violence and one to increase income. Our results indicate that all interventions could modestly decrease incident PTSD, although the impact of the intervention increasing income was larger. Results from the g-computation model and the ABM without dynamic mechanisms were comparable. Adding a violence-perpetration mechanism and neighborhood effects to the ABM led to modest increases in the effectiveness of the simulated full violence-reduction and income-supplementation interventions.

The magnitude of the violence intervention’s effect was modest in comparison with that of the income increase. Prior work suggests that interventions that affect multiple pathways toward an outcome, such as those that are earlier in the life course or broader in scope, may be more impactful than targeted interventions (31), in part because causes such as socioeconomic status affect multiple risk factors that may interact to affect health (21). Alternatively, it may be that most of the PTSD in this population occurred due to nonviolent traumatic events that are related to low incomes. While we cannot verify this speculation because the survey items did not distinguish violence-related from violence-unrelated PTSD, prior research has shown that violence-related events are common and associated with a higher increased risk of PTSD than nonviolent events (41–43). This is consistent with our finding that intervening on income, a cause that has both violent and nonviolent pathways and mechanisms to influence PTSD through the life course, achieved a greater PTSD reduction benefit than intervening on violence, a more proximal cause of PTSD in our model.

Adding a violence-perpetration mechanism alone did not affect intervention effectiveness but adding both a violence-perpetration and a neighborhood-effect mechanism to the ABM modestly increased the effectiveness of both violence-reduction and income-supplementation interventions. These changes in estimated effect are consistent with the notion that accounting for spillover effects may be important for social exposures such as violent victimization (44). In particular, the perpetration mechanism alone may not have affected PTSD incidence because prevented violence was not clustered in areas where agents were already at elevated risk, whereas including perpetration and neighborhood together did cluster reductions. While we do not have a gold standard to validate against and cannot conclude that the ABM results match what would be found in the real world, the potential for the ABM to both replicate the g-computation finding and shed light on mechanisms through which interventions operate is intriguing.

More broadly, our investigation highlights that there can be substantial overlap between g-computation and ABM computational approaches. Both use statistical models to simulate hypothetical interventions. Some features typically considered part of the complex systems domain, including changes to group-level variables emerging from individual-level changes, can be modeled statistically and thus could be incorporated into g-computation analyses. A formal approach to exploring such models in a g-computation framework to explore emergent phenomena while incorporating statistical robustness would be a promising approach to address social epidemiology questions.

Our results should also be considered in light of several limitations. First, as discussed throughout, both the ABM and g-computation approaches are simplified models of complex social processes, and some amount of model misspecification is nearly certain in this context. This limitation is intrinsic to quantitative exploration of complex processes and not limited to our methodological approach, although both g-computation and ABMs may be more vulnerable to misspecification error than doubly-robust statistical models (45). Second, although not all study participants were directly exposed to the World Trade Center disaster, and our outcome of interest was PTSD that was not related to the WTC, the cohort was recruited in the greater New York City area in the early 2000s and may not represent the experience of people in other time and place settings. Third, the methodological concerns addressed by the approaches explored here—time-varying confounding and spatial clustering of risk—likely do not represent the greatest quantitative risks to validity for this study. For example, if people affected by WTC-related PTSD were less likely to agree to participate in the study, there would be residual collider bias. Fourth, our conclusions are limited by potential violations of the consistency assumption—for example, intervening to increase income to a moderate level may have different effects than having had a sustained moderate income. This violation of the counterfactual consistency assumption (34) would apply to both the g-computation and ABM results. The target trial framework, which focuses researcher attention on counterfactual consistency, could improve ABMs (46). Fifth, the use of multiple imputation for missing item responses while resampling to implement g-computation remains an area of active research. If these methods together fail to appropriately capture statistical uncertainty, our confidence intervals may be optimistic. Sixth, the 1.2% gap between our simulated PTSD prevalence and the observed PTSD prevalence, although present in both the g-computation model and the ABM, suggests that there are real-world aspects to the development of PTSD that are not incorporated in our models. Finally, our choice to forego model calibration implicitly assumes that differences between prevalence observed under the natural-course simulation and the truly observed prevalence of PTSD are negligible.

In conclusion, when specified equivalently, effects estimated using ABMs and g-computation were nearly identical. Substantively, we found that both violence prevention and income supplementation could prevent PTSD, with income supplementation showing greater effectiveness, albeit under sizeable assumptions. Methodologically, our results empirically demonstrate the mathematical equivalence of g-computation with a special case of agent-based modeling, while also showing how, by incorporating assumptions, ABMs may also shed light into complex effects of specific mechanisms. Ultimately, epidemiologists seeking the greater flexibility of the ABM approach must decide if this flexibility is worth the uncertainty of the more extensive assumptions it carries.

Supplementary Material

Web_Material_kwab219

ACKNOWLEDGMENTS

Affiliations: Department of Epidemiology, School of Public Health, University of Washington, Seattle, Washington, United States (Stephen J. Mooney); Harborview Injury Prevention and Research Center, University of Washington, Seattle, Washington, United States (Stephen J. Mooney); Violence Prevention Research Program, Department of Emergency Medicine, University of California, Davis, California, Unites States (Aaron B. Shev); Department of Epidemiology, Columbia University, New York, New York, United States (Katherine M. Keyes); Department of Epidemiology and Biostatistics, University at Albany, State University of New York, Albany, New York, United States (Melissa Tracy); and Department of Population Health, New York University, New York, New York, United States (Magdalena Cerdá).

This work was supported by the Eunice Kennedy Shriver National Institute of Child Health and Human Development (grant T32HD057822), the National Library of Medicine (grant R00LM012868), the National Institute on Alcohol Abuse and Alcoholism (grant 5R21AA021909), and the National Institute on Drug Abuse (grant 5K01DA030449).

Drs. Brandon Marshall and Daniel Westreich offered helpful comments on the overall framing of this project at its outset, and Ava Hamilton assisted with agent-based model development.

Data availability: The data underlying this article were provided by the New York Academy of Medicine and are available upon request to the New York Academy of Medicine.

Conflicts of interest: none declared.

REFERENCES

  • 1. Galea S, Tracy M, Hoggatt KJ, et al. . Estimated deaths attributable to social factors in the United States. Am J Public Health. 2011;101(8):1456–1465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Glymour MM, Osypuk TL, Rehkopf DH. Invited commentary: off-roading with social epidemiology—exploration, causation, translation. Am J Epidemiol. 2013;178(6):858–863. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. el-Sayed AM, Scarborough P, Seemann L, et al. . Social network analysis and agent-based modeling in social epidemiology. Epidemiol Perspect Innov. 2012;9(1):1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Auchincloss AH, Diez Roux AV. A new tool for epidemiology: the usefulness of dynamic-agent models in understanding place effects on health. Am J Epidemiol. 2008;168(1):1–8. [DOI] [PubMed] [Google Scholar]
  • 5. Diez Roux AV. Invited commentary: the virtual epidemiologist—promise and peril. Am J Epidemiol. 2015;181(2):100–102. [DOI] [PubMed] [Google Scholar]
  • 6. Marshall BD, Galea S. Formalizing the role of agent-based modeling in causal inference and epidemiology. Am J Epidemiol. 2015;181(2):92–99. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Naimi AI. Commentary: integrating complex systems thinking into epidemiologic research. Epidemiology. 2016;27(6):843–847. [DOI] [PubMed] [Google Scholar]
  • 8. Hernán MA. Invited commentary: agent-based models for causal inference—reweighting data and theory in epidemiology. Am J Epidemiol. 2015;181(2):103–105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Robins J. A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Math Model. 1986;7(9):1393–1512. [Google Scholar]
  • 10. Lajous M, Willett WC, Robins J, et al. . Changes in fish consumption in midlife and the risk of coronary heart disease in men and women. Am J Epidemiol. 2013;178(3):382–391. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Westreich D, Cole SR, Young JG, et al. . The parametric g-formula to estimate the effect of highly active antiretroviral therapy on incident AIDS or death. Stat Med. 2012;31(18):2000–2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Arnold KF, Harrison WJ, Heppenstall AJ, et al. . DAG-informed regression modelling, agent-based modelling and microsimulation modelling: a critical comparison of methods for causal inference. Int J Epidemiol. 2019;48(1):243–253. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Murray EJ, Robins JM, Seage GR, et al. . A comparison of agent-based models and the parametric g-formula for causal inference. Am J Epidemiol. 2017;186(2):131–142. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Krieger N. Theories for social epidemiology in the 21st century: an ecosocial perspective. Int J Epidemiol. 2001;30(4):668–677. [DOI] [PubMed] [Google Scholar]
  • 15. Cerdá M, Tracy M, Keyes KM, et al. . To treat or to prevent?: reducing the population burden of violence-related post-traumatic stress disorder. Epidemiology. 2015;26(5):681–689. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Tracy M, Cerdá M, Keyes KM. Agent-based modeling in public health: current applications and future directions. Annu Rev Public Health. 2018;39:77–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Murray EJ, Robins JM, Seage GR III, et al. . The challenges of parameterizing direct effects in individual-level simulation models. Med Decis Making. 2020;40(1):106–111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Gradus JL. Epidemiology of PTSD. National Center for PTSD (United States Department of Veterans Affairs). https://www.ptsd.va.gov/professional/treat/essentials/epidemiology.asp. Accessed July 29, 2021.
  • 19. Galea S. Invited commentary: continuing to loosen the constraints on epidemiology in an age of change—a comment on McMichael’s “prisoners of the proximate”. Am J Epidemiol. 2017;185(11):1217–1219. [DOI] [PubMed] [Google Scholar]
  • 20. McMichael AJ. Prisoners of the proximate: loosening the constraints on epidemiology in an age of change. Am J Epidemiol. 1999;149(10):887–897. [DOI] [PubMed] [Google Scholar]
  • 21. Link BG, Phelan J. Social conditions as fundamental causes of disease. J Health Soc Behav. 1995;(extra issue):80–94. [PubMed] [Google Scholar]
  • 22. Bonanno GA, Galea S, Bucciarelli A, et al. . What predicts psychological resilience after disaster? The role of demographics, resources, and life stress. J Consult Clin Psychol. 2007;75(5):671–682. [DOI] [PubMed] [Google Scholar]
  • 23. Galea S, Ahern J, Tracy M, et al. . Longitudinal determinants of posttraumatic stress in a population-based cohort study. Epidemiology. 2008;19(1):47–54. [DOI] [PubMed] [Google Scholar]
  • 24. Nandi A, Galea S, Tracy M, et al. . Job loss, unemployment, work stress, job satisfaction, and the persistence of posttraumatic stress disorder one year after the September 11 attacks. J Occup Environ Med. 2004;46(10):1057–1064. [DOI] [PubMed] [Google Scholar]
  • 25. Kilpatrick DG, Resnick HS, Saunders BE, et al. . The national women’s study PTSD module Unpublished instrument. Charleston, SC: Crime Victims Research and Treatment Center, Department of Psychiatry and Behavioral Sciences, Medical University of South Carolina; 1989. [Google Scholar]
  • 26. Batista GE, Monard MC. A study of K-nearest neighbour as an imputation method. Hybrid Intelligence Systems. 2002;87(251–260):48. [Google Scholar]
  • 27. Schomaker M, Heumann C. Bootstrap inference when using multiple imputation. Stat Med. 2018;37(14):2252–2266. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Snowden JM, Rose S, Mortimer KM. Implementation of G-computation on a simulated data set: demonstration of a causal inference technique. Am J Epidemiol. 2011;173(7):731–738. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Kreif N, Tran L, Grieve R, et al. . Estimating the comparative effectiveness of feeding interventions in the pediatric intensive care unit: a demonstration of longitudinal targeted maximum likelihood estimation. Am J Epidemiol. 2017;186(12):1370–1379. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Cerdá M, Tracy M, Keyes K. Reducing urban violence: a contrast of public health and criminal justice approaches. Epidemiology. 2018;29(1):142–150. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Cerdá M, Tracy M, Ahern J, et al. . Addressing population health and health inequalities: the role of fundamental causes. Am J Public Health. 2014;104(S4):S609–S619. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Grant BF, Stinson FS, Dawson DA, et al. . Prevalence and co-occurrence of substance use disorders and independent mood and anxiety disorders: results from the national epidemiologic survey on alcohol and related conditions. Arch Gen Psychiatry. 2004;61(8):807–816. [DOI] [PubMed] [Google Scholar]
  • 33. Luke DA, Stamatakis KA. Systems science methods in public health: dynamics, networks, and agents. Annu Rev Public Health. 2012;33:357–376. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Hernán MA, Taubman SL. Does obesity shorten life? The importance of well-defined interventions to answer causal questions. Int J Obes (Lond). 2008;32(S3):S8–S14. [DOI] [PubMed] [Google Scholar]
  • 35. Westreich DJ, Mooney SJ. Number (of whom?) needed to treat (with what?): exposures, population interventions, and the NNT. Epidemiology. 2019;30(suppl 2):S55–S59. [DOI] [PubMed] [Google Scholar]
  • 36. NYPD CompStat Unit . CompStat Report. https://www1.nyc.gov/assets/nypd/downloads/pdf/crime_statistics/cs-en-us-city.pdf. Accessed July 29, 2021.
  • 37. Rothman KJ. Causes. Am J Epidemiol. 1976;104(6):587–592. [DOI] [PubMed] [Google Scholar]
  • 38. Eng E. New York Police Department Report on the Fiscal 2017 Executive Budget. https://council.nyc.gov/budget/wp-content/uploads/sites/54/2016/06/nypd.pdf. Accessed December 29, 2018.
  • 39. Independent Budget Office . Analysis of the Mayor’s executive budget for 2000. http://www.ibo.nyc.ny.us/iboreports/mayrepfisc2000/mayreportfy2000.html. Accessed December 29, 2018.
  • 40. Shev AB. ABM model for testing population interventions to reduce PTSD in New York City. https://github.com/abshev/ViolencePTSD_ABM. Accessed October 22, 2021.
  • 41. Kessler RC, Sonnega A, Bromet E, et al. . Posttraumatic stress disorder in the National Comorbidity Survey. Arch Gen Psychiatry. 1995;52(12):1048–1060. [DOI] [PubMed] [Google Scholar]
  • 42. McLaughlin KA, Koenen KC, Hill ED, et al. . Trauma exposure and posttraumatic stress disorder in a national sample of adolescents. J Am Acad Child Adolesc Psychiatry. 2013;52(8):815–830.e14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Lowe SR, Joshi S, Galea S, et al. . Pathways from assaultive violence to post-traumatic stress, depression, and generalized anxiety symptoms through stressful life events: longitudinal mediation models. Psychol Med. 2017;47(14):2556–2566. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Halloran ME, Hudgens MG. Dependent happenings: a recent methodological review. Curr Epidemiol Rep. 2016;3(4):297–305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Kang JD, Schafer JL. Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Stat Sci. 2007;22(4):523–539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Murray EJ, Marshall BD, Buchanan AL. Emulating target trials to improve causal inference from agent-based models. Am J Epidemiol. 2021;90(8):1652–1658. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Web_Material_kwab219

Articles from American Journal of Epidemiology are provided here courtesy of Oxford University Press

RESOURCES