Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Dec 16.
Published in final edited form as: Complexity. 2020 Jun 5;2020:8923197. doi: 10.1155/2020/8923197

Multiobjective Genetic Programming Can Improve the Explanatory Capabilities of Mechanism-Based Models of Social Systems

Tuong M Vu 1, Charlotte Buckley 2, Hao Bai 2, Alexandra Nielsen 3, Charlotte Probst 4, Alan Brennan 1, Paul Shuper 4, Mark Strong 1, Robin C Purshouse 2
PMCID: PMC7743914  NIHMSID: NIHMS1612745  PMID: 33335382

Abstract

The generative approach to social science, in which agent-based simulations (or other complex systems models) are executed to reproduce a known social phenomenon, is an important tool for realist explanation. However, a generative model, when suitably calibrated and validated using empirical data, represents just one viable candidate set of entities and mechanisms. The model only partially addresses the needs of an abductive reasoning process - specifically it does not provide insight into other viable sets of entities or mechanisms, nor suggest which of these are fundamentally constitutive for the phenomenon to exist. In this paper, we propose a new model discovery framework that more fully captures the needs of realist explanation. The framework exploits the implicit ontology of an existing human-built generative model to propose and test a plurality of new candidate model structures. Genetic programming is used to automate this search process. A multi-objective approach is used, which enables multiple perspectives on the value of any particular generative model - such as goodness-of-fit, parsimony, and interpretability - to be represented simultaneously. We demonstrate this new framework using a complex systems modeling case study of change and stasis in societal alcohol use patterns in the US over the period 1980–2010. The framework is successful in identifying three competing explanations of these alcohol use patterns, using novel integrations of social role theory not previously considered by the human modeler. Practitioners in complex systems modeling should use model discovery to improve the explanatory utility of the generative approach to realist social science.

1. Introduction

Agent based simulation (ABS) is a well-established tool for understanding complex systems using the generative social science approach. The goal of generative social science is to explain and understand a social phenomenon as the result of actions of autonomous entities acting according to causal mechanisms, or rules as encoded in an agent-based model [1]. If a modeler encodes a model that produces a known empirical pattern, the so-called “generative test” is met, and the postulated mechanisms form a candidate explanation for the phenomenon. Many social phenomena may be explained by a multiplicity of theories, each of which could pass the generative test when encoded as an ABS, leaving us to wonder: which theory is correct; how can theories be combined; what is missing from our theories? Here, we propose a novel method of discovering new models and extending the explanatory capabilities of theory-driven generative models using multi-objective genetic programming - a process of knowledge discovery. Elements of a generative theory or several generative theories are codified in a common grammar, evolved through genetic programming, evaluated for empirical fit, complexity and interpretability, and interpreted by subject matter experts to bring new insight to the social phenomenon.

In this paper we set out the following aims: (1) to explain the role of complex systems models for realist explanation; (2) to define the structural calibration method: a retroductive model discovery framework; (3) to demonstrate the application of the model discovery framework to a specific mechanism-based social systems model; (4) to discuss the implications of computer aided model discovery in light of the case study results.

2. Methods

2.1. The role of abductive reasoning in mechanism-based explanation

The context for our methods is the generative [1] or mechanism-based [2] approach to the study of complex social systems. In this approach, we assume that any concrete phenomenon (which may be empirically observable to a greater or lesser degree) emerges from the dynamic interplay of real entities and mechanisms that exist independently of our ability to detect them [3]. In this context, the role of complex systems modeling is principally explanatory, in helping to gain insights into theorised entities and mechanisms by representing them in a dynamic simulation model.

Abductive reasoning plays a key role in mechanism-based explanations and can be conceived of in two parts [4]:

  • Redescription - situating the concrete phenomenon as a case which emerges from the hypothesised interacting components (i.e., entities and mechanisms) of one or more theories;

  • Retroduction - identifying which of the components in the redescription are fundamentally constitutive to the emergence of the phenomenon (i.e., entities and mechanisms whose inexistence would preclude the phenomenon).

The development of a complex systems model by a human modeler is principally a redescription activity - the modeler uses existing theory (and potentially develops new theory) to construct a set of equations and rules that, when executed as a simulation, produce emergent outcomes that are in some sense comparable to the phenomenon under investigation. As part of the model building process, the modeler defines - either explicitly or implicitly - an ontology of real entities, the agents of agent-based simulations, and mechanisms, the rules that determine action and interaction.

The generative approach commits the modeler to at least a limited form of retroduction - the simulation, as redescription, is scrutinised for its ability to reproduce the concrete phenomenon, in so far as the latter is observable in empirical data. The simulation parameters often have to be manipulated in order to achieve good similarity; in the computational modeling community, this process is known as calibration [5] and is commonly identified as belonging to best practice programs for analytical sociological research [6]. If a simulation can be calibrated successfully then the redescription it encodes is said to pass the generative sufficiency test – it remains a candidate explanation for the phenomenon [1]. However as a retroductive process, the generative approach, when applied to a single simulation model, has two fundamental shortcomings: (1) it does not allow the entities and mechanisms to be accepted as fundamentally constitutive since to do so would be to commit the fallacy of affirming the consequent; (2) neither does it allow the entities and mechanisms to be rejected as fundamentally constitutive, only that their present configuration or representation in the simulation model is non-constitutive.

Together, these limitations form the basis for many of the concerns about complex systems modeling raised within the sociological community (see, for example, [7]). We argue that the limitations arise from the focus of the modeling on a single redescription – i.e., a single ABS. To improve our explanatory capability, we need to increase the number of redescriptions considered within the overall modeling process – i.e., by building multiple ABS that either interpret a single theory in different ways, or represent multiple different theories, or both. By subjecting a plurality of redescriptions to retroduction, we can seek to identify commonalities in the theory components that survive the generative sufficiency test; we can also seek to increase the robustness of the test outcome to potential issues with the configuration or representation of a theory within the simulation. Within the context of model calibration activities, this concept is operationalized as interrogating model structure (i.e., the selection and configuration of entities and mechanisms) in addition to model parameters – we call this structural calibration. Perhaps surprisingly, given its key role in the abductive reasoning process, the complex systems modeling community has yet to pay significant attention to the issue of structural calibration. Below we review the handful of existing works on this topic.

2.2. Existing works on structural calibration of simulation models

A very small literature exists on the structural calibration (described variously as “theory discovery” [8], “model discovery” [9] and “inverse generative social science” [10]) of mechanism-based models, all of which use evolutionary computing (EC) methods to steer the search for good model structures. In the earliest known study, Smith [11] used a genetic algorithm to identify simplified representations of behavior that could reproduce the observed social assortativity of birds. Later, with a focus on reproducing observed human crowd dynamics, Zhong et al. [12] used gene expression programming to identify the structure of a reward function representing individual decision-making in a “sense-think-act” framework. Gunaratne & Garibay [8] used genetic programming to revise agents’ farm selection rules for the ‘Artificial Anasazi’ model, in order to more accurately reproduce the archeological population demography of Long House Valley, Arizona. Most interestingly, again within the context of the Artificial Anasazi, the same authors then used genetic programming to identify components for Epstein’s Agent_Zero model of human behavior as the basis of farm selection decisions [9]. Finally, in work that forms a prelude to the present study, Vu et al. [10] used multi-objective genetic programming to identify alternative situational mechanisms for a social norms model of alcohol use, aimed at both improved representation of observed drinking patterns in the USA over a 15 year period, and theoretical interpretability (operationalized as the number of terms in the situational mechanism).

There exist a number of issues and limitations with these early techniques. Whilst the studies have demonstrated success at improving the goodness of fit to empirical data, they have all been limited in scale – focusing on specific aspects of larger models. Further, the studies focusing on human behavior have also struggled with the issue of theoretical meaningfulness of the structures that have been identified. Whilst minimizing or constraining the number of terms in the candidate behavioral rules is clearly helpful at improving interpretability, it is often the case that the new structures remain challenging to interpret in terms of the original theory, with this crucial activity deferred to future work.

2.3. Proposed approach to structural calibration

Here, we describe a new framework for structural calibration and position it explicitly as a tool for realist explanation that can be used alongside more traditional approaches within the realist tradition [3]. Our approach is grounded in the recognition that the human modeler uses a creative process of redescription that results in the construction of an ontology for entities and mechanisms that may be implicated in the generation of a complex phenomenon. The starting point for our framework is this ontology. We exploit the ontology to: (a) construct new candidate redescriptions (i.e., simulation model structures) that can be realised via the ontology; (b) test the candidate redescriptions in terms of their explanatory value, where ‘value’ can be a plurality of considerations, such as empirical goodness-of-fit, structural parsimony, interpretability, and theoretical credibility.

The ontology developed by the modeler can be considered as a set of basic building blocks of entities and mechanisms. Whilst we could, as humans, use the building blocks to construct an exhaustive set of possible alternative simulation model structures (by assembling the building blocks in different ways), even a relatively small set of building blocks can result in a very large number of alternatives that cannot be practically explored by hand. An alternative is to use machine learning approaches, where we make intelligent use of computational resources to automatically search through the space of possible model structures. In the vein of the existing literature, we regard the family of evolutionary computing approaches known as genetic programming (GP) as a highly promising workhorse for structural calibration [13]. Multi-objective genetic programming is particularly beneficial because it allows researchers to evaluate candidate model structures according to a set of explicitly stated considerations of explanatory value [14]. In our present framework we concentrate on two aspects of value: (i) the ability of the model structure, with suitably calibrated parameters, to reproduce the phenomenon so far as we understand it from our beliefs and empirical data; (ii) the meaningful interpretability of the model structure in terms of theory.

2.4. Generating candidate model structures: Grammar-based Genetic Programming

Evolutionary computing is a field that applies the principles of natural evolution in computing. In EC, a population of candidates is evolved over many generations based on a fitness function. A typical process starts with a random population of candidates. The candidates with high fitness are then probabilistically chosen to breed and produce the candidates for the next generation. Two common genetic operators for breeding are crossover (combining random parts from two selected candidates) and mutation (altering a random part of a selected candidate). Genetic programming applies this idea of evolution for computer programs [13,15].

The basic genetic operators (i.e., crossover and mutation) are entirely random and can result in the construction of illegal programs (e.g., that breach requirements for legal expressions or type restrictions of the programming language). For this reason alone, it is often appropriate to constrain the structure of programs in advance of the evolutionary process. An approach to enforcing particular structures is using a grammar [16]. GP approaches that use a grammar to express constraints are called Grammar-based Genetic Programming (GGP). For example, the expression f(x,y) = x*x + y, is one of many possible specific structures that could be generated with the following grammar:

E::=var|(EopE)op::=+||*var::=x|y|z

Each line in the grammar is a production rule. The elements on the left-hand-side can be rewritten and are called non-terminal symbols. On the other hand, elements that cannot be rewritten are terminals. The first production rule is an expression (E) which can equal either a variable (var) or a combination of two expressions (E) by an operator (op). The second rule allows three operators: plus, minus, multiply. The last production rule specifies three variables: x, y, z.

Each structure produced by the grammar is represented by a tree. The tree representation allows researchers to measure the structural complexity of models by counting the number of nodes (terminal and non-terminal symbols) in the tree. Even the simple expression f(x,y) = x*x + y, shown in Figure 1, has a node count of 15. Crossover is operationalized by cutting a branch of the tree and replacing it with a branch from another tree. Mutation is operationalized by replacing a node with a randomly generated tree.

Figure 1.

Figure 1.

An example tree of an expression f(x,y) = x*x + y

2. 5. Description of the model discovery process

This section describes the proposed model discovery process, depicted by the flow chart of Figure 2. The process is a variant of a recent approach described by Vu et al. [10]. There are three roles in the model discovery process: modeler, analyst, and domain expert. The modeler designs, implements, and tests agent-based models. The analyst analyses the model structure and abstracts a set of basic building blocks of entities and mechanisms. The domain expert possesses the knowledge and understanding about the social science that underpins the model and can assess the model’s theoretical credibility.

Figure 2.

Figure 2.

Model discovery framework. The dashed rectangle is the process that is not implemented in the present paper.

In Step 1 of the model discovery process, a human modeler develops a mechanism-based model to explain the phenomenon in a complex social system (the baseline model). This redescription process is undertaken by the modeler based on existing knowledge captured in social theories. In Step 2, the model is evaluated for its theoretical credibility by a human expert in the social theories that underpin the model (i.e., a domain expert). Redevelopment of the model may occur following this step (representing a return to Step 1). Once the human expert is satisfied with the baseline model, in Step 3 a human analyst abstracts a set of primitives, i.e., “building blocks”, from the model. This set of primitives are the entities and mechanisms to be exposed and modified in the evolutionary step (Step 6). In addition, a grammar is defined to guide the search. Since the grammar is a set of production rules for combining the primitives, the analyst can enforce certain structures based on the modeler’s knowledge of the system; non-credible operations can be prohibited. In Step 4, the model parameters are calibrated using the baseline model structure against the empirical calibration targets. In Step 5, the model built by human modeler, along with the best calibrated parameters, is cloned to fill the initial population of model structures.

Step 6 is the heart of the evolutionary approach. Parent structures are selected from the population of model structures (Step 6a). After applying the genetic operators (crossover and mutation), new child structures are generated (Step 6b). Ideally, the parameters of the new child structures are re-calibrated to see if the model error can be minimized further (Step 6c). However such a nested approach to calibration is very computationally intensive, and so we necessarily omitted this step due to limits on the available computing resources. Instead we allowed the GP to select constants from not only a general set of constants but also values of calibrated parameters generated at Step 4. Next (in Step 6d), the new structures are evaluated for their fitness (such as model error compared to empirical data). After evaluation, the new population is selected based on the objectives (Step 6e). These evolutionary steps are performed until convergence is achieved or when a maximum number of iterations is reached (Step 6f).

In Step 7, through deliberative discussion with the analyst and the modeler, the domain expert assesses the set of new structures with the highest fitness values in terms of theoretical credibility. If the new structures lack sufficient credibility, the domain expert, the modeler, and the analyst return to Step 3 to discuss changes to the grammar to improve the meaningfulness of the operations that can be selected by the evolutionary algorithm. Further iterations of Steps 3 to 7 are carried out until credible structures are generated or resources are exhausted. In Step 8, credible model structures are interpreted for knowledge discovery purposes, promoting discussion about the underlying social theories used in the model and, potentially, further theory development or empirical data gathering.

3. Application

We applied the new framework to a specific mechanism-based social systems model. Here, we interpreted the causal mechanisms derived from social role theory as drivers of alcohol consumption to build a complex systems model of population-level alcohol use patterns in the US since the 1980s (Figure 3). Social role theory is a collective term used to describe a diverse range of mechanism-based explanations for individual and collective behaviors and practices from the fields of social psychology, sociology and anthropology [17]. Particular conceptualizations of role theory have been used within the alcohol research community to explain observed trends in alcohol use – specifically relating to the interplay between alcohol use and positional roles such as parent, partner and paid employee [18]. Our aim in this application is to test the extent to which credible conceptualizations of role theory can reproduce historical trends in population alcohol use (as measured via survey data).

Figure 3.

Figure 3.

Schematic of roles model

In this application, different roles in the model discovery process were undertaken by different authors. The modeler role was principally undertaken by Bai, Brennan and Purshouse. The analyst role was undertaken by Buckley and Vu. The domain expert was Shuper.

3.1. Social roles as a mechanism-based explanation of alcohol use: the human built model as it relates to role theory

The concept of role strain is central to many of the studies relating positional roles to alcohol use. Biddle [17] defines role strain as the “experience of stress associated with positions or expected role”. Role strain is hypothesized to arise through a number of pathways where alcohol can act as cause, consequence, or both. Alcohol can be used by individuals as a means of coping with role strain arising from role overload (holding a role set that is too complex), role deprivation (lacking roles that provide meaning to life), or role incongruence (holding roles which are non-normative with respect to status or identity) [19,20]. Alcohol use can also induce or exacerbate role strain, where use is incompatible with the demands of performing the role [21]. In the model, role strain is the arithmetic mean of role incongruence and role overload (Equation 1 in Table 1). Role overload (Equation 2) is determined by the roles an agent holds, their levels of involvement in these roles, and four calibrated parameters representing the effect of holding each role on experiencing role overload. Role incongruence (Equation 3) is the arithmetic mean of the difference between each role holding status and the prevalence of that role in society (i.e., the percentage of people holding that role).

Table 1.

Equations in the human-built model. The concepts in the Concept column are from the schematic in Figure 3. These equations contain unobserved parameters (highlighted in bold) which modify the effects of social role mechanisms. These are given values following model calibration (Section 3.3) which searches for the set of parameters which best fit historically observed trends in alcohol use over time. The agents in the model, indexed by i, represent individual drinkers; their behaviour is a decision to consume the jth drink in a drinking occasion. The model also includes two dynamic structural entities: expectancies for holding roles at a given point in the lifecourse and average transition probabilities (TP) between roles. For clarity, all structural entities carry the prefix s. The discrete time unit (representing each day) in the simulation is k.

No. Concept Model equation Description

1 Role strain RoleStraini[k] = (RoleLoadi[k] + RoleIncongruencei[k]) / 2 Role strain is the overall stress an individual experiences as a result of the social roles they hold.

2 Role load RoleLoadi[k] = β1*ParentStatusi[k]*ParentInvolvementi + β2*MaritalStatusi[k]*MaritalInvolvementi + β3*EmploymentStatusi[k]*EmploymentInvolvementi + β4*(1-MaritalStatusi[k])*ParentStatusi[k]*ParentInvolvementi Role load is the stress that results from needing to perform a role. Role status is either 0 (not having a role) or 1 (having a role). Role involvement represents how much a person is involved in a role, if they hold it (between 0 and 1, from no involvement to full involvement). There are four terms: one term for each of the three roles (having a role and more involvement in that role increases the stress) and a term for additional stress when an individual is a single parent (holding the role without the support of another parent).

3 Role incongruence RoleIncongruencei[k] = (ParentStatusi[k]-sParentExpectancysex,age[k] + MaritalStatusi[k]-sMaritalExpectancysex,age[k] + EmploymentStatusi[k]-sEmploymentExpectancysex,age[k]) / 3 Role incongruence is the stress that results from holding a role that deviates from societal expectations for an individual’s identity (encoded as a sex-age category). It is the average of the differences for each role between the current status and the corresponding societal expectancy (prevalence of that role in the society is between 0 and 1).

4 Role transition update for gaining roles Heavy drinkers:
  • TPi[k] = sTPsex,age[k-1]*(1+β12)

Non heavy drinkers:
  • TPi[k] = sTPsex,age[k-1] * (1-AnnualHeavyDrinkingPrevalence[k-1]*(1+β12)) / (1-AnnualHeavyDrinkingPrevalence[k-1])

To account for role selection, individual role transitions over the lifecourse are calculating by modifying the societal transition rates according to whether or not the individual is a heavy drinker. (Equation 4 and 5). Heavy drinking makes it less likely for an individual to gain roles. Heavy drinking makes it more likely for an individual to lose roles. AnnualHeavyDrinkingPrevalence represents the population prevalence of heavy drinking in the model (between 0 and 1)

5 Role transition update for losing roles Heavy drinkers:
  • TPi[k] = sTPsex,age[k-1]*(1+β13)

Non heavy drinkers:
  • TPi[k] = sTPsex,age[k-1] * (1-AnnualHeavyDrinkingPrevalence[k-1]*(1+β13)) / (1-AnnualHeavyDrinkingPrevalence[k-1])


6 Difference in disposition to drink due to gaining roles DispositionDifferencei,j[k] = Dispositioni,j[k]*(1+β10) - Dispositioni,j[k] To account for role socialization, the disposition to drink is gradually reduced the longer an individual holds a role and is gradually increased if an individual loses a role. The full disposition effect to apply is calculated using Equation 6 or 7. The proportion of that effect to apply after a particular number of days of socialization is calculated using the logistic function in Equation 9. This modifier is then applied to scale the full disposition effect using Equation 8 and calculate the overall disposition at time k. Socialisation effects accrue over one year following a role transition.

7 Difference in disposition to drink due to losing roles DispositionDifferencei,j[k] = Dispositioni,j[k]*(1+β11) - Dispositioni,j[k]

8 New disposition to drink (after role socialisation) Dispositioni,j[k]=Dispositioni,j[k-1] + DispositionDifferencei,j[k]*modifieri[k]

9 Modifier for socialisation mechanisms modifieri[k] = e^((DaysofSocialisationi[k]-sSocialisationSpeed)/365) / (1+e^((DaysofSocialisationi[k]-sSocialisationSpeed)/365))

10 Opportunity to drink out logOddsOppOuti[k] = log(β5*(1 - β6*RoleLoadi[k] + β7*EmploymentStatusi[k])) Equation 10 and 11 describe the log odds for the opportunities to drink outside and inside the home, with reference to having no opportunity to drink. β5 is the baseline opportunity. Role load acts to reduce both opportunities. Individuals have more opportunity to drink outside the home if employed, and more opportunity to drink inside the home when holding marital or parenting roles. Equation 12 and 13 operationalize the logit model that derives the probabilities of drinking outside and inside the home on any given day from the log odds of Equation 10 and 11 (for three mutually exclusive scenarios: drinking in, drinking out, and not drinking).

11 Opportunity to drink in logOddsOppIni[k] = log(β5*(1 - β8*RoleLoadi[k] + β9*(MaritalStatusi[k] + ParentStatusi[k])))

13 Probability of having an opportunity to drink out probOppOuti[k] = e^(logOddsOppOuti[k]) / ((e^(logOddsOppIni[k])+e^(logOddsOppOuti[k])+1))

12 Probability of having an opportunity to drink in probOppIni[k] = e^(logOddsOppIni[k]) / ((e^(logOddsOppIni[k])+e^(logOddsOppOuti[k])+1))

14 Probability of drinking first drink (j=0) ProbabilityDrinki,0[k] = Dispositioni[k] * (ProbOppOuti[k]+ProbOppIni[k]) * (1+β14*RoleStraini[k]) The daily drinking probability is modelled as the long-term drinking disposition, mediated by drinking opportunities and role strain. We differentiate between drinking frequency (first drink in an occasion) and quantity (next drink, given that an occasion has begun).

15 Probability of drinking next drink (j>0) ProbabilityDrinki,j[k] = Dispositioni[k] * (ProbOppOuti[k]+ProbOppIni[k]) * (1+β15*RoleStraini[k])

In the mechanism known as role selection, individuals may act (consciously or otherwise) to prevent or reduce role strain by avoiding or escaping from roles that are incompatible with their existing alcohol use [21]. This mechanism is implemented by adjusting the probability of transitioning between roles based on the heavy drinking status of the agent (where heavy drinking is defined as having consumed 5+ standard drinks in the previous month) (Equations 4 and 5). In a contrasting role socialization mechanism, individuals gradually adopt and internalize drinking practices that are compatible with the roles they hold [2123]. A difference in drinking disposition is calculated when individuals gain or lose roles (Equation 6 and 7). The new disposition to drink (Equation 8) is a function of this difference in disposition and a modifier (Equation 9), calculated using the number of days the new role has been held and the speed of socialisation.

Knibbe et al. (1987) suggest that the set of positional roles held by an individual can affect the ability of that person to participate in drinking situations, depending on the extent to which drinking is integrated into the structure of everyday life within society. In this sense, social roles act as mechanisms that regulate the daily opportunities for using alcohol. Individuals in the model have a different opportunity to drink outside and inside the home, which depends on the roles they hold. The opportunity to drink out and in are each calculated as a log odds (Equations 10 and 11) then converted to probabilities (Equations 12 and 13). The log odds for drinking outside the home is based on an agent’s role load, employment status, and two calibrated parameters to describe the unknown effect sizes of these factors on opportunity to drink out. Role load, and a combination of marital and parenthood status, determine the log odds of drinking opportunity inside the home. Again, this equation contains parameters describing the unknown effect of these factors on the opportunity to drink in.

The conditional probability of agent i consuming a jth drink (Equations 14 and 15) is governed by each agent’s long-term disposition to drink, their probability of drinking in and outside the home, and role strain.

3.2. Data

3.2.1. Model initialisation

To initialise the models, data from the National Survey on Drug Use and Health (NSDUH) 1979–2010 [24], Panel Study of Income Dynamics (PSID) 1979–2010 [25] and the US Census 1980–2010 [26] was used. A micro-synthesis [27] was generated for a demographically representative population of 1000 individuals aged 12–80 in the USA, 1980. The model was initialized with these 1000 agents on the first day of 1980. The socio-demographic attributes of agents were initialized from the micro-synthesis: age, sex, ethnicity, employment status, marital status, and parenthood status. Additionally, the micro-synthesis initialized agents with alcohol use attributes: a 12-month drinking status, usual number of drinking days per month, usual quantity of drinks per month, and number of days more than five drinks are consumed per month.

3.2.2. Simulation

During each simulated year, individuals enter the model as new 12-year-old adolescents and new migrants. Individuals also leave the model due to death and outward migration. Total counts of new migrants to enter in each year were estimated using the American Community Survey 1980–2010 [28] and were micro-synthesised to data from the nearest NSDUH year to give a representative migrant population with corresponding baseline drinking behaviour. “Mortality rates for the microsimulation were derived from the Center for Disease Control and Prevention (CDC) all-cause mortality data for the USA between 1979–1998 [29] and 1999–2010 [30].

Transition probabilities for moving between each of the eight unique combinations of social roles variables (marriage, employment and parenthood) are applied annually during the simulation. These probabilities were derived from multi-state Markov models fitted to marriage, parenting and employment trends from a representative USA study, the Panel Study of Income Dynamics 1979–2000 [25].

At initialisation, each agent is allocated a vector which represents their long term disposition to drink. These are initialised from the mean and standard deviation of their drinking frequency and quantity at baseline.

3.2.3. Calibration targets

Calibration targets for alcohol use were derived from empirical data from NSDUH for the years 1979–2010. Four alcohol use targets per year were used for calibration: (1) prevalence – the proportion of individuals reporting consuming an alcoholic beverage during the previous year; (2) frequency – among drinkers, the average number of drinking days per month; (3) quantity – among drinkers, the average grams of alcohol consumed per day; (4) heavy episodic drinking – among drinkers, the average number of occasions where 5+ drinks were consumed, per month. The targets are split by sub-group, with four sub-groups defined by the number of roles held (0–3 roles). We chose to categorized by the number of roles (n=4) instead of the combination of roles (n=8) for two reasons: firstly, this is an indicator commonly used in the social roles literature [19]; secondly, the eight role decomposition is too great for the standard error of the targets to be informative from a calibration perspective. In summary, there are 16 targets (4 alcohol use targets by 4 different number of roles) for each year between 1979–2010.

3.2.4. Implementation

The model was implemented in C++ using the RepastHPC 2.2.0 toolkit [31]. The model is run forward in time for 20 years for calibration (1980–1999) and 10 years for validation (2000–2009). Each model tick represents one day in the simulation. On each day of the simulation, the probability of drinking is calculated for each agent. Once per year in the simulation, transition probabilities for role transitions are applied and role expectancies are updated.

3.3. Parameter calibration

The model contains 31 parameters for calibration, which are highlighted in bold text in Table 1. For this paper, a Latin hypercube space-filling design was employed to sample 5,000 parameter sets from the joint prior distribution using the lhs package in R [32]. The Latin hypercube was optimized by maximizing the minimum distance between samples [33]. These parameter sets are evaluated using an error metric that compares the simulated results against the calibration targets (Equation 16).

error=1NMn=1Nm=1M|ym*[n]ym[n]|(sm[n]2)+(dm)2 (16)

where N is the number of observations, M is the number of outputs, ym*n is the simulated data for output m at time n, ym[n] is the mean of empirical target data for output m at time n, sm[n] is the standard error of the empirical target data for output m at time n, (dm)2 is the variance of the model discrepancy for output m, which is 10% of the possible range for each output. Model discrepancy captures the fact that model is not a perfect representation of reality.

As described in Section 3.2.3, there are 16 targets (prevalence, frequency, quantity, heavy episodic drinking, each split by the number of roles held, 0–3), thus M is 16. N is different for each output because some years are missing in the empirical data. The parameterization that provided the minimum error in Equation 16 was selected as the result of the calibration process. The human-built model along with the best parameterization was selected as the reference model for the structural calibration process.

3.4. Grammar-based Genetic Programming: Design and Implementation

This section describes a GGP system designed to perform the model discovery process. For the grammar that guided the GP process, the popular context-free grammar was used. The full grammar written in Backus-Naur form [34] is available in Figure 4. Considering the representation of the GGP candidates, each candidate is represented by a tree that is generated following the production rules of the grammar. Each GGP candidate (a program <p>) contains 9 expressions for the 9 role-related terms used in the roles model: role selection mechanism, role socialisation mechanism, role load, role incongruence, role strain, log-odds-out modifier, log-odds-in modifier, first-drink disposition, and next-drink disposition. Several groups of variables, along with constants and calibrated parameters, were defined. Each can be formed only by a defined combination of variables, operators, and constants. Some role-related terms use the same expression because they have the same structure and the same constraints, e.g., probFirstDrink and probNextDrink both use expression <e5>. This grammar provides a hierarchical structure that can capture the layers of role-related concepts in the reference model.

Figure 4.

Figure 4.

Context-free grammar. Variables have the same name as in Table 1 but, for improved clarity, without the indexing notation. Shorthand notation is defined for the multiplication of role involvement and role status (e.g., InvolvedxMarital = MaritalInvolvement * MaritalStatus) and for the difference between the role status and role expectancy (e.g., DiffExpectancyMarital = MaritalStatus - sMaritalExpectancy)

For the initialisation of the GGP, we decided to start with an initial population filled with the same structure. The structure used as the starting point is the reference model, i.e., the structure designed by human modelers with the best fitted parameterization. Additionally, a multi-objective GGP was employed to simultaneously minimize both model error and complexity. These two objectives address the two aspects of value we discussed in Section 2.3: (i) the ability of the model to reproduce the phenomenon so far as we understand it from our beliefs and empirical data; (ii) the meaningful interpretability in terms of theory.

The first objective, model error, is captured by comparing the simulated data from the model with the empirical data from the real world. The model error is described in Equation 16 (Section 3.3). The second objective, complexity, is a proxy for interpretability and parsimony. Minimizing the complexity during model discovery also constrains the model discovery process from discovering too complex structures that overfit the empirical data and are not interpretable by domain experts. The complexity is defined by the number of nodes in the GGP candidate, with a special case that node ON is counted as 2 in contrast to node OFF being counted as 1. The drawback when using complexity as a proxy is that it does not guarantee meaningful interpretability, i.e., low complexity can also result in meaningless model structures. Therefore, at the end of every iteration of the model discovery process, we worked with the domain expert to verify the interpretability of all structures in terms of theoretical meaningfulness. We asked the expert to classify model structures using a crisp binary definition of credible or non-credible. Whilst the judgment was holistic, the classification process was a deliberative discussion between the expert, the modeler and the analyst. This discussion was recorded and used subsequently to produce a set of qualitative criteria for judging model credibility.

For the selection process, the popular NSGA-II optimizer [35] was used to develop an even representation of the Pareto front that shows the trade-off between model error and complexity. During the evaluation, the corresponding expressions in the simulation’s source code were edited based on a candidate structure; then the simulation was re-compiled and run in order to collect the simulated results for calculating the model error.

The described GGP system was implemented using the PonyGE2 toolkit [34] and set up with the following parameters:

  • 500 GGP candidates per generation;

  • Initialisation: 500 copies of the reference structure;

  • GP operators: 75% subtree crossover, 25% subtree mutation;

  • Maximum tree depth: 17;

  • 2 objectives (goodness-of-fit and complexity) with NSGA-II replacement and selection operators.

The GGP process was run on an Intel i9 9980XE processor with 36 cores. The source code of the simulation with the best calibrated parameters (RepastHPC) and the GGP system (PonyGE2) is available at bitbucket.org/r01cascade/roles_ggp_complexity and is licensed under the GNU General Public License version 3.

4. Results

4.1. GGP Results and the Pareto Front

In the case study, three iterations of the model discovery process were required to produce any structures that the domain expert deemed as credible. Modifications were made to the grammar between each iteration in an attempt to improve the effectiveness of the discovery process. This was an open-ended trial and error iterative process involving the modeler and the analyst. We stopped the process once credible structures had been discovered. The evolution of the grammar is documented in the Supplementary Material A. The final grammar is shown in Figure 4.

In the final iteration of the process, both model error and model complexity objectives reduced over generations and converged at the 20th generation, after which no change to the Pareto front was observed. Figure 5 shows the final population of 14 non-dominated structures and also includes the reference structure for comparison. These models are indexed by their complexity, e.g., model 24 is the model on the Pareto front with complexity 24. All the structures discovered by the GGP are less complex than the reference structure. Six of them are worse that the reference model in terms of model error, while the remaining eight offer improved fit over the calibration window.

Figure 5.

Figure 5.

GGP structures on the Pareto front versus the reference structure (model error is the error when compared the simulated results against calibration targets; complexity is measured as the number of nodes in the tree of a GGP candidate).

4.2. Theoretical credibility of the discovered models

We worked with a domain specialist to examine the non-dominated model structures generated by the GGP in terms of their theoretical credibility and coherency with respect to social role theory. Table 2 compares the structures of the reference model and selected GGP models. In Table 2, elements not affecting agent drinking (probFirstDrink and probNextDrink variables, Equation 14 and 15) are shaded in gray.

Table 2:

Structures of the discussed models. Elements not affecting agent drinking are shaded in gray.

(a) Structures of the reference model, model 24, model 25, and, model 29
Name The reference model Model 24 Model 25 Model 29

RoleSelectionMech ON OFF ON OFF
RoleSocialisationMech ON OFF OFF OFF
RoleLoad 0.180*InvolvedxMarital + 0.039*InvolvedxParent + 0.128*InvolvedxEmployment + 60.952*(1-MaritalStatus)*InvolvedxParent InvolvedxEmployment InvolvedxEmployment InvolvedxMarital
RoleIncongruence 0.333*DiffExpectancyMarital + 0.333*DiffExpectancyParent + 0.333*DiffExpectancyEmployment DiffExpectancyMarital DiffExpectancyMarital DiffExpectancyMarital
RoleStrain 0.5*RoleLoad + 0.5*RoleIncongruence RoleLoad RoleLoad RoleIncongruence
OutMod OutMod=(1–0.25357*RoleLoad) + 11.18756*EmploymentStatus RoleLoad RoleLoad EmploymentStatus
InMod InMod=(1–23.786*RoleLoad) + 2.010*(MaritalStatus+ParentStatus) RoleLoad RoleLoad RoleLoad
probFirstDrink (ProbOppIn+ProbOppOut) * Disposition * (1+1.104*RoleStrain) Disposition Disposition Disposition * ProbOppIn
probNextDrink (ProbOppIn+ProbOppOut) * Disposition*(1+9.100*RoleStrain) Disposition Disposition Disposition
(b) Structures of model 38, model 59, and, model 79
Name Model 24 Model 25 Model 29

RoleSelectionMech ON ON OFF
RoleSocialisationMech OFF ON OFF
RoleLoad InvolvedxMarital InvolvedxMarital InvolvedxEmployment
RoleIncongruence 1 - DiffExpectancyEmployment 0.333*DiffExpectancyMarital + 0.333*DiffExpectancyParent + 0.333*DiffExpectancyEmployment DiffExpectancyEmployment
RoleStrain RoleIncongruence RoleIncongruence 7*RoleIncongruence + RoleLoad
OutMod RoleLoad MaritalStatus 1 – 10*RoleLoad
InMod EmploymentStatus ParentStatus 1 - ParentStatus*RoleLoad
probFirstDrink (LifetimeDisposition^2) * (4+ProbOppIn) (LifetimeDisposition^2) * (4+ProbOppIn) Disposition * (0.9*Disposition + ProbOppIn) * (1.104*RoleStrain + 0.8)
probNextDrink Disposition Disposition 1.25 * Disposition^2 * (RoleStrain + 1)

Through analysis of the deliberative discussions held between the expert, analyst, and modeler, it was possible to generate three qualitative criteria that enable a consistent assessment of credibility for this example case study. In what follows, we provide a narrative discussion of credibility across the Pareto front in Figure 5. However, a complete documentation of all GGP models and corresponding justifications of credibility is also included in the Supplementary Material B. The three identified criteria for theoretical credibility are:

  1. At least one of the theory constructs must be implicated in the model dynamics. In a mechanism-based model of alcohol use, the mechanisms need to be used to generate drinking behavior. For models based on role theory, this means that the models must use at least one of the core theory constructs of role strain, role load, role incongruence, or opportunity. For example, in the absence of a role socialization process, using solely a variable that describes drinking history (Disposition) does not generate a credible mechanism-based model in terms of role theory.

  2. The theory constructs must be used to represent mechanisms, rather than being proxies for black-box variable-centric explanation. In some of the identified models, we observed that theory constructs could be replaced directly by observable socio-demographic properties of the agents. These cases indicate that the mechanism-based explanation is being avoided in favor of a black-box variable-centric ‘determinants’ approach to understanding drinking behaviors that is more conventional in the literature [36]. For example, in a mechanism-based model, marital status should not directly define opportunity, where opportunity then directly determines disposition to drink.

  3. The model equations that describe the mechanisms must be compatible with the causal logic and evidence base for the theory. In a mechanism-based model, some of the encoded causal relationships between core theory constructs are only meaningful when constrained in terms of direction, sign, and/or magnitude. For example, role load must either not affect or cause a decrease in opportunity to drink – it is inconceivable, in role theory terms, that role load could cause an increase in opportunity (since load implies time use by agents that cannot be combined with drinking).

Focusing now on the Pareto front, the model with lowest complexity on the front (on the far right of Figure 5) is model 24 - this model includes a single term for each production rule, with both role selection and role socialisation processes switched off (Table 2). When parsed, all but two of the production rules are inactive - probFirstDrink and probNextDrink - with both set to the term Disposition. This term represents the initial dispositions to drink endowed to the agents based on observed drinking patterns in NSDUH data, so essentially model 24 encodes the heuristic “past behavior predicts future behavior” with no aspect of role theory present. This heuristic is clearly sufficient to reproduce target data at the start of the simulation but the fit to targets becomes progressively poor over time.

As we begin to traverse the Pareto front in the direction of increasing complexity (from right to left in Figure 5), elements relating to role theory begin to be introduced into the production rules; however these elements do not necessarily survive the parsing process. For example, model 25 switches on role selection, but - as with model 24 - no components relating to roles are present in the final probFirstDrink and probNextDrink production rules. We can conclude that the very small improvement observed for model 25 in comparison to model 24 is due to low level stochasticity in the simulation.

In the next model, model 29, role selection and socialisation processes are switched off but production rule probFirstDrink is now set to Disposition*ProbOppIn - i.e., the probability of engaging in a drinking occasion is scaled by the probability of having the opportunity to drink at home, whilst the number of drinks consumed in such an occasion continues to follow the heuristic “past behavior predicts future behavior”. Following the production rules upward, we identify that - as a result of the log-odds structure that is preserved in all models - ProbOppIn is increased by role load (RoleLoad) and decreased by holding an employment role (EmploymentStatus). Meanwhile role load is defined as level of involvement in a held marital role (InvolvedxMarital). From the perspective of role theory, this model is interpretable but not credible: (i) since only ProbOppIn is included, but the possibility of ProbOppOut=0 across all agents is not credible, then ProbOppIn is interpreted as simply a surrogate for any kind of drinking opportunity; (ii) opportunity is seen to increase as a result of role load - which is not credible, since opportunity should decrease with role load - and seen to decrease as a result of being employed - which is also not credible because, outside of role load considerations (for which employment is not present in the model), being employed should provide increased opportunities for drinking (e.g., due to income or exposure to social drinking situations). Model 29 also offers little improvement in model error compared to the overall “past behavior predicts future behavior”. Overall, it is very clear that this model is found wanting.

Continuing to traverse the Pareto front from right to left, we find that the first model to offer a credible interpretation in terms of role theory is model 38. This model also offers a substantial improvement over the less complex “past behavior predicts future behavior” model in terms of model error. In model 38, the frequency of drinking occasions (probFirstDrink) is increased by the probability of an at-home drinking opportunity (where, as for model 29, this should be interpreted as a surrogate for any kind of drinking opportunity). Some attempt by the GGP at parameter calibration is also seen here - with nonlinear scaling of Disposition. Opportunity is increased by holding an employment role and decreased by role load (the latter defined as level of involvement in a marital role). Despite reservations about the limited extent of the definition of role load, this opportunity mechanism appears plausible. Role holding is also influenced by drinking behavior in this model, since role selection is switched on - feedback is therefore in action and we can claim at this point that the GGP has discovered a true dynamical model that involves roles.

The second model that can be regarded as credible in terms of role theory is model 59 - this model suggests that experiencing role strain increases the likelihood of a drinking occasion (due to the RoleStrain multiplier on Disposition in the probFirstDrink production rule). Role strain is defined purely as role incongruence, where the latter concept is preserved intact from the reference model (i.e., role incongruence is the average deviation of an agent’s role holding from normative roles). Both role selection and socialisation are also active in this model, but no opportunity mechanisms are present.

The third model offering credibility in terms of role theory is model 79. In the model, role strain increases both frequency of drinking occasions and per-occasion quantity (via positive modifiers on probFirstDrink and probNextDrink respectively). Role strain arises as a weighted combination of role load and role incongruence, where load arises from high levels of involvement in an employment role and incongruence arises through non-normative employment status (e.g., being working age unemployed). Opportunity also influences drinking frequency (via the PropOppIn shift in the probFirstDrink production rule) - again, to be credible, PropOppIn must be interpreted as a general opportunity, rather than having any locational context. Opportunity is reduced through the interaction of role load with parenting (via the InMod production rule) and also reduced by role load in isolation (via OutMod). In this model, neither role selection nor socialisation are active.

4.3. Calibrated goodness-of-fit

The reference model has an error of 0.54 against the time series of targets used for calibration (covering the period 1980–2000). The time series plot for the reference model is shown by the pink line in Figure 6. Fit to drinking prevalence is good, except for the one-role sub-group. Fit to frequency, quantity and heavy episodic drinking is generally poor, particularly for the zero-role sub-group. Models lying on the Pareto front represent a range of errors, including both better and worse than the reference model. The credible models are quite considerably better, as seen from the time series plots in Figure 6. The issue with drinking prevalence in the one-role sub-group and issues with frequency and quantity are largely eliminated–with remaining problems largely confined to underestimation of heavy episodic drinking in the three-roles subgroup.

Figure 6.

Figure 6.

Time series of 16 outputs of the reference model, 3 credible GGP models, and the empirical target data (mean target data +/− 95% confidence interval). The horizontal axis is the simulated years and each graph has a dashed vertical line that separates 20 calibration years and 10 validation years. The vertical axis is the value for each target with different units for prevalence (of current drinking in the past 12 months), frequency (average number of drinking days in the past 30 days among current drinkers), quantity (average grams of pure alcohol consumed per day among current drinkers in the past 30 days), heavy episodic drinking (average number of days 5+ drinks consumed in the past 30 days among current drinkers).

It should be noted that the goodness-of-fit of the reference model is weak in comparison to other models. This issue could have been caused by either an inadequate parameter calibration process or fundamental inability of the structures initially designed by the modeler to capture the target dynamics. To seek an improvement in the goodness-of-fit of the reference model, we could have run a more extensive parameter calibration or undertaken handcrafting of the structure. However, we decided the calibrated model was adequate for the GGP process to work with and intentionally did not try to improve the goodness-of-fit further. It is clear here, from a model development lifecycle perspective, that the boundary between the reference model and the GGP can be blurred.

4.4. Validated goodness-of-fit

Target time series data used for validation covers the period 2000–2010.This period covers an increase in drinking prevalence, frequency and heavy episodic consumption that contrasts with the gentle declines seen over the calibration window. All three theoretically credible models exhibit a substantial relative decline in performance over the validation period (see Table 3). The calibration and validation errors are inversely correlated, suggestive of overfitting to noise in the calibration targets. However the decline in even the lowest complexity credible model (model 38) suggests that the retroduction fundamentally lacks generalizability to an adjacent temporal period - i.e., the real entities and mechanisms identified are invalid or incomplete. Looking in more detail at the validation issues, models 38 and 59 generate continuing declines in drinking for two-role and three-role groups that are trending in the opposite direction in the empirical data. The models also generate a collapse in heavy episodic drinking amongst the three-role sub-group that is not supported by the data. The most complex of the credible models - model 79 - does capture the trend reversal (if slightly lagged) for most two-role and three-role outcomes, but underestimates frequency and quantity for the no-roles group.

Table 3.

Goodness-of-fit errors for the reference model and the three credible GP models

Model Calibration error Validation error

The reference model 0.542 0.750
Model 38 0.236 0.340
Model 59 0.214 0.390
Model 79 0.211 0.444

5. Discussion

5.1. Model discovery case study findings

5.1.1. Insights into mechanisms

Our model discovery framework offers three alternative perspectives (corresponding to three credible models 38, 59, and 79) that all offer a substantially better fit to the calibration window data compared to the reference model. In perspective (i), the retroducted mechanisms influencing alcohol use are opportunity and role selection; roles affect drinking frequency only, but not quantity. The relevant roles are employment (which drives opportunity) and being strongly involved in a marital role (which reduces opportunity via the role load it creates). Role strain is not important, nor is role socialisation. Perspective (ii) suggests that role strain drives increased drinking frequency but not quantity. Role strain is due to holding non-normative roles, with all three roles implicated. Both role selection and socialisation are important, but drinking opportunity is not important. Finally in perspective (iii), role strain drives both increased frequency and increased quantity. Opportunity drives frequency only. Role selection and socialisation are not important. Parenting and employment are implicated for role strain, but marriage is not. A universal caveat on these perspectives, which were driven by data over 1980–2000 is that validation issues were identified over the period 2000–2010.

5.1.2. How do these retroducted insights compare to empirical findings?

In the first perspective (i), holding an employment role increases opportunity to drink, whilst high levels of involvement in a marital role reduces opportunity to drink, with no other role-based pathways activated. These retroducted mechanisms are both supported by empirical research. Using data from a large birth cohort in the UK, Staff et al. [37] demonstrated that the employment role in isolation was associated with increased alcohol consumption, and both marital and parental roles were associated with a decrease in consumption. Further, the authors suggest that these effects may be caused by differences in opportunities to drink associated with the marital role - for example by reducing the number of occasions an individual will engage in socialising which could impact alcohol use. This potential mechanism is reflected in perspective (i), whereby an individual has less opportunity to drink, and therefore less frequent drinking occasions if they are married (and highly involved in their marital role). This is also supported by Bachman’s analysis [22], which found that the association between the marital role and reductions in alcohol consumption was strongly moderated by reductions in evenings out, and increased disapproval for use.

In perspective (ii) role incongruence is suggested to be a driver of role strain, which influences the frequency of drinking. This is supported by Biddle [17] who suggests that individuals can experience role strain due to experiencing roles outside the normative timings, for example transitioning into a parent role as an unmarried teenager. Role strain as a driver of alcohol use has also been empirically observed [19], which suggests that individuals may use alcohol to cope with stress. Additionally in perspective (ii), both role socialisation and selection mechanisms are active. The importance of role socialisation mechanisms as a driver of alcohol use is supported by Lee et al. [23] who found that heavy drinking occasions were reduced after individuals had become married. Additionally, Bachman [22] conducted a review of the literature linking marriage and alcohol use, and suggested that the majority of studies find socialisation effects for the marriage role - i.e., gaining the marriage role leads to reductions in alcohol use. The involvement of role selection mechanisms in alcohol use behaviour is also supported by the wider literature on role theory and alcohol use. Specifically, Lee et al. [38] provided evidence for role selection mechanisms, finding that earlier alcohol misuse reduces the likelihood of transitioning into social roles. This could be interpreted in a role selection context - if an individual is a heavy drinker they are less likely to transition into a role which would be incompatible with their drinking.

An increase in alcohol consumption due to role strain is also implicated as a mechanism in perspective (iii). Here, role strain arises due to a combination of role load and role incongruence, which are determined by high levels of involvement in an employment role, or a non-normative employment status, respectively. Role strain as a driver of both alcohol consumption frequency and quantity is supported by Cooper [39], who found in a study of adolescents in the USA that drinking as a means of coping (with role strain or otherwise) was associated with heavier drinking patterns. Additionally in this model, having an opportunity to drink affects frequency of alcohol consumption, and is reduced if the parenting role is held. This is supported by Kuntsche and colleagues [19] who found that in a large study of westernised countries, holding a greater number of roles, including parenthood, was associated with a reduction in alcohol consumption, via a decrease in opportunities to drink.

5.2. Benefits of the framework

Our approach represents a novel and promising technique for knowledge discovery which is able to generate models with theoretically interpretable mechanisms, and can fit historical patterns of data in complex dynamic representations of social systems. The model discovery framework offers a substantial improvement compared to the reference model, in terms of both lowered complexity for interpretability and improved fit to historically observed alcohol consumption trends. However, it is important that in the last step, domain experts are involved to interpret the options generated by the model. Our technique can therefore provide new approaches for developing theories to explain complex social systems.

A further advantage of this approach is that it is both theory and data-driven. We use formalised theories of behaviour, and several large empirical datasets to inform both the initial settings of the model (agent population characteristics) and the phenomenon to be explained - population level alcohol use is derived from a large nationally representative survey.

This method is also very flexible. Firstly, the grammar can be easily modified to redefine search directions, introduce new building blocks, restrict or relax constraints, or introduce different ways to combine the building blocks. This can be done within the grammar without changing the whole model discovery process. Additionally, although we present a case study modeling alcohol consumption, the framework could also be utilised to explain and understand a variety of complex models of social systems. Our method is easily adapted to look at alternative theories of behaviour, and even to search across multiple theories to give novel combinations which provide a better explanation of empirical data trends.

5.3. Limitations

One limitation of this approach is that the GGP works with the primitives and the grammar that the modeler provides it with. It is therefore possible that theoretically meaningful and adequately explanatory models could be missed, because the modeler does not allow for it. In our case study, deliberative discussions during GGP identified that the concept of opportunity encapsulated both time and money resources, which are impacted differently by role theory mechanisms; including these aspects explicitly might arguably have improved the model’s explanatory capability. Model discovery is an iterative, problem-specific process. To design the primitives, modelers have to decide which elements in their models are interesting and relevant to their research questions. The level of abstraction is also important: lower levels of abstraction usually have more elements and possible combinations. As for the grammar design, good practices can be found in the work of Nicolau and Agapitos [40].

Additionally, not all aspects of the model were exposed to the GGP process; for example socialisation and selection mechanisms could be either switched on or off, but the equations could not be modified. If socialisation is switched on, the new disposition to drink is always determined by the same calculation, however exposing this to the GGP could offer alternative candidate mechanisms for the effects of transitioning roles on underlying desire to drink. This would also allow us to investigate in future model iterations whether socialisation effects vary for different roles, as suggested by Bachman et al. [22].

The GGP method can produce complicated models in terms of theories, which require interpreting by a domain expert. In this paper, out of 14 non-dominated structures discovered in the final iteration of the GGP, three were deemed to be theoretically meaningful. For a more efficient search, it would be beneficial if either the grammar allows only meaningful structures or the model discovery process can enforce the theoretical meaningfulness in other ways (such as during the crossover and mutation operators). However, the prior encoding of meaning is very challenging to achieve and there is also a risk of missing novel ideas due to over-constraining the search. Further, we stopped the discovery process once a small number of credible models had been identified. If we had continued to refine the grammar, further credible models may have been identified – including models that offered reduced complexity or increased goodness-of-fit over those existing models. In the context of retroduction, such models may have offered greater insight into the relationship between social roles and alcohol use. At present, it remains unclear what a good yield of interpretable theories would be from a model discovery process.

We identified three qualitative criteria for model credibility. These criteria arose from deliberative discussions in relation to the role theory case study and, as such, their generalizability remains untested. We worked with only a single domain expert, but multiple experts may have offered different criteria or interpreted the same criteria differently. Whilst we enforced a crisp binary categorization of model credibility in the search process, given the qualitative nature of the criteria it may be more appropriate to adopt multinomial and/or fuzzy measures of credibility in future work.

Lastly, our implementation of the proposed discovery process skipped the recalibration stage for parameters of the newly discovered structures (Step 6c) due to computational limitations. We addressed this issue by allowing calibrated constants as primitives, but this approach is not as rich as a full calibration for each new structure. This problem is actually present in many GP works, especially with computationally expensive programs. The potential solution is leveraging surrogate models to approximate the fitness evaluation in GP [41,42].

5.4. Implications for complex systems modeling practice

Retroduction - teasing out the complex interaction of real entities and mechanisms that brings about a concrete phenomenon – is challenging. Complex systems models (CSM) that attempt this are often charged with being arbitrary and/or absolutist in their conceptions of reality. Structural calibration avoids this by looking across a wide multiplicity of models that retain the base elements of mechanism-based theory. Complex systems modelers, who use formal models to help explain concrete phenomena, should use structural calibration as part of their standard modeling practices, in the same way that data-driven modelers, who use formal models to explain variance in patterns of data, consider term selection.

However, automated structural calibration is a major enterprise. It requires complex systems modelers to: (i) think more about ontology – what are the base elements of theory that are candidates for inclusion in any model? and (ii) formally describe entities and mechanisms in a consistent way that allows them to be recombined together in meaningful ways. To help, we have developed an open-source software architecture for CSM developers that forces an ontological focus and provides an underlying formal language that is amenable to automatic structural calibration.

This work and the underlying model discovery process contribute to the needs for standards and modelling practices in the ABS community. Collins et al. [43] pointed out that as ABS matures, with many simulation software tools (like Netlogo [44], Repast Symphony[45]), potential standards and protocols will be needed and proposed. For example, Grimm et al. [46] designed the ODD (Overview, Design concepts, and Details) protocol as a generic and structured template to describe agent-based models for better communication and replicability. Another example is the UML (Unified Modeling Language) to develop and document agent-based models [4749]. There are also many discussions and proposal about different aspects of the ABS development process. Following this line of thought, our model discovery process can be addressed as a protocol concerned with structural uncertainty. It is common to decide on a model structure, then calibrate the parameters within the model to capture parameter uncertainty. Our work takes this further and addresses the between-model uncertainty of structural assumptions. More importantly, we demonstrated the feasibility of automated model structure discovery by embracing both theory and empirical data. Exploring different model structures is not only valuable for theory testing but also can contribute to theory exploration.

6. Conclusion

Here we have presented a novel method which utilizes genetic programming techniques to discover new and adapted behavioral theories from models of complex social systems. Using a case study of a social role theory to inform mechanisms-based model of population level alcohol use, we have demonstrated that our approach can find new, theoretically meaningful and interpretable mechanisms which drive population alcohol use in a complex systems model. It would take a human modeller an infeasible amount of time to manually construct a multiplicity of different variations of the mechanisms. The GP method assists in efficient screening of mechanism variants. This screening process can be important, since different realisations of a mechanism can produce qualitatively different model outputs [50]. The novel models generated by the GP method offer a better fit to alcohol consumption data than a reference model, which was constructed by a human modeler representing one possible interpretation of the mechanisms of role theory. Our approach is flexible and can be easily extended to complex systems models that are seeking to explain other social phenomena. Our method also offers novel directions for future knowledge discovery and social theory development, based on the fusion of data-driven and theory-driven methods.

A key part of realist explanation is comparison and integration across multiple theories [4]. Whilst our existing example is limited to the building blocks defined in social role theory, there is no reason why building blocks relating to other theories cannot be defined. However, integration of these wider building blocks, such that they can be exploited by machine learning, will require a common language for expression of the theories. We see middle-range theory [51] and its realization in the so-called Coleman Boat [52], or other micro-macro schemes, as a potentially useful template for formal descriptions of theory and their translation, via a software architecture, into integrated simulation models. Future work will aim to incorporate additional theories, and to generate novel combinations of multiple theories.

Supplementary Material

Supplementary material A – The evolution of the grammar

Supplementary Material A – The evolution of the grammar contains three iterations of the grammar with change logs.

Supplementary material B – The structures on the Pareto front and corresponding theoretical credibility

Supplementary Material B – The structures on the Pareto front and corresponding theoretical credibility.

Acknowledgments

We would like to thank Jack E. Hutton for preliminary work on structural calibration for the roles models which he conducted in his MEng dissertation, and thank the whole CASCADE team for their input to wider discussions in generating the research reported in this paper. TMV designed the primitives and implemented the model discovery process. AB, CB and RCP designed the roles model, which was implemented by HB. CB implemented the microsimulation. AN and CP contributed to mechanism development. TMV, AB, MS and RCP designed the model discovery process. PS aided the interpretation of the findings. RCP led the research and its conceptual underpinnings. TMV, CB, RCP, MS, AN wrote the paper. RCP would also like to thank the organizers and participants of the Economic and Social Research Council seminar series ‘Complexity and Method in the Social Sciences: An Interdisciplinary Approach’ (2014–2017), which stimulated the thinking developed in this paper. We also thank an anonymous referee for constructive and helpful suggestions to improve the manuscript.

Funding Statement

Research reported in this publication was supported by the National Institute on Alcohol Abuse and Alcoholism of the National Institutes of Health [Award Number R01AA024443]. This research was conducted as part of the Calibrated Agent Simulations for Combined Analysis of Drinking Etiologies (CASCADE) project.

Footnotes

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The author(s) declare(s) that there is no conflict of interest regarding the publication of this paper.

References

  • [1].Epstein JM, Agent-based computational models and generative social science, Complexity. 4 (1999) 41–60. [Google Scholar]
  • [2].Hedstrom P, Dissecting the social: On the principles of analytical sociology, 2005. 10.1017/CBO9780511488801. [DOI] [Google Scholar]
  • [3].Lawson T, Economics and reality, Routledge, 1997. [Google Scholar]
  • [4].Danermark B, Ekström M, Jakobsen L, Karlsson J. Ch., Explaining Society: Critical realism in the social sciences, Routledge, London, 2002. [Google Scholar]
  • [5].Boero R, Squazzoni F, Does empirical embeddedness matter? Methodological issues on agent-based models for analytical social science, J. Artif. Soc. Soc. Simul 8 (2005). [Google Scholar]
  • [6].León-Medina FJ, Analytical Sociology and Agent-Based Modeling: Is Generative Sufficiency Sufficient?, Sociol. Theory. 35 (2017) 157–178. 10.1177/0735275117725766. [DOI] [Google Scholar]
  • [7].Byrne D, Callaghan G, Complexity theory and the social sciences: The state of the art, Routledge, 2013. [Google Scholar]
  • [8].Gunaratne C, Garibay I, Alternate social theory discovery using genetic programming: towards better understanding the artificial anasazi, in: GECCO 2017. - Proc. 2017 Genet. Evol. Comput. Conf., ACM Press, Berlin, Germany, 2017: pp. 115–122. 10.1145/3071178.3071332. [DOI] [Google Scholar]
  • [9].Gunaratne C, Garibay I, Evolutionary Model Discovery of Factors for Farm Selection by the Artificial Anasazi, in: Proc. 2017 Int. Conf. Comput. Soc. Sci. Soc. Am., ACM, 2017: p. 27. [Google Scholar]
  • [10].Vu TM, Probst C, Epstein JM, Brennan A, Strong M, Purshouse RC, Toward inverse generative social science using multi-objective genetic programming, in: GECCO 2019 - Proc. 2019 Genet. Evol. Comput. Conf., Association for Computing Machinery, Prague, Czech Republic, 2019: pp. 1356–1363. 10.1145/3321707.3321840. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Smith VA, Evolving an agent-based model to probe behavioral rules in flocks of cowbirds, in: Artif. Life XI Proc. 11th Int. Conf. Simul. Synth. Living Syst ALIFE; 2008, 2008: pp. 561–568. [Google Scholar]
  • [12].Zhong J, Luo L, Cai W, Lees M, Automatic rule identification for agent-based crowd models through gene expression programming, in: 13th Int. Conf. Auton. Agents Multiagent Syst AAMAS; 2014, 2014: pp. 1125–1132. [Google Scholar]
  • [13].Poli R, Langdon WB, McPhee NF, Koza JR, A field guide to genetic programming, Lulu Press; ], [Morrisville, NC, 2008. [Google Scholar]
  • [14].Rodríguez-Vázquez K, Fonseca CM, Fleming PJ, Identifying the structure of nonlinear dynamic systems using multiobjective genetic programming, IEEE Trans. Syst. Man Cybern. Part ASystems Hum. 34 (2004) 531–545. 10.1109/TSMCA.2004.826299. [DOI] [Google Scholar]
  • [15].Koza JR, Genetic Programming: On the programming of computers by means of natural selection, MIT Press, Cambridge, MA, 1992. [Google Scholar]
  • [16].McKay RI, Hoai NX, Whigham PA, Shan Y, O’Neill M, Grammar-based Genetic Programming: a survey, Genet. Program. Evolvable Mach. 11 (2010) 365–396. 10.1007/s10710-010-9109-y. [DOI] [Google Scholar]
  • [17].Biddle BJ, Role theory: Expectations, identities, and behaviors, Academic Press, 1979. [Google Scholar]
  • [18].Knibbe RA, Drop MJ, Muytjens A, Correlates of stages in the progression from everyday drinking to problem drinking, Soc. Sci. Med 24 (1987) 463–473. [DOI] [PubMed] [Google Scholar]
  • [19].Kuntsche S, Knibbe RA, Gmel G, Social roles and alcohol consumption: A study of 10 industrialised countries, Soc. Sci. Med 68 (2009) 1263–1270. 10.1016/j.socscimed.2009.01.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [20].Wilsnack RW, Cheloha R, Women’s Roles and Problem Drinking across the Lifespan, Soc. Probl 34 (1987) 231–248. 10.2307/800764. [DOI] [Google Scholar]
  • [21].Yamaguchi K, Kandel DB, On the resolution of role incompatibility: A life event history analysis of family roles and marijuana use, Am. J. Sociol 90 (1985) 1284–1325. [Google Scholar]
  • [22].Bachman JG, O’Malley PM, Schulenberg JE, Johnston LD, Bryant AL, Merline AC, The decline of substance use in young adulthood: Changes in social activities, roles, and beliefs, Lawrence Erlbaum Associates Publishers, Mahwah, NJ, US, 2002. [Google Scholar]
  • [23].Lee MR, Chassin L, MacKinnon D, The effect of marriage on young adult heavy drinking and its mediators: Results from two methods of adjusting for selection into marriage., Psychol. Addict. Behav 24 (2010) 712–718. 10.1037/a0020983. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].Department of Health US and Human Services, Substance Abuse and Mental Health Services Administration, Center for Behavioral Health Statistics and Quality., National Survey on Drug Use and Health (NSDUH), (1979). [Google Scholar]
  • [25].University of Michigan. Survey Research Center, Panel Study of Income Dynamics (PSID): Main Interview, 1968–2015, (2018). 10.3886/ICPSR37142.v1. [DOI] [Google Scholar]
  • [26].Manson S, Schroeder J, Van Riper D, Ruggles S, IPUMS National Historical Geographic Information System: Version 14.0 [Database]. Minneapolis, MN: IPUMS, (2019). [Google Scholar]
  • [27].Lovelace R, Dumont M, Spatial microsimulation with R, Chapman and Hall/CRC, 2016. [Google Scholar]
  • [28].Ruggles S, Flood S, Goeken R, Grover J, Meyer E, Pacas J, Matthew Sobek IPUMS USA: Version 9.0 [dataset]. Minneapolis, MN: IPUMS, (2019). 10.18128/D010.V9.0. [DOI] [Google Scholar]
  • [29].Centers for Disease Control and Prevention, National Center for Health Statistics. Compressed Mortality File, 1979–1998 CDC WONDER On-line Database, compiled from Compressed Mortality File CMF 1968–1988, Series 20, No. 2A, 2000 and CMF 1989–1998, Series 20, No. 2E, 2003. Accessed at http://wonder.cdc.gov/cmf-icd9.html on June 20, 2019. 6:04:42 AM, (1979). [Google Scholar]
  • [30].Centers for Disease Control and Prevention, National Center for Health Statistics. Underlying Cause of Death, 1999–2017 on CDC WONDER Online Database, released December, 2018 Data are from the Multiple Cause of Death Files, 1999–2017, as compiled from data provided by the 57 vital statistics jurisdictions through the Vital Statistics Cooperative Program. Accessed at http://wonder.cdc.gov/ucd-icd10.html on June 20, 2019. 6:24:28 AM, (1999). [Google Scholar]
  • [31].Collier N, North M, Parallel agent-based simulation with Repast for High Performance Computing, SIMULATION. 89 (2013) 1215–1235. 10.1177/0037549712462620. [DOI] [Google Scholar]
  • [32].Carnell R, lhs R package: Latin Hypercube Samples, 2019. https://github.com/bertcarnell/lhs. [Google Scholar]
  • [33].Stein M, Large Sample Properties of Simulations Using Latin Hypercube Sampling, Technometrics. 29 (1987) 143–151. 10.1080/00401706.1987.10488205. [DOI] [Google Scholar]
  • [34].Fenton M, McDermott J, Fagan D, Forstenlechner S, O’Neill M, Hemberg E, PonyGE2: Grammatical Evolution in Python, Proc. Genet. Evol. Comput. Conf Companion - GECCO 17; (2017) 1194–1201. 10.1145/3067695.3082469. [DOI] [Google Scholar]
  • [35].Deb K, Pratap A, Agarwal S, Meyarivan T, A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE Trans. Evol. Comput 6 (2002) 182–197. 10.1109/4235.996017. [DOI] [Google Scholar]
  • [36].Gell L, Bühringer G, McLeod J, Forberger S, Holmes J, Lingford-Hughes A, Meier P, What Determines Harm from Addictive Substances and Behaviours?, Oxford University Press, 2016. [Google Scholar]
  • [37].Staff J, Greene KM, Maggs JL, Schoon I, Family transitions and changes in drinking from adolescence through mid-life: Family transitions and drinking in adulthood, Addiction. 109 (2014) 227–236. 10.1111/add.12394. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [38].Lee MR, Chassin L, MacKinnon DP, Role Transitions and Young Adult Maturing Out of Heavy Drinking: Evidence for Larger Effects of Marriage Among More Severe Premarriage Problem Drinkers, Alcohol. Clin. Exp. Res 39 (2015) 1064–1074. 10.1111/acer.12715. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [39].Cooper ML, Motivations for alcohol use among adolescents: Development and validation of a four-factor model., Psychol. Assess 6 (1994) 117–128. 10.1037/1040-3590.6.2.117. [DOI] [Google Scholar]
  • [40].Nicolau M, Agapitos A, Understanding Grammatical Evolution: Grammar Design, in: Ryan C, O’Neill M, Collins J (Eds.), Handb. Gramm. Evol, Springer International Publishing, Cham, 2018: pp. 23–53. 10.1007/978-3-319-78717-6_2. [DOI] [Google Scholar]
  • [41].Hildebrandt T, Branke J, On Using Surrogates with Genetic Programming, Evol. Comput. 23 (2015) 343–367. 10.1162/EVCO_a_00133. [DOI] [PubMed] [Google Scholar]
  • [42].Zaefferer M, Stork J, Flasch O, Bartz-Beielstein T, Linear Combination of Distance Measures for Surrogate Models in Genetic Programming, in: Auger A, Fonseca CM, Lourenço N, Machado P, Paquete L, Whitley D (Eds.), Parallel Probl. Solving Nat. – PPSN XV, Springer International Publishing, Cham, 2018: pp. 220–231. 10.1007/978-3-319-99259-4_18. [DOI] [Google Scholar]
  • [43].Collins A, Petty M, Vernon-Bido D, Sherfey S, A Call to Arms: Standards for Agent-Based Modeling and Simulation, J. Artif. Soc. Soc. Simul 18 (2015) 12. [Google Scholar]
  • [44].Wilnesky Uri, NetLogo, Center of Connected Learning and Computer-Based Modeling, Norwestern University, Evanston, IL, 1999 http://ccl.northwestern.edu/netlogo/. [Google Scholar]
  • [45].North MJ, Collier NT, Ozik J, Tatara ER, Macal CM, Bragen M, Sydelko P, Complex adaptive systems modeling with Repast Simphony, Complex Adapt. Syst. Model 1 (2013) 3 10.1186/2194-3206-1-3. [DOI] [Google Scholar]
  • [46].Grimm V, Berger U, DeAngelis DL, Polhill JG, Giske J, Railsback SF, The ODD protocol: A review and first update, Ecol. Model. 221 (2010) 2760–2768. 10.1016/j.ecolmodel.2010.08.019. [DOI] [Google Scholar]
  • [47].Bauer B, Müller JP, Odell J, Agent uml: a formalism for specifying multiagent software systems, Int. J. Softw. Eng. Knowl. Eng 11 (2001) 207–230. 10.1142/S0218194001000517. [DOI] [Google Scholar]
  • [48].Bersini H, UML for ABM J Artif. Soc. Soc. Simul 15 (2011) 9. [Google Scholar]
  • [49].Vu TM, Wagner C, Siebers P-O, ABOOMS: Overcoming the Hurdles of Continuous-Time Public Goods Games with a Simulation-Based Approach, J. Artif. Soc. Soc. Simul 22 (2019) 7. [Google Scholar]
  • [50].Muelder H, Filatova T, One Theory - Many Formalizations: Testing Different Code Implementations of the Theory of Planned Behaviour in Energy Agent-Based Models, J. Artif. Soc. Soc. Simul 21 (2018) 5. [Google Scholar]
  • [51].Merton RK, On theoretical sociology, Free Press; New York, 1967. [Google Scholar]
  • [52].Coleman JS, Social Theory, Social Research, and a Theory of Action, Am. J. Sociol 91 (1986) 1309–1335. 10.1086/228423. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary material A – The evolution of the grammar

Supplementary Material A – The evolution of the grammar contains three iterations of the grammar with change logs.

Supplementary material B – The structures on the Pareto front and corresponding theoretical credibility

Supplementary Material B – The structures on the Pareto front and corresponding theoretical credibility.

RESOURCES