Chemical risk assessment relies on knowledge of hazard, the dose–response relationship, and exposure to characterize potential risks to public health and the environment. A chemical with minimal toxicity might pose a risk if exposures are extensive, repeated, and/or occurring during critical windows across the human life span. Exposure assessment involves understanding human activity, and this activity is confounded by interindividual variability that is both biological and behavioral. Exposures further vary between the general population and susceptible or occupationally exposed populations. Recent computational exposure efforts have tackled these problems through the creation of new tools and predictive models. These tools include machine learning to draw inferences from existing data and computer-enhanced screening analyses to generate new data. Mathematical models provide frameworks describing chemical exposure processes. These models can be statistically evaluated to establish rigorous confidence in their predictions. The computational exposure tools reviewed here are oriented toward ‘high-throughput’ application, that is, they are suitable for dealing with the thousands of chemicals in commerce with limited sources of chemical exposure information. These new tools and models are moving chemical exposure and risk assessment forward in the 21st century.
1). Introduction
Sources of chemical emissions surround us, including emissions from industry, building materials, furnishings, and consumer products. Given the multiplicity of domestic and occupational human activities, there is a wide diversity of chemical exposures across individuals and the human life span. In ‘Risk Assessment in the Federal Government,’ the U.S. National Research Council delineated three aspects that must be considered when assessing chemical risk: toxicological hazard, biological dose–response, and exposure [102]. Toxicological hazard is the potential of a chemical to cause a specific adverse effect at some dose; dose-response information characterizes the intensity and duration of doses needed to cause an adverse effect in an organism. Both toxicity and dose-response are functions of how chemicals move through and perturb the body [1]. Exposure is a measure of the amount of a chemical that reaches an individual and dose is the mass of a chemical that enters an organism over time by one or more routes of exposure [177]. Although traditional exposure assessment methods have been successful at addressing individual chemicals and specific scenarios [49], there remains a significant backlog of chemical exposures which have not yet been addressed [13, 38].
New approach methodologies (NAMs) are being developed to address and prioritize data needs for preliminary determination of the risk posed by chemicals to the public health [*75, 149]. We consider NAMs to broadly include new experimental, in silico, and informatic approaches that can rapidly inform chemical risk assessments [44, *75]. Exposure NAMs have been envisioned variously and under many names [6, 8, 31, 39, 104, 108, 142]. Here, we review examples of exposure NAMs largely surrounding research currently being conducted at or in collaboration with the U.S. Environmental Protection Agency (EPA)’s ExpoCast (‘Exposure Forecasting’) project [25, 132] and several collaborators, although additional important examples from other agencies are provided, including Health Canada, the European Centre for Ecotoxicology and Toxicology of Chemicals (ECETOC), and the European Network of Reference Laboratories, Research Centres and Related Organisations (NORMAN).
As illustrated in Figure 1, exposure to humans or ecological receptors result from a complex web of pathways. Here, we review advances in high throughput exposure NAMs. We start with an overview of the broad types of relevant exposure pathways. We then detail specific NAMs that have been developed to better characterize these pathways, noting where improvements are needed. We then describe how these NAMs can be combined with hazard information to set chemical priorities, including the identification of susceptible or highly exposed populations. Together these new approaches allow greater certainty about our environment and its impact on public health [149].
Figure 1:
Chemical exposure arises from a diversity of pathways that involve human interactions and physical processes.
2). Exposure Pathways
As depicted in Figure 1, chemical exposure impacts both the public health and ecological endpoints. Human chemical exposures can be coarsely grouped into “near-field” sources in the home or at work (i.e., occupational sources) and in ‘far-field’ sources wherein individuals are exposed to chemicals that were released or used some distance away from the individual [3, 67]. Evaluation of near-field domestic chemical sources has emerged as a prominent topic in exposure research [63, 159]. Humans are exposed to these sources via consumer and occupational exposure pathways (Figure 1). However, describing far-field chemical releases and the resulting “ambient” exposures to humans who are not consumers or workers (i.e., the general population) remains critical to chemical risk assessment [47]. The methods described here are focused on public health, but many of the same tools can and have been adapted to address ecological impact of chemicals as well. We divide human exposure into following three broad categories.
Far-Field Sources of Chemical Exposure
Understanding chemical risk to public health requires detailed knowledge of the background concentration data in ambient environments and resulting exposure levels. Far-field sources include discharges to air, water, and soil by industrial activities and releases from end-of-life disposal of consumer products. Ambient exposure to chemicals from these sources occurs from inhalation of dust and vapors; dermal contact with water, soils, and dusts; and ingestion of food, water, and dust [111]. The parent compound released into the environment may itself be a toxicant of concern, or it may be a precursor to a toxicant that forms from the parent after release. Substances may be transformed by abiotic processes (for example, photolysis and hydrolysis) and biotic process (for example, microbial metabolism). The transformation of substances occurs almost everywhere so that the ‘transport’ (movement of a chemical in the environment) and ‘fate’ (transformation of the chemical) are linked, with fate being understood to include kinetics and equilibria. Thermodynamics underlie the chemical transformation of pollutants and their precursors. For example, raising temperature increases the rate of kinetics of a system.
The parent compound’s physical-chemical properties are the first determinant of how rapidly it will be transformed. These transformation processes parallel absorption, distribution, metabolism and excretion that take place inside an organism, albeit at different rates [90, 113, 144]. Environmental transport and fate processes involve the interrelationships between and within food webs, including humans [53]. The transfer of mass and energy up and down levels of biological organization can be measured and predicted. However, these data are not generally available for most compounds [38]. Currently, the international Human Toxicity Task Force recommends additional research to collect these data and reduce model and data uncertainties [165].
Near-Field Sources of Consumer Exposure
In contrast to “far-field” sources, “near-field” consumer chemical sources include consumer products, household furnishings, and building materials. Exposure to near-field sources may occur directly, as when consumer products are applied to the body, or indirectly via contact with contaminated residential air, dust, and/or surfaces. Volatile organic compounds (VOCs) and semi-volatile organic compounds (SVOCs) such as phthalates, flame retardant (FRs), and polychlorinated biphenyl (PCBs) are all classes of chemicals that have been found in indoor air, dust, and/or surfaces. As with far-field sources, however, the necessary exposure data are lacking [38]. Physical-chemical properties, information on identities and concentrations of chemicals present in consumer products or durable articles, and knowledge of how humans interact with chemicals in their environments are all examples of data required for exposure estimation that are often unknown.
Because adults spend roughly 80% of the day indoors [154] and have direct contact with many consumer products, exposures to near-field sources can contribute substantially to total exposure [157]. Using statistical analyses of biomonitoring data from the National Health and Nutrition Examination Survey (NHANES), Wambaugh et al. [159, 160] have demonstrated that use of a chemical in a near-field context is a predictor of higher levels of chemical intake. Efforts within EPA’s ExpoCast project and elsewhere have focused on compiling qualitative [33] and quantitative [*32, 52, 64, 101] data on the use of chemicals in consumer products. Chemicals may be used in multiple product types in different amounts. Therefore, characterizing the breadth of uses and the associated concentrations is critical to accurately assess potential aggregate exposure. While useful concentration information has been gleaned from Safety Data Sheets (SDS) [*32, 52, 64, 101] and reported ingredient lists [65] for consumer product formulations, these types of data sources do not include all chemicals in products. Undisclosed compounds may legally include chemicals present in small amounts (for example, <0.1 percent), proprietary ingredients, low-level contaminants of ingredients, components of ingredients that are themselves mixtures (for example, fixatives in fragrances) [*115], or degradation products. The composition of durable consumer goods (i.e., articles of commerce) and building materials prove to be particularly challenging because ingredient lists or SDS are not required. Therefore, analytical measurement of composition of relevant items and materials is still required to augment reported chemical data [7, 72, 77, *115, 174].
The presence of a chemical in, for example, a consumer product is a prerequisite for chemical exposure. However, chemicals must also be emitted from the matrix for exposure to occur [69, 83, 86, 175]. Emission properties can be understood using mass transfer models that describe the exchange of VOCs and SVOCs between sources, air, house dust, and interior surfaces in residential environments. To use these models, many empirical quantities must be known including adsorption/desorption rate constants, material/air partition coefficients, solid-phase diffusion coefficients, mass-transfer coefficients, and initial material phase concentration. The development of methods to measure emissions and determine these model parameters are essential to estimate indoor exposure. Relevant methodologies rely on a variety of emission chambers [151]. General principles and guidelines for expanding the overall experimental approaches and standardizing indoor exposure product testing protocols have been identified [151] to assess chemical exposures from indoor sources. The collected data and application of measurement-based modeling tools can then be used to refine and evaluate computational exposure models for rapidly predicting human exposures to chemicals [12].
Occupational Exposure
Occupational exposures to chemicals are estimated to cause at least 290,000 deaths globally every year [84]. The primary occupational exposure route is inhalation, followed by dermal exposure [111]. There are both great variability across time in the occupational environment for any one individual, and significant differences in the exposures between different individuals performing different jobs within the same workplace [111]. Although public tools exist for performing occupational exposure assessment (for example, chemical screening tool for exposures and environmental releases ((ChemSTEER) [30]), these tools are not yet capable of rapidly running large numbers of chemicals with minimal information. Similarly, while data are available on occupational exposure (such as from the U.S. Occupational Safety and Health Administration, https://www.osha.gov/opengov/healthsamples.html), the interfaces and data are focused on specific chemicals and scenarios. Although attempts are being made to enhance the speed of occupational exposure assessment [76], the models and informatics needed to rapidly address thousands of chemicals in occupational exposure scenarios are largely missing [147].
3). NAMs for exposure
Here, we consider NAMs to broadly include advances in both computational and measurement approaches that can rapidly inform chemical risk assessments (European Chemicals Agency (ECHA), 2016); [*75]. Exposure measurements collected in different settings such as homes and workplaces can characterize exposure and allow model evaluation [2,35,63,159]. However, collecting exposure measurements in real-world settings are resource intensive because of the costs associated with chemical analysis, maintaining a cohort of participants, training field personnel, and designing and maintaining sampling equipment. “Computational exposure” science provides tools to supplement more traditional approaches [39]. Some of these approaches are purely computational such as machine learning, database development, and sophisticated algorithmics (for example, frequent itemset mining [55]). However, errors and other limitations of the available data and models can result in incorrect exposure model selection and consequently increase uncertainty in exposure assessment [134]. Other NAMs include more traditional scientific techniques enhanced by modern computational power (for example, suspect-screening mode mass spectrometry).
Exposure Measurement
Experimental measurements are the foundation of any science. Measurements of exposure can identify unique scenarios or sources that may not have been previously anticipated. For example, an important residential source of PCBs — namely, wood floor finish — was only identified through deeper investigation of uniquely high house dust and indoor air concentrations of PCBs in some residences enrolled in the Cape Cod Household Exposure Study [124]. Furthermore, biomonitoring revealed skin-lightning cream as an important and unrecognized source of inorganic mercury among the most highly exposed [93].
NAMs for exposure measurement include suspect screening and nontargeted mass spectrometry analysis [*143]. These advanced mass spectrometry techniques for chemical screening and identification commonly use high resolution mass spectrometry, fundamental chemistry concepts, and extensive chemical reference lists to simultaneously identify hundreds or thousands of substances within a sample [*153]. These methods hold promise for filling gaps in reported chemical data. Suspect screening mass spectrometry has allowed researchers to confirm the presence or absence of hundreds to thousands of known chemicals in dust [97, 119], water [98, *106, 130], consumer products [*115], and serum [162]. To facilitate the potential use of measurement NAMs in decision-making, EPA’s Non-Targeted Analysis Collaborative Trial (ENTACT) is working to characterize the performance of different approaches, establish benchmarks, and develop reporting standards [*139, *153]. In Europe, the NORMAN network has already made use of suspect screening analysis to detect emerging drinking water contaminants [130], while in the United States, researchers have also allowed the detection of previously unknown chemicals such as those recently introduced into commerce [88, *106].
Exposure biomonitoring involves measuring parent chemicals and/or their metabolites in accessible biological media (for example, blood, urine) and using these measurements as “exposure biomarkers” [103]. Biomonitoring surveys of human populations have provided an invaluable resource for inferring chemical exposures, although they are not without their limitations [6, 137]. Exposures must be inferred from biomarkers [140, 146] but research frameworks have been proffered to realize the many benefits of biomonitoring data for exposure and risk assessors [78, 142]. First, biomarkers aggregate exposure across all relevant routes (for example, ingestion, inhalation, dermal contact). Thus, a single biomarker measure can reflect the culmination of multiple exposure scenarios and provide a solid foundation for exposure model evaluation [*120, 159, 160]. Biomarker variability over time is often less than actual exposure variability [5, 85], and so a modest number of measurements may accurately serve as surrogate exposure measures in observational studies [96, 141]. Concurrent measurement of multiple exposure biomarkers from the same individual can measure the variability in time of exposures in the surveyed individuals [22] and can characterize correlations within and co-occurrence between exposures [*73]. Finally, exposure biomarkers can be measured concurrently with effect biomarkers using a single biological specimen [8].
Advances in laboratory techniques are also expanding the chemical space coverage of biomonitoring studies, leading to the discovery of contaminants-of-emerging-concern [57]. Sobus, et al. [*143] present a framework to integrate results from exposome [171] studies with high-throughput exposure (HTE) estimates, particularly how presence/absence and semi-quantitative information is useful in HTE prediction, where orders of magnitude uncertainty are unsurprisingly more useful than a complete lack of data. The framework describes the interrelationships [119] between screening-level mass spectrometry and efforts such as ExpoCast and ToxCast (the EPA’s Toxicity Forecaster Project [34]). For example, screening data can be used to calibrate models that predict exposure (see Figure 2), and these data are viewed as “one innovative approach for identifying and setting priorities among chemicals for additional exposure assessment, hazard testing, and risk assessment…” [*100]. As these new methods begin to inform public health risk assessment, ongoing efforts to characterize, standardize, and report the confidence of these data remain extremely important [129, *139, *153].
Figure 2.
Systematic empirical evaluation of models (SEEM) uses Intake rates inferred from biomonitoring data to evaluate and calibrate exposure predictors across as many chemicals as possible. Exposure predictors include HTE models for predicting intake rates as well as presence on the chemicals on various lists (such as high production or banned). SEEM provides a quantitative estimate of uncertainty. HTE, high-throughput exposure.
The analysis of multiple biomarkers along an adverse outcome pathway enables molecular epidemiology investigations that can yield biomarker-based reference values for target chemicals [184]. Given the advantages of biomarkers, numerous national monitoring programs and large cohort studies have been deployed. Many of these programs routinely measure hundreds of biomarkers and aim to establish population reference ranges, track exposure trends, and examine links between exposure and disease [17, 79]. Guidance exists to enhance the utility of nationally representative study data for risk assessments [138]. However, newly discovered chemicals or chemicals with higher than expected exposures need to become priorities for toxicological screening [*143]. A recent workshop organized by the Helmholtz Centre for Environmental Research investigated ‘integrating the exposome approach with the adverse outcome pathway’ concept to better inform public health decision-making with respect to chemicals in the environment [43].
Toxicokinetics and Exposure Reconstruction
Toxicokinetics describes internal exposure, including the absorption, distribution, metabolism, and excretion of chemicals by the body and is important for understanding risk to public health [1. 26]. With toxicokinetic information, it is possible to predict internal tissue dosimetry via “forward” modeling, assuming a certain external exposure. Alternately, internal concentrations (such as from biomonitoring) can be used to predict an external exposure that would result in these internal concentrations (that is, “reverse dosimetry”) [50, 146]. Because most chemicals lack information on toxicokinetics [168, 170], high throughput toxicokinetics (HTTK) methods are needed. A series of studies have established HTTK methods for using in vitro assays to provide chemical-specific information allowing estimation of toxicokinetics, often within a factor of three [123, 158, 161, 164, 168, 169, 170, 178].
Toxicokinetics is heavily influenced by biological variability [66]. Quantifying human biological variability allows the development of HTTK models that can rapidly characterize the range and distribution of external and internal (tissue) exposures across individuals, allowing better identification of potentially sensitive subpopulations [*121, 167]. Data and tools to quantify human variability in exposure are increasingly available and accessible, for example the European Centre for Ecotoxicology and Toxicology of Chemicals (ECETOC) Human Exposure Assessment Tools Database (heatDB) (https://heatdb.cremeglobal.com/. In addition, the open-source R package httk [112] contains a module that uses NHANES data to simulate population physiological variability for use with HTTK modeling [121]. This tool simulates key aspects of biological variability in a correlated fashion, allowing the identification of the 95th percentile most “sensitive” individuals who experience higher plasma concentrations for the same exposure. These individuals require a lesser exposure to produce a given tissue concentration [166].
However, additional research is needed to reduce the uncertainty in the application of HTTK to exposure reconstruction [*135, 158, 161].
Humans also vary biologically with respect to their microbiome [37], which can significantly affect toxicokinetics [*183]. The microbiome is comprised of trillions of microbial cells [131] that collectively expands the metabolic capacity of the cooperative unit. A majority of bacteria reside in the intestinal tract [131] where they can interact with dietary compounds, pharmaceuticals, environmental chemicals, and xenobiotics present in consumer products [*19, 21, 29, 60, 114, 145]. While intestinal microbiota have increasingly been linked to the efficacy and toxicity of pharmacological compounds [173], the interaction between microbiota and environmental pollutants remains poorly understood. This is because it is difficult to separate the host from microbial metabolism. A recent report from the National Academies of Sciences, Engineering, and Medicine highlighted the need for risk assessment approaches to include an evaluation of the interaction between environmental chemicals and the microbiome [99]. If the impacts of host-associated microbes on tissue concentrations of chemicals can be further demonstrated, and the variation between human populations can be broadly characterized (for example, heavy antibiotic users, diets rich in fat and preservatives), then models may eventually be used to better characterize human exposure to chemicals [105].
High throughput Exposure (HTE) Models
Although biomonitoring and other exposure measurement studies may cover dozens or even hundreds of analytes, with tens of thousands of chemicals in commerce [155] a large number of known and unknown chemical–media combinations might best be addressed using models. These models can screen potential sources and routes of exposure to identify relevant chemicals beyond the usual suspects (for example, persistent organic pollutants, phthalates, and brominated flame retardants). While exposure modeling is a well-established field, newer ‘high-throughput’ models are a class of exposure NAMs whose development is ongoing. To be considered an HTE model, a model must:
be applicable to and capable of handling many chemicals with minimal descriptive information [3, 63, 122, 159];
cover one or more relevant exposure routes (for example, inhalation, food ingestion, mouthing, and dermal contact,) and sources (for example, industrial and residential use), accounting for the influential parameters relevant for the considered pathways [51, 179];
allow for integration with models for other pathways [47, 118, *120];
be scientifically plausible, respecting mass-balance principles and accounting for competing processes (for example volatilization versus dermal uptake) [90];
allow for the assessment of interindividual and intraindividual variation in exposure and impact of such variation on acute and chronic doses as the required input data becomes available [5, 117, *121];
be amenable to integration within statistical frameworks that quantify uncertainty for propagation into risk evaluations [2, *120, 159, 160, 168]; and
remain parsimonious, that is, no more complicated than necessary to reflect the data [20, 23, 46, 56].
This list was adapted from the study by Huang and Jolliet [59].
A series of HTE models have been developed for far-field environmental emissions such as primary and secondary fine particulates [9, 148], and for persistent organic pollutants [3, 94, 128, *165]. HTE models have also been constructed for the indoor environment [63, 81, 133], accounting for consumer product use and chemical release types and routes such as inhalation, dust ingestion, and dermal exposure. To further account for direct contact between household products and users, some product-specific models have been made available for cosmetics [28] or food contact materials [12, 42]. Many HTE models consider the fraction of the chemical inside a product that is taken in by users and overall population [59, 70].
HTE models can provide rough but quantitative estimates of exposure that can then be combined with quantitative estimates of toxicity [*121, *135, 168]. At the same time, substantial work is needed to consolidate model knowledge and accuracy of exposure pathways that are relatively poorly covered by high-throughput models such as mouthing, nondietary dust ingestion, air-to-skin dermal uptake, and occupational [76].
Exposed Populations in HTE Models
Human behavior is complex, and patterns such as consumer product use are difficult to quantify. However, it is important to continue to improve upon the description of these behaviors in HTE exposure models to provide the means to model population variability in exposure to chemicals. In general, these models use Monte Carlo (MC) simulation to draw from distributions for relevant model parameters based on substantial amounts of data regarding behavior. For example, the Stochastic Human Exposure and Dose Simulation High-Throughput (SHEDS-HT) model for residential and dietary exposures includes demographic data from the U.S. Census, dietary intake survey data from NHANES, and a large database of daily activity diaries covering more than 54,000 individual day entries [63]. Another example is the Creme Care model of consumer products exposures and Creme RIFM model of fragrance exposures which include a database of detailed consumer survey data collected from more than 36,000 consumers in the European Union and United States, characterizing variability in habits and practices regarding use of cosmetics, personal care products, and air freshener products [127]. Finally, the Probabilistic Aggregate Consumer Exposure Model (PACEM) for exposures to substances in personal care products includes use frequency and amount data from a survey of 512 Dutch adults [36].
An alternative approach to relying on survey data for human behavior is to model patterns of human behavior over time (longitudinal patterns). These models begin by defining the variation in the characteristics of the exposure population. An individual’s fixed characteristics are defined and used to simulate the individual’s behavior over time [117]. The simulation can be performed by combining records of daily activity patterns from multiple individuals [62], linking data from multiple surveys [180], or using agent based modeling [*15].
For any of these models, key parameters defining consumer product use patterns must be identified, for example, the percent of the population using the product (prevalence), frequency of use of the product, amount of product per use (mass), and duration of use. In addition, for some product categories, differentiation in use patterns by age, gender, ethnicity, or socioeconomic status may be appropriate and desired. In past studies, necessary input parameters to establish consumer product use patterns have been identified for individual product categories using a variety of sources, as no one established and authoritative source exists for data on consumer product use patterns. EPA’s Chemical and Products Database (CPDat) allows investigation of how and where chemicals are used [*32]. A handful of other sources also provide aggregated data [16, 27, 150]. However, there remain many product categories where availability of use pattern data are extremely limited or in some cases non-existent (for example, many arts and crafts, automotive care, and home maintenance related product categories), and in cases where data exist, they are often difficult to synthesize because findings are dependent on the survey design, population sampled, and method of data collection.
Chemical Descriptors
While the development of HTE models represents one class of NAMs, parallel advances in describing chemicals is a distinct class of NAMs. A key barrier to the use of HTE models is describing chemicals in a high-throughput manner, that is, how do we “parameterize” the models when we only know chemical structure [2]? Three types of input data describing each chemical are typically required for the HTE calculations: (1) physical—chemical properties, (2) how and how much chemical mass is applied, and (3), the characteristics of the exposed population [134].
Chemical Properties
Chemical structures and properties can be difficult to obtain; however, the EPA has compiled structures for more than 400,000 chemicals from lists with various degrees of curation (https://comptox.epa.gov/dashboard) [172]. While physical-chemical property measurements may not be available, properties can be predicted from structure [*92, 95, 125]. For chemicals that have the potential to dissociate (ionize) in ambient environments, most HTE models have to date made predictions based on only the neutral chemical properties of these chemicals in other high-throughput exposure model applications [2, 63, 159, 182]. Therefore, further measurements of chemical properties for these types of chemicals would improve exposure estimates [107, 144].
Chemical Release
Chemical mass released to the environment is important because exposure estimates from a single source are typically a direct linear function of the selected chemical mass applied, used, or released (Q), such that if the selected value for Q overestimates or underestimates actual chemical use/application quantity by a factor of n, then the exposure estimates will also include the factor of n error [2]. However, much of currently available data that can be used as surrogates for Q are either localized emissions or total (national) production volumes (TPVs). Several countries maintain some form of anthropogenic chemical release reporting [4, 41, 45, 152]. Production volume and industrial use data are often confidential or are only reported within large ranges that span several orders of magnitude for a given chemical. Furthermore, a portion of the produced volume of chemical might be used as an intermediate which is not released into the environment. Imports in and exports out of the country are also difficult to capture with TPVs. Finally, TPV estimates are not averaged values over multiple years, but from a single year. Thus, historical values accounting for import and export in a narrow range would significantly reduce uncertainty in exposure predictions.
A range of approaches have been developed to address chemical release data gaps, each with their own challenges and impacts on decision uncertainty. The approaches can be broadly categorized as “top-down” or “bottom-up”, although this terminology can vary in meaning. “Top-down” approaches include data mining [18], data mining with proxies [110], and inverse projection from environmental monitoring data [11]. Data mining can be a useful tool because it relies on existing data to model chemical releases by reconciling data in multiple reporting sources. This approach can be extended to groups of chemicals by using existing release data for a chemical as proxies for structurally similar chemicals. With inverse projection, releases are estimated by back-projecting chemical dispersion patterns from monitoring locations to known locations of emission point sources. “Bottom-up” approaches include process modeling and simulation [14, 30, 110, 136], production and consumption modeling [11, 82, 163, 181], and material (or substance) flow analysis [54, 80]. Process modeling and simulation can yield accurate release estimates provided sufficient detail is known about the activity being modeled. Production and consumption modeling uses chemical production or industrial use quantities and activity-specific emission factors to estimate releases. Substance flow analysis applies material balancing principles to the sequence of activities within a substance’s life cycle to determine the releases from each activity.
The chemical mass applied, used, or released is directly related to how the chemical is used. Possible exposure scenarios include direct oral intake of a product (for example, toothpaste), contact and contamination of food and beverages, direct dermal contact (for example, cosmetics applied to the skin), dermal contact to contaminants on indoor surfaces, and inhalation of contaminants in indoor and outdoor air [83, 86]. While the use scenarios, resulting exposure pathway, and relevant properties are available in some cases [*32], most often the necessary data are unknown. For example, the emission rate of a chemical from a product is critical but often unavailable information because it depends not only on the chemical but also on the properties of the matrix (product) itself [83, 86]. In these cases, the use of machine learning methods to predict chemical use from structure and properties can inform the potential exposure pathways in which a given chemical might be involved [*120].
Systematic Model Evaluation
To apply HTE models in a human health risk framework [*121, *135, 168], it is necessary to quantify the uncertainty in the HTE predictions [*120, 161]. One recent approach has been to treat chemicals for which monitoring data are available as representative of chemicals without such data. In this way, the uncertainty of HTE predictions for those chemicals may be estimated [109, 126, 159]. The predictions of models of human variability in exposure can be compared with population exposure biomonitoring data via toxicokinetic modeling, for purposes of evaluation and calibration [36, *120]. For example, the ExpoCast project has made use of the Systematic Empirical Evaluation of Models (SEEM) framework, as illustrated in Figure 2 [*120, 159, 160].
Machine Learning
One constant from exposure measurement, through modeling, to chemical properties is a lack of data. As existing data are compiled into databases and new data sets are generated, machine learning has become an increasingly common approach to fill remaining gaps and is therefore the final class of NAMs described here. At its most basic description, machine learning is the development of mathematical models to understand data (VanderPlas, 2016). The available information is used to build (“train”) and evaluate predictive models for extrapolation. A common application of machine learning models in exposure science (as well as drug-discovery and toxicology) is quantitative activity/structure-property relationships (QSAR/QSPRs) which use the measured or reported activity or property of known chemicals to predict that same activity or property for a chemical where it is unknown (Leach, 2001). Successful machine learning methods can identify complex, multivariate relationships among large numbers of chemical descriptors which might describe the presence or absence of thousands of structural features [176, 177]. Machine-learning models have been used to predict migration of chemicals from food packaging to food [12], chemical function within consumer products [64, *116], weight fractions of chemicals in consumer products [64], fraction of unbound xenobiotics in human plasma 61], physical-chemical properties used in exposure modeling [92], and chemical exposure pathways [*120]. As new exposure data are acquired, and subsequent data needs are identified, existing machine learning models can be re-trained with new and existing data to improve model accuracy and adjust the scope of the model (i.e., sensitive vs. general populations, consumer vs. industrial chemicals, or specific chemical class vs. broad chemical coverage).
4). Risk-based Prioritization with Exposure NAMs
All the classes of NAMs described in Table 1 can be used together when trying to identify chemicals for additional testing [40]. Given that there are thousands of chemicals in commerce that have undergone limited human health safety evaluation [156], it is hoped that high-throughput bioactivity screening (for example, ToxCast, Tox21), HTTK, and HTE models can, respectively, provide alternatives to traditional hazard, dose-response, and exposure data in priority setting. Early efforts to use ToxCast high-throughput toxicity testing data in chemical risk prioritization were limited by the inability to link bioactive in vitro concentrations to a meaningful exposure metric. Despite the inherent uncertainties in coverage of biological pathways, toxicokinetic processes, and exposure variability, if the uncertainty in these NAMs can be quantified, then it may be possible to prioritize chemicals with respect to potential risk, as in Figure 3. The value of these NAMs in prioritization efforts has been consistently recognized by experts and stakeholders across the sectors [*100].
Table 1.
New approach methodologies for exposure science.
Exposure NAM class | Description | Traditional Approach | Makes use of |
Measurement | Toxicokinetics | Models | Descriptors | Evaluation | Machine learning | |||
| ||||||||
Measurements | New techniques including screening analyses capable of detecting hundreds of chemicals present in a sample | Targeted (chemical-specific) analyses | – | • | • | • | • | |
Toxicokinetics | High-throughput methods using in vitro data to generate chemical-specific models | Analyses based on in vivo animal studies | • | – | • | • | ||
HTE models | Models capable of making predictions for thousands of chemicals | Models requiring detailed, chemical- and scenario- specific information | • | • | – | • | ||
Chemical descriptors | Informatic approaches for organizing chemical information in a machine- readable format | Tools targeted at single- chemical analyses by humans | – | • | ||||
Evaluation | Statistical approaches that use the data from many chemicals to estimate the uncertainty in a prediction for a new chemical | Comparison of model predictions to data on a per- chemical basis | • | • | • | • | – | • |
Machine learning | Computer algorithms to identify patterns | Manual inspection of the data | • | • | • | – | ||
Prioritization | Integration of exposure and other NAMs to identify chemicals for follow-up study | Expert decision-making | • | • | • | • | • | • |
HTE, high-throughput exposure; NAMs, new approach methodologies.
We describe six broad categories of NAMs that are being used or applied to inform exposure along with other chemical risk prioritization NAMs (Table 1).
• indicates that the NAM also makes use of the NAM in the indicated column, while - indicates that these are the same NAM.
Figure 3:
High-throughput methods may trade off precisions for speed, but if the uncertainty in the methods can be quantified, then the methods may still be useful for separating chemicals based on likelihood of risk.
For example, Health Canada has moved from focusing on a single chemical at a time to higher throughput methods applicable to most chemicals in commerce [13]. Bonnell et al. [13] write that regulatory prioritizations “…can therefore greatly benefit from inclusion of exposure descriptors and, by extension, exposure modeling to improve targeting of chemicals of highest concern.” HTE models for fate and human exposure [3] were used by Health Canada when the relevant environmental concentration and biomonitoring data were not available to identify the chemicals of ‘highest concern’ [13]. Recently, the EPA has proposed a working approach for selecting chemical candidates for chemical prioritization that includes risk-based metrics developed from both toxicity and exposure NAMs (U.S. Environmental Protection Agency, 2018).
Considerations for Susceptible Populations
In the United States, the recently updated Frank R. Lautenberg Chemical Safety for the 21st Century Act specifically calls for consideration of “potentially exposed or susceptible subpopulations,” defined as a groups “who, due to either greater susceptibility or greater exposure, may be at greater risk than the general population of adverse health effects from exposure to a chemical substance or mixture, such as infants, children, pregnant women, workers, or the elderly’’. Susceptibility to harmful effects to environmental chemicals is greater at certain periods in the life span and is influenced by genetic factors as well as exposure to other stressors [48, 91]. Periods of the life span known to be more susceptible to perturbation than others include during fetal, infant, and child development, spermatogenesis and oocyte maturation before conception, puberty, pregnancy, postpartum, and menopause [10, 48, 58, 68]. In addition to greater susceptibility, greater exposures may be experienced across the life span. For example, children and infants have unique exposure pathways that result in higher exposure than the general population for some chemicals (U.S. EPA, 2006). These pathways include soil and dust ingestion [185], breast milk and formula ingestion [186], and enhanced hand-to-mouth and object-to-mouth chemical transfer (U.S. EPA, 2006). Children also consume more food and water and have higher inhalation rates per unit body weight than adults (U.S. EPA, 2002), which also results in higher exposure rates. Therefore, comprehensive understanding of exposure across the life span, including in the testicular, ovarian, and fetal compartments, is necessary to fully characterize risk for susceptible populations. Various toxicokinetic (TK) models have been built to link exposure during gestation [89] and breast-feeding [24] to tissue concentrations, but these models continue need revision [74] and are typically tailored to a single chemical (that is, not high throughput). Generic TK models do exist for some routes important to occupational exposure, such as inhalation [71], but chemical-specific parameters for these models cannot yet be rapidly generated. Significant work remains in characterizing susceptible and highly exposed individuals with high-throughput methods [∗121, 167].
The only HTE models with quantified uncertainty have large (orders of magnitude) uncertainties when only chemical structure can be provided as input [*120]. Furthermore, while different median intake rates can be estimated for various population demographics [160], these approaches do not yet assess individuals who experience “greater exposure” because of limitations in sample sizes of the biomonitoring data used for evaluation. Finally, exposure to chemicals is often correlated, with various combinations of chemical exposures tending to be significantly more or less likely [*73]. Uncertainty in the HTE model, results for various age groups can be reduced by the development of improved pathway-specific HTE models that examine exposure variability across the life cycle (for example, refined consumer models) or for pathways that are specific to a susceptible population (for example, occupational pathways). Improved understanding of highly exposed populations and chemical co-exposures must be addressed by via the development of new biomonitoring data (for example, from next-generation mass spectrometry) and its integration into evaluation frameworks. Although such approaches may be more uncertain than traditional monitoring, the collection of information covering a larger sample size or an expanded chemical space can aid in identifying and quantifying exposures for highly exposed groups.
5). Conclusion
The exposure data available for chemicals in commerce are limited, which contributes to uncertainties in risk assessment [13, 38]. It is impractical, if not impossible, to acquire all the needed exposure data using traditional methods, and thus, there is a focus on developing higher throughput NAMs. For example, while HTE models can currently characterize exposure for exposure from ambient (far-field) sources and consumer product sources, additional work is needed for occupational and susceptible population scenarios and dietary pathways. Better characterization (including measurements) of the breadth of pathways, scenarios, and human biological variability will allow continued advances in HTE modeling and reduction in uncertainty.
Here, we have reviewed six broad classes of NAMs for exposure, starting with measurement methods. Additional measurement data are needed to improve HTE models. Screening analyses of environmental media are still in their infancy and not without limitations, yet they have already contributed greatly to reducing uncertainty in exposure data in two ways: 1) confirming the presence/absence of known chemicals; and 2) identifying previously unknown chemicals. We then reviewed exposure inference, which requires HTTK to address the large number of chemicals in commerce. We reviewed HTE models, and how they can fill gaps that cannot be addressed even with NAMs for measurement. Just as importantly, we reviewed NAMs for chemical descriptors, including the quantitative chemical concentration and emission data required for using HTE models. Because the measured presence of a chemical in environmental media is only a prerequisite to exposure [69, 83, 87, 175], it remains of critical importance to characterizing the emissivity of chemicals, which depends both on the chemical and the media in which it is found. We then reviewed how systematic statistical evaluation of HTE models using inferred exposures allows for the uncertainty quantification needed for prioritization. Finally, we reviewed how all other exposure NAMs can be informed by machine learning. We ended up with examples drawing together the six classes of NAMs to perform exposure- and risk-based prioritization of chemicals.
While great strides have been made to advance high-throughput methods for hazard identification, continued advances in exposure assessment will help to establish the real-world, human health consequences for chemicals in our environment.
The authors thank Drs. Peter Egeghy and Maureen Gwinn for their helpful reviews of the manuscript.
Disclaimer: The United States Environmental Protection Agency through its Office of Research and Development primarily funded the research described here. The views expressed in this publication are those of the authors and do not necessarily represent the views or policies of the U.S. Environmental Protection Agency. Reference to commercial products or services does not constitute endorsement.
