Skip to main content
Toxicological Sciences logoLink to Toxicological Sciences
. 2020 May 8;176(1):86–102. doi: 10.1093/toxsci/kfaa062

A Rat Liver Transcriptomic Point of Departure Predicts a Prospective Liver or Non-liver Apical Point of Departure

Kamin J Johnson k1,, Scott S Auerbach k2, Eduardo Costa k3
PMCID: PMC7357187  PMID: 32384157

Abstract

Identifying a toxicity point of departure (POD) is a required step in human health risk characterization of crop protection molecules, and this POD has historically been derived from apical endpoints across a battery of animal-based toxicology studies. Using rat transcriptome and apical data for 79 molecules obtained from Open TG-GATES (Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System) (632 datasets), the hypothesis was tested that a short-term exposure, transcriptome-based liver biological effect POD (BEPOD) could estimate a longer-term exposure “systemic” apical endpoint POD. Apical endpoints considered were body weight, clinical observation, kidney weight and histopathology and liver weight and histopathology. A BMDExpress algorithm using Gene Ontology Biological Process gene sets was optimized to derive a liver BEPOD most predictive of a systemic apical POD. Liver BEPODs were stable from 3 h to 29 days of exposure; the median fold difference of the 29-day BEPOD to BEPODs from earlier time points was approximately 1 (range: 0.7–1.1). Strong positive correlation (Pearson R = 0.86) and predictive accuracy (root mean square difference = 0.41) were observed between a concurrent (29 days) liver BEPOD and the systemic apical POD. Similar Pearson R and root mean square difference values were observed for comparisons between a 29-day systemic apical POD and liver BEPODs derived from 3 h to 15 days of exposure. These data across 79 molecules suggest that a longer-term exposure study apical POD from liver and non-liver compartments can be estimated using a liver BEPOD derived from an acute or subacute exposure study.

Keywords: transcriptome, benchmark dose, point of departure, apical effect, risk assessment


Current requirements for crop protection molecule (ie, pesticide) registration include performing a large battery of regulatory guideline toxicity studies in mammals. Data from these studies, along with uncertainty factors, are used to derive a reference dose for human health risk assessment. Regulatory guideline toxicity studies examine “apical” effects such as body weight, clinical observations, and organ histopathology across multiple dose levels to identify a toxicity point of departure (POD). A POD can be defined as the highest dose level without an effect at a given response magnitude (Haber et al., 2018; Hardy et al., 2017). In the final step of human health risk assessment, the toxicity POD and human exposure data are combined to characterize the risk in exposed human populations.

The concept of using transcriptome data to derive a toxicity POD for use in risk assessment has been developing for over a decade (Mezencev and Subramaniam, 2019; NRC, 2007; Schmitz-Spanke, 2019). Health Canada recently explored this concept and envisioned a use in chemical screening, assessment of data-poor chemicals, and developing a provisional POD for risk assessment (Cheung et al., 2018). The National Toxicology Program employed the approach in their assessment of the chemicals in the 2014 Elk River chemical spill (NTP, 2016).

Foundational to the use of transcriptome data for deriving a toxicity POD is two concepts. First, current molecular biology methods are able to query the entire transcriptome in an agnostic manner and, second, all apical effects (ie, all toxicity) are dependent on a prior change at the molecular level (Farr and Dunn, 1999; LaRocca et al., 2017; Vinken et al., 2017). The emergent concepts are that a comprehensive analysis of the molecular landscape will capture all potential toxicities and that a POD based on this comprehensive molecular analysis will be protective of all possible apical PODs.

A software tool (BMDExpress) to derive a gene set-level biological effect POD (BEPOD) using transcriptome data and the potential use of this tool in risk characterization were first described in 2007 (Thomas et al., 2007; Yang et al., 2007). Since that time, a relatively small number of chemicals have been examined for the concordance between a BEPOD from subacute or subchronic studies and chronic and cancer apical PODs from rodent cancer bioassays. Using this limited dataset, data suggest a BEPOD from as short as 5 days of exposure may predict a chronic and cancer apical POD within the same organ (Chepelev et al., 2018; Jackson et al., 2014; Moffat et al., 2015; Thomas et al., 2011, 2013).

The Open Toxicogenomics Project-Genomics Assisted Toxicity Evaluation Systems (TG-GATES) database created by the Japanese Toxicogenomics Project contains toxicity and transcriptome data generated on 170 molecules (mostly pharmaceuticals) (Igarashi et al., 2015). The typical in vivo study design for each molecule was exposure of young adult male rats to vehicle or 3 treated dose levels and collection of data on clinical observations, body, kidney, and liver weight, liver and kidney histopathology, clinical pathology, clinical chemistry, and the kidney and liver transcriptome. Data were collected at four single dose time points (3, 6, 9, or 24 h) and four repeat dose time points (4, 8, 15, and 29 days). Transcriptome data were generated using whole-genome Affymetrix microarrays (Affymetrix Rat 230 2.0). This extensive database containing dose response information for apical and whole transcriptome endpoints represents a unique resource to develop novel toxicological testing strategies based on transcriptome data.

The goals of this study were to (1) develop a BEPOD derivation algorithm providing maximal concordance between a liver BEPOD and a systemic apical POD derived from liver and non-liver biological compartments and (2) examine the concordance across exposure time between liver BEPOD and systemic apical POD values. Three hypotheses were tested. First, liver BEPOD and systemic apical POD values are similar (within 10X) following subacute (29 days) dosing. Second, a liver BEPOD predicts (within 10X) the systemic apical POD, even if that apical POD is derived from a non-liver endpoint. Lastly, a short-term (ie, acute exposure) liver BEPOD predicts (within 10X) a longer-term (29-day exposure) systemic apical POD. To test these hypotheses across a large set of molecules (79 in total), rat liver BEPOD and systemic apical POD values derived from multiple biological compartments were compared using data obtained from the TG-GATES database.

MATERIALS AND METHODS

Annotating and filtering TG-GATES study data

The 170 molecules in the TG-GATES database with some in vivo data were filtered to a set with sufficient and appropriate data for benchmark dose (BMD) modeling (Igarashi et al., 2015). The first filtering criterion was having a “complete” dataset, defined as having all of the following data collected at the 29-day time point: (1) clinical observations; (2) body weight; (3) liver and kidney weight and histopathology; and (4) transcriptome (including transcriptome data for more than 1 treated dose level). After applying this filter, 130 molecules remained. Two criteria have been identified which make a dataset appropriate for BMD modeling: (1) at least one dose level with no effect or a modest effect size and (2) at least one dose level with a robust response (EPA, 2012; Slob et al., 2005). Therefore, the following filter was applied to 29-day apical data for the remaining 130 molecules: (1) at least 1 apical endpoint with a treatment-related effect at 29 days of exposure and (2) at least 1 dose level at 29 days of exposure with no or a modest (< 20% effect size) effect across all apical endpoints examined. After applying this second filter, 79 molecules remained (Supplementary Table 1). Supplementary Table 2 contains the initial list of 170 molecules, and the reason for culling each molecule that did not pass all filters.

For terminal body weight, relative liver weight, and relative kidney weight endpoints, the following considerations were used to determine the relationship to treatment at 29 days: (1) presence of a dose response at 29 days; (2) consistency of the dose response over exposure time; and (3) whether or not the 29-day treatment group mean values were within the range of mean values for control groups contained within the TG-GATES database. Graphs showing these data for all 130 molecules with a complete dataset are contained in Supplementary File 1. Primary histopathology and clinical observation data are contained in Supplementary File 1, and primary data for body weight, relative liver weight, and relative kidney weight are contained in Supplementary Files 2–4. For liver and kidney histopathology and clinical observation data, the TG-GATES project team made relationship to treatment calls, and these calls were used when available in summary data files contained within the TG-GATES database (https://toxico.nibiohn.go.jp/, last accessed May 17, 2020). For molecules without a TG-GATES project team call for histopathology and clinical observation data, the incidence data were examined, and the relationship to treatment was determined using the same considerations noted above. For all molecules, a heatmap showing the presence or absence of a treatment-related effect across all apical and transcriptional endpoints and exposure time points is shown in Supplementary Table 3.

Apical POD identification

For all 79 filtered molecules, an apical POD was generated for each endpoint with a treatment-related change at 29 days of exposure. Discolored urine, dirty fur/hair/nose, spilled food, and salivation clinical observation calls were excluded from analysis because these endpoints may not represent toxicity. If a BMD value could be generated, the BMD lower 95% confidence limit (BMDL) was used as the POD. If BMD modeling failed, then the no-observed-effect-level (NOEL) value was used as the POD. POD values are provided in Supplementary Table 4 for all apical endpoints with a treatment-related effect at 29 days of exposure across the 79 molecules examined.

BMD-based apical POD values were generated using BMDS (version 2.6.0.1) developed by the U.S. Environmental Protection Agency (EPA) (Davis et al., 2011; EPA, 2012). Terminal body weight, relative liver weight, and relative kidney weight data were examined according to the EPA recommended BMD analysis workflow for continuous data, whereas liver histopathology and kidney histopathology were examined according to the EPA recommended workflow for dichotomous data. Apical data were analyzed irrespective of the response being considered an adaptive or adverse effect.

For continuous data, the following EPA models were considered: Exponential (M2exp, M3exp, M4exp, and M5exp), Hill, Linear, Polynomial (Ply2 and Ply3), and Power models. The benchmark response (BMR) was defined as a change of 1 SD from the modeled baseline at control. BMD results for the different models were kept for further consideration only if: (1) there was a difference between response and/or variances among the dose levels (Test 1 p value < .05); (2) the data could be modeled by either assuming a consistent (homogenous) variance (Test 2 p value > .1) or a variance that changes as a power function of the mean value variance (Test 3 p value > .1); (3) the model had an adequate good global fit (goodness of fit test p value > .1) and the model had an adequate local fit (absolute scaled residuals < 2) for all dose levels; (4) the difference between the estimated BMD and its BMDL was not greater than 20X; and (5) the estimated BMDL value was not smaller than 10X of the lowest nonzero dose level. For all BMD results kept for further consideration, the results for the model with the lowest Akaike information criterion (AIC) value was chosen when all BMDL values were within 3X from one another; the smallest BMDL value was chosen otherwise.

For dichotomous data, the following EPA models were considered: Gamma, Logistic, LogLogistic, LogProbit, Multistage, Probit, Weibull, and Quantal-Linear models. The BMR was defined as a change in 10% with respect with the mean incidence in the control group. BMD results for the different models were kept for further consideration only if: (1) goodness of fit test p value > .1; (2) absolute scaled residuals < 2 for all doses; (3) BMD/BMDL < 20; and (4) BMDL and nonzero lowest dose level values within 10X the lowest nonzero dose level. As for continuous data, the result for the model with the lowest AIC value was chosen when all BMDL values were within 3X from one another; otherwise, the smallest BMDL value was chosen.

The parameter choices for dichotomous and continuous models are provided in Supplementary File 5. The model results for all 29-day continuous and dichotomous datasets are shown in Supplementary Tables 5 and 6, respectively. The tables also highlight which model was chosen for every endpoint, according to the aforementioned selection criteria. Supplementary Table 7 provides details on the most sensitive (ie, lowest) apical POD across all endpoints for every molecule.

BEPOD identification

TG-GATES rat liver microarray data and BMDExpress software (version 2.01.0264 BETA) were used to derive a BEPOD (Phillips et al., 2019). This process was agnostic to the molecule therapeutic or toxicity mechanism. TG-GATES microarray gene expression primary data (CEL files) were obtained from the Open TG-GATES ftp site (https://dbarchive.biosciencedbc.jp/en/open-tggates/download.html, last accessed May 17, 2020) (Igarashi et al., 2015). Probe set (ie, gene) intensities were normalized using Robust Multi-array Average algorithm implemented in GeneSpring GX 12.6 (Agilent Technology, Foster City, California) (Irizarry et al., 2003). Normalized intensity values were parsed into individual BMDExpress expression data files for each organ and exposure duration. All BMDExpress bm2 data files are available (Supplementary Files 6–13).

Following the BMDExpress workflow (Phillips et al., 2019), normalized microarray data were filtered against a Williams trend test p value < .05 and an absolute fold change ≥ 1.5. By selecting only probe sets meeting these criteria, this step reduced the chance of modeling data without a treatment-related transcriptional response. For probe sets passing this filter, expression data were fit to Hill, Power, Linear, Polynomial 2, Exponential 2, Exponential 3, Exponential 4, and Exponential 5 models. A best fit model for each probe set was selected using the following settings: (1) maximum iterations of 250; (2) confidence level of 0.95; (3) consistent variance; (4) power restricted to ≥ 1; (5) Hill models with a k parameter < 1/3 of the lowest positive dose were flagged and then the next best model with a fit p value > .05 was used; (6) lowest AIC value to select between the best poly model and all other models; and (7) a goodness of fit p value > .1. If no other best fit model could be identified when the Hill model was flagged, the Hill model was retained. Because the response level associated with an adverse change in gene expression was unknown, the BMR was set to a mean response equal to 1 SD at the modeled control baseline (Davis et al., 2011). All gene-level BMD output data files from BMDExpress are available (Supplementary Files 14–21).

Inclusion or exclusion of probe set-level BMD values obtained using Hill models with a “k” parameter < 1/3 of the lowest positive dose (ie, flagged Hill models) was considered. When including BMD values derived from a flagged Hill model, a small number of genes with BMDL values several orders of magnitude below the lowest dose level tested was frequently observed (data not shown). Including BMD values derived from a flagged Hill model often resulted in a final BEPOD for a given molecule and time point combination that was orders of magnitude below the lowest dose level tested (data not shown); therefore, such BEPOD values were considered inaccurate due to being extrapolated from the data. In addition, it was reasoned that the BEPOD for a molecule would be similar over time. For example, it was considered biologically implausible for the BEPOD to be similar at 4- and 15-day exposure time points but several orders of magnitude lower than these values at the intervening 8-day time point. Across the dataset examined, including probe set-level BMD values derived from a flagged Hill model resulted in BEPOD values that were inconsistent over time by several orders of magnitude; excluding these values resulted in more consistent BEPOD values over time (data not shown). For these reasons, probe set-level BMD values obtained using flagged Hill models were excluded from the BEPOD algorithm.

A limited BMD minima constraint sensitivity analysis of the BEPOD was performed using 29-day data in which probe sets with a flagged Hill model had the probe set BMD set to multiples of 1, 0.33, or 0.1 of the lowest dose level tested. Comparing the BEPOD values from this method versus the BEPOD values obtained using the process described above demonstrated that the BEPOD values were robust to treatment of probe sets with a flagged Hill model. When present, differences in BEPOD values were driven by the selection of distinct probe sets during the functional classification step. Seventy-seven of 79 BEPOD values were within 1.2-fold when the lowest dose level tested was selected as the BMD (Supplementary Table 8). BEPOD values for clofibrate and benziodarone were within 2.1- and 7.4-fold, respectively. The BEPOD value for iproniazid was linearly affected by the multiple of the lowest dose level.

The BMDExpress functional classification step was used to identify BMD and BMDL values for gene sets using Gene Ontology Biological Process (GO-BP) terms (GO File Creation Date of March 3, 2019). To reduce noise in this step, probe set-level results were excluded when one or more of the following characteristics were observed: (1) promiscuous probe sets (ie, probe sets that mapped to more than 1 gene); (2) modeled BMDs > the highest dose level; or (3) probe sets with upper bound (95 percentile) BMD values (BMDU) to lower bound (95 percentile) BMD value (BMDL) ratios > 40. The GO-BP term with the smallest median gene BMD value was identified, and the final BEPOD value reported was the median gene BMDL value of that GO-BP term. Liver BEPOD values across all the molecules examined and all exposure time points are provided in Supplementary Table 1. All functional classification BMD output data files from BMDExpress are available (Supplementary Files 22–30). To provide further clarification on which model results were used to derive the final BEPOD for every molecule, Supplementary Table 9 shows the frequency that every model was chosen across the probe sets that were considered in the functional classification step. Note that if a probe set was not mapped to any GO-BP term or if it was excluded based on the noise reduction criteria, its model results are not shown in Supplementary Table 9.

Optimal BEPOD functional classification algorithm identification

A goal was to identify a single BMDExpress data processing algorithm to use across all 79 molecules studied in the TG-GATES database. Data filtering options tested during the functional classification step within BMDExpress included use of the Fisher’s exact two-tailed test (with different thresholds for the test p value), varying the minimum number of genes in a GO-BP term with a BMD value, and/or varying the minimum percentage of genes in a GO-BP term with a BMD value. The derived BEPOD values for every filtering combination were compared against the most sensitive 29-day apical POD value for a subset of 50 molecules out of the original set of 79 molecules. The reason for choosing a subset was two-fold. First, depending on the strictness of the filter, BMDExpress might not return any BMD value for some molecules because all possible gene sets were removed during the functional classification step. This resulted in different sets of molecules with a valid BMD value across all examined combinations of filters (data not shown). Having a common evaluation set is important to have a fair comparison among filtering criteria. For this reason, the most common set of 50 molecules across all combinations was chosen for the evaluation, and any combination of filters that did not return BMD values for all 50 molecules was excluded from the comparison. Second, this kept a subset of 29 molecules aside as a virgin set to test the appropriateness of a filter. Molecules in the training and test sets are shown in Supplementary Table 1.

Concordance metrics

To evaluate the concordance between apical POD and BEPOD values, two evaluation metrics were used: Pearson correlation R and root mean square difference (RMSD). Although the former verifies the linear relationship between two sets of values, the latter is an error metric that compares the distance between two sets of values. Apical POD and BEPOD values with higher concordance were expected to have higher Pearson correlation values and smaller RMSD values. The Pearson R coefficient between two vectors X and Y of equal length is given by the following equation, where cov stands for covariance and σ stands for standard deviation.

r=covX,YσXσY.

The RMSD between X and Y is given by the following equation, where N is the length of the vectors.

RMSD=k=iNXk-Yk2N.

For both metrics, the values being compared were converted to the log scale with base 10. This transformation was necessary because, given the different orders of magnitude of dose level values across the TG-GATES experiments, it was more important to consider the relative concordance of apical and BEPOD values than the absolute difference between them.

Statistical analyses of data examining the effect of dose level selection

Data compared were individual molecule fold difference values of either (1) BEPODs from all time points compared with the 29-day apical POD or (2) BEPODs from exposures up to and including 15 days of exposure compared with the 29-day BEPOD. Datasets were further subdivided prior to statistical analysis into molecules with a 3–4X difference and those with a 10X difference in low dose levels used between single and repeat dose experiments. With the exception of the fold change values of 29-day BEPOD compared with 29-day apical POD, all BEPOD and apical POD data were obtained from independent animals. Statistical analyses were performed using Prism 8 software (GraphPad Software, San Diego, California). Because most datasets were not normally distributed based on having a D’Agostino-Pearson test p value < .05, a nonparametric Kruskal-Wallis ANOVA was performed. If the ANOVA p value was < .05, statistical comparisons of individual groups were performed using Dunn’s post-test. A Dunn’s posttest p value < .05 was considered statistically significant.

RESULTS

Preliminary data analysis of the 79 filtered molecules showed that a 29-day liver BEPOD value could be generated for all molecules, despite 21 of these 79 molecules having no treatment-related liver apical effect at the 29-day time point (Supplementary Table 1). However, all 21 molecules without treatment-related liver apical effects had treatment-related effects on non-liver apical endpoints (relative kidney weight, kidney histopathology, body weight, and/or clinical observations). Therefore, PODs derived from all apical endpoints examined in the TG-GATES study were compared with the liver BEPOD (data in Supplementary Table 1). For each molecule, the apical POD was defined as the 29-day apical endpoint with the lowest POD value.

Optimizing a Single Algorithm for BMDExpress Functional Classification

An analysis was conducted to identify a single BMDExpress functional classification filtering algorithm that resulted in optimal concordance of the 29-day liver BEPOD and with the 29-day apical POD. The three functional classification filtering criteria explored were (1) the number of genes in a GO-BP term with a BMD value (ie, minimum number of genes); (2) the percentage of genes in a GO-BP term with a BMD value (ie, minimum percentage of genes); and (3) a Fisher’s exact two-tailed test p value. The effect of varying these three BMDExpress functional classification parameters on the concordance between the liver BEPOD and apical POD values was evaluated across 50 molecules and validated on a virgin set of 29 molecules.

Thousands of different combinations for the three functional classification filters were tested, from very relaxed filtering combinations (eg, minimum number of genes = 1; minimum percentage of genes = 0; and no Fisher’s exact test) to very strict ones (eg, minimum number of genes = 20; minimum percentage of genes = 20; and Fisher’s exact test p value ≤ .1). Note that the latter combination will consider a set of gene sets that is a subset of the gene sets considered in the former, as the filter values work as thresholds for the inclusion of gene sets in the calculation of the final BEPOD.

The first criterion examined was the minimum number of genes with a BMD value per GO-BP term. As shown in Figure 1A, BEPOD and apical POD RMSD values tended to decrease (became more concordant) as the minimum number of genes was decreased. Criteria having a minimum number of genes between 2 and 4 yielded the best concordance in terms of RMSD. When the minimum number of genes was set to 1, RMSD values increased, indicating less concordance between BEPOD and apical POD values. Interestingly, the Pearson R is on average high for the different choices of minimum number of genes; however, the RMSD becomes larger as the minimum number of genes is increased due to the BEPOD values becoming larger and more distant from the apical POD values. Also note that variations in both correlation and RMSD may change for the same color in the plot (ie, the same choice of threshold). This shows the influence of the other two filters which will be evaluated next. Considering both RMSD and Pearson R values, setting the minimum number of genes to 2 resulted in the best BEPOD to apical POD concordance across the analysis.B

Figure 1.

Figure 1.

Effect of different BMDExpress functional classification filters on concordance between biological effect point of departure and apical point of departure (POD) values. Each dot summarizes the concordance across all 50 molecules, in terms of Pearson correlation and root mean square difference (RMSD), given a unique choice of values for the following filters: (1) the Fisher’s exact two-tailed test p value; (2) the minimum number of genes in a Gene Ontology Biological Process (GO-BP) term with a benchmark dose (BMD) value; and (3) the minimum percentage of genes in a GO-BP term with a BMD value. A, Effect of varying the minimum number of genes with a POD value per GO-BP term. The dots are colored according to the minimum number of genes because this was the filter with the largest influence on the results. A large number of dots per color are shown because for every color (ie, a specific minimum number of genes filter) there are multiple combinations of the Fisher’s exact two-tailed test p value and minimum number of genes filters examined. B, Effect of varying the minimum percentage of genes with a POD value per GO-BP term. The dots are colored according to the minimum percentage of genes. To avoid confounding this filter with the minimum number of genes, results are displayed separately for four different choices of the latter. C, Effect of varying the Fisher’s exact two-tailed test p value. The minimum number of genes was fixed at two for all combinations shown. To avoid confounding the effect of varying the other two filters in the analysis, the results are displayed separately for four different choices of the minimum percentage of genes filter. The dots are colored according to the p value for the Fisher’s exact two-tailed test, ranging from .01 to 1.

The second criterion examined was the minimum percentage of genes in each GO-BP term with a BMD value. Figure 1B shows the effect of varying the minimum percentage of genes filter for criteria where the minimum number of genes was between 2 and 5. For analyses having the minimum number of genes set between 2 and 4, criteria with the minimum percentage of genes set to low values improved both Pearson correlation and RMSD. For analyses having the minimum number of genes set to 5, improvement was mainly seen on RMSD values. Considering both the minimum number of genes and the minimum percentage of genes, setting both of these values at 2 resulted in the best BEPOD to apical POD concordance across the 50 TG-GATES molecules examined.

The final criterion examined was use of the Fisher’s exact two-tailed test. For this comparison, the BEPOD and apical POD concordance was examined after varying the Fisher’s exact two-tailed test p value from .01 to 1, while keeping the minimum number of genes at 2 and varying the minimum percentage of genes from 0 to 3. The minimum number of genes variable was held consistent to avoid confounding the effect of varying two filters. The results showed that setting the Fisher’s exact two-tailed test p value threshold at .05 (ie, Fisher’s exact two-tailed test p value ≤ .05) resulted in the best BEPOD to apical POD concordance across the analysis (Figure 1C).

Based on these results, the following functional classification filters were used for all subsequent BEPOD derivation: (1) GO-BP terms having only 1 gene with a BMD value were excluded; (2) GO-BP terms with < 2% of genes having BMD values were excluded; and (3) GO-BP terms with a Fisher’s exact 2-tailed test p value > .05 were excluded. These filtering criteria were tested using the 29 virgin molecules. A Pearson correlation R of 0.83 and a RMSD of 0.40 were observed for these molecules, which were similar to values obtained using the prior 50 molecules (Pearson R = 0.85 and RMSD = 0.41) (Supplementary File 31).

Liver BEPOD Stability Over Time

The purpose of this analysis was to determine if and when the liver BEPOD stabilized. Because the TG-GATES study generated transcriptome data for all 79 molecules across eight time points ranging from 3 h to 29 days, variation in the liver BEPOD over exposure time was examined. Except for two molecules at a single time point, a liver BEPOD could be calculated for all 632 molecule/time point combinations (Supplementary Table 1). For this analysis, all liver BEPOD values from 3 h to 15 days of treatment were examined relative to the liver BEPOD at 29 days. Initial data analysis suggested dose level selection influenced the concordance of the liver BEPOD (data not shown). Therefore, molecules were separated into two groups for the comparisons: (1) 51 molecules with the same dose level used for all eight time points (Consistent Dose Level Group) and (2) 28 molecules for which the single dose time points used higher dose levels compared with the repeat dose time points (Inconsistent Dose Level Group) (Supplementary Table 10). In the TG-GATES study design, molecules in the Inconsistent Dose Level Group had the same dose levels used for all repeat dose time points (4–29 days).

At all time points, a moderate to strong positive correlation was observed between BEPOD values and 29-day BEPOD values (Figure 2). A correlation was observed regardless of dose level selection. For the Consistent Dose Level Group, Pearson R values ranged from 0.73 to 0.83, and RMSD values ranged from 0.35 to 0.45. For the Inconsistent Dose Level Group, Pearson R values ranged from 0.65 to 0.92, and RMSD values ranged from 0.39 to 0.86. Liver BEPOD values were positively correlated across BEPOD values spanning approximately four orders of magnitude.

Figure 2.

Figure 2.

Variation in the liver biological effect point of departure (BEPOD) across exposure time. Shown is a comparison of the liver BEPOD at 29 days with the liver BEPOD from 3 h to 15 days. Each graph contains a comparison of the liver BEPOD at 29 days to the liver BEPOD at an earlier exposure time point. In all graphs, each dot represents a distinct molecule, the blue line represents a 1:1 concordance, and the red lines represent the 10X boundary between values. Panels on the left contain data for molecules having the same (ie, consistent) dose levels selected for all time points. Panels on the right contain data for molecules having different (ie, inconsistent) dose levels used between single dose and repeated dose studies. Abbreviations: mkd, mg/kg (body weight)/day; R, Pearson R value; RMSD, root mean square difference value.

For the Consistent Dose Level Group, the liver BEPOD was stable from 3 h to 29 days. Liver BEPOD values across this group bestrode the 1:1 correlation line at all time points (Figure 2). RMSD and Pearson R values were similar at all time points (Figs. 3A and 3B). The percentage of molecules with a liver BEPOD within 3X or 10X of the 29-day liver BEPOD was similar across all time points (Figs. 3C and 3D). Approximately 75% of molecules in the Consistent Dose Level Group had a liver BEPOD within 3X of the 29-day liver BEPOD at all time points. Nearly all molecules in this group had a liver BEPOD within 10X of the 29-day liver BEPOD at all time points. The median fold difference of the liver BEPOD to the 29-day liver BEPOD was approximately 1 at all time points (range: 0.7–1.1) (Figure 3E).

Figure 3.

Figure 3.

Metrics quantifying the liver biological effect point of departure (BEPOD) variation across exposure duration. The dataset was divided into 2 groups: molecules with consistent dose levels used and molecules with inconsistent dose levels used. Each panel shows a different metric comparing the 29-day liver BEPOD with the liver BEPOD from 3 h to 15 days: (A) RMSD, root mean square difference; (B) Pearson R; (C) the percentage of molecules with a BEPOD within 3X of the 29-day BEPOD; (D) the percentage of molecules with a BEPOD within 10X of the 29-day BEPOD; and (E) the median fold difference between the BEPOD and the 29-day BEPOD. Data shown in panel E are the median with 95% confidence interval. To generate data shown in panel E, the molecule BEPOD at each exposure time point was divided by that molecule’s 29-day BEPOD. The dashed line in panel E is a value of 1.

For the Inconsistent Dose Level Group, the liver BEPOD was influenced by dose level selection. The higher dose levels used for the acute, single dose experiments (3–24 h) compared with the 29-day time point appeared to shift the liver BEPOD concordance below the 1:1 correlation line (Figure 2). However, concordance appeared better for all of the repeat dose time points (4–15 days) for which the dose levels were the same as those used for the 29-day time point. This reduction in data concordance coincident with differences in low dose level selection was reflected in metrics used to quantify the variation in BEPOD values over time. RMSD values ranged from 0.7 to 0.86 for the single dose time points but decreased to 0.39 to 0.49 at repeat dose time points (Figure 3A). For the single dose time points, the percentage of molecules having a liver BEPOD within 3X or 10X of the 29-day liver POD ranged from 29 to 42 and 69 to 82, respectively (Figs. 3C and 3D). For the repeat dose time points, the percentage within 3X increased to between 57 and 71, and all molecules were within 10X. The median fold difference of the liver BEPOD to the 29-day liver BEPOD was approximately 3.5 at the single dose time points (range: 3.1–3.6) (Figure 3E); however, the median fold difference was approximately 1.5 at all repeat dose time points (range: 1.3–1.7) (Figure 3E). For all of these metrics, it was noted that these data were similar for the Inconsistent Dose Level Group and the Consistent Dose Level Group at the repeat dose time points (Figure 3), when dose level selection was consistent for both groups of molecules.

A Liver BEPOD Predicts a Concurrent Systemic Apical POD

Across the 79 molecules, the 29-day liver BEPOD was concordant with the 29-day apical POD. The apical category with the lowest 29-day POD for these 79 molecules was distributed across all four categories: liver (34 molecules); kidney (22 molecules); clinical observation (19 molecules); terminal body weight (6 molecules) (Figure 4A). Except for 5 of 6 molecules where the apical POD was obtained from the terminal body weight NOEL, the apical POD was obtained from a BMD-derived value (Supplementary Table 1). There was a strong positive correlation (Pearson R = 0.86) between the 29-day liver BEPOD and 29-day apical POD values, and the liver BEPOD predicted the apical POD (RMSD = 0.41) (Figure 4B). For 99% of the molecules (78 of 79), the liver BEPOD and apical POD values were within 10X. BEPOD values bestrode the 1:1 correlation line with approximately an equal number of molecules having higher and lower liver BEPOD values compared with the apical POD values. Concordance between the liver BEPOD and apical POD values was similar throughout the range of POD values, which spanned approximately four orders of magnitude.

Figure 4.

Figure 4.

Concordance of the liver biological effect point of departure (BEPOD) at 29 days with the apical point of departure (POD) in different compartments at 29 days. A, Shown here is the apical compartment with the lowest POD value after 29 days of exposure across all molecules examined. For two molecules, two compartments had the same POD values. B, Comparison of the 29-day liver BEPOD with the lowest 29-day apical POD value across all compartment (n = 79 molecules). C, Comparison of the 29-day liver BEPOD with the 29-day body weight POD for molecules having a body weight effect represent the lowest apical POD (n = 6 molecules). D, Comparison of the 29-day liver BEPOD with the 29-day clinical observation POD for molecules having a clinical observation effect represent the lowest apical POD (n = 19 molecules). E, Comparison of the 29-day liver BEPOD with the 29-day kidney POD for molecules having a kidney effect represent the lowest apical POD (n = 22 molecules). F, Comparison of the 29-day liver BEPOD with the 29-day liver POD for molecules having a liver effect represent the lowest apical POD (n = 34 molecules). G, Comparison of the 29-day liver BEPOD with the lowest systemic 29-day apical POD for molecules having no treatment-related liver apical effect (n = 21 molecules). H, Comparison of the 29-day liver BEPOD with the lowest systemic 29-day apical POD for molecules administered intravenously (n = 4 molecules). In panels B–H, each dot represents a unique molecule, the blue line represents a 1:1 concordance, and the red lines represent the 10X boundary between values. Abbreviations: mkd, mg/kg (body weight)/day; R, Pearson R value; RMSD, root mean square difference value.

When apical PODs were separated by categories and each of the four categories were compared in isolation to the 29-day liver BEPOD, a similar picture emerged (Figs. 4C–F). Correlation and accuracy values of the liver BEPOD and apical POD values were similar, even if the category with the lowest apical POD was not the liver. For these comparisons, Pearson R values ranged between 0.83 and 0.94, and RMSD values ranged between 0.37 and 0.58.

Twenty-one molecules had no treatment-related liver apical effect after 29 days of exposure but did produce a treatment-related apical effect in other categories (body weight, clinical observation, and/or kidney) (Supplementary Table 1). For these 21 molecules, the 29-day liver BEPOD estimated the non-liver apical POD within an order of magnitude (Figure 4G). The Pearson R and RMSD values were 0.80 and 0.52, respectively.

Four molecules were administered via a non-oral, intravenous exposure route (Supplementary Table 1). For these molecules, the 29-day liver BEPOD estimated the apical POD within an order of magnitude (Figure 4H). The Pearson R and RMSD values were 0.86 and 0.34, respectively.

A Short-term Liver BEPOD Predicts a Longer-term Apical POD

Given that the liver BEPOD appeared to be stable from 3 h to 29 days, it was hypothesized that a 29-day apical POD could be predicted by a liver BEPOD derived from an exposure duration as short as 3 h. To test this hypothesis, the apical POD values for all 79 molecules were compared with the liver BEPODs across all eight exposure time points. Once again, the 79 molecules were divided into two groups for this analysis: a Consistent Dose Level Group (ie, dose levels were the same in the single and repeat dose studies) and an Inconsistent Dose Level Group (ie, dose levels were different in the single and repeat dose studies with repeat dose always being lower than the single dose).

For the Consistent Dose Level Group, a positive correlation was observed between the apical POD and the liver BEPOD at all time points (Figure 5). The POD values bestrode the 1:1 correlation line approximately equally for all comparisons. As measured by RMSD, BEPOD prediction accuracy was similar at all exposure time points (Figure 6A); RMSD values ranged from 0.45 to 0.54. Pearson R values ranged from 0.65 to 0.75 (Figure 6B). The percentage of molecules with a liver BEPOD within 3X or 10X of the apical POD was similar across all time points (Figs. 6C and 6D). Approximately 70% of molecules had a BEPOD within 3X of the apical POD at all time points; approximately 95% of molecules had a BEPOD within 10X of the apical POD at all time points. The median fold difference of the BEPOD to the apical POD was approximately 1 at all time points (range: 0.6–1.1) (Figure 6E).

Figure 5.

Figure 5.

Concordance of the 29-day systemic apical point of departure (POD) and liver biological effect point of departure (BEPOD) at different exposure time points. Shown is a comparison of the 29-day systemic apical BEPOD with the liver BEPOD from 3 h to 29 days. In all graphs, each dot represents a distinct molecule, the blue line represents a 1:1 concordance, and the red lines represent the 10X boundary between values. Panels on the left contain data for molecules having the same (ie, consistent) dose levels selected for all time points. Panels on the right contain data for molecules having different (ie, inconsistent) dose levels used between single dose and repeated dose studies. Abbreviations: mkd, mg/kg (body weight)/day; R, Pearson R value; RMSD, root mean square difference value.

Figure 6.

Figure 6.

Metrics quantifying the 29-day systemic apical point of departure (POD) compared with the liver biological effect point of departures (BEPODs) at different exposure time points. The dataset was divided into 2 groups: molecules with consistent dose levels used and molecules with inconsistent dose levels used. Each panel shows a different metric comparing the 29-day systemic apical POD with the liver BEPOD from 3 h to 29 days: (A) RMSD, root mean square difference; (B) Pearson R; (C) the percentage of molecules with a liver BEPOD within 3X of the 29-day systemic apical POD; (D) the percentage of molecules with a liver BEPOD within 10X of the 29-day systemic apical POD; and (E) the median fold difference between the BEPOD and the 29-day systemic apical POD. Data shown in panel E are the median with 95% confidence interval. To generate data shown in panel E, the molecule liver BEPOD at each exposure time point was divided by that molecule’s 29-day systemic apical POD. The dashed line in panel E is a value of 1.

For the Inconsistent Dose Level Group, a positive correlation was observed between the 29-day apical POD and the liver BEPOD at all time points (Figure 5). Pearson R values ranged from 0.64 to 0.92. POD values bestrode the 1:1 correlation line for comparisons involving the repeated dose (4–29 days) BEPOD values. However, POD values appeared to be skewed below the 1:1 correlation line for the comparisons using the single dose (3–24 h) BEPOD values; in other words, the BEPOD values from single dose exposure time points tended to be higher as a group than the 29-day apical POD. As measured by RMSD, prediction accuracy was good at all exposure time points but improved (ie, RMSD values were lower) for comparisons using repeat dose BEPOD data (Figure 6A). RMSD values ranged from 0.64 to 0.82 for the single dose data and from 0.34 to 0.49 for the repeat dose data. The percentage of molecules with a BEPOD value within 3X or 10X of the apical POD value was higher for comparisons using repeat dose BEPOD data compared with comparisons using single dose BEPOD data (Figs. 6C and 6D). For the single dose time points, the percentage of molecules having a BEPOD within 3X or 10X of the apical POD ranged from 29 to 43 and 75 to 89, respectively. For the repeat dose time points, the percentage within 3X increased to between 68 and 82, and the percentage within 10X increased to between 96 and 100. The median fold difference of the BEPOD to the apical POD was approximately 3.5 at the single dose time points (range: 2.9–3.7) (Figure 6E); however, the median fold difference was approximately 1 at all repeat dose time points (range: 0.8–1.7) (Figure 6E). For all of these metrics, it was noted that the data were similar for the Inconsistent Dose Level Group and the Consistent Dose Level Group at the repeat dose time points (Figure 6), when dose level selection was consistent for both groups of molecules.

Quantitative Effect of Low Dose Level Selection on the Liver BEPOD

Because dose level selection influenced the liver BEPOD value, the magnitude of this effect was quantified. Of the 28 molecules in the Inconsistent Dose Level Group, 15 molecules had a single dose study where the low dose level was 3–4X higher than the corresponding repeat dose study low dose level (Supplementary Table 10). The remaining 13 molecules had single dose study where the low dose level was 10X higher than the corresponding repeat dose study low dose level (Supplementary Table 10).

For molecules in the 3–4X group, no statistically identified variance was observed among all group comparisons in the median fold difference between the BEPOD and the apical POD (Table 1). Within the 3–4X group, the median fold difference ranged from 1.8 to 2.9 and 0.9 to 1.8 for the single dose and repeat dose groups (Figs. 7A and 8A); the means of the median fold differences were 2.1 and 1.3 for the single dose and repeat dose groups. For molecules in the 10X group, several statistically identified variances in the median fold difference between the BEPOD and the apical POD were identified when comparing data for single dose time points to data for repeat dose time points (Table 1). Within the 10X group, the median fold difference ranged from 3.6 to 5.4 and 0.7 to 1.1 for the single dose and repeat dose groups (Figs. 7A and 8A); the means of the median fold differences were 4.3 and 0.9 for the single dose and repeat dose groups.

Table 1.

Statistical Analysis of 3 h to 29-day Liver BEPODs Compared With 29-day Apical POD

Molecules With a ×3–4 Difference in Low Dose Level
Kruskal-Wallis test ANOVA p value >.09
Fold difference between liver BEPOD and 29-day apical POD
3 h 6 h 9 h 24 h 4 days 8 days 15 days 29 days
Median 1.9 2.0 1.8 2.9 1.5 1.8 1.1 0.9
Range 0.08–33.3 0.06–43.2 0.04–15.1 0.6–17.5 0.4–4.9 0.05–7.7 0.6–8.8 0.2–2.8
Molecules With a ×10 Difference in Low Dose Level
Kruskal-Wallis test ANOVA p value < .0001
Fold difference between liver BEPOD and 29-day apical POD
3 h 6 h 9 h 24 h 4 days 8 days 15 days 29 days
Median 5.4 4.5 3.8 3.6 0.7 1.1 1.1 0.7
Range 0.3–78.5 0.6–13.1 1.0–29.0 0.2–14.6 0.05–6.8 0.3–7.1 0.2–6.6 0.2–3.6
Dunn’s posttest p values of fold differences
3 h 6 h 9 h 24 h 4 days 8 days 15 days 29 days
3 h > .99 > .99 > .99 < .01 .03 .01 < .01
6 h > .99 > .99 > .99 .02 .29 .15 .04
9 h > .99 > .99 > .99 < .01 .08 .04 < .01
24 h > .99 > .99 > .99 .23 > .99 > .99 .47
4 days < .01 .02 < .01 .23 > .99 > .99 > .99
8 days .03 .29 .08 > .99 > .99 > .99 > .99
15 days .01 .15 .04 > .99 > .99 > .99 > .99
29 days < .01 .04 < .01 .47 > .99 > .99 > .99

Shaded values represent comparisons with a Dunn’s p value <.05.

Figure 7.

Figure 7.

Effect of dose level selection on point of departure (POD) concordance. A, These graphs show comparisons between the 29-day systemic apical POD and liver biological effect point of departures (BEPODs) from 3 h to 29 days. B, These graphs show comparisons between the 29-day liver BEPOD and liver BEPODs from 3 h to 15 days. In all graphs, each dot represents a distinct molecule, the blue line represents a 1:1 concordance, and the red lines represent the 10X boundary between values. Molecules having a single dose exposure low dose level 3–4X higher than the repeated dose low dose level are shown in black dots. Molecules having a single dose exposure (ie, 3–24 h) low dose level 10X higher than the repeated exposure (ie, 4–29 days) low dose level are shown in red dots. Abbreviation: mkd, mg/kg (body weight)/day.

Figure 8.

Figure 8.

Quantitative effect of low dose level selection on the liver biological effect point of departure (BEPOD). A, Fold difference between the liver BEPOD from 3 h to 29 days and the 29-day systemic apical point of departure (POD). B, Fold difference between the liver BEPOD from 3 h to 15 days and the 29-day liver BEPOD. Gray bars show data for molecules with a 3–4X difference in the low dose level selected between single dose and repeat dose studies. Black bars show data for molecules with a 10X difference in the selected low dose level between single dose and repeat dose studies. Data shown are the median with 95% confidence interval. The dashed line is a value of 1.

Fold differences between the BEPOD at each exposure time point and the 29-day BEPOD were examined. For molecules in the 3–4X group, no statistically identified variance was observed in the fold difference between the BEPOD and the 29-day BEPOD among all group comparisons (Table 2). Within the 3–4X group, the median fold difference ranged from 1.4 to 3.1 and 1.5 to 2.6 for the single dose and repeat dose groups (Figs. 7B and 8B); the means of the median fold differences were 2.1 and 2.0 for the single dose and repeat dose groups. For molecules in the 10X group, several statistically identified variances in the median fold difference between the BEPOD and the 29-day BEPOD were identified when comparing data for single dose time points to data for repeat dose time points (Table 2). Within the 10X group, the median fold difference ranged from 2.4 to 9.2 and 0.8 to 1.5 for the single dose and repeat dose groups (Figs. 7B and 8B); the means of the median fold differences were 4.9 and 1.2 for the single dose and repeat dose groups.

Table 2.

Statistical Analysis of 3 h to 15-day Liver BEPODs Compared With 29-day Liver BEPOD

Molecules With a ×3–4 Difference in Low Dose Level
Kruskal-Wallis test ANOVA p value = .52
Fold difference between liver BEPOD and 29-day liver BEPOD
3 h 6 h 9 h 24 h 4 days 8 days 15 days
Median 1.4 2.2 1.6 3.1 2.0 1.5 2.6
Range 0.07–47.0 0.06–178.1 0.04–11.0 0.9–14.9 0.2–6.8 0.2–8.0 0.3–7.6
Molecules With a ×10 Difference in Low Dose Level
Kruskal-Wallis test ANOVA p value < .0001
Fold difference between liver BEPOD and 29-day liver BEPOD
3 h 6 h 9 h 24 h 4 days 8 days 15 days
Median 9.2 4.2 3.7 2.4 0.8 1.3 1.5
Range 0.1–34.0 0.3–23.0 1.4–34.5 0.1–26.7 0.2–3.6 0.5–3.5 0.2–7.5
Dunn’s posttest p values of fold differences
3 h 6 h 9 h 24 h 4 days 8 days 15 days
3 h > .99 > .99 > .99 .0004 .0096 .02
6 h > .99 > .99 > .99 .007 .1 .2
9 h > .99 > .99 > .99 .002 .03 .07
24 h > .99 > .99 > .99 .1 > .99 > .99
4 days .0004 .007 .002 .1 > .99 > .99
8 days .0096 .1 .03 > .99 > .99 > .99
15 days .02 .2 .07 > .99 > .99 > .99

Shaded values represent comparisons with a Dunn’s p value < .05.

DISCUSSION

The purpose of this project was to determine the predictivity of an in vivo BEPOD for an in vivo apical POD across various durations of exposure using data from a large set of molecules. To do this, a single BMDExpress BEPOD derivation workflow was required for all analyses, but no scientific consensus exists currently on the optimal algorithm to derive a BEPOD. One strategy is to perform BMD analysis on individual genes and then use gene sets (ie, pathways) to aggregate the individual gene BMD values into a single BEPOD value [4, 7, 14–17]. However, how to best determine which gene set should be used for the BEPOD is still an open question. Farmahin et al. (2017) compared eleven approaches to select a gene set for BEPOD determination, including the functional classification to GO terms implemented in BMDExpress as investigated here. The study concluded that, despite small variations in the results, all tested approaches produced BEPODs within an order of magnitude of apical PODs.

The choice of gene set functional classification filtering parameters for the current approaches might be considered arbitrary because these are based on expert opinion rather than on extensive empirical data. For example, BMDExpress has three filters to select GO-BP terms in the functional classification step of the workflow. The most recent recommendation for this step, based on expert opinion, was to set the minimum number of genes to 3 and the minimum percentage of genes to 5 and exclude the Fisher’s exact test from the selection. The analyses reported here extend this previous work by considering a much larger set of molecules to obtain a data-driven BEPOD algorithm. The 29-day apical POD value was used as the “gold standard” to benchmark potential BEPOD algorithms, and thousands of different combinations for the three functional classification filters were tested. Under the conditions of this study, the main conclusions were: (1) the minimum number of genes filter had the largest influence on the results; (2) setting the minimum number of genes to 2 yielded the best concordance in terms of both Pearson correlation and RMSD; (3) requiring a small percentage of genes with a BMD value per GO-BP term also yielded better results; and (4) setting the Fishers exact two-tailed test threshold to 0.05 also improved the results.

Based on these data, the following functional classification filters were used for BEPOD derivation: (1) GO-BP terms with only 1 gene having a BMD value were excluded; (2) GO-BP terms with < 2% of genes having BMD values were excluded; and (3) GO-BP terms with Fisher’s exact two-tailed test p value > .05 were excluded. The identified choices for the minimum number of genes filter and the minimum percentage of genes filter are slightly more conservative than the most recent National Toxicology Program Expert Panel recommendations for genomic dose response modeling (minimum of 3 genes and 5% of genes) (NTP, 2018). It remains to be determined how well these functional classification filters predict an apical POD using other transcriptome profiling methodology, gene sets, or study designs deploying different numbers of dose groups, group sizes, or dose spacing. Furthermore, one caveat on how the molecules were split in the training and test set for the parameter optimization is that the split was not random. The test set was chosen from the molecules that had BMD values for only a subset of the filtering combinations, to maximize the training set (all training molecules had calculated BMD values for all filtering combinations). However, the Pearson R and RMSD values were almost identical for the two sets, showing that the filtering worked equally well for both sets.

Using the BMDExpress functional classification filter outlined above, the data presented here indicated that BEPODs and apical PODs bestrode the unit line, and mean BEPOD and apical POD values were approximately equal across the entire dataset. The BEPOD was derived from what might be considered “concerted biological change” using gene sets and did not rely upon the POD of any single gene. Given the thousands of hypotheses tested in genome-wide transcriptome experiments, the likelihood is high that one gene will by random chance pass all BMDExpress filters leading to an aberrantly low BEPOD value. The negative impact of using a single gene-based BEPOD to predict the apical POD is illustrated by the results shown in Figure 1A. Such BEPOD values were typically lower that apical POD values, and this demonstrates that using the most sensitive single gene POD value to predict an apical POD is not optimal. Prior work suggesting that gene expression changes may be more sensitive than apical endpoint changes apical largely have examined temporal sensitivity and not dose sensitivity (Chen et al., 2012; Heinloth et al., 2004). Like these prior data, our results also show that gene expression changes occur prior to apical endpoint effects in time; however, PODs for concurrent or prospective apical endpoints and BEPODs based on “concerted biological change” at the gene expression level are approximately equal across the dose continuum. Future research to identify a toxicologically-relevant BEPOD should focus on a unifying method to define the threshold of “concerted biological change” at the gene expression level, analogous in concept to a No Observed Transcriptional Effect Level (Lobenhofer et al., 2004). We suggest this BEPOD may be derived from genome-wide transcriptome analysis and hypothesize it is the initial point along the exposure continuum showing a burst of increasing and persistent gene expression change.

Across the large number of molecules examined, the liver BEPOD showed stability over time. Prior work using a small number of molecules has demonstrated that a transcriptomic BEPOD derived from exposure durations spanning 5 to 90 days can predict a chronic study apical POD within an order of magnitude (Thomas et al., 2013). These data imply that a BEPOD may be stable within 5 days of exposure. Data presented here across a large number of molecules suggest that a liver BEPOD may stabilize within hours of exposure. In our analysis, liver BEPODs varied little from an acute exposure to a 29-day repeat dose exposure. Toxicokinetic parameters such as time to reach steady-state exposure and time-dependent changes in metabolite formation may have some influence on the stability of a BEPOD over time. To examine the potential influence such parameters in greater detail, it will be necessary to critically examine BEPOD stability using molecules with widely varying biopersistence and metabolism profiles.

An interesting result within our analysis was the predictive accuracy of a BEPOD derived from the liver for an apical POD derived from a non-liver compartment. A liver BEPOD predicted the apical POD within an order of magnitude when that apical POD was derived from a body weight, clinical observation, or kidney endpoint. The biological basis for how liver gene expression might be linked to apical effects in non-liver compartments is unknown. However, it is speculated that molecular signals are produced at sites of injury, become systemically available, and alter liver gene expression. Such molecular organ-to-organ crosstalk involving the liver has been observed in normal physiology and following organ injury (Lee et al., 2018; Poole et al., 2017). Alternatively, the liver may experience relatively high exposure due to first pass metabolism following an oral exposure; however, a non-liver apical POD was still predicted by the liver BEPOD for the small number of TG-GATES molecules administered intravenously (Figure 4H). From a toxicity assessment standpoint, these data suggest the liver may be a useful surrogate organ for determining a systemic apical POD. A caveat of this conclusion is that our analysis included only apical data available within TG-GATES and not all potential apical effects. Our data corroborate a previous finding from a benzo[a]pyrene exposure model indicating a rodent liver BEPOD approximated a lung and forestomach apical POD (Moffat et al., 2015). To properly test this hypothesis, additional data using a larger chemical space and examining a variety of toxicity endpoints will be needed.

Prior research examining optimal study design for a BMD derivation concluded that 3 parameters were important: (1) the total number of animals in the study (not group size); (2) including a dose level approximating the POD; and (3) including at least one dose level with a robust response (Slob et al., 2005). Although data were available within TG-GATES for more than the 79 molecules examined here, many molecules were culled from the analysis because the lowest dose level used produced a robust apical response. It was reasoned that such datasets violated the important parameter of having a dose level approximating the POD. Even after this culling, dose level selection influenced the BEPOD. When the low dose level was increased by 10X, the resulting BEPOD showed a mean increase of approximately 4X. These data confirm the need for proper dose placement across a wide dose range to accurately identify a BMD-based POD, as recommended by prior studies (NTP, 2018; Slob et al., 2005).

A scientific consensus is building supporting the hypothesis that a short-term BEPOD can predict a concurrent and future apical POD. The data presented here show that a liver BEPOD derived from an acute exposure can predict with reasonable accuracy an apical POD at 29 days of exposure. As demonstrated for a limited number of molecules and toxicity types, a subacute (eg, 5 days) or subchronic exposure BEPOD in a rodent (rat and mouse) can predict a chronic/cancer adverse apical POD occurring over a rodent lifetime (Chepelev et al., 2018; Dong et al., 2016; Jackson et al., 2014; Moffat et al., 2015; Thomas et al., 2013). If confirmed by additional research, the prediction of an apical POD from a short-term exposure BEPOD may argue for a paradigm shift in how plant protection product and industrial chemical regulatory toxicology studies are conducted.

One might envision an in vivo toxicity testing strategy with the sole aim to derive an organismal BEPOD for use in the human health risk assessment process. Such a testing program may require only a small number of short-term studies appropriately encompassing all life stage-based windows of susceptibility and, if realized, would use many fewer animals and considerably shorten toxicity testing timelines (Chen et al., 2012). Because omic technology can comprehensively query molecular change and all apical effects require a prior change in the molecular landscape (Farr and Dunn, 1999; LaRocca et al., 2017; Vinken et al., 2017), it follows that an organismal BEPOD would theoretically capture all possible toxicities. The currently deployed toxicity testing paradigm uses apical endpoints to identify the most sensitive adverse effect and is a legacy paradigm established decades ago when modern molecular systems biology analytical tools were not available (Fitzhugh, 1959; Fitzhugh and Schouboe, 1959). In a risk-based human health regulatory framework utilizing a BEPOD for risk assessment, the value of conducting time- and animal-intensive studies such as the rodent cancer bioassay or multigeneration reproduction studies for the sole purpose of identifying the hazard may not be warranted from an animal use or human health protection perspective.

Before such a testing paradigm shift using BEPOD derivation as the regulatory endpoint could be considered, it is recognized that extensive additional research will be necessary, including examining the predictivity of a BEPOD for an apical POD across various windows of susceptibility and encompassing a large chemical, mode-of-action, and toxicological space. These data would be needed to ensure a BEPOD-based testing program is at least as protective of human health compared with the current testing program. Support from various scientific sectors for generating data required to achieve such a paradigm shift can be found in the literature (Buesen et al., 2017; LaRocca et al., 2017; Mezencev and Subramaniam, 2019; Schmitz-Spanke, 2019; Yauk et al., 2020).

Together with published data, the results described here suggest that the ability of a BEPOD to predict an apical POD is agnostic to the chemical space or mode-of-action. Prior BEPOD and apical POD comparisons largely examined industrial molecules, environmental contaminants, or agrochemicals and generally showed a BEPOD predictive of an apical POD within an order of magnitude (Bhat et al., 2013; Chepelev et al., 2015; Farmahin et al., 2019; Jackson et al., 2014; Moffat et al., 2015; Thomas et al., 2013; Zhou et al., 2017). The 79 molecules examined here are mostly drugs and have a wide range of therapeutic modes-of-action (and presumably toxicity modes-of-action) (Lee et al., 2016). Taken together, these data suggest that a transcriptome-based BEPOD may accurately predict an apical POD across multiple toxicity modes-of-action, chemistries with specific or promiscuous biological targets, and a wide range of chemical space.

Based on the data reported here across 79 molecules and under the conditions of this study, the following conclusions were drawn. A relatively nonrestrictive functional classification filtering method in BMDExpress (at least 2 genes/GO-BP term, at least 2% of genes/GO-BP term, and Fisher’s exact two-tailed test p value not greater than .05) provided the best concordance between a BEPOD and apical POD. Appropriate study design including proper dose placement and a wide dose selection range should be used to most accurately identify a BEPOD. A rat liver BEPOD stabilizes within hours of exposure. A rat liver BEPOD (agnostic to mechanism or mode-of-action) predicts with reasonable accuracy (within 3X for a majority of molecules and within an order of magnitude for all molecules) the most sensitive (ie, lowest) POD among liver, kidney, body weight, and clinical observation apical effects. A rat liver BEPOD from an acute or subacute exposure can predict a concurrent or longer-term exposure apical POD.

DATA AVAILABILITY

Supplementary data are available at https://doi.org/10.5061/dryad.pvmcvdngd (Johnson et al., 2020).

Supplementary Material

kfaa062_Supplementary_data

ACKNOWLEDGMENTS

The authors would like to thank Dr Enrica Bianchi, Dr Selma Davis, Dr Michael DeVito, Dr Navin Elango, Dr Jessica LaRocca, Dr Fred Parham, Dr Reza Rasoulpour, Dr Daniel Svoboda, and Dr Zhongyu Yan for critical discussions throughout this project, statistical support, or bioinformatics support.

FUNDING

Corteva Agriscience and by the National Institute of Environmental Health Sciences (NIEHS) of National Institutes of Health (NIH) (ES103318-03).

DECLARATION OF CONFLICTING INTERESTS

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

REFERENCES

  1. Bhat V. S., Hester S. D., Nesnow S., Eastmond D. A. (2013). Concordance of transcriptional and apical benchmark dose levels for conazole-induced liver effects in mice. Toxicol. Sci. 136, 205–215. [DOI] [PubMed] [Google Scholar]
  2. Buesen R., Chorley B. N., da Silva Lima B., Daston G., Deferme L., Ebbels T., Gant T. W., Goetz A., Greally J., Gribaldo L., et al. (2017). Applying ‘omics technologies in chemicals risk assessment: Report of an ECETOC workshop. Regul. Toxicol. Pharmacol. 91, S3–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Chen M., Zhang M., Borlak J., Tong W. (2012). A decade of toxicogenomic research and its contribution to toxicological science. Toxicol. Sci. 130, 217–228. [DOI] [PubMed] [Google Scholar]
  4. Chepelev N. L., Gagne R., Maynor T., Kuo B., Hobbs C. A., Recio L., Yauk C. L. (2018). Transcriptional profiling of male CD-1 mouse lungs and Harderian glands supports the involvement of calcium signaling in acrylamide-induced tumors. Regul. Toxicol. Pharmacol. 95, 75–90. [DOI] [PubMed] [Google Scholar]
  5. Chepelev N. L., Moffat I. D., Labib S., Bourdon-Lacombe J., Kuo B., Buick J. K., Lemieux F., Malik A. I., Halappanavar S., Williams A., et al. (2015). Integrating toxicogenomics into human health risk assessment: Lessons learned from the benzo[a]pyrene case study. Crit. Rev. Toxicol. 45, 44–52. [DOI] [PubMed] [Google Scholar]
  6. Cheung C., Jones-McLean E., Yauk C., Barton-Maclaren T., Boucher S., Bourdon-Lacombe J., Chauhan V., Gagne M., Gillespie Z., Halappanavar S., et al. (2018). Evaluation of the Use of Toxicogenomics in Risk Assessment at Health Canada. An Exploratory Document on Current Health Canada Practices for the Use of Toxicogenomics in Risk Assessment. Health Canada, Ottawa, Canada.
  7. Davis J. A., Gift J. S., Zhao Q. J. (2011). Introduction to benchmark dose methods and U.S. EPA’s benchmark dose software (BMDS) version 2.1.1. Toxicol. Appl. Pharmacol. 254, 181–191. [DOI] [PubMed] [Google Scholar]
  8. Dong H., Gill S., Curran I. H., Williams A., Kuo B., Wade M. G., Yauk C. L. (2016). Toxicogenomic assessment of liver responses following subchronic exposure to furan in Fischer F344 rats. Arch. Toxicol. 90, 1351–1367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. EPA. (2012). In Benchmark Dose Technical Guidance, EPA/100/R-12/001, June 2012. (Agency USEP, Ed.) Washington, DC.
  10. Farmahin R., Gannon A. M., Gagne R., Rowan-Carroll A., Kuo B., Williams A., Curran I., Yauk C. L. (2019). Hepatic transcriptional dose-response analysis of male and female Fischer rats exposed to hexabromocyclododecane. Food Chem. Toxicol. 133, 110262. [DOI] [PubMed] [Google Scholar]
  11. Farmahin R., Williams A., Kuo B., Chepelev N. L., Thomas R. S., Barton-Maclaren T. S., Curran I. H., Nong A., Wade M. G., Yauk C. L. (2017). Recommended approaches in the application of toxicogenomics to derive points of departure for chemical risk assessment. Arch. Toxicol. 91, 2045–2065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Farr S., Dunn R. T. 2nd (1999). Concise review: Gene expression applied to toxicology. Toxicol. Sci. 50, 1–9. [DOI] [PubMed] [Google Scholar]
  13. Fitzhugh O. G. (1959). Chronic Oral Toxicity. Appraisal of the Safety of Chemicals in Food Drugs and Cosmetics, pp. 36–45. The Editorial Committee of the Texas State Department of Health, Austin, TX. [Google Scholar]
  14. Fitzhugh O. G., Schouboe B. S. (1959). Subacute Toxicity. Appraisal of the Safety of Chemicals in Food Drugs and Cosmetics, pp. 26–35. The Editorial Committee of the Texas State Department of Health, Austin, TX. [Google Scholar]
  15. Haber L. T., Dourson M. L., Allen B. C., Hertzberg R. C., Parker A., Vincent M. J., Maier A., Boobis A. R. (2018). Benchmark dose (BMD) modeling: Current practice, issues, and challenges. Crit. Rev. Toxicol. 48, 387–415. [DOI] [PubMed] [Google Scholar]
  16. Hardy A., Benford D., Halldorsson T., Jeger M. J., Knutsen K. H., More S., Mortensen A., Naegeli H., Noteborn H., Ockleford C., et al. (2017). Update: Use of the benchmark dose approach in risk assessment. EFSA J. 15, 4658–4688. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Heinloth A. N., Irwin R. D., Boorman G. A., Nettesheim P., Fannin R. D., Sieber S. O., Snell M. L., Tucker C. J., Li L., Travlos G. S., et al. (2004). Gene expression profiling of rat livers reveals indicators of potential adverse effects. Toxicol. Sci. 80, 193–202. [DOI] [PubMed] [Google Scholar]
  18. Igarashi Y., Nakatsu N., Yamashita T., Ono A., Ohno Y., Urushidani T., Yamada H. (2015). Open TG-GATEs: A large-scale toxicogenomics database. Nucleic Acids Res. 43, D921–927. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Irizarry R. A., Hobbs B., Collin F., Beazer-Barclay Y. D., Antonellis K. J., Scherf U., Speed T. P. (2003). Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics (Oxford, England) 4, 249–264. [DOI] [PubMed] [Google Scholar]
  20. Jackson A. F., Williams A., Recio L., Waters M. D., Lambert I. B., Yauk C. L. (2014). Case study on the utility of hepatic global gene expression profiling in the risk assessment of the carcinogen furan. Toxicol. Appl. Pharmacol. 274, 63–77. [DOI] [PubMed] [Google Scholar]
  21. Johnson K. J., Auerbach S. S., Costa E. (2020). A rat liver transcriptomic point of departure predicts a prospective liver or non-liver apical point of departure. doi:10.5061/dryad.pvmcvdngd. [DOI] [PMC free article] [PubMed]
  22. LaRocca J., Johnson K. J., LeBaron M. J., Rasoulpour R. (2017). The interface of epigenetics and toxicology in product safety assessment. Curr. Opin. Toxicol. 6, 87–92. [Google Scholar]
  23. Lee M., Liu Z., Huang R., Tong W. (2016). Application of dynamic topic models to toxicogenomics data. BMC Bioinformatics 17, 368. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Lee S. A., Cozzi M., Bush E. L., Rabb H. (2018). Distant organ dysfunction in acute kidney injury: A review. Am. J. Kidney Dis. 72, 846–856. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Lobenhofer E. K., Cui X., Bennett L., Cable P. L., Merrick B. A., Churchill G. A., Afshari C. A. (2004). Exploration of low-dose estrogen effects: Identification of no observed transcriptional effect level (NOTEL). Toxicol. Pathol. 32, 482–492. [DOI] [PubMed] [Google Scholar]
  26. Mezencev R., Subramaniam R. (2019). The use of evidence from high-throughput screening and transcriptomic data in human health risk assessments. Toxicol. Appl. Pharmacol. 380, 114706. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Moffat I., Chepelev N., Labib S., Bourdon-Lacombe J., Kuo B., Buick J. K., Lemieux F., Williams A., Halappanavar S., Malik A., et al. (2015). Comparison of toxicogenomics and traditional approaches to inform mode of action and points of departure in human health risk assessment of benzo[a]pyrene in drinking water. Crit. Rev. Toxicol. 45, 1–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. NRC. (2007). Applications of Toxicogenomic Technologies to Predictive Toxicology and Risk Assessment. National Academies of Sciences Press, Washington, DC. [PubMed] [Google Scholar]
  29. NTP. (2016). West Virginia Chemical Spill: 5-Day Rat Toxicogenomic Studies, July 2016 NTP Update. National Toxicology Program, Research Triangle Park, NC.
  30. NTP. (2018). NTP Research Report on National Toxicology Program Approach to Genomic Dose-response Modeling: Research Report 5. National Toxicology Program, Research Triangle Park, NC. [PubMed]
  31. Phillips J. R., Svoboda D. L., Tandon A., Patel S., Sedykh A., Mav D., Kuo B., Yauk C. L., Yang L., Thomas R. S., et al. (2019). BMDExpress 2: Enhanced transcriptomic dose-response analysis workflow. Bioinformatics (Oxford, England) 35, 1780–1782. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Poole L. G., Dolin C. E., Arteel G. E. (2017). Organ-organ crosstalk and alcoholic liver disease. Biomolecules 7, 62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Schmitz-Spanke S. (2019). Toxicogenomics—What added value do these approaches provide for carcinogen risk assessment? Environ. Res. 173, 157–164. [DOI] [PubMed] [Google Scholar]
  34. Slob W., Moerbeek M., Rauniomaa E., Piersma A. H. (2005). A statistical evaluation of toxicity study designs for the estimation of the benchmark dose in continuous endpoints. Toxicol. Sci. 84, 167–185. [DOI] [PubMed] [Google Scholar]
  35. Thomas R. S., Allen B. C., Nong A., Yang L., Bermudez E., Clewell H. J. 3rd, Andersen M. E. (2007). A method to integrate benchmark dose estimates with genomic data to assess the functional effects of chemical exposure. Toxicol. Sci. 98, 240–248. [DOI] [PubMed] [Google Scholar]
  36. Thomas R. S., Clewell H. J., Allen B. C., Wesselkamper S. C., Wang N. C. Y., Lambert J. C., Hess-Wilson J. K., Zhao Q. J., Andersen M. E. (2011). Application of transcriptional benchmark dose values in quantitative cancer and noncancer risk assessment. Toxicol. Sci. 120, 194–205. [DOI] [PubMed] [Google Scholar]
  37. Thomas R. S., Wesselkamper S. C., Wang N. C. Y., Zhao Q. J., Petersen D. D., Lambert J. C., Cote I., Yang L., Healy E., Black M. B., et al. (2013). Temporal concordance between apical and transcriptional points of departure for chemical risk assessment. Toxicol. Sci. 134, 180–194. [DOI] [PubMed] [Google Scholar]
  38. Vinken M., Knapen D., Vergauwen L., Hengstler J. G., Angrish M., Whelan M. (2017). Adverse outcome pathways: A concise introduction for toxicologists. Arch. Toxicol. 91, 3697–3707. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Yang L., Allen B. C., Thomas R. S. (2007). BMDExpress: A software tool for the benchmark dose analyses of genomic data. BMC Genomics 8, 387. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Yauk C. L., Harrill A. H., Ellinger-Ziegelbauer H., van der Laan J. W., Moggs J., Froetschl R., Sistare F., Pettit S. (2020). A cross-sector call to improve carcinogenicity risk assessment through use of genomic methodologies. Regul. Toxicol. Pharmacol. 110, 104526. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Zhou Y. H., Cichocki J. A., Soldatow V. Y., Scholl E. H., Gallins P. J., Jima D., Yoo H. S., Chiu W. A., Wright F. A., Rusyn I. (2017). Editor’s highlight: Comparative dose-response analysis of liver and kidney transcriptomic effects of trichloroethylene and tetrachloroethylene in B6C3F1 mouse. Toxicol. Sci. 160, 95–110. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

kfaa062_Supplementary_data

Data Availability Statement

Supplementary data are available at https://doi.org/10.5061/dryad.pvmcvdngd (Johnson et al., 2020).


Articles from Toxicological Sciences are provided here courtesy of Oxford University Press

RESOURCES