Abstract
Evaluating the risk of chemical carcinogenesis has long been a challenge owing to the protracted nature of the pathology and the limited translatability of animal models. Although numerous short-term in vitro and in vivo assays have been developed, they have failed to reliably predict the carcinogenicity of nongenotoxic compounds. Extending upon previous microarray work (Fielden, M. R., Nie, A., McMillian, M., Elangbam, C. S., Trela, B. A., Yang, Y., Dunn, R. T., II, Dragan, Y., Fransson-Stehen, R., Bogdanffy, M., et al. (2008). Interlaboratory evaluation of genomic signatures for predicting carcinogenicity in the rat. Toxicol. Sci. 103, 28–34), we have developed and extensively evaluated a quantitative PCR-based signature to predict the potential for nongenotoxic compounds to induce liver tumors in the rat as a first step in the safety assessment of potential nongenotoxic carcinogens. The training set was derived from liver RNA from rats treated with 72 compounds and used to develop a 22-gene signature on the TaqMan array platform, providing an economical and standardized assay protocol. Independent testing on over 900 diverse samples (66 compounds) confirmed the interlaboratory precision of the assay and its ability to predict known nongenotoxic hepatocarcinogens (NGHCs). When tested under different experimental designs, strains, time points, dose setting criteria, and other preanalytical processes, the signature sensitivity and specificity was estimated to be 67% (95% confidence interval [CI] = 38–88%) and 59% (95% CI = 44–72%), respectively, with an area under the receiver operating characteristic curve of 0.65 (95% CI = 0.46–0.83%). Compounds were best classified using expression data from short-term repeat dose studies; however, the prognostic expression changes appeared to be preserved after longer term treatment. Exploratory evaluations also revealed that different modes of action for nongenotoxic and genotoxic compounds can be discriminated based on the expression of specific genes. These results support a potential early preclinical testing paradigm to catalyze broader understanding of putative NGHCs.
Keywords: nongenotoxic, carcinogenesis, biomarkers, safety evaluation, liver, systems toxicology, toxicogenomics, methods, predictive toxicology, in vitro, alternatives
The rodent cancer bioassay has been used for over 30 years to evaluate the human carcinogenic risk of chemicals. The bioassay requires exposing rats and mice to a test compound for most of their lifetime (~18 to 24 months) up to a maximum tolerated dose based on prior chronic dose-ranging studies. Because of the extensive resources required, only a small fraction of chemicals have undergone carcinogenicity testing relative to the tens of thousands of compounds identified on the U.S. Environmental Protection Agency’s Toxic Substances Control Act Inventory or registered by Registration, Evaluation, Authorization, and Restriction of Chemicals (Christensen et al., 2011). Additionally, it has been reported that ~31% of marketed drugs have not been tested according to present carcinogenicity testing guidelines (Brambilla and Martelli, 2009). In addition to resource constraints and ethical concerns, the high doses frequently used in the bioassay and the physiological differences between rodents and humans have led to considerable debate over the relevance of the rodent cancer bioassay for assessing human risk (Cohen, 2010; Jacobs, 2005; Maronpot et al., 2004; Melnick et al., 2008; Ward, 2008). As a result, improving upon the current carcinogenicity testing paradigm remains an active area of research.
Because DNA damage is considered a hallmark of carcinogenesis, it is assumed that DNA damaging agents are likely to be carcinogenic. Thus, a number of in vitro and short-term in vivo genotoxicity assays have been developed and validated to detect the ability of chemicals and/or their metabolites to damage or mutate DNA and predict carcinogenic outcome (Kirkland et al., 2005). In contrast to the expectation that genotoxic chemicals are carcinogens, nongenotoxic chemicals cannot be assumed to be noncarcinogenic. Because most compounds that are mutagenic in the Ames test are excluded from drug development, the most frequent adverse outcome observed in rodent cancer bioassays is carcinogenicity initiated by nongenotoxic events. Due to the high maximum tolerated doses used in the rodent cancer bioassay, these carcinogenic events in rodents often occur at exposures above which carcinogenic risk to humans is assumed to be minimal. In addition, many examples exist for which nongenotoxic carcinogenicity in rodents has been conclusively shown not to be relevant for human risk, such as urinary bladder transitional cell carcinoma induced by saccharin or muraglitazar and renal carcinoma induced by D-limonene (Waites et al., 2007; Whysner and Williams, 1996a,b). As a result, there has been a stronger emphasis on understanding the chemical’s mode of action to better evaluate the risk and relevance of the findings to humans (Jacobs, 2005; Jacobs and Jacobson-Kram, 2004). However, the time and resources needed to determine the mechanism of action is considerable, and testing often occurs very late in drug development, if at all. Furthermore, the rodent bioassay typically does not provide the type of mechanistic insight needed to enable this evaluation.
Predicting carcinogenicity induced by nongenotoxic compounds is a challenge due to the many modes of action that have been described to contribute to tumor formation and the multistep process of carcinogenesis (Yamasaki et al., 1996). Nonetheless, numerous assays have been developed in an attempt to predict nongenotoxic carcinogens, including in silico quantitative structure activity relationship models (Contrera et al., 2003; Lee et al., 1995), in vitro mechanistic assays (Yamasaki et al., 1996), cell-based transformation systems (Mauthe et al., 2001; Vanparys et al., 2011), and various (sub)-chronic histological, histochemical, and biochemical indices (Allen et al., 2004; Elcombe et al., 2002; Kitchin et al., 1993; Tatematsu et al., 1987) and combinations thereof (Cohen, 2004; Kitchin et al., 1994). Given the modest predictivity of these approaches or the nature of the assays, there is general agreement that these short-term methods do not reliably predict tumor outcome or provide sufficient information to fully inform a human risk assessment (Jacobs, 2005). As a result, the rodent cancer bioassay remains the gold standard for assessing the human risk of chemical carcinogenesis. Therefore, novel assays or biomarkers that provide an early prediction of a carcinogenic outcome induced by nongenotoxic compounds could enable a more informed compound selection process for early-stage development. This approach could facilitate the proactive initiation of investigative studies to enable an early human risk assessment prior to initiating the rodent bioassay or provide a more efficient hazard identification approach to prioritize chemicals for carcinogenicity testing.
In response to these challenges, genomic or large-scale gene expression profiling has been extensively researched for its ability to predict long-term tumor outcome and/or provide mechanistic data to enable the risk assessment of carcinogens (Waters et al., 2010). The underlying premise of genomic profiling for carcinogenicity prediction is that gene expression changes in the target tissue precede and/or contribute to tumor development and that these changes can be monitored after a short-term in vivo treatment to predict longer term carcinogenic outcomes. To this end, numerous genomic biomarkers or signatures have been described to predict rat hepatocarcinogenicity induced by non-genotoxic compounds (Ellinger-Ziegelbauer et al., 2008; Fielden et al., 2007; Nie et al., 2006; Uehara et al., 2008). The liver is the most common site of tumor formation in the rodent bioassay (Gold et al., 2005) and a number of well-described mechanisms of liver tumor formation are amenable to evaluation based on hepatic gene expression (Waters et al., 2010), making it an ideal model system to evaluate the utility of genomics for carcinogenicity assessment.
Building on the genomic signatures originally described by Fielden et al. (2007) and Nie et al. (2006), we have previously demonstrated the statistical robustness of these proposed signatures for predicting nongenotoxic hepatocarcinogens (NGHCs) (Fielden et al., 2008). However, it was concluded that the published signatures lacked sufficient classification accuracy when used as is likely due to the effect of experimental variables that varied across laboratories, including the microarray platform and study conditions such as time and dose. We reasoned that if the gene expression measurement platform was controlled for and the reproducibility of gene expression measurements is enhanced, a signature could be derived and more thoroughly evaluated for its ability to predict NGHCs and to refine its boundaries of use for optimal classification. Furthermore, to enable broad utilization and evaluation across laboratories, the signature had to be commercially available, established on a reliable and readily available measurement platform, biologically interpretable, and thoroughly evaluated across hundreds of diverse samples. Because the signature is not intended to replace the chronic rodent bioassay but rather to guide internal decision making, allow prioritization of chemicals for formal testing, possibly reduce the reliance on longer term animal studies, and/or enable a more rapid understanding of mode of action, a rigorous validation of the signature as a replacement of chronic rodent studies was not an objective. Instead, the objectives were to develop a signature to enable an early evaluation of NGHCs and to make the signature and underlying data publically available for broader testing.
MATERIALS AND METHODS
TaqMan array card design.
We chose to rederive the initial microarray-based signature using quantitative real-time PCR (qPCR) to provide a widely accessible higher throughput gene expression platform to support evaluation. To this end, we chose the TaqMan array platform (384-well microfluidic cards) to develop a custom array (Applied Biosystems, part of Life Technologies, Foster City, CA). In order to maximize sample throughput, it was desirable to create a TaqMan array with 32 primer pairs in order to permit the analysis of four samples per card in triplicate wells. The predictor genes considered for evaluation included 37 genes from the Iconix signature (Fielden et al., 2007) and six genes from the signature published by Nie et al. (2006). An additional 10 genes from the genotoxic carcinogen signature published by Bayer (Ellinger-Ziegelbauer et al., 2004) were included as it was considered desirable to distinguish nongenotoxic from genotoxic modes of action. Because it was not practical to evaluate all 53 genes, steps were taken to identify 11 genes from the original Iconix 37 gene signature that could provide similar predictive accuracy (data not shown). This resulted in the final selection of 27 unique genes. Three of these genes were evaluated using multiple primer pair sequences (Trnt1, EST AW143969, and Sel1I). Three normalizer genes were also selected to identify an appropriate transcript to normalize and assess input RNA quality (Table 1). Primers and probes were designed by Applied Biosystems according to published design rules (Applied Biosystems).
TABLE 1.
TaqMan Assays Used for qPCR Signature Development
| Assay ID | Accession | Gene symbol | Gene name | Source | In final model |
|---|---|---|---|---|---|
| Rn03399817_g1 | AI232085.1 | Trnt1 | tRNA nucleotidyl transferase, CCA-adding, 1 | Fielden et al., 2007 | No |
| Rn03399820_s1 | AI232085.1 | Trnt1 | tRNA nucleotidyl transferase, CCA-adding, 1 | Fielden et al., 2007 | Yes |
| Rn03399816_s1 | AW143969.1 | EST | EST | Fielden et al., 2007 | No |
| Rn03399821_s1 | AW143969.1 | EST | EST | Fielden et al., 2007 | No |
| Rn03399822_s1 | AW143969.1 | EST | EST | Fielden et al., 2007 | Yes |
| Rn03399819_s1 | AW533663.1 | Prodh | Proline dehydrogenase | Fielden et al., 2007 | Yes |
| Rn03399815_s1 | AW915076.1 | Gpr146 | G protein-coupled receptor 146 | Fielden et al., 2007 | Yes |
| Rn03399814_s1 | BF553500.1 | Cited4 | Cbp/p300-interacting transactivator, with Glu/Asp-rich carboxy-terminal domain, 4 | Fielden et al., 2007 | Yes |
| Rn00680664_g1 | NM_012708.1 | Psmb9 | Proteasome (prosome, macropain) subunit, beta type 9 | Fielden et al., 2007 | Yes |
| Rn01452409_m1 | NM_030844.2 | Ica1 | Islet cell autoantigen 1 | Fielden et al., 2007 | Yes |
| Rn00587206_m1 | NM_053774.2 | Usp2 | Ubiquitin-specific peptidase 2 | Fielden et al., 2007 | Yes |
| Rn01475179_m1 | NM_138882.1 | Pla1a | Phospholipase A1 member A | Fielden et al., 2007 | Yes |
| Rn01424675_m1 | U53184 | Litaf | Lipopolysaccharide-induced TNF factor | Fielden et al., 2007 | Yes |
| Rn01432563_g1 | NM_001007629.1 | Nutf2 | Nuclear transport factor 2 | Nie et al., 2006 | No |
| Rn00689231_m1 | NM_012860.2 | Mat1a | Methionine adenosyltransferase I, alpha | Nie et al., 2006 | Yes |
| Rn02132590_g1 | NM_021766.1 | Pgrmc1 | Progesterone receptor membrane component 1 | Nie et al., 2006 | No |
| Rn00821759_g1 | NM_138826.4 | Mt1a | Metallothionein 1a | Nie et al., 2006 | Yes |
| Rn00756519_m1 | NM_173295.1 | Ugt2b17 | UDP glucuronosyltransferase 2 family, polypeptide B17 | Nie et al., 2006 | Yes |
| Rn01517723_m1 | NM_177933.2 | Sel1I | Sel-1 suppressor of lin-12-like (C. elegans) | Nie et al., 2006 | No |
| Rn00710081_m1 | NM_177933.2 | Sel1I | Sel-1 suppressor of lin-12-like (C. elegans) | Nie et al., 2006 | Yes |
| Rn03399818_s1 | AI639488.1 | Mdm2 | Mdm2 p53-binding protein homolog | Ellinger-Ziegelbauer et al., 2004 | No |
| Rn00563462_m1 | NM_012861.1 | Mgmt | O-6-methylguanine-DNA methyltransferase | Ellinger-Ziegelbauer et al., 2004 | Yes |
| Rn00566256_m1 | NM_013215.1 | Akr7a3 | Aldo-keto reductase family 7, member A3 | Ellinger-Ziegelbauer et al., 2004 | Yes |
| Rn00568504_m1 | NM_017259.1 | Btg2 | B-cell translocation gene 2, anti-proliferative | Ellinger-Ziegelbauer et al., 2004 | Yes |
| Rn01530533_g1 | NM_019905.1 | Anxa2 | Annexin A2 | Ellinger-Ziegelbauer et al., 2004 | Yes |
| Rn00755484_m1 | NM_022407.3 | Aldh1a1 | Aldehyde dehydrogenase 1 family, member A1 | Ellinger-Ziegelbauer et al., 2004 | Yes |
| Rn00709612_m1 | NM_032055 | Tap1 | Transporter 1, ATP-binding cassette, sub-family B | Ellinger-Ziegelbauer et al., 2004 | Yes |
| Rn01427989_s1 | NM_080782.3 | Cdkn1a | Cyclin-dependent kinase inhibitor 1A | Ellinger-Ziegelbauer et al., 2004 | Yes |
| Rn00592205_m1 | NM_133586.1 | Ces2 | Carboxylesterase 2 (intestine, liver) | Ellinger-Ziegelbauer et al., 2004 | Yes |
| Rn00690933_m1 | NM_017101.1 | Ppia | Peptidylprolyl isomerase A (cyclophilin A) | Housekeeping gene | Yes |
| Hs99999901_s1 | X03205.1 | 18S | 18S ribosomal RNA | Housekeeping gene | No |
| Rn99999916_s1 | X02231.1 | Gapdh | Glyceraldehyde 3-phosphate-dehydrogenase | Housekeeping gene | No |
Note. C. elegans, Caenorhabditis elegans; TNF, tumor necrosis factor; UPD, uridine diphosphate.
TaqMan array card assay.
RNA concentration and quality were determined using a NanoDrop ND-1000 Spectrophotometer (Thermo Scientific, Wilmington, DE). A total of 220 ng of liver RNA from each animal was reverse transcribed using the High Capacity complementary DNA (cDNA) RT Kit according to the manufacturer’s instructions (Applied Biosystems). The cDNA was diluted to 2 ng/μl in water and 105 μl were mixed with an equal volume of 2× TaqMan Universal Master Mix (Applied Biosystems). One hundred microliters were then injected into each of two ports on the TaqMan array and analyzed on the Applied Biosystems 7900HT Real-Time PCR System according to the manufacturer’s instructions.
Liver RNA samples.
To develop a de novo signature from qPCR data to predict nongenotoxic hepatocarcinogenicity in the rat, we reanalyzed rat liver RNA samples that had previously been used to identify and evaluate the original Iconix microarray signature (Fielden et al., 2007). Briefly, these samples were derived from male Sprague-Dawley (SD) rats that were administered compound (NGHCs or nonhepatocarcinogens [NH]) or vehicle by oral gavage once daily for 1, 3, or 5 days (n = 3 per group). The considerations for compound classification are described below. The doses administered were considered maximally tolerated in a 5-day study and induced decreases in body weight gain or histological changes in target organs but did not induce severe clinical signs that may otherwise confound interpretation of gene expression changes. Rats were necropsied 24 h after the last dose and liver was stored frozen until RNA extraction according to Fielden et al. (2007). RNA samples were stored at −70°C and were checked to ensure sufficient material to permit cDNA synthesis, as some RNA samples had been depleted or were of low quality. Samples were selected to ensure at least two to three rats per treatment and control group. Vehicle control samples were matched based on common vehicle (aqueous or corn oil) and date that the study was run (i.e., year/quarter). In total, there were 415 RNA samples representing 121 treatment groups, which were analyzed on the TaqMan array. The analyzed log10 ratio data for all treatment groups are provided in Supplementary table 1.
For an independent sample set, we obtained over 900 rat liver RNA samples representing 178 treatment groups from a variety of studies performed at collaborators facilities as further described below. Each treatment group had their own vehicle-matched control and consisted of at least three animals per group. All original SDS files and the R script to execute the model are available upon request to the author.
Compound classification.
A chemical was classified as a hepatocarcinogen if it was (1) found to induce liver tumors in a 2-year carcinogenicity study in at least one strain or gender of rat or (2) reasonably expected to induce liver tumors based on a known class effect (e.g., peroxisome proliferator-activated receptor alpha [PPARα] agonists, steroid hormones). Due to the high false-positive rate of some in vitro genotoxicity assays, we decided to classify hepatocarcinogens as nongenotoxic if there was sufficient literature evidence that they induce liver tumors primarily through a nongenotoxic mechanism despite having a positive finding in an in vitro genotoxicity assay (e.g., phenobarbital, clofibrate). Although we cannot discount the involvement of genotoxic mechanisms in tumor formation for these chemicals, we chose to include these chemicals as NGHCs to improve our ability to identify nongenotoxic mechanisms that may lead to tumor formation.
A chemical was classified as negative for hepatocarcinogenicity if it was (1) found not to induce liver tumors in a 2-year rodent bioassay in both male and female rats or (2) not expected to induce liver tumors based on an antiproliferative mode of action. NHs with a positive finding in a genotoxicity assay were not expected to affect the ability of the signature to identify NGHCs and thus were not specifically excluded from the NH class. Because the assay was restricted to hepatic gene expression, tumor formation in other organs was not considered in the classification nor was the presence or absence of carcinogenic activity in the mouse. No differentiation among tumor types was made, and the term hepatocarcinogenicity is used throughout to refer to chemicals that have been identified to induce adenomas and/or carcinomas.
Data on carcinogenicity outcomes were obtained from the Carcinogenicity Potency Database (http://potency.berkeley.edu), the National Library of Medicine Chemical Carcinogenesis Research Information System (http://toxnet.nlm.nih.gov/cgi-bin/sis/htmlgen?CCRIS), the National Toxicology Program (NTP) Database (http://ntp-apps.niehs.nih.gov/ntp_tox/index.cfm), the Physician’s Desktop Reference (http://www.pdrhealth.com), or peer-reviewed publications (Brambilla and Martelli, 2009; Davies and Monro, 1995; Haseman et al., 1987). Findings reported in the literature were used without reinterpretation or reclassification with respect to their statistical significance or relationship to treatment. Furthermore, no attempt was made to segregate chemicals based on the incidence or severity of tumor formation because the doses used in the current study are likely higher than that used in the rodent bioassay, and biasing the training set toward only potent carcinogens may hinder the sensitivity of the biomarker toward weaker carcinogens that are still of regulatory concern. We recognize that alternative classification of chemicals is possible, given discrepancies in the literature or the rodent bioassay; however, the goal was to derive a signature that provides a sensitive means of identifying NGHCs and modes of action that are expected to contribute to tumor formation rather than recapitulate a specific rodent bioassay result.
Model development step 1: Process evaluation.
The modeling strategy is outlined in Figure 1. We define a model in this context as a specific combination of parameters that are integrated to produce a single score for a given compound at a given dose and time (see model development in Supplementary materials and methods for more information about the model parameters). The general process is to use one subset of data to find the optimal model and a second set of data to test the single optimal model. The first step was to evaluate the process for selecting the optimal model. This involved estimating the accuracy (in terms of area under the receiver operating characteristic [ROC] curve [AUC] and proportion classified correctly) of the model building process in an evaluation phase using 72 compounds profiled on day 5 from the Iconix data set (Table 2). The AUC is equal to the probability that a classification model will rank a randomly chosen positive sample higher than a randomly chosen negative sample and is commonly used to select optimal models independent of class distribution. An AUC of 1.0 reflects a perfect classifier with 100% classification accuracy. This first step also enabled us to evaluate the model-building process on a set of samples from the same site with animals treated under the same protocol. Compounds not profiled on day 5 were excluded because it would have resulted in a skewed distribution of early (day 1) and late (days 3 and 5) samples in the training set, and previous experience indicated that a signature developed on day 5 samples provided the most robust classifier (Fielden et al., 2007). The data for the additional time points are nonetheless available as part of Supplementary table 1. Genotoxic hepatocarcinogens (GHCs) (aflatoxin B1, diethylnitrosamine, methyleugenol, and safrole) were excluded from model development and validation because they did not fit either class we intended to predict, and an insufficient number of GHC samples were available to adequately evaluate classification accuracy toward this class of compounds. However, they were included as part of an effort to test the ability of the genes to differentiate mode of action.
FIG. 1.

Overview of model building and evaluation. The model development occurred in three steps. (A) Step 1 was used to evaluate the process for selecting a single model for validation on an independent test set based on training and test set definitions in Fielden et al. (2007). The model strategy was successful in selecting a single model with a similar AUC estimate to that previously published (see Supplementary Results). (B) The model from step 1 was promising but underpowered. As a result, all Iconix samples were used for training in step 2 to select a single top model to classify compounds in step 3. The strategy for model building and selection was identical to that implemented in step 1 with the qualification that the performance of the top model is not preferentially driven by correctly classifying training samples defined in the process evaluation (step 1). (C) Step 3 is the validation of the top qPCR based model from step 2 on an independent test set. The independent test set is composed of samples from multiple sites using different protocols.
TABLE 2.
Summary of Male SD Rat Compound Treatments Used for Signature Development as Part of the Evaluation Study and Final Model Development
| Compound | Vehicle | Dose (mg/kg/day) | Time point (days) | Class | Seta |
|---|---|---|---|---|---|
| Anastrozole | CMC | 400 | 5 | NGHC | Training |
| Ethisterone | CMC | 1500 | 5 | NGHC | Training |
| Methapyrilene | CMC | 100 | 5 | NGHC | Training |
| Nafenopin | Corn oil | 338 | 5 | NGHC | Training |
| Norethindrone | Corn oil | 375 | 5 | NGHC | Training |
| Pentobarbital | Water | 70 | 5 | NGHC | Training |
| Phenobarbital | Water | 80 | 5 | NGHC | Training |
| Pirinixic acid | CMC | 364 | 5 | NGHC | Training |
| Pravastatin | Corn oil | 1200 | 5 | NGHC | Training |
| 2,3,7,8-Tetrachlorodibenzo-p-dioxin | CMC | 0.02 | 5 | NGHC | Training |
| Acetaminophen | Corn oil | 972 | 5 | NGHC | Test |
| Beta-naphthoflavone | CMC | 1500 | 5 | NGHC | Test |
| Bezafibrate | Corn oil | 617 | 7 | NGHC | Test |
| Bis(2-ethylhexyl) phthalate | Corn oil | 500 | 5 | NGHC | Test |
| Carbamazepine | CMC | 490 | 5 | NGHC | Test |
| Carbimazole | Water | 400 | 5 | NGHC | Test |
| Chloroform | Corn oil | 600 | 5 | NGHC | Test |
| Diethylstilbestrol | Corn oil | 280 | 5 | NGHC | Test |
| Ethylestrenol | CMC | 390 | 5 | NGHC | Test |
| Fluconazole | Corn oil | 394 | 5 | NGHC | Test |
| Oxymetholone | CMC | 1170 | 5 | NGHC | Test |
| Spironolactone | CMC | 300 | 5 | NGHC | Test |
| Testosterone | CMC | 375 | 5 | NGHC | Test |
| Alfacalcidol | CMC | 0.04 | 5 | NH | Training |
| Amlodipine | Corn oil | 19 | 5 | NH | Training |
| Aspirin | Corn oil | 375 | 5 | NH | Training |
| Carvedilol | Corn oil | 2000 | 5 | NH | Training |
| Celecoxib | Corn oil | 400 | 5 | NH | Training |
| Ciprofloxacin | Corn oil | 450 | 5 | NH | Training |
| Citric acid | Water | 3000 | 5 | NH | Training |
| Clarithromycin | Water | 476 | 5 | NH | Training |
| Cortisone | CMC | 206 | 5 | NH | Training |
| Cycloheximide | Water | 0.25 | 5 | NH | Training |
| Dichlorvos | Water | 17 | 5 | NH | Training |
| Diclofenac | Corn oil | 10 | 5 | NH | Training |
| Ergocalciferol | CMC | 15 | 5 | NH | Training |
| Etodolac | CMC | 24 | 5 | NH | Training |
| Fluoxetine | CMC | 52 | 5 | NH | Training |
| Ketorolac | Water | 48 | 5 | NH | Training |
| Megestrol acetate | CMC | 132 | 5 | NH | Training |
| Methyldopa | Water | 325 | 5 | NH | Training |
| Pergolide | CMC | 1.1 | 5 | NH | Training |
| Perhexiline | CMC | 320 | 5 | NH | Training |
| Pioglitazone | Corn oil | 1500 | 5 | NH | Training |
| Praziquantel | CMC | 1200 | 5 | NH | Training |
| Promethazine | Saline | 113 | 5 | NH | Training |
| Propylthiouracil | CMC | 625 | 5 | NH | Training |
| Pyrazinamide | CMC | 1500 | 5 | NH | Training |
| Rabeprazole | Water | 1024 | 5 | NH | Training |
| Rifabutin | CMC | 1500 | 5 | NH | Training |
| Rofecoxib | Corn oil | 1550 | 5 | NH | Training |
| Rosiglitazone | Corn oil | 1800 | 5 | NH | Training |
| Roxithromycin | CMC | 312 | 5 | NH | Training |
| Ticlopidine | CMC | 223 | 5 | NH | Training |
| Tolazamide | CMC | 1500 | 5 | NH | Training |
| Troglitazone | Corn oil | 1200 | 5 | NH | Training |
| Valproic acid | Water | 1500 | 5 | NH | Training |
| 1,1-Dichloroethene | Water | 600 | 5 | NH | Test |
| Amoxapine | CMC | 313 | 5 | NH | Test |
| Cholecalciferol | CMC | 8 | 5 | NH | Test |
| Citalopram | Corn oil | 90 | 5 | NH | Test |
| Clomiphene | CMC | 250 | 5 | NH | Test |
| Clomipramine | Water | 115 | 5 | NH | Test |
| Diazepam | CMC | 710 | 5 | NH | Test |
| Erythromycin | CMC | 1500 | 5 | NH | Test |
| Finasteride | Corn oil | 800 | 5 | NH | Test |
| Geraniol | CMC | 1500 | 5 | NH | Test |
| Pemoline | CMC | 70 | 5 | NH | Test |
| Phenothiazine | Corn oil | 386 | 5 | NH | Test |
| Primidone | CMC | 750 | 5 | NH | Test |
| Propylene glycol | Water | 2000 | 5 | NH | Test |
| Quetiapine | CMC | 500 | 5 | NH | Test |
Note. CMC, carboxymethylcellulose.
Training and test refers to how the treatments were divided for model evaluation only. The final model included all 72 treatment groups. All treatments were by oral administration.
Although the original studies were performed with different microarray platforms, all work in this study was based on the TaqMan array platform. This implies that the features used in the original papers may have different predictive powers in this study due to differences between platforms, but we would expect comparable performance. Other markers from other papers (Nie et al., 2006) were provided, so the final feature list could be reconstituted. Because a portion of the genes used on the TaqMan array were chosen from the same samples based on results of prior modeling (e.g., Fielden et al., 2007), our estimates of signature performance on the training data were optimistically biased. Therefore, we chose to maintain the distinction of training and test samples as originally defined by Fielden et al. (2007) and estimate accuracy only on the test data that were not used for feature selection in the original study. The 72 compounds were thus split into a training set of 10 NGHC compounds and 34 NH compounds and a test set of 13 NGHC compounds and 15 NH compounds as shown in Table 2. For this particular training set, we performed 25 replications of fivefold cross-validation on the 44 training compounds (10 NGHC compounds and 34 NH compounds) to produce 25 × 44 = 1100 scores for each candidate model. For each model, we pooled all 1100 estimates appropriately paired with the class membership to estimate a single value for the area under the ROC curve. Although the AUC estimates will be optimistically biased, the AUC was still a reasonable procedure for ranking the models with the risk of potentially selecting overfitted models. The candidate with the top AUC estimate was selected to classify the samples in the held out test data.
Model development step 2: Final model development.
The AUC point estimate of the test data from the evaluation study appeared promising and was comparable to the AUC estimate in the original microarray-based classifier (see Supplementary materials and methods and Supplementary results). Therefore, we performed a second model building procedure using all 72 compounds in the data set to identify a final model suitable for further independent testing as described below. For the second and final model building process, we chose the model with the best pooled AUC estimate as the top model with the caveat that the AUC estimates stratified by the original training and test split are relatively balanced (in other words, we would not select a model with a high AUC estimate driven predominantly by the originally defined training data that were used in the feature selection). See Supplementary materials and methods and Supplementary results for more information.
Model development step 3: Signature evaluation on independent data set.
The model building procedure in step 2 produced a single final model to classify samples in an independent set of samples. In order to estimate signature performance of the final model, yet also determine the boundaries of use of the derived signature, we tested it on a broad array of samples, several of which did not fall within the training set framework as a result of distinct study designs, rat strains, and/or compound classes. Our goal in doing so was not only to judge the sensitivity, specificity, and reproducibility of the signature but also to identify any factors that may result in poor signature performance so that a study design could be recommended to provide optimal classification accuracy. In total, we obtained over 900 liver RNA samples from a number of sources that tested a variety of chemicals under different conditions. A description of the treatment conditions (dose, time, strain, and route of administration) is provided in Table 3 and references therein. Expression data for these treatment groups are also provided in Supplementary table 1. Liver RNA was analyzed using the TaqMan array as described above for the training set samples. In total, there were 169 treatment groups representing 86 unique compounds, including NGHC, NH, and GHC, and several compounds of unknown or inconclusive carcinogenic outcome in the rat (alpha-naphthylisothiocyanate, butylated hydroxytoluene, ridogrel, prucalopride, chlorpromazine, hexachlorocyclohexane, and amitriptyline). For the purpose of the independent signature evaluation, we removed the unknown and GHC compounds from the analysis, and we removed any compounds that were used in the training of the final model. This produced an independent data set totaling 66 unique compounds that were evaluated under varying conditions.
TABLE 3.
Results of Independent Multisite Signature Evaluation
| Johnson & Johnson (male SD rats); Nie et al., 2006 | ||||||||
|---|---|---|---|---|---|---|---|---|
| Treatment | Vehicle | Dose (mg/kg/day) | Route of administration | Time point (days) | Class | Score | Predicted classa | Confidence levelb |
| Butylated hydroxytoluene | 5% MC | 1000 | PO | 1 | NH | 0.79 | NGHC | 100 |
| Cyproterone acetate | 5% MC | 200 | PO | 1 | NGHC | 0.624 | NGHC | 68.9 |
| Ethinyl estradiol (experiment 1) | 5% MC | 500 | PO | 1 | NGHC | 0.851 | NGHC | 100 |
| Ethinyl estradiol (experiment 2) | 5% MC | 500 | PO | 1 | NGHC | 0.864 | NGHC | 100 |
| Isoniazid | 5% MC | 125 | PO | 1 | NGHC | 0.157 | NH | 100 |
| Methapyrilenec | 5% MC | 200 | PO | 1 | NGHC | 0.87 | NGHC | 100 |
| Monocrotaline | 5% MC | 30 | PO | 1 | NGHC | 0.749 | NGHC | 99.7 |
| Piperonyl butoxide | 5% MC | 4000 | PO | 1 | NGHC | 0.548 | NH | 80.2 |
| Progesterone | 5% MC | 100 | PO | 1 | NGHC | 0.359 | NH | 100 |
| Simvastatin | 5% MC | 150 | PO | 1 | NGHC | 0.353 | NH | 100 |
| Tamoxifen | 5% MC | 750 | PO | 1 | NGHC | 0.532 | NH | 87.1 |
| Amiodarone | 5% MC | 600 | PO | 1 | NH | 0.468 | NH | 98.8 |
| Amiodarone (experiment 1) | 5% MC | 1000 | PO | 1 | NH | 0.419 | NH | 99.9 |
| Amiodarone (experiment 2) | 5% MC | 1000 | PO | 1 | NH | 0.508 | NH | 94 |
| Aniline | 5% MC | 200 | PO | 1 | NH | 0.696 | NGHC | 96.1 |
| Aspirinc | 5% MC | 600 | PO | 1 | NH | 0.37 | NH | 100 |
| Atenolol | 5% MC | 1500 | PO | 1 | NH | 0.245 | NH | 100 |
| Beta-hydroxypropyl-cyclodextrin | Water | 2000 | PO | 1 | NH | 0.724 | NGHC | 98.8 |
| Bromocryptine | 5% MC | 200 | PO | 1 | NH | 0.19 | NH | 100 |
| Buspirone | 5% MC | 100 | PO | 1 | NH | 0.661 | NGHC | 87.6 |
| Captopril | 5% MC | 5000 | PO | 1 | NH | 0.401 | NH | 100 |
| Clozapine | 5% MC | 150 | PO | 1 | NH | 0.517 | NH | 91.8 |
| Dantrolene | 5% MC | 500 | PO | 1 | NH | 0.438 | NH | 99.7 |
| Dapsone | 5% MC | 50 | PO | 1 | NH | 0.563 | NH | 71.7 |
| Dexamethasone | 5% MC | 75 | PO | 1 | NH | 0.437 | NH | 99.7 |
| Dieldrin | 5% MC | 30 | PO | 1 | NH | 0.62 | NGHC | 66.3 |
| Dieldrin | 5% MC | 45 | PO | 1 | NH | 0.615 | NGHC | 63 |
| Dipyridamole | 5% MC | 5000 | PO | 1 | NH | 0.659 | NGHC | 86.6 |
| Disulfiram | 5% MC | 2000 | PO | 1 | NH | 0.91 | NGHC | 100 |
| Enalapril | 5% MC | 1800 | PO | 1 | NH | 0.653 | NGHC | 84.2 |
| Erythromycin estolate (experiment 1) | 5% MC | 1500 | PO | 1 | NH | 0.678 | NGHC | 92.5 |
| Erythromycin estolate (experiment 2) | 5% MC | 1500 | PO | 1 | NH | 0.802 | NGHC | 100 |
| Famotidine | 5% MC | 500 | PO | 1 | NH | 0.641 | NGHC | 78.8 |
| Fluoxetinec | 5% MC | 50 | PO | 1 | NH | 0.584 | NH | 58.7 |
| Fluoxetinec | 5% MC | 100 | PO | 1 | NH | 0.499 | NH | 95.6 |
| Flutamide | 5% MC | 500 | PO | 1 | NH | 0.923 | NGHC | 100 |
| Flutamide | 5% MC | 500 | PO | 1 | NH | 0.779 | NGHC | 99.9 |
| Furosemide | 5% MC | 1500 | PO | 1 | NH | 0.284 | NH | 100 |
| Glibenclamide | 5% MC | 3000 | PO | 1 | NH | 0.35 | NH | 100 |
| Glibenclamide | 5% MC | 5010 | PO | 1 | NH | 0.479 | NH | 98 |
| Iansoprazole | 5% MC | 200 | PO | 1 | NH | 0.768 | NGHC | 99.9 |
| Ibuprofen | 5% MC | 500 | PO | 1 | NH | 0.571 | NH | 67.2 |
| Indomethacin | Saline | 30 | IP | 1 | NH | 0.385 | NH | 100 |
| Itraconazole | 5% MC | 200 | PO | 1 | NH | 0.584 | NH | 58.7 |
| Ketoconazole | 5% MC | 150 | PO | 1 | NH | 0.456 | NH | 99.3 |
| Mebendazole | 5% MC | 40 | PO | 1 | NH | 0.472 | NH | 98.5 |
| Metformin | 5% MC | 750 | PO | 1 | NH | 0.443 | NH | 99.7 |
| Methyldopac | 5% MC | 1000 | PO | 1 | NH | 0.293 | NH | 100 |
| Metoprolol | 5% MC | 2000 | PO | 1 | NH | 0.712 | NGHC | 98 |
| Mycophenolic acid | 5% MC | 500 | PO | 1 | NH | 0.429 | NH | 99.8 |
| Naltrexone | 5% MC | 1000 | PO | 1 | NH | 0.792 | NGHC | 100 |
| Niacin | 5% MC | 2505 | PO | 1 | NH | 0.779 | NGHC | 99.9 |
| Niacin | 5% MC | 5010 | PO | 1 | NH | 0.125 | NH | 100 |
| Nifedipine | 5% MC | 750 | PO | 1 | NH | 0.655 | NGHC | 85.2 |
| Nitrofurantoin | 5% MC | 400 | PO | 1 | NH | 0.599 | NGHC | 52.3 |
| Nizatidine | 5% MC | 1000 | PO | 1 | NH | 0.569 | NH | 68.5 |
| IVrhexilenec | 5% MC | 2000 | PO | 1 | NH | 0.522 | NH | 90.4 |
| Perhexilenec | 5% MC | 2010 | PO | 1 | NH | 0.381 | NH | 100 |
| Phenylephrine | Saline | 5 | IP | 1 | NH | 0.105 | NH | 100 |
| Quercetin | 5% MC | 1995 | PO | 1 | NH | 0.316 | NH | 100 |
| Quercetin | 5% MC | 4005 | PO | 1 | NH | 0.368 | NH | 100 |
| Raloxifene | 5% MC | 700 | PO | 1 | NH | 0.235 | NH | 100 |
| Rantidine | 5% MC | 1000 | PO | 1 | NH | 0.302 | NH | 100 |
| Rifampin | 5% MC | 600 | PO | 1 | NH | 0.49 | NH | 96.9 |
| Rosiglila/onec | 5% MC | 30 | PO | 1 | NH | 0.179 | NH | 100 |
| Rosiglitazonec | 5% MC | 100 | PO | 1 | NH | 0.207 | NH | 100 |
| Rotenone | 5% MC | 4 | PO | 1 | NH | 0.702 | NGHC | 96.9 |
| Rotenone | 5% MC | 100 | PO | 1 | NH | 0.533 | NH | 86.8 |
| Sulfamethoxazole | 5% MC | 2000 | PO | 1 | NH | 0.681 | NGHC | 93.2 |
| Tannic acid (experiment 1) | 5% MC | 3000 | PO | 1 | NH | 0.715 | NGHC | 98.2 |
| Tannic acid (experiment 2) | 5% MC | 3000 | PO | 1 | NH | 0.639 | NGHC | 77.5 |
| Tetracycline | 5% MC | 500 | PO | 1 | NH | 0.565 | NH | 71 |
| Troglitazanec | 5% MC | 100 | PO | 1 | NH | 0.486 | NH | 97.3 |
| Troglitazonec | 5% MC | 500 | PO | 1 | NH | 0.606 | NGHC | 56.8 |
| Valproic acidc | 5% MC | 200 | PO | 1 | NH | 0.312 | NH | 100 |
| Valproic acidc | 5% MC | 500 | PO | 1 | NH | 0.237 | NH | 100 |
| Valproic acidc | 5% MC | 600 | PO | 1 | NH | 0.41 | NH | 99.9 |
| Valproic acidc | 5% MC | 1000 | PO | 1 | NH | 0.735 | NGHC | 99.3 |
| Verapamil | 5% MC | 75 | PO | 1 | NH | 0.377 | NH | 100 |
| Vitamin A | 5% MC | 100 | PO | 1 | NH | 0.509 | NH | 93.7 |
| Vitamin A | 5% MC | 200 | PO | 1 | NH | 0.406 | NH | 100 |
| NTP (male F344 rats); Auerbach et al., 2010 | ||||||||
|---|---|---|---|---|---|---|---|---|
| Treatment | Feed | Dose (mg/kg/day) | Route of administration | Time point (days) | Class | Score | Predicted class | Confidence level |
| 1-Amino-2,4-dibromoanthraquinone | Feed | 5000 ppm | Dietary | 2 | NGHC | 0.909 | NGHC | 100 |
| 1-Amino-2,4-dibromoanthraquinone | Feed | 5000 ppm | Dietary | 14 | NGHC | 0.868 | NGHC | 100 |
| 1-Amino-2,4-dibromoanthraquinone | Feed | 5000 ppm | Dietary | 90 | NGHC | 0.792 | NGHC | 100 |
| Acetaminophenc | Feed | 3000 ppm | Dietary | 2 | NGHC | 0.861 | NGHC | 100 |
| Acetaminophenc | Feed | 3000 ppm | Dietary | 14 | NGHC | 0.797 | NGHC | 100 |
| Acetaminophenc | Feed | 3000 ppm | Dietary | 90 | NGHC | 0.672 | NGHC | 91.1 |
| Methyleugenol | 5% MC | 150 | PO | 2 | GHC | 0.911 | NGHC | 100 |
| Methyleugenol | 5% MC | 150 | PO | 14 | GHC | 0.869 | NGHC | 100 |
| Methyleugenol | 5% MC | 150 | PO | 90 | GHC | 0.654 | NGHC | 84.7 |
| Methyleugenol | Corn oil | 35.6 | PO | 2 | GHC | 0.632 | NGHC | 73.6 |
| Methyleugenol | Corn oil | 35.6 | PO | 14 | GHC | 0.537 | NH | 85.2 |
| Methyleugenol | Corn oil | 35.6 | PO | 90 | GHC | 0.565 | NH | 70.7 |
| Methyleugenol | Corn oil | 356 | PO | 2 | GHC | 0.849 | NGHC | 100 |
| Methyleugenol | Corn oil | 356 | PO | 14 | GHC | 0.757 | NGHC | 99.8 |
| Methyleugenol | Corn oil | 356 | PO | 90 | GHC | 0.851 | NGHC | 100 |
| Safrole | Corn oil | 32.4 | PO | 2 | GHC | 0.717 | NGHC | 98.3 |
| Safrole | Corn oil | 32.4 | PO | 14 | GHC | 0.538 | NH | 84.9 |
| Safrole | Corn oil | 32.4 | PO | 90 | GHC | 0.655 | NGHC | 88.6 |
| Safrole | Corn oil | 324 | PO | 2 | GHC | 0.758 | NGHC | 99.8 |
| Safrole | Corn oil | 324 | PO | 14 | GHC | 0.736 | NGHC | 99.3 |
| Safrole | Corn oil | 324 | PO | 90 | GHC | 0.717 | NGHC | 98.3 |
| Ascorbic acid | Feed | 25,000 ppm | Dietary | 2 | NH | 0.636 | NGHC | 76.1 |
| Ascorbic acid | Feed | 25,000 ppm | Dietary | 14 | NH | 0.786 | NGHC | 100 |
| Ascorbic acid | Feed | 25,000 ppm | Dietary | 90 | NH | 0.468 | NH | 98.8 |
| Eugenol | Corn oil | 32.8 | PO | 2 | NH | 0.544 | NH | 82.2 |
| Eugenol | Corn oil | 32.8 | PO | 14 | NH | 0.49 | NH | 96.9 |
| Eugenol | Corn oil | 32.8 | PO | 90 | NH | 0.431 | NH | 99.8 |
| Eugenol | Corn oil | 328 | PO | 2 | NH | 0.429 | NH | 99.8 |
| Eugenol | Corn oil | 328 | PO | 14 | NH | 0.698 | NGHC | 96.4 |
| Eugenol | Corn oil | 328 | PO | 90 | NH | 0.363 | NH | 100 |
| Isoeugenol | Corn oil | 32.8 | PO | 2 | NH | 0.804 | NGHC | 100 |
| Isoeugenol | Corn oil | 32.8 | PO | 14 | NH | 0.77 | NGHC | 99.9 |
| Isoeugenol | Corn oil | 32.8 | PO | 90 | NH | 0.709 | NGHC | 97.7 |
| Isoeugenol | Corn oil | 328 | PO | 2 | NH | 0.728 | NGHC | 99 |
| Isoeugenol | Corn oil | 328 | PO | 14 | NH | 0.655 | NGHC | 85 |
| Isoeugenol | Corn oil | 328 | PO | 90 | NH | 0.642 | NGHC | 79.3 |
| l-tryptophan | Feed | 25,000 ppm | Dietary | 2 | NH | 0.876 | NGHC | 100 |
| l-tryptophan | Feed | 25,000 ppm | Dietary | 14 | NH | 0.837 | NGHC | 100 |
| l-tryptophan | Feed | 25,000 ppm | Dietary | 90 | NH | 0.647 | NGHC | 81.8 |
| Aflatoxin B1 | Feed | 1 ppm | Dietary | 2 | GHC | 0.665 | NGHC | 88.9 |
| Aflatoxin B1 | Feed | 1 ppm | Dietary | 14 | GHC | 0.812 | NGHC | 100 |
| Aflatoxin B1 | Feed | 1 ppm | Dietary | 90 | GHC | 0.836 | NGHC | 100 |
| Dimethylnitrosamine | Water | 5 ppm | Water | 2 | GHC | 0.653 | NGHC | 84.2 |
| Dimethylnitrosamine | Water | 5 ppm | Water | 14 | GHC | 0.745 | NGHC | 99.6 |
| Dimethylnitrosamine | Water | 5 ppm | Water | 90 | GHC | 0.759 | NGHC | 99.8 |
| Pfizer (male SD rats) | ||||||||
|---|---|---|---|---|---|---|---|---|
| Treatment | Vehicle | Dose (mg/kg/day) | Route of administration | Time point (days) | Class | Score | Predicted class | Confidence level |
| Acetaminophenc | 5% MC | 300 | PO | 4 | NGHC | 0.199 | NH | 100 |
| Thioacetamide | 5% MC | 50 | PO | 4 | NGHC | 0.809 | NGHC | 100 |
| Alpha-naphthylisothiocyanate | 5% MC | 30 | PO | 1 | Unknown | 0.42 | NH | 99.9 |
| Alpha-naphthylisothiocyanate | 5% MC | 100 | PO | 1 | Unknown | 0.329 | NH | 100 |
| Roche (male SD rats) | ||||||||
|---|---|---|---|---|---|---|---|---|
| Treatment | Vehicle | Dose (mg/kg/day) | Route of administration | Time point (days) | Class | Score | Predicted class | Confidence level |
| Methapyrilenec | Water | 10 | PO | 2 | NGHC | 0.329 | NH | 100 |
| Methapyrilenec | Water | 10 | PO | 6 | NGHC | 0.584 | NH | 58.7 |
| Methapyrilenec | Water | 10 | PO | 10 | NGHC | 0.698 | NGHC | 96.4 |
| Methapyrilenec | Water | 10 | PO | 14 | NGHC | 0.734 | NGHC | 99.3 |
| Methapyrilenec | Water | 50 | PO | 2 | NGHC | 0.861 | NGHC | 100 |
| Methapyrilenec | Water | 50 | PO | 6 | NGHC | 0.884 | NGHC | 100 |
| Methapyrilenec | Water | 50 | PO | 10 | NGHC | 0.879 | NGHC | 100 |
| Methapyrilenec | Water | 50 | PO | 14 | NGHC | 0.713 | NGHC | 98 |
| Sanofi-aventis (male F344 rats); Michel et al., 2005—site 2 | ||||||||
|---|---|---|---|---|---|---|---|---|
| Treatment | Vehicle | Dose (mg/kg/day) | Route of administration | Time point (days) | Class | Score | Predicted class | Confidence level |
| Clofibrate | Feed | 5000 ppm | Dietary | 18 | NGHC | 0.821 | NGHC | 100 |
| Clofibrate | Feed | 5000 ppm | Dietary | 264 | NGHC | 0.866 | NGHC | 100 |
| Clofibrate (nontumorous) | Feed | 5000 ppm | Dietary | 607 | NGHC | 0.709 | NGHC | 97.8 |
| Clofibrate (adjacent tumor) | Feed | 5000 ppm | Dietary | 607 | NGHC | 0.945 | NGHC | 100 |
| Schering-Plough Research Institute (male SD rats); Nioi et al., 2008 | ||||||||
|---|---|---|---|---|---|---|---|---|
| Treatment | Vehicle | Dose (mg/kg/day) | Route of administration | Time point (days) | Class | Score | Predicted class | Confidence level |
| Acetaminophenc | 4% MC | 950 | PO | 1 | NGHC | 0.782 | NGHC | 99.9 |
| Acetaminophenc | 4% MC | 950 | PO | 5 | NGHC | 0.866 | NGHC | 100 |
| Butylated hydroxytoluene | 4% MC | 450 | PO | 1 | NH | 0.739 | NGHC | 99.4 |
| Butylated hydroxytoluene | 4% MC | 450 | PO | 5 | NH | 0.685 | NGHC | 94.2 |
| Methapyrilenec | 4% MC | 100 | PO | 1 | NGHC | 0.828 | NGHC | 100 |
| Methapyrilenec | 4% MC | 100 | PO | 5 | NGHC | 0.895 | NGHC | 100 |
| Phenobarbitalc | Water | 50 | PO | 1 | NGHC | 0.701 | NGHC | 96.8 |
| Phenobarbitalc | Water | 50 | PO | 5 | NGHC | 0.649 | NGHC | 82.5 |
| Fluoxetinec | 4% MC | 400 | PO | 1 | NH | 0.444 | NH | 99.6 |
| Fluoxetinec | 4% MC | 400 | PO | 5 | NH | 0.484 | NH | 97.6 |
| Ranitidine | 4% MC | 1000 | PO | 1 | NH | 0.54 | NH | 83.9 |
| Ranitidine | 4% MC | 1000 | PO | 5 | NH | 0.541 | NH | 83.4 |
| Iconix (male SD rats); Fielden et al., 2007 | ||||||||
|---|---|---|---|---|---|---|---|---|
| Treatment | Vehicle | Dose (mg/kg/day) | Route of administration | Time point (days) | Class | Score | Predicted class | Confidence level |
| Aflatoxin B1 | 0.5% CMC | 0.3 | PO | 5 | GHC | 0.518 | NH | 91.5 |
| Diethylnitrosamine | Saline | 34 | PO | 5 | GHC | 0.754 | NGHC | 99.7 |
| Pregnenolone-16alpha-carbonitrile | 0.5% MC | 100 | PO | 5 | NGHC | 0.899 | NGHC | 100 |
| Carbon tetrachloride | Corn oil | 1175 | PO | 3 | NGHC | 0.62 | NGHC | 66.6 |
| Abbott (male and female SD rats) | ||||||||
|---|---|---|---|---|---|---|---|---|
| Treatment | Vehicle | Dose (mg/kg/day) | Route of administration | Time point (days) | Class | Score | Predicted class | Confidence level |
| N-vinylpyrrolidone-2—male | Saline | 3000 | PO | 5 | NGHC | 0.782 | NGHC | 99.7 |
| N-vinylpyrrolidone-2—female | Saline | 3000 | PO | 5 | NGHC | 0.725 | NGHC | 97.1 |
| Rimonabant—male | 0.2% HPMC | 10 | PO | 5 | NGHC | 0.635 | NGHC | 71.9 |
| Latrepirdine—male | 0.2% HPMC | 10 | IP | 6 | NH | 0.276 | NH | 100 |
Note. CMC, carboxymethylcellulose; HPMC, hydroxypropylmethylcellulose; MC, methylcellulose.
Signature scores ≥ than the classification threshold (0.596) were predicted as NGHCs.
The CI provides an estimate of confidence for the two class predictions (NGHC or NH) and is described in the Supplementary Materials and Methods.
Indicates compounds that were also used in the original training set.
Determination of classification threshold.
Classifying compounds as NGHC or NH required dichotomizing the classification scores into calls. Because we standardized all potential models to have probabilistic output with values 0–1 inclusive, we modeled the classification scores with beta distributions. After each replication of cross-validation on the training data, we chose to separately fit the classification scores into two separate beta probability densities; one beta distribution was fit for NH classification results and one beta distribution for NGHC classification results. The point that is equally likely to be in either NGHC or NH distribution was defined as the threshold or classification cut point. The 25 replications of cross-validation provided 25 estimates for the threshold for a given model. Because the threshold tended to be away from the 0 or 1 limits, the thresholds were approximately normally distributed, and this allowed for reasonable estimates of the variance associated with the threshold.
Interlaboratory precision study.
The interlaboratory precision of the model was evaluated by splitting each of 38 liver RNA samples among four laboratories and determining the reproducibility of the expression values and signature scores when measured in different laboratories. The 38 samples consisted of liver RNA from male F344 rats (n = 6–10 rats per group) treated with 5000 ppm of clofibrate in the diet for 18, 264, or 607 days (Michel et al., 2005). RNA samples from liver tumors, and adjacent normal tissue, were evaluated and compared on day 607. The time-matched control animals received diet only. The precision of the TaqMan array data was evaluated by comparing the variability of signature scores and expression ratios across the four sites.
Biological interpretation of biomarker genes and their regulation.
Two approaches were used to obtain gene function information on the 23 genes composing the final model: (1) A general biomedical literature searching (PubMed) effort carried out on a gene-by-gene basis and (2) the mining of annotated knowledge-based databases found in the Ingenuity Pathway Analysis (IPA) software (Ingenuity Systems, Redwood City, CA) and the BIOBASE Knowledge Library (BIOBASE Corporation, Beverly, MA). The literature review was focused on identifying functional associations between biomarker genes and regulation of cell proliferation and carcinogenesis. IPA was used to identify pathways, biological processes, and networks that were statistically enriched in the signature genes. Through the use of both tools, the probability that the representation level of genes in the query set in each functional category, disease, or network process is due to chance alone was expressed as a p value. p Values less than 0.05 were considered significant. Detailed information on the statistical methods underlying the pathway and functional category enrichment and impact scoring can be found at the software provider’s web address (Ingenuity Systems, http://www.ingenuity.com/).
Using gene accession information, the genes composing the final model were uploaded into the BIOBASE analysis tool, ExPlain data analysis system, which leverages the TRANSFAC and TRANSPATH databases to score for the presence of transcription factor response elements (TFREs) within the 1100-bp proximal promoter region of the member genes. To determine relative enrichment, the TFRE abundance in the query set were compared with a reference set of 400 rat housekeeping gene promoters and the likelihood of TFRE overrepresentation in the query set relative to the reference set is expressed as a p value representing the probability that the difference in the TFRE overrepresentation is due solely to chance. A more detailed description of the BIOBASE ExPlain tool and the statistical methods underpinning the TFRE enrichment analysis can be found at the provider’s web address (http://www.biobase-international.com/).
RESULTS
Classification Accuracy
The results of the initial model building in the evaluation study (model development step 1: process evaluation) produced an AUC of 0.84 on the test data, which was significantly different from the top model trained on the same data with the class labels randomly permuted (p = 0.012), indicating that the model and the underlying model building procedure identified a true signal that can differentiate NGHCs from NHs (see Supplementary Results). Based on these encouraging results, we proceeded to build a final model using all 72 compounds. This modeling resulted in a signature containing 22 genes, all normalized to peptidylprolyl isomerase A, using a random forest classifier with a classification threshold of 0.596. This signature was then evaluated on the independent data set detailed in Table 3. Figure 2 shows the principal component analysis visualization based on the delta-delta Ct values (expression ratios) for each data point (a compound measured at a given site, dose, and time point) in the independent data set. The compound classes tended to separate in the first two principal components, thus indicating that the separation of NGHCs and NHs is partially preserved on independent samples based solely on the expression of the 22 genes in the final signature. To estimate overall sensitivity and specificity of the signature, an evaluation was done at the compound level by merging compounds measured from multiple sites, and at different doses or time points, into a single score based on the median signature score. Merging the replicates produced 66 unique compounds in the independent data set. This approach resulted in a sensitivity and specificity of 67% (95% confidence interval [CI] = 38–88%) and 59% (95% CI = 44–72%), respectively, with an AUC of 0.65 (95% CI = 0.46–0.83%) (Fig. 4A and Supplementary figure. 2). This conservative estimate provided a 1:1 mapping of compound to prediction in order to estimate the associations with class. In general, we found the data from multiple sites for the same compound to have correlated scores (see Supplementary figure. 3). However, merging multiple doses in this manner may risk conflating very different responses on individual compounds and should not be done in practice, but it nonetheless provides a convenient means to estimate performance. The effect of site, which in this context is a proxy for study protocol, on these classification results is difficult to evaluate because most samples in the independent study came from single dose (1 day) studies at Johnson & Johnson (J&J) (Fig. 3). If we confined our results to sites outside of the J&J samples, we estimate an improved AUC at 0.81 (95% CI = 0.5–1.0%), whereas the J&J results provided an AUC of 0.49 (95% CI = 0.22–0.76%) (Fig. 4B).
FIG. 2.

Principal component analysis of the independent signature evaluation data. The 66 test compounds (including replicates) spanned by the 22 predictive genes in the model are projected in the first two principal components. The results are stratified vertically by compound classification accuracy and horizontally by compound class. Rug plots were added, so that compound positioning is more apparent with the color scaled according to the classifier score (darker marks suggest higher scores). In general, we see good separation between NGHC compounds (black points) and NH compounds (gray points), and this suggests the 22 predictive genes tended to separate the classes as expected. The NGHC compounds that are classified incorrectly (black points with gray borders) are generally in close proximity to the NH compounds, whereas the NH compounds that are classified incorrectly (gray points with black borders) have mixed proximity to other NGHC compounds.
FIG. 4.

Final model performance. (A) ROC plot for the evaluation signature set. The results on the independent signature set (multisite test set) are represented by the black line with points. Each point is derived from an identical compound generated at different sites and tested at different doses but summarized by the median signature score. Random chance is the diagonal dashed black line. The observed sensitivity and specificity derived from the independent test set is shown with the gray ‘X’. The black box captures the 95% through the 97.5% CI based on an exact test (Clopper-Pearson). FPR: false-positive rate. TPR: true-positive rate. Sensitivity and specificity curves are also evaluated in Supplementary figure 2. (B) 2 × 2 contingency tables for classification of independent and unique compounds. Sensitivity is the proportion of NGHCs correctly predicted positive. Specificity is the proportion of NHs correctly predicted negative. PPV, positive predictive value, is the proportion of samples with positive test results that are correctly diagnosed. NPV, negative predictive value, is the proportion of samples with negative test results that are correctly diagnosed. AUC with 95% confidence limits in brackets.
FIG. 3.

Stratification of independent test compounds by class, day, and site. Each replicate from a given compound is represented by a single point, and class results are summarized using box plots. The boxes capture the middle 50% of the data. The classifier cutoff is represented using a horizontal line at 0.596. Compounds with a signature score greater than 0.596 are classified as positive (predicted NGHC). The plot shows the composition of the independent data set being composed of samples predominately from J&J at day1.
To explore further the boundaries of use for the signature, we evaluated the dose and time dependence of the signature score. The effects of dose and time on signature predictions were best illustrated by the samples from a time-course study in male SD rats treated with methapyrilene at doses of 10 and 50 mg/kg/day. Methapyrilene administered at 10 mg/kg produced a time-dependent increase in the signature score; however, it was correctly predicted positive only at the later time points on days 10 and 14 (Table 3). By contrast, the high dose of methapyrilene at 50 mg/kg/day was correctly predicted positive at all time points. The signature score did not appear to increase over time as it was close to its maximum on day 2 and sustained above 0.7 throughout the course of treatment. Methapyrilene was also correctly predicted positive when tested by J&J at 200 mg/kg for 24 h (Table 3). These results suggest that the signature is sensitive to dose and time and that low-dose exposure and/or early time points may not be optimal to identify expression changes diagnostic of NGHCs. This is consistent with the fact that the training set was established using maximum tolerated doses and repeated daily doses for 5 days.
The ability of the signature to correctly classify samples from long-term treatments was investigated by evaluating the 90-day studies conducted by the National Toxicology Program (NTP). A comparison of time points within these long-term studies indicate that the 90-day samples typically produce similar classification results as the earlier time points (cf days 2 or 14). Although many of the classification results from the 90-day NTP studies were incorrect (false positives), the consistency of the results suggest that the expression changes were conserved over time. Likewise, the long-term clofibrate diet study also indicated the classification results and expression changes were preserved over the extended course of treatment (Table 3). These limited results suggest that samples from both short-term and long-term repeat dose studies may have applicability to the signature.
Some hepatocarcinogens are thought to cause tumors secondary to hepatotoxicity and regenerative proliferation, raising the concern that the signature may be sensitive to false positives as a result of liver injury. Therefore, it was of interest to determine if there was an association between the signature score and the degree of hepatic damage. Rats treated with methapyrilene at both 10 and 50 mg/kg showed no difference in the degree of hepatotoxicity at the early time points as both groups showed minimal single cell necrosis, yet the signature scores were clearly distinct on days 2 and 6. The increasing severity of hepatic necrosis in the high-dose group at later time points also did not correlate with the signature score. Both low and high doses produced minimal to mild spindle cell proliferation on days 6 through 14, including biliary hyperplasia in the most severe instances in the high dose group (data not shown), yet this was not correlated with the signature score. These results suggest that proliferating cells and hepatotoxicity are unlikely to influence the signature scores and lead to false positives with hepatotoxic treatments. While the lack of a complete histological evaluation of all test samples precludes a more comprehensive analysis of this hypothesis, the negative signature scores for other known hepatotoxic compounds such as alpha-naphthylisothiocyanate, rotenone, valproic acid, or aflatoxin B1 (Table 3) provide further evidence that hepatotoxic drug treatments are unlikely to produce false-positive predictions. This is consistent with results with the original Iconix microarray signature (Fielden et al., 2007), and the fact that hepatotoxic treatments were included in both classes of the training set to limit this possibility.
Signature Precision and Reproducibility
The precision of the TaqMan array was assessed by splitting RNA samples from a chronic clofibrate toxicity study into aliquots for evaluation at four different laboratories to assess site-to-site variation. As expected, the precision of the classifier score when measured across sites was excellent, as all four sites produced very similar expression results and signature scores (Supplementary figures. 1A and 1B). In addition, it was of interest to determine the robustness of the predictive expression changes for compounds evaluated at different sites or dates; we expect a given compound to be classified identically assuming the same dose and experimental protocol were used. Five compounds were tested at the same dose level in separate studies but at the same site (J&J), thus permitting an evaluation of reproducibility within a single laboratory. In all five cases, the biomarker predictions were concordant (amiodarone, erythromycin, ethinyl estradiol, flutamide, and tannic acid; Table 3). These results provide confidence that signature predictions should be similar when assessed under similar study conditions. Because other compounds evaluated at multiple sites were tested under different conditions or doses, we were unable to evaluate the reproducibility across sites. In addition, it was of interest to determine the robustness of the predictive expression changes for compounds evaluated at different sites, as the same compound should ideally be predicted the same regardless of where it was tested. A number of compounds were tested at multiple sites albeit with different study designs and doses, so a direct comparison could not be made. For example, acetaminophen was tested at three different sites at 300 mg/kg for 4 days, 950 mg/kg/day for 1 and 5 days, and at a dietary exposure of 3000 ppm for 2, 14, and 90 days. Acetaminophen was correctly predicted positive by the signature at 950 mg/kg/day and 3000 ppm at all time points, whereas the lower dose of 350 mg/kg for 24 h was predicted negative (Table 3). Additionally, the non-genotoxic hepatocarcinogen methapyrilene was correctly predicted positive by the signature at four different doses and across three different laboratories. The NH fluoxetine was also correctly predicted negative at three different doses and across two different laboratories (Table 3).
Evaluating Nongenotoxic Modes of Action
Previous microarray expression data on the Iconix samples (Fielden et al., 2007) demonstrated that hierarchical clustering of NGHCs across 37 signature genes could identify compounds with similar mode of action based on the similarity of their expression profiles. Although hierarchical clustering is an unsupervised clustering technique and therefore not a formal prediction, it can provide a visual but subjective means to evaluate novel compounds for potential modes of action that may contribute to a positive prediction and hepatocarcinogenicity. The 23 NGHCs in the training set were clustered across all 22 genes in the model (Fig. 5). A number of test compounds were included in the clustering to evaluate whether the signature genes could facilitate identification of known compounds with similar modes of action. The genotoxic hepatocarcinogens aflatoxin B1 and N-nitrosodiethylamine dosed orally for 5 days clustered together and were distinct from other treatments. The next most similar expression profiles were a number of hepatotoxicants such as acetaminophen and chloroform, which appeared to be driven by induction of the oxidative stress–responsive gene Akr7a3. The test compounds pregnenolone-16alpha-carbonitrile, phenobarbital, and butylated hydroxytoluene clustered among other PXR and CAR agonists as expected, whereas the P450 inducers and Ah receptor agonists, beta-naphthoflavone and 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), clustered distinctly. Interestingly, fluconazole coclustered with diethylstilbestrol and norethindrone, which suggests fluconazole may have a similar mode of action for inducing hepatocarcinogenicity. It is notable that a number of PPARα agonists coclustered despite the fact that the 22 signature genes are not known for being associated with fatty acid metabolism. This cluster of PPARα agonists was also correlated to the profiles of bis(2-ethylhexyl)phthalate and pravastatin, which are also thought to activate PPARα (Chen et al., 2010). These results substantiate the utility of the 22 signature genes to identify putative modes of action for known or suspected hepatocarcinogens.
FIG. 5.

Hierarchical clustering of genotoxic and nongenotoxic hepatocarcinogens. The log10 ratios of the 22 signature genes were calculated by comparing the expression in the treated rats relative to time-matched vehicle control rats. The genes and expression profiles were then hierarchically clustered. Clustering method: complete linkage. Distance measure: correlation. Green = upregulation; red = downregulation; black = no change. Absolute magnitude of expression change is provided in Supplementary table 1. Note. See online version for color version.
Evaluating Genotoxic Modes of Action
A number of genotoxic treatments were included in the test set to evaluate whether the signature detected expression changes that were common to all hepatocarcinogens regardless of mode of action. In male SD rats, oral administration of aflatoxin B1 resulted in a negative prediction, whereas N-nitrosodiethylamine was predicted positive. Dietary exposure of male F344 rats to aflatoxin and N-nitrosodimethylamine resulted in a consistent positive prediction on days 2 through 90. This was unexpected as the model was trained to identify NGHCs. Whether genotoxic hepatocarcinogens cause prognostic expression changes similar to NGHCs is unclear and will require evaluation of a broader set of genotoxic compounds.
A number of genes on the array were chosen based on a previous study (Ellinger-Ziegelbauer et al., 2004) that demonstrated a strong and consistent upregulation of expression in response to genotoxic hepatocarcinogens, which suggested that they could be used to differentiate genotoxic from nongenotoxic modes of action. These genes include a number of p53- and DNA damage–responsive genes such as BTG2, CDKN1A, and MGMT, as well as a number of xenobiotic metabolism genes such as CES2 and ALDH1A1. As shown in Figure 6A, these genes were significantly induced by aflatoxin B1 and diethylnitrosamine after 5 days of repeated daily dosing in male SD rats. By comparison, the NGHCs, bezafibrate and TCDD do not consistently induce these genes after 5 days of repeated daily dosing (Fig. 6B), thus suggesting these genes could be used to differentiate genotoxic modes of action. However, it was also observed that a number of NGHCs were found to induce many of these genes. Examples include hepatotoxic treatments such as methapyrilene and chloroform (Fig. 6C). The induction of these genes may be secondary to cytotoxicity and p53 activation rather than evidence of direct DNA damage. The NHs, praziquantal and dichlorvos also induce a number of these genes (Fig. 6D). The weight of evidence would suggest these compounds are not genotoxic in vivo despite some conflicting reports (Booth et al., 2007; Montero and Ostrosky, 1997); however, there is no histological evidence of hepatotoxicity in these animals (data not shown). By evaluating the gene expression changes for these DNA damage–responsive genes, it may be possible to differentiate nongenotoxic from genotoxic modes of action. Histological changes in the samples would likely need to be taken into consideration when interpreting the potential of treatments to cause direct DNA damage in vivo based on the expression of these DNA damage–responsive genes.
FIG. 6.

Expression of DNA damage–responsive genes. Genes previously identified as being responsive to DNA damagers (Ellinger-Ziegelbauer et al., 2004; Table 1) were evaluated for their ability to differentiate genotoxic from nongenotoxic modes of action. Male SD rats were treated for 5 days with examples of (A) known genotoxic hepatocarcinogens, (B) known nongenotoxic hepatocarcinogens, (C) known nongenotoxic hepatocarcinogens at hepatotoxic doses, and (D) with NHs with known (dichlorvos) or equivocal (praziquantal) genotoxic liabilities. Treatments were as described in Table 3. Fold induction was calculated by comparing the expression in the treated rats relative to vehicle-matched controls as described in the “Materials and Methods” section.
Role of Biomarker Genes in Neoplasia
A detailed gene literature survey using the BioBase Knowledge Library revealed 10 of 23 genes that were correlated or causally associated with neoplasia or cancer (Supplementary table 2), and 8 of the genes have “Cell growth/cycle/signal transduction” as the primary biological process category. A gene-by-gene characterization, though useful, may miss the possible interconnectivity of the signature genes. Therefore, IPA was used to analyze the 23 genes and generate enrichment scores (statistical significance) for a number of biological categories and canonical pathways as well as for deriving potential network relationships. This analysis revealed that of the top 10 significantly ranked biological categories associated with the 23 signature genes, 7 have an association with cell proliferation and cancer or processes that when dysregulated could theoretically lead to neoplasia (data available upon request to the author).
In order to investigate possible relationships that may underlie the 23 genes in the signature, an examination of potential transcriptional coregulation was conducted. A response element enrichment analysis of the proximal promoter regions of all 23 genes revealed that a number of TFREs were significantly enriched for. The four most significantly enriched TFREs relative to the reference set were, in order of significance (all p < 0.05), AP-1, PBX-1, NFKB, and AHR. Although AP-1, NFKB, and AHR are associated with a general response to cellular stress and response to xenobiotics, the role of PBX1 (pre-B-cell leukemia homeobox 1) in the liver is unclear. This gene encodes a homeobox family transcription factor initially identified as a proto-oncogene associated with B-cell leukemia and has been reported to be required for the maintenance of hematopoiesis in the fetal liver and implicated in promoting hematopoietic progenitor cell expansion (DiMartino et al., 2001); however, it has not been reported to play a role in hepatocarcinogenesis. The genes harboring a PBX response element include not only the DNA damage–responsive genes, Akr7a3, Aldh1a1, Tap1, Cdkn1a, and Ces2 but also genes originally identified in the Iconix (Cited4, Ica2) and J&J (Sel1I) signature (Supplementary table 2).
DISCUSSION
Our approach to improve human carcinogenicity risk assessment has focused on the development of biomarkers for the early prediction of NGHCs in rats and the simultaneous application of genomics to understand their potential modes of action, in order to enable a proactive human hepatocarcinogenicity risk assessment prior to initiation of the 2-year rodent bioassay. To this end, we have leveraged previously published genomic biomarker discovery efforts (Ellinger-Ziegelbauer et al., 2004; Fielden et al., 2007; Nie et al., 2006) to develop a signature on the TaqMan array card to facilitate prediction of NGHCs using data from short-term repeat dose rat toxicology studies. Together with the diagnostic expression profiles provided by the accompanying data set, the data also facilitate investigations into the potential modes of action.
Numerous efforts have attempted to discover and evaluate novel biomarkers to predict carcinogenicity of nongenotoxic carcinogens (Waters et al., 2010); however, the biomarkers were often derived from relatively small data sets and/or lacked adequate independent testing. As a result, these putative biomarkers may not be widely recognized or applicable outside their laboratory of origin. In response to these limitations, we have focused our efforts on deriving a signature using a large training set of 72 compounds and subsequently evaluating the performance of the signature on over 900 RNA samples representing 169 treatment groups (86 unique compounds, including 4 GHCs) from eight different research sites. This facilitated an estimation of the likely sensitivity and specificity when applied to different treatment protocols and allowed us to understand the strengths and limitations of the signature to help define its boundaries of use.
In general, a predictive model can only be expected to perform well based on the training information. The training set was derived from a homogenous data set that utilized a common rat strain (SD), gender (male), dose-setting criteria (maximum tolerated dose), time point (day 5) and RNA isolation procedure. Although the boundaries of use that are expected to maximize classification accuracy are likely to be defined by the training set, it was important to test these boundaries with an independent and heterogeneous data set as this would reflect real world application. In order to generate composite estimates of classification accuracy, it was convenient to merge compounds into a single score and remove overlapping compounds that were also utilized in the 72 compound training set. This resulted in 169 test treatments, generated with varying study protocols, which reduced to 66 unique predictions that included 15 NGHCs and 51 NHs. Using this method, the sensitivity, or true-positive rate, was 67% and the specificity, or true-negative rate, was 59%. Although the sensitivity may be considered acceptable, we were hampered by the relatively few (15) independent compounds available for testing and so these results should be viewed as preliminary. By contrast, a fair assessment of the true-negative rate was provided by 51 independent NH compounds. Although numerous false positives and negatives were identified, they appeared to be enriched in samples predominantly from the J&J and NTP data sets. This may not be surprising as the protocols used by these two sites differed dramatically from the training set. For example, the sensitivity and specificity of the signature against the J&J compounds alone were 38 and 61%, respectively, and a high number of false positives were observed when testing compounds in the NTP data set. The AUC for the J&J compounds is right at random chance with a value of 0.49, but it is based on only eight positive samples, so the prevalence of NGHC compounds in the J&J and non-J&J data sets are quite different. In the end, the overall sensitivity estimate is driven by the non-J&J compounds whereas the specificity estimate is driven by the J&J compounds. The reason for the incorrect classifications in the J&J and NTP data sets are possibly numerous. For example, the false positives in the J&J data set may be a result of testing samples obtained only 24 h after a single high dose, as detailed in Nie et al. (2006). The acute transcriptional response after the first dose is expected to be highly variable and may result in other compensatory changes that do not uniquely reflect the predictive changes that may persist during the course of repeated daily doses, as represented in the training set. Given that the training set is based on maximum doses that are tolerated for up to 5 days (see Fielden et al., 2007), it is likely that the optimal signature performance for this particular model would be obtained when following a similar dosing paradigm. This is exemplified by the 3/3 correct predictions from samples evaluated at Abbott where compounds were dosed for 5 or 6 days in SD rats. However, it is important to consider that the use of a maximally tolerated dose in the training set may be of detriment when the signature is applied to samples that have not achieved such a dose level.
The reason for the high number of false positives in the NTP data set is unclear, but we cannot rule out the possibility that it is due to the use of male F344 rats or the inclusion of primarily nontherapeutic chemicals that may have unique modes of action that are difficult to classify with the current model. This latter hypothesis seems unlikely, however, given the large training set that includes nontherapeutic compounds of diverse modes of action. Differences in RNA isolation procedures may impact the results here. Additionally, it does not appear that false positives are generated by hepatotoxic treatments based on the differentiation of signature scores in the methapyrilene treatment groups and the results of other hepatotoxic treatments in the data set. One must also consider the score produced by the final model as it is likely that performance improves if one considers results that are farther away from the threshold that distinguishes NGHC and NH compounds (see Supplementary figure 7). These data suggest how samples generated by protocols distinct from the training set can result in poor signature performance and reinforces the concept that classification accuracy is likely to be optimal when samples are generated using protocols most similar to that of the training set. The results from both the evaluation study and the independent data set reinforces the protocol established by the Iconix training set as constituting the optimal boundaries of use. Based on these findings, a recommended protocol would include repeat daily dosing in male SD rats for ~5 days to generate data most comparable to the training set and maximize the potential benefit of this predictive assay. The number of animals per treatment group is recommended to be at least three, although it is recognized that more biological replication for test samples should improve the overall precision of the prediction. The use of only two or three animals per group in the training set is unlikely to have negatively affected the performance of the signature because the initial model evaluation exercise resulted in a robust area under the ROC curve of 0.84. However, any increased precision afforded by more replication may improve the confidence in the predictions, particularly those close to the classification threshold.
Comparison of the qPCR–based model to the microarray-based model from previous publications (Fielden et al., 2007; 2008) showed that performance is largely preserved across platforms. The most instructive comparison is the similarity of AUC estimates derived on the test data from the process evaluation step. In that step, the AUCs derived on the Iconix test data from the microarray and qRT-PCR–based classifiers are 0.89 and 0.84, respectively. We compared the microarray and qRT-PCR–based models using 45 compounds in the independent signature evaluation step. In that context, the AUC estimates of the microarray and qRT-PCR–based models were 0.65 and 0.66, respectively. Although the scores are less correlated with a Pearson correlation coefficient of 0.5, the classification calls per model have ~73% overall agreement. This suggests that the two models have similar performance characteristics (see Supplementary Results for more information).
Evaluating the effect of gender or strain on signature performance was not appropriately permitted by the available test samples; however, it is plausible that pharmacological mechanisms linked to tumor induction in the rat (i.e., PPARα induction) will be adequately maintained across genders and strains of rat. Where these variables impact pharmacokinetics, a quantitative or qualitative difference in the gene expression profile and signature outcome may be anticipated, although we have not formally tested this possibility. In one case, N-vinylpyrrolidone-2 was evaluated at the same dose in both male and female rats and the signature scores were highly similar. It is noteworthy that a number of compounds found to induce liver tumors in female rats only were positively identified by the signature despite using male expression data (e.g., carbon tetrachloride). The inclusion of expression data in the training set from male rats treated with female-specific hepatocarcinogens, such as diethylstilbestrol, 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), or chloroform, may have helped in this regard and increase the sensitivity of the assay to detect potential NGHCs. The effect of gender or strain on signature performance should still be considered unknown and suitably accounted for in the interpretation of any testing.
In addition to evaluating the predictive accuracy of the signature, we demonstrated that how the genomic data and the use of similarities in expression profiles could generate hypotheses for potential modes of action for NGHCs. Numerous examples exist that demonstrate the utility of toxicogenomic data to help understand mechanisms of carcinogenesis (Waters et al., 2010); however, it was concerning whether or not an expression profile of only 22 predictive genes would be sufficient to reveal information indicative of a compounds mode of action. It was surprising then that the expression profiles of well-characterized NGHCs with similar mode of action maintained a high degree of similarity. For example, the two genotoxic carcinogens aflatoxin B1 and diethylnitrosamine were found to cluster most similar to each other even when clustered among NGHCs. This raises the possibility that compounds with genotoxic activity may be identified through clustering of expression profiles, in addition to specifically evaluating the induction of the DNA damage–responsive genes included on the TaqMan array. Although the interpretation of clustering patterns can be subjective, it nonetheless provides valuable clues to guide more definitive investigate work that can help explain the mode of action of a novel compound. More extensive evaluations using whole genome arrays can provide more data; however, this could make interpretation more difficult because a database of known reference expression profiles from which comparisons can be drawn would not be as readily available as it is with the TaqMan array data set described here.
To add additional weight of evidence for use of the signature as both a predictive and mechanistic tool, it was of interest to understand if the 22 biomarker genes had a functional role in carcinogenesis, proliferation, and/or related phenotypes. Although the gene members of an algorithm-derived classifier are selected based on performance optimization and assay design, there is an underlying assumption that their classifying power is dependent on, a not always obvious, but nonetheless real connection to the underlying biology associated with the predicted phenotype. Failure to identify any known functional connection may cast doubt on the validity of the signature, although it is recognized that our knowledge of carcinogenesis and gene function is incomplete. Nonetheless, the combination of literature mining, pathway analysis, and transcription factor binding site analysis together provided support for a linkage between these genes and cellular processes associated with cell proliferation, growth regulation, injury repair, and cancer, all of which when dysregulated could lead to carcinogenesis. The possibility that compounds that induce liver tumors via a nongenotoxic mechanism may be eliciting a common transcriptional regulatory response, such as the possible activation of the homeobox transcription factor Pbx1, is an intriguing one that warrants further investigation. In any case, a complete understanding of the biology underlying the genes in the signature should not prohibit practical application of this tool.
By comparison to other approaches for predicting NGHCs, the predictive accuracy of the signature is greater or comparable to other histological-based endpoints that have been proposed and evaluated for their ability to predict carcinogenic outcome (Allen et al., 2004; Elcombe et al., 2002; Ito et al., 2003). The advantage of the current genomic approach is the ability to facilitate early and more efficient evaluation of molecules because it relies on a short-term repeat dose rat toxicity study rather than histological indices following chronic treatment (Allen et al., 2004; Elcombe et al., 2002) or a laborious initiation, partial hepatectomy, and promotion phase of treatment (Ito et al., 2003). This approach also provides a means to generate mechanistic information that other proposed predictive methods fail to provide (Contrera et al., 2003; Lee et al., 1995; Mauthe et al., 2001).
The challenge with evaluating this or other methods intended to predict carcinogenic outcome is the reliance on the rodent bioassay as the gold standard to which accuracy is defined. Due to the variable nature of the bioassay itself and the influences of dose, route of administration, strain, gender, and/or other experimental variables known to influence the outcome of the bioassay, the determined accuracy of the signature is subject to not only the intrinsic variation in the genomic assay but also the variation in the benchmark to which the signature is measured against. As a result, we cannot discount the possibility that false positives reported by the signature are true signals or mechanistic events relevant to proliferative potential, which did not happen to materialize into a phenotypic effect in the rodent bioassay due to differences in the aforementioned variables used between the assays. Likewise, false negatives may arise when low doses or early time points are evaluated that do not produce drug exposures or cumulative effects sufficient to perturb the biomarker genes. Therefore, the sensitivity and specificity of the signature reported here is a composite estimate and should be used as a guide rather than an absolute measure of performance. This could be said of other attempts to derive signatures predictive of hepatocarcinogenic activity (Ellinger-Ziegelbauer et al., 2008; Uehara et al., 2008), which notably have not reported to be 100% accurate either. Perhaps training and test sets composed of samples from longer term treatments would result in gene expression changes that are more prognostic of chronic lifetime changes in carcinogenic outcome, although this would limit the value of obtaining early predictions and mechanistic data as presented here. In practice, each compound should be evaluated individually in light of its dose-response, concurrent pathology, genotoxic potential, and any mechanistic data available.
A comparison of the approach presented here with alternative predictive assays or approaches designed to predict nongenotoxic hepatocarcinogens reveals dramatic differences in the relative sensitivity and specificity for prediction, utility for screening, and the degree of mechanistic data provided. For example, the Ito Medium Term bioassay reveals a higher accuracy for prediction (92%) (Ito et al., 2003), however, it does not afford much mechanistic information or provide a means to rapidly screen compounds. Other methods relying on histological endpoints from chronic studies suffer from poor accuracy, low throughput, and do not provide mechanistic insight (Allen et al., 2004; Elcombe et al., 2002). More recent efforts utilizing a similar genomic approach have reported favorable prediction accuracy (Ellinger-Ziegelbauer et al., 2008; Uehara et al., 2008), however, the reported performance should be viewed with caution because validation on a wider set of diverse samples has not been reported and the use of smaller training and test sets will increase the likelihood of bias in the performance estimates. Therefore, we believe the currently proposed assay system offers the advantages of reasonable predictive accuracy, moderate throughput, and a means to begin to understand mode of action.
Previous studies have illustrated the application of gene expression dose-response data to establish benchmark dose values for nongenotoxic carcinogens in order to determine a threshold, or point of departure, for risk assessment (Bercu et al., 2010; Thomas et al., 2007, 2010). These approaches utilized the dose-response of genes aggregated in pathways and Gene Ontology processes, which assume that changes in these groups of genes are key events in the mode of action for these carcinogens. In a similar manner, the genes in the current signature could be used to establish benchmark doses from short-term dose-response studies to estimate points of departure for nongenotoxic events driving hepatocarcinogenicity. Further evaluation would be needed to assess this possibility. Therefore, the outcome of this predictive assay should be viewed as solely a hazard identification tool. In this context of use, it is advantageous to consider compounds in the training set that cause liver tumors in any strain, gender, or dose in order to increase the sensitivity of the assay. False positives could be better tolerated in a predictive tool because the outcome would not necessarily limit development of a positive compound. Instead, a positive result would initiate investigations or development strategies to build a weight of evidence for carcinogenic risk and understand the potential modes of action before obtaining results from the 2-year rodent bioassay. Considering the frequency by which liver weight elevation and hepatocellular hypertrophy is observed in preclinical drug discovery, this approach may enable a rapid understanding of the potential mechanism(s) and relevance of the finding for humans. In addition to prospective applications in drug discovery and development, the signature would also be of use retrospectively when tumors or preneoplastic lesions are observed in chronic toxicology studies and a mechanistic understanding is needed to inform the risk to humans. Additionally, it would be useful to differentiate and prioritize molecules when structurally related chemicals have been identified as having a hepatocarcinogenic risk.
Hepatic adenomas and carcinomas are the most frequent neoplastic lesion in the 2-year rodent bioassay (Gold et al., 2005); however, a broad range of tumor types is observed. Unfortunately, methods to predict carcinogenicity in tissues outside the liver still remain limited, although genomic approaches have shown promise for the prediction of lung carcinogens (Thomas et al., 2009). Previous results, albeit limited, have suggested that hepatic gene expression data may be predictive of carcinogenic potential in extrahepatic tissues (Nie et al., 2006). Although the biological rationale for how hepatic expression could predict carcinogenic outcome in other tissues is currently unclear, this possibility was intriguing because it would significantly expand the utility of the current genomic signature. As the current data set included a number of nongenotoxic carcinogens that caused tumors in tissues outside the liver, we applied the current signature to these compounds to test this hypothesis. The results, however, indicate that the current model trained to detect hepatocarcinogens is unable to accurately predict carcinogens in other tissues (data not shown). Alternative models trained specifically on nongenotoxic carcinogens from Table 2, regardless of target tissue, also failed to appreciably predict carcinogens from the independent data set (data not shown). It is likely that alternative approaches will be needed to identify extrahepatic carcinogens.
In summary, we have developed and extensively evaluated a hepatic gene expression-based signature for NGHCs on a moderate throughput, cost-effective and well-validated TaqMan array platform using a training set derived from short-term rat toxicology studies and tested on a large heterogeneous test set. These results, in conjunction with previous publications demonstrating the predictive and mechanistic utility of the genes (Fielden et al., 2007, 2008; Nie et al., 2006), add to the weight of evidence demonstrating the practical application of genomic biomarkers for use in the assessment of potential hepatocarcinogens. The classification results on a large heterogeneous data set underscore the importance of protocol on the boundaries of use for the signature and to utilize samples that most closely follow the protocol established by the training set. Dissemination of the underlying expression data and commercial availability of the TaqMan array assay described here should facilitate further evaluation of this research tool.
Supplementary Material
SUPPLEMENTARY DATA
Supplementary data are available online at http://toxsci.oxfordjournals.org/.
ACKNOWLEDGMENTS
The authors would like to acknowledge Iconix Biosciences (now Entelos) for donating the liver RNA samples, Asuragen Services for performing experiments, Applied Biosystems (Life Technologies) for providing custom TaqMan arrays, Deepa Eveleigh, Sandi Calhoun, Michael McMillian, Joanne Tran, Rong Hu, Marnie Higgins-Garn, Rita Ciurlionis, and Olimpia Disorbo for laboratory support; Cassandra Mtine, Lindsay Lehman, Phil Rossi, and Elizabeth Walker for administrative support; and many others within the Predictive Safety Testing Consortium for constructive feedback and encouragement. This article may be the work product of an employee or group of employees of the National Institute of Environmental Health Sciences (NIEHS), National Institutes of Health (NIH), however, the statements, opinions, or conclusions contained therein do not necessarily represent the statements, opinions, or conclusions of NIEHS, NIH, or the U.S. government. J.S. is an employee of Life Technologies, a company that sells the TaqMan array. A.A. and A.K. are employees of Asuragen, a company that offers gene expression and TaqMan array services.
FUNDING
Member contributions to the Predictive Safety Testing Consortium of the Critical Path Institute.
REFERENCES
- Allen DG, Pearse G, Haseman JK, and Maronpot RR (2004). Prediction of rodent carcinogenesis: an evaluation of prechronic liver lesions as forecasters of liver tumors in NTP carcinogenicity studies. Toxicol. Pathol 32, 393–401. [DOI] [PubMed] [Google Scholar]
- Auerbach SS, Shah RR, Mav D, Smith CS, Walker NJ, Vallant MK, Boorman GA, and Irwin RD (2010). Predicting the hepatocarcinogenic potential of alkenylbenzene flavoring agents using toxicogenomics and machine learning. Toxicol. Appl. Pharmacol 243, 300–314. [DOI] [PubMed] [Google Scholar]
- Bercu JP, Jolly RA, Flagella KM, Baker TK, Romero P, and Stevens JL (2010). Toxicogenomics and cancer risk assessment: a framework for key event analysis and dose-response assessment for nongenotoxic carcinogens. Regul. Toxicol. Pharmacol 58, 369–381. [DOI] [PubMed] [Google Scholar]
- Booth ED, Jones E, and Elliott B,M (2007). Review of the in vitro and in vivo genotoxicity of dichlorvos. Regul. Toxicol. Pharmacol 49, 316–326. [DOI] [PubMed] [Google Scholar]
- Brambilla G, and Martelli A (2009). Update on genotoxicity and carcinogenicity testing of 472 marketed pharmaceuticals. Mutat. Res 681, 209–229. [DOI] [PubMed] [Google Scholar]
- Chen HH, Chen TW, and Lin H (2010). Pravastatin attenuates carboplatin-induced nephrotoxicity in rodents via peroxisome proliferator-activated receptor alpha-regulated heme oxygenase-1. Mol. Pharmacol 78, 36–45. [DOI] [PubMed] [Google Scholar]
- Christensen FM, Eisenreich SJ, Rasmussen K, Sintes JR, Sokull-Kluettgen B, and Van de Plassche EJ (2011). European experience in chemicals management: integrating science into policy. Environ. Sci. Technol 45, 80–89. [DOI] [PubMed] [Google Scholar]
- Cohen SM (2004). Human carcinogenic risk evaluation: an alternative approach to the two-year rodent bioassay. Toxicol. Sci 80, 225–259. [DOI] [PubMed] [Google Scholar]
- Cohen SM (2010). Evaluation of possible carcinogenic risk to humans based on liver tumors in rodent assays: the two-year bioassay is no longer necessary. Toxicol. Pathol 38, 487–501. [DOI] [PubMed] [Google Scholar]
- Contrera JF, Matthews EJ, and Daniel Benz R (2003). Predicting the carcinogenic potential of pharmaceuticals in rodents using molecular structural similarity and E-state indices. Regul. Toxicol. Pharmacol 38, 243–259. [DOI] [PubMed] [Google Scholar]
- Davies TS, and Monro A (1995). Marketed human pharmaceuticals reported to be tumorigenic in rodents. J. Amer. Coll. Toxicol 14, 90–107. [Google Scholar]
- DiMartino JF, Selleri L, Traver D, Firpo MT, Rhee J, Warnke R, O’Gorman S, Weissman IL, and Cleary ML (2001). The Hox cofactor and proto-oncogene Pbx1 is required for maintenance of definitive hematopoiesis in the fetal liver. Blood 98, 618–626. [DOI] [PubMed] [Google Scholar]
- Elcombe CR, Odum J, Foster JR, Stone S, Hasmall S, Soames AR, Kimber I, and Ashby J (2002). Prediction of rodent nongenotoxic carcinogenesis: evaluation of biochemical and tissue changes in rodents following exposure to nine nongenotoxic NTP carcinogens. Environ. Health Perspect 110, 363–375. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ellinger-Ziegelbauer H, Gmuender H, Bandenburg A, and Ahr HJ (2008). Prediction of a carcinogenic potential of rat hepatocarcinogens using toxicogenomics analysis of short-term in vivo studies. Mutat. Res 637, 23–39. [DOI] [PubMed] [Google Scholar]
- Ellinger-Ziegelbauer H, Stuart B, Wahle B, Bomann W, and Ahr HJ (2004). Characteristic expression profiles induced by genotoxic carcinogens in rat liver. Toxicol. Sci 77, 19–34. [DOI] [PubMed] [Google Scholar]
- Fielden MR, Brennan R, and Gollub J (2007). A gene expression biomarker provides early prediction and mechanistic assessment of hepatic tumor induction by nongenotoxic chemicals. Toxicol. Sci 99, 90–100. [DOI] [PubMed] [Google Scholar]
- Fielden MR, Nie A, McMillian M, Elangbam CS, Trela BA, Yang Y, Dunn RT II., Dragan Y, Fransson-Stehen R, Bogdanffy M, et al. (2008). Interlaboratory evaluation of genomic signatures for predicting carcinogenicity in the rat. Toxicol. Sci 103, 28–34. [DOI] [PubMed] [Google Scholar]
- Gold LS, Manley NB, Slone TH, Rohrbach L, and Garfinkel GB (2005). Supplement to the Carcinogenic Potency Database (CPDB): results of animal bioassays published in the general literature through 1997 and by the National Toxicology Program in 1997–1998. Toxicol. Sci 85, 747–808. [DOI] [PubMed] [Google Scholar]
- Haseman JK, Huff JE, Zeiger E, and McConnell EE (1987). Comparative results of 327 chemical carcinogenicity studies. Environ. Health Perspect 74, 229–235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ito N, Tamano S, and Shirai T (2003). A medium-term rat liver bioassay for rapid in vivo detection of carcinogenic potential of chemicals. Cancer Sci. 94, 3–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jacobs A (2005). Prediction of 2-year carcinogenicity study results for pharmaceutical products: how are we doing? Toxicol. Sci 88, 18–23. [DOI] [PubMed] [Google Scholar]
- Jacobs A, and Jacobson-Kram D (2004). Human carcinogenic risk evaluation, part III: assessing cancer hazard and risk in human drug development. Toxicol. Sci 81, 260–262. [DOI] [PubMed] [Google Scholar]
- Kirkland D, Aardema M, Henderson L, and Muller L (2005). Evaluation of the ability of a battery of three in vitro genotoxicity tests to discriminate rodent carcinogens and non-carcinogens I. Sensitivity, specificity and relative predictivity. Mutat. Res 584, 1–256. [DOI] [PubMed] [Google Scholar]
- Kitchin KT, Brown JL, and Kulkarni AP (1993). Predicting rodent carcinogenicity of halogenated hydrocarbons by in vivo biochemical parameters. Teratog. Carcinog. Mutagen 13, 167–184. [DOI] [PubMed] [Google Scholar]
- Kitchin KT, Brown JL, and Kulkarni AP (1994). Complementarity of genotoxic and nongenotoxic predictors of rodent carcinogenicity. Teratog. Carcinog. Mutagen 14, 83–100. [DOI] [PubMed] [Google Scholar]
- Lee Y, Buchanan BG, Mattison DM, Klopman G, and Rosenkranz HS (1995). Learning rules to predict rodent carcinogenicity of nongenotoxic chemicals. Mutat. Res 328, 127–149. [DOI] [PubMed] [Google Scholar]
- Maronpot RR, Flake G, and Huff J (2004). Relevance of animal carcinogenesis findings to human cancer predictions and prevention. Toxicol. Pathol 32(Suppl. 1), 40–48. [DOI] [PubMed] [Google Scholar]
- Mauthe RJ, Gibson DP, Bunch RT, and Custer L (2001). The syrian hamster embryo (SHE) cell transformation assay: review of the methods and results. Toxicol. Pathol 29(Suppl.), 138–146. [DOI] [PubMed] [Google Scholar]
- Melnick RL, Thayer KA, and Bucher JR (2008). Conflicting views on chemical carcinogenesis arising from the design and evaluation of rodent carcinogenicity studies. Environ. Health Perspect 116, 130–135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Michel C, Roberts RA, Desdouets C, Isaacs KR, and Boitier E (2005). Characterization of an acute molecular marker of nongenotoxic rodent hepatocarcinogenesis by gene expression profiling in a long term clofibric acid study. Chem. Res. Toxicol 18, 611–618. [DOI] [PubMed] [Google Scholar]
- Montero R, and Ostrosky P (1997). Genotoxic activity of praziquantel. Mutat Res 387, 123–139. [DOI] [PubMed] [Google Scholar]
- Nie AY, McMillian M, Parker JB, Leone A, Bryant S, Yieh L, Bittner A, Nelson J, Carmen A, Wan J, et al. (2006). Predictive toxicogenomics approaches reveal underlying molecular mechanisms of nongenotoxic carcinogenicity. Mol. Carcinog 45, 914–933. [DOI] [PubMed] [Google Scholar]
- Nioi P, Pardo ID, Sherratt PJ, Fielden MR, Gollub J, Nie A, and Snyder RD (2008). Prediction of non-genotoxic carcinogenesis in rats using changes in gene expression following acute dosing. Chem. Biol. Interact 176, 252–260. [DOI] [PubMed] [Google Scholar]
- Tatematsu M, Tsuda H, Shirai T, Masui T, and Ito N (1987). Placental glutathione S-transferase (GST-P) as a new marker for hepatocarcinogenesis: in vivo short-term screening for hepatocarcinogens. Toxicol. Pathol 15, 60–68. [DOI] [PubMed] [Google Scholar]
- Thomas RS, Allen BC, Nong A, Yang L, Bermudez E, Clewell HJ III., and Andersen ME (2007). A method to integrate benchmark dose estimates with genomic data to assess the functional effects of chemical exposure. Toxicol. Sci 98, 240–248. [DOI] [PubMed] [Google Scholar]
- Thomas RS, Bao W, Chu TM, Bessarabova M, Nikolskaya T, Nikolsky Y, Andersen ME, and Wolfinger RD (2009). Use of short-term transcriptional profiles to assess the long-term cancer-related safety of environmental and industrial chemicals. Toxicol. Sci 112, 311–321. [DOI] [PubMed] [Google Scholar]
- Thomas RS, Clewell HJ III., Allen BC, Wesselkamper SC, Wang NC, Lambert JC, Hess-Wilson JK, Zhao QJ, and Andersen ME (2010). Application of transcriptional benchmark dose values in quantitative cancer and noncancer risk assessment. Toxicol. Sci 120, 194–205. [DOI] [PubMed] [Google Scholar]
- Uehara T, Hirode M, Ono A, Kiyosawa N, Omura K, Shimizu T, Mizukawa Y, Miyagishima T, Nagao T, and Urushidani T (2008). A toxicogenomics approach for early assessment of potential nongenotoxic hepatocarcinogenicity of chemicals in rats. Toxicology 250, 15–26. [DOI] [PubMed] [Google Scholar]
- Vanparys P, Corvi R, Aardema M, Gribaldo L, Hayashi M, Hoffmann S, and Schechtman L (2011). ECVAM prevalidation of three cell transformation assays. ALTEX 28, 56–59. [DOI] [PubMed] [Google Scholar]
- Waites CR, Dominick MA, Sanderson TP, and Schilling BE (2007). Nonclinical safety evaluation of muraglitazar, a novel PPARalpha/gamma agonist. Toxicol. Sci 100, 248–258. [DOI] [PubMed] [Google Scholar]
- Ward JM (2008). Value of rodent carcinogenesis bioassays. Toxicol. Appl. Pharmacol 226, 212. [DOI] [PubMed] [Google Scholar]
- Waters MD, Jackson M, and Lea I (2010). Characterizing and predicting carcinogenicity and mode of action using conventional and toxicogenomics methods. Mutat. Res 705, 184–200. [DOI] [PubMed] [Google Scholar]
- Whysner J, and Williams GM (1996a). D-limonene mechanistic data and risk assessment: absolute species-specific cytotoxicity, enhanced cell proliferation, and tumor promotion. Pharmacol. Ther 71, 127–136. [DOI] [PubMed] [Google Scholar]
- Whysner J, and Williams GM (1996b). Saccharin mechanistic data and risk assessment: urine composition, enhanced cell proliferation, and tumor promotion. Pharmacol. Ther 71, 225–252. [DOI] [PubMed] [Google Scholar]
- Yamasaki H, Ashby J, Bignami M, Jongen W, Linnainmaa K, Newbold RF, Nguyen-Ba G, Parodi S, Rivedal E, Schiffmann D, et al. (1996). Nongenotoxic carcinogens: development of detection methods based on mechanisms: a European project. Mutat. Res 353, 47–63. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
