Development and Evaluation of a Genomic Signature for the Prediction and Mechanistic Assessment of Nongenotoxic Hepatocarcinogens in the Rat

Mark R Fielden; Alex Adai; Robert T Dunn, II; Andrew Olaharski; George Searfoss; Joe Sina; Jiri Aubrecht; Eric Boitier; Paul Nioi; Scott Auerbach; David Jacobson-Kram; Nandini Raghavan; Yi Yang; Andrew Kincaid; Jon Sherlock; Shen-Jue Chen; Bruce Car

doi:10.1093/toxsci/kfr202

. Author manuscript; available in PMC: 2026 Mar 17.

Published in final edited form as: Toxicol Sci. 2011 Aug 2;124(1):54–74. doi: 10.1093/toxsci/kfr202

Development and Evaluation of a Genomic Signature for the Prediction and Mechanistic Assessment of Nongenotoxic Hepatocarcinogens in the Rat

Mark R Fielden ^*,¹, Alex Adai ^†, Robert T Dunn II ^‡, Andrew Olaharski ^§, George Searfoss ^¶, Joe Sina ^∥, Jiri Aubrecht ^∥|, Eric Boitier ^∥∥, Paul Nioi ^#,², Scott Auerbach ^**, David Jacobson-Kram ^††, Nandini Raghavan ^a, Yi Yang ^b, Andrew Kincaid ^†, Jon Sherlock ^c, Shen-Jue Chen ^d, Bruce Car ^d, on behalf of the Predictive Safety Testing Consortium, Carcinogenicity Working Group

PMCID: PMC12989971 NIHMSID: NIHMS2153729 PMID: 21813463

Abstract

Evaluating the risk of chemical carcinogenesis has long been a challenge owing to the protracted nature of the pathology and the limited translatability of animal models. Although numerous short-term in vitro and in vivo assays have been developed, they have failed to reliably predict the carcinogenicity of nongenotoxic compounds. Extending upon previous microarray work (Fielden, M. R., Nie, A., McMillian, M., Elangbam, C. S., Trela, B. A., Yang, Y., Dunn, R. T., II, Dragan, Y., Fransson-Stehen, R., Bogdanffy, M., et al. (2008). Interlaboratory evaluation of genomic signatures for predicting carcinogenicity in the rat. Toxicol. Sci. 103, 28–34), we have developed and extensively evaluated a quantitative PCR-based signature to predict the potential for nongenotoxic compounds to induce liver tumors in the rat as a first step in the safety assessment of potential nongenotoxic carcinogens. The training set was derived from liver RNA from rats treated with 72 compounds and used to develop a 22-gene signature on the TaqMan array platform, providing an economical and standardized assay protocol. Independent testing on over 900 diverse samples (66 compounds) confirmed the interlaboratory precision of the assay and its ability to predict known nongenotoxic hepatocarcinogens (NGHCs). When tested under different experimental designs, strains, time points, dose setting criteria, and other preanalytical processes, the signature sensitivity and specificity was estimated to be 67% (95% confidence interval [CI] = 38–88%) and 59% (95% CI = 44–72%), respectively, with an area under the receiver operating characteristic curve of 0.65 (95% CI = 0.46–0.83%). Compounds were best classified using expression data from short-term repeat dose studies; however, the prognostic expression changes appeared to be preserved after longer term treatment. Exploratory evaluations also revealed that different modes of action for nongenotoxic and genotoxic compounds can be discriminated based on the expression of specific genes. These results support a potential early preclinical testing paradigm to catalyze broader understanding of putative NGHCs.

Keywords: nongenotoxic, carcinogenesis, biomarkers, safety evaluation, liver, systems toxicology, toxicogenomics, methods, predictive toxicology, in vitro, alternatives

The rodent cancer bioassay has been used for over 30 years to evaluate the human carcinogenic risk of chemicals. The bioassay requires exposing rats and mice to a test compound for most of their lifetime (~18 to 24 months) up to a maximum tolerated dose based on prior chronic dose-ranging studies. Because of the extensive resources required, only a small fraction of chemicals have undergone carcinogenicity testing relative to the tens of thousands of compounds identified on the U.S. Environmental Protection Agency’s Toxic Substances Control Act Inventory or registered by Registration, Evaluation, Authorization, and Restriction of Chemicals (Christensen et al., 2011). Additionally, it has been reported that ~31% of marketed drugs have not been tested according to present carcinogenicity testing guidelines (Brambilla and Martelli, 2009). In addition to resource constraints and ethical concerns, the high doses frequently used in the bioassay and the physiological differences between rodents and humans have led to considerable debate over the relevance of the rodent cancer bioassay for assessing human risk (Cohen, 2010; Jacobs, 2005; Maronpot et al., 2004; Melnick et al., 2008; Ward, 2008). As a result, improving upon the current carcinogenicity testing paradigm remains an active area of research.

Because DNA damage is considered a hallmark of carcinogenesis, it is assumed that DNA damaging agents are likely to be carcinogenic. Thus, a number of in vitro and short-term in vivo genotoxicity assays have been developed and validated to detect the ability of chemicals and/or their metabolites to damage or mutate DNA and predict carcinogenic outcome (Kirkland et al., 2005). In contrast to the expectation that genotoxic chemicals are carcinogens, nongenotoxic chemicals cannot be assumed to be noncarcinogenic. Because most compounds that are mutagenic in the Ames test are excluded from drug development, the most frequent adverse outcome observed in rodent cancer bioassays is carcinogenicity initiated by nongenotoxic events. Due to the high maximum tolerated doses used in the rodent cancer bioassay, these carcinogenic events in rodents often occur at exposures above which carcinogenic risk to humans is assumed to be minimal. In addition, many examples exist for which nongenotoxic carcinogenicity in rodents has been conclusively shown not to be relevant for human risk, such as urinary bladder transitional cell carcinoma induced by saccharin or muraglitazar and renal carcinoma induced by D-limonene (Waites et al., 2007; Whysner and Williams, 1996a,b). As a result, there has been a stronger emphasis on understanding the chemical’s mode of action to better evaluate the risk and relevance of the findings to humans (Jacobs, 2005; Jacobs and Jacobson-Kram, 2004). However, the time and resources needed to determine the mechanism of action is considerable, and testing often occurs very late in drug development, if at all. Furthermore, the rodent bioassay typically does not provide the type of mechanistic insight needed to enable this evaluation.

Predicting carcinogenicity induced by nongenotoxic compounds is a challenge due to the many modes of action that have been described to contribute to tumor formation and the multistep process of carcinogenesis (Yamasaki et al., 1996). Nonetheless, numerous assays have been developed in an attempt to predict nongenotoxic carcinogens, including in silico quantitative structure activity relationship models (Contrera et al., 2003; Lee et al., 1995), in vitro mechanistic assays (Yamasaki et al., 1996), cell-based transformation systems (Mauthe et al., 2001; Vanparys et al., 2011), and various (sub)-chronic histological, histochemical, and biochemical indices (Allen et al., 2004; Elcombe et al., 2002; Kitchin et al., 1993; Tatematsu et al., 1987) and combinations thereof (Cohen, 2004; Kitchin et al., 1994). Given the modest predictivity of these approaches or the nature of the assays, there is general agreement that these short-term methods do not reliably predict tumor outcome or provide sufficient information to fully inform a human risk assessment (Jacobs, 2005). As a result, the rodent cancer bioassay remains the gold standard for assessing the human risk of chemical carcinogenesis. Therefore, novel assays or biomarkers that provide an early prediction of a carcinogenic outcome induced by nongenotoxic compounds could enable a more informed compound selection process for early-stage development. This approach could facilitate the proactive initiation of investigative studies to enable an early human risk assessment prior to initiating the rodent bioassay or provide a more efficient hazard identification approach to prioritize chemicals for carcinogenicity testing.

In response to these challenges, genomic or large-scale gene expression profiling has been extensively researched for its ability to predict long-term tumor outcome and/or provide mechanistic data to enable the risk assessment of carcinogens (Waters et al., 2010). The underlying premise of genomic profiling for carcinogenicity prediction is that gene expression changes in the target tissue precede and/or contribute to tumor development and that these changes can be monitored after a short-term in vivo treatment to predict longer term carcinogenic outcomes. To this end, numerous genomic biomarkers or signatures have been described to predict rat hepatocarcinogenicity induced by non-genotoxic compounds (Ellinger-Ziegelbauer et al., 2008; Fielden et al., 2007; Nie et al., 2006; Uehara et al., 2008). The liver is the most common site of tumor formation in the rodent bioassay (Gold et al., 2005) and a number of well-described mechanisms of liver tumor formation are amenable to evaluation based on hepatic gene expression (Waters et al., 2010), making it an ideal model system to evaluate the utility of genomics for carcinogenicity assessment.

Building on the genomic signatures originally described by Fielden et al. (2007) and Nie et al. (2006), we have previously demonstrated the statistical robustness of these proposed signatures for predicting nongenotoxic hepatocarcinogens (NGHCs) (Fielden et al., 2008). However, it was concluded that the published signatures lacked sufficient classification accuracy when used as is likely due to the effect of experimental variables that varied across laboratories, including the microarray platform and study conditions such as time and dose. We reasoned that if the gene expression measurement platform was controlled for and the reproducibility of gene expression measurements is enhanced, a signature could be derived and more thoroughly evaluated for its ability to predict NGHCs and to refine its boundaries of use for optimal classification. Furthermore, to enable broad utilization and evaluation across laboratories, the signature had to be commercially available, established on a reliable and readily available measurement platform, biologically interpretable, and thoroughly evaluated across hundreds of diverse samples. Because the signature is not intended to replace the chronic rodent bioassay but rather to guide internal decision making, allow prioritization of chemicals for formal testing, possibly reduce the reliance on longer term animal studies, and/or enable a more rapid understanding of mode of action, a rigorous validation of the signature as a replacement of chronic rodent studies was not an objective. Instead, the objectives were to develop a signature to enable an early evaluation of NGHCs and to make the signature and underlying data publically available for broader testing.

MATERIALS AND METHODS

TaqMan array card design.

We chose to rederive the initial microarray-based signature using quantitative real-time PCR (qPCR) to provide a widely accessible higher throughput gene expression platform to support evaluation. To this end, we chose the TaqMan array platform (384-well microfluidic cards) to develop a custom array (Applied Biosystems, part of Life Technologies, Foster City, CA). In order to maximize sample throughput, it was desirable to create a TaqMan array with 32 primer pairs in order to permit the analysis of four samples per card in triplicate wells. The predictor genes considered for evaluation included 37 genes from the Iconix signature (Fielden et al., 2007) and six genes from the signature published by Nie et al. (2006). An additional 10 genes from the genotoxic carcinogen signature published by Bayer (Ellinger-Ziegelbauer et al., 2004) were included as it was considered desirable to distinguish nongenotoxic from genotoxic modes of action. Because it was not practical to evaluate all 53 genes, steps were taken to identify 11 genes from the original Iconix 37 gene signature that could provide similar predictive accuracy (data not shown). This resulted in the final selection of 27 unique genes. Three of these genes were evaluated using multiple primer pair sequences (Trnt1, EST AW143969, and Sel1I). Three normalizer genes were also selected to identify an appropriate transcript to normalize and assess input RNA quality (Table 1). Primers and probes were designed by Applied Biosystems according to published design rules (Applied Biosystems).

TABLE 1.

TaqMan Assays Used for qPCR Signature Development

Assay ID	Accession	Gene symbol	Gene name	Source	In final model
Rn03399817_g1	AI232085.1	Trnt1	tRNA nucleotidyl transferase, CCA-adding, 1	Fielden et al., 2007	No
Rn03399820_s1	AI232085.1	Trnt1	tRNA nucleotidyl transferase, CCA-adding, 1	Fielden et al., 2007	Yes
Rn03399816_s1	AW143969.1	EST	EST	Fielden et al., 2007	No
Rn03399821_s1	AW143969.1	EST	EST	Fielden et al., 2007	No
Rn03399822_s1	AW143969.1	EST	EST	Fielden et al., 2007	Yes
Rn03399819_s1	AW533663.1	Prodh	Proline dehydrogenase	Fielden et al., 2007	Yes
Rn03399815_s1	AW915076.1	Gpr146	G protein-coupled receptor 146	Fielden et al., 2007	Yes
Rn03399814_s1	BF553500.1	Cited4	Cbp/p300-interacting transactivator, with Glu/Asp-rich carboxy-terminal domain, 4	Fielden et al., 2007	Yes
Rn00680664_g1	NM_012708.1	Psmb9	Proteasome (prosome, macropain) subunit, beta type 9	Fielden et al., 2007	Yes
Rn01452409_m1	NM_030844.2	Ica1	Islet cell autoantigen 1	Fielden et al., 2007	Yes
Rn00587206_m1	NM_053774.2	Usp2	Ubiquitin-specific peptidase 2	Fielden et al., 2007	Yes
Rn01475179_m1	NM_138882.1	Pla1a	Phospholipase A1 member A	Fielden et al., 2007	Yes
Rn01424675_m1	U53184	Litaf	Lipopolysaccharide-induced TNF factor	Fielden et al., 2007	Yes
Rn01432563_g1	NM_001007629.1	Nutf2	Nuclear transport factor 2	Nie et al., 2006	No
Rn00689231_m1	NM_012860.2	Mat1a	Methionine adenosyltransferase I, alpha	Nie et al., 2006	Yes
Rn02132590_g1	NM_021766.1	Pgrmc1	Progesterone receptor membrane component 1	Nie et al., 2006	No
Rn00821759_g1	NM_138826.4	Mt1a	Metallothionein 1a	Nie et al., 2006	Yes
Rn00756519_m1	NM_173295.1	Ugt2b17	UDP glucuronosyltransferase 2 family, polypeptide B17	Nie et al., 2006	Yes
Rn01517723_m1	NM_177933.2	Sel1I	Sel-1 suppressor of lin-12-like (C. elegans)	Nie et al., 2006	No
Rn00710081_m1	NM_177933.2	Sel1I	Sel-1 suppressor of lin-12-like (C. elegans)	Nie et al., 2006	Yes
Rn03399818_s1	AI639488.1	Mdm2	Mdm2 p53-binding protein homolog	Ellinger-Ziegelbauer et al., 2004	No
Rn00563462_m1	NM_012861.1	Mgmt	O-6-methylguanine-DNA methyltransferase	Ellinger-Ziegelbauer et al., 2004	Yes
Rn00566256_m1	NM_013215.1	Akr7a3	Aldo-keto reductase family 7, member A3	Ellinger-Ziegelbauer et al., 2004	Yes
Rn00568504_m1	NM_017259.1	Btg2	B-cell translocation gene 2, anti-proliferative	Ellinger-Ziegelbauer et al., 2004	Yes
Rn01530533_g1	NM_019905.1	Anxa2	Annexin A2	Ellinger-Ziegelbauer et al., 2004	Yes
Rn00755484_m1	NM_022407.3	Aldh1a1	Aldehyde dehydrogenase 1 family, member A1	Ellinger-Ziegelbauer et al., 2004	Yes
Rn00709612_m1	NM_032055	Tap1	Transporter 1, ATP-binding cassette, sub-family B	Ellinger-Ziegelbauer et al., 2004	Yes
Rn01427989_s1	NM_080782.3	Cdkn1a	Cyclin-dependent kinase inhibitor 1A	Ellinger-Ziegelbauer et al., 2004	Yes
Rn00592205_m1	NM_133586.1	Ces2	Carboxylesterase 2 (intestine, liver)	Ellinger-Ziegelbauer et al., 2004	Yes
Rn00690933_m1	NM_017101.1	Ppia	Peptidylprolyl isomerase A (cyclophilin A)	Housekeeping gene	Yes
Hs99999901_s1	X03205.1	18S	18S ribosomal RNA	Housekeeping gene	No
Rn99999916_s1	X02231.1	Gapdh	Glyceraldehyde 3-phosphate-dehydrogenase	Housekeeping gene	No

Open in a new tab

Note. C. elegans, Caenorhabditis elegans; TNF, tumor necrosis factor; UPD, uridine diphosphate.

TaqMan array card assay.

RNA concentration and quality were determined using a NanoDrop ND-1000 Spectrophotometer (Thermo Scientific, Wilmington, DE). A total of 220 ng of liver RNA from each animal was reverse transcribed using the High Capacity complementary DNA (cDNA) RT Kit according to the manufacturer’s instructions (Applied Biosystems). The cDNA was diluted to 2 ng/μl in water and 105 μl were mixed with an equal volume of 2× TaqMan Universal Master Mix (Applied Biosystems). One hundred microliters were then injected into each of two ports on the TaqMan array and analyzed on the Applied Biosystems 7900HT Real-Time PCR System according to the manufacturer’s instructions.

Liver RNA samples.

To develop a de novo signature from qPCR data to predict nongenotoxic hepatocarcinogenicity in the rat, we reanalyzed rat liver RNA samples that had previously been used to identify and evaluate the original Iconix microarray signature (Fielden et al., 2007). Briefly, these samples were derived from male Sprague-Dawley (SD) rats that were administered compound (NGHCs or nonhepatocarcinogens [NH]) or vehicle by oral gavage once daily for 1, 3, or 5 days (n = 3 per group). The considerations for compound classification are described below. The doses administered were considered maximally tolerated in a 5-day study and induced decreases in body weight gain or histological changes in target organs but did not induce severe clinical signs that may otherwise confound interpretation of gene expression changes. Rats were necropsied 24 h after the last dose and liver was stored frozen until RNA extraction according to Fielden et al. (2007). RNA samples were stored at −70°C and were checked to ensure sufficient material to permit cDNA synthesis, as some RNA samples had been depleted or were of low quality. Samples were selected to ensure at least two to three rats per treatment and control group. Vehicle control samples were matched based on common vehicle (aqueous or corn oil) and date that the study was run (i.e., year/quarter). In total, there were 415 RNA samples representing 121 treatment groups, which were analyzed on the TaqMan array. The analyzed log₁₀ ratio data for all treatment groups are provided in Supplementary table 1.

For an independent sample set, we obtained over 900 rat liver RNA samples representing 178 treatment groups from a variety of studies performed at collaborators facilities as further described below. Each treatment group had their own vehicle-matched control and consisted of at least three animals per group. All original SDS files and the R script to execute the model are available upon request to the author.

Compound classification.

A chemical was classified as a hepatocarcinogen if it was (1) found to induce liver tumors in a 2-year carcinogenicity study in at least one strain or gender of rat or (2) reasonably expected to induce liver tumors based on a known class effect (e.g., peroxisome proliferator-activated receptor alpha [PPARα] agonists, steroid hormones). Due to the high false-positive rate of some in vitro genotoxicity assays, we decided to classify hepatocarcinogens as nongenotoxic if there was sufficient literature evidence that they induce liver tumors primarily through a nongenotoxic mechanism despite having a positive finding in an in vitro genotoxicity assay (e.g., phenobarbital, clofibrate). Although we cannot discount the involvement of genotoxic mechanisms in tumor formation for these chemicals, we chose to include these chemicals as NGHCs to improve our ability to identify nongenotoxic mechanisms that may lead to tumor formation.

A chemical was classified as negative for hepatocarcinogenicity if it was (1) found not to induce liver tumors in a 2-year rodent bioassay in both male and female rats or (2) not expected to induce liver tumors based on an antiproliferative mode of action. NHs with a positive finding in a genotoxicity assay were not expected to affect the ability of the signature to identify NGHCs and thus were not specifically excluded from the NH class. Because the assay was restricted to hepatic gene expression, tumor formation in other organs was not considered in the classification nor was the presence or absence of carcinogenic activity in the mouse. No differentiation among tumor types was made, and the term hepatocarcinogenicity is used throughout to refer to chemicals that have been identified to induce adenomas and/or carcinomas.

Data on carcinogenicity outcomes were obtained from the Carcinogenicity Potency Database (http://potency.berkeley.edu), the National Library of Medicine Chemical Carcinogenesis Research Information System (http://toxnet.nlm.nih.gov/cgi-bin/sis/htmlgen?CCRIS), the National Toxicology Program (NTP) Database (http://ntp-apps.niehs.nih.gov/ntp_tox/index.cfm), the Physician’s Desktop Reference (http://www.pdrhealth.com), or peer-reviewed publications (Brambilla and Martelli, 2009; Davies and Monro, 1995; Haseman et al., 1987). Findings reported in the literature were used without reinterpretation or reclassification with respect to their statistical significance or relationship to treatment. Furthermore, no attempt was made to segregate chemicals based on the incidence or severity of tumor formation because the doses used in the current study are likely higher than that used in the rodent bioassay, and biasing the training set toward only potent carcinogens may hinder the sensitivity of the biomarker toward weaker carcinogens that are still of regulatory concern. We recognize that alternative classification of chemicals is possible, given discrepancies in the literature or the rodent bioassay; however, the goal was to derive a signature that provides a sensitive means of identifying NGHCs and modes of action that are expected to contribute to tumor formation rather than recapitulate a specific rodent bioassay result.

Model development step 1: Process evaluation.

The modeling strategy is outlined in Figure 1. We define a model in this context as a specific combination of parameters that are integrated to produce a single score for a given compound at a given dose and time (see model development in Supplementary materials and methods for more information about the model parameters). The general process is to use one subset of data to find the optimal model and a second set of data to test the single optimal model. The first step was to evaluate the process for selecting the optimal model. This involved estimating the accuracy (in terms of area under the receiver operating characteristic [ROC] curve [AUC] and proportion classified correctly) of the model building process in an evaluation phase using 72 compounds profiled on day 5 from the Iconix data set (Table 2). The AUC is equal to the probability that a classification model will rank a randomly chosen positive sample higher than a randomly chosen negative sample and is commonly used to select optimal models independent of class distribution. An AUC of 1.0 reflects a perfect classifier with 100% classification accuracy. This first step also enabled us to evaluate the model-building process on a set of samples from the same site with animals treated under the same protocol. Compounds not profiled on day 5 were excluded because it would have resulted in a skewed distribution of early (day 1) and late (days 3 and 5) samples in the training set, and previous experience indicated that a signature developed on day 5 samples provided the most robust classifier (Fielden et al., 2007). The data for the additional time points are nonetheless available as part of Supplementary table 1. Genotoxic hepatocarcinogens (GHCs) (aflatoxin B₁, diethylnitrosamine, methyleugenol, and safrole) were excluded from model development and validation because they did not fit either class we intended to predict, and an insufficient number of GHC samples were available to adequately evaluate classification accuracy toward this class of compounds. However, they were included as part of an effort to test the ability of the genes to differentiate mode of action.

FIG. 1. — Overview of model building and evaluation. The model development occurred in three steps. (A) Step 1 was used to evaluate the process for selecting a single model for validation on an independent test set based on training and test set definitions in Fielden *et al*. (2007). The model strategy was successful in selecting a single model with a similar AUC estimate to that previously published (see Supplementary Results). (B) The model from step 1 was promising but underpowered. As a result, all Iconix samples were used for training in step 2 to select a single top model to classify compounds in step 3. The strategy for model building and selection was identical to that implemented in step 1 with the qualification that the performance of the top model is not preferentially driven by correctly classifying training samples defined in the process evaluation (step 1). (C) Step 3 is the validation of the top qPCR based model from step 2 on an independent test set. The independent test set is composed of samples from multiple sites using different protocols.

TABLE 2.

Summary of Male SD Rat Compound Treatments Used for Signature Development as Part of the Evaluation Study and Final Model Development

Compound	Vehicle	Dose (mg/kg/day)	Time point (days)	Class	Set^a
Anastrozole	CMC	400	5	NGHC	Training
Ethisterone	CMC	1500	5	NGHC	Training
Methapyrilene	CMC	100	5	NGHC	Training
Nafenopin	Corn oil	338	5	NGHC	Training
Norethindrone	Corn oil	375	5	NGHC	Training
Pentobarbital	Water	70	5	NGHC	Training
Phenobarbital	Water	80	5	NGHC	Training
Pirinixic acid	CMC	364	5	NGHC	Training
Pravastatin	Corn oil	1200	5	NGHC	Training
2,3,7,8-Tetrachlorodibenzo-p-dioxin	CMC	0.02	5	NGHC	Training
Acetaminophen	Corn oil	972	5	NGHC	Test
Beta-naphthoflavone	CMC	1500	5	NGHC	Test
Bezafibrate	Corn oil	617	7	NGHC	Test
Bis(2-ethylhexyl) phthalate	Corn oil	500	5	NGHC	Test
Carbamazepine	CMC	490	5	NGHC	Test
Carbimazole	Water	400	5	NGHC	Test
Chloroform	Corn oil	600	5	NGHC	Test
Diethylstilbestrol	Corn oil	280	5	NGHC	Test
Ethylestrenol	CMC	390	5	NGHC	Test
Fluconazole	Corn oil	394	5	NGHC	Test
Oxymetholone	CMC	1170	5	NGHC	Test
Spironolactone	CMC	300	5	NGHC	Test
Testosterone	CMC	375	5	NGHC	Test
Alfacalcidol	CMC	0.04	5	NH	Training
Amlodipine	Corn oil	19	5	NH	Training
Aspirin	Corn oil	375	5	NH	Training
Carvedilol	Corn oil	2000	5	NH	Training
Celecoxib	Corn oil	400	5	NH	Training
Ciprofloxacin	Corn oil	450	5	NH	Training
Citric acid	Water	3000	5	NH	Training
Clarithromycin	Water	476	5	NH	Training
Cortisone	CMC	206	5	NH	Training
Cycloheximide	Water	0.25	5	NH	Training
Dichlorvos	Water	17	5	NH	Training
Diclofenac	Corn oil	10	5	NH	Training
Ergocalciferol	CMC	15	5	NH	Training
Etodolac	CMC	24	5	NH	Training
Fluoxetine	CMC	52	5	NH	Training
Ketorolac	Water	48	5	NH	Training
Megestrol acetate	CMC	132	5	NH	Training
Methyldopa	Water	325	5	NH	Training
Pergolide	CMC	1.1	5	NH	Training
Perhexiline	CMC	320	5	NH	Training
Pioglitazone	Corn oil	1500	5	NH	Training
Praziquantel	CMC	1200	5	NH	Training
Promethazine	Saline	113	5	NH	Training
Propylthiouracil	CMC	625	5	NH	Training
Pyrazinamide	CMC	1500	5	NH	Training
Rabeprazole	Water	1024	5	NH	Training
Rifabutin	CMC	1500	5	NH	Training
Rofecoxib	Corn oil	1550	5	NH	Training
Rosiglitazone	Corn oil	1800	5	NH	Training
Roxithromycin	CMC	312	5	NH	Training
Ticlopidine	CMC	223	5	NH	Training
Tolazamide	CMC	1500	5	NH	Training
Troglitazone	Corn oil	1200	5	NH	Training
Valproic acid	Water	1500	5	NH	Training
1,1-Dichloroethene	Water	600	5	NH	Test
Amoxapine	CMC	313	5	NH	Test
Cholecalciferol	CMC	8	5	NH	Test
Citalopram	Corn oil	90	5	NH	Test
Clomiphene	CMC	250	5	NH	Test
Clomipramine	Water	115	5	NH	Test
Diazepam	CMC	710	5	NH	Test
Erythromycin	CMC	1500	5	NH	Test
Finasteride	Corn oil	800	5	NH	Test
Geraniol	CMC	1500	5	NH	Test
Pemoline	CMC	70	5	NH	Test
Phenothiazine	Corn oil	386	5	NH	Test
Primidone	CMC	750	5	NH	Test
Propylene glycol	Water	2000	5	NH	Test
Quetiapine	CMC	500	5	NH	Test

Open in a new tab

Note. CMC, carboxymethylcellulose.

Training and test refers to how the treatments were divided for model evaluation only. The final model included all 72 treatment groups. All treatments were by oral administration.

Although the original studies were performed with different microarray platforms, all work in this study was based on the TaqMan array platform. This implies that the features used in the original papers may have different predictive powers in this study due to differences between platforms, but we would expect comparable performance. Other markers from other papers (Nie et al., 2006) were provided, so the final feature list could be reconstituted. Because a portion of the genes used on the TaqMan array were chosen from the same samples based on results of prior modeling (e.g., Fielden et al., 2007), our estimates of signature performance on the training data were optimistically biased. Therefore, we chose to maintain the distinction of training and test samples as originally defined by Fielden et al. (2007) and estimate accuracy only on the test data that were not used for feature selection in the original study. The 72 compounds were thus split into a training set of 10 NGHC compounds and 34 NH compounds and a test set of 13 NGHC compounds and 15 NH compounds as shown in Table 2. For this particular training set, we performed 25 replications of fivefold cross-validation on the 44 training compounds (10 NGHC compounds and 34 NH compounds) to produce 25 × 44 = 1100 scores for each candidate model. For each model, we pooled all 1100 estimates appropriately paired with the class membership to estimate a single value for the area under the ROC curve. Although the AUC estimates will be optimistically biased, the AUC was still a reasonable procedure for ranking the models with the risk of potentially selecting overfitted models. The candidate with the top AUC estimate was selected to classify the samples in the held out test data.

Model development step 2: Final model development.

The AUC point estimate of the test data from the evaluation study appeared promising and was comparable to the AUC estimate in the original microarray-based classifier (see Supplementary materials and methods and Supplementary results). Therefore, we performed a second model building procedure using all 72 compounds in the data set to identify a final model suitable for further independent testing as described below. For the second and final model building process, we chose the model with the best pooled AUC estimate as the top model with the caveat that the AUC estimates stratified by the original training and test split are relatively balanced (in other words, we would not select a model with a high AUC estimate driven predominantly by the originally defined training data that were used in the feature selection). See Supplementary materials and methods and Supplementary results for more information.

Model development step 3: Signature evaluation on independent data set.

The model building procedure in step 2 produced a single final model to classify samples in an independent set of samples. In order to estimate signature performance of the final model, yet also determine the boundaries of use of the derived signature, we tested it on a broad array of samples, several of which did not fall within the training set framework as a result of distinct study designs, rat strains, and/or compound classes. Our goal in doing so was not only to judge the sensitivity, specificity, and reproducibility of the signature but also to identify any factors that may result in poor signature performance so that a study design could be recommended to provide optimal classification accuracy. In total, we obtained over 900 liver RNA samples from a number of sources that tested a variety of chemicals under different conditions. A description of the treatment conditions (dose, time, strain, and route of administration) is provided in Table 3 and references therein. Expression data for these treatment groups are also provided in Supplementary table 1. Liver RNA was analyzed using the TaqMan array as described above for the training set samples. In total, there were 169 treatment groups representing 86 unique compounds, including NGHC, NH, and GHC, and several compounds of unknown or inconclusive carcinogenic outcome in the rat (alpha-naphthylisothiocyanate, butylated hydroxytoluene, ridogrel, prucalopride, chlorpromazine, hexachlorocyclohexane, and amitriptyline). For the purpose of the independent signature evaluation, we removed the unknown and GHC compounds from the analysis, and we removed any compounds that were used in the training of the final model. This produced an independent data set totaling 66 unique compounds that were evaluated under varying conditions.

TABLE 3.

Results of Independent Multisite Signature Evaluation

Johnson & Johnson (male SD rats); Nie et al., 2006
Treatment	Vehicle	Dose (mg/kg/day)	Route of administration	Time point (days)	Class	Score	Predicted class^a	Confidence level^b
Butylated hydroxytoluene	5% MC	1000	PO	1	NH	0.79	NGHC	100
Cyproterone acetate	5% MC	200	PO	1	NGHC	0.624	NGHC	68.9
Ethinyl estradiol (experiment 1)	5% MC	500	PO	1	NGHC	0.851	NGHC	100
Ethinyl estradiol (experiment 2)	5% MC	500	PO	1	NGHC	0.864	NGHC	100
Isoniazid	5% MC	125	PO	1	NGHC	0.157	NH	100
Methapyrilene^c	5% MC	200	PO	1	NGHC	0.87	NGHC	100
Monocrotaline	5% MC	30	PO	1	NGHC	0.749	NGHC	99.7
Piperonyl butoxide	5% MC	4000	PO	1	NGHC	0.548	NH	80.2
Progesterone	5% MC	100	PO	1	NGHC	0.359	NH	100
Simvastatin	5% MC	150	PO	1	NGHC	0.353	NH	100
Tamoxifen	5% MC	750	PO	1	NGHC	0.532	NH	87.1
Amiodarone	5% MC	600	PO	1	NH	0.468	NH	98.8
Amiodarone (experiment 1)	5% MC	1000	PO	1	NH	0.419	NH	99.9
Amiodarone (experiment 2)	5% MC	1000	PO	1	NH	0.508	NH	94
Aniline	5% MC	200	PO	1	NH	0.696	NGHC	96.1
Aspirin^c	5% MC	600	PO	1	NH	0.37	NH	100
Atenolol	5% MC	1500	PO	1	NH	0.245	NH	100
Beta-hydroxypropyl-cyclodextrin	Water	2000	PO	1	NH	0.724	NGHC	98.8
Bromocryptine	5% MC	200	PO	1	NH	0.19	NH	100
Buspirone	5% MC	100	PO	1	NH	0.661	NGHC	87.6
Captopril	5% MC	5000	PO	1	NH	0.401	NH	100
Clozapine	5% MC	150	PO	1	NH	0.517	NH	91.8
Dantrolene	5% MC	500	PO	1	NH	0.438	NH	99.7
Dapsone	5% MC	50	PO	1	NH	0.563	NH	71.7
Dexamethasone	5% MC	75	PO	1	NH	0.437	NH	99.7
Dieldrin	5% MC	30	PO	1	NH	0.62	NGHC	66.3
Dieldrin	5% MC	45	PO	1	NH	0.615	NGHC	63
Dipyridamole	5% MC	5000	PO	1	NH	0.659	NGHC	86.6
Disulfiram	5% MC	2000	PO	1	NH	0.91	NGHC	100
Enalapril	5% MC	1800	PO	1	NH	0.653	NGHC	84.2
Erythromycin estolate (experiment 1)	5% MC	1500	PO	1	NH	0.678	NGHC	92.5
Erythromycin estolate (experiment 2)	5% MC	1500	PO	1	NH	0.802	NGHC	100
Famotidine	5% MC	500	PO	1	NH	0.641	NGHC	78.8
Fluoxetine^c	5% MC	50	PO	1	NH	0.584	NH	58.7
Fluoxetine^c	5% MC	100	PO	1	NH	0.499	NH	95.6
Flutamide	5% MC	500	PO	1	NH	0.923	NGHC	100
Flutamide	5% MC	500	PO	1	NH	0.779	NGHC	99.9
Furosemide	5% MC	1500	PO	1	NH	0.284	NH	100
Glibenclamide	5% MC	3000	PO	1	NH	0.35	NH	100
Glibenclamide	5% MC	5010	PO	1	NH	0.479	NH	98
Iansoprazole	5% MC	200	PO	1	NH	0.768	NGHC	99.9
Ibuprofen	5% MC	500	PO	1	NH	0.571	NH	67.2
Indomethacin	Saline	30	IP	1	NH	0.385	NH	100
Itraconazole	5% MC	200	PO	1	NH	0.584	NH	58.7
Ketoconazole	5% MC	150	PO	1	NH	0.456	NH	99.3
Mebendazole	5% MC	40	PO	1	NH	0.472	NH	98.5
Metformin	5% MC	750	PO	1	NH	0.443	NH	99.7
Methyldopa^c	5% MC	1000	PO	1	NH	0.293	NH	100
Metoprolol	5% MC	2000	PO	1	NH	0.712	NGHC	98
Mycophenolic acid	5% MC	500	PO	1	NH	0.429	NH	99.8
Naltrexone	5% MC	1000	PO	1	NH	0.792	NGHC	100
Niacin	5% MC	2505	PO	1	NH	0.779	NGHC	99.9
Niacin	5% MC	5010	PO	1	NH	0.125	NH	100
Nifedipine	5% MC	750	PO	1	NH	0.655	NGHC	85.2
Nitrofurantoin	5% MC	400	PO	1	NH	0.599	NGHC	52.3
Nizatidine	5% MC	1000	PO	1	NH	0.569	NH	68.5
IVrhexilene^c	5% MC	2000	PO	1	NH	0.522	NH	90.4
Perhexilene^c	5% MC	2010	PO	1	NH	0.381	NH	100
Phenylephrine	Saline	5	IP	1	NH	0.105	NH	100
Quercetin	5% MC	1995	PO	1	NH	0.316	NH	100
Quercetin	5% MC	4005	PO	1	NH	0.368	NH	100
Raloxifene	5% MC	700	PO	1	NH	0.235	NH	100
Rantidine	5% MC	1000	PO	1	NH	0.302	NH	100
Rifampin	5% MC	600	PO	1	NH	0.49	NH	96.9
Rosiglila/one^c	5% MC	30	PO	1	NH	0.179	NH	100
Rosiglitazone^c	5% MC	100	PO	1	NH	0.207	NH	100
Rotenone	5% MC	4	PO	1	NH	0.702	NGHC	96.9
Rotenone	5% MC	100	PO	1	NH	0.533	NH	86.8
Sulfamethoxazole	5% MC	2000	PO	1	NH	0.681	NGHC	93.2
Tannic acid (experiment 1)	5% MC	3000	PO	1	NH	0.715	NGHC	98.2
Tannic acid (experiment 2)	5% MC	3000	PO	1	NH	0.639	NGHC	77.5
Tetracycline	5% MC	500	PO	1	NH	0.565	NH	71
Troglitazane^c	5% MC	100	PO	1	NH	0.486	NH	97.3
Troglitazone^c	5% MC	500	PO	1	NH	0.606	NGHC	56.8
Valproic acid^c	5% MC	200	PO	1	NH	0.312	NH	100
Valproic acid^c	5% MC	500	PO	1	NH	0.237	NH	100
Valproic acid^c	5% MC	600	PO	1	NH	0.41	NH	99.9
Valproic acid^c	5% MC	1000	PO	1	NH	0.735	NGHC	99.3
Verapamil	5% MC	75	PO	1	NH	0.377	NH	100
Vitamin A	5% MC	100	PO	1	NH	0.509	NH	93.7
Vitamin A	5% MC	200	PO	1	NH	0.406	NH	100

NTP (male F344 rats); Auerbach et al., 2010
Treatment	Feed	Dose (mg/kg/day)	Route of administration	Time point (days)	Class	Score	Predicted class	Confidence level
1-Amino-2,4-dibromoanthraquinone	Feed	5000 ppm	Dietary	2	NGHC	0.909	NGHC	100
1-Amino-2,4-dibromoanthraquinone	Feed	5000 ppm	Dietary	14	NGHC	0.868	NGHC	100
1-Amino-2,4-dibromoanthraquinone	Feed	5000 ppm	Dietary	90	NGHC	0.792	NGHC	100
Acetaminophen^c	Feed	3000 ppm	Dietary	2	NGHC	0.861	NGHC	100
Acetaminophen^c	Feed	3000 ppm	Dietary	14	NGHC	0.797	NGHC	100
Acetaminophen^c	Feed	3000 ppm	Dietary	90	NGHC	0.672	NGHC	91.1
Methyleugenol	5% MC	150	PO	2	GHC	0.911	NGHC	100
Methyleugenol	5% MC	150	PO	14	GHC	0.869	NGHC	100
Methyleugenol	5% MC	150	PO	90	GHC	0.654	NGHC	84.7
Methyleugenol	Corn oil	35.6	PO	2	GHC	0.632	NGHC	73.6
Methyleugenol	Corn oil	35.6	PO	14	GHC	0.537	NH	85.2
Methyleugenol	Corn oil	35.6	PO	90	GHC	0.565	NH	70.7
Methyleugenol	Corn oil	356	PO	2	GHC	0.849	NGHC	100
Methyleugenol	Corn oil	356	PO	14	GHC	0.757	NGHC	99.8
Methyleugenol	Corn oil	356	PO	90	GHC	0.851	NGHC	100
Safrole	Corn oil	32.4	PO	2	GHC	0.717	NGHC	98.3
Safrole	Corn oil	32.4	PO	14	GHC	0.538	NH	84.9
Safrole	Corn oil	32.4	PO	90	GHC	0.655	NGHC	88.6
Safrole	Corn oil	324	PO	2	GHC	0.758	NGHC	99.8
Safrole	Corn oil	324	PO	14	GHC	0.736	NGHC	99.3
Safrole	Corn oil	324	PO	90	GHC	0.717	NGHC	98.3
Ascorbic acid	Feed	25,000 ppm	Dietary	2	NH	0.636	NGHC	76.1
Ascorbic acid	Feed	25,000 ppm	Dietary	14	NH	0.786	NGHC	100
Ascorbic acid	Feed	25,000 ppm	Dietary	90	NH	0.468	NH	98.8
Eugenol	Corn oil	32.8	PO	2	NH	0.544	NH	82.2
Eugenol	Corn oil	32.8	PO	14	NH	0.49	NH	96.9
Eugenol	Corn oil	32.8	PO	90	NH	0.431	NH	99.8
Eugenol	Corn oil	328	PO	2	NH	0.429	NH	99.8
Eugenol	Corn oil	328	PO	14	NH	0.698	NGHC	96.4
Eugenol	Corn oil	328	PO	90	NH	0.363	NH	100
Isoeugenol	Corn oil	32.8	PO	2	NH	0.804	NGHC	100
Isoeugenol	Corn oil	32.8	PO	14	NH	0.77	NGHC	99.9
Isoeugenol	Corn oil	32.8	PO	90	NH	0.709	NGHC	97.7
Isoeugenol	Corn oil	328	PO	2	NH	0.728	NGHC	99
Isoeugenol	Corn oil	328	PO	14	NH	0.655	NGHC	85
Isoeugenol	Corn oil	328	PO	90	NH	0.642	NGHC	79.3
l-tryptophan	Feed	25,000 ppm	Dietary	2	NH	0.876	NGHC	100
l-tryptophan	Feed	25,000 ppm	Dietary	14	NH	0.837	NGHC	100
l-tryptophan	Feed	25,000 ppm	Dietary	90	NH	0.647	NGHC	81.8
Aflatoxin B₁	Feed	1 ppm	Dietary	2	GHC	0.665	NGHC	88.9
Aflatoxin B₁	Feed	1 ppm	Dietary	14	GHC	0.812	NGHC	100
Aflatoxin B₁	Feed	1 ppm	Dietary	90	GHC	0.836	NGHC	100
Dimethylnitrosamine	Water	5 ppm	Water	2	GHC	0.653	NGHC	84.2
Dimethylnitrosamine	Water	5 ppm	Water	14	GHC	0.745	NGHC	99.6
Dimethylnitrosamine	Water	5 ppm	Water	90	GHC	0.759	NGHC	99.8

Pfizer (male SD rats)
Treatment	Vehicle	Dose (mg/kg/day)	Route of administration	Time point (days)	Class	Score	Predicted class	Confidence level
Acetaminophen^c	5% MC	300	PO	4	NGHC	0.199	NH	100
Thioacetamide	5% MC	50	PO	4	NGHC	0.809	NGHC	100
Alpha-naphthylisothiocyanate	5% MC	30	PO	1	Unknown	0.42	NH	99.9
Alpha-naphthylisothiocyanate	5% MC	100	PO	1	Unknown	0.329	NH	100

Roche (male SD rats)
Treatment	Vehicle	Dose (mg/kg/day)	Route of administration	Time point (days)	Class	Score	Predicted class	Confidence level
Methapyrilene^c	Water	10	PO	2	NGHC	0.329	NH	100
Methapyrilene^c	Water	10	PO	6	NGHC	0.584	NH	58.7
Methapyrilene^c	Water	10	PO	10	NGHC	0.698	NGHC	96.4
Methapyrilene^c	Water	10	PO	14	NGHC	0.734	NGHC	99.3
Methapyrilene^c	Water	50	PO	2	NGHC	0.861	NGHC	100
Methapyrilene^c	Water	50	PO	6	NGHC	0.884	NGHC	100
Methapyrilene^c	Water	50	PO	10	NGHC	0.879	NGHC	100
Methapyrilene^c	Water	50	PO	14	NGHC	0.713	NGHC	98

Sanofi-aventis (male F344 rats); Michel et al., 2005—site 2
Treatment	Vehicle	Dose (mg/kg/day)	Route of administration	Time point (days)	Class	Score	Predicted class	Confidence level
Clofibrate	Feed	5000 ppm	Dietary	18	NGHC	0.821	NGHC	100
Clofibrate	Feed	5000 ppm	Dietary	264	NGHC	0.866	NGHC	100
Clofibrate (nontumorous)	Feed	5000 ppm	Dietary	607	NGHC	0.709	NGHC	97.8
Clofibrate (adjacent tumor)	Feed	5000 ppm	Dietary	607	NGHC	0.945	NGHC	100

Schering-Plough Research Institute (male SD rats); Nioi et al., 2008
Treatment	Vehicle	Dose (mg/kg/day)	Route of administration	Time point (days)	Class	Score	Predicted class	Confidence level
Acetaminophen^c	4% MC	950	PO	1	NGHC	0.782	NGHC	99.9
Acetaminophen^c	4% MC	950	PO	5	NGHC	0.866	NGHC	100
Butylated hydroxytoluene	4% MC	450	PO	1	NH	0.739	NGHC	99.4
Butylated hydroxytoluene	4% MC	450	PO	5	NH	0.685	NGHC	94.2
Methapyrilene^c	4% MC	100	PO	1	NGHC	0.828	NGHC	100
Methapyrilene^c	4% MC	100	PO	5	NGHC	0.895	NGHC	100
Phenobarbital^c	Water	50	PO	1	NGHC	0.701	NGHC	96.8
Phenobarbital^c	Water	50	PO	5	NGHC	0.649	NGHC	82.5
Fluoxetine^c	4% MC	400	PO	1	NH	0.444	NH	99.6
Fluoxetine^c	4% MC	400	PO	5	NH	0.484	NH	97.6
Ranitidine	4% MC	1000	PO	1	NH	0.54	NH	83.9
Ranitidine	4% MC	1000	PO	5	NH	0.541	NH	83.4

Iconix (male SD rats); Fielden et al., 2007
Treatment	Vehicle	Dose (mg/kg/day)	Route of administration	Time point (days)	Class	Score	Predicted class	Confidence level
Aflatoxin B₁	0.5% CMC	0.3	PO	5	GHC	0.518	NH	91.5
Diethylnitrosamine	Saline	34	PO	5	GHC	0.754	NGHC	99.7
Pregnenolone-16alpha-carbonitrile	0.5% MC	100	PO	5	NGHC	0.899	NGHC	100
Carbon tetrachloride	Corn oil	1175	PO	3	NGHC	0.62	NGHC	66.6

Abbott (male and female SD rats)
Treatment	Vehicle	Dose (mg/kg/day)	Route of administration	Time point (days)	Class	Score	Predicted class	Confidence level
N-vinylpyrrolidone-2—male	Saline	3000	PO	5	NGHC	0.782	NGHC	99.7
N-vinylpyrrolidone-2—female	Saline	3000	PO	5	NGHC	0.725	NGHC	97.1
Rimonabant—male	0.2% HPMC	10	PO	5	NGHC	0.635	NGHC	71.9
Latrepirdine—male	0.2% HPMC	10	IP	6	NH	0.276	NH	100

Open in a new tab

Note. CMC, carboxymethylcellulose; HPMC, hydroxypropylmethylcellulose; MC, methylcellulose.

Signature scores ≥ than the classification threshold (0.596) were predicted as NGHCs.

The CI provides an estimate of confidence for the two class predictions (NGHC or NH) and is described in the Supplementary Materials and Methods.

Indicates compounds that were also used in the original training set.

Determination of classification threshold.

Classifying compounds as NGHC or NH required dichotomizing the classification scores into calls. Because we standardized all potential models to have probabilistic output with values 0–1 inclusive, we modeled the classification scores with beta distributions. After each replication of cross-validation on the training data, we chose to separately fit the classification scores into two separate beta probability densities; one beta distribution was fit for NH classification results and one beta distribution for NGHC classification results. The point that is equally likely to be in either NGHC or NH distribution was defined as the threshold or classification cut point. The 25 replications of cross-validation provided 25 estimates for the threshold for a given model. Because the threshold tended to be away from the 0 or 1 limits, the thresholds were approximately normally distributed, and this allowed for reasonable estimates of the variance associated with the threshold.

Interlaboratory precision study.

The interlaboratory precision of the model was evaluated by splitting each of 38 liver RNA samples among four laboratories and determining the reproducibility of the expression values and signature scores when measured in different laboratories. The 38 samples consisted of liver RNA from male F344 rats (n = 6–10 rats per group) treated with 5000 ppm of clofibrate in the diet for 18, 264, or 607 days (Michel et al., 2005). RNA samples from liver tumors, and adjacent normal tissue, were evaluated and compared on day 607. The time-matched control animals received diet only. The precision of the TaqMan array data was evaluated by comparing the variability of signature scores and expression ratios across the four sites.

Biological interpretation of biomarker genes and their regulation.

Two approaches were used to obtain gene function information on the 23 genes composing the final model: (1) A general biomedical literature searching (PubMed) effort carried out on a gene-by-gene basis and (2) the mining of annotated knowledge-based databases found in the Ingenuity Pathway Analysis (IPA) software (Ingenuity Systems, Redwood City, CA) and the BIOBASE Knowledge Library (BIOBASE Corporation, Beverly, MA). The literature review was focused on identifying functional associations between biomarker genes and regulation of cell proliferation and carcinogenesis. IPA was used to identify pathways, biological processes, and networks that were statistically enriched in the signature genes. Through the use of both tools, the probability that the representation level of genes in the query set in each functional category, disease, or network process is due to chance alone was expressed as a p value. p Values less than 0.05 were considered significant. Detailed information on the statistical methods underlying the pathway and functional category enrichment and impact scoring can be found at the software provider’s web address (Ingenuity Systems, http://www.ingenuity.com/).

Using gene accession information, the genes composing the final model were uploaded into the BIOBASE analysis tool, ExPlain data analysis system, which leverages the TRANSFAC and TRANSPATH databases to score for the presence of transcription factor response elements (TFREs) within the 1100-bp proximal promoter region of the member genes. To determine relative enrichment, the TFRE abundance in the query set were compared with a reference set of 400 rat housekeeping gene promoters and the likelihood of TFRE overrepresentation in the query set relative to the reference set is expressed as a p value representing the probability that the difference in the TFRE overrepresentation is due solely to chance. A more detailed description of the BIOBASE ExPlain tool and the statistical methods underpinning the TFRE enrichment analysis can be found at the provider’s web address (http://www.biobase-international.com/).

RESULTS

Classification Accuracy

The results of the initial model building in the evaluation study (model development step 1: process evaluation) produced an AUC of 0.84 on the test data, which was significantly different from the top model trained on the same data with the class labels randomly permuted (p = 0.012), indicating that the model and the underlying model building procedure identified a true signal that can differentiate NGHCs from NHs (see Supplementary Results). Based on these encouraging results, we proceeded to build a final model using all 72 compounds. This modeling resulted in a signature containing 22 genes, all normalized to peptidylprolyl isomerase A, using a random forest classifier with a classification threshold of 0.596. This signature was then evaluated on the independent data set detailed in Table 3. Figure 2 shows the principal component analysis visualization based on the delta-delta C_t values (expression ratios) for each data point (a compound measured at a given site, dose, and time point) in the independent data set. The compound classes tended to separate in the first two principal components, thus indicating that the separation of NGHCs and NHs is partially preserved on independent samples based solely on the expression of the 22 genes in the final signature. To estimate overall sensitivity and specificity of the signature, an evaluation was done at the compound level by merging compounds measured from multiple sites, and at different doses or time points, into a single score based on the median signature score. Merging the replicates produced 66 unique compounds in the independent data set. This approach resulted in a sensitivity and specificity of 67% (95% confidence interval [CI] = 38–88%) and 59% (95% CI = 44–72%), respectively, with an AUC of 0.65 (95% CI = 0.46–0.83%) (Fig. 4A and Supplementary figure. 2). This conservative estimate provided a 1:1 mapping of compound to prediction in order to estimate the associations with class. In general, we found the data from multiple sites for the same compound to have correlated scores (see Supplementary figure. 3). However, merging multiple doses in this manner may risk conflating very different responses on individual compounds and should not be done in practice, but it nonetheless provides a convenient means to estimate performance. The effect of site, which in this context is a proxy for study protocol, on these classification results is difficult to evaluate because most samples in the independent study came from single dose (1 day) studies at Johnson & Johnson (J&J) (Fig. 3). If we confined our results to sites outside of the J&J samples, we estimate an improved AUC at 0.81 (95% CI = 0.5–1.0%), whereas the J&J results provided an AUC of 0.49 (95% CI = 0.22–0.76%) (Fig. 4B).

FIG. 2. — Principal component analysis of the independent signature evaluation data. The 66 test compounds (including replicates) spanned by the 22 predictive genes in the model are projected in the first two principal components. The results are stratified vertically by compound classification accuracy and horizontally by compound class. Rug plots were added, so that compound positioning is more apparent with the color scaled according to the classifier score (darker marks suggest higher scores). In general, we see good separation between NGHC compounds (black points) and NH compounds (gray points), and this suggests the 22 predictive genes tended to separate the classes as expected. The NGHC compounds that are classified incorrectly (black points with gray borders) are generally in close proximity to the NH compounds, whereas the NH compounds that are classified incorrectly (gray points with black borders) have mixed proximity to other NGHC compounds.

FIG. 4. — Final model performance. (A) ROC plot for the evaluation signature set. The results on the independent signature set (multisite test set) are represented by the black line with points. Each point is derived from an identical compound generated at different sites and tested at different doses but summarized by the median signature score. Random chance is the diagonal dashed black line. The observed sensitivity and specificity derived from the independent test set is shown with the gray ‘X’. The black box captures the 95% through the 97.5% CI based on an exact test (Clopper-Pearson). FPR: false-positive rate. TPR: true-positive rate. Sensitivity and specificity curves are also evaluated in Supplementary figure 2. (B) 2 × 2 contingency tables for classification of independent and unique compounds. Sensitivity is the proportion of NGHCs correctly predicted positive. Specificity is the proportion of NHs correctly predicted negative. PPV, positive predictive value, is the proportion of samples with positive test results that are correctly diagnosed. NPV, negative predictive value, is the proportion of samples with negative test results that are correctly diagnosed. AUC with 95% confidence limits in brackets.

FIG. 3. — Stratification of independent test compounds by class, day, and site. Each replicate from a given compound is represented by a single point, and class results are summarized using box plots. The boxes capture the middle 50% of the data. The classifier cutoff is represented using a horizontal line at 0.596. Compounds with a signature score greater than 0.596 are classified as positive (predicted NGHC). The plot shows the composition of the independent data set being composed of samples predominately from J&J at day1.

To explore further the boundaries of use for the signature, we evaluated the dose and time dependence of the signature score. The effects of dose and time on signature predictions were best illustrated by the samples from a time-course study in male SD rats treated with methapyrilene at doses of 10 and 50 mg/kg/day. Methapyrilene administered at 10 mg/kg produced a time-dependent increase in the signature score; however, it was correctly predicted positive only at the later time points on days 10 and 14 (Table 3). By contrast, the high dose of methapyrilene at 50 mg/kg/day was correctly predicted positive at all time points. The signature score did not appear to increase over time as it was close to its maximum on day 2 and sustained above 0.7 throughout the course of treatment. Methapyrilene was also correctly predicted positive when tested by J&J at 200 mg/kg for 24 h (Table 3). These results suggest that the signature is sensitive to dose and time and that low-dose exposure and/or early time points may not be optimal to identify expression changes diagnostic of NGHCs. This is consistent with the fact that the training set was established using maximum tolerated doses and repeated daily doses for 5 days.

The ability of the signature to correctly classify samples from long-term treatments was investigated by evaluating the 90-day studies conducted by the National Toxicology Program (NTP). A comparison of time points within these long-term studies indicate that the 90-day samples typically produce similar classification results as the earlier time points (cf days 2 or 14). Although many of the classification results from the 90-day NTP studies were incorrect (false positives), the consistency of the results suggest that the expression changes were conserved over time. Likewise, the long-term clofibrate diet study also indicated the classification results and expression changes were preserved over the extended course of treatment (Table 3). These limited results suggest that samples from both short-term and long-term repeat dose studies may have applicability to the signature.

Some hepatocarcinogens are thought to cause tumors secondary to hepatotoxicity and regenerative proliferation, raising the concern that the signature may be sensitive to false positives as a result of liver injury. Therefore, it was of interest to determine if there was an association between the signature score and the degree of hepatic damage. Rats treated with methapyrilene at both 10 and 50 mg/kg showed no difference in the degree of hepatotoxicity at the early time points as both groups showed minimal single cell necrosis, yet the signature scores were clearly distinct on days 2 and 6. The increasing severity of hepatic necrosis in the high-dose group at later time points also did not correlate with the signature score. Both low and high doses produced minimal to mild spindle cell proliferation on days 6 through 14, including biliary hyperplasia in the most severe instances in the high dose group (data not shown), yet this was not correlated with the signature score. These results suggest that proliferating cells and hepatotoxicity are unlikely to influence the signature scores and lead to false positives with hepatotoxic treatments. While the lack of a complete histological evaluation of all test samples precludes a more comprehensive analysis of this hypothesis, the negative signature scores for other known hepatotoxic compounds such as alpha-naphthylisothiocyanate, rotenone, valproic acid, or aflatoxin B₁ (Table 3) provide further evidence that hepatotoxic drug treatments are unlikely to produce false-positive predictions. This is consistent with results with the original Iconix microarray signature (Fielden et al., 2007), and the fact that hepatotoxic treatments were included in both classes of the training set to limit this possibility.

Signature Precision and Reproducibility

The precision of the TaqMan array was assessed by splitting RNA samples from a chronic clofibrate toxicity study into aliquots for evaluation at four different laboratories to assess site-to-site variation. As expected, the precision of the classifier score when measured across sites was excellent, as all four sites produced very similar expression results and signature scores (Supplementary figures. 1A and 1B). In addition, it was of interest to determine the robustness of the predictive expression changes for compounds evaluated at different sites or dates; we expect a given compound to be classified identically assuming the same dose and experimental protocol were used. Five compounds were tested at the same dose level in separate studies but at the same site (J&J), thus permitting an evaluation of reproducibility within a single laboratory. In all five cases, the biomarker predictions were concordant (amiodarone, erythromycin, ethinyl estradiol, flutamide, and tannic acid; Table 3). These results provide confidence that signature predictions should be similar when assessed under similar study conditions. Because other compounds evaluated at multiple sites were tested under different conditions or doses, we were unable to evaluate the reproducibility across sites. In addition, it was of interest to determine the robustness of the predictive expression changes for compounds evaluated at different sites, as the same compound should ideally be predicted the same regardless of where it was tested. A number of compounds were tested at multiple sites albeit with different study designs and doses, so a direct comparison could not be made. For example, acetaminophen was tested at three different sites at 300 mg/kg for 4 days, 950 mg/kg/day for 1 and 5 days, and at a dietary exposure of 3000 ppm for 2, 14, and 90 days. Acetaminophen was correctly predicted positive by the signature at 950 mg/kg/day and 3000 ppm at all time points, whereas the lower dose of 350 mg/kg for 24 h was predicted negative (Table 3). Additionally, the non-genotoxic hepatocarcinogen methapyrilene was correctly predicted positive by the signature at four different doses and across three different laboratories. The NH fluoxetine was also correctly predicted negative at three different doses and across two different laboratories (Table 3).

Evaluating Nongenotoxic Modes of Action

Previous microarray expression data on the Iconix samples (Fielden et al., 2007) demonstrated that hierarchical clustering of NGHCs across 37 signature genes could identify compounds with similar mode of action based on the similarity of their expression profiles. Although hierarchical clustering is an unsupervised clustering technique and therefore not a formal prediction, it can provide a visual but subjective means to evaluate novel compounds for potential modes of action that may contribute to a positive prediction and hepatocarcinogenicity. The 23 NGHCs in the training set were clustered across all 22 genes in the model (Fig. 5). A number of test compounds were included in the clustering to evaluate whether the signature genes could facilitate identification of known compounds with similar modes of action. The genotoxic hepatocarcinogens aflatoxin B₁ and N-nitrosodiethylamine dosed orally for 5 days clustered together and were distinct from other treatments. The next most similar expression profiles were a number of hepatotoxicants such as acetaminophen and chloroform, which appeared to be driven by induction of the oxidative stress–responsive gene Akr7a3. The test compounds pregnenolone-16alpha-carbonitrile, phenobarbital, and butylated hydroxytoluene clustered among other PXR and CAR agonists as expected, whereas the P450 inducers and Ah receptor agonists, beta-naphthoflavone and 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), clustered distinctly. Interestingly, fluconazole coclustered with diethylstilbestrol and norethindrone, which suggests fluconazole may have a similar mode of action for inducing hepatocarcinogenicity. It is notable that a number of PPARα agonists coclustered despite the fact that the 22 signature genes are not known for being associated with fatty acid metabolism. This cluster of PPARα agonists was also correlated to the profiles of bis(2-ethylhexyl)phthalate and pravastatin, which are also thought to activate PPARα (Chen et al., 2010). These results substantiate the utility of the 22 signature genes to identify putative modes of action for known or suspected hepatocarcinogens.

FIG. 5. — Hierarchical clustering of genotoxic and nongenotoxic hepatocarcinogens. The log₁₀ ratios of the 22 signature genes were calculated by comparing the expression in the treated rats relative to time-matched vehicle control rats. The genes and expression profiles were then hierarchically clustered. Clustering method: complete linkage. Distance measure: correlation. Green = upregulation; red = downregulation; black = no change. Absolute magnitude of expression change is provided in Supplementary table 1. Note. See online version for color version.

Evaluating Genotoxic Modes of Action

A number of genotoxic treatments were included in the test set to evaluate whether the signature detected expression changes that were common to all hepatocarcinogens regardless of mode of action. In male SD rats, oral administration of aflatoxin B₁ resulted in a negative prediction, whereas N-nitrosodiethylamine was predicted positive. Dietary exposure of male F344 rats to aflatoxin and N-nitrosodimethylamine resulted in a consistent positive prediction on days 2 through 90. This was unexpected as the model was trained to identify NGHCs. Whether genotoxic hepatocarcinogens cause prognostic expression changes similar to NGHCs is unclear and will require evaluation of a broader set of genotoxic compounds.

A number of genes on the array were chosen based on a previous study (Ellinger-Ziegelbauer et al., 2004) that demonstrated a strong and consistent upregulation of expression in response to genotoxic hepatocarcinogens, which suggested that they could be used to differentiate genotoxic from nongenotoxic modes of action. These genes include a number of p53- and DNA damage–responsive genes such as BTG2, CDKN1A, and MGMT, as well as a number of xenobiotic metabolism genes such as CES2 and ALDH1A1. As shown in Figure 6A, these genes were significantly induced by aflatoxin B₁ and diethylnitrosamine after 5 days of repeated daily dosing in male SD rats. By comparison, the NGHCs, bezafibrate and TCDD do not consistently induce these genes after 5 days of repeated daily dosing (Fig. 6B), thus suggesting these genes could be used to differentiate genotoxic modes of action. However, it was also observed that a number of NGHCs were found to induce many of these genes. Examples include hepatotoxic treatments such as methapyrilene and chloroform (Fig. 6C). The induction of these genes may be secondary to cytotoxicity and p53 activation rather than evidence of direct DNA damage. The NHs, praziquantal and dichlorvos also induce a number of these genes (Fig. 6D). The weight of evidence would suggest these compounds are not genotoxic in vivo despite some conflicting reports (Booth et al., 2007; Montero and Ostrosky, 1997); however, there is no histological evidence of hepatotoxicity in these animals (data not shown). By evaluating the gene expression changes for these DNA damage–responsive genes, it may be possible to differentiate nongenotoxic from genotoxic modes of action. Histological changes in the samples would likely need to be taken into consideration when interpreting the potential of treatments to cause direct DNA damage in vivo based on the expression of these DNA damage–responsive genes.

FIG. 6. — Expression of DNA damage–responsive genes. Genes previously identified as being responsive to DNA damagers (Ellinger-Ziegelbauer *et al*., 2004; Table 1) were evaluated for their ability to differentiate genotoxic from nongenotoxic modes of action. Male SD rats were treated for 5 days with examples of (A) known genotoxic hepatocarcinogens, (B) known nongenotoxic hepatocarcinogens, (C) known nongenotoxic hepatocarcinogens at hepatotoxic doses, and (D) with NHs with known (dichlorvos) or equivocal (praziquantal) genotoxic liabilities. Treatments were as described in Table 3. Fold induction was calculated by comparing the expression in the treated rats relative to vehicle-matched controls as described in the “Materials and Methods” section.

Role of Biomarker Genes in Neoplasia

A detailed gene literature survey using the BioBase Knowledge Library revealed 10 of 23 genes that were correlated or causally associated with neoplasia or cancer (Supplementary table 2), and 8 of the genes have “Cell growth/cycle/signal transduction” as the primary biological process category. A gene-by-gene characterization, though useful, may miss the possible interconnectivity of the signature genes. Therefore, IPA was used to analyze the 23 genes and generate enrichment scores (statistical significance) for a number of biological categories and canonical pathways as well as for deriving potential network relationships. This analysis revealed that of the top 10 significantly ranked biological categories associated with the 23 signature genes, 7 have an association with cell proliferation and cancer or processes that when dysregulated could theoretically lead to neoplasia (data available upon request to the author).

In order to investigate possible relationships that may underlie the 23 genes in the signature, an examination of potential transcriptional coregulation was conducted. A response element enrichment analysis of the proximal promoter regions of all 23 genes revealed that a number of TFREs were significantly enriched for. The four most significantly enriched TFREs relative to the reference set were, in order of significance (all p < 0.05), AP-1, PBX-1, NFKB, and AHR. Although AP-1, NFKB, and AHR are associated with a general response to cellular stress and response to xenobiotics, the role of PBX1 (pre-B-cell leukemia homeobox 1) in the liver is unclear. This gene encodes a homeobox family transcription factor initially identified as a proto-oncogene associated with B-cell leukemia and has been reported to be required for the maintenance of hematopoiesis in the fetal liver and implicated in promoting hematopoietic progenitor cell expansion (DiMartino et al., 2001); however, it has not been reported to play a role in hepatocarcinogenesis. The genes harboring a PBX response element include not only the DNA damage–responsive genes, Akr7a3, Aldh1a1, Tap1, Cdkn1a, and Ces2 but also genes originally identified in the Iconix (Cited4, Ica2) and J&J (Sel1I) signature (Supplementary table 2).

DISCUSSION

Our approach to improve human carcinogenicity risk assessment has focused on the development of biomarkers for the early prediction of NGHCs in rats and the simultaneous application of genomics to understand their potential modes of action, in order to enable a proactive human hepatocarcinogenicity risk assessment prior to initiation of the 2-year rodent bioassay. To this end, we have leveraged previously published genomic biomarker discovery efforts (Ellinger-Ziegelbauer et al., 2004; Fielden et al., 2007; Nie et al., 2006) to develop a signature on the TaqMan array card to facilitate prediction of NGHCs using data from short-term repeat dose rat toxicology studies. Together with the diagnostic expression profiles provided by the accompanying data set, the data also facilitate investigations into the potential modes of action.

Numerous efforts have attempted to discover and evaluate novel biomarkers to predict carcinogenicity of nongenotoxic carcinogens (Waters et al., 2010); however, the biomarkers were often derived from relatively small data sets and/or lacked adequate independent testing. As a result, these putative biomarkers may not be widely recognized or applicable outside their laboratory of origin. In response to these limitations, we have focused our efforts on deriving a signature using a large training set of 72 compounds and subsequently evaluating the performance of the signature on over 900 RNA samples representing 169 treatment groups (86 unique compounds, including 4 GHCs) from eight different research sites. This facilitated an estimation of the likely sensitivity and specificity when applied to different treatment protocols and allowed us to understand the strengths and limitations of the signature to help define its boundaries of use.

In general, a predictive model can only be expected to perform well based on the training information. The training set was derived from a homogenous data set that utilized a common rat strain (SD), gender (male), dose-setting criteria (maximum tolerated dose), time point (day 5) and RNA isolation procedure. Although the boundaries of use that are expected to maximize classification accuracy are likely to be defined by the training set, it was important to test these boundaries with an independent and heterogeneous data set as this would reflect real world application. In order to generate composite estimates of classification accuracy, it was convenient to merge compounds into a single score and remove overlapping compounds that were also utilized in the 72 compound training set. This resulted in 169 test treatments, generated with varying study protocols, which reduced to 66 unique predictions that included 15 NGHCs and 51 NHs. Using this method, the sensitivity, or true-positive rate, was 67% and the specificity, or true-negative rate, was 59%. Although the sensitivity may be considered acceptable, we were hampered by the relatively few (15) independent compounds available for testing and so these results should be viewed as preliminary. By contrast, a fair assessment of the true-negative rate was provided by 51 independent NH compounds. Although numerous false positives and negatives were identified, they appeared to be enriched in samples predominantly from the J&J and NTP data sets. This may not be surprising as the protocols used by these two sites differed dramatically from the training set. For example, the sensitivity and specificity of the signature against the J&J compounds alone were 38 and 61%, respectively, and a high number of false positives were observed when testing compounds in the NTP data set. The AUC for the J&J compounds is right at random chance with a value of 0.49, but it is based on only eight positive samples, so the prevalence of NGHC compounds in the J&J and non-J&J data sets are quite different. In the end, the overall sensitivity estimate is driven by the non-J&J compounds whereas the specificity estimate is driven by the J&J compounds. The reason for the incorrect classifications in the J&J and NTP data sets are possibly numerous. For example, the false positives in the J&J data set may be a result of testing samples obtained only 24 h after a single high dose, as detailed in Nie et al. (2006). The acute transcriptional response after the first dose is expected to be highly variable and may result in other compensatory changes that do not uniquely reflect the predictive changes that may persist during the course of repeated daily doses, as represented in the training set. Given that the training set is based on maximum doses that are tolerated for up to 5 days (see Fielden et al., 2007), it is likely that the optimal signature performance for this particular model would be obtained when following a similar dosing paradigm. This is exemplified by the 3/3 correct predictions from samples evaluated at Abbott where compounds were dosed for 5 or 6 days in SD rats. However, it is important to consider that the use of a maximally tolerated dose in the training set may be of detriment when the signature is applied to samples that have not achieved such a dose level.

The reason for the high number of false positives in the NTP data set is unclear, but we cannot rule out the possibility that it is due to the use of male F344 rats or the inclusion of primarily nontherapeutic chemicals that may have unique modes of action that are difficult to classify with the current model. This latter hypothesis seems unlikely, however, given the large training set that includes nontherapeutic compounds of diverse modes of action. Differences in RNA isolation procedures may impact the results here. Additionally, it does not appear that false positives are generated by hepatotoxic treatments based on the differentiation of signature scores in the methapyrilene treatment groups and the results of other hepatotoxic treatments in the data set. One must also consider the score produced by the final model as it is likely that performance improves if one considers results that are farther away from the threshold that distinguishes NGHC and NH compounds (see Supplementary figure 7). These data suggest how samples generated by protocols distinct from the training set can result in poor signature performance and reinforces the concept that classification accuracy is likely to be optimal when samples are generated using protocols most similar to that of the training set. The results from both the evaluation study and the independent data set reinforces the protocol established by the Iconix training set as constituting the optimal boundaries of use. Based on these findings, a recommended protocol would include repeat daily dosing in male SD rats for ~5 days to generate data most comparable to the training set and maximize the potential benefit of this predictive assay. The number of animals per treatment group is recommended to be at least three, although it is recognized that more biological replication for test samples should improve the overall precision of the prediction. The use of only two or three animals per group in the training set is unlikely to have negatively affected the performance of the signature because the initial model evaluation exercise resulted in a robust area under the ROC curve of 0.84. However, any increased precision afforded by more replication may improve the confidence in the predictions, particularly those close to the classification threshold.

Comparison of the qPCR–based model to the microarray-based model from previous publications (Fielden et al., 2007; 2008) showed that performance is largely preserved across platforms. The most instructive comparison is the similarity of AUC estimates derived on the test data from the process evaluation step. In that step, the AUCs derived on the Iconix test data from the microarray and qRT-PCR–based classifiers are 0.89 and 0.84, respectively. We compared the microarray and qRT-PCR–based models using 45 compounds in the independent signature evaluation step. In that context, the AUC estimates of the microarray and qRT-PCR–based models were 0.65 and 0.66, respectively. Although the scores are less correlated with a Pearson correlation coefficient of 0.5, the classification calls per model have ~73% overall agreement. This suggests that the two models have similar performance characteristics (see Supplementary Results for more information).

Evaluating the effect of gender or strain on signature performance was not appropriately permitted by the available test samples; however, it is plausible that pharmacological mechanisms linked to tumor induction in the rat (i.e., PPARα induction) will be adequately maintained across genders and strains of rat. Where these variables impact pharmacokinetics, a quantitative or qualitative difference in the gene expression profile and signature outcome may be anticipated, although we have not formally tested this possibility. In one case, N-vinylpyrrolidone-2 was evaluated at the same dose in both male and female rats and the signature scores were highly similar. It is noteworthy that a number of compounds found to induce liver tumors in female rats only were positively identified by the signature despite using male expression data (e.g., carbon tetrachloride). The inclusion of expression data in the training set from male rats treated with female-specific hepatocarcinogens, such as diethylstilbestrol, 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), or chloroform, may have helped in this regard and increase the sensitivity of the assay to detect potential NGHCs. The effect of gender or strain on signature performance should still be considered unknown and suitably accounted for in the interpretation of any testing.

In addition to evaluating the predictive accuracy of the signature, we demonstrated that how the genomic data and the use of similarities in expression profiles could generate hypotheses for potential modes of action for NGHCs. Numerous examples exist that demonstrate the utility of toxicogenomic data to help understand mechanisms of carcinogenesis (Waters et al., 2010); however, it was concerning whether or not an expression profile of only 22 predictive genes would be sufficient to reveal information indicative of a compounds mode of action. It was surprising then that the expression profiles of well-characterized NGHCs with similar mode of action maintained a high degree of similarity. For example, the two genotoxic carcinogens aflatoxin B₁ and diethylnitrosamine were found to cluster most similar to each other even when clustered among NGHCs. This raises the possibility that compounds with genotoxic activity may be identified through clustering of expression profiles, in addition to specifically evaluating the induction of the DNA damage–responsive genes included on the TaqMan array. Although the interpretation of clustering patterns can be subjective, it nonetheless provides valuable clues to guide more definitive investigate work that can help explain the mode of action of a novel compound. More extensive evaluations using whole genome arrays can provide more data; however, this could make interpretation more difficult because a database of known reference expression profiles from which comparisons can be drawn would not be as readily available as it is with the TaqMan array data set described here.

To add additional weight of evidence for use of the signature as both a predictive and mechanistic tool, it was of interest to understand if the 22 biomarker genes had a functional role in carcinogenesis, proliferation, and/or related phenotypes. Although the gene members of an algorithm-derived classifier are selected based on performance optimization and assay design, there is an underlying assumption that their classifying power is dependent on, a not always obvious, but nonetheless real connection to the underlying biology associated with the predicted phenotype. Failure to identify any known functional connection may cast doubt on the validity of the signature, although it is recognized that our knowledge of carcinogenesis and gene function is incomplete. Nonetheless, the combination of literature mining, pathway analysis, and transcription factor binding site analysis together provided support for a linkage between these genes and cellular processes associated with cell proliferation, growth regulation, injury repair, and cancer, all of which when dysregulated could lead to carcinogenesis. The possibility that compounds that induce liver tumors via a nongenotoxic mechanism may be eliciting a common transcriptional regulatory response, such as the possible activation of the homeobox transcription factor Pbx1, is an intriguing one that warrants further investigation. In any case, a complete understanding of the biology underlying the genes in the signature should not prohibit practical application of this tool.

By comparison to other approaches for predicting NGHCs, the predictive accuracy of the signature is greater or comparable to other histological-based endpoints that have been proposed and evaluated for their ability to predict carcinogenic outcome (Allen et al., 2004; Elcombe et al., 2002; Ito et al., 2003). The advantage of the current genomic approach is the ability to facilitate early and more efficient evaluation of molecules because it relies on a short-term repeat dose rat toxicity study rather than histological indices following chronic treatment (Allen et al., 2004; Elcombe et al., 2002) or a laborious initiation, partial hepatectomy, and promotion phase of treatment (Ito et al., 2003). This approach also provides a means to generate mechanistic information that other proposed predictive methods fail to provide (Contrera et al., 2003; Lee et al., 1995; Mauthe et al., 2001).

The challenge with evaluating this or other methods intended to predict carcinogenic outcome is the reliance on the rodent bioassay as the gold standard to which accuracy is defined. Due to the variable nature of the bioassay itself and the influences of dose, route of administration, strain, gender, and/or other experimental variables known to influence the outcome of the bioassay, the determined accuracy of the signature is subject to not only the intrinsic variation in the genomic assay but also the variation in the benchmark to which the signature is measured against. As a result, we cannot discount the possibility that false positives reported by the signature are true signals or mechanistic events relevant to proliferative potential, which did not happen to materialize into a phenotypic effect in the rodent bioassay due to differences in the aforementioned variables used between the assays. Likewise, false negatives may arise when low doses or early time points are evaluated that do not produce drug exposures or cumulative effects sufficient to perturb the biomarker genes. Therefore, the sensitivity and specificity of the signature reported here is a composite estimate and should be used as a guide rather than an absolute measure of performance. This could be said of other attempts to derive signatures predictive of hepatocarcinogenic activity (Ellinger-Ziegelbauer et al., 2008; Uehara et al., 2008), which notably have not reported to be 100% accurate either. Perhaps training and test sets composed of samples from longer term treatments would result in gene expression changes that are more prognostic of chronic lifetime changes in carcinogenic outcome, although this would limit the value of obtaining early predictions and mechanistic data as presented here. In practice, each compound should be evaluated individually in light of its dose-response, concurrent pathology, genotoxic potential, and any mechanistic data available.

A comparison of the approach presented here with alternative predictive assays or approaches designed to predict nongenotoxic hepatocarcinogens reveals dramatic differences in the relative sensitivity and specificity for prediction, utility for screening, and the degree of mechanistic data provided. For example, the Ito Medium Term bioassay reveals a higher accuracy for prediction (92%) (Ito et al., 2003), however, it does not afford much mechanistic information or provide a means to rapidly screen compounds. Other methods relying on histological endpoints from chronic studies suffer from poor accuracy, low throughput, and do not provide mechanistic insight (Allen et al., 2004; Elcombe et al., 2002). More recent efforts utilizing a similar genomic approach have reported favorable prediction accuracy (Ellinger-Ziegelbauer et al., 2008; Uehara et al., 2008), however, the reported performance should be viewed with caution because validation on a wider set of diverse samples has not been reported and the use of smaller training and test sets will increase the likelihood of bias in the performance estimates. Therefore, we believe the currently proposed assay system offers the advantages of reasonable predictive accuracy, moderate throughput, and a means to begin to understand mode of action.

Previous studies have illustrated the application of gene expression dose-response data to establish benchmark dose values for nongenotoxic carcinogens in order to determine a threshold, or point of departure, for risk assessment (Bercu et al., 2010; Thomas et al., 2007, 2010). These approaches utilized the dose-response of genes aggregated in pathways and Gene Ontology processes, which assume that changes in these groups of genes are key events in the mode of action for these carcinogens. In a similar manner, the genes in the current signature could be used to establish benchmark doses from short-term dose-response studies to estimate points of departure for nongenotoxic events driving hepatocarcinogenicity. Further evaluation would be needed to assess this possibility. Therefore, the outcome of this predictive assay should be viewed as solely a hazard identification tool. In this context of use, it is advantageous to consider compounds in the training set that cause liver tumors in any strain, gender, or dose in order to increase the sensitivity of the assay. False positives could be better tolerated in a predictive tool because the outcome would not necessarily limit development of a positive compound. Instead, a positive result would initiate investigations or development strategies to build a weight of evidence for carcinogenic risk and understand the potential modes of action before obtaining results from the 2-year rodent bioassay. Considering the frequency by which liver weight elevation and hepatocellular hypertrophy is observed in preclinical drug discovery, this approach may enable a rapid understanding of the potential mechanism(s) and relevance of the finding for humans. In addition to prospective applications in drug discovery and development, the signature would also be of use retrospectively when tumors or preneoplastic lesions are observed in chronic toxicology studies and a mechanistic understanding is needed to inform the risk to humans. Additionally, it would be useful to differentiate and prioritize molecules when structurally related chemicals have been identified as having a hepatocarcinogenic risk.

Hepatic adenomas and carcinomas are the most frequent neoplastic lesion in the 2-year rodent bioassay (Gold et al., 2005); however, a broad range of tumor types is observed. Unfortunately, methods to predict carcinogenicity in tissues outside the liver still remain limited, although genomic approaches have shown promise for the prediction of lung carcinogens (Thomas et al., 2009). Previous results, albeit limited, have suggested that hepatic gene expression data may be predictive of carcinogenic potential in extrahepatic tissues (Nie et al., 2006). Although the biological rationale for how hepatic expression could predict carcinogenic outcome in other tissues is currently unclear, this possibility was intriguing because it would significantly expand the utility of the current genomic signature. As the current data set included a number of nongenotoxic carcinogens that caused tumors in tissues outside the liver, we applied the current signature to these compounds to test this hypothesis. The results, however, indicate that the current model trained to detect hepatocarcinogens is unable to accurately predict carcinogens in other tissues (data not shown). Alternative models trained specifically on nongenotoxic carcinogens from Table 2, regardless of target tissue, also failed to appreciably predict carcinogens from the independent data set (data not shown). It is likely that alternative approaches will be needed to identify extrahepatic carcinogens.

In summary, we have developed and extensively evaluated a hepatic gene expression-based signature for NGHCs on a moderate throughput, cost-effective and well-validated TaqMan array platform using a training set derived from short-term rat toxicology studies and tested on a large heterogeneous test set. These results, in conjunction with previous publications demonstrating the predictive and mechanistic utility of the genes (Fielden et al., 2007, 2008; Nie et al., 2006), add to the weight of evidence demonstrating the practical application of genomic biomarkers for use in the assessment of potential hepatocarcinogens. The classification results on a large heterogeneous data set underscore the importance of protocol on the boundaries of use for the signature and to utilize samples that most closely follow the protocol established by the training set. Dissemination of the underlying expression data and commercial availability of the TaqMan array assay described here should facilitate further evaluation of this research tool.

Supplementary Material

Supplementary results

NIHMS2153729-supplement-Supplementary_results.doc^{(1.3MB, doc)}

Supplementary table 1

NIHMS2153729-supplement-Supplementary_table_1.xls^{(245.5KB, xls)}

Supplementary data file 004

NIHMS2153729-supplement-Supplementary_data_file_004.xls^{(33.5KB, xls)}

Supplemenary materials and methods

NIHMS2153729-supplement-Supplemenary_materials_and_methods.doc^{(33.5KB, doc)}

SUPPLEMENTARY DATA

Supplementary data are available online at http://toxsci.oxfordjournals.org/.

ACKNOWLEDGMENTS

The authors would like to acknowledge Iconix Biosciences (now Entelos) for donating the liver RNA samples, Asuragen Services for performing experiments, Applied Biosystems (Life Technologies) for providing custom TaqMan arrays, Deepa Eveleigh, Sandi Calhoun, Michael McMillian, Joanne Tran, Rong Hu, Marnie Higgins-Garn, Rita Ciurlionis, and Olimpia Disorbo for laboratory support; Cassandra Mtine, Lindsay Lehman, Phil Rossi, and Elizabeth Walker for administrative support; and many others within the Predictive Safety Testing Consortium for constructive feedback and encouragement. This article may be the work product of an employee or group of employees of the National Institute of Environmental Health Sciences (NIEHS), National Institutes of Health (NIH), however, the statements, opinions, or conclusions contained therein do not necessarily represent the statements, opinions, or conclusions of NIEHS, NIH, or the U.S. government. J.S. is an employee of Life Technologies, a company that sells the TaqMan array. A.A. and A.K. are employees of Asuragen, a company that offers gene expression and TaqMan array services.

FUNDING

Member contributions to the Predictive Safety Testing Consortium of the Critical Path Institute.

REFERENCES

Allen DG, Pearse G, Haseman JK, and Maronpot RR (2004). Prediction of rodent carcinogenesis: an evaluation of prechronic liver lesions as forecasters of liver tumors in NTP carcinogenicity studies. Toxicol. Pathol 32, 393–401. [DOI] [PubMed] [Google Scholar]
Auerbach SS, Shah RR, Mav D, Smith CS, Walker NJ, Vallant MK, Boorman GA, and Irwin RD (2010). Predicting the hepatocarcinogenic potential of alkenylbenzene flavoring agents using toxicogenomics and machine learning. Toxicol. Appl. Pharmacol 243, 300–314. [DOI] [PubMed] [Google Scholar]
Bercu JP, Jolly RA, Flagella KM, Baker TK, Romero P, and Stevens JL (2010). Toxicogenomics and cancer risk assessment: a framework for key event analysis and dose-response assessment for nongenotoxic carcinogens. Regul. Toxicol. Pharmacol 58, 369–381. [DOI] [PubMed] [Google Scholar]
Booth ED, Jones E, and Elliott B,M (2007). Review of the in vitro and in vivo genotoxicity of dichlorvos. Regul. Toxicol. Pharmacol 49, 316–326. [DOI] [PubMed] [Google Scholar]
Brambilla G, and Martelli A (2009). Update on genotoxicity and carcinogenicity testing of 472 marketed pharmaceuticals. Mutat. Res 681, 209–229. [DOI] [PubMed] [Google Scholar]
Chen HH, Chen TW, and Lin H (2010). Pravastatin attenuates carboplatin-induced nephrotoxicity in rodents via peroxisome proliferator-activated receptor alpha-regulated heme oxygenase-1. Mol. Pharmacol 78, 36–45. [DOI] [PubMed] [Google Scholar]
Christensen FM, Eisenreich SJ, Rasmussen K, Sintes JR, Sokull-Kluettgen B, and Van de Plassche EJ (2011). European experience in chemicals management: integrating science into policy. Environ. Sci. Technol 45, 80–89. [DOI] [PubMed] [Google Scholar]
Cohen SM (2004). Human carcinogenic risk evaluation: an alternative approach to the two-year rodent bioassay. Toxicol. Sci 80, 225–259. [DOI] [PubMed] [Google Scholar]
Cohen SM (2010). Evaluation of possible carcinogenic risk to humans based on liver tumors in rodent assays: the two-year bioassay is no longer necessary. Toxicol. Pathol 38, 487–501. [DOI] [PubMed] [Google Scholar]
Contrera JF, Matthews EJ, and Daniel Benz R (2003). Predicting the carcinogenic potential of pharmaceuticals in rodents using molecular structural similarity and E-state indices. Regul. Toxicol. Pharmacol 38, 243–259. [DOI] [PubMed] [Google Scholar]
Davies TS, and Monro A (1995). Marketed human pharmaceuticals reported to be tumorigenic in rodents. J. Amer. Coll. Toxicol 14, 90–107. [Google Scholar]
DiMartino JF, Selleri L, Traver D, Firpo MT, Rhee J, Warnke R, O’Gorman S, Weissman IL, and Cleary ML (2001). The Hox cofactor and proto-oncogene Pbx1 is required for maintenance of definitive hematopoiesis in the fetal liver. Blood 98, 618–626. [DOI] [PubMed] [Google Scholar]
Elcombe CR, Odum J, Foster JR, Stone S, Hasmall S, Soames AR, Kimber I, and Ashby J (2002). Prediction of rodent nongenotoxic carcinogenesis: evaluation of biochemical and tissue changes in rodents following exposure to nine nongenotoxic NTP carcinogens. Environ. Health Perspect 110, 363–375. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ellinger-Ziegelbauer H, Gmuender H, Bandenburg A, and Ahr HJ (2008). Prediction of a carcinogenic potential of rat hepatocarcinogens using toxicogenomics analysis of short-term in vivo studies. Mutat. Res 637, 23–39. [DOI] [PubMed] [Google Scholar]
Ellinger-Ziegelbauer H, Stuart B, Wahle B, Bomann W, and Ahr HJ (2004). Characteristic expression profiles induced by genotoxic carcinogens in rat liver. Toxicol. Sci 77, 19–34. [DOI] [PubMed] [Google Scholar]
Fielden MR, Brennan R, and Gollub J (2007). A gene expression biomarker provides early prediction and mechanistic assessment of hepatic tumor induction by nongenotoxic chemicals. Toxicol. Sci 99, 90–100. [DOI] [PubMed] [Google Scholar]
Fielden MR, Nie A, McMillian M, Elangbam CS, Trela BA, Yang Y, Dunn RT II., Dragan Y, Fransson-Stehen R, Bogdanffy M, et al. (2008). Interlaboratory evaluation of genomic signatures for predicting carcinogenicity in the rat. Toxicol. Sci 103, 28–34. [DOI] [PubMed] [Google Scholar]
Gold LS, Manley NB, Slone TH, Rohrbach L, and Garfinkel GB (2005). Supplement to the Carcinogenic Potency Database (CPDB): results of animal bioassays published in the general literature through 1997 and by the National Toxicology Program in 1997–1998. Toxicol. Sci 85, 747–808. [DOI] [PubMed] [Google Scholar]
Haseman JK, Huff JE, Zeiger E, and McConnell EE (1987). Comparative results of 327 chemical carcinogenicity studies. Environ. Health Perspect 74, 229–235. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ito N, Tamano S, and Shirai T (2003). A medium-term rat liver bioassay for rapid in vivo detection of carcinogenic potential of chemicals. Cancer Sci. 94, 3–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jacobs A (2005). Prediction of 2-year carcinogenicity study results for pharmaceutical products: how are we doing? Toxicol. Sci 88, 18–23. [DOI] [PubMed] [Google Scholar]
Jacobs A, and Jacobson-Kram D (2004). Human carcinogenic risk evaluation, part III: assessing cancer hazard and risk in human drug development. Toxicol. Sci 81, 260–262. [DOI] [PubMed] [Google Scholar]
Kirkland D, Aardema M, Henderson L, and Muller L (2005). Evaluation of the ability of a battery of three in vitro genotoxicity tests to discriminate rodent carcinogens and non-carcinogens I. Sensitivity, specificity and relative predictivity. Mutat. Res 584, 1–256. [DOI] [PubMed] [Google Scholar]
Kitchin KT, Brown JL, and Kulkarni AP (1993). Predicting rodent carcinogenicity of halogenated hydrocarbons by in vivo biochemical parameters. Teratog. Carcinog. Mutagen 13, 167–184. [DOI] [PubMed] [Google Scholar]
Kitchin KT, Brown JL, and Kulkarni AP (1994). Complementarity of genotoxic and nongenotoxic predictors of rodent carcinogenicity. Teratog. Carcinog. Mutagen 14, 83–100. [DOI] [PubMed] [Google Scholar]
Lee Y, Buchanan BG, Mattison DM, Klopman G, and Rosenkranz HS (1995). Learning rules to predict rodent carcinogenicity of nongenotoxic chemicals. Mutat. Res 328, 127–149. [DOI] [PubMed] [Google Scholar]
Maronpot RR, Flake G, and Huff J (2004). Relevance of animal carcinogenesis findings to human cancer predictions and prevention. Toxicol. Pathol 32(Suppl. 1), 40–48. [DOI] [PubMed] [Google Scholar]
Mauthe RJ, Gibson DP, Bunch RT, and Custer L (2001). The syrian hamster embryo (SHE) cell transformation assay: review of the methods and results. Toxicol. Pathol 29(Suppl.), 138–146. [DOI] [PubMed] [Google Scholar]
Melnick RL, Thayer KA, and Bucher JR (2008). Conflicting views on chemical carcinogenesis arising from the design and evaluation of rodent carcinogenicity studies. Environ. Health Perspect 116, 130–135. [DOI] [PMC free article] [PubMed] [Google Scholar]
Michel C, Roberts RA, Desdouets C, Isaacs KR, and Boitier E (2005). Characterization of an acute molecular marker of nongenotoxic rodent hepatocarcinogenesis by gene expression profiling in a long term clofibric acid study. Chem. Res. Toxicol 18, 611–618. [DOI] [PubMed] [Google Scholar]
Montero R, and Ostrosky P (1997). Genotoxic activity of praziquantel. Mutat Res 387, 123–139. [DOI] [PubMed] [Google Scholar]
Nie AY, McMillian M, Parker JB, Leone A, Bryant S, Yieh L, Bittner A, Nelson J, Carmen A, Wan J, et al. (2006). Predictive toxicogenomics approaches reveal underlying molecular mechanisms of nongenotoxic carcinogenicity. Mol. Carcinog 45, 914–933. [DOI] [PubMed] [Google Scholar]
Nioi P, Pardo ID, Sherratt PJ, Fielden MR, Gollub J, Nie A, and Snyder RD (2008). Prediction of non-genotoxic carcinogenesis in rats using changes in gene expression following acute dosing. Chem. Biol. Interact 176, 252–260. [DOI] [PubMed] [Google Scholar]
Tatematsu M, Tsuda H, Shirai T, Masui T, and Ito N (1987). Placental glutathione S-transferase (GST-P) as a new marker for hepatocarcinogenesis: in vivo short-term screening for hepatocarcinogens. Toxicol. Pathol 15, 60–68. [DOI] [PubMed] [Google Scholar]
Thomas RS, Allen BC, Nong A, Yang L, Bermudez E, Clewell HJ III., and Andersen ME (2007). A method to integrate benchmark dose estimates with genomic data to assess the functional effects of chemical exposure. Toxicol. Sci 98, 240–248. [DOI] [PubMed] [Google Scholar]
Thomas RS, Bao W, Chu TM, Bessarabova M, Nikolskaya T, Nikolsky Y, Andersen ME, and Wolfinger RD (2009). Use of short-term transcriptional profiles to assess the long-term cancer-related safety of environmental and industrial chemicals. Toxicol. Sci 112, 311–321. [DOI] [PubMed] [Google Scholar]
Thomas RS, Clewell HJ III., Allen BC, Wesselkamper SC, Wang NC, Lambert JC, Hess-Wilson JK, Zhao QJ, and Andersen ME (2010). Application of transcriptional benchmark dose values in quantitative cancer and noncancer risk assessment. Toxicol. Sci 120, 194–205. [DOI] [PubMed] [Google Scholar]
Uehara T, Hirode M, Ono A, Kiyosawa N, Omura K, Shimizu T, Mizukawa Y, Miyagishima T, Nagao T, and Urushidani T (2008). A toxicogenomics approach for early assessment of potential nongenotoxic hepatocarcinogenicity of chemicals in rats. Toxicology 250, 15–26. [DOI] [PubMed] [Google Scholar]
Vanparys P, Corvi R, Aardema M, Gribaldo L, Hayashi M, Hoffmann S, and Schechtman L (2011). ECVAM prevalidation of three cell transformation assays. ALTEX 28, 56–59. [DOI] [PubMed] [Google Scholar]
Waites CR, Dominick MA, Sanderson TP, and Schilling BE (2007). Nonclinical safety evaluation of muraglitazar, a novel PPARalpha/gamma agonist. Toxicol. Sci 100, 248–258. [DOI] [PubMed] [Google Scholar]
Ward JM (2008). Value of rodent carcinogenesis bioassays. Toxicol. Appl. Pharmacol 226, 212. [DOI] [PubMed] [Google Scholar]
Waters MD, Jackson M, and Lea I (2010). Characterizing and predicting carcinogenicity and mode of action using conventional and toxicogenomics methods. Mutat. Res 705, 184–200. [DOI] [PubMed] [Google Scholar]
Whysner J, and Williams GM (1996a). D-limonene mechanistic data and risk assessment: absolute species-specific cytotoxicity, enhanced cell proliferation, and tumor promotion. Pharmacol. Ther 71, 127–136. [DOI] [PubMed] [Google Scholar]
Whysner J, and Williams GM (1996b). Saccharin mechanistic data and risk assessment: urine composition, enhanced cell proliferation, and tumor promotion. Pharmacol. Ther 71, 225–252. [DOI] [PubMed] [Google Scholar]
Yamasaki H, Ashby J, Bignami M, Jongen W, Linnainmaa K, Newbold RF, Nguyen-Ba G, Parodi S, Rivedal E, Schiffmann D, et al. (1996). Nongenotoxic carcinogens: development of detection methods based on mechanisms: a European project. Mutat. Res 353, 47–63. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary results

NIHMS2153729-supplement-Supplementary_results.doc^{(1.3MB, doc)}

Supplementary table 1

NIHMS2153729-supplement-Supplementary_table_1.xls^{(245.5KB, xls)}

Supplementary data file 004

NIHMS2153729-supplement-Supplementary_data_file_004.xls^{(33.5KB, xls)}

Supplemenary materials and methods

NIHMS2153729-supplement-Supplemenary_materials_and_methods.doc^{(33.5KB, doc)}

[R1] Allen DG, Pearse G, Haseman JK, and Maronpot RR (2004). Prediction of rodent carcinogenesis: an evaluation of prechronic liver lesions as forecasters of liver tumors in NTP carcinogenicity studies. Toxicol. Pathol 32, 393–401. [DOI] [PubMed] [Google Scholar]

[R2] Auerbach SS, Shah RR, Mav D, Smith CS, Walker NJ, Vallant MK, Boorman GA, and Irwin RD (2010). Predicting the hepatocarcinogenic potential of alkenylbenzene flavoring agents using toxicogenomics and machine learning. Toxicol. Appl. Pharmacol 243, 300–314. [DOI] [PubMed] [Google Scholar]

[R3] Bercu JP, Jolly RA, Flagella KM, Baker TK, Romero P, and Stevens JL (2010). Toxicogenomics and cancer risk assessment: a framework for key event analysis and dose-response assessment for nongenotoxic carcinogens. Regul. Toxicol. Pharmacol 58, 369–381. [DOI] [PubMed] [Google Scholar]

[R4] Booth ED, Jones E, and Elliott B,M (2007). Review of the in vitro and in vivo genotoxicity of dichlorvos. Regul. Toxicol. Pharmacol 49, 316–326. [DOI] [PubMed] [Google Scholar]

[R5] Brambilla G, and Martelli A (2009). Update on genotoxicity and carcinogenicity testing of 472 marketed pharmaceuticals. Mutat. Res 681, 209–229. [DOI] [PubMed] [Google Scholar]

[R6] Chen HH, Chen TW, and Lin H (2010). Pravastatin attenuates carboplatin-induced nephrotoxicity in rodents via peroxisome proliferator-activated receptor alpha-regulated heme oxygenase-1. Mol. Pharmacol 78, 36–45. [DOI] [PubMed] [Google Scholar]

[R7] Christensen FM, Eisenreich SJ, Rasmussen K, Sintes JR, Sokull-Kluettgen B, and Van de Plassche EJ (2011). European experience in chemicals management: integrating science into policy. Environ. Sci. Technol 45, 80–89. [DOI] [PubMed] [Google Scholar]

[R8] Cohen SM (2004). Human carcinogenic risk evaluation: an alternative approach to the two-year rodent bioassay. Toxicol. Sci 80, 225–259. [DOI] [PubMed] [Google Scholar]

[R9] Cohen SM (2010). Evaluation of possible carcinogenic risk to humans based on liver tumors in rodent assays: the two-year bioassay is no longer necessary. Toxicol. Pathol 38, 487–501. [DOI] [PubMed] [Google Scholar]

[R10] Contrera JF, Matthews EJ, and Daniel Benz R (2003). Predicting the carcinogenic potential of pharmaceuticals in rodents using molecular structural similarity and E-state indices. Regul. Toxicol. Pharmacol 38, 243–259. [DOI] [PubMed] [Google Scholar]

[R11] Davies TS, and Monro A (1995). Marketed human pharmaceuticals reported to be tumorigenic in rodents. J. Amer. Coll. Toxicol 14, 90–107. [Google Scholar]

[R12] DiMartino JF, Selleri L, Traver D, Firpo MT, Rhee J, Warnke R, O’Gorman S, Weissman IL, and Cleary ML (2001). The Hox cofactor and proto-oncogene Pbx1 is required for maintenance of definitive hematopoiesis in the fetal liver. Blood 98, 618–626. [DOI] [PubMed] [Google Scholar]

[R13] Elcombe CR, Odum J, Foster JR, Stone S, Hasmall S, Soames AR, Kimber I, and Ashby J (2002). Prediction of rodent nongenotoxic carcinogenesis: evaluation of biochemical and tissue changes in rodents following exposure to nine nongenotoxic NTP carcinogens. Environ. Health Perspect 110, 363–375. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] Ellinger-Ziegelbauer H, Gmuender H, Bandenburg A, and Ahr HJ (2008). Prediction of a carcinogenic potential of rat hepatocarcinogens using toxicogenomics analysis of short-term in vivo studies. Mutat. Res 637, 23–39. [DOI] [PubMed] [Google Scholar]

[R15] Ellinger-Ziegelbauer H, Stuart B, Wahle B, Bomann W, and Ahr HJ (2004). Characteristic expression profiles induced by genotoxic carcinogens in rat liver. Toxicol. Sci 77, 19–34. [DOI] [PubMed] [Google Scholar]

[R16] Fielden MR, Brennan R, and Gollub J (2007). A gene expression biomarker provides early prediction and mechanistic assessment of hepatic tumor induction by nongenotoxic chemicals. Toxicol. Sci 99, 90–100. [DOI] [PubMed] [Google Scholar]

[R17] Fielden MR, Nie A, McMillian M, Elangbam CS, Trela BA, Yang Y, Dunn RT II., Dragan Y, Fransson-Stehen R, Bogdanffy M, et al. (2008). Interlaboratory evaluation of genomic signatures for predicting carcinogenicity in the rat. Toxicol. Sci 103, 28–34. [DOI] [PubMed] [Google Scholar]

[R18] Gold LS, Manley NB, Slone TH, Rohrbach L, and Garfinkel GB (2005). Supplement to the Carcinogenic Potency Database (CPDB): results of animal bioassays published in the general literature through 1997 and by the National Toxicology Program in 1997–1998. Toxicol. Sci 85, 747–808. [DOI] [PubMed] [Google Scholar]

[R19] Haseman JK, Huff JE, Zeiger E, and McConnell EE (1987). Comparative results of 327 chemical carcinogenicity studies. Environ. Health Perspect 74, 229–235. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] Ito N, Tamano S, and Shirai T (2003). A medium-term rat liver bioassay for rapid in vivo detection of carcinogenic potential of chemicals. Cancer Sci. 94, 3–8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] Jacobs A (2005). Prediction of 2-year carcinogenicity study results for pharmaceutical products: how are we doing? Toxicol. Sci 88, 18–23. [DOI] [PubMed] [Google Scholar]

[R22] Jacobs A, and Jacobson-Kram D (2004). Human carcinogenic risk evaluation, part III: assessing cancer hazard and risk in human drug development. Toxicol. Sci 81, 260–262. [DOI] [PubMed] [Google Scholar]

[R23] Kirkland D, Aardema M, Henderson L, and Muller L (2005). Evaluation of the ability of a battery of three in vitro genotoxicity tests to discriminate rodent carcinogens and non-carcinogens I. Sensitivity, specificity and relative predictivity. Mutat. Res 584, 1–256. [DOI] [PubMed] [Google Scholar]

[R24] Kitchin KT, Brown JL, and Kulkarni AP (1993). Predicting rodent carcinogenicity of halogenated hydrocarbons by in vivo biochemical parameters. Teratog. Carcinog. Mutagen 13, 167–184. [DOI] [PubMed] [Google Scholar]

[R25] Kitchin KT, Brown JL, and Kulkarni AP (1994). Complementarity of genotoxic and nongenotoxic predictors of rodent carcinogenicity. Teratog. Carcinog. Mutagen 14, 83–100. [DOI] [PubMed] [Google Scholar]

[R26] Lee Y, Buchanan BG, Mattison DM, Klopman G, and Rosenkranz HS (1995). Learning rules to predict rodent carcinogenicity of nongenotoxic chemicals. Mutat. Res 328, 127–149. [DOI] [PubMed] [Google Scholar]

[R27] Maronpot RR, Flake G, and Huff J (2004). Relevance of animal carcinogenesis findings to human cancer predictions and prevention. Toxicol. Pathol 32(Suppl. 1), 40–48. [DOI] [PubMed] [Google Scholar]

[R28] Mauthe RJ, Gibson DP, Bunch RT, and Custer L (2001). The syrian hamster embryo (SHE) cell transformation assay: review of the methods and results. Toxicol. Pathol 29(Suppl.), 138–146. [DOI] [PubMed] [Google Scholar]

[R29] Melnick RL, Thayer KA, and Bucher JR (2008). Conflicting views on chemical carcinogenesis arising from the design and evaluation of rodent carcinogenicity studies. Environ. Health Perspect 116, 130–135. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] Michel C, Roberts RA, Desdouets C, Isaacs KR, and Boitier E (2005). Characterization of an acute molecular marker of nongenotoxic rodent hepatocarcinogenesis by gene expression profiling in a long term clofibric acid study. Chem. Res. Toxicol 18, 611–618. [DOI] [PubMed] [Google Scholar]

[R31] Montero R, and Ostrosky P (1997). Genotoxic activity of praziquantel. Mutat Res 387, 123–139. [DOI] [PubMed] [Google Scholar]

[R32] Nie AY, McMillian M, Parker JB, Leone A, Bryant S, Yieh L, Bittner A, Nelson J, Carmen A, Wan J, et al. (2006). Predictive toxicogenomics approaches reveal underlying molecular mechanisms of nongenotoxic carcinogenicity. Mol. Carcinog 45, 914–933. [DOI] [PubMed] [Google Scholar]

[R33] Nioi P, Pardo ID, Sherratt PJ, Fielden MR, Gollub J, Nie A, and Snyder RD (2008). Prediction of non-genotoxic carcinogenesis in rats using changes in gene expression following acute dosing. Chem. Biol. Interact 176, 252–260. [DOI] [PubMed] [Google Scholar]

[R34] Tatematsu M, Tsuda H, Shirai T, Masui T, and Ito N (1987). Placental glutathione S-transferase (GST-P) as a new marker for hepatocarcinogenesis: in vivo short-term screening for hepatocarcinogens. Toxicol. Pathol 15, 60–68. [DOI] [PubMed] [Google Scholar]

[R35] Thomas RS, Allen BC, Nong A, Yang L, Bermudez E, Clewell HJ III., and Andersen ME (2007). A method to integrate benchmark dose estimates with genomic data to assess the functional effects of chemical exposure. Toxicol. Sci 98, 240–248. [DOI] [PubMed] [Google Scholar]

[R36] Thomas RS, Bao W, Chu TM, Bessarabova M, Nikolskaya T, Nikolsky Y, Andersen ME, and Wolfinger RD (2009). Use of short-term transcriptional profiles to assess the long-term cancer-related safety of environmental and industrial chemicals. Toxicol. Sci 112, 311–321. [DOI] [PubMed] [Google Scholar]

[R37] Thomas RS, Clewell HJ III., Allen BC, Wesselkamper SC, Wang NC, Lambert JC, Hess-Wilson JK, Zhao QJ, and Andersen ME (2010). Application of transcriptional benchmark dose values in quantitative cancer and noncancer risk assessment. Toxicol. Sci 120, 194–205. [DOI] [PubMed] [Google Scholar]

[R38] Uehara T, Hirode M, Ono A, Kiyosawa N, Omura K, Shimizu T, Mizukawa Y, Miyagishima T, Nagao T, and Urushidani T (2008). A toxicogenomics approach for early assessment of potential nongenotoxic hepatocarcinogenicity of chemicals in rats. Toxicology 250, 15–26. [DOI] [PubMed] [Google Scholar]

[R39] Vanparys P, Corvi R, Aardema M, Gribaldo L, Hayashi M, Hoffmann S, and Schechtman L (2011). ECVAM prevalidation of three cell transformation assays. ALTEX 28, 56–59. [DOI] [PubMed] [Google Scholar]

[R40] Waites CR, Dominick MA, Sanderson TP, and Schilling BE (2007). Nonclinical safety evaluation of muraglitazar, a novel PPARalpha/gamma agonist. Toxicol. Sci 100, 248–258. [DOI] [PubMed] [Google Scholar]

[R41] Ward JM (2008). Value of rodent carcinogenesis bioassays. Toxicol. Appl. Pharmacol 226, 212. [DOI] [PubMed] [Google Scholar]

[R42] Waters MD, Jackson M, and Lea I (2010). Characterizing and predicting carcinogenicity and mode of action using conventional and toxicogenomics methods. Mutat. Res 705, 184–200. [DOI] [PubMed] [Google Scholar]

[R43] Whysner J, and Williams GM (1996a). D-limonene mechanistic data and risk assessment: absolute species-specific cytotoxicity, enhanced cell proliferation, and tumor promotion. Pharmacol. Ther 71, 127–136. [DOI] [PubMed] [Google Scholar]

[R44] Whysner J, and Williams GM (1996b). Saccharin mechanistic data and risk assessment: urine composition, enhanced cell proliferation, and tumor promotion. Pharmacol. Ther 71, 225–252. [DOI] [PubMed] [Google Scholar]

[R45] Yamasaki H, Ashby J, Bignami M, Jongen W, Linnainmaa K, Newbold RF, Nguyen-Ba G, Parodi S, Rivedal E, Schiffmann D, et al. (1996). Nongenotoxic carcinogens: development of detection methods based on mechanisms: a European project. Mutat. Res 353, 47–63. [DOI] [PubMed] [Google Scholar]

PERMALINK

Development and Evaluation of a Genomic Signature for the Prediction and Mechanistic Assessment of Nongenotoxic Hepatocarcinogens in the Rat

Mark R Fielden

Alex Adai

Robert T Dunn II

Andrew Olaharski

George Searfoss

Joe Sina

Jiri Aubrecht

Eric Boitier

Paul Nioi

Scott Auerbach

David Jacobson-Kram

Nandini Raghavan

Yi Yang

Andrew Kincaid

Jon Sherlock

Shen-Jue Chen

Bruce Car

Abstract

MATERIALS AND METHODS

TaqMan array card design.

TABLE 1.

TaqMan array card assay.

Liver RNA samples.

Compound classification.

Model development step 1: Process evaluation.

FIG. 1.

TABLE 2.

Model development step 2: Final model development.

Model development step 3: Signature evaluation on independent data set.

TABLE 3.

Determination of classification threshold.

Interlaboratory precision study.

Biological interpretation of biomarker genes and their regulation.

RESULTS

Classification Accuracy

FIG. 2.

FIG. 4.

FIG. 3.

Signature Precision and Reproducibility

Evaluating Nongenotoxic Modes of Action

FIG. 5.

Evaluating Genotoxic Modes of Action

FIG. 6.

Role of Biomarker Genes in Neoplasia

DISCUSSION

Supplementary Material

ACKNOWLEDGMENTS

FUNDING

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases