Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2024 Aug 2;14:17974. doi: 10.1038/s41598-024-68717-9

A competing risks machine learning study of neutron dose, fractionation, age, and sex effects on mortality in 21,000 mice

Eric Wang 1,, Igor Shuryak 1, David J Brenner 1
PMCID: PMC11297256  PMID: 39095647

Abstract

This study explores the impact of densely-ionizing radiation on non-cancer and cancer diseases, focusing on dose, fractionation, age, and sex effects. Using historical mortality data from approximately 21,000 mice exposed to fission neutrons, we employed random survival forest (RSF), a powerful machine learning algorithm accommodating nonlinear dependencies and interactions, treating cancer and non-cancer outcomes as competing risks. Unlike traditional parametric models, RSF avoids strict assumptions and captures complex data relationships through decision tree ensembles. SHAP (SHapley Additive exPlanations) values and variable importance scores were employed for interpretation. The findings revealed clear dose–response trends, with cancer being the predominant cause of mortality. SHAP value dose–response shapes differed, showing saturation for cancer hazard at high doses (> 2 Gy) and a more linear pattern at lower doses. Non-cancer responses remained more linear throughout the entire dose range. There was a potential inverse dose rate effect for cancer, while the evidence for non-cancer was less conclusive. Sex and age effects were less pronounced. This investigation, utilizing machine learning, enhances our understanding of the patterns of non-cancer and cancer mortality induced by densely-ionizing radiations, emphasizing the importance of such approaches in radiation research, including space travel and radioprotection.

Subject terms: Cancer, Computational biology and bioinformatics

Introduction

Densely-ionizing radiations, such as neutrons, are encountered in various fields, including space exploration, nuclear safety, and therapeutic applications17. While much of the existing research, including our own previous studies like Wang et al.8, has focused on the carcinogenic effects of these radiations913, there is a growing need to understand their impact on non-cancer diseases1420. These diseases, particularly those affecting the cardiovascular and nervous systems, can contribute significantly to radiation-induced mortality and are crucial for designing effective radioprotection strategies2025.

As humans continue to explore space and undertake long-distance missions to other planets, understanding the effects of space radiations becomes increasingly important to ensure the health and safety of astronauts. In particular, understanding the effects of dose fractionation and protraction on non-cancer diseases induced by densely ionizing radiations, such as those found in space, is crucial for assessing whether prolonged radiation exposure over many months or years can be more (or less) detrimental than short-term exposure to a similar dose. Investigating the role of sex in radiation sensitivity to non-cancer and cancer diseases is also very important for estimating the risks to female vs. male personnel involved in activities where exposures to densely ionizing radiations occur, such as astronauts. More research in this area can fill this knowledge gap and improve our overall understanding of the health effects of radiation. This is particularly important for public health and radiation risk assessment.

The Janus archive26 is a comprehensive dataset that provides unique insights into the impact of densely ionizing radiation on mortality from cancers and various organ dysfunctions. The data is based on lifespan studies of laboratory rodents, specifically B6CF1 mice, exposed to fission neutrons. Neutrons can be considered a form of densely-ionizing radiation, whereas photons (X or gamma rays) with energies typically used for animal experiments can be considered sparsely-ionizing. These radiation types deposit energy differently on a microscopic level due to their distinct interactions with matter. Neutrons are uncharged particles and do not interact with orbital electrons. They can travel considerable distances through matter without interacting. Neutrons interact with atomic nuclei through several mechanisms, including elastic scatter, inelastic scatter, nonelastic scatter, neutron capture, and spallation. For instance, when a fast neutron hits a hydrogen nucleus (proton), the proton can recoil with a portion of the neutron’s energy. The recoil protons then lose their energy when they pass through the material. In contrast, high-energy photons deposit their energy primarily through interactions with electrons. When a gamma ray passes through matter, it can interact with an electron and transfer some or all of its energy to that electron. The excited electron then causes ionization or excitation of atoms in the material, leading to energy deposition. Thus, while gamma rays deposit energy by interacting with electrons, neutrons deposit energy primarily by interacting with atomic nuclei, often generating recoil protons in the process.

Due to these differences in physical properties, the radiobiology of neutrons and photons also differs significantly. Neutrons are generally more biologically hazardous than photons. For instance, if neutrons contribute 10–20% of the physical dose to irradiated individuals, they will undoubtedly play a major role in the resulting biological effects27. In fact, it’s estimated that roughly half of the biological effect observed will be due to neutrons with the other half due to photons28. The Relative Biological Effectiveness (RBE) of neutrons versus photon exposures changes based on different biological endpoints, neutron energy, and neutron dose27,29. Neutrons can also result in an inverse dose rate effect, meaning that reduction of the dose rate/prolongation of the exposure time can increase the biological damage per unit dose29,30. This contrasts with photons, for which the opposite reduction of damage per unit dose is generally observed when dose rate is reduced.

Specifically for the Janus experimental series, numerous studies examined overall life shortening in mice as function of radiation dose and/or radioprotective chemical agents, e.g.26,31,32. Comparisons between different rodent species were also performed33. While the majority of deaths in Janus mice can be attributed to neoplasms, there is still a large number of mouse records for which death occurred due to non-cancer diseases, which are important to investigate as well34. For example, there is a growing concern that non-cancer diseases, especially cognitive dysfunction and other central nervous system damage, cause a significant impediment for human-crewed space exploration missions to Mars and other planets14,17,18,35.

Most radiobiological studies conducted so far relied on traditional parametric statistical models such as Cox regression. Such models rely on a set of strict assumptions, such as proportional hazards, where the hazard ratio should be constant across time, and linearity, where the relationship between the log hazard and the covariates should be linear. Somewhat more complex models which are used when competing risks from different diseases (e.g. cancer vs. non-cancer) are investigated, such as the Fine-Gray model36, also rely on a set of assumptions, e.g. that that the effects of covariates are proportional on the subdistribution hazard scale. These assumptions are often violated in complex data sets found in radiation research, where several variables (e.g. dose fractionation, biological and demographic characteristics of the subjects) can affect the survival time and can interact in complicated ways. State of the art machine learning (ML) approaches provide a much more flexible alternative for modeling such data using fewer assumptions, and hold great promise for enhancing our understanding of the intricacies of radiation effects and their interactions with biological variables. Here we use such ML methods to investigate the impact of densely-ionizing radiation (neutrons) on non-cancer and cancer diseases in the large mouse studies from the Janus archive, focusing on dose, fractionation, age, and sex effects. In our previous work on this data set (Wang et al.8) we modeled all-cause mortality, much of which is attributable to cancer, whereas in the current paper we model non-cancer and cancer diseases separately as distinct causes of death, treating them as competing risks. The competing risks approach necessitated the use of a different machine learning method than for all deaths, as described in more detail below in materials and methods.

Methods

Data collection and selection

Our study utilized a detailed dataset of mice from the Janus archive26. We selected individual-level data on mice exposed to fission neutrons or unexposed controls (labeled as “C” for Control, or “N” for neutron), focusing on variables like radiation dose, duration, sex, and age at exposure. Records with incomplete data or non-neutron irradiation methods were excluded for consistency. We only considered mice of the Mus musculus species, specifically the B6CF1 strain. Additionally, we only included mice that were autopsied and had a cause of death listed as either “Died” or “Sacrifice, moribund”. This ensured that the recorded deaths were not due to scheduled sacrifice, transfer to another experiment, cannibalism, or unknown causes. We excluded animals that were exposed solely to gamma radiation. The irradiation setup for the Janus experiments primarily involved exposure to fission neutrons, with a minor component of gamma radiation. The gamma-ray doses were consistently low and considered minor contaminants relative to the neutron doses. For our analysis, we focused on neutron doses (excluding the gamma ray component) because the gamma-ray contribution was minimal and did not significantly influence the outcomes. Specifically, the gamma-ray component typically constituted less than 3% of the total dose. This small proportion ensures that the observed effects are predominantly due to neutron exposure. Furthermore, for any entry where the dose was recorded as 0, we automatically set the number of fractions and duration of fraction to 0.

Additionally, we classified the mice into two groups: cancer vs. non-cancer diagnoses, based on the diagnosis codes and descriptions provided in the Janus archive documentation. The list of non-cancer diagnoses can be seen in Table S1 and Fig. S1 from the supplementary material, with the five most commonly known diagnoses being hemothorax/ascites, pneumonitis, pneumonia, anemia, and hydronephrosis.

The resulting dataset was further preprocessed to ensure accuracy and consistency using the R programming language (R Studio, Version 2023.06.0+421)37. The dataset was divided into a training set (75%) and a testing set (25%) to ensure robust model training and validation.

Data analysis and modeling

The cancer and non-cancer disease categories in this data set represent competing risks (causes) of death. Competing risk analysis is a special type of survival analysis that aims to correctly estimate the marginal probability of an event in the presence of competing events. Traditional methods to describe the survival process, such as the Kaplan–Meier method, are not designed to accommodate the competing nature of multiple causes to the same event, and tend to produce inaccurate estimates when analyzing the marginal probability for cause-specific events. To address this issue, the cause specific cumulative hazard function (CSCHF) and the cumulative incidence function (CIF) are generally used in competing risks analysis.

To perform competing risks modeling of the Janus mouse data, we initially considered the Fine-Gray regression model implemented in the cmprsk R package36,38. It is a type of survival analysis that aims to correctly estimate the marginal probability of an event in the presence of competing events. The model is based on the concept of the subdistribution hazard, which is used to model the CIF of an event of interest. The Fine-Gray model has several key assumptions: (1) The hazard of each event is proportional. (2) The occurrence of a competing event does not provide any information about the hazard of the event of interest. (3) The covariates have a multiplicative effect on the subdistribution hazard. However, these assumptions (especially the first one) were significantly violated in all our attempts to use the Fine-Gray model (with different covariate selection and/or interaction terms) on the Janus data. In addition, this model type has known limitations, e.g. the sum of the lifetime probabilities of the individual events can exceed 1, and interpretation of the model’s coefficients is not always straightforward.

For these reasons, we switched to a machine learning analysis method: random survival forests (RSF) for competing risks39, implemented in the randomForestSRC R package. RSF, a non-parametric machine learning method, constructs an ensemble of decision trees to model survival outcomes, proficiently managing censored data without strict assumptions on covariate effects39. This approach is well-suited for untangling nonlinear relationships and interactions among predictors in the presence of competing events such as cancer and non-cancer mortality. The ‘randomForestSRC’ package specifically accommodates the subtleties of competing risks, estimating cumulative incidence functions and providing variable importance metrics that reflect each predictor’s influence on the outcome. Notably, RSF’s ensemble nature not only bolsters predictive accuracy but also offers inherent validation through out-of-bag (OOB) error estimates and concordance indices, critical for confirming the model's prognostic integrity. OOB error estimates are a measure of prediction error for a random forest model, calculated using the subset of data not included in the bootstrap sample for each tree. This method provides an unbiased estimate of the model’s performance without the need for a separate validation set. By leveraging RSF, we aim to enhance the predictive modeling of radiation-induced mortality risks, offering a nuanced analysis that traditional parametric models may not sufficiently provide.

The random survival forest algorithm

Decision trees are the fundamental building blocks of the Random Survival Forest (RSF) model. A decision tree splits data into branches to make predictions based on certain conditions. Each internal node represents a decision based on a feature, each branch represents the outcome of that decision, and each leaf node represents a final prediction. Each tree in the RSF is constructed using a bootstrap sample of the data. The tree splits the data at each node based on a feature (or predictor variable) that maximizes the separation of survival times or event occurrences (cancer or non-cancer death, in our case). A feature, in the context of machine learning and decision tree learning, refers to an individual measurable property or characteristic of the data being analyzed (i.e. a predictor variable like radiation dose, age at irradiation). Splitting continues until a specified minimum node size is reached, resulting in terminal nodes, or leaves, that represent the predicted outcome for the observations within that leaf. In our analysis, we use regression trees rather than classification trees. Regression trees predict a continuous outcome (e.g., survival time) rather than a categorical outcome (e.g., death or no death). Each node in a regression tree is associated with a predicted survival outcome based on the mean of the observations within that node. The “random” aspect of RSF comes from two sources: (1) each tree is trained on a different bootstrap sample of the data, and (2) at each split in the tree, a random subset of the features is considered for splitting. This randomness helps to ensure that the model is not overly dependent on any single feature and improves the generalizability of the model. Features that frequently appear near the top of the trees and contribute to significant splits are often those with higher VIMP and SHAP scores (see below for definitions), indicating their strong influence on the survival outcomes being modeled.

The RSF modeling results were visualized through the Cause-Specific Cumulative Hazard Function (CSCHF) and Cumulative Incidence Function (CIF) plots. CSCHF describes the rate of death from one event type in the presence of others, while CIF estimates the marginal probability of a death as a function of its cause-specific probability and overall survival probability. Both metrics are crucial for understanding the survival experience of multiple competing events and provide more interpretable estimates compared to traditional survival analysis methods.

Variable Importance (VIMP) scores measure the importance of a variable in predicting each outcome (cancer and non-cancer). VIMP is calculated by measuring the increase in prediction error when the feature's values are randomly permuted. The greater the increase in prediction error, the more important the feature is considered. Large positive VIMP values indicate strong predictive importance. VIMP values around 0.03 suggest that the variables provide some predictive power and should not be disregarded, especially in the context of complex interactions within the model39.

SHAP (SHapley Additive exPlanations) values provide a consistent and interpretable way to distribute a model’s prediction among its features, offering both local and global insights40. These values decompose a prediction into contributions from each feature, maintaining additivity and consistency, and they have the same units as the outcome variable. Higher SHAP values indicate a greater influence on the model’s prediction, with increasing SHAP values corresponding to increasing evidence for radiation risk. Based on the principles of cooperative game theory, specifically the Shapley value concept, SHAP values fairly allocate the model’s prediction (“payout”) among the features (“players”). The calculation involves determining a baseline prediction (the average prediction when no features are present), computing the marginal contribution of each feature by considering all possible combinations of features, and averaging these contributions to obtain the Shapley value. We used the fastshap R package with 20 Monte Carlo repeats to calculate SHAP values, which is a model-agnostic tool that can compute approximate SHAP values for any supervised learning model efficiently. More Monte Carlo repeats enhance the stability and accuracy of the SHAP value estimates.

Handling of death from cancer or non-cancer causes

The RSF model addresses competing risks by simultaneously considering multiple types of events—in this case, cancer and non-cancer mortality. Each tree within the forest is built to partition the data based on the feature that most effectively separates the time to each type of event. This process involves:

  1. Splitting Criteria: At each node, the model evaluates the potential splits using a modified log-rank test that accounts for the presence of competing risks. This allows the tree to choose the split that best separates the causes of death (cancer vs. non-cancer).

  2. Segmentation of Age and Dose: The model segments the age at death and radiation dose continuously. For example, a node might split the data based on whether the age at death is above or below a certain threshold, or whether the neutron dose exceeds a specific value. These splits create branches that reflect different risk profiles for cancer and non-cancer outcomes.

  3. Node Predictions: Each terminal node in the tree represents a subset of the data with similar characteristics and predicted outcomes. The predicted survival function for the observations in a terminal node is based on the cumulative incidence function (CIF) specific to that node, taking into account the distribution of cancer and non-cancer events.

  4. Ensemble Predictions: The RSF combines predictions from all trees in the forest to estimate the overall CIF for each cause of death. This ensemble approach enhances the model’s ability to predict the risk of each type of event across different age intervals and dose levels.

By using this approach, the RSF model can effectively handle the complex interplay between different predictors and outcomes, providing a nuanced analysis of how age at death, radiation dose, and other factors influence cancer and non-cancer mortality. This methodology leverages the strengths of RSF to model non-linear effects and interactions, offering robust predictions and valuable insights into the competing risks framework39.

Evaluation and interpretation of the RSF model

The RSF results (described below in the “Results” section) did not appear sensitive to details of the tree splitting criteria used in the RSF analysis, or to the number of trees in the forest. In the randomForestSRC R package, splitrule=“logrankCR” (default) represents a modified weighted log-rank splitting rule modeled after Gray’s test. It is used to find all variables that are informative and when the goal is long term prediction. By comparison, splitrule=“logrank” represents weighted log-rank splitting where each event type is treated as the event of interest and all other events are treated as censored. It is used to find variables that affect a specific cause of interest (non-cancer in this case) and when the goal is a targeted analysis of a specific cause. The RSF version for which results are shown here used splitrule=“logrankCR” with ntree = 1000, but we also tried cause = 2 (non-cancer, whereas 1 = cancer) and splitrule=“logrank” options, with ntree varied between 100 and 2000, and obtained similar concordance score results.

In the evaluation of our competing risks model's predictive accuracy, we relied on concordance scores as a key metric. It is commonly referred to as the concordance index (C-index) or the concordance probability estimate. The concordance score evaluates how well the model ranks the survival times of pairs of subjects. Pairs of subjects are considered concordant if the subject who experienced the event had a shorter survival time than the subject who did not experience the event. Pairs are discordant if the subject who experienced the event had a longer survival time than the event-free subject. This indicates a disagreement between the model's prediction and the observed outcomes. The concordance index is calculated as the proportion of concordant pairs to the total number of comparable pairs (concordant and discordant). The index ranges from 0 to 1, where 1 indicates perfect concordance (ideal ranking of survival times), and 0.5 represents a model that performs no better than random chance.

Results

Overview of variables in the data set

In this study, we separated the large number of mouse records identified from the Janus database into two groups—cancer and non-cancer—according to diagnoses made at the time of death. The purpose was to model these outcomes as competing risks and investigate how neutron dose, fractionation, age and sex of the mice influenced the hazard and incidence functions for these two disease categories. The list of non-cancer diagnoses is detailed in Table S1 and Fig. S1 from the supplementary material, with hemothorax/ascites, pneumonitis, pneumonia, anemia, and hydronephrosis being the five most prevalent conditions.

For a comprehensive understanding of the study parameters, summary statistics for key variables are provided in Table 1 for non-cancerous mice and for cancerous mice. Variable meanings are also provided there in the Table 1 caption. As expected, the variables related to experimental conditions (dose, number of fractions, duration of fraction, treatment age) had similar distributions for the cancer and non-cancer mouse groups. However, the median age at death was considerably lower for non-cancerous mice (815 days) than for cancerous mice (915 days). Due to the large size of the data set, this difference in age at death distributions is highly statistically significant by Wilcoxon rank sum test with continuity correction (W = 21,971,395, p-value < 2.2 × 10–16). A more detailed look at these age at death distributions by disease category is illustrated in Fig. 1. A visual analysis of the figure reveals that cancer is the predominant cause of mortality with its distribution peaking around the middle age range of the cohort. In contrast, non-cancer deaths are less frequent and display a distribution that suggests a relatively earlier onset of mortality, but a wide range. The overlap of the distributions indicates a range of ages where both cancer and non-cancer deaths occur, while also highlighting the differing frequency and age distribution patterns between the two event types. This figure underscores the importance of distinguishing between cancer and non-cancer outcomes when considering the lifespan and mortality risks in populations exposed to radiation.

Table 1.

Summary table of variables in the both the cancer and non-cancer data sets.

Variable Minimum 25th percentile Median Mean 75th percentile Maximum
Cancer (19,441 total mice)
 Treatment age (days) 93.0 108.0 113.0 128.3 116.0 519.0
 Age at death (days) 144.0 770.0 913.0 903.7 1048.0 1517.0
 Dose (Gy) 0.00 0.00 0.1884 0.5519 0.7536 3.2379
 Number of fractions 0.00 0.00 1.00 18.01 24.00 180.00
 Duration of fraction (min) 0.00 20.00 20.00 61.56 45.00 1320.00
Non-cancer (1854 total mice)
 Treatment age (days) 93.0 108.0 114.0 128.8 117.0 519.0
 Age at death (days) 113.0 613.5 815.0 809.0 1002.8 1498.0
 Dose (Gy) 0.00 0.00 0.1385 0.6654 1.5072 3.1877
 Number of fractions 0.00 0.00 1.00 16.2 24.00 156.00
 Duration of fraction (min) 0.00 20.00 20.00 63.83 45.00 1320.00

The variable meanings are as follows. Treatment Age, mouse age at the start of irradiation. Age at Death, mouse age at death. Dose, neutron dose. Number of Fractions, number of fractions into which the dose was split. Duration of Fraction, duration of each dose fraction.

Figure 1.

Figure 1

Frequency distribution of age at death for two distinct event types among the mice: cancer (red) and non-cancer (blue) mortalities in histogram form, as function of Age at Death in days. The frequency distribution is separated into unexposed and exposed. White numbers within the histogram bars indicate the corresponding numbers of mouse deaths.

Cause-specific cumulative hazard and cumulative incidence functions for cancer and non-cancer

The data set was analyzed using the RSF machine learning method for competing risks, and the results were visualized through the Cause-Specific Cumulative Hazard Function (CSCHF) and the Cumulative Incidence Function (CIF) plots, as described above. The CSCHF and CIF generated by the RSF model are shown in Fig. 2. The ‘Time (d)’ axis represents the mouse age in days from birth. The ‘mean’ CSCHF and CIF values represent the average of the CSCHF or CIF predictions for a given data point over all the trees in the random forest. This averaging is done over all the trees in a single run of the random forest algorithm, ensuring that the displayed curves reflect the ensemble prediction for the whole dataset, not individual mice. The black curves denote cancer mortality, while the red curves represent non-cancer mortality. These results clearly show that cancer was the dominant cause of death in the selected mouse population. However, the non-cancer disease hazard continued to increase up to the oldest mouse ages. Over a large portion of the age range (roughly between 600 and 1200 days), the hazards for both cancer and non-cancer diseases followed a sigmoid pattern, appearing roughly linear in this specific region. The plots shown in Figs. 1 and 2 lump together mouse groups exposed to different neutron doses and fractionation regimens, but they give a good overall idea of the age dependences of cancer and non-cancer deaths in the population.

Figure 2.

Figure 2

Cause-Specific Cumulative Hazard Function (CSCHF) and Cumulative Incidence Function (CIF) plots for cancer (red) and non-cancer (blue) diseases. These functions were generated by the RSF model fit to the training data.

Concordance scores

The concordance scores for the RSF model fitted to this data set are as follows. On the training dataset, the model achieved concordance scores of 0.659 for cancer outcomes and 0.602 for non-cancer outcomes. These scores indicate that the model has some (although far from perfect) predictive capability, with higher accuracy in distinguishing cancer-related outcomes compared to non-cancer ones. When applied to the testing dataset, the model achieved concordance scores of 0.649 for cancer and 0.552 for non-cancer outcomes. These results suggest that the model performed better for cancer than for non-cancer, likely because there were much more cancer events in the data set so the fit was more heavily influenced by them. For cancer, there is almost no reduction in concordance values between training and testing data, suggesting that the model is both stable and generalizable for cancer. For non-cancer there was some reduction in concordance on testing data compared to training, which is also likely due to the smaller number of non-cancer instances within our dataset. However, while the reduction in concordance score was 0.05, it is important to recognize that this reduction is substantial because a further reduction of similar magnitude would approach a score indicative of random chance (0.5). This underscores the need to interpret the performance difference between training and testing data cautiously.

Variable importance scores (VIMP)

VIMP scores for this analysis are shown in Table 2. The variables called “Experiment” represent binary (0 or 1) indicators for the different Janus neutron exposure experiments included in the analysis. The different Janus neutron exposure experiments varied in their objectives and methodologies, including differences in dose rates, fractionation schedules, total doses, and exposure durations. For instance, some experiments tested single doses while others explored fractionated doses given over weeks or months. Specific experiments focused on comparing high-dose-rate exposures to low-dose-rate exposures, investigating the effects of age at the time of exposure, and examining the efficacy of radioprotective agents. These variations allowed for a comprehensive evaluation of the biological effects of neutron radiation under different conditions. The specific meanings of the listed 'Experiment' variables are as follows:

  • Experiment 2 (JM-2): This was the first and largest experiment, testing the additivity of small increments of neutron dose delivered in different patterns over 24 weeks. It involved five different exposure patterns, including single high-dose-rate exposures and fractionated low-dose-rate exposures, to evaluate their effects on life shortening and neoplastic disease incidence.

  • Experiment 3 (JM-3): This was a straightforward single-dose study with seven replications, including a small dose-rate comparison where one group received a single dose of 240 cGy of neutrons over 20 min and another over 8 h.

  • Experiment 4 (JM-4 K): This experiment used a 24-week once-weekly exposure procedure with different total doses to test dose-rate and protraction factors, involving multiple replications to evaluate life shortening and neoplastic disease incidence.

  • Experiment 8 (JM-8): This was the only duration-of-life exposure experiment in the JM series, comparing protraction factors between 24- and 60-week once-weekly exposure paradigms, with varying weekly dose levels for both γ-rays and neutrons.

Table 2.

Variable importance (VIMP) scores for both cancer and non-cancer, and their lower and upper 95% confidence intervals (CI).

Feature Cancer Lower CI Upper CI Non-Cancer Lower CI Upper CI
Dose 0.207 0.184 0.223 0.131 0.092 0.172
Sex 0.014 0.009 0.018 0.012 − 0.007 0.031
Treatment age 0.047 0.032 0.063 0.024 − 0.006 0.057
Number of fractions 0.095 0.076 0.123 0.034 0.017 0.061
Duration of fraction 0.032 0.023 0.046 − 0.004 − 0.026 0.023
Experiment 2 0.038 0.027 0.048 0.038 0.018 0.063
Experiment 3 0.010 0.005 0.016 0.004 − 0.020 0.030
Experiment 4 0.009 0.006 0.013 0.002 − 0.008 0.013
Experiment 8 0.035 0.018 0.067 0.018 − 0.006 0.052

By providing this additional context, readers can better understand the significance of these experiment variables and their contributions to the model's predictions.

For cancer mortality, dose emerges as the most influential variable, with a VIMP value of 0.207 (CI 0.184–0.223), underscoring the significant role that neutron radiation dosage plays in predicting cancer outcomes. Following dose, number of fractions holds a VIMP value of 0.095 (CI 0.076–0.123), indicating its substantial, though lesser, influence compared to the dosage. Variables ‘Treatment Age’, ‘Experiment 2’, and ‘Experiment 8’ also contribute to the model’s predictions with VIMP values of 0.047 (CI 0.032–0.063), 0.038 (CI 0.027–0.048), and 0.035 (CI 0.018–0.067), respectively. While these VIMP values are lower compared to the dominant variable ‘Dose’, they still indicate that these variables have a meaningful, albeit smaller, impact on the model’s accuracy. VIMP values around 0.03 suggest that the variables provide some predictive power and should not be disregarded, especially in the context of complex interactions within the model. Notably, ‘Sex’, ‘Experiment 3’, and ‘Experiment 4’ show minimal VIMP values, suggesting their relatively limited impact on cancer mortality predictions within this model. In the context of non-cancer mortality, dose still remains a pivotal feature with a VIMP value of 0.131 (CI 0.092–0.172). However, the VIMP values for ‘Sex’, ‘Treatment Age’, and ‘Duration of Fraction’, 0.012 (CI − 0.007 to 0.031), 0.024 (CI − 0.006 to 0.057), and − 0.004 (CI − 0.026 to 0.023), respectively, indicate that these features contribute minimally to the prediction of non-cancer outcomes.

Overall, we can see that dose was the predominant factor in both models, as expected. Interestingly, number of fractions, duration of fraction, and treatment age appeared to be important for cancer, but not important for non-cancer outcome predictions. There also appeared to be more contribution of differences between experiments for cancer than for non-cancer. Sex did not appear very important for either outcome.

SHAP value analysis for cancer and non-cancer mortality

The application of SHAP values to our competing risks model provides a granular understanding of feature importance, quantifying the impact of each predictor on the model's output for each mouse. SHAP values, rooted in cooperative game theory, offer an interpretable measure of the contribution of each variable to the prediction of both cancer and non-cancer mortality risks in the cohort of B6CF1 mice. The aggregated median SHAP values for different variables in the RSF model are listed in Table 3, with variables listed in alphabetical order. As expected, the SHAP analysis highlights ‘Dose’ as the most impactful predictor for both cancer and non-cancer mortality, with other features exerting varying smaller effects, which are not completely consistent between cancer and non-cancer outcomes.

Table 3.

Summary of SHAP value contributions for different features, for cancer and non-cancer diseases.

Feature Cancer Non-cancer
Medians Abs medians Medians Abs medians
Dose − 0.267 0.462 − 0.020 0.036
Duration of fraction − 0.024 0.065 0.001 0.009
Experiment 2 − 0.057 0.101 − 0.003 0.008
Experiment 3 0.000 0.005 0.000 0.001
Experiment 4 − 0.003 0.010 0.001 0.003
Experiment 8 0.000 0.000 0.000 0.000
Number of fractions − 0.009 0.062 − 0.001 0.008
Sex 0.006 0.063 0.001 0.017
Treatment age 0.015 0.065 0.000 0.012

Medians represent median SHAP values for a given feature on the training data set. Abs medians represent medians of absolute values of the SHAP values (i.e. medians calculated after the sign is removed). The abs medians provide a summary measure that ignores the direction of the effect but focuses on the magnitude, which is appropriate for assessing overall feature importance.

Visualizations of how the SHAP values of each continuous feature changed as function of the feature values for cancer and non-cancer diseases are provided in Fig. 3. These SHAP value graphs provide a visual representation of the impact each feature has on the model’s prediction for cancer and non-cancer outcomes. For dose, the SHAP values expectedly illustrate a positive relationship with the predicted hazard of both cancer and non-cancer outcomes (Fig. 3). The increasing SHAP values with higher doses indicate that as the dose increases, so does the risk of both cancer and non-cancer mortality, aligning with the established understanding of dose–response relationships in radiation exposure. However, the SHAP value trends with dose differed in shape for cancer and non-cancer outcomes. For cancer they show a non-linear trend with dose, with the effect plateauing and then slightly decreasing at higher doses, possibly indicating a saturation point beyond which additional dose increments do not correspond to a proportionally higher risk. In contrast, non-cancer SHAP trends with dose appeared to continue to increase (linearly or even with upward curvature) even at the highest tested doses, with no sign of saturation. This difference in dose response shapes may represent a difference in biological effects of neutrons on cancer vs non-cancer diseases, but the saturation for the cancer response may also be due to the fact that at the highest doses the majority of mice developed cancer, so not much further increase was possible.

Figure 3.

Figure 3

Visualizations of how the SHAP values of each continuous feature change as function of the feature values for cancer (red) and non-cancer (blue) diseases. To better visualize on comparing the shapes, rather than the magnitudes, of the SHAP value dependences for cancer vs non-cancer, all SHAP values were normalized by dividing by the absolute value of the mean of SHAP values for each feature.

For age at treatment (i.e. at the start of irradiation), the SHAP values for non-cancer and cancer both tended towards decreases at older ages, with more scattering for non-cancer. Complexity in the trends could reflect the interplay between age-related disease susceptibility and the decreasing probability of disease manifestation as mice reach advanced ages.

For non-cancer, the number of fractions variable exhibited relatively small and unclear effects on the predicted hazard, with the SHAP values clustering around zero, implying a limited influence of these treatment-related factors on non-cancer mortality. Conversely, for cancer outcomes, number of fractions showed a trend where initially the risk increases with the number of fractions, then reaches a peak around 100 fractions and decreases at higher fraction numbers.

These trends, especially for cancer, may be consistent with the concept of an inverse dose rate or protraction effect30, where increased number of fractions in the intermediate range and/or increased fraction duration were associated with higher SHAP values compared with single-fraction exposures. The subsequent decrease in SHAP values at the highest fraction numbers might indicate a saturation effect, where additional fractions do not correspond to increased risk. For non-cancer events, this peak is not visible, suggesting that the inverse dose rate effect does not apply in the same way or is not present for non-cancer outcomes.

The duration of fraction feature seemed to have little effect on either cancer or non-cancer SHAP values. Regarding categorical (binary) features, the analysis indicated a reduction in SHAP values for males compared to females for both disease outcomes. This suggests that being male is associated with a lower hazard of both cancer and non-cancer mortality compared to being female, within the context of this study population. The binary indicator variables for various experiments also contributed, suggesting that there was some variability between mouse cohorts.

Pearson’s and Spearman correlation analyses were used to investigate to what extent the SHAP value contributions of the different variables to the RSF model were independent, or correlated with each other (Figs. S2–S3 from the supplementary material). The correlations were generally not very strong, suggesting largely independent contributions. The only exception was the correlation between SHAP values for dose and number of fractions, particularly for the cancer outcome, which may indicate some redundancy in the SHAP contributions of these variables.

CSCHF summaries for different dose bins

In addition to the SHAP analysis, we also visualized the model predictions by plotting the distributions of CSCHF at 900, 1000 or 1100 days for each disease for different dose bins (0–0.5 Gy, 0.5–1 Gy, and so on). These results are shown in Fig. 4. The trends in this visualization are generally consistent with those from the SHAP analysis (Fig. 3), indicating clear increases in predicted hazards for both diseases with increasing neutron dose, and a lot of variability in hazard values between different mice even in the same dose bracket.

Figure 4.

Figure 4

Cumulative hazard function error bar plots for cancer (left) and non-cancer (right) diseases by dose bin at time points 900, 1000, and 1100 days.

Discussion

Densely-ionizing radiations, encountered in space travel and radioprotection, have primarily been studied for their cancer-related effects, but their role in non-cancer diseases remains underexplored. This study, using mortality data for approximately 21,000 B6CF1 mice from historical Janus experiments (Table S1 and Fig. S1 from the supplementary material), fills this gap by investigating how densely-ionizing radiation dose, fractionation, age, and sex influence the hazards of non-cancer and cancer diseases. The SHAP values in Fig. 3 clearly show increasing dose–response trends, particularly for cancer outcomes1,9. This trend is less evident in the cumulative hazard and incidence functions in Fig. 2, which aggregate across different dose groups. The SHAP values directly quantify the impact of dose on the risk, providing a clearer picture of how increasing doses influence cancer and non-cancer mortality risks20,22.

SHAP values quantify the contribution of each feature to the model's predictions, with higher values indicating greater risk. For radiation protection, these values help identify the most critical factors influencing mortality. The increasing SHAP values with dose demonstrate a clear dose–response relationship, highlighting the heightened risk of both cancer and non-cancer mortality at higher doses. This underscores the importance of minimizing exposure to densely-ionizing radiation. Although non-cancer events are less frequent than cancer events (Figs. 1 and 2), they still represent a significant portion of the overall risk profile. Comprehensive radiation protection strategies must address both cancer and non-cancer risks to ensure holistic safety measures. By focusing on minimizing both types of risks, protection strategies can be more effective in safeguarding long-term health.

Despite the moderate concordance scores (0.55–0.66), the RSF model provides valuable insights into the risk factors associated with radiation exposure. These scores indicate that the model performs better than random chance and captures essential patterns in the data. By integrating machine learning approaches like RSF, health risk predictions can be refined, leading to more tailored and effective radiation protection strategies. It is important to note the uncertainties involved in translating findings from mice to humans. While mice provide a valuable model for studying radiation effects, differences in biology, lifespan, and environmental factors can influence how these results apply to human populations. Future research should focus on bridging these gaps, possibly through additional animal studies, human epidemiological data, and improved modeling techniques.

The vast preponderance of cancer as a cause of death in the data analyzed is notable, particularly given the mouse strain used, B6CF1 mice. This strain is known for its susceptibility to cancer4042, which likely contributed to the high proportion of cancer-related deaths observed in our study. This inherent predisposition to cancer in B6CF1 mice could have influenced the model's performance, particularly the modest concordance scores obtained for radiation dose. The high frequency of cancer outcomes may have skewed the model’s predictive capability, making it more attuned to detecting cancer-related mortality while potentially underperforming for non-cancer outcomes. This imbalance in the data could have diluted the model’s ability to distinguish subtle dose–response relationships for non-cancer diseases, leading to lower concordance scores for these outcomes. Future studies might consider using mouse strains with different susceptibilities to cancer and non-cancer diseases to assess whether these factors influence the predictive performance of machine learning models in radiation research. Additionally, incorporating external validation datasets with varied disease profiles could provide a more robust evaluation of the model's generalizability and accuracy in different biological contexts.

Interestingly, the SHAP value-based dose–response shapes differed for cancer and non-cancer outcomes, showing saturation for cancer hazard at high doses (> 2 Gy), while non-cancer responses displayed a more linear pattern (Fig. 3). This difference was not as clear when distributions of the cause specific cumulative hazard function (CSCHF) for cancer and non-cancer were plotted as function of dose bin (Fig. 4). However, such observations suggest a potential variation in the biological mechanisms of densely-ionizing radiation effects on different disease types, which is crucial for medical monitoring and intervention strategies in space missions17,18. The RSF model's flexibility and reliability, as opposed to traditional models, allowed for a more nuanced understanding of these complex relationships36,39. The use of SHAP (SHapley Additive exPlanations) values and variable importance scores provided deeper insights into the contribution of each variable to the model’s predictions, underscoring the predominance of dose as a predictive factor for both disease outcomes. Although the effects of sex and age at the start of irradiation were not very pronounced, their inclusion offers a more comprehensive view of potential risks, which could be important for designing effective radioprotection strategies and assessing radiation risk for different astronaut demographics16,19.

In conclusion, this study not only elucidates the cancer-related effects of densely-ionizing radiations, which are typically studied, but also explores their role in non-cancer diseases. This comprehensive understanding is crucial for ensuring the health and safety of astronauts during long-duration space missions. The use of the RSF machine learning algorithm enhances the accuracy of health risk predictions associated with densely-ionizing radiation exposures, potentially leading to more effective radiation protection strategies.

Supplementary Information

Author contributions

E.W.: Conceptualization, calculations and writing of the manuscript. I.S.: Conceptualization, investigation, feedback on the manuscript. D.B.: Supervision, resources, feedback on the manuscript. All authors reviewed and contributed to writing of the manuscript.

Funding

This work was supported by the National Aeronautics and Space Administration (NASA, grant #80NSSC23M0099) and Columbia University. Data used in this study were sourced from the Janus Archive for Mouse Aging Studies at Argonne National Laboratory.

Data availability

The datasets generated during and/or analyzed during the study are available from the corresponding author upon request.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-024-68717-9.

References

  • 1.Cucinotta, F. A. & Durante, M. Cancer risk from exposure to galactic cosmic rays: Implications for space exploration by human beings. Lancet Oncol.7, 431–435. 10.1016/s1470-2045(06)70695-7 (2006). 10.1016/s1470-2045(06)70695-7 [DOI] [PubMed] [Google Scholar]
  • 2.Ainsworth, E. J. Early and late mammalian responses to heavy charged particles. Adv. Space Res.6, 153–165. 10.1016/0273-1177(86)90288-7 (1986). 10.1016/0273-1177(86)90288-7 [DOI] [PubMed] [Google Scholar]
  • 3.Cucinotta, F. A. Review of NASA approach to space radiation risk assessments for mars exploration. Health Phys.108, 131–142 (2015). 10.1097/HP.0000000000000255 [DOI] [PubMed] [Google Scholar]
  • 4.Chang, P. Y. et al. Harderian gland tumorigenesis: Low-dose and LET response. Radiat. Res.185, 449–460. 10.1667/rr14335.1 (2016). 10.1667/rr14335.1 [DOI] [PubMed] [Google Scholar]
  • 5.Datta, K., Suman, S., Kallakury, B. V. S. & Fornace, A. J. Heavy ion radiation exposure triggered higher intestinal tumor frequency and greater β-catenin activation than γ radiation in APCMin/+ mice. PLoS ONE8, e59295 (2013). 10.1371/journal.pone.0059295 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Suman, S. et al. Relative biological effectiveness of energetic heavy ions for intestinal tumorigenesis shows male preponderance and radiation type and energy dependence in APC(1638N/+) mice. Int. J. Radiat. Oncol. Biol. Phys.95, 131–138. 10.1016/j.ijrobp.2015.10.057 (2016). 10.1016/j.ijrobp.2015.10.057 [DOI] [PubMed] [Google Scholar]
  • 7.Weil, M. M. et al. Comparative efficacy of intracoronary allogeneic mesenchymal stem cells and cardiosphere-derived cells in swine with hibernating myocardium. Circ. Res.9, 22–37 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Wang, E., Shuryak, I. & Brenner, D. J. Quantifying the effects of neutron dose, dose protraction, age and sex on mouse survival using parametric regression and machine learning on a 21,000-mouse data set. Sci. Rep.13, 21841. 10.1038/s41598-023-49262-3 (2023). 10.1038/s41598-023-49262-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Brenner, D. J. et al. Cancer risks attributable to low doses of ionizing radiation: Assessing what we really know. Proc. Natl. Acad. Sci. U. S. A.100, 13761–13766. 10.1073/pnas.2235592100 (2003). 10.1073/pnas.2235592100 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Shuryak, I. et al. Quantitative modeling of carcinogenesis induced by single beams or mixtures of space radiations using targeted and non-targeted effects. Sci. Rep.187, 476–482 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Shuryak, I. & Brenner, D. J. Mechanistic modeling predicts no significant dose rate effect on heavy-ion carcinogenesis at dose rates relevant for space exploration. Radiat. Protect. Dosimetry183, 403 (2019). 10.1093/rpd/ncz003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Shuryak, I. et al. A practical approach for continuous in situ characterization of radiation quality factors in space. Sci. Rep.12, 1453. 10.1038/s41598-022-04937-1 (2022). 10.1038/s41598-022-04937-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Chappell, L. J., Elgart, S. R., Milder, C. M. & Semones, E. J. Assessing nonlinearity in harderian gland tumor induction using three combined HZE-irradiated mouse datasets. Radiat. Res.194, 38–51 (2020). 10.1667/RR15539.1 [DOI] [PubMed] [Google Scholar]
  • 14.Cucinotta, F. A. & Cacao, E. Risks of cognitive detriments after low dose heavy ion and proton exposures. Int. J. Radiat. Biol.95, 985–998 (2019). 10.1080/09553002.2019.1623427 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Elgart, S. R. et al. Local generation and efficient evaluation of numerous drug combinations in a single sample. Elife12, e85439 (2018). 10.7554/eLife.85439 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Cacao, E. & Cucinotta, F. A. Meta-analysis of cognitive performance by novel object recognition after proton and heavy ion exposures. Radiat. Res.192, 463–472 (2019). 10.1667/RR15419.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Acharya, M. M. et al. Roles of the functional interaction between brain cholinergic and dopaminergic systems in the pathogenesis and treatment of schizophrenia and Parkinson’s disease. Int. J. Mol. Sci.22, 4299 (2019). 10.3390/ijms22094299 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Parihar, V. K. et al. Cosmic radiation exposure and persistent cognitive dysfunction. Sci. Rep.6, 34774 (2016). 10.1038/srep34774 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Shuryak, I., Brenner, D. J., Blattnig, S. R., Shukitt-Hale, B. & Rabin, B. M. Modeling space radiation induced cognitive dysfunction using targeted and non-targeted effects. Sci. Rep.11, 1–8 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Little, M. P., Azizova, T. V. & Hamada, N. Ionising radiation and cardiovascular disease: Systematic review and meta-analysis. BMJ2021, 380 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Kadhim, M. et al. Non-targeted effects of ionising radiation—Implications for low dose risk. Mutat. Res./Rev. Mutat. Res.752, 84–98. 10.1016/j.mrrev.2012.12.001 (2013). 10.1016/j.mrrev.2012.12.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Pariset, E., Malkani, S., Cekanaviciute, E. & Costes, S. V. Ionizing radiation-induced risks to the central nervous system and countermeasures in cellular and rodent models. Int. J. Radiat. Biol.97, 1–19 (2020). [DOI] [PubMed] [Google Scholar]
  • 23.Yasuda, T. et al. Neurocytotoxic effects of iron-ions on the developing brain measured in vivo using medaka (Oryzias latipes), a vertebrate model. Int. J. Radiat. Biol.87, 915–922. 10.3109/09553002.2011.584944 (2011). 10.3109/09553002.2011.584944 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Gillies, M. et al. Mortality from circulatory diseases and other non-cancer outcomes among nuclear workers in France, the United Kingdom and the United States (INWORKS). Radiat. Res. Soc.188, 276–290 (2017). 10.1667/RR14608.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Little, M. P. et al. Estimating risk of circulatory disease: Little et al. respond. Environ. Health Perspect.120, 1503–1511 (2012). 10.1289/ehp.1204982 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Carnes, B. A. & Grahn, D. Issues about neutron effects: The JANUS program. Radiat. Res.128, S141-146 (1991). 10.2307/3578017 [DOI] [PubMed] [Google Scholar]
  • 27.Stricklin, D. L., VanHorne-Sealy, J., Rios, C. I., Scott Carnell, L. A. & Taliaferro, L. P. Neutron radiobiology and dosimetry. Radiat. Res.195, 480–496. 10.1667/RADE-20-00213.1 (2021). 10.1667/RADE-20-00213.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Garty, G. et al. Mice and the A-Bomb: Irradiation systems for realistic exposure scenarios. Radiat. Res.187, 465–475. 10.1667/RR008CC.1 (2017). 10.1667/RR008CC.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Hall, E. J. & Brenner, D. J. The biological effectiveness of neutrons. Implic. Radiat. Protect.44, 1–9. 10.1093/rpd/44.1-4.1 (1992). 10.1093/rpd/44.1-4.1 [DOI] [Google Scholar]
  • 30.Brenner, D. J. & Hall, E. J. The inverse dose-rate effect for oncogenic transformation by neutrons and charged particles: A plausible interpretation consistent with published data. Int. J. Radiat. Biol.58, 745–758. 10.1080/09553009014552131 (1990). 10.1080/09553009014552131 [DOI] [PubMed] [Google Scholar]
  • 31.Carnes, B. A., Grahn, D. & Thomson, J. F. Dose-response modeling of life shortening in a retrospective analysis of the combined data from the JANUS program at Argonne National Laboratory. Radiat. Res.119, 39–56 (1989). 10.2307/3577366 [DOI] [PubMed] [Google Scholar]
  • 32.Grdina, D. J., Wright, B. J. & Carnes, B. A. Protection by WR-151327 against late-effect damage from fission-spectrum neutrons. Radiat. Res.128, S124-127 (1991). 10.2307/3578014 [DOI] [PubMed] [Google Scholar]
  • 33.Liu, W. et al. Comparing radiation toxicities across species: An examination of radiation effects in Mus musculus and Peromyscus leucopus. Int. J. Radiat. Biol.89, 391–400. 10.3109/09553002.2013.767994 (2013). 10.3109/09553002.2013.767994 [DOI] [PubMed] [Google Scholar]
  • 34.Zander, A., Paunesku, T. & Woloschak, G. E. Analyses of cancer incidence and other morbidities in gamma irradiated B6CF1 mice. PLoS One15, e0231510. 10.1371/journal.pone.0231510 (2020). 10.1371/journal.pone.0231510 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Cacao, E. & Cucinotta, F. A. Meta-analysis of cognitive performance by novel object recognition after proton and heavy ion exposures. Radiat. Res.192, 463–472. 10.1667/rr15419.1 (2019). 10.1667/rr15419.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Fine, J. P. & Gray, R. J. A proportional hazards model for the subdistribution of a competing risk. J. Am. Stat. Assoc.94, 496–509. 10.1080/01621459.1999.10474144 (1999). 10.1080/01621459.1999.10474144 [DOI] [Google Scholar]
  • 37.Woloschak, G. Northwestern University Radiobiology Archives (NURA) Data. Northwestern University. https://sites.northwestern.edu/nura/data/ (2011).
  • 38.Robert, J. G. A class of $K$-sample tests for comparing the cumulative incidence of a competing risk. Ann. Stat.16, 1141–1154. 10.1214/aos/1176350951 (1988). 10.1214/aos/1176350951 [DOI] [Google Scholar]
  • 39.Ishwaran, H. et al. Random survival forests for competing risks. Biostatistics15, 757–773. 10.1093/biostatistics/kxu010 (2014). 10.1093/biostatistics/kxu010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Maronpot, R. R. Biological basis of differential susceptibility to hepatocarcinogenesis among mouse strains. J. Toxicol. Pathol22, 11–33. 10.1293/tox.22.11 (2009). 10.1293/tox.22.11 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.King-Herbert, A. & Thayer, K. NTP workshop: Animal models for the NTP rodent cancer bioassay: Stocks and strains–should we switch?. Toxicol. Pathol.34, 802–805. 10.1080/01926230600935938 (2006). 10.1080/01926230600935938 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Haseman, J. K., Hailey, J. R. & Morris, R. W. Spontaneous neoplasm incidences in Fischer 344 rats and B6C3F1 mice in two-year carcinogenicity studies: A National Toxicology Program update. Toxicol. Pathol.26, 428–441. 10.1177/019262339802600318 (1998). 10.1177/019262339802600318 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The datasets generated during and/or analyzed during the study are available from the corresponding author upon request.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES