Abstract
Background:
Stereotactic radiosurgery (SRS) is widely used for managing brain metastases (BMs), but an adverse effect, radionecrosis, complicates post-SRS management. Differentiating radionecrosis from tumor recurrence non-invasively remains a major clinical challenge, as conventional imaging techniques often necessitate surgical biopsy for accurate diagnosis. Machine learning and deep learning models have shown potential in distinguishing radionecrosis from tumor recurrence. However, their clinical adoption is hindered by a lack of explainability, limiting understanding and trust in their diagnostic decisions.
Purpose:
To utilize a novel neural ordinary differential equation (NODE) model for discerning BM post-SRS radionecrosis from recurrence. This approach integrates image deep features, genomic biomarkers, and non-image clinical parameters within a synthesized latent feature space. The trajectory of each data sample towards the diagnosis decision can be visualized within this feature space, offering a new angle on radiogenomic data analysis foundational for AI explainability.
Methods:
By hypothesizing that deep feature extraction can be modeled as a spatiotemporally continuous process, we designed a novel model based on heavy ball NODE (HBNODE) in which deep feature extraction was governed by a second-order ODE. This approach enabled tracking of deep neural network (DNN) behavior by solving the HBNODE and observing the stepwise derivative evolution. Consequently, the trajectory of each sample within the Image-Genomic-Clinical (I-G-C) space became traceable. A decision-making field (F) was reconstructed within the feature space, with its gradient vectors directing the data samples’ trajectories and intensities showing the potential. The evolution of F reflected the cumulative feature contributions at intermediate states to the final diagnosis, enabling quantitative and dynamic comparisons of the relative contribution of each feature category over time. A velocity curve was designed to determine key intermediate states (locoregional ∇F=0) that are most predictive. Subsequently, a non-parametric model aggregated the optimal solutions from these key states to predict outcomes.
Our dataset included 90 BMs from 62 NSCLC patients, and 3-month post-SRS T1+c MR image features, seven NSCLC genomic features, and seven clinical features were analyzed. An 8:2 train/test assignment was employed, and five independent models were trained to ensure robustness. Performance was benchmarked in sensitivity, specificity, accuracy, and ROCAUC, and results were compared against 1) a DNN using only image-based features, and 2 )a combined ‘I+G+C’ features without the HBNODE model.
Results:
The temporal evolution of gradient vectors and potential fields in F suggested that clinical features contribute the most during the initial stages of the HBNODE implementation, followed by imagery features taking dominance in the latter ones, while genomic features contribute the least throughout the process. The HBNODE model successfully identified and assembled key intermediate states, exhibiting competitive performance with an ROCAUC of 0.88±0.04, sensitivity of 0.79±0.02, specificity of 0.86±0.01, and accuracy of 0.84±0.01, where the uncertainties represent standard deviations. For comparison, the image-only DNN model achieved an ROCAUC of 0.71±0.05 and sensitivity of 0.66±0.32 (p=0.086), while the ‘I+G+C’ model without HBNODE reported an ROCAUC of 0.81±0.02 and sensitivity of 0.58±0.11 (p=0.091).
Conclusion:
The HBNODE model effectively identifies BM radionecrosis from recurrence, enhancing explainability within XAI frameworks. Its performance encourages further exploration in clinical settings and suggests potential applicability across various XAI domains.
Keywords: Neural ODE, deep learning, visualization, explainability, treatment response
1. Introduction
Brain metastases (BMs) are the most common intracranial tumors in adults and are increasing in incidence as survival improves with more effective anti-neoplastic therapies [1 2]. It has been estimated that 20–40% of patients with cancer may develop BMs during their clinical course. This prevalence is on the rise, owing not only to improved survival outcomes for many extracranial tumors but also due to improvements in early detection via advanced imaging techniques [1]. Radiation therapy with stereotactic radiosurgery (SRS) or whole brain radiation therapy remains a critical component in the management of brain metastases [3 4]. The use of SRS is increasingly common in BM management and provides exceptional local control [5]. However, the dose limited adverse effect for SRS is radionecrosis, a delayed complication due to injury and inflammation to normal brain parenchyma. Radionecrosis remains a growing neuro-oncologic challenge for patients and providers, and can manifest months to years after treatment with the potential for significant morbidity [4] [6]. Unfortunately, radionecrosis is indistinguishable from tumor recurrence using conventional imaging alone, leading to diagnostic uncertainty and potentially delays in appropriate management [7 8]. Accurate and timely differentiation between radionecrosis and tumor recurrence using non-invasive techniques is crucial to facilitate appropriate management and improve patient outcomes [7 9].
Many advanced imaging modalities, including MR perfusion [10], MR spectroscopy [11], and Positron Emission Tomography (PET) using amino acid tracers [12] have been extensively studied for radionecrosis/tumor recurrence differentiation. Despite their potential, these advanced techniques have not yet been standardized in clinical practice due to challenges such as low resolution, machine-to-machine variability, and the need for specialized equipment [13–15]. Currently, invasive surgical biopsy remains the diagnostic gold standard [8 16] for radionecrosis. However, these procedures carry risk for morbidity and mortality, and can delay timely treatment interventions. Additionally, obtaining biopsies from multiple lesions is often impractical [16]. Early and non-invasive diagnosis of tumor recurrence is therefore crucial, and computational image analysis has emerged as a potential solution to this challenge. One well-investigated technique is radiomics analysis, which focuses on the high-throughput mining of high-dimensional medical imaging data to identify quantitative diagnostic and prognostic features [17]. Previous studies have demonstrated the effectiveness of radiomic features as prognostic indicators [18]. Nevertheless, radiomic features face limitations in terms of robustness and reproducibility [19–21], as well as challenges in handling imbalanced datasets [22], leading to the relatively low sensitivity observed in these studies [23 24].
Recent advancements in image analysis algorithms and increased computational power have allowed deep learning methods [25 26] to become a promising avenue for radionecrosis detection. Unlike traditional handcrafted radiomic features, deep learning algorithms can automatically extract relevant imaging features [26]. To differentiate tumor recurrence and radionecrosis, currently available deep learning models incorporate convolutional neural networks (CNNs), which learn image features directly at multiple scales through repeated convolutional layers, pooling layers, and fully connected layers. Despite promising results in prior research, the clinical application of these deep learning models has faced many challenges, particularly in explainability [27], as which the internal mechanics of a deep network can be explained in human terms from a clinical perspective [28]. A good explanation provides insights into how a neural network arrives at its decision and/or renders that decision understandable [29]. Without such explainability, deep learning models can be opaque, lacking in intuitiveness, and often surpassing human logical capacities for causality [27 30]. This is owing to their inherent “black box” nature [30 31], leading to their limited adoption in the clinical setting. A pilot study suggested that neural ordinary differential equation (ODE) models were suitable for visualizing deep neural network behavior and enhancing explainability for brain MR image analysis [28]. Unlike traditional CNN models that have finite layers and discrete spatial and depthwise representations [32 33], neural ODE models define forward inference passes as the solution of an initial value problem, which can be considered a continuous evolution. Therefore, the intermediate stages of model decision and corresponding data utilization can be visualized by solving the ODE. Through this spatiotemporally continuous process, an unlimited number of solutions could be obtained, unveiling the models’ underlying interaction within the “black box”.
In this work, we developed a novel neural ODE model to differentiate BM post-SRS radionecrosis from tumor recurrence using radiogenomic data and to provide enhanced explainability. Image deep features, genomic biomarkers, and non-image clinical parameters were integrated into a synthesized latent feature space. We hypothesized that feature trajectories continuously evolve toward a binary state (i.e., radionecrosis or recurrence), and the evolution can be modeled as a spatiotemporally continuous process. To track the evolution of the deep feature space and understand the model’s decision-making process, we employ a second-order neural ODE model, heavy ball neural ODE (HBNODE) [34], to construct a feature space and track data trajectories that evolve synchronously over time. Compared to classic 1st-order neural ODE models, HBNODE models do not suffer from vanishing gradient issues and have been shown to achieve faster convergence and improved generalization in benchmark image classification and sequential learning tasks [35]. Additionally, we introduced a velocity curve derived from the continuous data trajectory, which evaluated the radionecrosis/tumor recurrence differentiation process on a temporal basis. This comprehensive approach provided valuable insights into the model’s decision-making process and highlighted the significance of various features.
2. Materials and Methods
Patient Data
This study retrospectively analyzed 90 BMs from 62 patients diagnosed with non-small cell lung cancer (NSCLC). These patients were treated with a single course of stereotactic radiosurgery (SRS) at our institution between October 2013 and October 2019. Among BMs, 27 were confirmed by biopsy to have developed local recurrence. The training-test split was performed based on individual BMs rather than by patients using an 8:2 ratio. To prevent any potential mixing, all metastases from the same patient were placed in the same set, either training or test. We utilized a stratified sampling method to ensure a balanced distribution of labels between the training and test sets. This standalone test set was determined before model training, ensuring that these samples were entirely independent and not ‘seen’ by the model during training. This study was approved by the Duke University Health System Institutional Review Board (IRB). SRS was prescribed to each BM, with a 1mm planning target volume (PTV) margin, in a single fraction of 20Gy/18Gy or 5 fractions for a total 27.5Gy/25Gy. The SRS procedures were delivered using a Varian™ Edge LINAC equipped with a 6D couch table. For each patient, the high-resolution T1 contrast-enhanced (T1-CE) MR volume at the 3-month post-SRS follow-up was selected as the primarily studied MR volume of this work. The post-SRS MR images, the planning target volumes (PTVs), and the planned volumetric dose distribution were resampled and registered to the pre-SRS MR scan. Seven clinical features and seven genomic features (specifically, NSCLC driver mutations) were collected and summarized in Table 1 and Figure 1. A Chi-Square Test [36] with Bonferroni correction [37] revealed that none of the investigated non-imagery features were significantly associated with the radionecrosis/tumor recurrence outcomes.
Table 1.
Summary of patient clinical features.
| Characteristic | Lesions (n=90) |
|---|---|
| Gender | |
| Male | 33/46 patients/lesions |
| Female | 29/44 patients/lesions |
| Histology | |
| NSCLC | 90 lesions |
| Dose fractionation | |
| 1 fraction | 65 lesions |
| 5 fractions | 25 lesions |
| Lesion location | |
| Infratentorium | 12 lesions |
| Supratentorium | 78 lesions |
| Outcome | |
| Local Recurrence | 27 lesions |
| Radionecrosis | 63 lesions |
| KPS[38] | |
| KPS≥70 | 76 lesions |
| KPS<70 | 14 lesions |
| Chemo/targeted therapy | |
| Yes | 56 lesions |
| No | 34 lesions |
| Immunotherapy | |
| Yes | 16 lesions |
| No | 74 lesions |
| Steroid | |
| Yes | 34 lesions |
| No | 56 lesions |
Fig 1.

Summary of patient genomic features.
Deep Learning Model Design
I. Mathematical Modeling
To predict radionecrosis/tumor recurrence, we hypothesized that feature trajectories continuously evolve towards a binary state, which can be modeled as a spatiotemporally continuous process. We then referred to this derivative evolution as a feature flow , which can be represented using Neural Ordinary Differential Equations (Neural ODEs) [38]. In a conventional Neural ODE solver [39], as shown in Equation 1, an initial condition , a sequence of time points to solve for , and a function representing the right-hand-side of the differential equation are required,
| (1) |
Unlike conventional Neural ODE modeling, a Heavy Ball Neural ODE (HBNODE) solver deviates by replacing the first-order ODE with a heavy ball ODE (HBODE), which is a second-order ODE incorporating an appropriate damping term [34],
| (2) |
Here, is the damping parameter, which can be set as a tunable or a learnable hyperparameter with positivity constraint. Equivalently, the HBNODE initial value problem (Equation 2) could be rewritten by the system of Equations 3,
| (3) |
subject to an initial condition .
Equation 4 represents the relationship between and
| (4) |
In this work, and represent the initial and final stages of , respectively, while serves as the transitional stage variable (i.e., “time”) between 0 and 1. Let denote the initial position and represent the intermediate feature distribution, then Equation 5 illustrates the final feature representation ,
| (5) |
Equation 3 forms the core function of our mathematical formulation, representing the continuous dynamics of the feature flow , parametrized by a neural network with initial position and momentum . The visualization of the deep neural network (DNN) behavior is achieved by solving the HBNODE and acquiring at multiple stages. Such visualization provides visual clues of feature space evolution during DNN implementation, serving as a natural semantics to explain deep neural network behavior that can be understood through human experience [28 40 41].
II. HBNODE Model Design
Figure 2 shows our workflow. As shown in (A), a 3D volume-of-interest (VOI) centered on each BM was first determined on the 3-month post-SRS high-resolution T1+c scan. Specifically, each original 3D MR image was cropped into a 6.4 × 6.4 × 6.4 cm3 VOI, with the target located at the center and V60% (derived from the SRS plan dose distribution) fully contained in the VOI. Data augmentation, including rotation, translation, reflection, and noise adjustment, was applied with different augmentation factors for tumor recurrence and radionecrosis to prevent potential overfitting due to the imbalanced dataset. A bespoke DNN resembling the U-net’s [42] encoding path was then trained for radionecrosis/tumor recurrence prediction using the 3D VOI. The DNN feature extraction part is a stack of 4 convolutional blocks. Each block consists of 2 convolutional layers, followed by a rectified linear unit (ReLU) and a max-pooling operation. Every convolutional layer is performed with a filter of size 3 × 3 × 3, a stride of 1, and padding. The max-pooling operation is conducted over a 2 × 2 × 2 window with a stride of 2. Prior to the binary prediction output, latent variables in the DNN were extracted as 1024 deep features.
Fig 2.

The workflow of XAI model design. (A) Feature extraction and fusion process; (B) The conceptional design of the proposed HBNODE model architecture. The model describes the feature flow from 3 difference sources (I, G, C) evolving from the initial stage (t = 0) to the final stage (t = 1); (C) the non-parametric model design that generates the final prediction results.
The deep features, genetic features, and clinical features were fused and then fed into a HBNODE solver (Fig. 2B). To overcome the dimensionality mismatch problem that arises when fusing data from various sources, we employed our group’s positional encoding (PE) method as a vector-growing scheme for the optimized feature space sizes [43]. The red, orange, and blue bands demonstrate the feature flow of image, genomic, and clinical features, respectively. The gradient color of each DNN block represents the increasing integration interval of “time” .
Conventional CNNs operate with a finite number of layers, leading to discrete model evolution and an opaque, inexplicable decision-making process. In contrast, the HBNODE solver is founded on a spatiotemporally continuous approach, yielding an infinite number of intermediate states corresponding to different selections of . We hypothesize that these intermediate states contain valuable insights elucidating the model’s decision-making process and enhancing predictive accuracy. In practice, throughout the stepwise derivative evolution in solving HBNODE, each data sample’s trajectory within the Image-Genomic-Clinical (I-G-C) space can be calculated. To explain the inputs utilization of each type of features, a decision-making field , was then reconstructed in this space, with its gradient vectors directing the trajectories and intensities showing the potential. The evolution of reflects the cumulative feature contributions at intermediate states to the final diagnosis.
Given the theoretically infinite intermediate states retrievable from the HBNODE solver, and the potential variance in information across states regarding the model’s decision, selecting the most informative and relevant stages (i.e., “key states”) of the decision-making process is crucial. To identify key states, we first designed a velocity curve from the newly available trajectories compared to previous trajectories at predefined time intervals. Subsequently, we computed the gradient from the trajectory, where time steps at local maxima of the velocity curve (where at training sample locations) are hypothesized to contribute most to the model’s decision making. Mathematically, the relationship between the decision-making field and velocity can be described using Equation 6,
| (6) |
A non-parametric model (Fig. 2C) then aggregated the optimal solutions from key intermediate states to predict radionecrosis or recurrence. Key intermediate states were then synthesized via an ensemble learning scheme to generate the final prediction results using a support vector machine (SVM) model with a linear kernel. A 5-fold cross-validation within the training set was employed to further refine the model and validate its performance.
III. Positional Encoding
To obtain comprehensive insights into patient treatment responses, we collected image-based features extracted from 3D MRI scans alongside non-image features encompassing clinical characteristics and genomic profiles. However, simply concatenating low-dimensional clinical/genomic features with high-dimensional image features can lead to a dimensionality problem [44], as the prominence of deep image features may overshadow the contributions of clinical and genomic features, resulting in imbalanced model outputs from individual data sources. We therefore employed a positional encoding (PE) scheme [44 45] to expand clinical/genomic features into an embedding space of optimal dimension. Essentially, PE minimized discrepancies between the two feature sources within a Hilbert space, thereby emphasizing their complementary roles in the combined probability density function.
Model Evaluation and Comparison Study
All calculations in this study were carried out in Python 3.7 with eight Core Intel Xeon Silver 4112 CPU @ 2.6 GHz, 32 GB RAM, and Nvidia Quadro P5000 Graphic Card (FP32 computation power: 8.873 teraflops). The HBNODE model was developed with TorchDyn Library [46] using a PyTorch environment with an 8:2 ratio of training/test set assignment, with a standard data augmentation factor of 30. Adam optimizer [47] and Dormand–Prince-45 ODE solver were applied. The damping parameter was set to be = sigmoid(θ), where θ is a trainable weight initialized as θ = −3. During training, ht, F, and ∇F results were saved at a step size of 0.01 from t = 0 to t = 1 (i.e., 100 time steps were simulated in this model), and velocities were computed based on consecutive trajectories. Five model versions were trained with random validation sample assignments to assess model robustness. Performance evaluation included sensitivity, specificity, accuracy, and ROC analysis, with Delong’s test employed when applicable [48].
To quantify the improvements of our model designs, two other deep learning models were trained and compared with the HBNODE results: 1) a DNN using only image-based features without the HBNODE design, and 2) a DNN using combined ‘I+G+C’ features without the HBNODE design. Both models employed a 17-layer 3D convolutional neural network architecture [49] as the backbone. For model training, the Stochastic Gradient Descent (SGD) optimizer with a learning rate of 10−7 was chosen. The model weights were initialized using the Glorot (Xavier) initialization method. Additionally, the model was optimized by minimizing the mean absolute error (MAE), ensuring that the model learns effectively and generalizes well to new data. The same training/test groups were used in these 2 model training processes. The aforementioned four evaluators from each model were then compared side-by-side for comprehensive analysis.
Explainability Analysis
To quantify HBNODE feature utilization and enhance its explainability, we first employed the Shapley Additive Explanations (SHAP) method [50], which facilitates the analysis of Shapley values [51] for each individual feature within the dataset utilized for training and testing a model. In this study, we used the SHAP library to calculate the Shapley values of clinical and genomic features, determining their importance to the model and how they affect the outcome. By utilizing reverse projection from positionally-encoded features back to the original clinical and genomic features, each feature in the embedding space could be attributed to a specific original feature. Consequently, for each clinical and genomic feature, the average Shapley values of its corresponding positionally-encoded features represented its importance in the final output of HBNODE. This process allows Shapley values to effectively attribute the final prediction of the trained model to its base features, providing a clear and interpretable measure of feature importance and therefore model explainability.
In addition to the SHAP analysis that evaluates the overall importance of each individual feature, we expanded our investigation to include the feature contributions of each data category from a broader and temporal perspective. The HBNODE implementation enables the tracking of model dynamics over time, and the category-wise feature contributions can be derived from the temporal evolution of the I-C-G feature space. Specifically, we analyzed the decision-making field at each time step to assess both its potential and gradient fields. Our hypothesis suggested that a stronger contribution would correlate with higher potential values and a more extensive distribution of gradient vectors. By examining the decision-making field , we could semi-quantitatively determine the relative contribution of a feature class (where could be , or ). Mathematically, the absolute contribution of a feature category at time is calculated as the sum of all gradient vector projections along its corresponding dimension in . The relative contribution at each time point is then normalized as follows:
| (7) |
Therefore, the relative contributions from imagery, clinical, and genomic features can be plotted as three curves evolving over time, allowing for a quantitative comparison of their contributions.
3. Results
Figure 3 and Table 2 provide a summary of the quantitative results, including sensitivity, specificity, accuracy, and AUCROC comparisons. As seen, the DNNimage-only model (blue curve in Fig. 3) achieved an acceptable AUCROC of 0.71±0.05; however, it demonstrated imbalanced sensitivity/specificity results, resulting in a relatively low accuracy of 0.61±0.11. Integrating positionally-encoded clinical and genomic features improved the performance of the model (orange curve in Fig. 3) with an AUCROC = 0.81±0.02 and accuracy = 0.77±0.005, albeit with compromised sensitivity of 0.58±0.11. This indicated that the simple concatenation of features from three data sources did not yield the optimal combination of feature contributions. Due to the lack of model explainability, it was challenging to discern the contribution from each feature category separately. In contrast, the proposed HBNODE model (red curve in Fig. 3) achieved competitive performance, with an AUCROC = 0.88±0.04, sensitivity = 0.79±0.02, specificity = 0.89±0.01, and accuracy = 0.84±0.01, where the uncertainties represent standard deviations. This underscores the effective utilization of key intermediate states identified by the HBNODE solver through ensemble learning. Fig. 3 illustrates the ROC curves of the three investigated models with standard deviations shown as shaded bands. The HBNODE ROC curve (red) demonstrates the best performance compared to the other two studied models (I+C+G model p=0.091; DNNimage-only model p=0.086).
Fig. 3.

The ROC curve results for the studied models. Shaded areas represent standard deviations.
Table 2:
RN/TR diagnosis results from all 3 investigated models
| AUCROC | ACC. | SEN. | SPE. | |
|---|---|---|---|---|
| HBNODE | 0.88±0.04 | 0.84±0.01 | 0.79±0.02 | 0.89±0.01 |
| DNNimage-only | 0.71±0.05 | 0.61±0.11 | 0.66±0.32 | 0.59±0.27 |
| I+C+G Model | 0.81±0.02 | 0.77±0.05 | 0.58±0.11 | 0.86±0.09 |
Figures 4(A) and 4(B) illustrate the trajectories of training and testing data samples within the I-G-C space at six different time steps. At the initial stage (), the tumor recurrence (TR, blue) and radionecrosis (RN, red) cases are closely positioned. As time progresses, the intermediate states exhibit improved separation between the two groups. By the final stage (), the data shows clear and distinct clustering. These trajectories indicate that the HBNODE model can make good predictions at intermediate states, not just at the final state (). This observation aligns with our initial hypothesis that several intermediate key states contain more relevant and predictive information. Animations of the trajectories over time are available as Supplementary files.
Fig. 4.

(A) trajectory plot of the training data sample within the I-G-C space at six different time steps; (B) trajectory plot of the testing data sample within the I-G-C space at six different time steps.
Fig. 5 demonstrates the selection of key intermediate states based on the gradient from the trajectory. The green curve illustrates the velocity of all training samples over time, with local maxima (where ) determined via peak detection after 7-point median filtering (no preset peak numbers). The corresponding are then considered key states of the HBNODE model. It is worth mentioning that both the tumor recurrence velocity curve (blue) and the radionecrosis velocity curve (orange) exhibit similar shapes to the green curve. Consequently, bias is mitigated as the model treats tumor recurrence and radionecrosis in a comparable manner.
Fig. 5.

Velocity results of all True Recurrence (blue), Radionecrosis (yellow) and all combined (green) training samples at t:0→1. The identified key states are marked with red crosses.
Figure 6 shows the decision-making fields FI-C, FI-G, and FC-G reconstructed in the I-C, I-G, and C-G planes, respectively. Each feature combination’s F at 6 different time steps is depicted, with gradient vectors guiding the trajectories and background color reflecting intensities (corresponding to potential values). In Fig. 6(A), a notable vertical alignment of vectors is observed at t = 0, 0.1, and 0.25, indicating clinical features’ predominant influence in early decision-making stages. By t = 0.5, imagery and clinical features exhibit roughly equal contributions, resembling concentric circles. Subsequently, horizontal vectors progressively dominate, signifying the transition to imagery feature predominance. Fig. 6B displays similar trends, with genomic features gradually ceding dominance to imagery features. However, the overall intensities are lower than those observed in FI-C. In Figure 6C, vertical vectors maintain a majority presence over horizontal ones, suggesting that genomic features are overshadowed by clinical features in decision-making. Notably, FC-G field exhibits the least intensities among the three fields. In summary, such pairwise comparison indicates that genomic features have a comparatively limited contribution to radionecrosis/tumor recurrence prediction results compared to clinical and imagery features. The construction of these decision-making fields not only enables visualization of the model’s decision-making process over time but also allows for a semi-quantitative evaluation of each feature category’s overall contribution.
Fig. 6.

(A) the decision-making field FI-C at six different time steps; (B) the decision-making field FI-G at six different time steps; (C) the decision-making field FC-G at six different time steps.
To gain a more quantitative understanding of the model’s input utilization, we computed the relative contributions of each feature group at each time step and plotted them in Figure 7. The results show that clinical features have the highest contribution during the first half of the time, after which imagery features become the dominant contributors for the remainder of the process. Genomic features never dominate, indicating their relatively limited contribution to the model’s decision-making. This observation is consistent with our findings from Figure 6. The relative contributions derived from the decision-making field provide valuable insights into the model’s intrinsic dynamics and data usage over time. Table 3 summarizes the Shapley values of clinical and genomic features of the investigated HBNODE model. As illustrated among the clinical features, patient age exhibits the highest Shapley value, followed by chemo/targeted therapy use and immunotherapy use. For genomic features, PDL-1 demonstrates the highest Shapley value. Nevertheless, no overwhelming numerical differences were observed in the reported Shapley values.
Fig. 7.

Relative feature group contribution of I versus C versus G from the decision-making field FI-C-G.
Table 3.
The original Shapley values of the studied features.
| Clinical Feature (C) | Shapley Value (C) | Genomic Feature (G) | Shapley Value (G) |
|---|---|---|---|
| Age | 1.29 | ALK | 1.15 |
| Chemo/targeted therapy use | 1.26 | BRAF | 1.26 |
| Dose | 1.15 | EGFR | 1.12 |
| Immunotherapy | 1.23 | KRAS | 1.20 |
| KPS | 1.20 | NRAS | 1.17 |
| BM Location | 1.18 | PDL-1 | 1.29 |
| Steroid use | 1.23 | RET | 1.23 |
4. Discussion
In this study, a novel deep neural model for discerning BM post-SRS radionecrosis from tumor recurrence using radiogenomic data was developed. To our knowledge, this is the first study to both propose a model for radionecrosis prediction using deep learning techniques and to provide insight and explainability of the model inputs (Fig. 6 and 7). Although SRS is generally well-tolerated, approximately 20–30% of patients will demonstrate radiographic progression; when such progression occurs, it is crucial to distinguish whether this represents true disease progression (which may necessitate further SRS or alteration in systemic therapy) or radionecrosis (in which case further SRS is contradicted) [52]. Radionecrosis can radiographically mimic tumor recurrence in the post-SRS follow-up MRs [9], and clinicians currently lack non-invasive modalities to reliable distinguish one radionecrosis from tumor recurrence. Efforts to develop reliable non-invasive biomarkers for radionecrosis remain a leading area of investigation in neuro-oncology. Our work addresses this unmet clinical need and offers a potential solution to guide clinical decision-making. The best-performing model, incorporating imagery features, genomic biomarkers, and non-image clinical parameters, achieved competitive predictive performance, with an AUCROC of 0.88±0.04, sensitivity of 0.79±0.02, specificity of 0.89±0.01, and accuracy of 0.84±0.01. Compared to the current literature in both radiomics (sensitivity: 0.65~0.70, AUC: 0.73~0.81) and deep learning-based approaches (sensitivity: ~0.75, AUCROC: ~0.85) [16–18, 24], our method’s performance ranks among the best reported for differentiating post-SRS brain metastasis radionecrosis from true recurrence. Although the observed p-values in DeLong’s tests (0.091 and 0.086) did not reach the significance threshold after a Holm-Bonferroni correction [53], likely due to small data sample size (yet a substantial one for BM radionecrosis studies), the improvements in model performance statistics remain meaningful. This is further supported by the notable ROC results shown in Figure 3.
One of the key innovations of this work is the adoption of the 2nd-order heavy ball ODE in deep neural network architecture design for governing the hypothesized spatiotemporal continuity of neural network behavior. Traditional CNN models have finite layers and discrete spatial and depthwise representations, which limit their ability to fully capture data complexity [32 33]. However, neural ODE models overcome this by defining forward inference passes as the solution of an initial value problem, which allows for a continuous depth framework and makes NODEs ideal whenever the underlying dynamics are known to evolve according to differential equations [39]. Therefore, NODE emerges as the optimal candidate for our hypothesized spatiotemporal continuous classification modeling: it involves only one independent variable of the system (i.e., “time”), and the function ‘’ employed in the numerical solver in a NODE can be specified as nearly any differentiable function in a regular supervised learning procedure [28]. While the NODE framework offers certain advantages for visualization and explainability, it is not meant to imply superiority over other neural network architectures in terms of theoretical approximation capabilities.
Instead of traditional NODEs, our design adopted a HBNODE model, which leverages the continuous limit of the classical momentum accelerated gradient descent to improve NODE training and inference [34]. By integrating image-derived deep features, genomic biomarkers, and non-image clinical parameters into a synthesized latent feature space, our HBNODE model tracks the trajectory of each data sample within the deep space after its interaction with the deep neural network. The ability to directly visualize these trajectories provides a groundbreaking perspective on the utilization of input data: it reveals how the neural network differentially interprets imagery/genomic/clinical features across various intermediate states. To our knowledge, this is the first time that investigators have been able to illuminate the nuanced ways in which deep models engage with complex dimension data samples, providing a new perspective to establish radiogenomic explainability for deep learning.
Throughout the stepwise derivative evolution in solving HBNODE, the trajectory of each data sample within the I-C-G space can be calculated, which enables an ensemble design by identifying key intermediate states from to . It is important to note that the dynamic computations in our HBNODE model are purely an internal characteristic of the model and do not reflect input data collected at multiple time points. Unlike conventional CNNs that generate only one output state, the spatiotemporal continuity of NODE models allows for an infinite number of intermediate states, with some potentially being representative of others. To identify these key states, we used velocity as a metric. The velocity, derived from consecutive trajectories, represents the extent to which the model’s decision-making process varies. By hypothesizing that local maxima in velocity indicate a significant change in model prediction, we considered such intermediate states to be meaningful as ‘key intermediate states’, providing a basis for ensemble learning. As suggested by prior studies, the generalization ability of an ensemble is usually much stronger than that of a single learner [43 54]. Thus, the ensemble of all key intermediate states is hypothesized to outperform a single model with the same input. Comparing the model, our non-parametric ensemble model demonstrated superior performance, confirming that the key states complement each other towards a better output.
In addition to the enhanced performance achieved by assembling the key intermediate states, the construction of a decision-making field increased the model’s explainability. This approach allowed for both qualitative and quantitative investigation of feature utilization by the deep learning model. More specifically, the gradient vectors and potential fields derived from vividly represent the model’s intrinsic dynamics from the initial to the final stage. By examining the vector directions and background intensities, one could semi-quantitatively analyze the model’s input utilization. A pairwise comparison helped determine the relative contribution of each feature category at different time steps. Subsequently, each category’s relative contribution was calculated from the gradient vector projections, providing a quantitative assessment of how feature importance changes over time. Consistent with real-work clinical practice, our work found that clinical features contributed most during the first half of time, while radiographic features dominated in the latter half. Consequently, the developed HBNODE model not only characterizes the behavior of deep neural networks, but also raises confidence in its application in clinical settings.
We also investigated feature importance within each feature category using SHAP analysis. Among clinical features, age, chemo/targeted therapy, and immunotherapy emerged as the most important factors contributing to the model’s predictions, aligning with prior studies regarding the correlation with radionecrosis outcomes post-SRS [55–58]. Although genomic features were the least important inputs, the most important genomic feature identified by SHAP, PDL-1, has been previously associated with clinical outcomes following SRS [59 60]. The SHAP analysis further confirms the model’s effective utilization of input data and boosts confidence in its results. However, Shapley values, while providing valuable insights into feature importance, have limitations. They reflect the overall contribution of each feature to the final prediction but do not capture the dynamic evolution of the model’s decision-making process. Unlike the continuous insights depicted in figure 6, Shapley values are calculated at discrete points, leaving some aspects of feature dynamics unaddressed. Our proposed -based method analyzes the contribution of clinical and genomic features at a group level rather than comparing each feature individually. However, to better capture temporal feature contributions, we recognize the need for a new method that operates synchronously with HBNODE. Resolving this issue is of significant clinical interest, although it involves more complex mathematical modeling to project the collapsed features back to their original clinical/genomic forms, resulting in a higher computational cost. Addressing this challenge is part of our ongoing research efforts and is expected to yield a more comprehensive analysis than SHAP. Additionally, identifying the contribution of individual imagery features and relating them back to the original image representation would be a valuable next step beyond the category-based contributions shown in Figure 7. This advancement is also part of our future research roadmap, which will require a more detailed mathematical framework to move from category-level to individual feature-level analysis within the HBNODE model.
This study utilized radiographic, clinical, and genomic data to predict BM SRS outcomes with enhanced interpretability. Future studies should expand the current dataset with additional data inputs to improve the training of the proposed HBNODE model and provide deeper insights into the contribution of each data source. Furthermore, this retrospective cohort was limited to patients with NSCLC, potentially limiting the generalizability of these findings. To address this limitation and enhance the robustness of predictive models for BMs, future research should incorporate additional histologies. The small cohort size from a single institution presents another notable limitation. The limited dataset size may have contributed to the discrepancies in clustering behavior between training and testing cases, particularly for the RN class, where testing samples formed multiple distinct clusters. This could reflect unmodeled heterogeneity within the RN group that the model is beginning to differentiate over time. Additionally, while our proposed model demonstrated trends suggestive of improved performance, the small sample size prevented the results from achieving statistical significance. Validation with larger, independent datasets will therefore be essential to confirm these findings, improve robustness, and establish statistical confidence. Therefore, we plan to prospectively validate this model in collaboration with other institutions using a standardized data collection protocol. Future utilizations of high-performance computing (HPC) units will be essential to tackle the significant computational task in the prospective studies. Despite these limitations, the current HBNODE model represents an advancement over existing radiomic and DNN models, with potential clinical implications for improved non-invasive radionecrosis detection and patient management.
5. Conclusion
To our knowledge, this is the first report of a novel deep neural model that distinguishes BM radionecrosis from tumor recurrence within an explainable AI (XAI) framework. The results demonstrate that the non-parametric model achieved an AUCROC result of 0.88±0.04 and a sensitivity of 0.79±0.02, showing competitive performance relative to the alternative models presented in the comparison. The model performance warrants further validation and suggests its potential application across various XAI domains.
Funding Statement:
This work is partially supported by NIH P30 CA014236, NIH R01 EB029431 and NIH/NCI R38 Award 5R38-CA245204
Footnotes
Conflict of Interest: The authors have no conflicts to disclose.
Supplementary File
Animations of the trajectories over time are available as GIFs at the following Google Drive folder: https://drive.google.com/drive/folders/1RtcJGIGog386pQNA30oMASJmUuNexpjM?usp=sharing
References
- 1.Bertolini F, Spallanzani A, Fontana A, Depenni R, Luppi G. Brain metastases: an overview. CNS Oncol 2015;4(1):37–46 doi: 10.2217/cns.14.51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Hardesty DA, Nakaji P. The Current and Future Treatment of Brain Metastases. Front Surg 2016;3:30 doi: 10.3389/fsurg.2016.00030 [published Online First: 20160525]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Gondi V, Bauman G, Bradfield L, et al. Radiation Therapy for Brain Metastases: An ASTRO Clinical Practice Guideline. Pract Radiat Oncol 2022;12(4):265–82 doi: 10.1016/j.prro.2022.02.003 [published Online First: 20220506]. [DOI] [PubMed] [Google Scholar]
- 4.Mayo ZS, Billena C, Suh JH, Lo SS, Chao ST. The dilemma of radiation necrosis from diagnosis to treatment in the management of brain metastases. Neuro-Oncology 2024;26(Supplement_1):S56–S65 doi: 10.1093/neuonc/noad188. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ladbury C, Pennock M, Yilmaz T, et al. Stereotactic Radiosurgery in the Management of Brain Metastases: A Case-Based Radiosurgery Society Practice Guideline. Advances in Radiation Oncology 2024;9(3):101402 doi: 10.1016/j.adro.2023.101402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Vaios EJ, Winter SF, Shih HA, et al. Novel Mechanisms and Future Opportunities for the Management of Radiation Necrosis in Patients Treated for Brain Metastases in the Era of Immunotherapy. Cancers (Basel) 2023;15(9) doi: 10.3390/cancers15092432 [published Online First: 20230424]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Vellayappan B, Tan CL, Yong C, et al. Diagnosis and Management of Radiation Necrosis in Patients With Brain Metastases. Front Oncol. 2018;8:395. Published 2018 Sep 28. doi: 10.3389/fonc.2018.00395 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Verma N, Cowperthwaite MC, Burnett MG, Markey MK. Differentiating tumor recurrence from treatment necrosis: a review of neuro-oncologic imaging strategies. Neuro-Oncology 2013/May;15(5) doi: 10.1093/neuonc/nos307. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Arvold ND, Lee EQ, Mehta MP, et al. Editor’s choice: Updates in the management of brain metastases. Neuro-Oncology 2016/August;18(8) doi: 10.1093/neuonc/now127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Hu LS, Baxter LC, Smith KA, et al. Relative cerebral blood volume values to differentiate high-grade glioma recurrence from posttreatment radiation effect: direct correlation between image-guided tissue histopathology and localized dynamic susceptibility-weighted contrast-enhanced perfusion MR imaging measurements. AJNR Am J Neuroradiol 2009;30(3):552–8 doi: 10.3174/ajnr.A1377 [published Online First: 20081204]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Zeng QS, Li CF, Zhang K, Liu H, Kang XS, Zhen JH. Multivoxel 3D proton MR spectroscopy in the distinction of recurrent glioma from radiation injury. J Neurooncol 2007;84(1):63–9 doi: 10.1007/s11060-007-9341-3 [published Online First: 20070214]. [DOI] [PubMed] [Google Scholar]
- 12.Galldiks N, Stoffels G, Filss CP, et al. Role of O-(2-(18)F-fluoroethyl)-L-tyrosine PET for differentiation of local recurrent brain metastasis from radiation necrosis. J Nucl Med 2012;53(9):1367–74 doi: 10.2967/jnumed.112.103325 [published Online First: 20120807]. [DOI] [PubMed] [Google Scholar]
- 13.Chernov MF, Ono Y, Abe K, et al. Differentiation of Tumor Progression and Radiation-Induced Effects Aft. Acta Neurochirurgica Supplement 2013. doi: 10.1007/978-3-7091-1376-9_29. [DOI] [PubMed] [Google Scholar]
- 14.Chuang M-T, Liu Y-S, Tsai Y-S, Chen Y-C, Wang C-K. Differentiating Radiation-Induced Necrosis from Recurrent Brain Tumor Using MR Perfusion and Spectroscopy: A Meta-Analysis. PLoS ONE 2016;11(1) doi: 10.1371/journal.pone.0141438. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Zach L, Guez D, Last D, et al. Delayed contrast extravasation MRI: a new paradigm in neuro-oncology. Neuro-Oncology 2015/March;17(3) doi: 10.1093/neuonc/nou230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Peng L, Parekh V, Huang P, et al. Distinguishing True Progression From Radionecrosis After Stereotactic Radiation Therapy for Brain Metastases With Machine Learning and Radiomics. International journal of radiation oncology, biology, physics 2018/November/11;102(4) doi: 10.1016/j.ijrobp.2018.05.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Jalalifar SA, Soliman H, Sahgal A, Sadeghi-Naini A. Predicting the outcome of radiotherapy in brain metastasis by integrating the clinical and MRI-based deep learning features. Med Phys 2022;49(11):7167–78 doi: 10.1002/mp.15814 [published Online First: 20220706]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Du P, Liu X, Shen L, et al. Prediction of treatment response in patients with brain metastasis receiving stereotactic radiosurgery based on pre-treatment multimodal MRI radiomics and clinical risk factors: A machine learning model. Front Oncol. 2023;13:1114194. Published 2023 Mar 13. doi: 10.3389/fonc.2023.1114194 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Mackin D, Fave X, Zhang L, et al. Measuring CT scanner variability of radiomics features. Investigative radiology 2015/November;50(11) doi: 10.1097/RLI.0000000000000180. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Lee J, Steinmann A, Ding Y, et al. Radiomics feature robustness as measured using an MRI phantom. Scientific Reports 2021. 11:1 2021–02-17;11(1) doi: 10.1038/s41598-021-83593-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Foy JJ, Robinson KR, Li H, Giger ML, Al-Hallaq H, Samuel G. Armato I. Variation in algorithm implementation across radiomics software. Journal of Medical Imaging 2018/October;5(4) doi: 10.1117/1.JMI.5.4.044505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Avanzo M, Wei L, Stancanello J, et al. Machine and Deep Learning Methods for Radiomics. Medical physics 2020/June;47(5) doi: 10.1002/mp.13678. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Peng L, Parekh V, Huang P, et al. Distinguishing True Progression From Radionecrosis After Stereotactic Radiation Therapy for Brain Metastases With Machine Learning and Radiomics. Int J Radiat Oncol Biol Phys 2018;102(4):1236–43 doi: 10.1016/j.ijrobp.2018.05.041 [published Online First: 20180524]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Zhang Z, Yang J, Ho A, et al. A Predictive Model for Distinguishing Radiation Necrosis from Tumor Progression after Gamma Knife Radiosurgery based on Radiomic Features from MR Images. European radiology 2018/June;28(6) doi: 10.1007/s00330-017-5154-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Hosny A, Parmar C, Quackenbush J, Schwartz LH, Aerts HJWL. Artificial intelligence in radiology. Nature reviews. Cancer 2018/August;18(8) doi: 10.1038/s41568-018-0016-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Cho H-h, Lee HYE, et al. Radiomics-guided deep neural networks stratify lung adenocarcinoma prognosis from CT scans. Communications Biology 2021. 4:1 2021–11-12;4(1) doi: 10.1038/s42003-021-02814-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Yang CC. Explainable Artificial Intelligence for Predictive Modeling in Healthcare. J Healthc Inform Res. 2022;6(2):228–239. Published 2022 Feb 11. doi: 10.1007/s41666-022-00114-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Yang Z, Hu Z, Ji H, et al. A neural ordinary differential equation model for visualizing deep neural network behaviors in multi-parametric MRI-based glioma segmentation. Med Phys 2023;50(8):4825–38 doi: 10.1002/mp.16286 [published Online First: 20230302]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.van der Velden BHM, Kuijf HJ, Gilhuijs KGA, Viergever MA. Explainable artificial intelligence (XAI) in deep learning-based medical image analysis. Medical Image Analysis 2022;79:102470 doi: 10.1016/j.media.2022.102470. [DOI] [PubMed] [Google Scholar]
- 30.Mohammed MA-O, Abdulkareem KA-O, Dinar AA-O, Zapirain BA-O. Rise of Deep Learning Clinical Applications and Challenges in Omics Data: A Systematic Review. LID - 10.3390/diagnostics13040664 [doi] LID - 664. (2075–4418 (Print)). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Alzubaidi L, Zhang J, Humaidi AJ, et al. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data 2021;8(1):53 doi: 10.1186/s40537-021-00444-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436–44 doi: 10.1038/nature14539. [DOI] [PubMed] [Google Scholar]
- 33.Xu Y, Vaziri-Pashkam M. Limits to visual representational correspondence between convolutional neural networks and the human brain. Nature Communications 2021;12(1):2065 doi: 10.1038/s41467-021-22244-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Xia H, Suliafu V, Ji H, Nguyen TM, Bertozzi AL, Osher SJ, Wang B. Heavy ball neural ordinary differential equations. In: Proceedings of the 35th International Conference on Neural Information Processing Systems (NIPS ‘21). Curran Associates Inc.; 2024:18646–18659. Article 1425. [Google Scholar]
- 35.Baker J, Cherkaev E, Narayan A, Wang B. Learning Proper Orthogonal Decomposition of Complex Dynamics Using Heavy-ball Neural ODEs. J. Sci. Comput. 2023;95(2):27 doi: 10.1007/s10915-023-02176-8. [DOI] [Google Scholar]
- 36.Tallarida RJ, Murray RB. Chi-Square Test. In: Tallarida RJ, Murray RB, eds. Manual of Pharmacologic Calculations: With Computer Programs. New York, NY: Springer New York, 1987:140–42. [Google Scholar]
- 37.Abdi H. Bonferroni and Šidák corrections for multiple comparisons. Encyclopedia of measurement and statistics 2007;3(01):2007. [Google Scholar]
- 38.Goldberg M, Mondragon-Soto MG, Altawalbeh G, et al. Enhancing outcomes: neurosurgical resection in brain metastasis patients with poor Karnofsky performance score - a comprehensive survival analysis. Frontiers in Oncology 2023;13 doi: 10.3389/fonc.2023.1343500. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Valle R, Reda FA, Shoeybi M, LeGresley P, Tao A, Catanzaro B. Neural ODEs for Image Segmentation with Level Sets. ArXiv 2019;abs/1912.11683. [Google Scholar]
- 40.Chen RTQ, Rubanova Y, Bettencourt J, Duvenaud D. Neural ordinary differential equations. Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal, Canada: Curran Associates Inc., 2018:6572–83. [Google Scholar]
- 41.Ancona M, Öztireli C, Gross MH. Explaining Deep Neural Networks with a Polynomial Time Algorithm for Shapley Values Approximation. ArXiv 2019;abs/1903.10992. [Google Scholar]
- 42.Fong R, Vedaldi A. Explanations for Attributing Deep Neural Network Predictions. Explainable AI: Interpreting, Explaining and Visualizing Deep Learning: Springer-Verlag, 2022:149–67. [Google Scholar]
- 43.Ronneberger O, Fischer P, Brox T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab N, Hornegger J, Wells W, Frangi A (eds) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. MICCAI 2015. Lecture Notes in Computer Science(), vol 9351. Springer, Cham. 10.1007/978-3-319-24574-4_28 [DOI] [Google Scholar]
- 44.Zhao J, Vaios E, Wang Y, et al. Dose-Incorporated Deep Ensemble Learning for Improving Brain Metastasis Stereotactic Radiosurgery Outcome Prediction. Int J Radiat Oncol Biol Phys 2024. doi: 10.1016/j.ijrobp.2024.04.006 [published Online First: 20240412]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Vogenberg FR, Isaacson Barash C, Pursel M. Personalized medicine: part 1: evolution and development into theranostics. P t 2010;35(10):560–76. [PMC free article] [PubMed] [Google Scholar]
- 46.Wang Y, Li X, Konanur M, et al. Towards optimal deep fusion of imaging and clinical data via a model-based description of fusion quality. Medical Physics 2023/June/01;50(6) doi: 10.1002/mp.16181. [DOI] [PubMed] [Google Scholar]
- 47.Poli M, Massaroli S, Yamashita A, Asama H, Park J. Torchdyn: A neural differential equations library. arXiv preprint arXiv:2009.09346 2020. [Google Scholar]
- 48.Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 2014. [Google Scholar]
- 49.DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988;44(3):837–45. [PubMed] [Google Scholar]
- 50.Zunair H, Rahman A, Mohammed N, Cohen JP. Uniformizing Techniques to Process CT Scans with 3D CNNs for Tuberculosis Prediction. Predictive Intelligence in Medicine: Third International Workshop, PRIME 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 8, 2020, Proceedings. Lima, Peru: Springer-Verlag, 2020:156–68. [Google Scholar]
- 51.Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, California, USA: Curran Associates Inc., 2017:4768–77. [Google Scholar]
- 52.Janzing D, Minorics L, Bloebaum P. Feature relevance quantification in explainable AI: A causal problem. In: Chiappa S, Calandra R, eds. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research. PMLR; 2020:2907–2916. Available from: https://proceedings.mlr.press/v108/janzing20a.html. [Google Scholar]
- 53.Sharma M, Jia X, Ahluwalia M, et al. First follow-up radiographic response is one of the predictors of local tumor progression and radiation necrosis after stereotactic radiosurgery for brain metastases. Cancer Medicine 2017/September/01;6(9) doi: 10.1002/cam4.1149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Aickin M, Gensler H. Adjusting for multiple testing when reporting research results: the Bonferroni vs Holm methods. Am J Public Health 1996;86(5):726–8 doi: 10.2105/ajph.86.5.726. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Huang F, Xie G, Xiao R. Research on ensemble learning. 2009 International Conference on Artificial Intelligence and Computational Intelligence; 2009:249–252. doi: 10.1109/AICI.2009.235. [DOI] [Google Scholar]
- 56.Yamamoto M, Serizawa T, Shuto T, et al. Stereotactic radiosurgery for patients with multiple brain metastases (JLGK0901): a multi-institutional prospective observational study. Lancet Oncol 2014;15(4):387–95 doi: 10.1016/s1470-2045(14)70061-0 [published Online First: 20140310]. [DOI] [PubMed] [Google Scholar]
- 57.Vellayappan B, Lim-Fat MJ, Kotecha R, et al. A Systematic Review Informing the Management of Symptomatic Brain Radiation Necrosis After Stereotactic Radiosurgery and International Stereotactic Radiosurgery Society Recommendations. Int J Radiat Oncol Biol Phys 2024;118(1):14–28 doi: 10.1016/j.ijrobp.2023.07.015 [published Online First: 20230722]. [DOI] [PubMed] [Google Scholar]
- 58.Rubino S, Oliver DE, Tran ND, et al. Improving Brain Metastases Outcomes Through Therapeutic Synergy Between Stereotactic Radiosurgery and Targeted Cancer Therapies. Front Oncol 2022;12:854402 doi: 10.3389/fonc.2022.854402 [published Online First: 20220302]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Vaios EJ, Shenker RF, Hendrickson PG, et al. Long-Term Intracranial Outcomes With Combination Dual Immune-Checkpoint Blockade and Stereotactic Radiosurgery in Patients With Melanoma and Non-Small Cell Lung Cancer Brain Metastases. Int J Radiat Oncol Biol Phys 2024;118(5):1507–18 doi: 10.1016/j.ijrobp.2023.12.002 [published Online First: 20231212]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Eguren-Santamaria I, Sanmamed MF, Goldberg SB, et al. PD-1/PD-L1 Blockers in NSCLC Brain Metastases: Challenging Paradigms and Clinical Practice. Clin Cancer Res 2020;26(16):4186–97 doi: 10.1158/1078-0432.Ccr-20-0798 [published Online First: 20200430]. [DOI] [PubMed] [Google Scholar]
- 61.Koenig JL, Shi S, Sborov K, et al. Adverse Radiation Effect and Disease Control in Patients Undergoing Stereotactic Radiosurgery and Immune Checkpoint Inhibitor Therapy for Brain Metastases. World Neurosurgery 2019;126:e1399–e411 doi: 10.1016/j.wneu.2019.03.110. [DOI] [PubMed] [Google Scholar]
