GTsurvival: A Hybrid GCN-Neural Decision Tree Model for Restricted Mean Survival Time Prediction with Complex Censored Data

Jingyi Zhang; Shishun Zhao; Dongmei Lu; Jianhua Cheng

doi:10.3390/e28010028

. 2025 Dec 25;28(1):28. doi: 10.3390/e28010028

GTsurvival: A Hybrid GCN-Neural Decision Tree Model for Restricted Mean Survival Time Prediction with Complex Censored Data

Jingyi Zhang ¹, Shishun Zhao ^1,², Dongmei Lu ^3,^*, Jianhua Cheng ^1,^*

Editor: Zhi-Ping Liu

PMCID: PMC12840287 PMID: 41593936

Abstract

Chronic diseases, particularly those with progressive neurological impairment, present a significant challenge in healthcare due to their impact on millions globally and the limited availability of effective therapies. Addressing this challenge requires innovative approaches, such as leveraging individuals’ genetic features for early intervention and treatment strategies. Due to the irregular intervals of patient visits, clinical data typically appear as censored, necessitating advanced analytical methods. Thus, this study introduces GTsurvival, a novel network architecture that combines graph convolutional networks (GCN) with a neural decision tree, providing promising advancements in disease prediction. GTsurvival utilizes restricted mean survival time (RMST) as pseudo-observations and directly connects them with baseline variables. Through the joint simulation of RMST, GTsurvival can effectively utilize shared information and enhance its predictive ability for patients’ future survival status. Firstly, GTsurvival is introduced to handle complex censored data, emphasizing the crucial role of graphs utilized in GCNs for processing related information among samples. Secondly, the neural decision tree within GTsurvival enhances decision-making by mitigating uncertainty at split nodes, effectively minimizing the global loss function and optimizing survival analysis in high-dimensional datasets. Thirdly, evaluations on simulated datasets and a real-world neurodegenerative disease cohort verify that the proposed GTsurvival method surpasses existing approaches. This superiority is partly attributed to the inclusion of a generalized score test during feature selection, which helps capture variants associated with disease progression.

Keywords: survival analysis, graph convolution network, neural decision tree, restricted mean survival time

1. Introduction

Chronic progressive diseases, particularly those affecting neurological function, present significant challenges to global healthcare systems. These conditions often involve complex interactions between genetic predisposition and environmental factors, leading to gradual decline in patient health. According to the World Health Organization [1], approximately 55 million people are currently suffering from Alzheimer’s disease (AD), with projections indicating a significant rise to 78 million by 2030 and 139 million by 2050. Due to the limited effectiveness of disease treatments, early prediction plays a crucial role in enhancing the quality of life for patients during disease advancement. Notably, genetic-based predictive models have recently garnered considerable attention. Existing models predominantly rely on polygenic risk scores [2] (PRSs) to handle genetic variations; however, these scores are essentially weighted linear sums of genetic factors. Despite their ease of implementation, PRSs ignore complex nonlinear interactions among high-dimensional genetic variants. The methods cannot automatically handle complex relationships among high-dimensional covariates, for instance, nonlinear associations between covariates and outcomes, or interplays among covariates. Furthermore, due to intermittent subject examinations, it was not feasible to observe the precise time of disease onset, posing a challenge for the analysis of the longitudinal cohort data [3]. These limitations mentioned above, particularly in modeling nonlinear genetic interactions and handling censored data, highlight the need for prediction models that can address both high-dimensional nonlinearity and censoring mechanisms.

To address these challenges, we propose integrating restricted mean survival time [4] (RMST) analysis with nonlinear machine learning algorithms. RMST has received widespread attention in the fields of medicine and biology owing to its direct and effective clinical interpretability while circumventing strong parametric assumptions required by traditional survival metrics. Specifically, RMST calculates the average survival time up to a specific observation time point through numerical integration of the survival curve, making it particularly suitable for handling the censored data. Statistical approaches based on Cox regression have demonstrated efficacy in analyzing RMST within simple covariate settings, as evidenced by applications in conditions like acute coronary syndrome [5] and chronic liver disease [6]. Recent pseudo-observation methods [7] have been extended to RMST analysis with time-varying covariates through generalized linear models, yet these methods remain limited by the requirement for explicit functional structures when utilized in high-dimensional genomic studies. Bouaziz (2023) [8] developed asymptotic formulas to accelerate computations for RMST in the context of right-censored and interval-censored data.

Deep learning offers novel approaches for high-dimensional biological data analysis, using flexible architectures to overcome traditional modeling constraints. Notable developments include Zhao’s [9] multi-timepoint RMST prediction framework using inverse probability of censoring weighting, and Sun and Ding’s [10] neural network for interval-censored survival analysis and identification of subgroups with differential progression risks. However, these approaches exhibit limited capacity for processing non-Euclidean data structures inherent to biological systems, such as cellular interaction graphs. The foundational work of Kipf and Welling [11] established graph convolutional networks (GCNs) as a framework for handling graph-structured data, with subsequent biological applications by Peng et al. [12] in cellular classification modeling and Xu et al. [13] in cell graph construction. Further advancements, such as FGCNSurv [14], employ dual-fused convolutional architectures for multi-omics survival prediction, and Ling et al. [15] developed a sparse geometric graph approach with sequential feature selection. While existing GCN-based survival models have shown progress, their implementations primarily focus on right-censored data. Based on the research mentioned above, we may face two problems: (1) progression data for many chronic conditions often exhibit complex censored types (e.g., interval censoring from irregular clinical visits combined with right censoring), which existing GCN architectures might not fully accommodate, and (2) conventional survival frameworks might inadequately capture nonlinear genetic interactions. These limitations could restrict the clinical translation of graph-based models for disease prediction.

To overcome these problems, our work includes the key components: (1) Loss function for complex censoring: we utilize a subject-specific mean squared error loss based on RMST pseudo-values, constructed via the Jackknife method. This approach directly models the relationship between covariates and RMST for interval-censored data. It differs from and extends prior works like Zhao et al. [9], which designed loss functions primarily for right-censored data. (2) Graph construction for patient similarity: We construct a patient similarity graph (nodes as patients, edges via KNN based on genomic Euclidean distance) as input to a GCN. This allows the model to leverage relational structure among patients, contrasting with standard neural networks (e.g., Sun et al. [10]) that process covariates independently. (3) Integration of the GCN with the neural decision tree: We combine the GCN for relational feature learning with subsequent neural decision tree for final prediction. This hybrid architecture, which integrates graph-based representation learning with tree-based prediction, is not a feature of existing pure GCN survival models (e.g., Ling et al. [15]). Specifically, the model jointly optimizes parameters of all splitting nodes and predictive values of leaf nodes via global risk minimization, thus ensuring the overall optimality of the model. (4) To enhance interpretability, we combine the generalized score test for identifying disease-associated genetic variants, with post-hoc explanation methods (e.g., LIME [16]) to interpret the importance of input features for individual model predictions. Notably, the GCN module in our model is critical to genomic data processing: by iteratively propagating and aggregating information across the gene network, each node integrates neighbor information. This information propagation and aggregation process inherently reduces feature redundancy (corresponding to entropy reduction in information theory) and strengthens the relevance between genetic features and survival outcomes. Conceptually, this aligns with the network communication paradigm in information theory, which aims to efficiently extract “signals” most predictive of survival outcomes from high-dimensional, complex genomic data while attenuating “noise” (i.e., irrelevant variations). Evaluations on simulated and real-world genomic datasets demonstrate GTsurvival’s superiority over existing survival methods.

The paper is structured as follows: Section 2 introduces GTsurvival, a survival prediction algorithm that merges GCN with network decision trees [17] to effectively handle complex censored data. Section 3 outlines detailed simulation experiments, with numerical results presented in the tabular format. In addition, the performance of GTsurvival is evaluated on a biomedical dataset, and a generalized score test is conducted to identify representative genetic features. Lastly, Section 4 offers general remarks and discusses our findings.

2. Methods

2.1. Data Description

Herein, a novel approach GTsurvival is utilized to analyze datasets from the Alzheimer’s Disease Neuroimaging Initiative (ADNI, https://adni.loni.usc.edu/ (accessed on 5 March 2025)), incorporating blood-based gene expression profiles generated by Bristol-Myers Squibb laboratories. The present study initially divided participants into three cohorts according to their cognitive status: cognitively normal (CN), mild cognitive impairment (MCI), and AD. AD-related brain changes begin even before MCI onset. Therefore, this investigation extends beyond MCI patients to include the broader CN/MCI population, aiming to facilitate early detection of AD and enhance the quality of life both prior to and throughout disease progression. Considering that the ADNI cohort includes individuals aged 55 years or above, and AD onset seldom occurs before age 55 [18], 55 years is established as the baseline for time-to-event measurement (years until AD diagnosis). Our dataset comprises 744 genomically sequenced participants with censoring distributed as follows: left-censored (n = 44), interval-censored (n = 192), and right-censored (n = 508). Microarray gene expression datasets are available at https://ida.loni.usc.edu/explore/jsp/search/search.jsp?project=ADNI#geneticFiles (accessed on 5 March 2025). The raw gene expression values were pre-processed utilizing the Robust Multi-chip Average (RMA) normalization method, with duplicate genes aggregated by mean expression to yield 20,093 final genetic features.

2.2. Interval Censoring

In this study, it is assumed that rather than directly obtaining survival time T, a random interval $[L, R]$ (where $L \geq 0$ and $L \leq R$ ) is observed, which almost certainly encloses the event time, as denoted by $P (T \in [L, R]) = 1$ [19]. The right end of the interval can take infinite values, classified as follows:

If $0 < L < R < \infty$ , the data are strictly interval-censoring;
If $0 = L < R < \infty$ , the data are left-censored;
If $0 < L < R = \infty$ , the data are right-censored;
If $0 < L = R < \infty$ , the data are right-censored.

2.3. Restricted Mean Survival Times

2.3.1. Problem Formulation and Notation

Event Time: let $T_{i} \geq 0$ denote the true (possibly unobserved) event time for the $i^{t h}$ subject.
Observed Interval: due to intermittent assessment, we only observe a random interval $[L_{i}, R_{i}]$ such that $P (T_{i} \in [L_{i}, R_{i}]) = 1$ , covering left-, right-, and interval-censoring (Section 2.2).
RMST: for a pre-specified, clinically meaningful upper bound of the time window $τ$ , the RMST $μ (τ)$ is defined as the expected survival time restricted to $[0, τ]$ :
$μ (τ) = E [min (T, τ)] = \int_{0}^{τ} S (t) d t,$
where $S (t) = P (T > t)$ is the survival function.

2.3.2. Construction of the Response Variable via Pseudo-Values

To handle interval-censored data and obtain a response variable suitable for regression analysis, we generate RMST pseudo-observations for each individual through the following three-step procedure.

Step 1: Survival Function Estimation via EM-ICM.

Given interval-censored data ${(L_{i}, R_{i}]}_{i = 1}^{n}$ , we obtain the nonparametric maximum likelihood estimator (NPMLE) of the survival function $S (t)$ using the hybrid EM-ICM algorithm [20,21,22], which combines Expectation-Maximization (EM) and Iterative Convex Minorant (ICM) algorithms for optimizing non-increasing functions. Let ${s_{j}}_{j = 0}^{m}$ be the ordered unique points from the set ${0, L_{i}, R_{i}}_{i = 1}^{n}$ , and define:

α_{i j} = I (s_{j} \in (L_{i}, R_{i}]), ρ_{j} = S (s_{j - 1}) - S (s_{j}) .

The likelihood function is:

L_{s} (ρ) = \prod_{i = 1}^{n} [S (L_{i}) - S (R_{i})] = \prod_{i = 1}^{n} \sum_{j = 1}^{m} α_{i j} ρ_{j} .

The product form arises because $S (L_{i}) - S (R_{i})$ (the probability that $T_{i} \in [L_{i}, R_{i}]$ ) equals the sum of $ρ_{j}$ for all intervals $(s_{j - 1}, s_{j}]$ contained within $[L_{i}, R_{i}]$ .

The EM-ICM algorithm estimates $ρ$ (and thus $S (t)$ ) via iterative updates until convergence:

E-step (Expectation): Since $T_{i}$ is unobserved (only $[L_{i}, R_{i}]$ is known), we compute the expected number of events at each time point $s_{j}$ , given the current estimate of $ρ$ (denoted $ρ^{(c)}$ ). This expectation is quantified by:
$d_{j} (ρ^{(c)}) = \sum_{i = 1}^{n} \frac{α_{i j} ρ_{j}^{(c)}}{\sum_{l = 1}^{m} α_{i l} ρ_{l}^{(c)}} .$
M-step (Iterative Convex Minorant, ICM): This step updates $ρ$ to maximize the log-likelihood while ensuring the survival function $S (t)$ remains valid. First, we convert $ρ_{j}$ to cumulative distribution function (CDF) estimates $β_{j} = \sum_{l = 1}^{j} ρ_{l}$ (where $β_{j} = 1 - S (s_{j})$ , requiring non-decreasing $β_{j}$ ), then use the Pool Adjacent Violators Algorithm (PAVA) to solve the constrained optimization of maximizing a quadratic approximation of the log-likelihood under $β_{1} \leq β_{2} \leq \dots \leq β_{m}$ . Finally, we obtain the survival function estimator as $\hat{S} (t) = 1 - {\hat{β}}_{j}$ for $t \in [s_{j - 1}, s_{j})$ . Detailed steps can be found in Section 3 of Sun [19].

Step 2: Population-level RMST Calculation

Using the estimated survival function $\hat{S} (t)$ (obtained from all n subjects), we compute the population-level RMST at each time point $τ_{k}$ (the average survival duration of the entire cohort within $[0, τ_{k}]$ ):

\hat{μ} (τ_{k}) = \int_{0}^{τ_{k}} \hat{S} (t) d t .

Step 3: Individual-level RMST Generation

To derive individual-specific RMST estimates for each subject i and time point $τ_{k}$ , we employ the Jackknife method [9]:

{\tilde{y}}_{i} (τ_{k}) = n \cdot \hat{μ} (τ_{k}) - (n - 1) \cdot {\hat{μ}}_{(- i)} (τ_{k}),

(1)

where ${\hat{μ}}_{(- i)} (τ_{k})$ is the ‘leave-one-out’ estimator computed after excluding the $i^{t h}$ subject. The pseudo-values ${{\tilde{y}}_{i} (τ_{k})}$ form a fully-observed, continuous response suitable for standard regression modeling.

2.4. Network Architecture

2.4.1. Build Connection Graph

This study employs GCN layers to model the relationships and neighborhood information among subjects. A connection graph is first constructed to serve as the input topology for the GCN. In this graph, each node represents a subject (totaling 744 nodes in our cohort), characterized by its gene expression profile as detailed in Section 3.3.1. An undirected edge with a uniform weight of 1 is established between a pair of subjects if one is among the $k_{n}$ -nearest neighbors of the other, signifying sufficient gene expression similarity without further weighting. Subject similarity is quantified using the Euclidean distance, computed as $D_{i j} = \sqrt{\sum_{m = 1}^{p} {(X_{i m} - X_{j m})}^{2}}$ , where $X_{i m}$ denotes the expression value of the $m^{t h}$ gene for subject i. A smaller $D_{i j}$ indicates higher similarity in expression patterns. The graph construction relies on the K-nearest neighbors (KNN) algorithm, which connects each node to its $k_{n}$ closest neighbors, with the neighborhood parameter $k_{n}$ controlling the local connectivity density. The optimal neighborhood size $k_{n}$ was determined via grid search combined with cross-validation on the training data, selecting from the candidate set {5, 10, 15, 20} based on MSE. This process yields an adjacency matrix that encodes these connections, where an entry of 1 indicates the presence of an edge and 0 its absence.

2.4.2. Graph Convolution Network with Decision Trees

This study designed a neural network architecture that integrates GCNs with stochastic neural decision trees [17] for survival prediction. The integration specifically addresses two needs in studies: (1) GCNs overcome limitations of conventional models in capturing complex interaction networks through graph-structured learning, while (2) the stochastic decision trees provide probabilistic branching at split nodes to enhance decision-making interpretability that pure deep learning models may lack. By jointly optimizing the global objective function, this framework preserves interpretable decision pathways while enhancing the predictive performance in survival analysis.

GCNs extend the convolution operator to the graph domain, demonstrating significant performance in various domains by effectively aggregating information among nodes. The essence of the GCN layer lies not in gathering information from the original nodes but in aggregating information from nodes linked by edges. The computation of the GCN layer is as follows:

H^{(ν + 1)} = f (H^{(ν)}, A) = σ (\hat{A} H^{(ν)} W^{(ν)}) .

Here, $ν$ indicates the layer index, and $W^{(ν)}$ denotes the trainable parameter matrix. $\hat{A}$ is a symmetric normalization of the graph’s adjacency matrix, defined as ${\tilde{D}}^{- 1 / 2} \tilde{A} {\tilde{D}}^{- 1 / 2}$ , where $\tilde{A}$ and $\tilde{D}$ represent the adjacency and degree matrices of the connection graph, respectively. The adjacency matrix $\tilde{A}$ is obtained by adding self-connections to the original adjacency matrix A, which is achieved through the operation $\tilde{A} = A + I$ , where I denotes the identity matrix. The activation function $σ (\cdot)$ is the Rectified Linear Unit [23] (ReLU). $H^{(ν)}$ represents the input matrix of the $v^{t h}$ GCN layer, with $H^{(0)} = X$ , where X denotes the gene expression matrix.

Using multiple layers of GCNs enables the aggregation of multi-order neighbor information. A two-layer GCN is employed, yielding the following outcomes:

H^{(2)} = f (f (X, A), A) = R e L U (\hat{A} R e L U (\hat{A} X W^{(0)}) W^{(1)}) .

Then a back propagation-compatible decision tree is incorporated into GCNs to guide representative learning in deeper layers. The structure of the GTsurvival network is depicted in Figure 1.

Illustration of the proposed GTsurvival architecture, comprising two GCN layers and a neural decision tree. FC layer: The fully connected (FC) layer provides the function $f_{e} (x; θ)$ . Decision Node: Each decision node $e \in E$ corresponds to a split node within a tree, yielding the routing decision $d_{e} (x, θ) = σ (f_{e} (x; θ))$ that dictates whether the sample x is assigned to the left or right subtree. Leaf Node: Circles at the bottom represent leaf nodes, offering RMST predictions at K time points.

Decision trees consist of internal decision nodes and leaf nodes. Here, E represents the set of internal decision nodes, and L denotes the set of leaf nodes situated at the bottom of the tree. Each decision node $e \in E$ is associated with a parameterized decision function $d_{e} (\cdot; Θ) : X \to [0, 1]$ , guiding samples through the tree:

d_{e} (X_{i}; Θ) = d_{s} (f_{e} (X_{i}; Θ)),

where $d_{s} (x) = {(1 + e^{- x})}^{- 1}$ stands for the sigmoid function, and $f_{e} (\cdot; Θ)$ constitutes a real-valued mapping from the previous network layer. As a sample $X_{i}$ arrives at an internal decision node e, the value of $d_{e} (X_{i}, Θ)$ determines its subsequent routing, either to the left or right subtree.

To provide an explicit form for the routing function, the following binary relations are introduced based on the tree’s structure:

p_{l} (X_{i} | Θ) = \prod_{e \in E} d_{e} {(X_{i}; Θ)}^{I_{l ↙ e}} {\bar{d}}_{e} {(X_{i}; Θ)}^{I_{e ↘ l}},

where the sample $X_{i}$ is assigned to either the left branch (denoted as $l ↙ e$ ) or the right branch (denoted as $e ↘ l$ ) of the subtree. Here, ${\bar{d}}_{e} (x; Θ) = 1 - d_{e} (x; Θ)$ , with $I_{l ↙ e}$ representing the indicator function.

The routing function $p_{l} (X_{i} | Θ)$ dictates the path for sample $X_{i}$ to reach leaf node l. Upon reaching the $l^{t h}$ leaf node, the corresponding RMST prediction at the $k^{t h}$ time point ( $τ_{k}$ ) is given by $λ_{l k}$ , and $Λ = \sum_{l = 1}^{L} \sum_{k = 1}^{K} λ_{l k}$ . Predictions from all L leaf nodes are then aggregated, yielding the final prediction for sample $X_{i}$ :

g (X_{i}, τ_{k}; Θ, Λ) = \sum_{l \in L} λ_{l k} p_{l} (X_{i} | Θ) .

(2)

By integrating the RMST contributions from all leaf nodes as in Equation (2), the model predicts the future RMST for subject i at the $k^{t h}$ observation time point.

2.4.3. Loss Function

Network output: for the $i^{t h}$ subject at time $τ_{k}$ , the model output $g (X_{i}, τ_{k}; Θ, Λ)$ is the predicted RMST, computed by aggregating contributions from all leaf nodes of the neural decision tree (Equation (2) in the manuscript).
Training objective: The core goal of the network is to minimize the difference between the predicted RMST $g (X_{i}, τ_{k}; Θ, Λ)$ and the RMST pseudo-observations ${\tilde{y}}_{i} (τ_{k})$ (constructed via the Jackknife method as detailed in Section 2.3). For the $i^{t h}$ subject, the loss (capturing prediction error across K time points) is:
${MSE}_{i} = \sum_{k = 1}^{K} {({\hat{y}}_{i k} - g (X_{i}, τ_{k}; Θ, Λ))}^{2} .$

The overall loss function for model training is the average mean squared error (MSE) across all subjects and time points:
$L (Θ, Λ) = \frac{1}{n K} \sum_{i = 1}^{n} {MSE}_{i} = \frac{1}{n K} \sum_{i = 1}^{n} \sum_{k = 1}^{K} {({\hat{y}}_{i k} - g (X_{i}, τ_{k}; Θ, Λ))}^{2} .$ (3)

2.4.4. Gradient-Based Learning for Decision Tree Parameters

The key innovation of the GTsruvival model is its end-to-end joint optimization of the GCN and the neural decision tree. This integration is achieved by computing and backpropagating gradients through all trainable parameters of the decision tree module. This subsection details the learning mechanism for the decision tree parameters, the decision node parameters $Θ$ and the leaf node parameters $Λ$ . We derive the gradients that connect the global loss function to parameter updates, ensuring convergence within a unified optimization framework while preserving the interpretable structure of the decision tree. The gradient calculations rely on the chain rule to trace error signals from the loss function back to each parameter (see Section S3 of the Supplementary Document for the complete derivation). We present the final gradient expressions below and discuss their roles in the training dynamics.

(1)
Gradient for Decision Nodesr

The gradient for a decision node parameter $Θ$ is a function of three components: the prediction error, the node’s classification confidence, and the performance difference between its subtrees. The gradient formula is:

\frac{\partial L (Θ, Λ)}{\partial f_{e} (X_{i}; Θ)} = \frac{2}{n K} \cdot d_{e} (X_{i}; Θ) \cdot {\bar{d}}_{e} (X_{i}; Θ) \cdot \sum_{k = 1}^{K} (r_{i k} \cdot (A_{e_{r}}^{(k)} - A_{e_{l}}^{(k)}))

where $r_{i k} = g_{i} (X_{i}, τ_{k}; Θ, Λ) - {\hat{y}}_{i k}$ is the prediction residual for the $i^{t h}$ sample at the $k^{t h}$ time point. The term $d_{e} \cdot {\bar{d}}_{e}$ (with ${\bar{d}}_{e} = 1 - d_{e}$ ) quantifies the node’s decision uncertainty. $A_{e_{r}}^{(k)}$ and $A_{e_{l}}^{(k)}$ denote the final output of the right and left subtrees of node e at time $τ_{k}$ , respectively. Their difference $(A_{e_{r}}^{(k)} - A_{e_{l}}^{(k)})$ indicates which subtree currently offers a better prediction. This gradient formula can be understood as comprising three instructional signals for the decision node. First, the term $d_{e} \cdot {\bar{d}}_{e}$ acts as an “attention” signal, which peaks when the node’s decision entropy is maximized ( $d_{e} \approx 0.5$ ), directing learning effort toward the most uncertain splits. Second, the residual $r_{i k}$ serves as a “correction” signal, scaling updates proportionally to the magnitude of the prediction error. Third, the difference $(A_{e_{r}}^{(k)} - A_{e_{l}}^{(k)})$ functions as a “guidance” signal, steering samples toward the better-performing subtree. By prioritizing high-entropy nodes while incorporating precise error correction and clear routing directions, this mechanism leads to efficient and targeted training.

(2)
Gradient of Leaf Node Parameters

The gradient for leaf node parameter $λ_{l k}$ is:

\frac{\partial L (Θ, Λ)}{\partial λ_{l k}} = \frac{2}{n K} \cdot r_{i k} \cdot p_{l} (X_{i} | Θ),

where $p_{l} (X_{i} | Θ)$ is the routing probability of sample $X_{i}$ to leaf node l (governed by network parameters $Θ$ ), and $r_{i k} = {\hat{y}}_{i k} - g (X_{i}, τ_{k}; Θ, Λ)$ denotes the model’s prediction residual for sample $X_{i}$ at $τ_{k}$ . This gradient implements a probability-weighted update rule for leaf nodes: the update magnitude of $λ_{l k}$ is scaled by $p_{l} (X_{i} | Θ)$ . Leaves frequently visited by samples (with large $p_{l} (X_{i} | Θ)$ values) get consistent updates. This lets them optimize predictions for common data. In contrast, rarely accessed leaves receive minimal adjustments, which prevents disruptive updates from sparse samples or outliers. The update magnitude is also modulated by the prediction residual $r_{i k}$ . Meaningful parameter changes only occur when both prediction error and routing probability are high. This adaptive scaling focuses updates on the most relevant leaf-sample pairs, stabilizing model training and accelerating the convergence of the neural decision tree.

2.4.5. Implementation Details

For evaluation, we apply fivefold cross-validation for the selection of key hyperparameters: we randomly split the data into a training set (80%) and a test set (20%), with 10% of the training set allocated as the validation set. The GTsurvival model consists of two GCN layers and a neural decision tree component. The model is trained by back-propagation with the Adam optimizer of a learning rate of 0.001 for 1000 epochs. The dropout probability of 0.1 and the weight decay of 1 are applied. Early stopping is performed based on the validation loss to avoid overfitting. Model weights are initialized using TensorFlow’s default scheme, and the loss function is mean squared error in Equation (3), which directly minimizes the difference between the RMST pseudo-observations and the network predictions.

The optimization of hyperparameters is performed individually for each fold by a grid search, and the configuration is selected such that the corresponding model achieves the best performance on the validation set. The search spaces for the hyperparameters are as follows: (1) For graph construction: the number of neighbors $k_{n}$ in the KNN algorithm is selected from the candidate set {5, 10, 15, 20}; (2) For GCN architecture: the dimensions of the two hidden layers are tuned within the range {[64, 32], [128, 64], [256, 128]}, with ReLU activation function applied after each layer; (3) For neural decision tree component: the depth of the tree is optimized from {2, 3, 4, 5}, where each decision node e is equipped with a learnable routing function $d_{e} (X_{i}; Θ) = d_{s} (f_{e} (X_{i}; Θ))$ ( $d_{s}$ denotes the sigmoid function, and $f_{e}$ is a fully connected layer that maps the GCN output to the decision tree), enabling soft decision-making where samples are assigned to left or right subtrees based on the value of $d_{e}$ .

3. Results

3.1. Simulation and Experimental Design

Simulations were performed to evaluate the performance of the GTsurvival algorithm. In this study, GTsurvival was compared with several benchmark methods including: (i) a two-layer GCN without neural decision tree, (ii) Convolutional Neural Network (CNN) with two layers of convolution and pooling, and (iii) Deep Neural Network (DNN), to demonstrate its effectiveness. To ensure a fair comparison, all methods were optimized via grid search. Hyperparameter tuning was conducted within the fivefold cross-validation framework detailed in Section 2.4.5, adhering to the predefined search spaces, and optimal tuning parameters for each model were systematically selected based on the MSE metric. A fixed random seed was used to guarantee identical training/validation/test splits across all compared methods. Evaluation results were summarized over 100 replications to provide a robust performance assessment, and detailed descriptions of the three simulation settings are provided below.

Each simulated dataset was generated from a specified multivariate normal distribution $MVN (0, Σ)$ with mean zero and a covariance matrix $Σ$ where the $(l, m)$ element was equal to $0 . 5^{| l - m |}$ . The survival function was assumed to be $S (t | X) = exp (- η (X) t)$ , where $η (X)$ in the three experiments is as follows:

Experiment 1:
$η_{1} (X) = | X_{1} | + {(X_{2} - 0.5)}^{2} + | X_{3} - X_{4} |$ ,
Experiment 2:
$η_{2} (X) = \sum_{p = 1}^{P} γ_{p} X_{p}$ ,
Experiment 3:
$η_{3} (X) = \sum_{p = 1}^{P} γ_{p} X_{p} + X_{3}^{2} + X_{4}^{2}$ ,

where $γ_{p}$ follows $MVN (0.2, 0.01 Σ)$ . The failure time T was derived via the inverse probability sampling approach based on the respective survival functions from Experiments 1 to 3. For simulating censored data, 10 visits ( $K = 10$ ) were constructed, where $V_{1}$ follows a uniform distribution over the interval $(0, 2)$ and $V_{K}$ is defined as $V_{k} = V_{k - 1} + U (0, 1)$ . Observations satisfying $T < V_{1}$ were treated as left-censored, with $L = 0$ and $R = V_{1}$ ; those meeting $T > V_{K}$ were right-censored, with $L = V_{k}$ and $R = \infty$ ; and observations falling into the range $V_{k - 1} < T < V_{k}$ were classified as strictly interval-censored, where $L = V_{k - 1}, R = V_{k}$ .

The proposed approach GTsurvival was evaluated by varying the sample size $n = 200 / 400$ and features counts $p = 50 / 100 / 500$ . MSE and mean absolute error (MAE) metrics were employed to assess the performance of various methods, where

\begin{matrix} MSE = \frac{1}{n} \sum_{i = 1}^{n} \sum_{k = 1}^{K} {({\hat{y}}_{i k} - g (X_{i}, τ_{k}; Θ, Λ))}^{2}, \\ MAE = \frac{1}{n} \sum_{i = 1}^{n} \sum_{k = 1}^{K} |{\hat{y}}_{i k} - g (X_{i}, τ_{k}; Θ, Λ)| . \end{matrix}

Here ${\hat{y}}_{i k}$ denotes the observed RMST at the $k^{t h}$ observed time point for the $i^{t h}$ sample, and $g (X_{i}, τ_{k}; Θ, Λ)$ denotes the corresponding predicted value from GTsurvival.

3.2. Simulation Results

Throughout the simulation process, K observation points were set for estimating RMST. For $K \in \{1, 3, 5\}$ , the specific observation times ${τ_{1}, τ_{2}, . . ., τ_{K}}$ were selected as {1}, ${1, 3, 5}$ , and ${1, 3, 5, 7, 9}$ years, respectively. Then the impact of various sample sizes $n = 200 / 400$ and feature counts $p = 50 / 100 / 500$ was explored on the predictive performance of the model. Model performance was evaluated across three groups of experiments, comparing GTsurvival against CNN, DNN, and GCN based on MSE and MAE metrics. Generally, a model with lower MSE and MAE values is considered more favorable.

In Experiment 1, Table 1 presented the MSE and MAE results of several algorithms. Supplementary Tables S1 and S2 corresponded to the results of Experiments 2 and 3. Figure 2 and Figure 3 illustrated box plots for Experiment 1 at $K = 3$ to offer a clearer visualization of the data whilst box plots for the remaining simulation experiments were displayed in Supplementary Figures S1–S16.

Table 1.

Simulation experiments on GTsurvival and the other three methods in Experiment 1. The sample sizes were set as $n = 200 / 400$ , and feature counts were set as $p = 50 / 100 / 500$ . The RMST was estimated at K time points, where $K \in {1, 3, 5}$ . The specific observation times $τ_{1}, τ_{2}, \dots, τ_{K}$ were selected as follows: ${1}$ , ${1, 3, 5}$ , and ${1, 3, 5, 7, 9}$ years. The employed evaluation metrics are MSE and MAE, where smaller values indicate better prediction performance. The top performance is highlighted in bold, and the second-top result is underlined.

		MAE (SD)			MSE (SD)
	Method	$p = 50$	$p = 100$	$p = 500$	$p = 50$	$p = 100$	$p = 500$
$K = 1$
$n = 200$	DNN	0.305 (0.122)	0.333 (0.121)	0.415 (0.125)	0.144 (0.072)	0.168 (0.070)	0.297 (0.073)
	CNN	0.245 (0.050)	0.259 (0.051)	0.279 (0.055)	0.091 (0.039)	0.098 (0.040)	0.111 (0.066)
	GCN	0.200 (0.026)	0.214 (0.029)	0.247 (0.087)	0.063 (0.012)	0.071 (0.016)	0.091 (0.052)
	GTsurvival	0.149 (0.022)	0.156 (0.019)	0.167 (0.029)	0.034 (0.009)	0.038 (0.008)	0.043 (0.013)
$n = 400$	DNN	0.273 (0.123)	0.305 (0.124)	0.378 (0.127)	0.114 (0.062)	0.142 (0.073)	0.232 (0.075)
	CNN	0.234 (0.048)	0.252 (0.045)	0.278 (0.055)	0.085 (0.034)	0.095 (0.036)	0.111 (0.062)
	GCN	0.185 (0.017)	0.198 (0.024)	0.230 (0.021)	0.055 (0.008)	0.062 (0.012)	0.080 (0.014)
	GTsurvival	0.141 (0.013)	0.150 (0.021)	0.161 (0.017)	0.030 (0.005)	0.034 (0.008)	0.040 (0.008)
$K = 3$
$n = 200$	DNN	0.359 (0.077)	0.380 (0.079)	0.451 (0.082)	0.205 (0.059)	0.232 (0.062)	0.351 (0.069)
	CNN	0.304 (0.042)	0.313 (0.035)	0.320 (0.067)	0.162 (0.043)	0.172 (0.042)	0.174 (0.076)
	GCN	0.283 (0.050)	0.297 (0.045)	0.316 (0.050)	0.153 (0.046)	0.169 (0.049)	0.177 (0.045)
	GTsurvival	0.264 (0.039)	0.266 (0.042)	0.280 (0.046)	0.138 (0.036)	0.144 (0.045)	0.149 (0.045)
$n = 400$	DNN	0.359 (0.076)	0.370 (0.075)	0.425 (0.076)	0.204 (0.059)	0.214 (0.059)	0.297 (0.059)
	CNN	0.312 (0.035)	0.316 (0.032)	0.334 (0.044)	0.171 (0.034)	0.172 (0.033)	0.189 (0.058)
	GCN	0.296 (0.042)	0.292 (0.047)	0.323 (0.043)	0.167 (0.044)	0.160 (0.042)	0.188 (0.042)
	GTsurvival	0.283 (0.034)	0.275 (0.033)	0.292 (0.038)	0.156 (0.037)	0.148 (0.034)	0.164 (0.037)
$K = 5$
$n = 200$	DNN	0.367 (0.059)	0.381 (0.054)	0.436 (0.059)	0.222 (0.052)	0.239 (0.047)	0.324 (0.055)
	CNN	0.314 (0.037)	0.322 (0.027)	0.323 (0.046)	0.183 (0.035)	0.191 (0.034)	0.185 (0.058)
	GCN	0.277 (0.046)	0.292 (0.046)	0.307 (0.042)	0.151 (0.039)	0.165 (0.046)	0.169 (0.039)
	GTsurvival	0.258 (0.040)	0.265 (0.042)	0.270 (0.037)	0.136 (0.040)	0.145 (0.045)	0.142 (0.037)
$n = 400$	DNN	0.382 (0.053)	0.382 (0.057)	0.422 (0.041)	0.240 (0.052)	0.238 (0.054)	0.293 (0.052)
	CNN	0.335 (0.032)	0.336 (0.028)	0.348 (0.032)	0.205 (0.039)	0.204 (0.030)	0.217 (0.038)
	GCN	0.299 (0.042)	0.297 (0.040)	0.320 (0.040,)	0.174 (0.038)	0.171 (0.037)	0.190 (0.041)
	GTsurvival	0.284 (0.030)	0.279 (0.034)	0.291 (0.033)	0.162 (0.035)	0.158 (0.035)	0.169 (0.035)

Open in a new tab

Box plots of MAE and MSE metrics in Experiment 1. Subfigures (A–F) correspond to the results of different feature count (p) settings (50, 100, 500) with a fixed sample size $n = 200$ in Experiment 1, where (A–C) present the MAE metric results and (D–F) present the MSE metric results across the three time points ( $τ_{1} = 1$ , $τ_{2} = 3$ , $τ_{3} = 5$ years).

Firstly, the different methods in Experiment 1 were compared, and the numerical results were listed in Table 1. As anticipated, GTsurvival significantly outperformed CNN, DNN and GCN, as evidenced by its lowest MSE and MAE values. Specifically, GTsurvival was superior to GCN, suggesting that combining a neural decision tree with GCN is more favorable than using GCN alone for survival analysis.

Secondly, as presented in Supplementary Table S1 for Experiment 2, GTsurvival exhibited superior performance on both metrics compared to other algorithms, validating its robust predictive capabilities. Despite marginally underperforming compared to GTsurvival, GCN outperformed CNN and DNN on the majority of simulated datasets, highlighting the significant role of GCN’s graph structure in the network’s functionality.

Thirdly, GTsurvival exhibited strong generality in Experiment 3, even with varying samples and feature counts. Meanwhile, GTsurvival and GCN exhibited comparable MSE metrics at $K = 3$ . Except for $(n, p) = (400, 500)$ , where GCN (MAE = 0.646) performed slightly better than GTsurvival (MAE = 0.651), GTsurvival outperformed the other models in all other settings. Therefore, these results collectively indicated that GTsurvival provided a reliable approach to analyzing and predicting survival data.

3.3. Application

3.3.1. Data Preprocessing and Genetic Feature Selection

We applied the GTsurvival method outlined in Section 2 to analyze a genomic dataset from ADNI. First, raw gene expression values extracted directly from Affymetrix HG U219 Array CEL files were preprocessed using the Robust Multi-chip Average (RMA) algorithm, which includes background correction, quantile normalization, and probe set summarization to ensure cross-sample comparability. Following this, we discarded genes with missing or zero expression across all subjects to obtain the initial expression matrix. Subsequently, to select genetic features associated with disease progression prior to graph construction and model training, we adopted the generalized score test proposed by Sun and Ding [10]. Setting p-value thresholds at 0.02, 0.005, and 0.002 resulted in 592, 175, and 80 genetic features, respectively. These selected features were used for subsequent graph construction and model training.

When applying the generalized score test for feature selection, our objective was to identify genetic variants that maximize the reduction of uncertainty in predicting disease progression. This selection process aligns with entropy-constrained information extraction in information theory: by testing the statistical association between genetic variants and disease progression, the test selects features with the highest information contribution to survival prediction and eliminates redundant or irrelevant variations. These selected features (i.e., genes) convey the most survival outcome-relevant information, enabling subsequent graph neural networks to learn efficiently on a more information-dense feature subset.

3.3.2. Construction of Gene Interaction Network

A group of representative genes with p-value < 0.02 were extracted, and the similarity of gene expression patterns across all samples was measured using Pearson and Spearman correlation coefficients. Cytoscape (version 3.10.0) was utilized to construct the gene interaction network, which is open source and freely available from https://cytoscape.org/ (accessed on 5 March 2025), under the GNU LGPL (Lesser General Public License). In the network, edges between genetic nodes were connected when the correlation coefficient between gene pairs exceeded 0.6. Notably, this study specifically explored the genes ABCA7, ADD3, NARG2, and ACAA1, as identified in Figure 4 and Figure 5. Evidence from human-based studies [24] has confirmed ABCA7 as one of the most prominent genes linked to AD risk, harboring both common and rare risk variants with a substantial impact on AD risk. Notably, ADD3, alternatively known as Adducin 3, is a protein-coding gene involved in disorders including cerebral palsy, spastic quadriplegic 3, and spastic quadriplegic cerebral palsy. Liang et al. [25] conducted a weighted gene co-expression network analysis and indicated that it could be a potential therapeutic target for AD. Furthermore, genetic analyses [26] of cortical grey matter thickness and fractional anisotropy showed a notable correlation between the NARG2 gene and gray matter. In addition, a recent study [27] discovered a novel missense variant in ACAA1, linked to early-onset AD, impaired lysosomal function, and the worsening of amyloid- $β$ pathology and cognitive decline. These novel findings may aid in enhancing the comprehension of AD pathogenesis.

The gene interaction network diagram with a significance threshold of the p-value < 0.02. Pearson correlation coefficients are utilized in the main figure to define the network’s edges, providing a clear representation of gene associations. The subgraph in the bottom right corner showcases the sub-networks derived from the genes ADD3, ACAA1, NARG2, and ABCA7. Nodes are color-coded based on their degree of connectivity: red points indicate nodes with high degrees, representing strong relationships with other genes, while blue points indicate nodes with low degrees, representing weaker relationships.

This gene interaction network diagram with a significance threshold of the p-value < 0.02. Spearman correlation coefficients were used to establish the network’s edges, ensuring a robust representation of gene associations. A detailed subgraph, positioned in the bottom-right corner, highlights sub-networks originating from the genes ADD3, ACAA1, NARG2, and ABCA7. Nodes are color-coded based on their connectivity: red points indicate nodes with high degrees, signifying strong associations with other genes, while blue points represent nodes with low degrees, indicating fewer connections.

3.3.3. Model Training and Performance Comparison

We developed a predictive model GTsurvival using representative genetic features. The ADNI dataset was evaluated under the same fivefold cross-validation described in Section 2.4.5. All methods were applied to the same data splits. Hyperparameter optimization used the predefined search spaces from Section 2.4.5, ensuring a consistent and fair comparison across all models. In addition to the methods in Section 3.1, we further include FastPseudo [8] for comparison. This model accelerates convergence via asymptotic formulas for computing pseudo-observations of the survival function and RMST under right-censored and interval-censored data.

Table 2 presented the results of GTsurvival across various p-value ranges. For p-values less than 0.02, GTsurvival outperformed other models, demonstrating the lowest MSE and MAE metrics. In contrast, CNN performed poorly in both MSE and MAE metrics with p-value less than 0.005. Overall, GTsurvival outperformed other models in the majority of scenarios. For p-values less than 0.002, GTsurvival achieved the lowest MAE value of 0.577, which was 35.1% lower than that of DNN. To verify whether the improvements of the proposed method over other models are statistically significant, we also conducted Wilcoxon signed-rank tests. The detailed results are presented in Supplementary Table S3. Notably, all corresponding p-values from the Wilcoxon tests are less than 0.01, confirming that the performance advantages of our proposed GTsurvival model over baseline methods are statistically significant.

Table 2.

Application results of GTsurvival and three comparative methods across different p-value ranges. The restricted mean survival time (RMST) was estimated at K time points ( $K \in {1, 3, 5}$ ), with observation times $τ_{1}, τ_{2}, \dots, τ_{K}$ set as ${1}$ , ${1, 3, 5}$ , and ${1, 3, 5, 7, 9}$ years, respectively. Evaluation metrics include mean squared error (MSE) and mean absolute error (MAE), where smaller values indicate better predictive performance. The best performance is highlighted in bold, and the second-best result is underlined.

	MAE			MSE
	$K = 1$	$K = 3$	$K = 5$	$K = 1$	$K = 3$	$K = 5$
p-value < 0.02
FastPseudo	1.924 (0.243)	1.509 (0.519)	1.718 (0.481)	5.948 (1.467)	4.693 (3.306)	5.504 (3.137)
DNN	1.352 (0.885)	1.547 (0.479)	1.187 (0.376)	3.105 (1.953)	5.760 (3.518)	4.047 (2.608)
CNN	1.607 (0.936)	1.800 (0.470)	1.775 (0.304)	3.730 (2.043)	7.787 (3.536)	6.820 (2.384)
GCN	1.315 (0.837)	1.435 (0.430)	1.330 (0.345)	2.986 (1.846)	4.941 (3.062)	4.896 (2.413)
GTsurvival	0.587 (0.094)	1.030 (0.142)	1.033 (0.159)	0.533 (0.265)	2.682 (1.362)	2.416 (0.942)
p-value < 0.005
FastPseudo	2.396 (0.833)	1.523 (0.484)	1.707 (0.467)	8.747 (7.876)	4.746 (3.311)	5.343 (2.938)
DNN	1.534 (0.895)	1.366 (0.483)	1.211 (0.321)	3.673 (1.650)	4.558 (2.891)	4.465 (2.488)
CNN	1.607 (0.770)	1.800 (0.470)	1.687 (0.306)	3.732 (1.854)	7.980 (3.538)	6.146 (2.398)
GCN	0.592 (0.180)	1.777 (0.407)	1.433 (0.385)	0.546 (0.207)	7.927 (3.310)	5.093 (2.665)
GTsurvival	0.581 (0.089)	1.013 (0.131)	1.025 (0.150)	0.532 (0.263)	2.930 (1.324)	2.406 (0.891)
p-value < 0.002
FastPseudo	2.000 (0.453)	1.362 (0.408)	1.570 (0.400)	6.010 (3.968)	3.965 (2.615)	4.803 (2.677)
DNN	0.889 (0.547)	1.499 (0.486)	1.192 (0.295)	1.446 (1.910)	6.172 (2.602)	3.390 (2.045)
CNN	1.424 (0.775)	1.748 (0.340)	1.799 (0.297)	3.122 (1.610)	7.760 (3.554)	6.624 (2.321)
GCN	1.117 (0.619)	1.484 (0.467)	1.413 (0.302)	2.343 (1.866)	6.419 (3.614)	4.878 (2.478)
GTsurvival	0.577 (0.088)	1.063 (0.133)	0.994 (0.122)	0.515 (0.237)	2.749 (1.388)	2.252 (0.898)

Open in a new tab

3.3.4. Interpretation of Results via LIME

To interpret GTsurvival’s predictions, we applied Local Interpretable Model-agnostic Explanations [16] (LIME) to evaluate predictor importance. This approach constructs simplified local models that approximate the behavior of complex neural networks. The top 15 important predictors are visualized in Figure 6. Our analysis identified several notable genes, including ADAMTS1, ACACB, and BRCA1. Existing evidence supports their roles in neurodegenerative processes. Specifically, Gurses et al. [28] demonstrated that ADAMTS genes play a significant role in neuroplasticity regulation and are implicated in nervous system pathologies including AD. These findings suggest therapeutic potential for ADAMTS family members in addressing central nervous system disorders, ischemic injuries, and neurodegenerative conditions. Furthermore, Liu et al. [29] identified ACACB as a hub gene through protein-protein interaction network analysis, suggesting its potential diagnostic value as an AD biomarker. The BRCA1 gene, known for its involvement in DNA damage response (DDR) mechanisms, has been linked to the pathogenesis of AD. Nakamura et al. [30] proposed that DDR dysfunction resulting from cytoplasmic sequestration of BRCA1 is involved in the pathogenesis of tauopathies. These findings may provide insights into the mechanisms of AD.

Feature importance plot based on LIME method.

3.3.5. External Validation on TCGA-LGG Dataset

To further validate the predictive performance, we also analyzed the brain lower-grade glioma (LGG) dataset from The Cancer Genome Atlas (TCGA). This dataset contains 531 samples and 20,530 genetic features, with patient death as the event of interest. To reduce the influence of noise-sensitive features, we first filtered out genetic features with very low variation across samples (as measured by variance). The remaining features were then ranked by information gain relative to the outcome variable in descending order, and the top 600 genes were selected for further analysis. As shown in Supplementary Table S4, GTsurvival demonstrates better prediction performance compared to GCN, FastPseudo, and other methods.

3.3.6. Ablation Study

To verify the contribution of the dual fusion strategy in GTsurvival to survival prediction, we conducted ablation experiments on the ADNI and TCGA-LGG datasets by modifying different components of the model configuration. We adopted the following configurations to evaluate the effectiveness of each component:

GCNet: Only graph convolutional network (GCN) layers, without the neural decision tree.
NDF: Two fully connected (FC) layers combined with the neural decision tree (no GCN layers).
FastPseudo: Utilizes the RMST estimation method proposed by Bouaziz [8], instead of the method described in Section 2.3.

As shown in Supplementary Table S5, GTsurvival integrates GCN and neural decision tree via the dual fusion strategy and achieves the best performance on both datasets, with smaller MAE and MSE indicating superior predictive ability.

4. Discussion and Concluding Remarks

Chronic progressive diseases such as AD are multifactorial disorders driven by intricate interactions between genetic variants and environmental factors. Traditional linear models are often limited in their ability to capture non-linear interactions and high-dimensional genomic complexity, which can lead to a loss of critical biological signals relevant to disease progression. Our proposed model, GTsurvival, which integrates GCNs with the neural decision tree, addresses these limitations by leveraging graph-structured learning to model genetic interactions and using pseudo-observations based on RMST to handle censored data effectively. The improved performance of GTsurvival over conventional methods, demonstrated in both simulated data and the ADNI cohort, supports its potential to advance the early prediction of diseases.

A key advantage of GTsurvival is its ability to identify genetic variants strongly associated with AD progression through generalized score testing and feature importance analysis (Figure 6). The most significant genes identified include ABCA7, ADD3, NARG2, ACAA1, ADAMTS1, ACACB and BRCA1, each of which has a documented role in AD-related biological processes, as supported by prior literature and reinforced by our network analysis (Figure 4 and Figure 5). For example, ABCA7 is a well-recognized AD risk gene [24]. A recently identified missense variant in ACAA1 was shown to impair lysosomal function, thereby aggravating amyloid- $β$ pathology and accelerating cognitive decline in early-onset AD [27]. Other genes also contribute to key dysfunctional processes in AD: ADAMTS1 is implicated in neuroplasticity [28], ACACB has been proposed as a potential biomarker [29], and BRCA1 participates in DNA damage repair mechanisms relevant to disease pathogenesis [30]. Together, these findings confirm that GTsurvival effectively captures biologically meaningful signals relevant to disease progression.

Noteworthily, survival prediction remains a challenging field requiring further optimization despite the significant predictive capability of GTsurvival. First, the model was developed and validated on the genetic dataset; integrating additional multimodal data (e.g., DNA methylation or neuroimaging features) could improve its accuracy and generalizability. Second, the predominantly White cohort limits the extrapolation of findings [31], as AD mechanisms may vary across ethnicities, and future work should include more diverse populations. Third, the graph construction relies on KNN with uniform edge weights, which may overlook variations in interaction strength between genetic features and limit the GCN’s ability to capture key biological relationships. Finally, expanding other evaluation metrics would provide a more comprehensive assessment of model performance. Addressing these aspects will be important for further refining and validating the model.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/e28010028/s1. The following are available online. We evaluated the performance of GTsurvival and other methods in Experiments 1–3. Supplementary Figures S1–S4 are designed for Experiment 1. We observed the performance of GTsurvival in a variety of sample sizes $n = 200 / 400$ and feature counts $p = 50 / 100 / 500$ . The results of Supplementary Figures S1–S4 indicated that GTsurvival can apply to datasets with non-linear relationship. Supplementary Figures S5–S10 are used to observe the generalization capability of GTsurvival in Experiment 2. GTsurvival performed best even under the different K groups. As for Experiment 3, GTsurvival obtained lower MSE and MAE than other methods in Supplementary Figures S11–S16. Supplementary Tables S1 and S2 presented the MSE and MAE results of several algorithms used in Experiments 2 and 3. The results indicated the advantage of GTsurvival in predicting survival in complex censored data. Supplementary Table S3. Statistical test results of GTsurvival vs. Second-Top baseline methods. Supplementary Table S4. Comparison of predictive performance among different methods on the LGG dataset (smaller values of MAE and MSE indicate better performance). Supplementary Table S5. Results of ablation experiments on ADNI and TCGA-LGG datasets. Section S3. Detailed process of gradient derivation.

entropy-28-00028-s001.zip^{(490.3KB, zip)}

Author Contributions

Formal analysis, J.Z.; Investigation, D.L.; Methodology, J.C.; Software, S.Z. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset in this paper is from https://adni.loni.usc.edu/ (accessed on 5 March 2025). One can registered for a download account and apply for the data.

Conflicts of Interest

The authors declare no conflict of interest.

Funding Statement

This work was funded by National Natural Science Foundation of China (NSFC) (12071176).

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

References

1.World Health Organization . Global Status Report on the Public Health Response to Dementia 2017–2025. World Health Organization; Geneva, Switzerland: 2021. pp. 1–27. [Google Scholar]
2.de Rojas I., Moreno-Grau S., Tesi N., Grenier-Boley B., Andrade V., Jansen I.E., Pedersen N.L., Stringa N., Zettergren A., Hernández I., et al. Common variants in Alzheimer’s disease and risk stratification by polygenic risk scores. Nat. Commun. 2021;12:3417. doi: 10.1038/s41467-021-22491-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Mueller S.G., Weiner M.W., Thal L.J., Petersen R.C., Jack C., Jagust W., Trojanowski J.Q., Toga A.W., Beckett L., Saykin A.J., et al. The Alzheimer’s disease neuroimaging initiative. Neuroimaging Clin. N. Am. 2005;15:869. doi: 10.1016/j.nic.2005.09.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Zucker D.M. Restricted Mean Life with Covariates: Modification and Extension of a Useful Survival Analysis Method. J. Am. Stat. Assoc. 1998;93:702–709. doi: 10.1080/01621459.1998.10473722. [DOI] [Google Scholar]
5.Chen P., Tsiatis A.A. Causal Inference on the Difference of the Restricted Mean Lifetime Between Two Groups. Biometrics. 2001;57:1030–1038. doi: 10.1111/j.0006-341X.2001.01030.x. [DOI] [PubMed] [Google Scholar]
6.Zhang M., Schaubel D.E. Estimating Differences in Restricted Mean Lifetime Using Observational Data Subject to Dependent Censoring. Biometrics. 2011;67:740–749. doi: 10.1111/j.1541-0420.2010.01503.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Zhong Y., Schaubel D.E. Restricted mean survival time as a function of restriction time. Biometrics. 2022;78:192–201. doi: 10.1111/biom.13414. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Bouaziz O. Fast approximations of pseudo-observations in the context of right censoring and interval censoring. Biom. J. 2023;65:e2200071. doi: 10.1002/bimj.202200071. [DOI] [PubMed] [Google Scholar]
9.Zhao L. Deep neural networks for predicting restricted mean survival times. Bioinformatics. 2021;36:5672–5677. doi: 10.1093/bioinformatics/btaa1082. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Sun T., Ding Y. Neural network on interval-censored data with application to the prediction of Alzheimer’s disease. Biometrics. 2023;79:2677–2690. doi: 10.1111/biom.13734. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Kipf T., Welling M. Semi-Supervised Classification with Graph Convolutional Networks. arXiv. 20161609.02907 [Google Scholar]
12.Peng H., Li Y., Zhang W. SCAFG: Classifying Single Cell Types Based on an Adaptive Threshold Fusion Graph Convolution Network. Mathematics. 2022;10:3407. doi: 10.3390/math10183407. [DOI] [Google Scholar]
13.Xu C., Cai L., Gao J. An efficient scRNA-seq dropout imputation method using graph attention network. BMC Bioinform. 2021;22:582. doi: 10.1186/s12859-021-04493-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Wen G., Li L. FGCNSurv: Dually fused graph convolutional network for multi-omics survival prediction. Bioinformatics. 2023;39:btad472. doi: 10.1093/bioinformatics/btad472. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Ling Y., Liu Z., Xue J. Survival Analysis of High-Dimensional Data With Graph Convolutional Networks and Geometric Graphs. IEEE Trans. Neural Netw. Learn. Syst. 2024;35:4876–4886. doi: 10.1109/TNNLS.2022.3190321. [DOI] [PubMed] [Google Scholar]
16.Ribeiro M.T., Singh S., Guestrin C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier; Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; San Francisco, CA, USA. 13–17 August 2016; pp. 1135–1144. KDD ’16. [Google Scholar]
17.Kontschieder P., Fiterau M., Criminisi A., Rota Bulo S. Deep Neural Decision Forests; Proceedings of the IEEE International Conference on Computer Vision (ICCV); Santiago, Chile. 7–13 December 2015; pp. 1467–1475. [Google Scholar]
18.Reitz C., Rogaeva E., Beecham G. Late-Onset vs Nonmendelian Early-Onset Alzheimer Disease: A Distinction Without a Difference? Neurol. Genet. 2020;6:e512. doi: 10.1212/NXG.0000000000000512. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Sun J. The Statistical Analysis of Interval-Censored Failure Time Data. Springer; New York, NY, USA: 2006. [Google Scholar]
20.Wellner J.A., Zhan Y. A Hybrid Algorithm for Computation of the Nonparametric Maximum Likelihood Estimator from Censored Data. J. Am. Stat. Assoc. 1997;92:945–959. doi: 10.1080/01621459.1997.10474049. [DOI] [Google Scholar]
21.Jongbloed G. The Iterative Convex Minorant Algorithm for Nonparametric Estimation. J. Comput. Graph. Stat. 1998;7:310–321. doi: 10.1080/10618600.1998.10474778. [DOI] [Google Scholar]
22.Anderson B.C. An Efficient Implementation of the EMICM Algorithm for the Interval Censored NPMLE. J. Comput. Graph. Stat. 2017;26:463–467. doi: 10.1080/10618600.2016.1208616. [DOI] [Google Scholar]
23.Glorot X., Bordes A., Bengio Y. Deep Sparse Rectifier Neural Networks. In: Gordon G., Dunson D., Dudík M., editors. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics; Fort Lauderdale, FL, USA. 11–13 April 2011; Fort Lauderdale, FL, USA: PMLR; 2011. pp. 315–323. [Google Scholar]
24.De Roeck A., Van Broeckhoven C., Sleegers K. The role of ABCA7 in Alzheimer’s disease: Evidence from genomics, transcriptomics and methylomics. Acta Neuropathol. 2019;138:201–220. doi: 10.1007/s00401-019-01994-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Liang J.W., Fang Z.Y., Huang Y., Liuyang Z.Y., Zhang X.L., Wang J.L., Wei H., Wang J.Z., Wang X.C., Zeng J., et al. Application of Weighted Gene Co-Expression Network Analysis to Explore the Key Genes in Alzheimer’s Disease. J. Alzheimer’s Dis. 2018;65:1353–1364. doi: 10.3233/JAD-180400. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Kochunov P., Glahn D.C., Nichols T.E., Winkler A.M., Hong E.L., Holcomb H.H., Stein J.L., Thompson P.M., Curran J.E., Carless M.A., et al. Genetic analysis of cortical thickness and fractional anisotropy of water diffusion in the brain. Front. Neurosci. 2011;5:120. doi: 10.3389/fnins.2011.00120. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Luo R., Fan Y., Yang J., Ye M., Zhang D.-F., Guo K., Li X., Bi R., Xu M., Yang L.-X., et al. A novel missense variant in ACAA1 contributes to early-onsand otherszheimer’s disease, impairs lysosomal function, and facilitates amyloid-beta pathology and cognitive decline. Signal Transduct. Target. Ther. 2021;6:325. doi: 10.1038/s41392-021-00748-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Gurses M.S., Ural M.N., Gulec M.A., Akyol O., Akyol S. Pathophysiological Function of ADAMTS Enzymes on Molecular Mechanism of Alzheimer’s Disease. Aging Dis. 2016;7:479–490. doi: 10.14336/AD.2016.0111. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Liu L., Wu Q., Zhong W., Chen Y., Zhang W., Ren H., Sun L., Sun J. Microarray Analysis of Differential Gene Expression in Alzheimer’s Disease Identifies Potential Biomarkers with Diagnostic Value. Med. Sci Monit. 2020;26:e919249. doi: 10.12659/MSM.919249. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Nakamura M., Kaneko S., Dickson D.W., Kusaka H. Aberrant Accumulation of BRCA1 in Alzheimer Disease and Other Tauopathies. J. Neuropathol. Exp. Neurol. 2020;79:22–33. doi: 10.1093/jnen/nlz107. [DOI] [PubMed] [Google Scholar]
31.Morris J.C., Schindler S.E., McCue L.M., Moulder K.L., Benzinger T.L., Cruchaga C., Fagan A.M., Grant E., Gordon B.A., Holtzman D.M., et al. Assessment of Racial Disparities in Biomarkers for Alzheimer Disease. JAMA Neurol. 2019;76:264–273. doi: 10.1001/jamaneurol.2018.4249. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

entropy-28-00028-s001.zip^{(490.3KB, zip)}

Data Availability Statement

The dataset in this paper is from https://adni.loni.usc.edu/ (accessed on 5 March 2025). One can registered for a download account and apply for the data.

[B1-entropy-28-00028] 1.World Health Organization . Global Status Report on the Public Health Response to Dementia 2017–2025. World Health Organization; Geneva, Switzerland: 2021. pp. 1–27. [Google Scholar]

[B2-entropy-28-00028] 2.de Rojas I., Moreno-Grau S., Tesi N., Grenier-Boley B., Andrade V., Jansen I.E., Pedersen N.L., Stringa N., Zettergren A., Hernández I., et al. Common variants in Alzheimer’s disease and risk stratification by polygenic risk scores. Nat. Commun. 2021;12:3417. doi: 10.1038/s41467-021-22491-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B3-entropy-28-00028] 3.Mueller S.G., Weiner M.W., Thal L.J., Petersen R.C., Jack C., Jagust W., Trojanowski J.Q., Toga A.W., Beckett L., Saykin A.J., et al. The Alzheimer’s disease neuroimaging initiative. Neuroimaging Clin. N. Am. 2005;15:869. doi: 10.1016/j.nic.2005.09.008. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B4-entropy-28-00028] 4.Zucker D.M. Restricted Mean Life with Covariates: Modification and Extension of a Useful Survival Analysis Method. J. Am. Stat. Assoc. 1998;93:702–709. doi: 10.1080/01621459.1998.10473722. [DOI] [Google Scholar]

[B5-entropy-28-00028] 5.Chen P., Tsiatis A.A. Causal Inference on the Difference of the Restricted Mean Lifetime Between Two Groups. Biometrics. 2001;57:1030–1038. doi: 10.1111/j.0006-341X.2001.01030.x. [DOI] [PubMed] [Google Scholar]

[B6-entropy-28-00028] 6.Zhang M., Schaubel D.E. Estimating Differences in Restricted Mean Lifetime Using Observational Data Subject to Dependent Censoring. Biometrics. 2011;67:740–749. doi: 10.1111/j.1541-0420.2010.01503.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B7-entropy-28-00028] 7.Zhong Y., Schaubel D.E. Restricted mean survival time as a function of restriction time. Biometrics. 2022;78:192–201. doi: 10.1111/biom.13414. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B8-entropy-28-00028] 8.Bouaziz O. Fast approximations of pseudo-observations in the context of right censoring and interval censoring. Biom. J. 2023;65:e2200071. doi: 10.1002/bimj.202200071. [DOI] [PubMed] [Google Scholar]

[B9-entropy-28-00028] 9.Zhao L. Deep neural networks for predicting restricted mean survival times. Bioinformatics. 2021;36:5672–5677. doi: 10.1093/bioinformatics/btaa1082. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B10-entropy-28-00028] 10.Sun T., Ding Y. Neural network on interval-censored data with application to the prediction of Alzheimer’s disease. Biometrics. 2023;79:2677–2690. doi: 10.1111/biom.13734. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B11-entropy-28-00028] 11.Kipf T., Welling M. Semi-Supervised Classification with Graph Convolutional Networks. arXiv. 20161609.02907 [Google Scholar]

[B12-entropy-28-00028] 12.Peng H., Li Y., Zhang W. SCAFG: Classifying Single Cell Types Based on an Adaptive Threshold Fusion Graph Convolution Network. Mathematics. 2022;10:3407. doi: 10.3390/math10183407. [DOI] [Google Scholar]

[B13-entropy-28-00028] 13.Xu C., Cai L., Gao J. An efficient scRNA-seq dropout imputation method using graph attention network. BMC Bioinform. 2021;22:582. doi: 10.1186/s12859-021-04493-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B14-entropy-28-00028] 14.Wen G., Li L. FGCNSurv: Dually fused graph convolutional network for multi-omics survival prediction. Bioinformatics. 2023;39:btad472. doi: 10.1093/bioinformatics/btad472. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B15-entropy-28-00028] 15.Ling Y., Liu Z., Xue J. Survival Analysis of High-Dimensional Data With Graph Convolutional Networks and Geometric Graphs. IEEE Trans. Neural Netw. Learn. Syst. 2024;35:4876–4886. doi: 10.1109/TNNLS.2022.3190321. [DOI] [PubMed] [Google Scholar]

[B16-entropy-28-00028] 16.Ribeiro M.T., Singh S., Guestrin C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier; Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; San Francisco, CA, USA. 13–17 August 2016; pp. 1135–1144. KDD ’16. [Google Scholar]

[B17-entropy-28-00028] 17.Kontschieder P., Fiterau M., Criminisi A., Rota Bulo S. Deep Neural Decision Forests; Proceedings of the IEEE International Conference on Computer Vision (ICCV); Santiago, Chile. 7–13 December 2015; pp. 1467–1475. [Google Scholar]

[B18-entropy-28-00028] 18.Reitz C., Rogaeva E., Beecham G. Late-Onset vs Nonmendelian Early-Onset Alzheimer Disease: A Distinction Without a Difference? Neurol. Genet. 2020;6:e512. doi: 10.1212/NXG.0000000000000512. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B19-entropy-28-00028] 19.Sun J. The Statistical Analysis of Interval-Censored Failure Time Data. Springer; New York, NY, USA: 2006. [Google Scholar]

[B20-entropy-28-00028] 20.Wellner J.A., Zhan Y. A Hybrid Algorithm for Computation of the Nonparametric Maximum Likelihood Estimator from Censored Data. J. Am. Stat. Assoc. 1997;92:945–959. doi: 10.1080/01621459.1997.10474049. [DOI] [Google Scholar]

[B21-entropy-28-00028] 21.Jongbloed G. The Iterative Convex Minorant Algorithm for Nonparametric Estimation. J. Comput. Graph. Stat. 1998;7:310–321. doi: 10.1080/10618600.1998.10474778. [DOI] [Google Scholar]

[B22-entropy-28-00028] 22.Anderson B.C. An Efficient Implementation of the EMICM Algorithm for the Interval Censored NPMLE. J. Comput. Graph. Stat. 2017;26:463–467. doi: 10.1080/10618600.2016.1208616. [DOI] [Google Scholar]

[B23-entropy-28-00028] 23.Glorot X., Bordes A., Bengio Y. Deep Sparse Rectifier Neural Networks. In: Gordon G., Dunson D., Dudík M., editors. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics; Fort Lauderdale, FL, USA. 11–13 April 2011; Fort Lauderdale, FL, USA: PMLR; 2011. pp. 315–323. [Google Scholar]

[B24-entropy-28-00028] 24.De Roeck A., Van Broeckhoven C., Sleegers K. The role of ABCA7 in Alzheimer’s disease: Evidence from genomics, transcriptomics and methylomics. Acta Neuropathol. 2019;138:201–220. doi: 10.1007/s00401-019-01994-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B25-entropy-28-00028] 25.Liang J.W., Fang Z.Y., Huang Y., Liuyang Z.Y., Zhang X.L., Wang J.L., Wei H., Wang J.Z., Wang X.C., Zeng J., et al. Application of Weighted Gene Co-Expression Network Analysis to Explore the Key Genes in Alzheimer’s Disease. J. Alzheimer’s Dis. 2018;65:1353–1364. doi: 10.3233/JAD-180400. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B26-entropy-28-00028] 26.Kochunov P., Glahn D.C., Nichols T.E., Winkler A.M., Hong E.L., Holcomb H.H., Stein J.L., Thompson P.M., Curran J.E., Carless M.A., et al. Genetic analysis of cortical thickness and fractional anisotropy of water diffusion in the brain. Front. Neurosci. 2011;5:120. doi: 10.3389/fnins.2011.00120. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B27-entropy-28-00028] 27.Luo R., Fan Y., Yang J., Ye M., Zhang D.-F., Guo K., Li X., Bi R., Xu M., Yang L.-X., et al. A novel missense variant in ACAA1 contributes to early-onsand otherszheimer’s disease, impairs lysosomal function, and facilitates amyloid-beta pathology and cognitive decline. Signal Transduct. Target. Ther. 2021;6:325. doi: 10.1038/s41392-021-00748-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B28-entropy-28-00028] 28.Gurses M.S., Ural M.N., Gulec M.A., Akyol O., Akyol S. Pathophysiological Function of ADAMTS Enzymes on Molecular Mechanism of Alzheimer’s Disease. Aging Dis. 2016;7:479–490. doi: 10.14336/AD.2016.0111. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B29-entropy-28-00028] 29.Liu L., Wu Q., Zhong W., Chen Y., Zhang W., Ren H., Sun L., Sun J. Microarray Analysis of Differential Gene Expression in Alzheimer’s Disease Identifies Potential Biomarkers with Diagnostic Value. Med. Sci Monit. 2020;26:e919249. doi: 10.12659/MSM.919249. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B30-entropy-28-00028] 30.Nakamura M., Kaneko S., Dickson D.W., Kusaka H. Aberrant Accumulation of BRCA1 in Alzheimer Disease and Other Tauopathies. J. Neuropathol. Exp. Neurol. 2020;79:22–33. doi: 10.1093/jnen/nlz107. [DOI] [PubMed] [Google Scholar]

[B31-entropy-28-00028] 31.Morris J.C., Schindler S.E., McCue L.M., Moulder K.L., Benzinger T.L., Cruchaga C., Fagan A.M., Grant E., Gordon B.A., Holtzman D.M., et al. Assessment of Racial Disparities in Biomarkers for Alzheimer Disease. JAMA Neurol. 2019;76:264–273. doi: 10.1001/jamaneurol.2018.4249. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

GTsurvival: A Hybrid GCN-Neural Decision Tree Model for Restricted Mean Survival Time Prediction with Complex Censored Data

Jingyi Zhang

Shishun Zhao

Dongmei Lu

Jianhua Cheng

Roles

Abstract

1. Introduction

2. Methods

2.1. Data Description

2.2. Interval Censoring

2.3. Restricted Mean Survival Times

2.3.1. Problem Formulation and Notation

2.3.2. Construction of the Response Variable via Pseudo-Values

2.4. Network Architecture

2.4.1. Build Connection Graph

2.4.2. Graph Convolution Network with Decision Trees

Figure 1.

2.4.3. Loss Function

2.4.4. Gradient-Based Learning for Decision Tree Parameters

2.4.5. Implementation Details

3. Results

3.1. Simulation and Experimental Design

3.2. Simulation Results

Table 1.

Figure 2.

Figure 3.

3.3. Application

3.3.1. Data Preprocessing and Genetic Feature Selection

3.3.2. Construction of Gene Interaction Network

Figure 4.

Figure 5.

3.3.3. Model Training and Performance Comparison

Table 2.

3.3.4. Interpretation of Results via LIME

Figure 6.

3.3.5. External Validation on TCGA-LGG Dataset

3.3.6. Ablation Study

4. Discussion and Concluding Remarks

Supplementary Materials

Author Contributions

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Funding Statement

Footnotes

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases