Parameter-efficient convolutional neural network for drug treatment outcome studies of pediatric epilepsy

Cailei Zhao; Zhao Liao; Dian Jiang; Xia Zhao; Bixia Yuan; Rongbo Lin; Jinyun Tang; Benxin Gong; Jianxiang Liao; Ling Lin; Zhanqi Hu

doi:10.1038/s41598-026-39728-5

. 2026 Feb 11;16:8410. doi: 10.1038/s41598-026-39728-5

Parameter-efficient convolutional neural network for drug treatment outcome studies of pediatric epilepsy

Cailei Zhao ^1,^#, Zhao Liao ^2,^#, Dian Jiang ^3,^#, Xia Zhao ⁴, Bixia Yuan ⁵, Rongbo Lin ⁶, Jinyun Tang ², Benxin Gong ², Jianxiang Liao ⁴, Ling Lin ^7,^✉, Zhanqi Hu ^2,^✉

PMCID: PMC12972117 PMID: 41667798

Abstract

This research aimed to develop and validate a convolutional neural network leveraging T2-weighted (T2W) and FLAIR MRI images to predict pharmacological treatment outcomes for pediatric epilepsy in tuberous sclerosis complex (TSC) patients. Using the EfficientNet3D-B0 architecture, we developed Efficient Tuberous Sclerosis Complex-Net (eTSC-Net), a weighted ensemble network that trains binary classification models with T2W and T2FLAIR images to differentiate between controlled and non-controlled TSC patients based on one-year post-anti-seizure medication (ASM) outcomes. Of the 95 patients, 39 (41.1%) achieved seizure control, while 56 (58.9%) continued having seizures after one year of ASM treatment. The dataset was split into training (67 patients), validation (9 patients), and test (19 patients) sets. We developed several models, including a baseline ResNet3D and various EfficientNet3D-B0 configurations. The baseline ResNet3D model achieved an AUC of 0.652. All EfficientNet3D-B0 models outperformed the baseline, with the optimized eTSC-Net achieving an AUC of 0.833 in the testing cohort. eTSC-Net can aid clinicians, including epileptologists, neurologists, neurosurgeons, and other physicians caring for TSC patients, by assisting in formulating targeted treatments.

Keywords: Epilepsy, Pediatric, Tuberous sclerosis complex, Drug treatment outcome, Convolutional neural network

Subject terms: Paediatric research, Epilepsy

Introduction

Tuberous sclerosis complex (TSC) is a rare autosomal dominant disorder caused by loss-of-function mutations of TSC1 or TSC2 mTOR pathway genes, which can affect multiple organ systems^1,2, and is frequently associated with seizures and TSC-associated neuropsychiatric disorders, including autism spectrum disorder and cognitive disability³. Epilepsy is the most common and challenging manifestation of TSC, affecting approximately 85% of patients^4,5. The goal of epilepsy treatment in TSC is to control seizures as soon as possible after the diagnosis, which may improve cognitive neurodevelopment and enhance the quality of life. However, while some studies have shown cognitive improvements with early treatment, more recent studies, such as the PREVeNT trial by Bebin et al.⁶, found no significant improvement in cognition following early treatment with vigabatrin. The classic treatment for epilepsy is anti-seizure medications (ASMs)⁷, but over 50% of patients with TSC develop drug-resistant epilepsy (DRE)⁸, and it may take a long time to increase the doses of ASMs and to identify drug resistance^9,10. Therefore, development of biomarkers that can predict outcomes of pharmacological treatment of epilepsy in patients with TSC is critical.

Nervous system manifestations are observed in almost all TSC cases. Magnetic resonance imaging (MRI) provides excellent tissue contrast in clinical settings, and is a modality used routinely to diagnose TSC¹¹. Yang et al.⁹ predicted DRE patients based on MRI lesion location and type of information features, and achieved an area under curve (AUC) of 0.812 with multilayer perceptron. However, the features of MRI were also typically extracted manually, and the description of these features was usually qualitative, subjective, and non-specific. Zhao et al. (2022) explored machine learning and statistical analysis techniques to predict drug treatment outcomes in pediatric epilepsy patients with TSC, identifying several key factors that contributed to predicting the likelihood of drug resistance¹². Their study primarily focused on clinical and radiological data but still faced challenges in scalability and the integration of diverse data sources. Similarly, Hu et al. (2023) conducted a clinical radiomics study to predict drug treatment outcomes in children with TSC-related epilepsy, utilizing quantitative imaging features derived from MRI scans¹³. Their results highlighted the importance of radiomic features, such as texture and shape, in predicting treatment response. However, these studies still relied heavily on traditional machine learning techniques and did not fully exploit the potential of deep learning methods that can capture more complex patterns in high-dimensional data.

Deep neural networks (DNNs) are a class of machine learning models capable of automatic feature extraction and predictive learning. Among them, convolutional neural networks (CNNs) are a specialized type designed for processing spatial data, such as medical images¹⁴. Deep neural networks have been used to detect cortical tubers to augment the ability of clinical radiologists to assess TSC¹⁵.

Previous studies have demonstrated the ability of deep convolutional neural network (CNN) models to classify lung cancer and bone lesions^16,17 and predict brain age on MRI with high accuracy^18,19. However, these studies required hundreds to thousands of images to train the CNNs, which are challenging to obtain in the context of rare disorders such as TSC. In the contemporary literature, there is a paucity of studies that have used CNN to predict epilepsy drug treatment outcomes in TSC patients.

In this study, based on the existing treatment outcomes, we hypothesize that multidimensional features of the hyperplastic lesions can be extracted and utilized to predict the efficacy of drug therapy. To the best of our knowledge, this study is the first to propose a parameter-efficient CNN for predicting anti-seizure medication outcomes in children with TSC. Parameter-efficient networks require fewer trainable parameters while maintaining comparable performance, thereby reducing the risk of overfitting in small-sample scenarios. Here, image-based models were generated using the EfficientNet3D-B0 architecture²⁰. Subsequently, a weighted average ensemble model was constructed by combining the output scores of two modality-specific models—T2W and FLAIR MRI—via weighted summation to enhance the final prediction. T2W and FLAIR sequences are routinely used in TSC evaluation due to their high sensitivity to cortical tubers and related pathologies (Russo et al., 2020). T1-weighted images were not selected, as they are generally less sensitive to TSC-related lesions, particularly cortical tubers, and tend to exhibit lower lesion conspicuity compared to T2W and FLAIR. This final model, named Efficient Tuberous Sclerosis Complex-Net (eTSC-Net), was validated on MRI data from 95 children with TSC-related epilepsy.

The results showed that the proposed method can successfully predict the outcomes of pharmacological therapy in children with TSC-related epilepsy.

Methods

Network architectures

The proposed method is based on the EfficientNet architecture²⁰, which is a state-of-the-art classification network that improves accuracy while significantly reducing the number of network parameters, making it suitable for small dataset classification. The image-based model is an improvement upon previously developed EfficientNet designs. It is a 3D version of a network modified from the 2D EfficientNet-B0 for our 3D T2W and FLAIR MRI datasets. Specifically, the 2D convolutional layers and MBConv blocks were replaced with their 3D counterparts (Conv3D, MBConv3D) to process volumetric MRI inputs. The EfficientNet-B0 used T2W images as an input and produced a prediction score as output. Another EfficientNet-B0 used FLAIR images as input. The eTSC-Net model used prediction scores from the T2W and FLAIR models as input and produced a final classification using a simple and effective weighted average method. The EfficientNet3D-B0 architecture was applied for the drug treatment outcome classification tasks with one single input modality (Fig. 1A). When eTSC-Net was used, the two modalities, T2W and FLAIR, from the current dataset were used as inputs (Fig. 1B).

Fig. 1 — Proposed model and pipeline. **(A)** This section of the model takes a single input modality, either T2W or FLAIR 3D images, for processing through the EfficientNet3D-B0 network. The architecture employs a series of convolutional and pooling layers, specifically 3D Convolutional Layer (Conv3D), 3D Mobile Inverted Residual Bottleneck Block (MBConv3D), and Global Average Pooling (GA Pooling), culminating in a Fully Connected Layer (FC) that outputs a prediction score. This score determines the classification of patients into two categories: seizure-controlled patients and refractory patients. **(B)** The eTSC-Net pipeline schematic illustrates the incorporation of both T2W and FLAIR modalities. Each modality’s images pass through the EfficientNet3D-B0 and produce individual prediction scores. These scores are then weighted (w₁ for T2W and w₂ for FLAIR) and combined in an ensemble to refine the classification accuracy between seizure-controlled and refractory patient groups.

Ensemble model

The main building block of the imaging models is EfficientNet3D²⁰. The FLAIR and T2W imaging models were combined using a weighted average ensemble network named eTSC-Net, using Eq. 1.

where T2W and FLAIR represent the prediction scores of two input images, here representing prediction scores of T2W and FLAIR, respectively; S represents the output prediction scores of eTSC-Net; w₁ and w₂ are the weights of the imaging models. w₁ and w₂ were optimized on the validation set to prevent test set leakage. Details of the optimization are provided in the Training and Evaluation section.

The proposed eTSC-Net can be extended to other imaging modalities. eTSC-Net receives prediction scores from the multiple-modality images as inputs and creates outputs based upon a weighted mean of the prediction scores. Therefore, Eq. 1 can be extended as follows,

where S represents the output prediction scores of eTSC-Net; w₁, w₂,… and w_n are the weights of the multiple-modality images; M₁, M₂,… and M_n are the multiple-modality images (such as T1w, T2w, FLAIR, and DWI). The proposed eTSC-Net is an extensible framework. The weights of the multiple-modality images have the optimal values, according to the subsequent experiments.

Currently, FLAIR and T2W are the main MRI sequences used by pediatric neurologists based on clinical experience in diagnosing TSC. In this study, each eTSC-Net classification model consisted of the outputs of an EfficientNet3D trained with T2W images and another EfficientNet3D trained with FLAIR images, as seen in Fig. 1B.

Materials and experiments

Drug treatment outcomes were defined according to the criteria set forth by GülMert et al. Patients were considered controlled if they had not experienced clinical seizures for at least one year, and uncontrolled if they had experienced at least one seizure in the past year. Seizure phenotypes were classified as generalized or partial seizures, with infantile spasms (IS) and Lennox-Gastaut syndrome (LGS) as the two main epileptic syndromes. The results showed that the proposed method can successfully predict the outcomes of pharmacological therapy in children with TSC-related epilepsy.

Dataset

Patients diagnosed with TSC-related epilepsy at the Shenzhen Children’s Hospital between January 2013 and September 2018 were eligible for inclusion in this study²¹. Written informed consent was obtained from the parents or legal caregivers of all subjects before the study, and the study protocols were approved by the Institutional Review Board (IRB) of the Shenzhen Institutes of Advanced Technology, Chinese Academy of Science. All scans included FLAIR and T2W sequences obtained before treatment with ASMs. MRI was performed using a 3-T scanner (Skyra, Siemens, Erlangen, Germany) with an 8-channel head coil. The parameters for T2Flair were: repetition time/echo time = 9000/132 ms; field of view = 230 × 194 mm²; slice thickness = 5 mm; slice gap = 1 mm; and voxel size = 0.7 × 0.7 × 5.0 mm³. The parameters for T2WI were: repetition time/echo time = 2790/93 ms; field of view = 258 × 250 mm²; slice thickness = 5 mm; slice gap = 1 mm; and voxel size = 0.6 × 0.6 × 5.0 mm³. Patients who met the following criteria were included: (1) availability of FLAIR and T2W MRI scan; and (2) an EEG recorded on admission or in the outpatient department; (3) received ASM treatment for at least one year; (4) age at MRI imaging > 6 months; (5) Absence of any other neurological disorder. Out of the 103 eligible patients, 95 qualified the inclusion criteria and were included in the study.

Figure 2 shows the inclusion criteria. The dataset was split into training, validation, and test sets in a 7:1:2 ratio. The dataset was split into training (67 patients), validation (9 patients), and test (19 patients) sets. In the training set (67 patients), 27 patients achieved seizure control, while 40 patients continued to have seizures. In the validation set (9 patients), 4 patients had seizure control and 5 patients still had seizures. In the test set (19 patients), 8 patients achieved seizure control and 11 patients continued having seizures after one year of ASM treatment.

According to the outcome after one year of taking ASMs, TSC patients were divided into a controlled group and a non-controlled group. Patients were considered controlled if they did not have clinical seizures after 1 year of taking ASMs, and those who had at least one seizure in the past year constituted the uncontrolled group.

Data processing

First, the skull was removed from MRI images using HD-bet software, as shown in Fig. 1. The size of all 3D MRI images was resized to (256, 256, 128), and the image intensity was then normalized to the range between 0 and 1, using Eq. 3.

where Max(x) and Min(x) are the maximum and minimum of the brain extracted MRI images and Normalized(x) is the normalized MRI images.

In addition to the above steps, we applied data augmentation techniques including random rotations and flips to improve model generalization. These preprocessing procedures — skull stripping using HD-bet, voxel-wise resampling to (256, 256, 128), intensity normalization to [0,1], and augmentation — together enhance the consistency, robustness, and reproducibility of model training.

Baseline

We compared our methods with a Residual Network 3D (ResNet3D) model²². Among the different variants of CNNs, ResNet3D is a classic classification network with a large number of parameters, and has exhibited remarkable performance in image classification. ResNet3D was chosen as a widely used baseline for 3D medical image classification due to its proven performance in related neuroimaging tasks (e.g., Chen et al., 2019; Ji et al., 2020). The model used four residual blocks with 18 layers in total.

We trained a 3D-ResNet on our T2W and FLAIR data. Each model was trained with the same training/testing split.

Training and evaluation

Binary classification models were trained to distinguish non-controlled group patients from controlled group patients on T2W and FLAIR images. We trained all the models using an Nvidia RTX A6000 Graphical Processing Unit (GPU) card. All models were trained with a learning rate of 0.001, batch size of 4 for 100 epochs, Adam optimization, and the loss function of binary cross-entropy. All models were trained from scratch due to the modality-specific nature of the MRI data. Data augmentation included random rotation, flipping, and intensity normalization. No layers were frozen.

To investigate an optimal combination of T2W and FLAIR to improve the AUC of the proposed eTSC-Net, the experiments were performed for values of w₁ and w₂ between 0.1 and 0.9, with a step of 0.1. The training, validation, and testing of the model were implemented in a python (version 3.8.10) and Pytorch (version 1.9.0) environment.

Performance measurements

Model performance was evaluated using the area under the ROC curve (AUC), accuracy (ACC), sensitivity (SEN), and specificity (SPE). These metrics were computed from the numbers of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN).

Statistical analysis

In this study, categorical variables were summarized as frequencies (percentages), and continuous variables were expressed as median (range). All subgroup comparisons were performed using the Mann–Whitney U test for continuous variables due to non-normal distribution, and the Pearson chi-square test for categorical variables. Statistical significance was defined as p < 0.05. To quantify the magnitude of differences in proportions, Cohen’s h³⁵ was employed. This standardized effect size measure is widely used in behavioral and medical sciences to assess practical significance beyond statistical significance, with values of 0.2, 0.5, and 0.8 typically interpreted as small, medium, and large effects, respectively. All statistical analyses were conducted using Python 3.8.10.

Results

Patient characteristics

The main patient characteristics are summarized in Table 1. Of the 95 enrolled patients, 39 (41.1%) had achieved seizure control and 56 (58.9%) were still having seizures after one year of treatment with ASMs. Of the 95 patients with TSC, 54 (56.8%) were male and 41 (43.2%) were female. Age of onset, age at MRI imaging, proportion of patients with infantile spasms, and epileptiform discharge in left parietooccipital area on EEG were significantly different between the uncontrolled and controlled groups (p < 0.05). Specifically, 10.3% of the controlled group (4/39) exhibited infantile spasms, compared to 53.6% of the uncontrolled group (30/56). Similarly, 12.8% of the controlled group (5/39) showed epileptiform discharge in the left parietooccipital area, while 44.6% of the uncontrolled group (25/56) had this finding. These differences highlight a substantial effect, with higher proportions of both infantile spasms and epileptiform discharges observed in the uncontrolled group compared to the controlled group.

Table 1.

Patient characteristics *P-values indicate statistically significant between-group differences (p < 0.05). Continuous variables were analyzed using the Mann–Whitney U test due to non-normal distribution, and categorical variables (e.g., gene mutation type) were assessed using the chi-square test. SD, standard deviation.

Characteristics	Controlled (N = 39)	Uncontrolled (N = 56)	p-value
Male, n (%)	24 (61.5%)	30 (53.6%)	0.445
Age at onset (months), mean (range)	31.17 (2–156)	10.34 (0–96)	< 0.001*
Age at MRI imaging, mean (range), months	57.88 (6–190)	35.94 (6–151)	0.019*
Infantile spasms, n (%)	4 (10.3%)	30 (53.6%)	< 0.001*
Gene mutation type (Negative/TSC1/TSC2)	5/10/24	3/7/46	0.080
Epilepsy, n (%)	39 (100.0%)	56 (100.0%)
Number of tubers	17.43 (0–36, SD: 10.50)	19.07 (0–51, SD: 12.61)	0.164
ASM numbers (≥ 3), n (%)	17 (43.6%)	36 (62.3%)	0.018*
Epileptiform discharge in left frontal area of EEG, n (%)	17 (43.6%)	29 (51.8%)	0.437
Epileptiform discharge in right frontal area of EEG, n (%)	18 (46.2%)	27 (48.2%)	0.845
Epileptiform discharge in left temporal area of EEG, n (%)	15 (38.5%)	25 (44.6%)	0.553
Epileptiform discharge in right temporal area of EEG, n (%)	16 (41.0%)	20 (35.7%)	0.604
Epileptiform discharge in left parietooccipital area of EEG, n (%)	5 (12.8%)	25 (44.6%)	< 0.001*
Epileptiform discharge in right parietooccipital area of EEG, n (%)	9 (23.1%)	14 (25.0%)	0.831

Open in a new tab

To better illustrate the magnitude of these differences, effect sizes were calculated using Cohen’s h³⁵, which is an appropriate measure for comparing proportions between two groups. The effect size for infantile spasms was found to be 0.948, indicating a very large effect. In contrast, the effect size for epileptiform discharge in the left parietooccipital area of the EEG was 0.728, reflecting a moderate to large effect.

Model performance

In Table 2, “dual-modality” refers to the weighted fusion of FLAIR and T2W MRI sequences, with the optimal combination defined in Eq. 1 as S = 0.4×FLAIR + 0.6×T2W. Table 2 summarizes the classification performance across all compared models. For FLAIR-only inputs, EfficientNet3D-B0 achieved a higher AUC than the ResNet3D baseline, whereas for T2W-only inputs, ResNet3D obtained a slightly higher AUC than EfficientNet3D-B0. To enable a fair multi-modality comparison, we also implemented a dual-modality ResNet3D baseline. However, the dual-modality configuration with the EfficientNet3D-B0 backbone (eTSC-Net, 0.4×FLAIR + 0.6×T2W) yielded the best overall performance, with an AUC of 0.833, an accuracy of 0.900, a sensitivity (true-positive rate) of 0.875, and a specificity (true-negative rate) of 0.917 on the testing cohort, substantially outperforming the dual-modality ResNet3D baseline.

Table 2.

Detailed performance of different models on the testing cohorts.

Input	Model	Testing cohort				Balanced Accuracy
Input	Model	AUC	ACC	SEN	SPE	Balanced Accuracy
FLAIR only	ResNet3D²²	0.652	0.710	0.849	0.617	0.733
FLAIR only	EfficientNet3D-B0	0.770	0.800	0.750	0.833	0.792
T2W only	ResNet3D²²	0.788	0.726	0.875	0.513	0.694
T2W only	EfficientNet3D-B0	0.729	0.800	0.749	0.833	0.791
Dual-modality	ResNet3D²²	0.728	0.750	0.623	0.900	0.771
Dual-modality	optimal eTSC-Net	0.833	0.900	0.875	0.917	0.896

Open in a new tab

Figure 3 shows the ROC curve for the proposed eTSC-Net models with different weights on the test set.

Fig. 3 — Receiver operating characteristic (ROC) curves for proposed eTSC-Net. with different weights of testing cohort.

Figure 4 shows the ROC curves for T2W, FLAIR, and the optimal eTSC-Net model’s performance with the EfficientNet3D-B0 architecture on the testing cohort.

Fig. 4 — Receiver operating characteristic (ROC) curves for FLAIR, T2W and optimal eTSC-Net (0.4×FLAIR་0.6×T2W) with EfficientNet3D-B0 architecture of testing cohort.

Model parameters size

Table 3 presents the model complexity comparison across all architectures. Compared with the baseline ResNet3D model, which requires 244.07 MB of parameters, 33.14 million trainable parameters, and 864.11 MB of VRAM, the EfficientNet3D-B0 backbone substantially reduces computational cost, requiring only 17.89 MB of parameters and 4.69 million trainable parameters. The dual-modality ResNet3D configuration is even more resource-intensive, reaching 488.14 MB of parameters and 66.28 million trainable parameters. In contrast, the proposed eTSC-Net requires only 35.78 MB of parameters and 9.38 million trainable parameters, with a VRAM usage of 370.90 MB—representing a large reduction in model size and memory footprint while still achieving superior classification performance.

Table 3.

Model complexity comparison.

Model	Parameters size (MB)	Trainable Params	Memory Usage (VRAM)
ResNet3D²²	244.07 MB	33.14 M	864.11 MB
EfficientNet3D-B0	17.89 MB	4.69 M	241.45 MB
Dual- ResNet3D	488.14 MB	66.28 M	1496.22 MB
eTSC-Net	35.78 MB	9.38 M	370.90 MB

Open in a new tab

Discussion

Reliable predictors of outcomes of treatment with ASMs can allow for more targeted treatment. Patients who are likely to be drug-resistant can be considered for surgical procedures or other treatment modalities to increase the cure rate and reduce mortality^23,24. However, determining the treatment outcomes of ASMs based on clinical and treatment presentation imposes an apparent lag. Thus, a model capable of predicting the outcomes of ASMs treatment prior to the initiation of treatment can confer a distinct leverage in clinical settings. MRI is the go-to advanced imaging modality for diagnosing TSC. Yang et al.⁹ evaluated the quantity, location, and type information of MRI lesions. They concluded that type information is the most important, followed by location information and quantity information. Deep-learning algorithms can extract a hierarchy of features from the input data via flexible and non-linear transformations²⁵.

Therefore, we developed a parameter-efficient CNN framework to predict the outcomes of pharmacological treatment in patients with TSC-related epilepsy, which could extract descriptive factors from T2W and FLAIR structural MRI images, if necessary. The optimal proposed eTSC-Net framework in this study achieved the best AUC performance of 0.833 in the current testing cohort. Surprisingly, all of our models with EfficientNet3D feature extractors outperformed the baseline model and reduced the number of parameters significantly. We believe that both the improved AUC value and the reduction in parameters are primarily caused by a reduction in model complexity. Empirically, parameter-efficient networks usually have fewer trainable parameters than 3D CNN models with a large number of parameters^19,25. Thus, a parameter-efficient network may require less training data and may be easier to optimize. This is an advantage, particularly when working with rare diseases like TSC, where data is limited. EfficientNet3D-B0 employs fewer parameters than other CNN architectures, which significantly limits data overfitting.

We observed significant differences between the groups with controlled and non-controlled epilepsy with respect to the age at onset, age at MRI imaging, infantile spasms, and epileptiform discharge in the left parietooccipital area on EEG (p < 0.05). The average number of tubers did not differ significantly between the controlled and non-controlled epilepsy groups, with both groups demonstrating a comparable mean of approximately 17.43 and 19.07, respectively. Nevertheless, it is worth mentioning that the discrepancy in tuber quantities approached statistical significance (p = 0.068), suggesting the possibility of a divergence in the distribution pattern of tubers between the two groups. In our cohort, approximately 58.9% of patients still had seizures after one-year treatment with ASMs. A high proportion of TSC patients have DRE²⁶. In a study of the natural history of 2034 TSC patients, Jeong et al. found that 59.6% of patients with focal epilepsy had drug resistance²⁷. In addition, drug resistance rate of TSC-related epilepsy has been shown to be as high as 60%²⁴, 63.7%²⁸, and 62%⁵. The reported ratio was similar to our study, which suggests no ethnic predisposition for drug resistance.

In our study, about 30 (53.6%) patients in the uncontrolled group had experienced infantile spasms, and 4 (10.3%) patients in the controlled group had experienced infantile spasms. Previous studies have also shown similar results. About 30%−60% of patients with TSC-related epilepsy experience infantile spasms. Among those who have experienced infantile spasms, 75.4% still have seizures after one year during ASM treatment, while among those who have not experienced infantile spasms, 39.8% have uncontrolled epilepsy after one year⁴. Jeong et al. also reported that TSC patients with previous infantile spasms are more likely to be resistant to ASMs⁵, which is consistent with our conclusion that the uncontrolled group had a higher proportion of patients with infantile spasms. Importantly, the model’s performance on infantile spasm patients was consistent with that on the overall cohort; however, due to the limited subgroup size, further statistical analysis was not feasible in the current study.

Our finding of a higher proportion of patients with TSC2 gene mutations in the uncontrolled group is consistent with previous studies. Compared with TSC1 pathogenic mutations, TSC2 pathogenic mutations are associated with a more severe clinical phenotype^27,29. Jeong et al. also found that epilepsy patients with TSC2 mutations have a higher drug resistance rate than those with TSC1 mutations²⁷. In our study, the mean age at seizure onset was 31.17 months in the controlled group and 10.34 months in the uncontrolled group. TSC patients with focal seizures before the age of 1 year are more likely to develop drug-resistance than patients with onset after the age of 1 year⁵. In our cohort, the mean age at MRI imaging was 57.88 months in the controlled group, and 35.94 months in the uncontrolled group. Epilepsy is a symptom that is easily noticed by parents and affects children the most. Most TSC patients seek medical treatment for seizures or convulsions, and have an MRI scan done². The younger age at MRI imaging in the uncontrolled group was likely attributable to the younger mean age at seizure onset compared to that in the controlled group.

In our study, we also found that the proposed 3D CNN models achieved better performance than the traditional machine learning approach described by Yang et al.⁹. The performance improvement is primarily attributable to the full utilization of the spatial features of 3D MRI. Compared with the use of single-modality images with EfficientNet3D approaches, the proposed eTSC-Net can improve the classification performance. This indicates that combining multiple-contrast MRI can enable the use of complementary information among multiple-contrast images. In addition, several studies^16,30 have illustrated that the late fusion model can most effectively grasp the data distribution and ultimately yield the best prediction performance. Despite the promising results, several limitations warrant consideration. First, we did not perform statistical significance testing (e.g., DeLong’s test) to determine whether the observed AUC improvement of eTSC-Net over the baseline model is statistically meaningful. This was due to current experimental constraints and the lack of sufficiently independent prediction distributions. We acknowledge the value of such validation and plan to include DeLong’s test in future studies, especially when evaluating the model on larger, multicenter datasets^12,13.

Second, the relatively small sample size remains a key limitation, which is a common challenge in rare disease research such as TSC^1,3. Although the test set (n = 19) was held out exclusively for final evaluation, its limited size may affect the statistical robustness of our findings^9,12. Furthermore, the dataset was derived from a single clinical center, introducing potential biases related to scanner variability, imaging protocols, and patient demographics. To improve generalizability and minimize such biases, future work will focus on external validation across multiple institutions and more diverse patient cohorts^13,15.

Third, while ResNet3D served as a standard 3D classification baseline, it may not reflect the latest advances in network architecture. Future comparisons with more powerful backbones, such as DenseNet3D, Swin UNETR, or transformer-based models, may yield improved performance, particularly when scaled to larger and more heterogeneous datasets^20,22,25.

Finally, although eTSC-Net demonstrated strong predictive performance, further evaluation is needed to determine its clinical utility in real-world settings, where factors such as patient comorbidities, treatment compliance, and imaging variability can affect model performance^6,7. Moreover, deploying the model on resource-constrained hardware platforms (e.g., FPGA, mobile edge devices) could facilitate real-time inference and seamless integration into clinical workflows. Hardware-oriented solutions have proven effective in other AI-driven healthcare applications^31–34, and may serve as valuable references for translating eTSC-Net into practical decision-support tools.

Conclusion

In this work, we proposed a parameter-efficient CNN method, named eTSC-Net, to predict the drug treatment outcomes of pediatric patients with TSC-related epilepsy. The experimental results suggest that the proposed method is a non-invasive, efficient, and reliable approach for outcome prediction in this rare disease population.

However, the relatively small sample size may limit the generalizability of our findings and increase the risk of overfitting. Given the rarity of TSC, assembling large datasets is inherently challenging. Therefore, future work should focus on validating the model in larger, multicenter cohorts with more diverse patient populations to enhance its robustness and clinical applicability.

We believe that eTSC-Net can serve as a strong baseline for future research on treatment outcome prediction in rare pediatric neurological disorders.

Acknowledgements

We are grateful to Hekai Wang and Yixuan Bo for conducting supplementary experiments to address reviewers’ comments, and preparing the response letter and revising the manuscript. Their expertise significantly strengthened this work.

Author contributions

CZ and DJ conducted the experiments, performed the computations and drafted the main manuscript. CZ, ZL, DJ, ZH and XZ summarized the results and prepared the figures; CZ, ZH, and JL verified the analytical methods and revised the manuscript. BY, JT, BG collected patients’ information. CZ, ZH, XZ, LL and RL collected the data and helped revise the manuscript. DJ and CZ contributed equally to this work. All authors discussed the results and contributed to the final manuscript. All authors have read and approved the final manuscript.

Funding

the Sanming Project of Medicine in Shenzhen Guangming (SZGMTD2025001), the National Natural Science Foundation of China (62271474), the International Partnership Program of the Chinese Academy of Sciences (321GJHZ2023246GC), the Natural Science Foundation of Guangdong Province (2023B1515120007 and 2024A1515012138), the Guangdong Provincial Key Laboratory of Multimodality Non-Invasive Brain-Computer Interfaces (2024B1212010010) and the Shenzhen Science and Technology Program (KJZD20230923113259001 and JCYJ20220530160005012).

Data availability

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Code availability

The custom code used to train the models and generate the main results reported in this study is archived on Zenodo (10.5281/zenodo.18217768), corresponding to release v1.0.1. The source repository is available at https://github.com/TSCI-SIAT/etscNet-dual-mri. The archived release contains scripts for training, inference, and weighted late-fusion evaluation used to compute the reported metrics. No restrictions apply to accessing the code. MRI data and trained model checkpoints are not included due to privacy/size constraints.

Declarations

Competing interests

The authors declare no competing interests.

Ethical considerations

The study protocols were approved by the Institutional Review Board (IRB) of the Shenzhen Institutes of Advanced Technology, Chinese Academy of Science and all experiments were performed in accordance with the Declaration of Helsinki.

Consent to participate

Written informed consent was obtained from the parents or legal caregivers of all subjects before the study.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally to this work: Cailei Zhao, Zhao Liao and Dian Jiang.

Contributor Information

Ling Lin, Email: l.ling@siat.ac.cn.

Zhanqi Hu, Email: huzhanqi1983@aliyun.com.

References

1. Randle, S. C. Tuberous sclerosis complex: A review. Pediatr. Ann.46, e166–e171 (2017). [DOI] [PubMed] [Google Scholar]
2.Holmes, G. L. & Stafstrom, C. E. Tuberous sclerosis complex and epilepsy: recent developments and future challenges. Epilepsia48, 617–630 (2007). [DOI] [PubMed] [Google Scholar]
3.Henske, E. P. et al. Tuberous sclerosis complex. Nat. Rev. Dis. Primers. 2, 16035 (2016). [DOI] [PubMed] [Google Scholar]
4.Curatolo, P. et al. Management of epilepsy associated with tuberous sclerosis complex: updated clinical recommendations. Eur. J. Paediatr. Neurol.22, 738–748 (2018). [DOI] [PubMed] [Google Scholar]
5.Chu-Shore, C. J. et al. The natural history of epilepsy in tuberous sclerosis complex. Epilepsia51, 1236–1241 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Bebin, E. M. et al. Early treatment with Vigabatrin does not decrease focal seizures or improve cognition in tuberous sclerosis complex: the prevent trial. Ann. Neurol.95, 15–26 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
7.van der Poest Clement, E., Jansen, F. E., Braun, K. P. J. & Peters, J. M. Update on drug management of refractory epilepsy in tuberous sclerosis complex. Paediatr. Drugs. 22, 73–84 (2020). [DOI] [PubMed] [Google Scholar]
8.Jesmanas, S. et al. Different MRI-defined tuber types in tuberous sclerosis complex: quantitative evaluation and association with disease manifestations. Brain Dev.40, 196–204 (2018). [DOI] [PubMed] [Google Scholar]
9.Yang, J. et al. Machine Learning in Epilepsy Drug Treatment Outcome Prediction Using Multi-modality Data in Children with Tuberous Sclerosis Complex. 6th International Conference on Big Data and Information Analytics (BigDIA). 100–103 (2020).
10.An, S. et al. Predicting drug-resistant epilepsy - A machine learning approach based on administrative claims data. Epilepsy Behav.89, 118–125 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Russo, C. et al. Neuroimaging in tuberous sclerosis complex. Childs Nerv. Syst.36, 2497–2509 (2020). [DOI] [PubMed] [Google Scholar]
12.Zhao, X. et al. Machine learning and statistic analysis to predict drug treatment outcome in pediatric epilepsy patients with tuberous sclerosis complex. Epilepsy Res.188, 107040%@ 100920–101211 (2022). [DOI] [PubMed]
13.Hu, Z. et al. Predicting drug treatment outcomes in children with tuberous sclerosis Complex–Related epilepsy: A clinical radiomics study. Am. J. Neuroradiol.44, 853–8600195 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Hu, J. et al. Squeeze-and-Excitation networks. IEEE Trans. Pattern Anal. Mach. Intell.42, 2011–2023 (2020). [DOI] [PubMed] [Google Scholar]
15.Sanchez Fernandez, I. et al. Deep learning in rare disease. Detection of tubers in tuberous sclerosis complex. PLoS One. 15, e0232376 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Eweje, F. R. et al. Deep Learning for Classification of Bone Lesions on Routine MRI. EBioMedicine68, 103402 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Grossman, R. et al. Differentiating Small-Cell lung cancer from Non-Small-Cell lung cancer brain metastases based on MRI using Efficientnet and transfer learning approach. Technol. Cancer Res. Treat.20, 15330338211004919 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Jonsson, B. A. et al. Brain age prediction using deep learning uncovers associated sequence variants. Nat. Commun.10, 5409 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Peng, H. et al. Accurate brain age prediction with lightweight deep neural networks. Med. Image Anal.68, 101871 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Tan, M., Le, Q. & Efficientnet Rethinking model scaling for convolutional neural networks. International Conference on Machine Learning. 6105–6114 (2019).
21.Shinnar, S. The new ILAE classification. Epilepsia51, 715–717 (2010). [DOI] [PubMed] [Google Scholar]
22.He, K., Zhang, X., Ren, S. & Sun, J. Deep Residual Learning for Image Recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–777 (2016).
23.Hsieh, D. T., Jennesson, M. M. & Thiele, E. A. Epileptic spasms in tuberous sclerosis complex. Epilepsy Res.106, 200–210 (2013). [DOI] [PubMed] [Google Scholar]
24.Fohlen, M. et al. Refractory epilepsy in preschool children with tuberous sclerosis complex: early surgical treatment and outcome. Seizure60, 71–79 (2018). [DOI] [PubMed] [Google Scholar]
25.Spasov, S. et al. A parameter-efficient deep learning approach to predict conversion from mild cognitive impairment to alzheimer’s disease. Neuroimage189, 276–287 (2019). [DOI] [PubMed] [Google Scholar]
26.Canevini, M. P. et al. Current concepts on epilepsy management in tuberous sclerosis complex. Am. J. Med. Genet. C Semin Med. Genet.178, 299–308 (2018). [DOI] [PubMed] [Google Scholar]
27.Jeong, A., Nakagawa, J. A. & Wong, M. Predictors of Drug-Resistant epilepsy in tuberous sclerosis complex. J. Child. Neurol.32, 1092–1098 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Chen, Z., Brodie, M. J., Liew, D. & Kwan, P. Treatment outcomes in patients with newly diagnosed epilepsy treated with established and new antiepileptic drugs: A 30-Year longitudinal cohort study (75, Pg 279, year 2017). Jama Neurol.75, 384–384 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Salussolia, C. L., Klonowska, K., Kwiatkowski, D. J. & Sahin, M. Genetic Etiologies, Diagnosis, and treatment of tuberous sclerosis complex. Annu. Rev. Genomics Hum. Genet.20, 217–240 (2019). [DOI] [PubMed] [Google Scholar]
30.Liang, G. et al. Alzheimer’s Disease Classification Using 2D Convolutional Neural Networks. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc.2021, 3008–3012 (2021). [DOI] [PubMed] [Google Scholar]
31.Wang, Y., Li, R., Wang, Y., Sun, J. & Military UCAV 3-D path planning based on multistrategy developed human evolutionary optimization algorithm. Ieee Internet Things J.12, 16735–16747. 10.1109/jiot.2025.3532749 (2025). [Google Scholar]
32.Movassagh, A. A. et al. Artificial neural networks training algorithm integrating invasive weed optimization with differential evolutionary model. J. Ambient Intell. Humaniz. Comput.10.1007/s12652-020-02623-6 (2021). [Google Scholar]
33.Wang, Y., Su, P., Wang, Z. & Sun, J. F. N. H. N. N. Coupled with tunable multistable memristors and encryption by Arnold mapping and diagonal diffusion algorithm. Ieee Trans. Circuits Syst. I-Regular Papers. 10.1109/tcsi.2024.3516722 (2024). [Google Scholar]
34.Wang, Y., Tao, K., Wang, Z., Sun, J. & Memristor-Based, G. F. M. M. Neural network circuit of biology with multiobjective decision and its application in industrial autonomous firefighting. IEEE Trans. Industr. Inf.21, 5777–5786. 10.1109/tii.2025.3558347 (2025). [Google Scholar]
35.Cohen, J. Statistical Power Analysis for the Behavioral Sciences 2nd edn (Lawrence Erlbaum Associates, 1988).

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

[CR1] 1. Randle, S. C. Tuberous sclerosis complex: A review. Pediatr. Ann.46, e166–e171 (2017). [DOI] [PubMed] [Google Scholar]

[CR2] 2.Holmes, G. L. & Stafstrom, C. E. Tuberous sclerosis complex and epilepsy: recent developments and future challenges. Epilepsia48, 617–630 (2007). [DOI] [PubMed] [Google Scholar]

[CR3] 3.Henske, E. P. et al. Tuberous sclerosis complex. Nat. Rev. Dis. Primers. 2, 16035 (2016). [DOI] [PubMed] [Google Scholar]

[CR4] 4.Curatolo, P. et al. Management of epilepsy associated with tuberous sclerosis complex: updated clinical recommendations. Eur. J. Paediatr. Neurol.22, 738–748 (2018). [DOI] [PubMed] [Google Scholar]

[CR5] 5.Chu-Shore, C. J. et al. The natural history of epilepsy in tuberous sclerosis complex. Epilepsia51, 1236–1241 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Bebin, E. M. et al. Early treatment with Vigabatrin does not decrease focal seizures or improve cognition in tuberous sclerosis complex: the prevent trial. Ann. Neurol.95, 15–26 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.van der Poest Clement, E., Jansen, F. E., Braun, K. P. J. & Peters, J. M. Update on drug management of refractory epilepsy in tuberous sclerosis complex. Paediatr. Drugs. 22, 73–84 (2020). [DOI] [PubMed] [Google Scholar]

[CR8] 8.Jesmanas, S. et al. Different MRI-defined tuber types in tuberous sclerosis complex: quantitative evaluation and association with disease manifestations. Brain Dev.40, 196–204 (2018). [DOI] [PubMed] [Google Scholar]

[CR9] 9.Yang, J. et al. Machine Learning in Epilepsy Drug Treatment Outcome Prediction Using Multi-modality Data in Children with Tuberous Sclerosis Complex. 6th International Conference on Big Data and Information Analytics (BigDIA). 100–103 (2020).

[CR10] 10.An, S. et al. Predicting drug-resistant epilepsy - A machine learning approach based on administrative claims data. Epilepsy Behav.89, 118–125 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR11] 11.Russo, C. et al. Neuroimaging in tuberous sclerosis complex. Childs Nerv. Syst.36, 2497–2509 (2020). [DOI] [PubMed] [Google Scholar]

[CR12] 12.Zhao, X. et al. Machine learning and statistic analysis to predict drug treatment outcome in pediatric epilepsy patients with tuberous sclerosis complex. Epilepsy Res.188, 107040%@ 100920–101211 (2022). [DOI] [PubMed]

[CR13] 13.Hu, Z. et al. Predicting drug treatment outcomes in children with tuberous sclerosis Complex–Related epilepsy: A clinical radiomics study. Am. J. Neuroradiol.44, 853–8600195 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR14] 14.Hu, J. et al. Squeeze-and-Excitation networks. IEEE Trans. Pattern Anal. Mach. Intell.42, 2011–2023 (2020). [DOI] [PubMed] [Google Scholar]

[CR15] 15.Sanchez Fernandez, I. et al. Deep learning in rare disease. Detection of tubers in tuberous sclerosis complex. PLoS One. 15, e0232376 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Eweje, F. R. et al. Deep Learning for Classification of Bone Lesions on Routine MRI. EBioMedicine68, 103402 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR17] 17.Grossman, R. et al. Differentiating Small-Cell lung cancer from Non-Small-Cell lung cancer brain metastases based on MRI using Efficientnet and transfer learning approach. Technol. Cancer Res. Treat.20, 15330338211004919 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR18] 18.Jonsson, B. A. et al. Brain age prediction using deep learning uncovers associated sequence variants. Nat. Commun.10, 5409 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] 19.Peng, H. et al. Accurate brain age prediction with lightweight deep neural networks. Med. Image Anal.68, 101871 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR20] 20.Tan, M., Le, Q. & Efficientnet Rethinking model scaling for convolutional neural networks. International Conference on Machine Learning. 6105–6114 (2019).

[CR21] 21.Shinnar, S. The new ILAE classification. Epilepsia51, 715–717 (2010). [DOI] [PubMed] [Google Scholar]

[CR22] 22.He, K., Zhang, X., Ren, S. & Sun, J. Deep Residual Learning for Image Recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–777 (2016).

[CR23] 23.Hsieh, D. T., Jennesson, M. M. & Thiele, E. A. Epileptic spasms in tuberous sclerosis complex. Epilepsy Res.106, 200–210 (2013). [DOI] [PubMed] [Google Scholar]

[CR24] 24.Fohlen, M. et al. Refractory epilepsy in preschool children with tuberous sclerosis complex: early surgical treatment and outcome. Seizure60, 71–79 (2018). [DOI] [PubMed] [Google Scholar]

[CR25] 25.Spasov, S. et al. A parameter-efficient deep learning approach to predict conversion from mild cognitive impairment to alzheimer’s disease. Neuroimage189, 276–287 (2019). [DOI] [PubMed] [Google Scholar]

[CR26] 26.Canevini, M. P. et al. Current concepts on epilepsy management in tuberous sclerosis complex. Am. J. Med. Genet. C Semin Med. Genet.178, 299–308 (2018). [DOI] [PubMed] [Google Scholar]

[CR27] 27.Jeong, A., Nakagawa, J. A. & Wong, M. Predictors of Drug-Resistant epilepsy in tuberous sclerosis complex. J. Child. Neurol.32, 1092–1098 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR28] 28.Chen, Z., Brodie, M. J., Liew, D. & Kwan, P. Treatment outcomes in patients with newly diagnosed epilepsy treated with established and new antiepileptic drugs: A 30-Year longitudinal cohort study (75, Pg 279, year 2017). Jama Neurol.75, 384–384 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR29] 29.Salussolia, C. L., Klonowska, K., Kwiatkowski, D. J. & Sahin, M. Genetic Etiologies, Diagnosis, and treatment of tuberous sclerosis complex. Annu. Rev. Genomics Hum. Genet.20, 217–240 (2019). [DOI] [PubMed] [Google Scholar]

[CR30] 30.Liang, G. et al. Alzheimer’s Disease Classification Using 2D Convolutional Neural Networks. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc.2021, 3008–3012 (2021). [DOI] [PubMed] [Google Scholar]

[CR31] 31.Wang, Y., Li, R., Wang, Y., Sun, J. & Military UCAV 3-D path planning based on multistrategy developed human evolutionary optimization algorithm. Ieee Internet Things J.12, 16735–16747. 10.1109/jiot.2025.3532749 (2025). [Google Scholar]

[CR32] 32.Movassagh, A. A. et al. Artificial neural networks training algorithm integrating invasive weed optimization with differential evolutionary model. J. Ambient Intell. Humaniz. Comput.10.1007/s12652-020-02623-6 (2021). [Google Scholar]

[CR33] 33.Wang, Y., Su, P., Wang, Z. & Sun, J. F. N. H. N. N. Coupled with tunable multistable memristors and encryption by Arnold mapping and diagonal diffusion algorithm. Ieee Trans. Circuits Syst. I-Regular Papers. 10.1109/tcsi.2024.3516722 (2024). [Google Scholar]

[CR34] 34.Wang, Y., Tao, K., Wang, Z., Sun, J. & Memristor-Based, G. F. M. M. Neural network circuit of biology with multiobjective decision and its application in industrial autonomous firefighting. IEEE Trans. Industr. Inf.21, 5777–5786. 10.1109/tii.2025.3558347 (2025). [Google Scholar]

[CR35] 35.Cohen, J. Statistical Power Analysis for the Behavioral Sciences 2nd edn (Lawrence Erlbaum Associates, 1988).

PERMALINK

Parameter-efficient convolutional neural network for drug treatment outcome studies of pediatric epilepsy

Cailei Zhao

Zhao Liao

Dian Jiang

Xia Zhao

Bixia Yuan

Rongbo Lin

Jinyun Tang

Benxin Gong

Jianxiang Liao

Ling Lin

Zhanqi Hu

Abstract

Introduction

Methods

Network architectures

Fig. 1.

Ensemble model

Materials and experiments

Dataset

Fig. 2.

Data processing

Baseline

Training and evaluation

Performance measurements

Statistical analysis

Results

Patient characteristics

Table 1.

Model performance

Table 2.

Fig. 3.

Fig. 4.

Model parameters size

Table 3.

Discussion

Conclusion

Acknowledgements

Author contributions

Funding

Data availability

Code availability

Declarations

Competing interests

Ethical considerations

Consent to participate

Footnotes

Contributor Information

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases