Reply to: Optimizing external validation of deep neural networks for interictal discharge detection

Marleen C Tjepkema‐Cloostermans; Michel J A M van Putten

doi:10.1111/epi.18411

letter

. 2025 May 23;66(7):2598–2599. doi: 10.1111/epi.18411

Reply to: Optimizing external validation of deep neural networks for interictal discharge detection

Marleen C Tjepkema‐Cloostermans ^1,^2,^✉, Michel J A M van Putten ^1,²

PMCID: PMC12291021 PMID: 40407061

To the Editors

We appreciate the opportunity to respond to the letter regarding our study. ¹ Although we welcome constructive discussion, the concerns raised reflect a misunderstanding of our methodology and the principles of external validation.

First, external validation assesses a model's generalizability in an independent clinical setting, and our study followed best practices by evaluating the deep learning model on an external dataset. The suggestion that our process introduced confirmation bias is misleading. Expert review of artificial intelligence (AI)‐detected events is a widely accepted practice in clinical AI validation, particularly in the absence of a universally accepted gold standard for interictal epileptiform discharge (IED) detection. ² Our inclusion of a multiexpert panel further strengthens the validation process and mitigates individual bias.

Second, although one of the original model developers participated in reviewing IEDs flagged by the model, final adjudication was conducted by a panel of five experts. This is a standard and accepted approach in electroencephalographic (EEG) studies, where interrater agreement naturally varies. ³ The assertion that this process introduces bias disregards that clinical neurophysiology often relies on expert consensus in the absence of a definitive ground truth.

Third, the claim that two authors who achieved perfect agreement (Cohen κ = 1.0) were involved in both training and external validation, indicating a lack of “assessor independence,” misrepresents our study. The interrater variability in the internal validation set ranged from .71 to 1. The one pair of experts who achieved perfect agreement were from different institutions, each with >20 years of experience in EEG interpretation. Their high κ value reflects expertise, not a lack of independence. Furthermore, only one of them was involved in data labeling for training, internal validation, and the external validation panel, whereas the other was involved solely in internal validation.

Fourth, the suggestion that our validation process is susceptible to overfitting seems to stem from an essential misunderstanding of the concept, as overfitting pertains to the training phase rather than validation. Overfitting is a phenomenon occurring during training when a model is excessively tailored to a specific dataset, reducing its ability to generalize. External validation, by definition, does not involve retraining, making such concerns both misplaced and irrelevant. ⁴

Finally, speculation about potential commercial influence is both unsubstantiated and misleading. Our study was conducted independently, with all affiliations transparently disclosed. Although commercial entities contribute to AI development in medicine, this does not inherently compromise scientific integrity when conflicts of interest are properly managed, ⁵ as was the case in our study. The insinuation of bias lacks supporting evidence and disregards the fundamental principles of independent scientific inquiry.

In conclusion, our study offers a rigorous and clinically meaningful external validation of AI‐based IED detection, adhering to best practices in clinical neurophysiology and AI validation. ¹ We appreciate the opportunity to further substantiate these points. We welcome others to evaluate our AI system using their own external datasets.

CONFLICT OF INTEREST STATEMENT

M.J.A.M.v.P. is cofounder of Clinical Science Systems, a supplier of EEG systems for Medisch Spectrum Twente. Clinical Science Systems offered no funding and was not involved in the design, execution, analysis, interpretation, or publication of the study. M.C.T.‐C. has no conflict of interest. We confirm that we have read the Journal's position on issues involved in ethical publication and affirm that this report is consistent with those guidelines.

ACKNOWLEDGMENTS

None.

DATA AVAILABILITY STATEMENT

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

REFERENCES

1. Tjepkema‐Cloostermans MC, Tannemaat MR, Wieske L, van Rootselaar A, Stunnenberg BC, Keijzer HM, et al. Expert level of detection of interictal discharges with a deep neural network. Epilepsia. 2025;66(1):184–194. [DOI] [PMC free article] [PubMed] [Google Scholar]
2. Rajpurkar P, Chen E, Banerjee O, Topol EJ. AI in health and medicine. Nat Med. 2022;28(1):31–38. [DOI] [PubMed] [Google Scholar]
3. Jing J, Herlopian A, Karakis I, Ng M, Halford JJ, Lam A, et al. Interrater reliability of experts in identifying interictal epileptiform discharges in electroencephalograms. JAMA Neurol. 2019;77(1):49. [DOI] [PMC free article] [PubMed] [Google Scholar]
4. da Silva Lourenço C, Tjepkema‐Cloostermans MC, van Putten MJAM. Efficient use of clinical EEG data for deep learning in epilepsy. Clin Neurophysiol. 2021;132(6):1234–1240. 10.1016/j.clinph.2021.01.035 [DOI] [PubMed] [Google Scholar]
5. Ramspek CL, Jager KJ, Dekker FW, Zoccali C, Van Diepen M. External validation of prognostic models: what, why, how, when and where? Clin Kidney J. 2020;14(1):49–58. 10.1093/ckj/sfaa188 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

[epi18411-bib-0001] 1. Tjepkema‐Cloostermans MC, Tannemaat MR, Wieske L, van Rootselaar A, Stunnenberg BC, Keijzer HM, et al. Expert level of detection of interictal discharges with a deep neural network. Epilepsia. 2025;66(1):184–194. [DOI] [PMC free article] [PubMed] [Google Scholar]

[epi18411-bib-0002] 2. Rajpurkar P, Chen E, Banerjee O, Topol EJ. AI in health and medicine. Nat Med. 2022;28(1):31–38. [DOI] [PubMed] [Google Scholar]

[epi18411-bib-0003] 3. Jing J, Herlopian A, Karakis I, Ng M, Halford JJ, Lam A, et al. Interrater reliability of experts in identifying interictal epileptiform discharges in electroencephalograms. JAMA Neurol. 2019;77(1):49. [DOI] [PMC free article] [PubMed] [Google Scholar]

[epi18411-bib-0004] 4. da Silva Lourenço C, Tjepkema‐Cloostermans MC, van Putten MJAM. Efficient use of clinical EEG data for deep learning in epilepsy. Clin Neurophysiol. 2021;132(6):1234–1240. 10.1016/j.clinph.2021.01.035 [DOI] [PubMed] [Google Scholar]

[epi18411-bib-0005] 5. Ramspek CL, Jager KJ, Dekker FW, Zoccali C, Van Diepen M. External validation of prognostic models: what, why, how, when and where? Clin Kidney J. 2020;14(1):49–58. 10.1093/ckj/sfaa188 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Reply to: Optimizing external validation of deep neural networks for interictal discharge detection

Marleen C Tjepkema‐Cloostermans

Michel J A M van Putten

CONFLICT OF INTEREST STATEMENT

ACKNOWLEDGMENTS

DATA AVAILABILITY STATEMENT

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Reply to: Optimizing external validation of deep neural networks for interictal discharge detection

Marleen C Tjepkema‐Cloostermans

Michel J A M van Putten

CONFLICT OF INTEREST STATEMENT

ACKNOWLEDGMENTS

DATA AVAILABILITY STATEMENT

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases