In Response To: “Neurosymbolic AI framework for explainable retinal disease classification from OCT images”.
Miladinovic and colleagues describe a tightly coupled neurosymbolic framework for optical coherence tomography (OCT) B-scan classification that combines an ImageNet-pretrained ResNet-50 with differentiable first-order logic in Scallop.1–3 The model outputs traceable “sign → rule → diagnosis” rationales and shows improved performance on an external OCT dataset (OCTID) relative to a purely convolutional baseline, a timely contribution given ongoing concerns about real-world generalizability of ophthalmic artificial intelligence (AI).1,4
One translational opportunity, however, is to use reasoning provenance not only to explain predictions, but also to signal when predictions may be fragile. The authors adopt k-top differential provenance with k = 1 (a single most probable proof path per diagnosis) to support interpretability.1–3 In clinical practice, a single best explanation is often less informative than knowing whether multiple competing explanations—or competing diagnoses—are nearly as plausible, as this commonly corresponds to borderline findings, device-dependent cues, imaging artifacts, or presentations that exceed the intended diagnostic frame.
In neurosymbolic systems, uncertainty can be “read” from how dispersed the proof support is, not only from final class probabilities. Scallop's provenance semantics make this dispersion measurable.2,3 A practical extension would be to compute auxiliary top-k proof paths (k > 1) and summarize them into a clinically interpretable conflict index, while keeping the deployed decision rule unchanged (e.g. still predicting with k = 1) so that the diagnostic probabilities reported in the current framework are not inadvertently altered by changing k.1–3 Two simple indices are: (1) a proof-margin (top-1 minus top-2 proof score) when multiple paths exist, and/or (2) a normalized entropy over the top-k proof scores. Because several of the paper's diagnosis rules are conjunction-heavy (and may yield few alternative paths), an additional and often more informative signal may be cross-diagnosis dispersion—for example, entropy over diagnosis-level supports or the gap between the top-1 and top-2 diagnoses—capturing cases where different rule-satisfaction chains in different diagnoses are similarly supported.
This idea can be tested largely with existing study assets. The authors already provide annotations for 10 intermediate signs for evaluation,1 enabling post hoc analyses of whether high conflict concentrates misclassifications, correlates with disagreement between predicted versus annotated signs, and increases under external validation. Beyond retrospective correlation, provenance-guided selective prediction (abstain/triage above a conflict threshold) could quantify an accuracy–coverage trade-off that is directly actionable for deployment: does the external robustness advantage persist when restricted to low-conflict cases, and can high-conflict cases be reliably routed for expert review? This is especially relevant because the study intentionally excluded mixed/coexisting pathologies to ensure diagnostic purity, whereas such mixtures and “out-of-rule” presentations are common in routine retina clinics.
By treating provenance not only as “why the model predicted X,” but also as “how internally consistent and clinically unambiguous the evidence is,” neurosymbolic OCT classifiers may move from interpretable to operationally trustworthy, supporting safer triage and more principled handling of borderline and distribution-shifted cases.
References
- 1. Miladinovic A, Biscontin A, Ajcevic M, et al.. Neurosymbolic AI framework for explainable retinal disease classification from OCT images. Transl Vis Sci Technol . 2026; 15(1): 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Li Z, Huang J, Naik M. Scallop: a language for neurosymbolic programming. Proc ACM Program Lang . 2023; 7: 1463–1487. [Google Scholar]
- 3. Li Z, Huang J, Liu J, Naik M. Neurosymbolic programming in Scallop: principles and practice. Foundations Trends Program Lang . 2024; 8(2): 118–249. [Google Scholar]
- 4. Rashidisabet H, Sethi A, Jindarak P, et al.. Validating the generalizability of ophthalmic artificial intelligence models on real-world clinical data. Transl Vis Sci Technol . 2023; 12(11): 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
