Abstract
Keratoconus is a progressive eye disease characterized by the thinning and bulging of the cornea, leading to visual impairment. Early and accurate diagnosis is crucial for effective management and treatment. This study investigates the application of machine learning models to identify keratoconus based on corneal topography and biomechanical data. We collected a dataset comprising 144 corneal scans from adults aged 18–35, including an equal proportion of keratoconus and normal cases. Various machine learning algorithms were trained and evaluated on datasets containing different parameters obtained using the Pentacam device. The Random Forest algorithm demonstrated the highest reliability, achieving an accuracy of 98% during training and 96% on the test set, while also identifying the most diagnostically relevant measurements. Unlike prior studies, our approach enables detailed comparison between model-selected features and clinically recognized diagnostic parameters. This interpretability provides a clinically meaningful bridge between AI-driven predictions and expert-based decision-making. The results suggest that machine learning models, particularly Random Forest, can effectively aid in the early detection of keratoconus in young individuals, potentially improving patient outcomes through timely intervention.
Keywords: Keratoconus, Machine learning, Corneal topography, Biomechanical data, Random forest, Early diagnosis, Medical imaging
Subject terms: Medical research, Mathematics and computing
Introduction
The cornea is the anterior, convex part of the eyeball and plays a critical role in the optical system of the human eye. Keratoconus is the most common form of corneal ectasia, typically manifesting in the second or third decade of life. It is a progressive, bilateral, and asymmetrical condition characterized by thinning and protrusion of the cornea, which leads to increased refractive errors and high-grade irregular astigmatism. As a result, affected individuals often experience significant visual impairment that cannot be adequately corrected with glasses.
Recent studies suggest that keratoconus has an inflammatory component and is associated with allergic or atopic conditions, a positive family history, as well as genetic and environmental factors. Chronic eye rubbing, often due to itching, may further contribute to biomechanical weakening of the cornea.
As the disease progresses, central or paracentral corneal thinning and steepening become more pronounced. Given the cornea’s central role in ocular optics, these changes cause irregular astigmatism and high-order aberrations, leading to a substantial reduction in visual quality. While typically bilateral, the disease progresses asymmetrically and may continue until the end of the third decade of life. A recent meta-analysis covering over 50 million individuals from 15 countries estimated the global prevalence of keratoconus at 138 cases per 100,000 people1. However, in some populations, prevalence rates may reach as high as 4.79%2. This variability is likely due to differences in diagnostic criteria, classification systems, study methodologies, and the imaging technologies used.
Keratoconus is characterized by physicochemical alterations in the corneal tissue, including changes in its viscoelastic and biomechanical properties. Disease progression is associated with disruption of the collagen lamellae that comprise the corneal extracellular matrix. These structural changes contribute to the characteristic thinning and protrusion of the cornea.
A key contributing factor is mechanical eye rubbing, which can trigger the release of multiple proinflammatory cytokines (IL-1β, IL-4, IL-5, IL-6, IL-8, IL-13, IL-17, TNF-α, IFN-γ) from the corneal epithelium. This cytokine cascade enhances the activity of tissue-degrading enzymes, particularly matrix metalloproteinase-9 (MMP-9), and reduces the expression of lysyl oxidase (LOX)—an essential enzyme involved in the formation of collagen cross-links. The combined effect of increased proteolysis and impaired collagen synthesis leads to progressive weakening of the corneal stroma.
Diagnosis of corneal ectasia is based on a combination of clinical evaluation, including medical history, refraction testing, slit-lamp examination, and advanced imaging techniques. Corneal topography and anterior segment optical coherence tomography (OCT) are particularly useful for assessing disease severity and progression. Additionally, changes in corneal biomechanics—such as those measured by dynamic Scheimpflug imaging—have emerged as valuable diagnostic markers, allowing earlier detection of subclinical keratoconus. Early diagnosis is essential, as it enables timely intervention with treatments such as corneal collagen cross-linking, which can stabilize the disease and prevent further visual deterioration.
The diagnosis of keratoconus is based on clinical examination and imaging techniques. During slit-lamp evaluation, clinicians should assess for characteristic signs such as corneal protrusion, scissors reflex, localized thinning, prominent corneal nerve fibers, Charleux’s oil droplet reflex, Fleischer’s ring, and Vogt’s striae3,4. While these features are useful in identifying advanced disease, their presence and severity do not always correlate consistently with disease progression. Early diagnosis and monitoring of keratoconus are critical for timely therapeutic intervention. In particular, detecting progression at an early stage enables the application of corneal collagen cross-linking, which can halt or slow disease advancement. Delayed diagnosis, on the other hand, often limits treatment options to corneal transplantation in cases of scarring or severe thinning, where keratoplasty may remain the only viable solution5. This challenge is especially pronounced in low-resource settings, where early detection is hindered by limited access to diagnostic technologies and a shortage of corneal donor tissue5.
In cases where keratoconus is diagnosed at an advanced stage, corneal transplantation may remain the only viable treatment option. This procedure involves prolonged waiting times for donor tissue, more complex surgical intervention, and the need for long-term—often lifelong—immunosuppressive therapy. Although still under active investigation, the viscoelastic properties of the cornea play a critical role in the diagnosis and management of corneal ectatic disorders. These biomechanical parameters are also valuable in preoperative screening for refractive surgery, treatment planning, outcome prediction, and risk assessment for postoperative complications.
Currently, early keratoconus diagnosis is primarily based on corneal topography, supplemented by pachymetry and aberrometry. These imaging techniques allow for the detection of subtle morphological abnormalities in corneal shape and curvature, even before clinical signs appear3,6. However, the major diagnostic challenge lies in identifying the most sensitive and specific parameters capable of detecting subclinical keratoconus—particularly when topographic patterns appear physiologically normal3,7.
Currently, the gold standard for screening, diagnosing, and assessing keratoconus is corneal topography and tomography. One of the most widely used diagnostic tools is the Pentacam, which utilizes a rotating Scheimpflug camera system. During the imaging process, the anterior segment of the eye is illuminated with monochromatic blue slit light, allowing the device to capture a sequence of cross-sectional images. These images are then reconstructed into a high-resolution three-dimensional model of the cornea. This enables detailed analysis of both anterior and posterior corneal surfaces, as well as the generation of pachymetric maps essential for evaluating corneal thickness distribution and detecting early ectatic changes.
Topographic imaging systems can be broadly classified based on their operating principle into two categories: reflection-based systems and slit-light projection systems8. Reflection-based systems assess anterior corneal topography by analyzing the reflection of Placido’s rings on the corneal surface8. These systems are particularly effective at detecting localized steepening of the anterior cornea, a hallmark feature of early-stage keratoconus5. However, they are limited by relatively low measurement repeatability and their inability to evaluate the posterior corneal surface3,5,8. In contrast, slit-light projection systems overcome these limitations by providing detailed imaging of both the anterior and posterior corneal surfaces. A unique variation of reflection-based systems—using multicolor light-emitting diodes (LEDs)—is represented by the Cassini Color LED Corneal Analyzer (i-Optics, The Netherlands), which enables more precise analysis of corneal curvature8. Slit-light-based systems operate using a two-step approach: projection of Placido rings followed by slit-light scanning. This enables comprehensive imaging of the entire anterior segment, including the posterior corneal curvature, which is often altered even in the earliest stages of keratoconus3,8. The implementation of such technologies has enabled the development of advanced diagnostic algorithms, such as the Belin/Ambrósio Enhanced Ectasia Display (BAD-D) available in the Pentacam (Oculus), and the Corneal Objective Risk of Ectasia Screening (CORE) developed for the Orbscan (Bausch & Lomb)5,9,10. These tools integrate multiple corneal parameters—such as posterior asphericity, anterior curvature, and pachymetric progression—into a comprehensive risk profile, revolutionizing the early diagnosis of keratoconus.
The Scheimpflug imaging technique, when combined with Placido disc analysis, enables the measurement and evaluation of up to 138,000 real elevation points, offering one of the highest-resolution assessments of the anterior segment. Each measurement generates a full cross-sectional scan of the corneal surface. Key advantages of rotational Scheimpflug imaging include precise central corneal measurement, correction for eye movements, straightforward patient positioning, and a rapid acquisition time—typically under two seconds.
The demand for early detection of keratoconus, particularly in its subclinical stages, has driven interest in alternative technologies capable of revealing subtle structural changes. One such approach is high-resolution ultrasound pachymetry, which enables the generation of detailed stromal and epithelial thickness maps. Notably, epithelial thickness profiling has been reported as the only method with 100% sensitivity for detecting preclinical keratoconus5,11. This technique identifies specific epithelial remodeling patterns—central thinning and compensatory peripheral thickening in an annular distribution—that are characteristic of early disease5,11.
Another promising modality is anterior segment optical coherence tomography (AS-OCT), which provides high-resolution, cross-sectional imaging of all corneal layers12. Compared to elevation-based maps, AS-OCT offers true 3D visualization of the cornea and enables reliable measurement of epithelial thickness, both centrally and peripherally—approaching the accuracy of high-resolution ultrasound5,10–13. These combined capabilities make AS-OCT a valuable tool in the detection of subclinical and early-stage keratoconus.
One important parameter that cannot be measured using standard non-invasive imaging techniques is corneal hysteresis5. This metric reflects the viscoelastic response of the cornea to mechanical stress—typically assessed through air-puff tonometry—and represents the difference between corneal deformation and recovery14,15. Reduced corneal hysteresis has been associated with biomechanical weakening and may serve as an early indicator of ectatic changes. The analysis of corneal biomechanics, including hysteresis, provides complementary data that can enhance and support the findings obtained from topography, tomography, and pachymetry5,14. Incorporating such biomechanical metrics into the diagnostic process may improve sensitivity for detecting subclinical keratoconus and monitoring its progression.
Recent studies from the past five years (2020–2025) highlight the increasing role of machine learning (ML) in the early detection and classification of keratoconus. A recent review16 emphasizes the potential of ML algorithms to improve early-stage diagnosis, while other works17 compare different classification techniques, illustrating the diagnostic gains possible with automated models. Various data sources have been used to build such models, including spectral-domain optical coherence tomography (SD-OCT), ultra-high-resolution OCT, air-puff tonometry, and Scheimpflug imaging18–20. Machine learning models have shown promising results in differentiating keratoconic from normal corneas19,21. However, many algorithms still struggle to identify form fruste keratoconus (FFKC), a very early and subtle stage of the disease19. Notably, Yang et al.20 report a model capable of successfully distinguishing FFKC from physiological corneas. A recent systematic review and meta-analysis by Bodmer et al.22 concluded that the overall diagnostic performance of deep learning models for keratoconus detection is strong, although many studies suffer from methodological limitations. Other detailed investigations23–26 further evaluate the performance of both classical ML algorithms and deep neural networks. For example, study23 demonstrated the high accuracy (up to 98%) of the Random Forest classifier in distinguishing normal eyes from those with keratoconus, as well as in grading disease severity. Subsequent research24 explored the use of convolutional neural networks (CNNs) and autoencoders to augment datasets and improve diagnostic accuracy. In a recent multicentre study, CNNs achieved 97.8% accuracy in detecting and grading keratoconus using colour-coded maps generated by Scheimpflug imaging, based on axial, elevation, and pachymetric data27. A hybrid approach25 combining CNNs with traditional ML techniques further improved performance in classifying corneal topographic patterns. Lastly, study26 demonstrated that complex deep learning models could effectively detect keratoconus based on subtle morphological changes in the corneal endothelium, achieving high scores across multiple performance metrics. These developments underscore the potential of ML and deep learning in improving keratoconus diagnostics. However, additional work is needed to compare algorithms using structured, interpretable data, particularly from widely used tools like the Pentacam. This motivates the present study, which aims to evaluate and compare several machine learning classifiers trained on topographic and biomechanical parameters obtained from the Pentacam device. A special focus is placed on identifying clinically relevant features and assessing their overlap with expert-defined diagnostic indicators.
The aim of this study was to differentiate healthy corneas from those affected by keratoconus using structured data obtained from the Pentacam device and a set of supervised machine learning algorithms. The analysis focused on evaluating the diagnostic performance of multiple classifiers across different groups of topographic and biomechanical parameters. An additional goal was to identify the most informative features contributing to classification performance and compare them with diagnostic indicators routinely used by clinicians, thereby improving model interpretability and clinical relevance.
Materials and methods
Study participants
A pilot clinical study was conducted to assess the anterior segment of the eye using the Pentacam device in individuals diagnosed with keratoconus and healthy controls. This single-center study took place at the Department of Ophthalmology, Medical University of Lublin, between March and August 2024. A total of 72 healthy eyes and 76 eyes with keratoconus were examined. All patients were evaluated on an outpatient basis and provided written informed consent prior to participation. The study adhered to the principles of Good Clinical Practice (GCP), was approved by the Local Ethics Committee of the Medical University of Lublin (approval no. KE-0254/98/03/2023), and complied with the Declaration of Helsinki. Each participant underwent a comprehensive ophthalmic examination, including objective and subjective refraction, slit-lamp biomicroscopy, intraocular pressure measurement, and corneal tomography.
Inclusion Criteria
-
(A)
Healthy Eyes:
Normal corneal tomography and topography (K max < 47 D, inferior–superior difference < 1.5 D, skewed radial axis index < 22º).
Normal elevation maps of the anterior and posterior corneal surfaces.
Normal corneal thickness distribution (CCT > 480 μm).
No corneal scars.
No clinical signs of keratoconus.
No family history of keratoconus.
-
(B)
Eyes with Keratoconus:
Abnormal corneal tomography and topography (K max > 47 D, inferior–superior difference > 1.5 D, skewed radial axis index > 22º).
Abnormal elevation maps of the anterior and posterior corneal surfaces.
Central or inferior steepening of the cornea.
Corneal thinning.
No corneal scars.
Exclusion criteria
Other corneal ectasias (e.g., Pellucid Marginal Degeneration, Keratoglobus): These disorders differ from keratoconus in pathophysiology, clinical presentation, and progression. Including them could introduce heterogeneity and compromise the specificity of keratoconus-related findings.
Corneal endothelial disorders (e.g., Fuchs’ endothelial dystrophy, posterior polymorphous corneal dystrophy [PPCD], congenital hereditary endothelial dystrophy [CHED], X-linked endothelial corneal dystrophy [XECD], or significant endothelial cell loss): These conditions may independently affect corneal thickness, transparency, and biomechanics, potentially confounding keratoconus-specific analysis.
History of ocular surgery (e.g., LASIK, cataract extraction): Surgical procedures can alter corneal shape and biomechanics, possibly mimicking or masking keratoconus-related changes.
Infectious eye diseases: Current or prior ocular infections can result in scarring, thinning, or corneal irregularities that may interfere with accurate diagnosis and classification of keratoconus.
Data acquisition
All participants were instructed to discontinue contact lens use at least 14 days prior to the examination.
The Pentacam device enables comprehensive assessment of the anterior segment of the eye through a three-dimensional mathematical model. This allows for the analysis of multiple structural and biomechanical parameters, including:
Corneal Topography: Evaluation of the anterior and posterior corneal surfaces, enabling assessment of curvature and shape to detect irregularities typical of keratoconus.
Pachymetry: Measurement of corneal thickness at various points to identify areas of thinning, which are characteristic of ectatic disorders.
Densitometry: Analysis of corneal and lens transparency to detect early signs of opacities, which may correlate with keratoconus progression.
Three-Dimensional Anterior Segment Imaging: High-resolution 3D reconstructions based on rotating Scheimpflug imaging, facilitating detailed visualization of corneal structure and pathology.
Anterior Chamber Analysis: Quantification of chamber depth, angle, and volume, providing additional insight into corneal shape and anterior segment changes.
The resulting measurements were categorized into feature groups and saved as tabular datasets in CSV format, corresponding to the following directories (original English names):
BADisplay: Belin/Ambrósio Enhanced Ectasia Display parameters, supporting early detection of ectatic corneal disorders through composite indices and detailed maps.
CHAMBER: Anterior chamber measurements (depth, volume, angle), important for glaucoma evaluation and preoperative planning for intraocular lenses.
COR-PWR: Corneal power maps assessing refractive strength and curvature, essential for refractive surgery planning.
EccSag: Parameters describing sagittal and tangential curvature, relevant to corneal shape analysis and keratoconus diagnosis.
Fourier: Fourier decomposition of corneal surfaces, used to detect subtle irregularities not evident on standard topography.
INDEX: Various topography-based indices, including keratoconus-specific metrics and surface irregularity scores.
KEIO: Features used in the KEIO classification system for categorizing corneal morphologies and anomalies.
PACHY: Pachymetric data providing detailed corneal thickness profiles for diagnostic and surgical purposes.
SUMMARY: A compiled summary of key diagnostic metrics, offering a global overview of corneal health.
ALL: A comprehensive dataset combining all the above feature sets.
A team of ophthalmology specialists from the Department of General Ophthalmology extracted diagnostically relevant topographic and tomographic data from all participants using the Pentacam® AXL Wave device (Oculus Optikgeräte GmbH, Germany). Only scans marked as “OK” by the device’s internal Quality Specification (QS) function were included in the analysis, ensuring that only high-quality data were used. Table 1 presents the extracted parameters along with basic descriptive statistics and the results of the Student’s t-test, which was used to compare the means between the keratoconus and healthy eye groups. This test determines whether observed differences in means are statistically significant—that is, unlikely to have occurred by chance. All statistical analyses were conducted using Statistica 13 software, with a significance level set at α = 0.05; p-values below this threshold were considered statistically significant.
Table 1.
Comparison of selected clinical, tomographic, and Biomechanical parameters between healthy and keratoconus groups.
| Parameter | Healthy | Keratoconus | p-value | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Mean | SD | 95% CI | Mean | SD | 95% CI | ||||
| Lower bound | Upper bound | Lower bound | Upper bound | ||||||
| SEX |
M 69 F 7 |
M 34 F 38 |
|||||||
| AGE | 29.21 | 9.681 | 8.292 | 11.63 | 28.94 | 10.44 | 8.94 | 12.55 | 0.872 |
| K max | 44.33 | 1.421 | 1.216 | 1.711 | 57.51 | 9.609 | 8.239 | 11.53 | 0.000 |
| K1 | 42.26 | 1.173 | 1.003 | 1.411 | 46.86 | 5.508 | 4.723 | 6.609 | 0.000 |
| K2 | 43.78 | 1.329 | 1.137 | 1.600 | 50.31 | 6.514 | 5.585 | 7.816 | 0.000 |
| Astigm | 1.533 | 1.303 | 1.115 | 1.568 | 3.447 | 2.663 | 2.283 | 3.195 | 0.000 |
| Thinnest pachy | 548.5 | 29.62 | 25.34 | 35.64 | 463.7 | 47.14 | 40.42 | 56.57 | 0.000 |
| ISV | 19.76 | 9.452 | 8.087 | 11.37 | 90.37 | 42.95 | 36.83 | 51.54 | 0.000 |
| IVA | 0.128 | 0.059 | 0.051 | 0.072 | 0.920 | 0.418 | 0.358 | 0.501 | 0.000 |
| KI | 1.019 | 0.022 | 0.019 | 0.027 | 1.247 | 0.152 | 0.131 | 0.183 | 0.000 |
| CKI | 1.008 | 0.005 | 0.004 | 0.006 | 1.073 | 0.077 | 0.066 | 0.092 | 0.000 |
| IHA | 7.316 | 7.172 | 6.137 | 8.631 | 33.10 | 28.26 | 24.23 | 33.91 | 0.000 |
| IHD | 0012 | 0.007 | 0.006 | 0.009 | 0.134 | 0.084 | 0.072 | 0.101 | 0.000 |
| KISA | 7.465 | 10.16 | 8.697 | 12.23 | 6430 | 5513 | 7716 | 768.6 | 0.000 |
| BADDf | 0.329 | 0.959 | 0.821 | 1.154 | 11.20 | 10.07 | 8.641 | 12.09 | 0.000 |
| BADDb | − 0.036 | 0.621 | 0.531 | 0.747 | 9.110 | 6.626 | 5.681 | 7.951 | 0.000 |
| BADDp | 0.425 | 0.864 | 0.7393 | 1.039 | 8.794 | 7.391 | 6.337 | 8.868 | 0.000 |
| BADDt | − 0.260 | 0.799 | 0.684 | 0.962 | 2.661 | 2.006 | 1.720 | 2.407 | 0.000 |
| BADDam | 0.221 | 0.694 | 0.594 | 0.835 | 2.948 | 0.829 | 0.711 | 0.995 | 0.000 |
| BADDy | 0.513 | 0.796 | 0.681 | 0.958 | 0.613 | 1.618 | 1.387 | 1.941 | 0.648 |
| BADD | 0.847 | 0.600 | 0.513 | 0.722 | 9.054 | 5.153 | 4.418 | 6.183 | 0.000 |
| BADDbHyp | − 0.858 | 0.614 | 0.526 | 0.740 | 8.199 | 6.560 | 5.625 | 7.871 | 0.000 |
| BFSFront8mm | 7.936 | 0.192 | 0.164 | 0.231 | 7.449 | 0.581 | 0.498 | 0.697 | 0.000 |
| BFSBack8mm | 6.500 | 0.198 | 0.170 | 0.239 | 6.114 | 0.514 | 0.441 | 0.617 | 0.000 |
| ARTMin | 830.2 | 150.0 | 128.3 | 180.5 | 365.2 | 198.3 | 170.0 | 238.0 | 0.000 |
| ARTMax | 463.7 | 75.97 | 65.00 | 91.42 | 165.5 | 90.72 | 77.79 | 108.8 | 0.000 |
| ARTAvg | 577.3 | 87.96 | 75.26 | 105.8 | 242.7 | 122.6 | 105.1 | 147.1 | 0.000 |
| PachyMin | 548.5 | 29.62 | 25.34 | 35.64 | 463.7 | 47.14 | 40.42 | 56.57 | 0.000 |
| DistApexThinLocmm | 0.757 | 0.238 | 0.203 | 0.286 | 0.836 | 0.627 | 0.537 | 0.752 | 0.336 |
| PachyMinLocOrientation | 101.1 | 0.492 | 0.421 | 0.593 | 101.2 | 0.760 | 0.652 | 0.912 | 0.125 |
| ACDepth | 2.974 | 0.264 | 0.225 | 0.321 | 3.147 | 0.318 | 0.271 | 0.384 | 0.001 |
| ACVolume | 269.2 | 313.3 | 268.1 | 377.0 | 247.7 | 282.5 | 242.2 | 339.0 | 0.673 |
| ChAngle | 37.44 | 7.315 | 6.259 | 8.803 | 36.82 | 6.265 | 5.371 | 7.517 | 0.591 |
| RMS (CF) | 2.654 | 3.643 | 3.145 | 4.331 | 12.64 | 7.058 | 6.063 | 8.445 | 0.000 |
| RMS HOA (CF) | 0.549 | 0.755 | 0.651 | 0.897 | 3.129 | 1.821 | 1.564 | 2.178 | 0.000 |
| RMS LOA (CF) | 2.590 | 3.569 | 3.081 | 4.243 | 12.24 | 6.838 | 5.875 | 8.182 | 0.000 |
| RMS (CB) | 0.996 | 0.735 | 0.634 | 0.874 | 3.119 | 1.558 | 1.338 | 1.864 | 0.000 |
| RMS HOA (CB) | 0.224 | 0.188 | 0.162 | 0.224 | 0.862 | 0.443 | 0.381 | 0.530 | 0.000 |
| RMS LOA (CB) | 0.969 | 0.712 | 0.615 | 0.847 | 2.992 | 1.503 | 1.291 | 1.798 | 0.000 |
| RMS (Cornea) | 2.370 | 3.321 | 2.866 | 3.948 | 11.52 | 6.562 | 5.638 | 7.852 | 0.000 |
| RMS HOA (Cornea) | 0.536 | 0.667 | 0.575 | 0.792 | 2.862 | 1.743 | 1.498 | 2.086 | 0.000 |
| RMS LOA (Cornea) | 2.299 | 3.260 | 2.814 | 3.875 | 11.15 | 6.350 | 5.455 | 7.598 | 0.000 |
| Z 2 2 (CF) | − 1.305 | 1.315 | 1.135 | 1.563 | -2.776 | 2.363 | 2.030 | 2.828 | 0.000 |
| Z 2 0 (CF) | 0.248 | 2.714 | 2.343 | 3.226 | -3.533 | 5.991 | 5.147 | 7.169 | 0.000 |
| Z 2–2 (CF) | 0.021 | 0.846 | 0.730 | 1.005 | -0.172 | 2.257 | 1.939 | 2.701 | 0.483 |
| Z 3 3 (CF) | 0.009 | 0.149 | 0.128 | 0.177 | -0.099 | 0.520 | 0.447 | 0.623 | 0.081 |
| Z 3 1 (CF) | − 0.009 | 0.332 | 0.286 | 0.394 | -0.030 | 1.020 | 0.877 | 1.221 | 0.869 |
| Z 3 − 1 (CF) | − 0.122 | 0.628 | 0.542 | 0.746 | -2.356 | 1.497 | 1.286 | 1.791 | 0.000 |
| Z 3–3 (CF) | − 0.061 | 0.134 | 0.115 | 0.159 | 0.304 | 0.657 | 0.565 | 0.787 | 0.000 |
| Z 2 2 (CB) | 0.366 | 0.232 | 0.200 | 0.275 | 0.796 | 0.593 | 0.509 | 0.710 | 0.000 |
| Z 2 0 (CB) | − 0.560 | 0.568 | 0.490 | 0.675 | 0.511 | 1.145 | 0.984 | 1.371 | 0.000 |
| Z 2–2 (CB) | − 0.017 | 0.161 | 0.139 | 0.191 | − 0.042 | 0.412 | 0.354 | 0.494 | 0.615 |
| Z 3 3 (CB) | − 0.014 | 0.061 | 0.053 | 0.073 | − 0.008 | 0.188 | 0.161 | 0.225 | 0.772 |
| Z 3 1 (CB) | 0.007 | 0.092 | 0.079 | 0.110 | 0.025 | 0.284 | 0.244 | 0.340 | 0.587 |
| Z 3 − 1 (CB) | 0.004 | 0.171 | 0.148 | 0.204 | 0.611 | 0.364 | 0.313 | 0.436 | 0.000 |
| Z 2 2 (Cornea) | − 1.111 | 1.277 | 1.102 | 1.518 | − 2.371 | 2.180 | 1.872 | 2.608 | 0.000 |
| Z 2 0 (Cornea) | 0.008 | 2.476 | 2.138 | 2.944 | − 2.970 | 5.729 | 4.922 | 6.855 | 0.000 |
| Z 2–2 (Cornea) | 0.008 | 0.811 | 0.700 | 0.965 | − 0.230 | 2.297 | 1.974 | 2.749 | 0.393 |
| Z 3 3 (Cornea) | − 0.003 | 0.170 | 0.146 | 0.202 | − 0.116 | 0.587 | 0.505 | 0.703 | 0.107 |
| Z 3 1 (Cornea) | − 0.004 | 0.295 | 0.254 | 0.351 | − 0.008 | 0.918 | 0.789 | 1.099 | 0.966 |
| Z 3 − 1 (Cornea) | − 0.129 | 0.561 | 0.484 | 0.667 | − 2.099 | 1.445 | 1.242 | 1.730 | 0.000 |
| Z 3–3 (Cornea) | − 0.088 | 0.157 | 0.135 | 0.187 | 0.157 | 0.672 | 0.578 | 0.805 | 0.002 |
Values are reported as mean, standard deviation (SD) with 95% confidence intervals (CI). Statistically significant differences (p < 0.05) are indicated.
A comparison of selected study parameters revealed no statistically significant difference in the mean age between the two groups (p = 0.872). However, significant differences were observed in most of the remaining parameters. Maximum keratometry (K max), K1, and K2 values—reflecting corneal curvature—were significantly higher in the keratoconus group (p < 0.001), as was astigmatism. The thinnest pachymetry readings indicated a markedly thinner cornea in the keratoconus group. Topographic indices such as the Index of Surface Variance (ISV), Index of Vertical Asymmetry (IVA), Keratoconus Index (KI), and Central Keratoconus Index (CKI) were also significantly elevated (p < 0.001). Numerous additional parameters showed statistically significant differences, including IHA, IHD, KISA, BADDf, BADDb, BADDp, BADDt, BADDam, BADDy, BADD, BADDbHyp, BFSFront8mm, BFSBack8mm, ARTMin, ARTMax, ARTAvg, PachyMin, ACDepth, and various root mean square (RMS) and Zernike components measured across different corneal layers (CF, CB, Cornea). In contrast, several parameters did not demonstrate statistically significant differences between groups (p > 0.05), including DistApexThinLocmm, PachyMinLocOrientation, and selected Zernike terms such as Z 2–2, Z 3 3, and Z 3 1 for both corneal front, back, and full surfaces.
While these indicators represent a substantial portion of the clinically relevant parameters analyzed, they do not include all variables used in subsequent machine learning models.
Machine learning and machine learning models
Training machine learning models on each individual dataset, as well as on their combined form, enabled the identification of the most diagnostically relevant parameters—both in terms of measurement categories and specific indices.
A summary of the number of features included in each dataset is provided in Table 1.
As shown in Table 2, the dataset containing corneal thickness measurements (PACHY) includes the highest number of features (248), while the Fourier parameter set contains the fewest (19). The total feature count in the combined dataset (ALL) is greater than the sum of individual sets due to the repeated occurrence of certain features across multiple categories.
Table 2.
Number of features included in each dataset derived from Pentacam measurements.
| Data | BADisplay | CHAMBER | COR-PWR | EccSag | Fourier | INDEX | KEIO | PACHY | SUMMARY | ALL |
|---|---|---|---|---|---|---|---|---|---|---|
| Number of fetures | 34 | 79 | 167 | 43 | 19 | 86 | 113 | 248 | 72 | 707 |
Additionally, the input data underwent preliminary preprocessing. Missing values in quantitative features were imputed using the mean, while missing values in categorical variables were filled with the mode (most frequently occurring value).
Each of the aforementioned data groups can be used individually or in combination as input datasets for machine learning models performing binary classification (absence = 0, presence of keratoconus = 1). In medical diagnostics involving artificial intelligence, the most frequently applied algorithms include Logistic Regression (LR)28,29 Decision Tree (DT)30,31 Random Forest (RF)29,32,33 Support Vector Machine (SVM)34,35 K-Nearest Neighbors (KNN)36,37 Gradient Boosting (GB)38,39 and Stochastic Gradient Descent (SGD)40,41.
These methods are commonly selected for medical data classification due to the following advantages:
Effectiveness: High classification accuracy across complex and multidimensional datasets.
Flexibility: Applicable to both continuous and categorical variables.
Interpretability: Many of these models offer interpretable outputs, which is essential in clinical settings where understanding the basis of a decision can be critical.
Scalability: Capable of handling large datasets, often required in medical applications.
Optimization potential: Focused on maximizing diagnostic performance, which is vital given the consequences of misclassification in healthcare.
Description of applied machine learning models
Logistic Regression (LR): A statistical model that estimates the probability of class membership using the logistic function. It offers high interpretability, with regression coefficients reflecting the influence of each predictor variable on the outcome.
Decision Tree (DT): A hierarchical model in which internal nodes represent decision rules based on feature values, and terminal nodes indicate predicted classes. It supports both numerical and categorical data and is easily interpretable and visualizable.
Random Forest (RF): An ensemble method that constructs multiple decision trees using bootstrapped data and random feature selection. Predictions are aggregated through majority voting, increasing robustness and reducing overfitting compared to individual trees.
Support Vector Machine (SVM): A model that identifies the optimal hyperplane to separate classes with maximum margin. Kernel functions allow for transformation into higher-dimensional spaces to handle non-linearly separable data.
K-Nearest Neighbors (KNN): A non-parametric model that assigns class labels based on the majority class among the k nearest data points in the feature space, using a chosen distance metric.
Gradient Boosting (GB): An iterative ensemble technique where each new weak learner (typically a decision tree) is trained to correct the residual errors of the previous model. The method minimizes loss through additive model updates.
Stochastic Gradient Descent (SGD): A linear model optimization technique that updates weights incrementally for each data sample, improving efficiency. Regularization is used to prevent overfitting, making it suitable for high-dimensional data.
Each selected machine learning model identifies data patterns differently, which is why their performance was evaluated empirically. Training and validating models on datasets containing different groups of features made it possible to assess which measurement parameters provided the highest diagnostic value. However, there is a risk that a model may become overfitted to the specific characteristics of a given dataset. To address this, a two-step approach was adopted:
First Implementation (T1): In this stage, all selected machine learning algorithms were tested on datasets representing individual measurement categories (corresponding to the source directory names). The goal was to identify which groups of features were the most informative for classification.
Second Implementation (T2): The most effective model from the T1 phase was then trained and evaluated on a dataset combining all available features from all measurement categories. This step aimed to determine the most relevant parameters for generalizing to new, unseen data.
The target variable consisted of a binary vector: 0 representing the absence and 1 the presence of keratoconus. The list of patient labels was prepared by ophthalmology specialists from Professor R. Rejdak’s research group at the Medical University of Lublin.
Comparison of classifier parameters
Model evaluation was performed using standard classification metrics: accuracy (ACC), precision (PRC), recall (REC), F1-score (F1), specificity (SPEC), confusion matrix (CM), and Brier score (BR). A brief explanation of each metric is provided below:
Accuracy (ACC): The proportion of correctly classified cases among all observations, reflecting the model’s overall performance.
Precision (PRC): The proportion of true positives among all predicted positives, indicating how well the model avoids false alarms.
Recall (REC)/Sensitivity: The proportion of true positives among all actual positives, measuring the model’s ability to detect positive cases.
F1-score (F1): The harmonic mean of precision and recall, balancing both false positives and false negatives.
Specificity (SPEC): The proportion of true negatives among all actual negatives, indicating the model’s ability to correctly reject non-cases.
Confusion Matrix (CM): A summary table of true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN), from which all other metrics are derived.
Brier Score (BS): The mean squared difference between predicted probabilities and actual outcomes. Lower scores indicate better calibration. In clinical contexts, a low Brier score implies that the model’s confidence levels are reliable and suitable for supporting risk-based decisions.
The initial evaluation of model performance was based solely on the accuracy (ACC) metric. In all cases, the dataset was split into 80% for training and 20% for testing. Additionally, 5-fold stratified cross-validation was applied to the training set to ensure robust performance estimation. This method divides the data into five equally sized subsets (folds), maintaining similar class distribution in each. In each iteration, four folds are used for training and one for validation. The process is repeated five times, with each fold serving once as the validation set. The final performance score is obtained by averaging the results across all folds. This approach provides more stable performance estimates and ensures more efficient use of the available data, especially in relatively small datasets.
Results
Accuracy (ACC) was used as the primary metric to evaluate model performance across the different feature sets, with 5-fold cross-validation applied in each case. Table 2 summarizes the individual ACC values obtained for each dataset during the T1 testing procedure, along with their mean values presented in the final column.
Table 3 presents a comparative analysis of accuracy scores achieved by various binary classification models across datasets containing different parameter groups. Overall, the models demonstrated excellent performance, with nearly all achieving an average accuracy above 90%, confirming their ability to accurately detect keratoconus in most cases. Among the tested models, K-Nearest Neighbors (KNN) showed the lowest average accuracy (91%), while Random Forest (RF) achieved the highest (96%). Several other models, including Decision Tree (DT), Logistic Regression (LR), Support Vector Machine (SVM), and Gradient Boosting (GB), also performed reliably, each reaching an average accuracy of 94%.
Table 3.
Classification accuracy (ACC) of machine learning models across different feature groups in the T1 training procedure, evaluated using 5-fold cross-validation.
| Model\Data | BADisplay (%) | CHAMBER (%) | COR-PWR (%) | EccSag (%) | Fourier (%) | INDEX (%) | KEIO (%) | PACHY (%) | SUMMARY (%) | Mean (%) |
|---|---|---|---|---|---|---|---|---|---|---|
| LR | 93 | 92 | 96 | 91 | 95 | 94 | 94 | 94 | 94 | 94 |
| DT | 98 | 89 | 92 | 94 | 93 | 94 | 92 | 96 | 97 | 94 |
| RF | 98 | 96 | 94 | 96 | 97 | 96 | 93 | 94 | 97 | 96 |
| SVM | 94 | 91 | 94 | 94 | 94 | 95 | 94 | 93 | 95 | 94 |
| KNN | 95 | 79 | 88 | 86 | 93 | 94 | 92 | 93 | 95 | 91 |
| GB | 97 | 92 | 92 | 95 | 93 | 94 | 87 | 96 | 97 | 94 |
| SDG | 94 | 88 | 93 | 90 | 93 | 92 | 92 | 89 | 92 | 91 |
| Mean | 96 | 90 | 93 | 92 | 94 | 94 | 92 | 94 | 94 | 95 |
Significant values are in [bold].
A more detailed breakdown shows:
RF and DT models consistently performed well across datasets, with RF achieving top results in CHAMBER (96%) and Fourier (97%).
LR and SVM also showed stable performance, particularly in the COR-PWR and Fourier feature sets.
KNN had comparatively weaker results in CHAMBER (79%) and EccSag (86%), indicating sensitivity to feature scaling or data distribution.
GB and SGD models produced mixed results: GB performed well on the SUMMARY dataset (97%), while SGD showed lower accuracy in PACHY (89%).
Figure 1. Box plot illustrating the distribution of classification accuracy (ACC) for each machine learning model across all tested feature sets during the training phase with 5-fold cross-validation.
Fig. 1.

provides a box plot summarizing the distribution of accuracy scores for each classification model across the evaluated datasets.
Analysis of Fig. 1 indicates that the Random Forest (RF) model exhibits the highest median accuracy among all evaluated classifiers. This suggests that its diagnostic performance is more consistently concentrated around higher values compared to the other models.
In addition, the results presented in Table 3 can be used to identify which types of Pentacam-derived measurements yield the highest diagnostic accuracy for keratoconus detection, independent of the classification algorithm applied (see Fig. 2).
Fig. 2.

Box plot showing the distribution of classification accuracy (ACC) across different feature sets used as input datasets for machine learning models.
Figure 2 demonstrates that, overall, each feature group yields high classification accuracy across models. The highest mean accuracy was observed for the BADisplay dataset (96%). A more detailed breakdown reveals the following:
BADisplay: Consistently high performance across all models, with no significant outliers.
CHAMBER: High variability across models, with KNN performing notably worse than others.
COR-PWR: Moderate variability, with most models achieving stable results.
EccSag: Moderate variability, with RF showing the best performance.
Fourier: Uniformly strong results across all classifiers.
INDEX: Consistently high accuracy across models.
KEIO: Moderate variation, with GB performing less favorably.
PACHY: Similar moderate variability, with SDG achieving lower accuracy.
SUMMARY: Stable and consistently high accuracy across all models.
Additional insight into model performance can be gained through analysis of the confusion matrix. Figure 3 presents the graphical representation of confusion matrices for the best-performing model, Random Forest (RF), evaluated on the BADisplay and SUMMARY datasets.
Fig. 3.
Confusion matrices for the Random Forest (RF) classifier evaluated on the BADisplay (left) and SUMMARY (right) feature sets.
As shown in Fig. 3, the Random Forest model correctly classified 15 healthy eyes and 12 eyes with keratoconus in both the BADisplay and SUMMARY datasets. In both cases, two keratoconus cases were misclassified as healthy, while no healthy eyes were incorrectly classified as diseased.
Based on the confusion matrices, several additional evaluation metrics relevant to medical classification tasks were calculated. These values are summarized in Table 4.
Table 4.
Evaluation metrics for the random forest (RF) classifier applied to the badisplay and SUMMARY datasets.
| Data/metrics | ACC (%) | PRC (%) | REC (%) | F1 (%) | SPEC (%) |
|---|---|---|---|---|---|
| BADisplay | 93 | 100 | 86 | 92 | 100 |
| Summary | 93 | 100 | 86 | 92 | 100 |
Both the BADisplay and SUMMARY datasets produced identical results across all evaluated metrics, indicating that the Random Forest model demonstrates high stability and robustness when trained on either set. In both cases, the model achieved excellent precision (PRC) and specificity (SPEC), each at 100%. These results confirm that the Random Forest classifier trained on BADisplay and SUMMARY features is highly reliable and well suited for clinical applications where maximum precision and specificity are required.
The second stage of analysis (T2) involved evaluating classifier performance on a dataset that combined all available measurements obtained from the Pentacam device. Unlike the T1 procedure, which focused on specific feature groups, this stage aimed to identify which individual features most strongly influence model performance and are thus most relevant for keratoconus detection. Table 5 presents the classification accuracy of each model, calculated using the final fold of 5-fold cross-validation.
Table 5.
Classification accuracy (ACC) of individual machine learning models evaluated on the training set using the final fold of 5-fold cross-validation.
| Model | LR (%) | DT (%) | RF (%) | SVM (%) | KNN (%) | GB (%) | SGD (%) |
|---|---|---|---|---|---|---|---|
| ACC | 95 | 95 | 96 | 95 | 94 | 95 | 95 |
When evaluated on the training set containing all features, the Random Forest (RF) model achieved the highest classification accuracy at 96%. All models, however, performed exceptionally well, with accuracies exceeding 94%.
Figure 4 presents the confusion matrices for the two top-performing models: Random Forest (RF), which achieved the highest accuracy, and Gradient Boosting (GB), included for comparative purposes. Both were evaluated on the independent test set.
Fig. 4.
Confusion matrices for the Random Forest (RF) classifier (left) and Gradient Boosting (GB) classifier (right), evaluated on the full feature set containing all 707 measurements obtained from the Pentacam device.
The classification results show that both models correctly identified 14 healthy individuals and 12 patients with keratoconus. Each model also produced the same number of misclassifications, incorrectly labeling one healthy individual as having keratoconus and two keratoconus cases as healthy. A summary of the evaluation metrics obtained during testing is presented in Table 6.
Table 6.
Quality assessment metrics for the random forest (RF) and gradient boosting (GB) classifiers evaluated on the test set using the full feature set.
| Modele/Metryki | ACC (%) | PRC (%) | REC (%) | F1 (%) | SPEC (%) | BRIER |
|---|---|---|---|---|---|---|
| RF | 90 | 92 | 86 | 89 | 93 | 0.0606 |
| GB | 90 | 92 | 86 | 89 | 93 | 0.0612 |
Results correspond to the final (fifth) fold of 5-fold cross-validation.
The Brier Score for both models (RF: 0.0606; GB: 0.0612) indicates high-quality probabilistic predictions, confirming good calibration and accuracy.
The final stage of the analysis focused on evaluating feature importance—both at the individual variable level and across broader measurement groups. Feature importance analysis identifies which variables have the greatest impact on model predictions and overall performance. Features with high importance contribute most to model accuracy, while those with low importance may be candidates for removal, potentially simplifying the model and improving efficiency.
In addition to guiding model refinement, this analysis enhances interpretability by providing insights into the model’s decision-making process. Figure 5 presents the top 30 most influential features for both the Random Forest (RF) and Gradient Boosting (GB) models.
Fig. 5.
Feature importance rankings for the Random Forest (RF) model (top) and Gradient Boosting (GB) model (bottom), evaluated on the full feature set containing all 707 measurements.
For the Random Forest (RF) model, the most influential feature was Pachy Prog Index Max. with an importance score of 0.0795, followed by RPI Max (0.0599) and D4mm Prog (0.0452). Other features with relatively high importance included R Min B (mm), KI, and D1.6 mm Prog. Features such as BAD Dam, D4.8 mm Prog, and BAD D showed moderate contributions, indicating some influence on model decisions but less than the top-ranked variables. Features like D2.4 mm Prog, BAD Dp, and D5.6 mm Prog were among the least influential.
In contrast, the Gradient Boosting (GB) model assigned the highest importance to BAD D (0.3679), followed by RPI Max (0.2920) and BAD Dam (0.1522). Additional features such as ART Max and Pachy Prog Index Max. also contributed substantially to the model’s predictions. Features including TNP R2mm P330°, Sag. Nas 25, and TNP R4mm P324° had moderate importance scores. On the lower end, variables such as Ecc. Nas 25, r4.0 324°, and Asph. Q F exhibited minimal impact on model output.
Both models identified RPI Max and Pachy Prog Index Max. as influential features, although their relative importance varied. The GB model assigned substantially greater weight to BAD D, whereas the RF model distributed importance more evenly across a broader set of features, without a single variable dominating predictions. In contrast, the GB model concentrated predictive influence on a few key features, suggesting that its decision-making relies more heavily on select measurements. A comparison of the distribution of important features by measurement type, as recorded by the Pentacam device, is shown in Fig. 6.
Fig. 6.
Distribution of the most important features by examination type for the Random Forest (RF) model (left) and Gradient Boosting (GB) model (right), evaluated on the complete feature set.
A comparison of the contribution of individual measurement groups shows that, in the case of the Random Forest (RF) model, the most influential features originated from corneal topography (INDEX) and pachymetric data (PACHY), each appearing 9 times, followed by Belin/Ambrósio Enhanced Ectasia Display (BADisplay) parameters with 8 occurrences. For the Gradient Boosting (GB) model, the most frequently represented group was the Corneal Power Map (COR-PWR), with 11 features, followed by BADisplay (6 occurrences) and KEIO-based parameters (5 occurrences), which relate to the classification of corneal shapes and irregularities. Notably, the GB model draws from a broader range of measurement categories, suggesting a more complex structure and greater capacity to integrate diverse input variables.
Discussion and conclusion
The results of this study demonstrate the strong potential of machine learning models in the early detection of keratoconus. Multiple feature sets derived from Pentacam measurements were evaluated using various classifiers. Most models achieved high diagnostic performance, with mean accuracy scores exceeding 91%. Among them, the Random Forest (RF) algorithm proved to be the most effective, achieving a mean accuracy of 96%, with values ranging from 93 to 98% depending on the feature set. Notably, the highest overall accuracy was obtained using the Belin/Ambrósio Enhanced Ectasia Display (BADisplay) dataset, where RF reached 98% accuracy, and even the least effective model—Logistic Regression—achieved 93%. These results suggest that BADisplay provides particularly informative parameters for keratoconus classification. In summary, the Random Forest model applied to BADisplay measurements appears to be the optimal combination for early corneal diagnosis. However, high accuracy values across other datasets indicate that a wide range of Pentacam-derived parameters can also be effectively used, provided the model is appropriately selected and tuned.
In the next stage of the study, all previously tested machine learning models were applied to a combined dataset containing all available features. Once again, the Random Forest (RF) model outperformed the others, achieving a diagnostic accuracy of 96%. Other models also demonstrated strong performance, with accuracy values ranging between 94% and 95%. When evaluated on the independent test set, both the optimal model (RF) and the comparative model (Gradient Boosting) showed reduced classification accuracy (90%) compared to the training set. This drop may be attributed to the limited sample size relative to the high dimensionality of the feature space, which can hinder the models’ generalization ability and lead to overly complex decision boundaries. Despite this, both models achieved high precision (PRC = 92%) and specificity (SPEC = 93%). High precision indicates that positive predictions were accurate, while high specificity confirms that negative cases were correctly identified. Overall, the models maintained robust diagnostic performance across key metrics, including accuracy, recall, and F1-score. However, the presence of false negatives suggests room for improvement, particularly in capturing all keratoconus cases. In clinical diagnostics, balancing precision and recall is essential to ensure accurate detection while minimizing missed diagnoses.
The final stage of the analysis focused on comparing the importance of individual features in the predictions generated by the Random Forest (RF) and Gradient Boosting (GB) models. The results indicate that both the type and specific identity of influential features vary depending on the algorithm used. The GB model relies on a broader range of features drawn from multiple measurement categories, with a particularly strong emphasis on corneal power map parameters. In contrast, the RF model places greater weight on a smaller subset of features, with some types-such as anterior chamber measurements and sagittal or tangential curvature values-having minimal influence on its predictions. In both models, the most impactful features were consistently derived from corneal topography, elevation, and pachymetry data, confirming the clinical relevance of these parameters in the diagnosis of keratoconus.
A comparison between the diagnostic parameters identified by clinicians (Table 1) and those selected by the machine learning models as most significant (using an equal number of features) revealed substantial overlap. Shared parameters include corneal relative thickness indicators (ART Avg., ART Max., ART Min.) and ectasia-related indices such as BAD-D (total ectasia risk), BAD-Dam (anterior corneal surface deviation), BAD-Dp (pachymetric progression), BAD-Dt (corneal thickness deviation), and BAD-Dy (posterior surface deviation). In addition, both sources identified the BFS Front 8 mm parameter, representing the radius of the best-fit sphere for the central anterior corneal surface, as diagnostically relevant. Other shared features include key indices of corneal irregularity: CKI (Central Keratoconus Index), IHD (Index of Height Decentration), ISV (Index of Surface Variance), IVA (Index of Vertical Asymmetry), and KI (Keratoconus Index). Overall, the overlapping parameters primarily concern corneal topography, including surface shape, curvature, elevation, thickness, and asymmetry-highlighting a strong alignment between model-derived and clinically established diagnostic criteria.
Limitations
This study has several limitations. First, the dataset was relatively small (144 eyes) and collected at a single clinical center, which may limit the generalizability of the findings. Although 5-fold cross-validation was employed to reduce overfitting, external validation using larger and more diverse datasets is both necessary and planned for future research.
Second, the feature space was high-dimensional, comprising 707 variables, which increases the risk of overfitting. While cross-validation and feature importance analysis partially addressed this issue, no formal dimensionality reduction or regularization techniques (e.g., PCA, L1/L2 penalties) were applied. These approaches will be incorporated in future work.
Third, the analysis was based exclusively on structured tabular data derived from Pentacam measurements. Deep learning methods were not implemented due to the absence of raw imaging data and the relatively limited sample size. However, a separate study utilizing a hybrid CNN-RNN architecture applied to dynamic corneal images from the Corvis device is currently under review.
Future directions will include external validation of the current models, exploration of dimensionality reduction and calibration methods, and the integration of multimodal data, including raw tomographic images and corneal biomechanics. Ultimately, the implementation of interpretable machine learning models in clinical settings may enhance early diagnosis, support more consistent clinical decisions, and improve the detection of subclinical keratoconus cases that are often missed using conventional diagnostic approaches.
Summary
This study confirms the effectiveness of machine learning methods for the early detection of keratoconus using corneal topography and biomechanical data derived from the Pentacam device. Among the evaluated classifiers, the Random Forest (RF) model achieved the highest diagnostic accuracy (96%), outperforming other models, which reached between 91% and 95%. The best results were obtained with the Belin/Ambrósio Enhanced Ectasia Display (BAD Display) dataset, where RF achieved an accuracy of 98%.
Further evaluation showed that both RF and Gradient Boosting (GB) models maintained strong performance. On an independent test set, accuracy decreased to 90%, likely due to the relatively small sample size and high dimensionality of the feature space. Nonetheless, the models retained high precision (92%) and specificity (93%), confirming their clinical reliability. The presence of some false negatives indicates room for improvement in capturing all keratoconus cases.
Feature importance analysis revealed that RF emphasized corneal topography, elevation, and pachymetry, while GB incorporated a broader range of features, including corneal power maps. A comparison between machine-selected and clinician-identified diagnostic parameters showed substantial overlap, with key indicators including ART metrics, BAD-D subindices, BFS Front 8 mm, and topographic irregularity indices (CKI, IHD, ISV, IVA, KI).
In conclusion, the findings highlight Random Forest as a robust and interpretable model for keratoconus detection, with results that are consistent with clinical expertise. Future research will expand on this foundation by incorporating deep learning methods and dynamic biomechanical data from the Corvis device to support more comprehensive and scalable diagnostic solutions.
Acknowledgements
This research was supported by Polish Ministry of Education and Science, grant no. MEiN/2023/DPI/2194, project title: Lublin Digital Union—Use of Digital Solutions and Artificial Intelligence in Medicine—Research Project.
Author contributions
A.S., K.J., and A.P. designed the study. T.C. and D.W. prepared and conducted the study and described the study group. M.M. and D.G. acquired and prepared the data. J.G., A.S., and R.K. selected the artificial intelligence models and conducted their training and validation. A.S., R.K., D.W., and J.G. analyzed the results. KEJ created the charts and translated the text. R.R. and K.J. administered the project and provided substantive supervision. All authors reviewed the manuscript.
Data availability
The datasets generated and/or analyzed during the current study are not publicly available due to patient privacy and ethical restrictions but are available from the corresponding author on reasonable request.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Lucas, S. E. M. & Burdon, K. P. Genetic and environmental risk factors for keratoconus. Annu. Rev. Vis. Sci.6, 25–46 (2020). [DOI] [PubMed] [Google Scholar]
- 2.Torres Netto, E. A. et al. Prevalence of keratoconus in paediatric patients in riyadh, Saudi Arabia. Br. J. Ophthalmol.102, 1436–1441 (2018). [DOI] [PubMed] [Google Scholar]
- 3.Santodomingo-Rubido, J. et al. Keratoconus: An updated review. Contact Lens Anterior Eye. 45, 101559 (2022). [DOI] [PubMed] [Google Scholar]
- 4.Naderan, M., Jahanrad, A. & Farjadnia, M. Clinical biomicroscopy and retinoscopy findings of keratoconus in a middle Eastern population. Clin. Experimental Optometry. 101, 46–51 (2018). [DOI] [PubMed] [Google Scholar]
- 5.Masiwa, L. E. & Moodley, V. A review of corneal imaging methods for the early diagnosis of pre-clinical keratoconus. J. Optometry. 13, 269–275 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Zhang, X., Munir, S. Z., Karim, S., Munir, W. & S. A. & M. A review of imaging modalities for detecting early keratoconus. Eye35, 173–187 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Said, O. M., Kamal, M., Tawfik, S. & Saif, A. T. S. Comparison of corneal measurements in normal and keratoconus eyes using anterior segment optical coherence tomography (AS-OCT) and Pentacam HR topographer. BMC Ophthalmol.23, 194 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Cavas-Martínez, F., De La Cruz Sánchez, E., Nieto Martínez, J., Fernández Cañavate, F. J. & Fernández-Pacheco, D. G. Corneal topography in keratoconus: State of the Art. Eye Vis.3, 5 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Smadja, D. et al. Detection of subclinical keratoconus using an automated decision tree classification. Am. J. Ophthalmol.156, 237–246e1 (2013). [DOI] [PubMed] [Google Scholar]
- 10.Ambrósio, R. et al. Novel pachymetric parameters based on corneal tomography for diagnosing keratoconus. J. Refract. Surg.27, 753–758 (2011). [DOI] [PubMed] [Google Scholar]
- 11.Silverman, R. H. et al. Epithelial remodeling as basis for machine-based identification of keratoconus. Invest. Ophthalmol. Vis. Sci.55, 1580 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Asam, J. S. et al. in In High Resolution Imaging in Microscopy and Ophthalmology. 285–299 (eds Bille, J. F.) (Springer International Publishing, 2019). 10.1007/978-3-030-16638-0_13 [PubMed]
- 13.Li, Y., Gokul, A., McGhee, C. & Ziaei, M. Repeatability of corneal and epithelial thickness measurements with anterior segment optical coherence tomography in keratoconus. PLoS ONE. 16, e0248350 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Girard, M. J. A. et al. Translating ocular biomechanics into clinical practice: Current state and future prospects. Curr. Eye Res.40, 1–18 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Scarcelli, G., Besner, S., Pineda, R. & Yun, S. H. Biomechanical characterization of keratoconus Corneas ex vivo with Brillouin microscopy. Invest. Ophthalmol. Vis. Sci.55, 4490 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Maile, H. et al. Machine learning algorithms to detect subclinical keratoconus: Systematic review. JMIR Med. Inf.9, e27363 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Mustapha, A., Mohamed, L., Hamid, H. & Ali, K. Machine learning techniques in keratoconus classification: A systematic review. IJACSA 14, (2023).
- 18.Shi, C. et al. Machine learning helps improve diagnostic ability of subclinical keratoconus using Scheimpflug and OCT imaging modalities. Eye Vis.7, 48 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Lu, N. J. et al. Combining spectral-domain OCT and air-puff tonometry analysis to diagnose keratoconus. J. Refract. Surg.38, 374–380 (2022). [DOI] [PubMed] [Google Scholar]
- 20.Yang, L. et al. Diagnosis of forme fruste keratoconus using Corvis ST sequences with digital image correlation and machine learning. Bioengineering11, 429 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Awwad, S. T. et al. Thickness speed progression index: Machine learning approach for keratoconus detection. Am. J. Ophthalmol.271, 188–201 (2025). [DOI] [PubMed] [Google Scholar]
- 22.Bodmer, N. S. et al. Deep learning models used in the diagnostic workup of keratoconus: A systematic review and exploratory meta-analysis. Cornea43, 916–931 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Aatila, M., Lachgar, M., Hamid, H. & Kartit, A. Keratoconus severity classification using features selection and machine learning algorithms. Comput. Math. Methods Med.2021, 1–26 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Agharezaei, Z. et al. Computer-aided diagnosis of keratoconus through VAE-augmented images using deep learning. Sci. Rep.13, 20586 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Al-Timemy, A. H. et al. A hybrid deep learning construct for detecting keratoconus from corneal maps. Trans. Vis. Sci. Tech.10, 16 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Wan, Q. et al. Deep learning-based automatic diagnosis of keratoconus with corneal endothelium image. Ophthalmol. Ther.12, 3047–3065 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Chen, X. et al. Keratoconus detection of changes using deep learning of colour-coded maps. BMJ Open. Ophth. 6, e000824 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Cox, D. R. The regression analysis of binary sequences. J. Royal Stat. Soc. Ser. B Stat. Methodol.20, 215–232 (1958). [Google Scholar]
- 29.Mourgues, E., Saunier, V., Smadja, D., Touboul, D. & Saunier, V. Forme fruste keratoconus detection with OCT corneal topography using artificial intelligence algorithms. J. Cataract Refract. Surg.50, 1247–1253 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Quinlan, J. R. Induction of decision trees. Mach. Learn.1, 81–106 (1986). [Google Scholar]
- 31.Song, P., Ren, S., Liu, Y., Li, P. & Zeng, Q. Detection of subclinical keratoconus using a novel combined tomographic and biomechanical model based on an automated decision tree. Sci. Rep.12, 5316 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Breiman, L. Random forests. Mach. Learn.45, 5–32 (2001). [Google Scholar]
- 33.Ambrósio, R. et al. Optimized artificial intelligence for enhanced ectasia detection using Scheimpflug-based corneal tomography and biomechanical data. Am. J. Ophthalmol.251, 126–142 (2023). [DOI] [PubMed] [Google Scholar]
- 34.Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn.20, 273–297 (1995). [Google Scholar]
- 35.Arbelaez, M. C., Versaci, F., Vestri, G., Barboni, P. & Savini, G. Use of a support vector machine for keratoconus and subclinical keratoconus detection by topographic and tomographic data. Ophthalmology119, 2231–2238 (2012). [DOI] [PubMed] [Google Scholar]
- 36.Fix, E. & Hodges, J. L. Discriminatory analysis. Nonparametric discrimination: Consistency properties. Int. Stat. Rev. Revue Int. De Statistique. 57, 238 (1989). [Google Scholar]
- 37.Chaari, A. et al. Automated feature selection for early keratoconus screening optimization. Biomed. Phys. Eng. Express. 11, 015039 (2025). [DOI] [PubMed] [Google Scholar]
- 38.Friedman, J. H. Greedy function approximation: A gradient boosting machine. Ann Statist29, (2001).
- 39.Xu, Z. et al. Evaluation of artificial intelligence models for the detection of asymmetric keratoconus eyes using Scheimpflug tomography. Clin. Exper Ophthalmol.50, 714–723 (2022). [DOI] [PubMed] [Google Scholar]
- 40.Robbins, H. & Monro, S. A. Stochastic approximation method. Ann. Math. Statist. 22, 400–407 (1951). [Google Scholar]
- 41.Mustapha, A., Mohamed, L. & Ali, K. An overview of gradient descent algorithm optimization in machine learning: Application in the ophthalmology field. in Smart Applications and Data Analysis (eds Hamlich, M., Bellatreche, L., Mondal, A. & Ordonez, C.) vol. 1207 349–359 (Springer International Publishing, Cham, (2020). [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The datasets generated and/or analyzed during the current study are not publicly available due to patient privacy and ethical restrictions but are available from the corresponding author on reasonable request.




