Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2024 Nov 6;103(3):272–280. doi: 10.1111/aos.16788

Predicting the risk of treatment‐requiring retinopathy of prematurity in preterm infants in Greece. External validation of DIGIROP prognostic models

Stella Moutzouri 1, Aldina Pivodic 2,3, Anna‐Bettina Haidich 4, Aikaterini K Seliniotaki 1, Maria Lithoxopoulou 5, Christos Tsakalidis 5, Ann Hellström 2,3, Nikolaos Ziakas 1, Asimina Mataftsi 1,
PMCID: PMC11986392  PMID: 39503477

Abstract

Purpose

To assess the predictive performance of DIGIROP‐v1.0 models in identifying treatment‐requiring ROP among infants undergoing ROP screening at a tertiary neonatal intensive care unit in Greece.

Methods

Retrospective cohort analysis of 640 consecutive screened preterm infants with gestational age (GA) 240/7 to 306/7 weeks and known ROP outcome in the 2nd Neonatology Department of Aristotle University of Thessaloniki (2009–2021). The primary outcome was the development of type 1 ROP according to the Early Treatment of ROP criteria or treatment based on the ophthalmologist's judgement. Sensitivity, specificity, area under the curve (AUC) with corresponding 95% confidence intervals (CI) and calibration plots for the DIGIROP‐v1.0 models were displayed.

Results

The DIGIROP‐Birth‐v1.0 model correctly identified 35/43 treated infants (sensitivity 81.4% [95% CI, 66.6%–91.6%], specificity 61.5% [95% CI, 57.4%–65.4%], AUC 0.82 [95% CI, 0.75–0.90]). During the postnatal weeks 6–14 the sensitivity of the DIGIROP‐Screen‐v1.0 model ranged from 82.6% to 100%. Eleven infants, all with severe comorbidities, that is, congenital malformation(s), syndrome(s), hydrocephalus or history of intestinal surgery, that were treated, were missed by the model, but met criteria for screening according to DIGIROP‐v1.0 models' recommendations, and to our unit's routine standards.

Conclusion

The DIGIROP‐v1.0 models resulted in lower sensitivity and higher specificity in this Greek cohort compared with the Swedish development group. Despite higher GA and BW, infants in our cohort had higher incidence of treated ROP than in Sweden, resulting in an under‐estimation of their risk for treatment‐requiring ROP. Further validation of the DIGIROP‐v2.0 models and potential adjusting are recommended to maximize generalizability in populations with different characteristics.

Keywords: paediatric ophthalmology, prognosis, prognostic model(s), retina, retinopathy of prematurity

1. INTRODUCTION

Retinopathy of prematurity (ROP) is a neurovascular disease of the preterm infants and is considered a major cause of preventable childhood blindness worldwide (Blencowe et al., 2013; Zhang et al., 2022). Currently established screening criteria in most countries rely on characteristics such as gestational age (GA) and birth weight (BW) to identify infants at risk for developing sight‐threatening disease (Kościółek et al., 2022). However, these criteria often result in examining too many infants, of whom only a small percentage requires treatment for ROP (Larsen et al., 2021; Moutzouri et al., 2021).

In our unit, infants with GA < 32 weeks and/or BW < 1501 gr, as well as infants with greater GA and BW but with comorbidities (e.g. sepsis, prolonged oxygen supplementation.), are routinely scheduled for ROP screening examinations. Screening commences at 30 to 31 weeks postmenstrual age or 4 weeks after birth, whichever is later, and is repeated weekly or biweekly, according to the disease severity, until treatment‐requiring ROP (TR‐ROP) develops, or retina is fully vascularized (Mataftsi et al., 2020; Wilkinson et al., 2008).

A prediction model, which would individually predict the risk of developing TR‐ROP, would allow clinicians to limit the number of unnecessary examinations, directing screening only to infants who are at high risk for sight‐threatening ROP. This approach would not only enhance infant well‐being but also optimize resource allocation.

In recent years, a variety of prediction models for ROP have been published (Athikarisamy et al., 2021). These models aim to calculate the individual risk for developing TR‐ROP in preterm infants by incorporating various factors such as GA, BW, postnatal weight gain, history of hydrocephalus, blood transfusion or prolonged oxygen supplementation (Binenbaum et al., 2012, 2018; Cao et al., 2016; Eckert et al., 2012; Hellström et al., 2009; McCauley et al., 2018). External validation of these models is necessary before their application in other populations, as the performance of the model depends greatly on the medical setting in which it is applied (Binenbaum et al., 2017; Choi et al., 2013; Piermarocchi et al., 2017; Wu et al., 2012).

Recently, Pivodic et al. developed a series of prediction models for infants born at 24–30 weeks of gestation that do not rely on neonatal data. The first one (DIGIROP‐Birth‐v1.0) is a risk estimating model for TR‐ROP that is based only on birth characteristics (GA, BW, sex) (Pivodic et al., 2020). DIGIROP‐Birth‐v1.0 was developed using Poisson regression for time‐varying data and was validated both internally and externally in a Swedish cohort (temporal validation) as well as a European and a US cohort (geographical validations) (Pivodic et al., 2020). Independent validations of the model were also performed in a Portuguese and a Chinese cohort (Almeida et al., 2022; Chen et al., 2021).

The second model (DIGIROP‐Screen‐v1.0) comprises a set of models that incorporate time of first ROP diagnosis as an additional factor to assess the risk of TR‐ROP over time from 6 to 14 postnatal weeks. DIGIROP‐Screen‐v1.0 models aim to identify infants at low risk for TR‐ ROP that can be safely released from ROP screening examinations at an earlier time point, that is, either at birth or between 6th and 14th postnatal week. Internal and external validations of DIGIROP‐Screen‐v1.0 set of models were performed on a Swedish cohort (temporal validation), and on a German and two US cohorts (geographical validation) in the original publication study (Pivodic et al., 2021). Another external validation of DIGIROP‐Screen‐v1.0 models was performed by the same authors in a different Swedish cohort (Pivodic et al., 2022).

DIGIROP models were found to achieve similar or higher discriminative ability when compared with four other prediction models (Weight, Insulin‐like growth factor‐1, Neonatal, ROP [WINROP], Colorado‐ROP [CO‐ROP], Children's Hospital of Philadelphia‐ROP [CHOP‐ROP], Omaha‐ROP [OMA‐ROP]) in a cohort of preterm infants in the United States (Pivodic et al., 2020, 2021).

The aim of this study was to externally validate the DIGIROP‐v1.0 models in a cohort of preterm infants undergoing screening for ROP in a tertiary neonatal intensive care unit in Greece.

2. MATERIALS AND METHODS

This study was conducted in accordance with the tenets of the Declaration of Helsinki and followed the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis statement (TRIPOD). The TRIPOD checklist is presented in Table S1. Approval of the study was granted by the Institutional Bioethics Committee of Papageorgiou General Hospital of Thessaloniki (344/14‐07‐2021) and by the Swedish Ethical Review Authority (Dnr 2019‐02321, amendment for validation Dnr 2021‐05222).

2.1. Participants

The study population comprised all infants with GA between 240/7 and 306/7 weeks that were screened for ROP in the 2nd Neonatology Department of Aristotle University of Thessaloniki from 1 January 2009 to 31 December 2021. Only infants with a known ROP outcome (i.e. type 1 or 2 ROP or TR‐ROP in either eye, or auto‐regressed ROP without reaching criteria for treatment, or no ROP in both eyes) were included in the study. Screened infants with GA under 24 weeks (n = 3, 3 treated) or over 31 weeks (n = 483, 0 treated) were excluded from the analysis, because DIGIROP‐v1.0 models are not intended for use in this subpopulation. There were no missing data for any of the subjects. The Swedish cohort, as previously described, was used for comparison (Pivodic et al., 2021).

2.2. Outcome

The primary outcome was the development of type 1 ROP according to the Early Treatment of ROP criteria (Good & Early Treatment for Retinopathy of Prematurity Cooperative Group, 2004) or treatment based on the ophthalmologist's judgement. ROP was classified according to International Classification of ROP guidelines (Chiang et al., 2021).

2.3. Predictors

Data extracted from medical records were: GA (based on foetal ultrasonography), BW, sex, ROP characteristics (most central zone and maximum stage of ROP, presence of plus disease in either eye), type of treatment, age at first sign of ROP and at treatment. BW Standard Deviation Score (BWSDS) was calculated based on gender‐specific growth charts (Niklasson & Albertsson‐Wikland, 2008). Additional data on comorbidities (i.e. congenital malformation(s), syndrome(s), hydrocephalus, history of intestinal surgery sepsis, respiratory distress syndrome, bronchopulmonary dysplasia and intraventricular haemorrhage) and parenteral nutrition duration (PND) were collected post hoc for infants who were treated for ROP and were not classified as high risk by the DIGIROP‐v1.0 models.

2.4. Estimated risk predictions

DIGIROP‐Birth‐v1.0 risk estimates were calculated using the Poisson regression model developed for DIGIROP‐Birth‐v1.0 prediction model (Pivodic et al., 2020). DIGIROP‐Screen‐v1.0 risk estimates were calculated using the logistic regression models continuously developed over postnatal ages 6 to 14 for DIGIROP‐Screen‐v1.0 models (Pivodic et al., 2021).

2.5. Statistical analysis

Continuous variables were described by mean, standard deviation (SD), median, range and categorical variables with frequency and percentage. For tests between two groups Fisher's exact test was used for dichotomous variables, Mantel–Haenszel Chi‐square trend test for ordered categorical variables, and Mann–Whitney U‐test for continuous variables.

Sensitivity, specificity, cumulative specificity, positive predictive value, negative predictive value, model accuracy, area under the curve (AUC), with corresponding 95% confidence intervals (CI) using binomial distribution were displayed for validation purpose, along with the calibration plots.

The number of treated infants (n = 43) allows for a lower 95% CI for 100% sensitivity of 0.92. The AUC, which is the standard description of discrimination for a binary outcome, displays the trade‐off between specificity (the probability that a not treated infant is identified to be at low risk for TR‐ROP) and sensitivity (the probability that a treated infant is identified to be at high risk for TR‐ROP).

A value of 1.0 indicates perfect discrimination. AUC values above 0.70 are generally considered acceptable, while a value of 0.50 indicates no discrimination capacity. Sensitivity, specificity, PPV, NPV values are project specific. For instance, a sensitivity of 100% is required for such models as DIGIROP‐v1.0, in order not to miss any baby that could develop TR‐ROP, as that could lead to blindness. As this study is a validation study, we hoped to get similar results to those obtained in the main study, that is, an AUC of about 0.90, a sensitivity of 100%, and a specificity of about 50% (Pivodic et al., 2021). Logistic regression was applied for a post hoc analysis of interaction between sex and cohorts (Greek and Swedish development cohort) with respect to ROP treatment. The main effects included in the model were: GA, sex, cohort, GA*sex, GA*cohort and sex*cohort.

All tests were two‐tailed and conducted at the 0.05 significance level. All analyses were performed in SAS software version 9.4 (SAS Institute).

3. RESULTS

3.1. Participants

Α total of 640 preterm infants (49.4% females) with GA between 240/7 and 306/7 weeks were examined for ROP in our unit and were included in this external validation study (Figure 1). Median (range) GA was 29.0 (24.0–30.9) weeks, median (range) BW was 1140 (405–2540) grams and median (range) BWSDS was −0.9 (−7.4–5.2). The median (range) of GA (29.0 [24.0–30.9] weeks vs. 29.1 [24.0–30.9] weeks; p = 0.29, Mann–Whitney U‐test), was similar between female and male infants, whereas of BW (1070 [470–2190 grams] vs. 1225 [405–2540] grams; p < 0.0001, Mann–Whitney U‐test) was lower in female infants. Any ROP was detected in 167 (26.1%) infants, in 49 (29.3%) of whom at the first eye examination. The median (range) postnatal age (PNA) at first diagnosis of ROP was 7.0 (2.1–13.1) weeks.

FIGURE 1.

FIGURE 1

Study flow chart of the Greek validation cohort (January 2009–December 2021). ROP, Retinopathy of Prematurity; GA, Gestational Age.

Forty‐two (6.6%) infants developed type 1 ROP and received treatment (laser photocoagulation, n = 35; anti‐VEGF injection, n = 7). One infant with type 2 ROP received one intravitreal anti‐VEGF injection based on the treating ophthalmologist's clinical judgement due to difficulty in attending follow‐up visits. Among treated infants, the median (range) GA was similar between female and male infants (25.9 [24.0–30.4] weeks vs. 26.4 [24.0–30.0] weeks; p = 0.60), whereas the BW was lower in females (700 [470–1170 grams] vs. 865 [530–1485] grams; p = 0.02).

Treatment for ROP occurred at a median (range) 10.1 (6.0–15.0) weeks postnatally, after a median (range) of 12 (7–24) eye examinations. Untreated infants were examined median (range) 3 (1–15) times in total.

Infants' characteristics for the validation cohort compared with the Swedish development cohort are presented in Table 1 and for treated and not treated infants in the validation cohort in Table S2. The estimating probabilities for ROP treatment were higher in the Greek cohort compared to the Swedish cohort, with Greek females showing the highest estimates, revealing a significant interaction between sex and cohort (p = 0.04) (Figure S1).

TABLE 1.

Infants' characteristics for the Swedish cohort 2007–2018 used for development of DIGIROP‐v1.0 models and the Greek cohort 2009–2021 used for validation.

Variable SWEDROP development cohort 2007–2018 N = 6991 Greek validation cohort 2009–2021 N = 640 p‐value a
Sex, No. (%)
Boy 3833 (54.8) 324 (50.6) 0.042
Girl 3158 (45.2) 316 (49.4)
Gestational age at birth, weeks
Mean (SD) 28.3 (1.9) 28.7 (1.7) <0.0001
Median (range) 28.6 (24.0–30.9) 29.0 (24.0–30.9)
Gestational age (full weeks), No. (%)
24 427 (6.1) 13 (2.0) <0.0001
25 597 (8.5) 40 (6.3)
26 781 (11.2) 46 (7.2)
27 914 (13.1) 74 (11.6)
28 1141 (16.3) 119 (18.6)
29 1419 (20.3) 149 (23.3)
30 1712 (24.5) 199 (31.1)
Birth weight, grams
Mean (SD) 1146 (339) 1170 (328) 0.08
Median (range) 1135 (307–3245) 1140 (405–2540)
Birth weight SDS
Mean (SD) −1.0 (1.4) −1.2 (1.6) 0.0011
Median (range) −0.8 (−8.6–4.9) −0.9 (−7.4–5.2)
Maximum ROP stage, No. (%)
No ROP 4965 (71.0) 473 (73.9) 0.035
ROP stage 1 661 (9.5) 75 (11.7)
ROP stage 2 775 (11.1) 41 (6.4)
ROP stage 3 588 (8.4) 51 (8.0)
ROP stage 4B 2 (0.0) 0 (0.0)
Any ROP, No. (%) 2026 (29.0) 167 (26.1) 0.13
Postnatal age at first ROP diagnosis, weeks
Mean (SD) 8.4 (2.2) 7.0 (2.2) <0.0001
Median (range) 8.1 (3.4–18.7) 7.0 (2.1–13.1)
ROP treatment, No (%) 287 (4.1) 43 (6.7) 0.0031
Postnatal age at first ROP treatment, weeks
Mean (SD) 12.8 (2.8) 10.1 (2.1) <0.0001
Median (range) 12.4 (7.0–21.9) 10.1 (6.0–15.0)
DIGIROP‐Birth risk estimate
Mean (SD) 0.041 (0.078) 0.024 (0.057) <0.0001
Median (range) 0.005 (0.000–0.720) 0.003 (0.000–0.509)
DIGIROP‐Birth decision support tool, No. (%)
No screening 3562 (51.0) 375 (58.6) 0.0002
Screening 3429 (49.0) 265 (41.4)

Abbreviations: DIGIROP, Digital ROP ROP; retinopathy of prematurity; SD, standard deviation; SDS, SD score; SWEDROP, Swedish National Register for ROP.

a

For tests between groups Fisher's exact test was used for dichotomous variables, Mantel–Haenszel Chi‐square trend test for ordered categorical variables, and Mann–Whitney U‐test for continuous variables.

3.2. Model performance

Performance measures for DIGIROP‐v1.0 models are presented in Table 2 and individual risk estimates and outcome from decision support tool in Figure S2A–I.

TABLE 2.

Sensitivity, specificity, cumulative specificity, positive predictive value, negative predictive value, model accuracy, area under the receiver operating characteristic curve with 95% confidence interval for DIGIROP‐Birth‐v1.0 and DIGIROP‐Screen‐v1.0 (Greek validation cohort 2009–2021).

Model and timepoint Sensitivity Specificity Cumulative specificity Positive predictive value Negative predictive value Model accuracy Area under the ROC curve
n/N % (95% CI) % (95% CI) % (95% CI) % (95% CI) % (95% CI) % (95% CI) AUC (95% CI)
DIGIROP‐Birth 35/43 a , b 81.4 (66.6–91.6) 61.5 (57.4–65.4) 61.5 (57.4–65.4) 13.2 (9.4–17.9) 97.9 (95.8–99.1) 62.8 (58.9–66.6) 0.82 (0.75–0.90)
DIGIROP‐Screen PNA6w 36/43 b , c 83.7 (69.3–93.2) 59.8 (55.7–63.8) 61.6 (57.6–65.6) 13.0 (9.3–17.6) 98.1 (96.1–99.2) 61.4 (57.5–65.2) 0.82 (0.76–0.89)
DIGIROP‐Screen PNA7w 36/41 b , d 87.8 (73.8–95.9) 60.0 (55.9–63.9) 63.3 (59.3–67.2) 13.1 (9.3–17.7) 98.6 (96.8–99.6) 61.8 (57.9–65.5) 0.85 (0.79–0.91)
DIGIROP‐Screen PNA8w 32/36 b , e 88.9 (73.9–96.9) 68.5 (64.6–72.2) 71.2 (67.4–74.8) 14.5 (10.2–19.9) 99.0 (97.5–99.7) 69.7 (65.9–73.2) 0.89 (0.84–0.94)
DIGIROP‐Screen PNA9w 24/29 b , f 82.8 (64.2–94.2) 73.2 (69.5–76.7) 76.5 (72.9–79.9) 13.0 (8.5–18.8) 98.9 (97.4–99.6) 73.6 (70.0–77.1) 0.88 (0.82–0.94)
DIGIROP‐Screen PNA10w 19/23 b , g 82.6 (61.2–95.0) 78.1 (74.5–81.3) 81.4 (78.0–84.4) 12.7 (7.8–19.1) 99.1 (97.8–99.8) 78.2 (74.8–81.4) 0.91 (0.86–0.96)
DIGIROP‐Screen PNA11w 15/17 b , h 88.2 (63.6–98.5) 79.9 (76.5–83.0) 82.9 (79.7–85.8) 11.1 (6.4–17.7) 99.6 (98.5–99.9) 80.1 (76.8–83.2) 0.94 (0.90–0.98)
DIGIROP‐Screen PNA12w 8/9 b , i 88.9 (51.8–99.7) 80.4 (77.0–83.5) 83.1 (79.8–86.0) 6.4 (2.8–12.2) 99.8 (98.8–100.0) 80.5 (77.1–83.6) 0.93 (0.87–0.98)
DIGIROP‐Screen PNA13w 3/3 100.0 (29.2–100.0) 86.8 (83.8–89.4) 87.6 (84.7–90.1) 3.7 (0.8–10.3) 100.0 (99.3–100.0) 86.8 (83.9–89.4) 0.97 (0.95–0.99)
DIGIROP‐Screen PNA14w 2/2 100.0 (15.8–100.0) 85.9 (82.9–88.6) 87.6 (84.7–90.1) 2.3 (0.3–8.1) 100.0 (99.3–100.0) 86.0 (82.9–88.7) 0.97 (0.95–0.99)

Abbreviations: AUC, area under the curve; CI, confidence interval; DIGIROP, digital retinopathy of prematurity; PNA, postnatal age; w, weeks; ROC, receiver operating characteristics.

a

Eight missed infants by DIGIROP‐Birth: history of hydrocephalus, (n = 2); surgical necrotizing enterocolitis, (n = 3); syndrome, (n = 1); other severe comorbidities (e.g. respiratory distress syndrome, bronchopulmonary dysplasia, intraventricular haemorrhage), (n = 2).

b

All missed infants had parenteral nutrition duration of 14 days or more.

c

Seven missed infants by DIGIROP‐Screen at PNA 6 weeks: same infants as for DIGIROP‐Birth [history of hydrocephalus, (n = 2); surgical necrotizing enterocolitis, (n = 3); syndrome, (n = 1); respiratory distress syndrome, (n = 1)].

d

Five missed infants by DIGIROP‐Screen at PNA 7 weeks: same infants as for DIGIROP‐Birth [history of hydrocephalus, (n = 2); surgical necrotizing enterocolitis, (n = 2); syndrome, (n = 1)].

e

Four missed infants by DIGIROP‐Screen at PNA 8 weeks: Two infants as for DIGIROP‐Birth [surgical necrotizing enterocolitis, (n = 1); other severe comorbidities (e.g. respiratory distress syndrome, bronchopulmonary dysplasia, intraventricular haemorrhage), (n = 1)] and two additional infants [other severe comorbidities (e.g. respiratory distress syndrome, bronchopulmonary dysplasia, intraventricular haemorrhage), (n = 2)].

f

Five missed infants by DIGIROP‐Screen at PNA 9 weeks: Three infants as for DIGIROP‐Birth [history of hydrocephalus, (n = 1); surgical necrotizing enterocolitis, (n = 1); other severe comorbidities (e.g. respiratory distress syndrome, bronchopulmonary dysplasia, intraventricular haemorrhage), (n = 1)] and the two as for DIGIROP‐Screen at PNA 8 weeks.

g

Four missed infants by DIGIROP‐Screen at PNA 10 weeks: Two infants as for DIGIROP‐Birth [history of hydrocephalus, (n = 1); surgical necrotizing enterocolitis, (n = 1)], one infant as for DIGIROP‐Screen at PNA 8 weeks [other severe comorbidities (e.g. respiratory distress syndrome, bronchopulmonary dysplasia, intraventricular haemorrhage), (n = 1)] and one additional infant [other severe comorbidities (e.g. respiratory distress syndrome, bronchopulmonary dysplasia, intraventricular haemorrhage), (n = 1)].

h

Two missed infants by DIGIROP‐Screen at PNA 11 weeks: One infant as for DIGIROP‐Birth [history of hydrocephalus (n = 1)] one infant as for DIGIROP‐Screen at PNA 8 weeks [other severe comorbidities (e.g. respiratory distress syndrome, bronchopulmonary dysplasia, intraventricular haemorrhage), (n = 1)].

i

One missed infant by DIGIROP‐Screen at PNA 12 weeks: one infant as for DIGIROP‐Birth [history of hydrocephalus (n = 1)].

3.2.1. DIGIROP‐birth‐v1.0 model

The DIGIROP‐Birth‐v1.0 model correctly identified 35 of 43 treated infants, displaying lower sensitivity 81.4% (95% CI, 66.6%–91.6%) than in the Swedish development cohort (Pivodic et al., 2020). The higher incidence of treatment‐requiring ROP for girls and overall for different GA in Greek infants (as shown in Figure S1) result in an under‐estimation of their risk for ROP treatment, which is evident in the calibration plot (Figure S3). Seven of the treated infants, that were missed by the model, had a history of hydrocephalus, (n = 2); surgical necrotizing enterocolitis, (n = 3); syndrome, (n = 1); or other severe comorbidities (e.g. respiratory distress syndrome, bronchopulmonary dysplasia, intraventricular haemorrhage), (n = 1). These infants should have been scheduled for ROP screening based on both current criteria of our unit and recommendations of DIGIROP‐v1.0 models to routinely examine all infants with congenital malformations/syndromes, hydrocephalus, or history of intestinal surgery. One remaining infant that was flagged by the model as not needing screening at birth was unable to attend follow‐up visits, so treatment was applied for type 2 disease. All missed infants had PND of 14 days or more. No treated infant with a GA < 27 weeks was missed by DIGIROP‐Birth‐v1.0 model (Figure 2). Specificity was estimated to be 61.5% (95% CI, 57.4%–65.4%), and AUC 0.82 (95% CI, 0.75–0.90). Values of specificity for DIGIROP‐Birth‐v1.0 model per GA are displayed in Figure 3a.

FIGURE 2.

FIGURE 2

Validation of DIGIROP‐Birth‐v1.0 prediction model on Greek data (2009–2021). History of comorbidity includes history of hydrocephalus or surgical necrotizing enterocolitis or syndrome or other severe comorbidities (e.g. respiratory distress syndrome, bronchopulmonary dysplasia, intraventricular haemorrhage). ROP, Retinopathy of Prematurity.

FIGURE 3.

FIGURE 3

Specificity and cumulative specificity for DIGIROP decision support tool in the Greek validation cohort (2009–2021) (a) by gestational age based on DIGIROP‐Birth‐v1.0, (b) by postnatal age for 6–14 weeks of life based on DIGIROP‐Birth‐v1.0 and DIGIROP‐Screen‐v1.0.

3.2.2. DIGIROP‐screen‐v1.0 models

The sensitivity of DIGIROP‐Screen‐v1.0 models ranged between 82.6% and 100% during 6–14 postnatal weeks. Three additional infants, all with severe comorbidities and PND over 14 days, that were treated, would have been incorrectly released from screening before TR‐ROP was developed. Cumulative specificity increased from 61.5% (95% CI, 57.6%–65.6%) in postnatal week 6 to 87.6% (95% CI, 84.7%–90.1%) in postnatal week 14 (Table 2). Cumulative specificity per GA over 6 to 14 weeks postnatally are displayed in Figure 3b. The AUC ranged between 0.82 (95% CI, 0.76–0.89) and 0.97 (95% CI, 0.95–0.99) over 6–14 weeks postnatally (Table 2).

4. DISCUSSION

The discriminative ability of DIGIROP‐v1.0 models in this Greek cohort of 640 consecutive preterm infants ranged between 0.82 AUC at birth and 0.97 AUC at 14 weeks PNA. Using the models in isolation provides insufficient sensitivity to detect all at‐risk infants however, this rises to acceptable levels if the models are used with the recommendation to additionally screen infants with specific comorbidities. Analysis of the data provides insight into differences in participants' characteristics that may be responsible for this variation in the performance of the models.

The AUC of the models in this validation cohort, although lower at birth, increased during screening at levels equal or higher compared with both the Swedish development group (0.91–0.93), and the Swedish, the United States and German validation groups (0.88–0.90) in the original publication study (Pivodic et al., 2021). Higher values were reported in a more recent Swedish validation cohort (0.93–0.97) (Pivodic et al., 2022). Regarding ROP screening, it is not permitted that even one individual who develops treatment‐requiring disease is missed, so it is mandatory that a model detects all subjects with severe ROP. Thus, the models would only be applicable in our cohort, if combined with clinical criteria of comorbidities.

Using the selected cut‐offs for achieving 100% sensitivity in the development cohort, the models displayed higher specificity in the Greek cohort (61.5% at birth—85.9% cumulatively at 14 weeks PNA) compared with the Swedish development cohort (53.1%–80.6%) (Pivodic et al., 2021). This means that 61.5% of the infants could be safely released from ROP screening as soon as at birth, corresponding to 40.0% (1016/2552) of visits spared by only using DIGIROP‐Birth‐v1.0 model. An additional 24.4% of the infants would require at least one eye examination to determine the risk of TR‐ROP according to the decision support tool (i.e. total of 85.9%). The specificity of DIGIROP‐v1.0 models in the validation performed in the Swedish, German and US cohorts in the original publication was estimated to vary from 46.3% at birth to 75.2% cumulatively at 14 weeks PNA (Pivodic et al., 2021). Similar values were obtained in the external validation performed in a contemporary Swedish cohort (49.9%–76.3%) (Pivodic et al., 2022).

The sensitivity of DIGIROP models in our validation cohort ranged between 81.4% and 100%. Among treated subjects, eight infants at birth and three infants at 6–14 weeks PNA, would have incorrectly been released from screening before TR‐ROP developed. All 11 infants had severe medical comorbidities and, because of these, qualified for serial ROP examinations based on both the recommendations of DIGIROP models, and on the current screening criteria in our unit. One infant that was categorized at birth as not needing screening, was eventually treated for type 2 ROP at 6.0 weeks PNA, so it remains unknown if type 1 disease would have developed, had it not been treated this early. Considering this, the corresponding sensitivity of DIGIROP‐v1.0 models in our cohort increases to 97.7%. The updated version of DIGIROP‐v2.0 models, which additionally includes the PND of ≥14 days as a proxy for the comorbidity status of the infants, further improves the sensitivity measure to 100%. This updated DIGIROP‐v2.0 clinical decision support tool also showed 100% sensitivity in the Swedish validation cohort (Pivodic et al., 2023).

The lower predictive ability of DIGIROP‐v1.0 models in our population stems from the higher incidence of TR‐ROP in this Greek cohort (6.7%) compared to Sweden (4.1%) despite higher GA. This discrepancy may be attributed to differences in neonatal care practices between the two countries. Also, among treated infants a higher percentage of girls is reported in the Greek vs. the Swedish cohort (58.1% vs. 38.3%). The association of sex with TR‐ROP has been investigated in many studies with contradicting results. (Choi et al., 2013; Koçak et al., 2016; Pivodic et al., 2020; Shiraki et al., 2019) In a recent sub‐group meta‐analysis of 50 studies by Hoyek et al. the pooled percentage of male treated infants was higher compared with the pooled percentage of female‐treated infants (54% vs. 46%) (Hoyek et al., 2022). However, in that population the weighted mean of GA and BW was similar among male and female infants, while in our cohort treated girls had similar GA but lower BW compared with treated boys; thus, they were at higher risk for ROP a priori. Another potential explanation is the relatively higher ratio of females among the screened subjects compared with the Swedish cohort (0.98 vs. 0.82).

A further element that differs in this Greek cohort is the timing of first ROP diagnosis (median PNA 7.0 vs. 8.1 weeks) and of initial treatment (median PNA 10.1 vs. 12.4 weeks). In the literature, there is a heterogeneity regarding the chronological age of infants at first ROP treatment. Pivodic et al. reported that the momentary risk for ROP treatment peaks at 12 weeks PNA irrespective of GA (Pivodic et al., 2020), a finding that was confirmed in previous validations of DIGIROP models, namely median PNA 12.3 weeks (Pivodic et al., 2021) and 12.6 weeks (Pivodic et al., 2022). Similar values were also reported in a German (44 treated infants; median PNA 11.4 weeks), a Swiss (76 treated infants; mean PNA 12.6 weeks) and a French (81 treated infants; mean PNA 12.3 weeks) multicentre study (Barjol et al., 2022; Gerull et al., 2018; Winter et al., 2023). Conversely, studies from Turkey and India show a mean PNA at initial ROP treatment of 9.5 and 9.0 weeks respectively, which is similar to that obtained in our cohort (Özen Tunay et al., 2016; Thomas et al., 2021). Moreover, unpublished national data on the incidence of TR‐ROP in Greece, also detect a median (range) PNA at first ROP treatment of 9.6 (5.6–27.0) weeks.

This is the first validation of DIGIROP‐v1.0 models and the clinical decision support tool in a population that includes more mature preterm infants. Previous external validations have been performed on infants from Sweden, the United States and Germany, who had lower GA and lower BW compared with the development cohort (Pivodic et al., 2021, 2022). Only the DIGIROP‐Birth‐v1.0 model has also been externally validated in a Portuguese (n = 257; 23 treated infants) and a Chinese (n = 442; 93 treated infants) cohort with a reported AUC of 0.63 and 0.70 respectively (Almeida et al., 2022; Chen et al., 2021). However, these measures were calculated based on modified cut‐offs to maximize sensitivity in the respective cohorts, rather than the published cut‐offs in the original development study, and thus cannot be used for comparison with the performance measures of the current study.

Among strengths of this validation cohort is that it comprises all infants with GA between 24 and 30 weeks that were consecutively screened for ROP at a tertiary neonatal intensive care unit over a 13‐year period. Moreover, no missing data were identified for any of the predictors, indicating the absence of selection bias. Finally, all examinations and treatments were performed by the same experienced paediatric ophthalmologist (AM) throughout the study period, resulting in no interobserver variability.

The retrospective nature and the low number of events are limitations; however, this is inevitable since treatment for ROP is required rarely among preterm infants. In any case, we included all screened subjects in our unit during a 13‐year period, in order to provide sufficient data for analysis. Another issue is that data regarding infants who died before undergoing ROP screening was unavailable, and data on infants' comorbidities were not collected for all participants. Had this data been available, we could have assessed both the observed ratio of females: males at birth in all born infants, and comorbidities in females compared with males. This could have elucidated possible correlations and differences observed in this cohort compared with the Swedish one. Finally, since participants originated from a single neonatal unit, the results are non‐generalizable in other settings in Greece with a potentially different level of neonatal care.

In conclusion, this validation study demonstrated that the estimates provided by the DIGIROP‐v1.0 prognostic models were insufficient to predict the risk of TR‐ROP in a population of preterm infants with higher GA and BW, unless clinical recommendations were also considered. The latter increased the sensitivity of the models to 97.7%. If only type 1 disease is considered, then sensitivity would be 100%, which is the desired performance. Using the DIGIROP‐v1.0 models would have resulted in 61.5% of the infants not requiring screening at all and would have reduced the number of eye examinations in the screened infants by 40%. Thus, further validation and adjusting of the models are recommended to maximize their performance and their generalizability.

FUNDING INFORMATION

None.

Supporting information

Figure S1. Estimated probability for ROP treatment by cohort, GA and sex (Greek cohort 2009–2021)—Dashed and solid lines are the regression lines for the Greek validation and the Swedish development cohort respectively. Coloured areas represent 95% confidence intervals.

Figure S2. DIGIROP‐Birth‐v1.0 risk estimates and outcome from decision support tool at (A) birth, (B) postnatal age of 6 weeks, (C) postnatal age of 7 weeks, (D) postnatal age of 8 weeks, (E) postnatal age of 9 weeks, (F) postnatal age of 10 weeks, (G) postnatal age of 11 weeks, (H) postnatal age of 12 weeks, (I) postnatal age of 13 weeks, (J) postnatal age of 14 weeks (Greek cohort 2009–2021).

Figure S3.—Calibration plot for DIGIROP‐Birth risk estimates (Greek cohort 2009–2021)—The yellow diagonal line represents the line of perfect calibration. Systematic deviation from the diagonal line indicates that the model is not well calibrated. Dots represent observed risk on y‐axis and estimated risk on x‐axis and bars 95% confidence intervals for the estimated risk.

AOS-103-272-s002.docx (1MB, docx)

Table S1. TRIPOD checklist for prediction model validation studies.

Table S2.—Infants’ characteristics for Retinopathy of Prematurity treated and not treated in the Greek cohort 2009–2021 used for validation of DIGIROP‐v1.0 models.

AOS-103-272-s001.docx (99KB, docx)

ACKNOWLEDGEMENTS

The authors have nothing to report.

Moutzouri, S. , Pivodic, A. , Haidich, A.‐B. , Seliniotaki, A.K. , Lithoxopoulou, M. , Tsakalidis, C. et al. (2025) Predicting the risk of treatment‐requiring retinopathy of prematurity in preterm infants in Greece. External validation of DIGIROP prognostic models. Acta Ophthalmologica, 103, 272–280. Available from: 10.1111/aos.16788

REFERENCES

  1. Almeida, A.C. , Borrego, L.M. , Brízido, M. , de Figueiredo, M.B. , Teixeira, F. , Coelho, C. et al. (2022) DIGIROP efficacy for detecting treatment‐requiring retinopathy of prematurity in a Portuguese cohort. Eye, 36, 463–469. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Athikarisamy, S. , Desai, S. , Patole, S. , Rao, S. , Simmer, K. & Lam, G.C. (2021) The use of postnatal weight gain algorithms to predict severe or type 1 retinopathy of prematurity: a systematic review and meta‐analysis. JAMA Network Open, 4, e2135879. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Barjol, A. , Lux, A.L. , Dureau, P. , Chapron, T. , Metge, F. , Abdelmassih, Y. et al. (2022) Evaluation and modification of French screening guidelines for retinopathy of prematurity. Acta Ophthalmologica, 100, e1451–e1454. [DOI] [PubMed] [Google Scholar]
  4. Binenbaum, G. , Bell, E.F. , Donohue, P. , Quinn, G. , Shaffer, J. , Tomlinson, L.A. et al. (2018) Development of modified screening criteria for retinopathy of prematurity: primary results from the postnatal growth and retinopathy of prematurity study. JAMA Ophthalmology, 136, 1034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Binenbaum, G. , Ying, G. , Quinn, G.E. , Huang, J. , Dreiseitl, S. , Antigua, J. et al. (2012) The CHOP postnatal weight gain, birth weight, and gestational age retinopathy of prematurity risk model. Archives of Ophthalmology, 130, 1560–1565. [DOI] [PubMed] [Google Scholar]
  6. Binenbaum, G. , Ying, G. , Tomlinson, L.A. & for the Postnatal Growth and Retinopathy of Prematurity (G‐ROP) Study Group . (2017) Validation of the Children's Hospital of Philadelphia retinopathy of prematurity (CHOP ROP) model. JAMA Ophthalmology, 135, 871. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Blencowe, H. , Lawn, J.E. , Vazquez, T. , Fielder, A. & Gilbert, C. (2013) Preterm‐associated visual impairment and estimates of retinopathy of prematurity at regional and global levels for 2010. Pediatric Research, 74(Suppl 1), 35–49. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Cao, J.H. , Wagner, B.D. , McCourt, E.A. , Cerda, A. , Sillau, S. , Palestine, A. et al. (2016) The Colorado–retinopathy of prematurity model (CO‐ROP): postnatal weight gain screening algorithm. Journal of American Association for Pediatric Ophthalmology and Strabismus, 20, 19–24. [DOI] [PubMed] [Google Scholar]
  9. Chen, S. , Wu, R. , Chen, H. , Ma, W. , Du, S. , Li, C. et al. (2021) Validation of the DIGIROP‐birth model in a Chinese cohort. BMC Ophthalmology, 21, 236. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Chiang, M.F. , Quinn, G.E. , Fielder, A.R. , Ostmo, S.R. , Paul Chan, R.V. , Berrocal, A. et al. (2021) International classification of retinopathy of prematurity, third edition. Ophthalmology, 128, e51–e68. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Choi, J.‐H. , Löfqvist, C. , Hellström, A. & Heo, H. (2013) Efficacy of the screening algorithm WINROP in a Korean population of preterm infants. JAMA Ophthalmology, 131, 62. [DOI] [PubMed] [Google Scholar]
  12. Eckert, G.U. , Fortes Filho, J.B. , Maia, M. & Procianoy, R.S. (2012) A predictive score for retinopathy of prematurity in very low birth weight preterm infants. Eye, 26, 400–406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Gerull, R. , Brauer, V. , Bassler, D. , Pfister, R.E. , Nelle, M. , Müller, B. et al. (2018) Prediction of ROP treatment and evaluation of screening criteria in VLBW infants–a population based analysis. Pediatric Research, 84, 632–638. [DOI] [PubMed] [Google Scholar]
  14. Good WV & Early Treatment for Retinopathy of Prematurity Cooperative Group . (2004) Final results of the early treatment for retinopathy of prematurity (ETROP) randomized trial. Transactions of the American Ophthalmological Society, 102, 233–248. [PMC free article] [PubMed] [Google Scholar]
  15. Hellström, A. , Hård, A.‐L. , Engström, E. , Niklasson, A. , Andersson, E. , Smith, L. et al. (2009) Early weight gain predicts retinopathy in preterm infants: new, simple, efficient approach to screening. Pediatrics, 123, e638–e645. [DOI] [PubMed] [Google Scholar]
  16. Hoyek, S. , Peacker, B.L. , Acaba‐Berrocal, L.A. , Al‐Khersan, H. , Zhao, Y. , Hartnett, M.E. et al. (2022) The male to female ratio in treatment‐warranted retinopathy of prematurity: a systematic review and meta‐analysis. JAMA Ophthalmology, 140, 1110–1120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Koçak, N. , Niyaz, L. & Ariturk, N. (2016) Prediction of severe retinopathy of prematurity using the screening algorithm WINROP in preterm infants. Journal of American Association for Pediatric Ophthalmology and Strabismus, 20, 486–489. [DOI] [PubMed] [Google Scholar]
  18. Kościółek, M. , Kisielewska, W. , Ćwiklik‐Wierzbowska, M. , Wierzbowski, P. & Gilbert, C. (2022) Systematic review of the guidelines for retinopathy of prematurity. European Journal of Ophthalmology, 33, 112067212211262. [DOI] [PubMed] [Google Scholar]
  19. Larsen, P.P. , Müller, A. , Lagrèze, W.A. , Holz, F.G. , Stahl, A. & Krohne, T.U. (2021) Incidence of retinopathy of prematurity in Germany: evaluation of current screening criteria. Archives of Disease in Childhood. Fetal and Neonatal Edition, 106, 189–193. [DOI] [PubMed] [Google Scholar]
  20. Mataftsi, A. , Moutzouri, S. , Karagianni, P. , Ziakas, N. , Soubasi, V. , Brazitikos, P. et al. (2020) Retinopathy of prematurity occurrence and evaluation of screening policy in a large tertiary Greek cohort. International Ophthalmology, 40, 385–391. [DOI] [PubMed] [Google Scholar]
  21. McCauley, K. , Chundu, A. , Song, H. , High, R. & Suh, D. (2018) Implementation of a clinical prediction model using daily postnatal weight gain, birth weight, and gestational age to risk stratify ROP. Journal of Pediatric Ophthalmology and Strabismus, 55, 326–334. [DOI] [PubMed] [Google Scholar]
  22. Moutzouri, S. , Haidich, A.‐B. , Seliniotaki, A.K. , Tsakalidis, C. , Soubasi, V. , Ziakas, N. et al. (2021) Optimization of retinopathy of prematurity screening in a tertiary neonatal unit in northern Greece based on 16‐year data. Journal of Perinatology, 42, 365–370. [DOI] [PubMed] [Google Scholar]
  23. Niklasson, A. & Albertsson‐Wikland, K. (2008) Continuous growth reference from 24th week of gestation to 24 months by gender. BMC Pediatrics, 8, 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Özen Tunay, Z. , Özdemi̇R, Ö. , Ergi̇Ntürk Acar, D. , Petri̇Çli̇, İ.S. & Oğuz, Ş.S. (2016) Clinical features of infants treated for severe retinopathy of prematurity: 8‐yearstudy from a large tertiary neonatal intensive care unit in Turkey. Turkish Journal of Medical Sciences, 46, 42–47. [DOI] [PubMed] [Google Scholar]
  25. Piermarocchi, S. , Bini, S. , Martini, F. , Berton, M. , Lavini, A. , Gusson, E. et al. (2017) Predictive algorithms for early detection of retinopathy of prematurity. Acta Ophthalmologica, 95, 158–164. [DOI] [PubMed] [Google Scholar]
  26. Pivodic, A. , E H Smith, L. , Hård, A.‐L. , Löfqvist, C. , Almeida, A.C. , Al‐Hawasi, A. et al. (2022) Validation of DIGIROP models and decision support tool for prediction of treatment for retinopathy of prematurity on a contemporary Swedish cohort. The British Journal of Ophthalmology, 107(8), 1132–1138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Pivodic, A. , Hård, A.‐L. , Löfqvist, C. , Smith, L.E.H. , Wu, C. , Bründer, M.C. et al. (2020) Individual risk prediction for sight‐threatening retinopathy of prematurity using birth characteristics. JAMA Ophthalmology, 138, 21–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Pivodic, A. , Holmström, G. , Smith, L.E.H. , Hård, A.L. , Löfqvist, C. , al‐Hawasi, A. et al. (2023) Prognostic value of parenteral nutrition duration on risk of retinopathy of prematurity: development and validation of the revised DIGIROP clinical decision support tool. JAMA Ophthalmology, 141, 716–724. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Pivodic, A. , Johansson, H. , Smith, L.E.H. , Hård, A.L. , Löfqvist, C. , Yoder, B.A. et al. (2021) Development and validation of a new clinical decision support tool to optimize screening for retinopathy of prematurity. The British Journal of Ophthalmology, 106(11), 318719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Shiraki, A. , Fukushima, Y. , Kawasaki, R. , Sakaguchi, H. , Mitsuhashi, M. , Ineyama, H. et al. (2019) Retrospective validation of the postnatal growth and retinopathy of prematurity (G‐ROP) criteria in a Japanese cohort. American Journal of Ophthalmology, 205, 50–53. [DOI] [PubMed] [Google Scholar]
  31. Thomas, D. , Madathil, S. , Thukral, A. , Sankar, M.J. , Chandra, P. , Agarwal, R. et al. (2021) Diagnostic accuracy of WINROP, CHOP‐ROP and ROPScore in detecting type 1 retinopathy of prematurity. Indian Pediatrics, 58, 915–921. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Wilkinson, A.R. , Haines, L. , Head, K. & Fielder, A.R. (2008) UK retinopathy of prematurity guideline. Early Human Development, 84, 71–74. [DOI] [PubMed] [Google Scholar]
  33. Winter, K. , Pfeil, J.M. , Engmann, H. , Aisenbrey, S. , Lorenz, B. , Hufendiek, K. et al. (2023) Comparability of input parameters in the German Retina.net ROP registry and the EU‐ROP registry—an exemplary comparison between 2011 and 2021. Acta Ophthalmologica, 102(3), e314–e321. [DOI] [PubMed] [Google Scholar]
  34. Wu, C. , Löfqvist, C. , Smith, L.E.H. , VanderVeen, D.K. , Hellström, A. & WINROP Consortium for the . (2012) Importance of early postnatal weight gain for normal retinal angiogenesis in very preterm infants: a multicenter study analyzing weight velocity deviations for the prediction of retinopathy of prematurity. Archives of Ophthalmology, 130, 992–999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Zhang, R.‐H. , Liu, Y.‐M. , Dong, L. , Li, H.Y. , Li, Y.F. , Zhou, W.D. et al. (2022) Prevalence, years lived with disability, and time trends for 16 causes of blindness and vision impairment: findings highlight retinopathy of prematurity. Frontiers in Pediatrics, 10, 735335. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S1. Estimated probability for ROP treatment by cohort, GA and sex (Greek cohort 2009–2021)—Dashed and solid lines are the regression lines for the Greek validation and the Swedish development cohort respectively. Coloured areas represent 95% confidence intervals.

Figure S2. DIGIROP‐Birth‐v1.0 risk estimates and outcome from decision support tool at (A) birth, (B) postnatal age of 6 weeks, (C) postnatal age of 7 weeks, (D) postnatal age of 8 weeks, (E) postnatal age of 9 weeks, (F) postnatal age of 10 weeks, (G) postnatal age of 11 weeks, (H) postnatal age of 12 weeks, (I) postnatal age of 13 weeks, (J) postnatal age of 14 weeks (Greek cohort 2009–2021).

Figure S3.—Calibration plot for DIGIROP‐Birth risk estimates (Greek cohort 2009–2021)—The yellow diagonal line represents the line of perfect calibration. Systematic deviation from the diagonal line indicates that the model is not well calibrated. Dots represent observed risk on y‐axis and estimated risk on x‐axis and bars 95% confidence intervals for the estimated risk.

AOS-103-272-s002.docx (1MB, docx)

Table S1. TRIPOD checklist for prediction model validation studies.

Table S2.—Infants’ characteristics for Retinopathy of Prematurity treated and not treated in the Greek cohort 2009–2021 used for validation of DIGIROP‐v1.0 models.

AOS-103-272-s001.docx (99KB, docx)

Articles from Acta Ophthalmologica are provided here courtesy of Wiley

RESOURCES