We thank Olson et al. (1) for their close reading of our recent study (2), where we examined sensitivity and specificity of automated diabetic retinopathy detection and demonstrated an area under the receiver operating characteristic curve of 0.87. A limited, 500-patient sample of all 10,000 photographic exams was examined by multiple, masked experts. We felt uncomfortable recommending a system for clinical practice for which patient safety compared to an accepted (gold) standard could not be established, concluding that it should be tested against widely accepted clinical standards, if practical. We have recently presented studies of an improved algorithm on a new, larger dataset of 15,000 exams with an area under the curve of 0.90 (3). Most of these results support the work of Olson and colleagues (4), although the fact that their system only detects small hemorrhages and microaneurysms is a serious limitation in our view, and sensitivity and specificity based upon the photographic interpretation by a single reader is unlikely to become widely accepted.
Failure to detect large, rare, and/or advanced lesions deserves disproportionate attention. If a patient with isolated neovascularization of the disc, <1:5,000 in our series, were to be missed by a system but would not have been missed by a person, that is a failure likely to lead to vision loss or blindness for that patient, potential litigation, and a backlash against implementation of automated detection.
Groups translating automated diabetic retinopathy detection into clinical practice operate in environments that differ on regulatory, legal, budgetary, and reimbursement aspects, but we disagree that “a recommendation against automated grading is only valid if it is shown that there is a higher performing and readily alternative methodology” (1). The currently established practice is human expert reading, and the burden of proof is therefore on the new system to be introduced, which is automated reading. For automated reading to gain widespread acceptance, no shortcuts regarding safety concerns will likely be permitted by regulatory agencies, payers, and patients.
One study's entry criteria may be perceived as another's selection bias. The target population of the EyeCheck project consists of patients who had not been previously identified to have diabetic retinopathy. In most settings, as patients are identified with diabetic retinopathy they are referred for evaluation or treatment, removing them from the screened population. To establish any other inclusion criteria would have constituted selection, affecting the potential application of this data to current clinical practice.
The potential positive effect of camera resolution on algorithmic performance is intriguing, although with less costly cameras presently offering at least 1,024 × 1,024 pixels, this debate may be self-limiting. We believe that comparison of algorithms to standardized datasets (http://roc.healthcare.uiowa.edu) as well as to the gold standard are required and should include: 1) demonstration in a prospective multicenter study of similar or better detection; on populations with defined race and ethnicity distributions, 2) acceptable comparison of detection to standard multifield stereo photographs read according to the Early Treatment of Diabetic Retinopathy Study standard; and 3) sensitivity/specificity analysis with standard and severity-weighted receiver operating characteristic curves.
In summary, we agree that automated detection of diabetic retinopathy can make the prevention of blindness and vision loss objective, more accessible, and more cost-effective, provided safety issues are not overlooked.
References
- 1.Olson JA, Sharp PF, Fleming A, Philip S: Evaluation of a system for automatic detection of diabetic retinopathy from color fundus photographs in a large population of patients with diabetes: response to Abràmoff et al. (Letter). Diabetes Care 31:e63, 2008. DOI: 10.2337/dc08-0827 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Abràmoff MD, Niemeijer M, Suttorp-Schulten MS, Viergever MA, Russell SR, van Ginneken B: Evaluation of a system for automatic detection of diabetic retinopathy from color fundus photographs in a large population of patients with diabetes., Diabetes Care 31:193–198, 2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Abràmoff MD, Van Ginneken B, Suttorp MSA, Russell SR, Niemeijer M: Improved computer aided detection of diabetic retinopathy evaluated on 10,000 screening exams (Abstract). Invest Ophthalmol Vis Sci 49:2735, 2008 [Google Scholar]
- 4.Philip S, Fleming AD, Goatman KA, Fonseca S, McNamee P, Scotland GS, Prescott GJ, Sharp PF, Olson JA: The efficacy of automated “disease/no disease” grading for diabetic retinopathy in a systematic screening programme. Br J Ophthalmol. 91:1512–1517, 2007 [DOI] [PMC free article] [PubMed] [Google Scholar]