Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2014 Nov 14;2014:1902–1910.

Development and Evaluation of Reference Standards for Image-based Telemedicine Diagnosis and Clinical Research Studies in Ophthalmology

Michael C Ryan 1, Susan Ostmo 1, Karyn Jonas 3, Audina Berrocal 4, Kimberly Drenser 5, Jason Horowitz 6, Thomas C Lee 7, Charles Simmons 8, Maria-Ana Martinez-Castellanos 9, RV Paul Chan 3, Michael F Chiang 1,2,
PMCID: PMC4419970  PMID: 25954463

Abstract

Information systems managing image-based data for telemedicine or clinical research applications require a reference standard representing the correct diagnosis. Accurate reference standards are difficult to establish because of imperfect agreement among physicians, and discrepancies between clinical vs. image-based diagnosis. This study is designed to describe the development and evaluation of reference standards for image-based diagnosis, which combine diagnostic impressions of multiple image readers with the actual clinical diagnoses. We show that agreement between image reading and clinical examinations was imperfect (689 [32%] discrepancies in 2148 image readings), as was inter-reader agreement (kappa 0.490-0.652). This was improved by establishing an image-based reference standard defined as the majority diagnosis given by three readers (13% discrepancies with image readers). It was further improved by establishing an overall reference standard that incorporated the clinical diagnosis (10% discrepancies with image readers). These principles of establishing reference standards may be applied to improve robustness of real-world systems supporting image-based diagnosis.

Introduction

Medical diagnosis has traditionally required examination by a physician. However, advances in imaging technology have affected specialties such as ophthalmology, dermatology, radiology, and cardiology to the point where clinical decision-making in these specialties is based largely on review of these imaging studies. Meanwhile, there have been persistent concerns about the accessibility of health care, particularly in rural and medically underserved areas. As a result, store-and-forward telemedicine strategies have emerged as a potential strategy for improving the delivery and cost of health care by replacing some in-person physician examinations with remote image-based evaluations.12 Real-world implementation of these strategies will require appropriate validation of diagnostic accuracy, as well as agreement among different image readers.

At the same time, institutional and regulatory pressures are placing increased emphasis on quality and adherence to evidence-based practice guidelines.34 Clinical examination by a physician is generally considered the gold standard in medical practice. However, there are often significant variations in diagnosis and management among physicians, even when they are presented with the exact same clinical scenarios.56 Similarly, although image-based diagnosis is made from the appearance of structural and morphological features, numerous studies in ophthalmology have demonstrated that there may be significant discrepancies in image reading, even among experts looking at the same images.79

This variability among experts creates challenges for the implementation of image-based clinical information systems. For telemedicine systems, an accurate reference standard must be defined for proper validation, yet different remote image readers may disagree with regard to diagnosis. Furthermore, it is often unclear whether the actual clinical examination or remote interpretation is more correct. For clinical research systems, it is critical to define a reference standard with the highest accuracy, yet there may be discrepancies between the interpretations of clinical examination and imaging data. Understanding the factors contributing to accurate diagnosis, as well as having a clear definition of a reference standard defining the correct diagnosis, is essential for managing image-based data for applications such as telemedicine and clinical research.

The purpose of this paper is to describe the development and evaluation of reference standards for image-based diagnosis, which combines the diagnostic impressions of multiple image readers with the actual clinical diagnoses by expert physicians. Sources of discrepancy will be identified and analyzed. Retinopathy of prematurity (ROP), an ophthalmic disease affecting low birth-weight infants during the first several months of life, is used as the study domain. Results from clinical ophthalmoscopic exams by an expert on a study cohort of infants are compared to results from image interpretation by three readers. In this way, we evaluate the variability among multiple readers performing image-based diagnosis, the impact of integrating diagnoses by multiple readers into an image-based reference standard for telemedicine applications, and the impact of integrating clinical diagnosis into an overall reference standard for clinical research applications.

Study Domain: Retinopathy of Prematurity (ROP)

ROP is diagnosed from dilated fundoscopic examination by an ophthalmologist, and there are established guidelines for identifying high-risk premature infants who need serial screening examinations.10 When ROP occurs, approximately 90% of cases improve spontaneously and require only close follow-up examinations every 1–2 weeks. However, approximately 10% are at high risk for complications leading to blindness and require treatment.1112

ROP has several characteristics that make it an ideal topic for research in telemedicine, biomedical informatics, and clinical research: (1) Diagnosis is based solely on the appearance of disease in the retina. (2) There is a universally-accepted, evidence-based, diagnostic classification standard for ROP.13 (3) Although it is treatable if detected early, ROP continues to be a leading cause of childhood blindness throughout the world because of inadequacies in screening.14 (4) Current ROP exam methods are time-intensive and physiologically stressful to infants. (5) Clinical expertise is often limited to larger academic centers, and is therefore unavailable at the point of care. Therefore, following the establishment of an appropriate reference standard, there should be diagnostic reliability and reproducibility of telemedical diagnosis via the application of objective diagnostic criteria. Furthermore, the physiologic stress on the infants may be reduced.(15), (16)DL, MM Finally, while the expertise required to diagnose and treat ROP tends to be found only at large academic medical centers, the skill needed to acquire images adequate for diagnosis can be taught and disseminated across myriad practice settings and communities of varying size, thereby increasing access to care.(17, 18, 19, 20)PC, RW, YM, DW

Methods

Ophthalmoscopic Examination and Image Capture

This is a multicenter study with eight participating academic medical centers: (1) Oregon Health & Science University (OHSU), (2) Weill Cornell Medical College, (3) University of Miami, (4) Beaumont Health System, (5) Columbia University Medical Center, (6) Children's Hospital Los Angeles, (7) Cedars-Sinai Medical Center, and (8) Asociación para Evitar la Ceguera en México (APEC). Each institution’s IRB approved the study protocol. Subject enrollment began in July 2011. All infants admitted to a participating Neonatal Intensive Care Unit (NICU) were eligible for the study if they met published criteria for ROP screening examination, or if they were transferred to the study center for specialized ophthalmic care.10

Study infants underwent serial examinations in accordance with the most recent evidence-based ROP guidelines.10 Dilated ophthalmoscopic examinations were performed by an expert ophthalmologist, and findings were documented according to the international classification standard13. Retinal images were captured by a trained photographer after each eye examination using a wide-angle camera (RetCam; Clarity Medical Systems, Pleasanton, CA) using a standard protocol following manufacturer guidelines. De-identified clinical and image data were uploaded to a secure database (ASP.net, C#; State33, Portland, OR).

Image-based Reading

Remote image-based readings were conducted by three study authors (MFC, RVPC, SO) using an SSL-encrypted web-based grading module. In some cases, the image readers were the same ophthalmologists who had performed the ophthalmic examination. To best simulate ophthalmoscopy, where both eyes are examined sequentially before a final diagnosis is made, images from both eyes were displayed side-by-side (Figure 1). Demographic information, such as gestational age, postmenstrual age, and birth weight were also visible during image reading. Image readings were graded on an ordinal scale based on criteria from NIH-funded clinical trials: (1) No ROP; (2) Mild ROP; (3) Type-2 (moderate) ROP; and (4) treatment-requiring (severe) ROP,1112

Figure 1.

Figure 1.

Example of web-based interface used for image evaluation and imaged-based diagnosis of ROP.

Development and Rationale for Reference Standards

Two reference standards were developed for this study: (1) Image-based reference standard, which was defined as the diagnosis given by a majority of the three readers. The rationale for this definition is that pooling the expertise of multiple image readers may improve the overall diagnosis. (2) Overall reference standard, which integrated the image-based reference standard with the actual clinical diagnosis provided by the examining ophthalmologist. The rationale for this definition is that combining information from clinical and telemedicine data may provide the most accurate diagnosis possible, and that this may be applicable in setting such as rigorous clinical research. In instances when there discrepancies between the image-based reference standard and the clinical diagnosis, all medical records reviewed by the three image readers and a moderator (KJ) to reach a consensus for the overall reference standard.

Data Analysis

Data were analyzed using spreadsheet software (Excel 2011; Microsoft, Redmond, WA). All records that had three image readers as of December 3, 2013 were analyzed. Additional analysis was conducted on the subset of these records that also had a submitted overall reference standard.

Inter-reader agreement in diagnostic classification was calculated for each pair of image readers using absolute agreement, kappa (κ) statistic for chance-corrected agreement, and weighted κ. Agreement was also investigated for the following comparisons: (1) individual readers vs. clinical diagnosis, (2) individual readers vs. image-based reference standard, (3) individual readers vs. overall reference standard, (4) image-based reference standard vs. clinical diagnosis, (5) image-based reference standard vs. overall reference standard, and (6) clinical diagnosis vs. overall reference standard.

In all instances where there were diagnostic discrepancies with reference standards, all study data were reviewed by the authors (MCR, SO) and the reason for discrepancy was classified as either: (1) no ROP identified by ophthalmoscopic exam, (2) no ROP identified by image-based exam, (3) disagreement in classification of ROP severity (“stage”), (4) disagreement in classification of ROP location (“zone”), and/or (5) disagreement in classification of blood vessel appearance (“dilation” and “tortuosity”).

Results

Characteristics of Study Population

A total of 150 infants who underwent 358 clinical exams met eligibility criteria for analysis. Both eyes of each infant underwent ophthalmoscopic examination at each visit, for a total of 716 study eyes. Based on clinical exam, 335 (47%) had no ROP, 283 (40%) had mild ROP, 67 (9%) had Type-2 ROP, and 31 (4%) had treatment-requiring ROP.

Inter-Reader Reliability and Reader Agreement with Clinical Exam

Table 1 summarizes agreement for each pair of readers based on ordinal ROP classification. Readers 1 & 2 had moderate agreement, while readers 1 & 3 and readers 2 & 3 had substantial agreement. Overall absolute agreement among the three readers was 73%. Table 2 summarizes agreement between individual readers and the actual clinical diagnosis. Readers 1 and 2 demonstrated moderate agreement with the clinical diagnosis, while reader 3 had substantial agreement with the clinical diagnosis.

Table 1.

Inter-reader agreement for ordinal ROP classification expressed as κ. weighted κ, and absolute agreement.

Reader Pair κ (SE) Weighted κ (SE) Absolute Agreement (%)
1 vs 2 0.490 (0.027) 0.586 (0.024) 67%
1 vs 3 0.591 (0.026) 0.668 (0.023) 74%
2 vs 3 0.652 (0.025) 0.723(0.021) 67%

Table 2.

Agreement between individual readers and clinical diagnosis for ordinal ROP classification expressed as κ and weighted κ statistics.

Reader κ (SE) Weighted κ (SE) Absolute Agreement (%)
1 0.460 (0.028) 0.542 (0.026) 66%
2 0.465 (0.028) 0.556 (0.024) 66%
3 0.553 (0.027) 0.624 (0.024) 72%

Agreement of Image-based Reference Standard with Individual Readers and Clinical Exam

Table 3 displays agreement for individual readers with the image-based reference standard. Reader 1 had moderate agreement with the image-based reference standard, while readers 2 and 3 had near-perfect agreement. Overall absolute agreement between the image-based reference standard and clinical diagnosis was 72%, while the κ (SE) and weighted κ (SE) were 0.551 (0.027) and 0.621 (0.024) respectively, indicating substantial agreement.

Table 3.

Agreement between individual readers and the image-based reference standard for ordinal ROP classification expressed as κ, weighted κ, and absolute agreement.

Reader κ (SE) Weighted κ (SE) Absolute Agreement (%)
1 0.713 (0.023) 0.769 (0.020) 80%
2 0.773 (0.021) 0.824 (0.018) 85%
3 0.900 (0.015) 0.920 (0.012) 92%

Agreement of Overall Reference Standard

Table 4 displays agreement for individual readers with the overall reference standard. There was near-perfect agreement for all three readers. Absolute agreement between the overall reference standard and clinical diagnosis was 81%. The κ (SE) and weighted κ (SE) were 0.679 (0.032) and 0.751 (0.027) respectively, indicating substantial agreement.

Table 4.

Agreement between individual readers and the overall reference standard for ordinal ROP classification expressed as κ and weighted κ statistics.

Reader κ (SE) Weighted κ (SE) Absolute Agreement (%)
1 0.802 (0.026) 0.850 (0.021) 88%
2 0.829 (0.025) 0.877 (0.019) 90%
3 0.839 (0.024) 0.877 (0.020) 90%

Classification of Discrepancies with Reference Standards

Among 434 eye exams in this study with an overall reference standard, there were 14 (3%) discrepancies with the image-based reference standard derived from majority vote among the three image graders. When these medical records were reviewed, it was discovered that 6/14 discrepancies were due to disagreements over disease location, 5/14 were due to disagreements over blood vessel morphology, in 2/14 cases the image-based reference standard did not identify any signs of ROP, and 1/14 was due to a disagreement in disease severity. Table 5 summarizes the absolute number of discrepancies and discrepancy rates for each study comparison.

Table 5.

Summary of absolute number of discrepancies and the discrepancy rate for each comparison in the study.

Comparison No. of Eye Exams No. of Disagreements Disagreement Rate
Individual Readers vs Clinical Exam 2148 689 32%
Individual Readers vs Image-based Reference Standard 2148 269 13%
Individual Readers vs Overall Reference Standard 1302 133 10%
Clinical Exam vs Image-based Reference Standard 716 198 30%
Clinical Exam vs. Overall Reference Standard Overall Reference Standard vs. Image-based 434 82 19%
Reference Standard 434 14 3%

Discussion

Summary of Key Findings

To our knowledge, this is the first study to develop and evaluate the accuracy of reference standards in ophthalmology that integrate clinical diagnosis with a consensus image-based diagnosis from multiple image readers. The performance of the reference standard was evaluated against the telemedical ROP diagnosis of three image readers, the clinical exam, and an image-based reference standard. The key findings from this study are: (1) There is imperfect agreement in image-based diagnosis among experts, and between image-based and clinical diagnoses. (2) Using multiple image readers is a potentially useful approach to increase the accuracy and reliability in telemedical diagnosis of ROP. (3) Use of an overall reference standard that integrates information from the clinical exam with image-based diagnoses may be of value in image-based clinical research.

Telemedicine holds great promise for improving the accessibility and quality of health care.2,21 However, variation in telemedical image interpretation and the lack of definitive reference standards have presented challenges to the implementation of telemedicine systems for the diagnosis and management of retinopathy of prematurity (ROP).7

Inter-reader Agreement

This study confirms findings from previously published research showing that inter-reader agreement in image-based ROP diagnosis is imperfect (Table 1). For example, a previous study examining interphysican agreement in the telemedical diagnosis of ROP found a weighted κ statistic that ranged from 0.38 (fair agreement) to 0.81 (near-perfect agreement).22 This prior study’s higher maximum value for κ may be a consequence of only using ophthalmologists for image interpretation, whereas one of the image readers in the current study was not a physician.23 However, inter-reader agreement for the non-ophthalmologist reader (reader 3) was comparable to what was observed for the two ophthalmologist readers. Interestingly, agreement between the non-ophthalmologist reader and the clinical diagnosis was actually higher than what was observed for the ophthalmologist readers.

The degree of inter-reader agreement noted in the current study is also consistent with what has been observed in non-ROP related ophthalmologic studies investigating image-based diagnosis. In a major multicenter study involving diabetic retinopathy, the weighted κ for intergrader reliability was 0.41 (moderate agreement) to 0.80 (substantial agreement), depending on the type of retinal lesion being observed.24 In a study examining interobserver agreement in the diagnosis of age-related macular degeneration based on fluorescein angiography imaging methods, the κ was 0.37 to 0.40 (fair agreement).25

Additional published literature suggests that this finding is generalizable across other medical domains. In one dermatologic study examining agreement in the evaluation and diagnosis of skin tumors the κ statistic was 0.32 (fair agreement).26 In a radiographic study comparing expert radiologists and pulmonologists in their diagnoses of upper lobe-predominant emphysema the κ was 0.20 (slight agreement) to 0.60 (moderate agreement).27

When considered in conjunction with the existing evidence base, findings from this study suggest that the reliability in the telemedical diagnosis of ROP, while imperfect, is comparable or better than interobserver agreement for other ophthalmic or medical diagnoses. This supports the validity of telemedicine programs for ROP diagnosis.

Image-Based Reference Standard Combining Multiple Readers

While the telemedicine literature examining inter-reader agreement and reader agreement against a gold standard is relatively robust, data investigating the utility of pooling telemedical diagnoses is sparse. A 2010 study investigating the accuracy of “non-expert” graders in diagnosing ROP noted slight improvements in sensitivity and specificity when using the majority diagnosis of the three best non-expert readers, but the results were not statistically significant.28 They found that sensitivity in diagnosing treatment-requiring ROP increased from 0.82 to 1.00, while specificity increased from 0.92 to 0.94.28 Similarly, a 2013 teledermatology study investigating the diagnostic accuracy of remote reflectance confocal microscopy found that sensitivity was improved by more than eight percentage points by combining reader diagnoses.29

This study confirms and expands upon these earlier findings (Tables 1 and 3). By merging the image-based diagnoses of the individual readers to form the image-based reference standard, the reliability of telemedical diagnosis of ROP was significantly improved. The range of individual reader’s weighted κ increased from 0.586 – 0.723 for inter-reader agreement to 0.769 – 0.920 for agreement with the image-based reference standard. Importantly, there was little-to-no improvement in the agreement of the clinical diagnosis with the image-based reference standard, with only a 2-percentage point increase in absolute agreement (69% vs. 71%).

Overall Reference Standard Combining Clinical and Image-Based Diagnoses

While the establishment of an image-based reference standard significantly improved the reliability of telemedical diagnosis, the relative lack of improvement in agreement between clinical and the image-based diagnoses suggests the need for further refinement of potential reference standards. A reference standard that allows for direct comparison of image-based and ophthalmoscopic diagnoses would be particularly useful in clinical research, where both diagnostic approaches need to be evaluated simultaneously. We created such a reference standard by incorporating the clinical diagnosis with the image-based reference standard. The utility of this approach is demonstrated by the increased levels of agreement (Table 4) and the decreased levels of discrepancies (Table 5). Each time more information was aggregated together to form a new reference standard there were commensurate increases in reliability and accuracy of the telemedical diagnosis.

Limitations

There are several limitations of this study. (1) While the data in this study were analyzed by eye, the classification and diagnosis of ROP in the right and left eyes of the same patient is not independent. At the time of image interpretation, images of both eyes were presented to the study readers simultaneously so as to simulate ophthalmoscopy, where both eyes are examined together. This approach minimizes bias that might favor either examination and allows for the analysis of both eyes of the infant. (2) There was no standardization of image reading conditions, such as resolution, contrast, and luminance. Previous radiographic studies have demonstrated the ability of these parameters to affect diagnostic accuracy.30 (3) While two of the readers involved in this study have extensive clinical experience with ROP (RVPC, MFC), the third was a research coordinator who was trained to interpret images (SO). While the use of “non-expert” readers is common in the literature, it is unclear what affect, if any, this may have had on the study findings. Future studies examining the influence of reader background on inter-reader reliability and diagnostic accuracy may be revealing. (4) The sample size of three image readers, while consistent with the published literature, is still relatively small. As such, it is difficult to make generalizations to the broader community of ophthalmologists. (5) As some of the ophthalmic examinations included in this study were performed by the image readers, there is the potential for recall bias. However: (a) Study images were collected at eight sites, and readers only worked at two of those sites. (b) Study images were reviewed by readers several months after clinical exams. (c) No clinical data beyond the retinal images and basic demographic information (e.g. birth weight & gestational age) were shown to readers about subjects. For these reasons, we suspect that the impact of recall bias is minimal.

Conclusions

Telemedicine has the potential to improve the quality, cost, and accessibility of medical care, particularly in image-oriented specialties. Previous research investigating the accuracy and reliability of telemedicine systems in ophthalmology has typically compared telemedical diagnosis to the current gold standard of a dilated fundoscopic exam by an ophthalmologist. However, defining absolute reference standards for the diagnosis of ROP has been difficult and has hindered the implementation of image-based clinical information systems. This study demonstrates that the reliability and accuracy of ROP diagnosis can be improved through the implementation an overall reference standard that integrates diagnostic information from the clinical examination and an image-based telemedicine system. Findings from this study have important implications for groups designing and implementing image-based telemedicine systems and for future research aimed at improving the accuracy of ROP diagnosis.

Acknowledgments

Supported by grant EY19474 from the National Institutes of Health, Bethesda, MD (MFC, RVPC), and by unrestricted departmental funding from Research to Prevent Blindness, New York, NY (MCR, SO, KJ, RVPC, MFC). The St. Giles Foundation (RVPC), Departmental Grant from Research to Prevent Blindness (RVPC, KEJ), The iNsight Foundation (RVPC, KEJ)

References

  • 1.Grigsby J, Sanders JH. Telemedicine: where it is and where it’s going. Ann Int Med. 1998;129:123–7. doi: 10.7326/0003-4819-129-2-199807150-00012. [DOI] [PubMed] [Google Scholar]
  • 2.Bashshur RL, Reardon TF, Shannon GW. Telemedicine: a new health care delivery system. Annu Rev Public Health. 2000;21:613–37. doi: 10.1146/annurev.publhealth.21.1.613. [DOI] [PubMed] [Google Scholar]
  • 3.Sittig DF, Singh H. Electronic health records and national patient safety goals. New Engl J Med. 2012;367:1854–60. doi: 10.1056/NEJMsb1205420. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Blumenthal D, Tavenner M. The “meaningful use” regulation for electronic health records. New Engl J Med. 2010;363:501–4. doi: 10.1056/NEJMp1006114. [DOI] [PubMed] [Google Scholar]
  • 5.Peabody JW, Luck J, Glassman P. Comparison of vignettes, standardized patients, and chart abstraction: a prospective study of 3 methods for measuring quality. JAMA. 2000;283:1715–22. doi: 10.1001/jama.283.13.1715. [DOI] [PubMed] [Google Scholar]
  • 6.Veloski J, Tai S, Evans AS, Nash DB. Clinical vignette-based surveys: a tool for assessing physician practice variation. Am J Med Qual. 2005;20:151–7. doi: 10.1177/1062860605274520. [DOI] [PubMed] [Google Scholar]
  • 7.Chiang MF, Jiang L, Gelman R, et al. Inter-expert agreement of plus disease diagnosis in retinopathy of prematurity. Arch Ophthalmol. 2007;125:875–80. doi: 10.1001/archopht.125.7.875. [DOI] [PubMed] [Google Scholar]
  • 8.Moss SE, Klein R, Kessler SD, Richie KA. Comparison between ophthalmoscopy and fundus photography in determining severity of diabetic retinopathy. Ophthalmology. 1985;92:62–7. doi: 10.1016/s0161-6420(85)34082-4. [DOI] [PubMed] [Google Scholar]
  • 9.Kinyoun JL, Martin DC, Fujimoto WY, Leonetti DL. Ophthalmoscopy versus fundus photographs for detecting and grading diabetic retinopathy. Invest Ophthalmol Vis Sci. 1992;33:1888–93. [PubMed] [Google Scholar]
  • 10.Fierson WM, American Academy of Pediatric, American Academy of Ophthalmology et al. Screening examination of premature infants for retinopathy of prematurity. Pediatrics. 2013;131:189–95. doi: 10.1542/peds.2012-2996. [DOI] [PubMed] [Google Scholar]
  • 11.Cryotherapy for ROP Cooperative Group Multicenter trial of cryotherapy for retinopathy of prematurity: preliminary results. Arch Ophthalmol. 1988;106:471–9. doi: 10.1001/archopht.1988.01060130517027. [DOI] [PubMed] [Google Scholar]
  • 12.Early Treatment for ROP Cooperative Group Revised indications for the treatment of ROP. Arch Ophthalmol. 2003;121:1684–94. doi: 10.1001/archopht.121.12.1684. [DOI] [PubMed] [Google Scholar]
  • 13.Committee for the classification of retinopathy of prematurity The international classification of ROP revisited. Arch Ophthalmol. 2005;123:991–9. doi: 10.1001/archopht.123.7.991. [DOI] [PubMed] [Google Scholar]
  • 14.Steinkuller PG, Du L, Gilbert C, et al. Childhood blindness. J AAPOS. 1999;3:26–32. doi: 10.1016/s1091-8531(99)70091-1. [DOI] [PubMed] [Google Scholar]
  • 15.Laws DE, Morton C, Weindling M, Clark D. Systemic effects of screening for retinopathy of prematurity. Br J Ophthalmol. 1996;80(5):425–8. doi: 10.1136/bjo.80.5.425. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Moral-pumarega MT, Caserío-carbonero S, De-la-cruz-bértolo J, Tejada-palacios P, Lora-pablos D, Pallás-alonso CR. Pain and stress assessment after retinopathy of prematurity screening examination: indirect ophthalmoscopy versus digital retinal imaging. BMC Pediatr. 2012;12:132. doi: 10.1186/1471-2431-12-132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Paul Chan RV, Williams SL, Yonekawa Y, Weissgold DJ, Lee TC, Chiang MF. Accuracy of retinopathy of prematurity diagnosis by retinal fellows. Retina (Philadelphia, Pa) 2010;30(6):958–65. doi: 10.1097/IAE.0b013e3181c9696a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Wong RK, Ventura CV, Espiritu MJ, et al. Training fellows for retinopathy of prematurity care: a Web-based survey. J AAPOS. 2012;16(2):177–81. doi: 10.1016/j.jaapos.2011.12.154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Murakami Y, Jain A, Silva RA, Lad EM, Gandhi J, Moshfeghi DM. Stanford University Network for Diagnosis of Retinopathy of Prematurity (SUNDROP): 12-month experience with telemedicine screening. Br J Ophthalmol. 2008;92(11):1456–60. doi: 10.1136/bjo.2008.138867. [DOI] [PubMed] [Google Scholar]
  • 20.Weaver DT. Telemedicine for retinopathy of prematurity. Curr Opin Ophthalmol. 2013;24(5):425–31. doi: 10.1097/ICU.0b013e3283645b41. [DOI] [PubMed] [Google Scholar]
  • 21.Ekeland AG, Bowes A, Flottorp S. Effectiveness of telemedicine: a systematic review of reviews. Int J Med Inform. 2010;79(11):736–71. doi: 10.1016/j.ijmedinf.2010.08.006. [DOI] [PubMed] [Google Scholar]
  • 22.Scott KE, Kim DY, Wang L, et al. Telemedical diagnosis of retinopathy of prematurity intraphysician agreement between ophthalmoscopic examination and image-based interpretation. Ophthalmology. 2008;115(7):1222–1228.e3. doi: 10.1016/j.ophtha.2007.09.006. [DOI] [PubMed] [Google Scholar]
  • 23.Chiang MF, Keenan JD, Starren J, et al. Accuracy and reliability of remote retinopathy of prematurity diagnosis. Arch Ophthalmol. 2006;124(3):322–7. doi: 10.1001/archopht.124.3.322. [DOI] [PubMed] [Google Scholar]
  • 24.Early Treatment Diabetic Retinopathy Study Research Group Grading diabetic retinopathy from stereoscopic color fundus photographs—an extension of the modified Airlie House classification. ETDRS report number 10. Ophthalmology. 1991;98(suppl):786–806. [PubMed] [Google Scholar]
  • 25.Holz FG, Jorzik J, Schutt F, et al. Agreement among ophthalmologists in evaluating fluorescein angiograms in patients with neovascular age-related macular degeneration for photo- dynamic therapy eligibility (FLAP-Study) Ophthalmology. 2003;110:400–5. doi: 10.1016/S0161-6420(02)01770-0. [DOI] [PubMed] [Google Scholar]
  • 26.Phillips CM, Burke WA, Allen MH, Stone D, Wilson JL. Reliability of telemedicine in evaluating skin tumors. Telemed J. 1998;4(1):5–9. doi: 10.1089/tmj.1.1998.4.5. [DOI] [PubMed] [Google Scholar]
  • 27.Hersh CP, Washko GR, Jacobson FL, et al. Interobserver variability in the determination of upper lobe-predominant emphysema. Chest. 2007;131(2):424–31. doi: 10.1378/chest.06-1040. [DOI] [PubMed] [Google Scholar]
  • 28.Williams SL, Wang L, Kane SA, et al. Telemedical diagnosis of retinopathy of prematurity: accuracy of expert versus non-expert graders. Br J Ophthalmol. 2010;94(3):351–6. doi: 10.1136/bjo.2009.166348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Rao BK, Mateus R, Wassef C, Pellacani G. In vivo confocal microscopy in clinical practice: comparison of bedside diagnostic accuracy of a trained physician and distant diagnosis of an expert reader. J Am Acad Dermatol. 2013;69(6):e295–300. doi: 10.1016/j.jaad.2013.07.022. [DOI] [PubMed] [Google Scholar]
  • 30.Herron JM, Bender TM, Campbell WL, Sumkin JH, Rockette HE, Gur D. Effects of luminance and resolution on observer performance with chest radiographs. Radiology. 2000;215(1):169–74. doi: 10.1148/radiology.215.1.r00ap34169. [DOI] [PubMed] [Google Scholar]

Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES