Skip to main content
JAMA Network logoLink to JAMA Network
. 2024 Mar 4;178(4):401–407. doi: 10.1001/jamapediatrics.2024.0011

Development and Validation of an Automated Classifier to Diagnose Acute Otitis Media in Children

Nader Shaikh 1, Shannon J Conway 1, Jelena Kovačević 2, Filipe Condessa 3, Timothy R Shope 1, Mary Ann Haralam 1, Catherine Campese 1, Matthew C Lee 1, Tomas Larsson 4, Zafer Cavdar 4, Alejandro Hoberman 1,
PMCID: PMC10985552  PMID: 38436941

This diagnostic study analyzes the validity of an artificial intelligence decision-support tool for the diagnosis of acute otitis media.

Key Points

Question

Can an artificial intelligence decision support tool be used in a primary care setting to enhance accuracy in the diagnosis of acute otitis media in young children?

Findings

In this diagnostic study using 1151 videos of the tympanic membrane from 635 children, the decision-support tool had a sensitivity of 93.8% and specificity of 93.5%.

Meaning

These findings suggest that given its high accuracy, the decision-support tool could reasonably be used in primary care or acute care settings to aid with decisions regarding treatment of acute otitis media.

Abstract

Importance

Acute otitis media (AOM) is a frequently diagnosed illness in children, yet the accuracy of diagnosis has been consistently low. Multiple neural networks have been developed to recognize the presence of AOM with limited clinical application.

Objective

To develop and internally validate an artificial intelligence decision-support tool to interpret videos of the tympanic membrane and enhance accuracy in the diagnosis of AOM.

Design, Setting, and Participants

This diagnostic study analyzed otoscopic videos of the tympanic membrane captured using a smartphone during outpatient clinic visits at 2 sites in Pennsylvania between 2018 and 2023. Eligible participants included children who presented for sick visits or wellness visits.

Exposure

Otoscopic examination.

Main Outcomes and Measures

Using the otoscopic videos that were annotated by validated otoscopists, a deep residual-recurrent neural network was trained to predict both features of the tympanic membrane and the diagnosis of AOM vs no AOM. The accuracy of this network was compared with a second network trained using a decision tree approach. A noise quality filter was also trained to prompt users that the video segment acquired may not be adequate for diagnostic purposes.

Results

Using 1151 videos from 635 children (majority younger than 3 years of age), the deep residual-recurrent neural network had almost identical diagnostic accuracy as the decision tree network. The finalized deep residual-recurrent neural network algorithm classified tympanic membrane videos into AOM vs no AOM categories with a sensitivity of 93.8% (95% CI, 92.6%-95.0%) and specificity of 93.5% (95% CI, 92.8%-94.3%) and the decision tree model had a sensitivity of 93.7% (95% CI, 92.4%-94.9%) and specificity of 93.3% (92.5%-94.1%). Of the tympanic membrane features outputted, bulging of the TM most closely aligned with the predicted diagnosis; bulging was present in 230 of 230 cases (100%) in which the diagnosis was predicted to be AOM in the test set.

Conclusions and Relevance

These findings suggest that given its high accuracy, the algorithm and medical-grade application that facilitates image acquisition and quality filtering could reasonably be used in primary care or acute care settings to aid with automated diagnosis of AOM and decisions regarding treatment.

Introduction

Acute otitis media (AOM) is the second most frequently diagnosed illness in children in the US following the common cold1,2 and is the most commonly cited indication for antimicrobials.3 Despite the high prevalence of AOM, accuracy of diagnosis has been consistently 75% or lower across primary care and pediatric practitioners.4,5,6,7 Methods used to enhance accuracy and facilitate diagnosis of AOM have evolved over time. Training programs, such as the Enhancing Proficiency in Otitis Media curriculum,8 were developed to improve practitioners’ skills in diagnosing otitis media. Other clinical tools have included pneumatic otoscopy, tympanometry, smartphone-based otoscope attachments, serum biomarkers,9 and novel imaging technologies such as a light field otoscope10 and optical coherence tomography.11 Despite these efforts, diagnostic accuracy remains low and, accordingly, further innovation is warranted.

Recently, efforts toward improving diagnostic accuracy of AOM have focused on developing artificial intelligence algorithms. Multiple research teams12,13,14,15,16 have used deep learning to train neural networks to recognize the presence of AOM and, in some instances, other ear-related diagnoses. Some previously developed neural networks have limited clinical applications due to training with ideal, nonobstructed images.14,15,16,17 Other training data sets were collected in specialty clinics or intraoperatively on sedated patients, which limits generalizability.15 Although most cases of AOM are diagnosed in a primary care setting, only 1 previous model was developed using data collected from a primary care clinic.12 The Kuruvilla et al12 model is based on a relatively small sample size, which limits its clinical application. There are no studies, to our knowledge, that use a large training set collected from a primary care setting and include nonideal or partially obstructed images. Systematic reviews17,18 and editorials19 have supported the overall development of neural networks to assist with the diagnosis of otitis media as a promising avenue requiring further exploration. Our objective was to develop an artificial intelligence decision support tool to interpret videos of the tympanic membrane (TM) and enhance accuracy in the diagnosis of AOM.

Methods

This diagnostic study was approved by the University of Pittsburgh institutional review board as a minimal-risk study with a waiver of written informed consent under the Common Rule because the ear examination is a standard part of a physical examination for young children, and the device used to take the picture (an otoscope) could be used in a standard examination. Verbal permission was obtained using an institutional review board–approved script. This study followed the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) reporting guideline.

Application Development

To capture and annotate videos, we developed a medical grade smartphone (iPhone [Apple]) application (app) that uses the main camera on the smartphone to capture video images of the TM. In the app, the user can adjust focus and brightness to obtain the best possible image for further processing. Voice recognition software is embedded to allow users to take video clips of the TM using voice commands (eg, capture and stop). After recording the video, the app allows cropping to facilitate further processing. Optionally, the app also allows users to record their impressions regarding the appearance of the TM and their presumptive diagnosis.

Recruitment

We developed a training library of otoscopic assessments on children presenting for well or sick visits to outpatient pediatric offices (University of Pittsburgh Medical Center Children’s Hospital Primary Care Center and Children’s Community Pediatrics) near Pittsburgh, Pennsylvania between 2018 and 2023. Children were selected through convenience sampling; a large proportion were enrolled in AOM clinical trials and were younger than 36 months of age. A video clip was taken of the child’s TM using an endoscope (Storz Hopkins) or an otoscope (Hillrom Macroview Plus) connected to a smartphone with an appropriate adapter (Video). Saved videos were reviewed by 2 validated otoscopists (A.H. or N.S.) who assigned a final diagnosis. Disagreements were resolved by discussion. Videos for which experts could not arrive at a diagnosis (because of near complete occlusion of the TM with cerumen or because the video was completely out of focus) were excluded. Expert consensus was used as the reference standard because myringotomy and tympanocentesis are invasive and, accordingly, not practical for use in a large cohort of unselected children. Furthermore, the diagnosis of AOM cannot be accurately made based on presence or absence of middle-ear effusion which is also present in children with otitis media with effusion (OME). Our target sample size was 1000.20

Video. Demonstration of the Pitt CMU iTM App as an Automated Classifier to Diagnose Acute Otitis Media (AOM) in Children.

Download video file (94MB, mp4)

Demonstration of the Pitt CMU iTM app. As video images of the tympanic membrane of a child are acquired, the automated classifier is executed, and images and the diagnosis with confidence levels are shared with the parent and additional health care professionals and trainees. Voice recognition can be used to start and stop the video recording; longer videos can be trimmed to submit a 3-second to 5-second segment for execution of the application programming interface. The app provides an answer of AOM vs no AOM with levels of confidence.

To gauge how parents felt about the use of this decision support tool for the diagnosis of AOM, we administered a short survey to a convenience sample of parents whose child’s examination included use of the classifier between November 2022 and July 2023. The survey comprised 5 rated questions and 2 open-ended questions. Results of both the rated and open-ended questions are outlined in eTable 1 in Supplement 1.

Diagnostic Classifier

Using TM videos as inputs and the diagnosis assigned by experts as the reference standard, we trained a deep residual-recurrent neural network (DR-RNN), a combined multilayer model that outputs features of the TM (position, color, translucency, distinct erythema, and air fluid interface) and the diagnosis (AOM vs no AOM). Mobility was not outputted by the model. Of 1151 videos, 921 (80.0%) were used for training and 230 (20.0%) for testing. The model learns to predict features and diagnosis at the same time (ie, combined learning); to improve diagnostic accuracy, the model also needs to improve feature predictions. The combined learning is implemented via weighted loss functions for both features and diagnosis predictions. As shown in Figure 1, the DR-RNN also has a diagnosis layer (final decision layer) which provides supervision to predict features more accurately. The DR-RNN model outputs the probability of AOM for each video. If the probability is 50% or greater, the model diagnoses AOM. A cutoff of 50% is frequently used because, in well trained models, most cases will have been assigned a probability of close to 100% and most controls a probability of close to 0%. To confirm our choice of cutoff, we also estimated the Youden index (difference between the true-positive rate [sensitivity] and false-positive rate) at different thresholds between 0% and 100%. To investigate whether using a different methodology would yield different results, we also developed a decision tree (DT) model. Inputs for the DT model were the TM features predicted by the DR-RNN.

Figure 1. Computation Flow and Model Architecture.

Figure 1.

The learning and decision-making paradigm of a deep residual-recurrent neural network (DR-RNN), shown in panel A, is similar to visual cognition of a human brain. Each layer’s responsibility can be simplified as follows: The DR-RNN is like a team of experts who specialize in recognizing different parts of an image and passes on their findings (eg, colors, edges, shapes, objects) to the next expert. If 1 of the experts is not certain, they can also ask the previous expert for help. This way, the team gets really good at recognizing all the details of an image. For the long short-term memory recurrent neural network (LSTM), the diagnosis decision is not based on information in a single image but on a sequence of images. One frame may tell more about color while another frame gives translucency information. LSTM remembers what has been seen in the previous frames and pays attention to the sequence of observations. The attention layer is like a spotlight and helps the next layers of the model focus on the most important parts of the information remembered by LSTM. It looks at clues found earlier, and it says, "Pay extra attention to this part right here!" to the fully connected neural network. The fully connected neural network is like a team of experts who take all the important clues and put them together. They talk to each other, share their thoughts, and come to a conclusion about features or diagnosis. Decision tree models (B) work by making a series of binary decisions based on the input features, ultimately leading to classification. Our decision tree model takes expert feature estimations from the DR-RNN model, computes the importance of these features for the final decision (diagnosis), and creates decision-making rules that maximize the accuracy.

Because we were using video files, we also compared various methods of extracting frames from the videos: equal-width sampling, sharpness maximization, blurriness minimization, contrast maximization and diversity maximization. Diversity maximization included visual representation of all frames extracted from a pretrained DR neural network model and a new clustering model trained for each frame collection; most central frames were selected from each cluster.

Image Quality Filter

To warn users that the video they had captured may be of suboptimal quality, we trained and tested an image quality classifier. We used 344 420 frames from 754 noncropped recordings. We sampled 34 442 of 344 420 frames (10.0%; 1 frame from each consecutive 10 frames, reducing 30 frames per second to 3 frames per second at equal intervals), extracted multidimensional data21 from the frames, reduced these data to 2 dimensions,22 and allocated frames into 100 clusters using a k-means algorithm. Based on 36 frames in each cluster, experts annotated each cluster as accept or reject (Figure 2); this was used to train a frame-level, binary quality filter (ie, a deep learning image classification model) with an output of accept or reject. We used 27 554 of 34 442 frames (80.0%) for training and 6888 of 34 442 frames (20.0%) for testing and ensured that training and test data sets did not contain frames from the same recording. Videos in which 70% or more of the frames are rejected generate a prompt that encourages users to obtain a new recording.

Figure 2. Development of the Quality Filter .

Figure 2.

The figure shows the visual landscape with tympanic membrane frames clustered based on similarity (A), an example of 1 cluster evaluated by experts (B), and final landscape with expert determinations (C). The X and Y axes in panel C represent the transformed coordinates of the high-dimensional image features.

Statistical Analysis

For each model (DR-RNN and DT), we compared the diagnosis outputted by the model with the diagnosis assigned by experts and used this to calculate sensitivity, specificity, positive predictive value, and negative predictive value. We calculated confidence intervals by training the same model architecture on 20 different train and test splits and evaluating each model on its own test data set. For the DR-RNN model, we also generated a receiver operator characteristic curve by plotting true positive results and false positive results at different probability thresholds on the test set of the original run. Because the DT model is not probabilistic, a receiver operator characteristic curve was not generated. Evaluation metrics, such as sensitivity, specificity, Youden-index, area under the curve and confidence intervals for all repeated experiments via cross-validation were computed using SciPy version 1.10 (Python Software Foundation) and NumPy version 1.23 (Python Software Foundation) libraries developed in Python programming language for scientific and numerical computing.

Results

Of 1561 videos, 410 (26.2%) were excluded, mostly due to occlusion of the tympanic membrane with cerumen, leaving 1151 videos from 635 children (majority younger than 3 years of age) included in the analyses. These videos were obtained during a relatively extended period and with a variety of devices. Experts labeled 305 of 1151 videos (26.5%) as AOM and 846 of 1151 (73.5%) as not AOM (ie, OME or normal middle-ear status) (Figure 3). We collected 60 parent questionnaires. Results from the parent questionnaires were favorable; 48 of 60 parents (80.0%) wanted the doctor to reuse the classifier in future visits. Most comments on open-ended interviews were positive (eTable 1 in Supplement 1).

Figure 3. Flow Diagram of Combined, Training, and Test Study Populations .

Figure 3.

Diagnostic Classifier

DR-RNN and DT models had almost identical accuracy (Table). The DR-RNN model had a sensitivity of 93.8% (95% CI, 92.6%-95.0%) and specificity of 93.5% (95% CI, 92.8%%-94.3%), and the DT model had a sensitivity of 93.7% (95% CI, 92.4%-94.9%) and specificity of 93.3% (92.5-94.1%). The area under the curve for the DR-RNN model was 0.973 (95% CI, 0.967-0.978) (eFigure 1 in Supplement 1). Of the various strategies for frame selection, diversity maximization provided the most accurate results. Removal of low-resolution recordings did not appear to improve the models (eTable 2 in Supplement 1). Recordings of less than 2 seconds were harder to categorize, compared with longer recordings. Mean prediction time was 4.6 seconds (95% CI, 3.2-6.0 seconds). eFigure 2 in Supplement 1 shows the Youden Index for each threshold value on the x-axis (0%-100%). The maximum Youden value (0.880) occurred at a threshold of 42% which was almost equivalent to the Youden value (0.876) at the 50% threshold. Of the TM features outputted, bulging of the TM most closely aligned with the predicted diagnosis; bulging was present in 230 of 230 cases (100.0%) in which the diagnosis was predicted to be AOM.

Table. Accuracy of the 2 Networks in the Diagnosis of Acute Otitis Media (Test Data Set).

Network Sensitivity, % (95% CI) Specificity, % (95% CI) Predictive value, % (95% CI)
Positive Negative
Deep residual-recurrent neural networka 93.8 (92.6-95.0) 93.5 (92.8-94.3) 84.5 (83.1-85.9) 97.6 (97.2-98.1)
Decision tree modelb 93.7 (92.4-94.9) 93.3 (92.5-94.1) 84.0 (82.7-85.4) 97.56 (97.1-98.0)
a

A fully connected neural layer that predicts tympanic membrane features (eg, position, color, translucency, distinct erythema, and air fluid interface) and diagnosis (acute otitis media vs no acute otitis media).

b

A model that takes tympanic membrane feature estimations (eg, position, color, translucency, distinct erythema, and air fluid interface) from the deep residual-recurrent neural network and makes diagnosis estimation.

Image Quality Filter

Sensitivity of the quality filter was 92.3% (95% CI, 88.3%-96.2%). Specificity of the quality filter was 78.3%. (95% CI, 71.6%-84.8%).

Discussion

In this diagnostic study that used short videos of the TM as input and used diagnosis provided by validated otoscopists as the reference standard, we developed an artificial intelligence algorithm that classifies video images of the TM into AOM vs no AOM (ie, OME and normal middle-ear status) categories (eFigure 3 in Supplement 1) with an overall sensitivity of 93.8% and specificity of 93.5%). The algorithm exhibited higher accuracy than pediatricians,23 primary care physicians24,25 and advance practice clinicians26 and, accordingly, could reasonably be used in these settings to aid with decisions regarding treatment. Given its high specificity, it could be used at the time of triage (by a trained nonphysician) to preclude repeated examinations that may occur in some teaching environments. Additionally, video images of the TM could be used to enhance teaching of otoscopic examination, discussion with colleagues, documentation in the electronic health record, and discussion with parents.

Our study differs from previous artificial intelligence studies in several respects. First, we enrolled a representative cross-sectional sample of young children presenting for primary care visits. Many studies to date have been conducted in children referred to specialty clinics who could presumably have more severe and easier to diagnose findings.14,19,27,28,29,30 Other studies have included only clearcut cases and controls,15,31 which can lead to overestimation of diagnostic accuracy.32 We did not exclude cases that were difficult to diagnose or cases in which some cerumen was present.13,15,16 Second, our algorithm focused on diagnosing a treatable condition (AOM) compared with nontreatable conditions (OME and normal middle-ear status). Algorithms that try to diagnose every finding (eg, perforation of the TM, cholesteatoma, chronic serous otitis media) are less applicable to a primary care setting.33,34,35 Moreover, because many of these conditions are visually distinct, inclusion of such entities could lead to overly optimistic accuracy values. We enrolled a large number of children, which protects against model overfitting that could have been an issue in smaller studies.34,36 Of the studies we reviewed, only Kuruvilla et al12 conducted their study in a primary care setting, focused on diagnosis of AOM, and avoided a case-control design. Comert et al37 and Alhudhaif et al38 developed tools with high accuracy rates of 99% and 98%, respectively, but both studies used the same data set (a publicly accessible set of TM images collected from volunteers at a hospital), which may limit its use in primary care.

Diagnostic accuracy of AOM varies from 30% to 84%.4,7,12,23,24,26,39 This wide variation results from differences in study methodology. Pediatric clinicians’ diagnostic certainty of AOM decreases substantially in children less than 2 years of age,25 which is the age group with the highest incidence of AOM, and which constituted most children in this study. Generally, accuracy improves with recent specific training8,39 and level of experience with children, incrementally increasing among general practitioners,24 nurse practitioners,26 pediatric residents,5,39 pediatricians,23 and otolaryngologists.4 However, pediatricians shown 9 perfectly focused and clear 30-second video images of various TM states (no typical TM images) only achieved an accuracy of 50%.4 Pediatricians consistently demonstrate a common problem of overdiagnosis of no effusion and OME as AOM, accounting for an unnecessary 12% to 27% of antibiotics prescribed for children diagnosed with AOM.4,7,12

Our algorithm appears to be independent of hardware used to obtain the video. We used 2 different instruments (endoscope and otoscope), and various versions of smartphones to develop the algorithm. In the internal validation data set, our model worked regardless of the device that was used. Strengths of this study include development of a quality filter that warns users that the image they obtained may be suboptimal, the cross-sectional design, use of images obtained during primary care, and our use of validated otoscopists. A unique strength of our algorithm resides in its ability to predict TM features used by expert otoscopists in stringently diagnosing AOM.40,41,42 As in clinical practice, the feature that was most closely assigned with the predicted diagnosis was bulging of the TM. DR-RNN or DT preference in the final diagnosis layer had little effect because the most challenging problem was to predict TM features (ie, position, color, translucency, distinct erythema, and air fluid interface) accurately, a challenge mostly solved by the initial complex component (common for both DR-RNN and DT final layers) of the model. Although integration of the quality filter into the mobile app did not improve accuracy, by filtering out noisy frames, it enabled rejection of recordings with more than 70% noise, prompting the user to retake or recrop recordings to enhance input quality.

The automated classifier developed here promises to improve the accuracy of AOM diagnosis in primary care. TM images could be obtained by trained medical personnel (medical assistants, registered nurses, research assistants) and uploaded into the electronic health record before the clinician (physician or advance practice clinician) examines the child. The built-in quality filter can alert the clinician that cerumen removal or retaking the image may be necessary. Improved diagnosis can help reduce inappropriate use of antimicrobials for this frequently diagnosed condition.

Limitations

This study has limitations. These include our use of convenience sampling, lack of external validation at another center, and lack of data collection regarding participant demographics and reason for visit (sick vs well).

Conclusion

In this diagnostic study, our findings suggest that the artificial intelligence decision support tool we developed and validated can enhance accuracy of diagnosing AOM in young children. In addition, with appropriate training, this tool could be used by a wide range of medical personnel to enhance teaching of otoscopic examination, discussion with colleagues, documentation in the electronic health record, and discussion with parents. Finally, more accurate diagnosis of AOM may help reduce unnecessary prescriptions of antimicrobials in young children.

Supplement 1.

eTable 1. Parent Satisfaction Survey

eFigure 1. Receiver Operating Characteristic (ROC) Curve Generated by Calculating True Positive Results and False Positive Results at Different Probability Thresholds for the Deep Residual-Recurrent Neural Network

eTable 2. Accuracy According to Resolution of Images (Test Dataset)

eFigure 2. Youden Index According to Probability Threshold Value for Binary Decision (AOM vs. No AOM)

eFigure 3. Screen Capture of Pitt-CMU-iTM App

Supplement 2.

Data Sharing Statement

References

  • 1.Schappert SM, Rechtsteiner EA. Ambulatory medical care utilization estimates for 2007. Vital Health Stat 13. 2011;(169):1-38. [PubMed] [Google Scholar]
  • 2.Young DE, Ten Cate WJ, Ahmad Z, Morton RP. The accuracy of otomicroscopy for the diagnosis of paediatric middle ear effusions. Int J Pediatr Otorhinolaryngol. 2009;73(6):825-828. doi: 10.1016/j.ijporl.2009.02.012 [DOI] [PubMed] [Google Scholar]
  • 3.Finkelstein JA, Metlay JP, Davis RL, Rifas-Shiman SL, Dowell SF, Platt R. Antimicrobial use in defined populations of infants and young children. Arch Pediatr Adolesc Med. 2000;154(4):395-400. doi: 10.1001/archpedi.154.4.395 [DOI] [PubMed] [Google Scholar]
  • 4.Pichichero ME, Poole MD. Assessing diagnostic accuracy and tympanocentesis skills in the management of otitis media. Arch Pediatr Adolesc Med. 2001;155(10):1137-1142. doi: 10.1001/archpedi.155.10.1137 [DOI] [PubMed] [Google Scholar]
  • 5.Pichichero ME. Diagnostic accuracy, tympanocentesis training performance, and antibiotic selection by pediatric residents in management of otitis media. Pediatrics. 2002;110(6):1064-1070. doi: 10.1542/peds.110.6.1064 [DOI] [PubMed] [Google Scholar]
  • 6.Jones WS, Kaleida PH. How helpful is pneumatic otoscopy in improving diagnostic accuracy? Pediatrics. 2003;112(3 Pt 1):510-513. doi: 10.1542/peds.112.3.510 [DOI] [PubMed] [Google Scholar]
  • 7.Rosenfeld RM. Diagnostic certainty for acute otitis media. Int J Pediatr Otorhinolaryngol. 2002;64(2):89-95. doi: 10.1016/S0165-5876(02)00073-3 [DOI] [PubMed] [Google Scholar]
  • 8.Kaleida PH, Ploof DL, Kurs-Lasky M, et al. Mastering diagnostic skills: enhancing proficiency in otitis media, a model for diagnostic skills training. Pediatrics. 2009;124(4):e714-e720. doi: 10.1542/peds.2008-2838 [DOI] [PubMed] [Google Scholar]
  • 9.Pichichero ME, Morris MC, Almudevar A. Three innate cytokine biomarkers predict presence of acute otitis media and relevant otopathogens. Biomark Appl. 2018;2(1). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Bedard N, Shope T, Hoberman A, et al. Light field otoscope design for 3D in vivo imaging of the middle ear. Biomed Opt Express. 2016;8(1):260-272. doi: 10.1364/BOE.8.000260 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Preciado D, Nolan RM, Joshi R, et al. Otitis media middle ear effusion identification and characterization using an optical coherence tomography otoscope. Otolaryngol Head Neck Surg. 2020;162(3):367-374. doi: 10.1177/0194599819900762 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Kuruvilla A, Shaikh N, Hoberman A, Kovačević J. Automated diagnosis of otitis media: vocabulary and grammar. Int J Biomed Imaging. 2013;2013:327515. doi: 10.1155/2013/327515 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Tran TT, Fang TY, Pham VT, Lin C, Wang PC, Lo MT. Development of an automatic diagnostic algorithm for pediatric otitis media. Otol Neurotol. 2018;39(8):1060-1065. doi: 10.1097/MAO.0000000000001897 [DOI] [PubMed] [Google Scholar]
  • 14.Shie CK, Chang HT, Fan FC, Chen CJ, Fang TY, Wang PC. A hybrid feature-based segmentation and classification system for the computer aided self-diagnosis of otitis media. Annu Int Conf IEEE Eng Med Biol Soc. 2014;2014:4655-4658. [DOI] [PubMed] [Google Scholar]
  • 15.Crowson MG, Hartnick CJ, Diercks GR, et al. Machine learning for accurate intraoperative pediatric middle ear effusion diagnosis. Pediatrics. 2021;147(4):e2020034546. doi: 10.1542/peds.2020-034546 [DOI] [PubMed] [Google Scholar]
  • 16.Wu Z, Lin Z, Li L, et al. Deep learning for classification of pediatric otitis media. Laryngoscope. 2021;131(7):E2344-E2351. doi: 10.1002/lary.29302 [DOI] [PubMed] [Google Scholar]
  • 17.Myburgh HC, van Zijl WH, Swanepoel D, Hellström S, Laurent C. Otitis media diagnosis for developing countries using tympanic membrane image-analysis. EBioMedicine. 2016;5:156-160. doi: 10.1016/j.ebiom.2016.02.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Byun H, Yu S, Oh J, et al. An assistive role of a machine learning network in diagnosis of middle ear diseases. J Clin Med. 2021;10(15):3198. doi: 10.3390/jcm10153198 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Pichichero ME. Can machine learning and AI replace otoscopy for diagnosis of otitis media? Pediatrics. 2021;147(4):e2020049584. doi: 10.1542/peds.2020-049584 [DOI] [PubMed] [Google Scholar]
  • 20.Alwosheel A, van Cranenburgh S, Chorus CG. Is your dataset big enough? Sample size requirements when using artificial neural networks for discrete choice analysis. J Choice Model. 2018;28:167-182. doi: 10.1016/j.jocm.2018.07.002 [DOI] [Google Scholar]
  • 21.Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. Poster presented at: International Conference on Learning Representations; May 2, 2016; San Juan, Puerto Rico. Accessed January 30, 2024. doi: 10.48550/arXiv.1511.06434 [DOI] [Google Scholar]
  • 22.Van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res. 2008;9(11):2579-2605. [Google Scholar]
  • 23.Pichichero ME. Diagnostic accuracy of otitis media and tympanocentesis skills assessment among pediatricians. Eur J Clin Microbiol Infect Dis. 2003;22(9):519-524. doi: 10.1007/s10096-003-0981-8 [DOI] [PubMed] [Google Scholar]
  • 24.Blomgren K, Pitkäranta A. Is it possible to diagnose acute otitis media accurately in primary health care? Fam Pract. 2003;20(5):524-527. doi: 10.1093/fampra/cmg505 [DOI] [PubMed] [Google Scholar]
  • 25.Jensen PM, Lous J. Criteria, performance and diagnostic problems in diagnosing acute otitis media. Fam Pract. 1999;16(3):262-268. doi: 10.1093/fampra/16.3.262 [DOI] [PubMed] [Google Scholar]
  • 26.Sorrento A, Pichichero ME. Assessing diagnostic accuracy and tympanocentesis skills by nurse practitioners in management of otitis media. J Am Acad Nurse Pract. 2001;13(11):524-529. doi: 10.1111/j.1745-7599.2001.tb00019.x [DOI] [PubMed] [Google Scholar]
  • 27.Cao Z, Chen F, Grais EM, et al. Machine learning in diagnosing middle ear disorders using tympanic membrane images: a meta-analysis. Laryngoscope. 2023;133(4):732-741. doi: 10.1002/lary.30291 [DOI] [PubMed] [Google Scholar]
  • 28.Habib AR, Kajbafzadeh M, Hasan Z, et al. Artificial intelligence to classify ear disease from otoscopy: a systematic review and meta-analysis. Clin Otolaryngol. 2022;47(3):401-413. doi: 10.1111/coa.13925 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Livingstone D, Talai AS, Chau J, Forkert ND. Building an otoscopic screening prototype tool using deep learning. J Otolaryngol Head Neck Surg. 2019;48(1):66. doi: 10.1186/s40463-019-0389-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Sundgaard JV, Harte J, Bray P, et al. Deep metric learning for otitis media classification. Med Image Anal. 2021;71:102034. doi: 10.1016/j.media.2021.102034 [DOI] [PubMed] [Google Scholar]
  • 31.Cai Y, Yu JG, Chen Y, et al. Investigating the use of a two-stage attention-aware convolutional neural network for the automated diagnosis of otitis media from tympanic membrane images: a prediction model development and validation study. BMJ Open. 2021;11(1):e041139. doi: 10.1136/bmjopen-2020-041139 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Crowson MG, Bates DW, Suresh K, Cohen MS, Hartnick CJ. “Human vs machine” validation of a deep learning algorithm for pediatric middle ear infection diagnosis. Otolaryngol Head Neck Surg. 2023;169(1):41-46. doi: 10.1177/01945998221119156 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Livingstone D, Chau J. Otoscopic diagnosis using computer vision: an automated machine learning approach. Laryngoscope. 2020;130(6):1408-1413. doi: 10.1002/lary.28292 [DOI] [PubMed] [Google Scholar]
  • 34.Zeng X, Jiang Z, Luo W, et al. Efficient and accurate identification of ear diseases using an ensemble deep learning model. Sci Rep. 2021;11(1):10839. doi: 10.1038/s41598-021-90345-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Başaran E, Cömert Z, Çelik Y. Convolutional neural network approach for automatic tympanic membrane detection and classification. Biomed Signal Process Control. 2020;56:101734. doi: 10.1016/j.bspc.2019.101734 [DOI] [Google Scholar]
  • 36.Huang YK, Huang CP. A depth-first search algorithm based otoscope application for real-time otitis media image interpretation. Paper presented at: 18th International Conference on Parallel and Distributed Computing, Applications and Technologies; December 18, 2017; Taipei, Taiwan. Accessed January 30, 2024. [Google Scholar]
  • 37.Comert Z. Original fusing fine-tuned deep features for recognizing different tympanic membranes. Biocybern Biomed Eng. 2020;40(1):40-51. doi: 10.1016/j.bbe.2019.11.001 [DOI] [Google Scholar]
  • 38.Alhudhaif A, Cömert Z, Polat K. Otitis media detection using tympanic membrane images with a novel multi-class machine learning algorithm. PeerJ Comput Sci. 2021;7:e405. doi: 10.7717/peerj-cs.405 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Paul CR, Keeley MG, Rebella GS, Frohna JG. Teaching pediatric otoscopy skills to pediatric and emergency medicine residents: a cross-institutional study. Acad Pediatr. 2018;18(6):692-697. doi: 10.1016/j.acap.2018.02.009 [DOI] [PubMed] [Google Scholar]
  • 40.Paradise JL. On classifying otitis media as suppurative or nonsuppurative, with a suggested clinical schema. J Pediatr. 1987;111(6 Pt 1):948-951. doi: 10.1016/S0022-3476(87)80226-3 [DOI] [PubMed] [Google Scholar]
  • 41.Shaikh N, Hoberman A, Rockette HE, Kurs-Lasky M. Development of an algorithm for the diagnosis of otitis media. Acad Pediatr. 2012;12(3):214-218. doi: 10.1016/j.acap.2012.01.007 [DOI] [PubMed] [Google Scholar]
  • 42.Lieberthal AS, Carroll AE, Chonmaitree T, et al. The diagnosis and management of acute otitis media. Pediatrics. 2013;131(3):e964-e999. Published correction appears in Pediatrics. 2014;133(2): 346. doi: 10.1542/peds.2012-3488 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement 1.

eTable 1. Parent Satisfaction Survey

eFigure 1. Receiver Operating Characteristic (ROC) Curve Generated by Calculating True Positive Results and False Positive Results at Different Probability Thresholds for the Deep Residual-Recurrent Neural Network

eTable 2. Accuracy According to Resolution of Images (Test Dataset)

eFigure 2. Youden Index According to Probability Threshold Value for Binary Decision (AOM vs. No AOM)

eFigure 3. Screen Capture of Pitt-CMU-iTM App

Supplement 2.

Data Sharing Statement


Articles from JAMA Pediatrics are provided here courtesy of American Medical Association

RESOURCES