Skip to main content
Diagnostics logoLink to Diagnostics
. 2026 Mar 1;16(5):731. doi: 10.3390/diagnostics16050731

Deep Learning-Based Dental Caries Diagnosis: A Modality-Stratified Systematic Review and Meta-Analysis of Faster R-CNN and Mask R-CNN

Quang Tuan Lam 1,2,3, Minh Huu Nhat Le 2,4, Fang-Yu Fan 5, Nguyen Quoc Khanh Le 2,6,7,*, I-Ta Lee 1,*
Editor: Francesco Inchingolo
PMCID: PMC12985255  PMID: 41828006

Abstract

Background: Deep convolutional neural networks (DCNNs) are increasingly used in computer-aided dental diagnostics. However, the relative diagnostic performance of commonly applied architectures, particularly Faster R-CNN and Mask R-CNN, has not been systematically synthesized across imaging modalities. This systematic review and meta-analysis compared the diagnostic accuracy of Faster R-CNN and Mask R-CNN for dental caries detection using radiographic and photographic images. Methods: PubMed (MEDLINE), EMBASE, Web of Science, and Scopus were systematically searched for studies published up to 15 June 2025. Studies applying Faster R-CNN and/or Mask R-CNN to dental caries detection were included. Binary diagnostic data were extracted, and pooled sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) were estimated using a bivariate random-effects model. Study quality was assessed with QUADAS-AI, and radiomics-based radiographic studies were additionally evaluated using the Radiomics Quality Score (RQS). The protocol was registered in PROSPERO (CRD420251074443). Results: Seventeen studies met the inclusion criteria. Across all imaging modalities, Mask R-CNN showed significantly higher pooled sensitivity (85.6% vs. 71.7%, p = 0.0244), specificity (94.2% vs. 81.4%, p = 0.00089), and AUC (0.95 vs. 0.84, p = 0.0053) than Faster R-CNN. In radiographic images, Mask R-CNN consistently outperformed Faster R-CNN in sensitivity (86.3% vs. 67.2%, p = 0.0497), specificity (96.5% vs. 85.0%, p = 0.00105), and AUC (0.97 vs. 0.86, p = 0.0067). In photographic images, Mask R-CNN achieved a higher AUC (0.91 vs. 0.83, p = 0.048), whereas differences in pooled sensitivity (83.5% vs. 77.3%, p = 0.435) and specificity (86.0% vs. 75.1%, p = 0.156) were not statistically significant. Conclusions: Faster R-CNN and Mask R-CNN both show potential for dental caries detection, but current evidence is limited by substantial heterogeneity, predominantly retrospective designs, and variability in imaging and labeling. Across the included studies, Mask R-CNN showed higher pooled performance estimates than Faster R-CNN, with the clearest differences in radiographic applications; however, this comparison is indirect and should be considered suggestive rather than definitive given study-level heterogeneity and uncertainty in the reference standard in a sizable proportion of studies. Prospective, multi-center studies with standardized imaging protocols, rigorous annotation, and independent external validation are required to support reliable clinical implementation.

Keywords: dental caries, deep learning, diagnostic accuracy, faster R-CNN, mask R-CNN

1. Introduction

Growing evidence indicates that oral health is closely linked not only to oral diseases but also to overall health. More specifically, untreated oral diseases have been consistently associated with a range of systemic disorders [1]. According to the Global Burden of Disease Study (GBD), only limited population-level improvement has been observed in oral health outcomes over the past three decades, with relatively small changes in oral condition estimates between 1990 and 2021 [2]. Consequently, oral diseases and the delivery of dental care are increasingly recognized as important public health priorities.

Dental caries is a biofilm-mediated, sugar-driven, multifactorial, dynamic disease resulting in the phasic demineralization and remineralization of dental hard tissues [3]. A 2025 GBD-based analysis estimated that, dental caries affected 2.24 billion individuals with permanent tooth caries and 520 million children with primary tooth caries globally [4]. Early and accurate detection of carious lesions is essential for informing clinical decision-making and optimizing treatment outcomes [5]. Dental caries is currently diagnosed through visual–tactile examination combined with radiographic assessment—a workflow that demands substantial expertise and is time-consuming. Consequently, diagnostic variability and human error can lead to missed or misclassified lesions, underscoring the need for emerging technologies to augment conventional workflows and improve early detection sensitivity [6].

Artificial intelligence (AI) is a domain of applied computer science that employs computational systems to emulate human behaviors such as intelligent reasoning, critical thinking, and decision-making [7]. In recent years, AI has attracted considerable scientific interest for its capacity to transform fields that traditionally depend on manual labor. Medicine is a prominent example, with accumulating evidence supporting AI’s utility as a diagnostic aid [8]. Innovative, technology-enabled diagnostic and therapeutic methods can shorten clinical workflows and anticipate potential adverse events [9]. Deep learning (DL)—a subfield of AI characterized by multilayer neural networks and automatic feature learning—has largely superseded earlier AI techniques and has become the dominant paradigm in healthcare. DL-based developments can function as clinical decision support and provide second-opinion capabilities in routine practice [8]. Convolutional neural networks (CNNs), a class of DL models, have achieved state-of-the-art performance [10] in dental radiologic tasks, including assessment of periodontal bone loss [11], detection of carious lesions [12], segmentation of apical pathology [13], and identification of dental plaque [14].

Faster R-CNN (Figure 1), introduced by Ren et al. at Microsoft Research, is a two-stage object detection framework that unifies region proposal generation and object classification within a single convolutional architecture. A Region Proposal Network (RPN) efficiently generates candidate regions while sharing convolutional features with the detection head, thereby reducing computational cost [15]. By eliminating external proposal methods such as selective search, this design substantially improves speed, particularly on GPUs. Faster R-CNN attains high accuracy and performs well on small objects due to its high-resolution, region-focused processing [16]. It has become a widely adopted baseline in object detection benchmarks such as PASCAL VOC and MS COCO, and its flexibility facilitates adaptation to diverse backbones and application domains, including medical imaging and remote sensing [17]. In dental applications using bitewing radiographs and intraoral photographs, Faster R-CNN is commonly initialized with ResNet-50 or ResNet-101 backbones pretrained on ImageNet and subsequently fine-tuned on domain-specific datasets [18]. The RPN is frequently customized by reducing anchor sizes to 8–32 pixels and adopting aspect ratios of 1:1, 1:2, and 2:1 to better reflect the small-scale morphology of interproximal caries, which differs from natural-image corpora such as MS COCO; high-IoU anchors are then passed to a two-stage classifier and regressor to produce Regions of Interest (ROIs) [19]. Faster R-CNN has seen growing use in dental imaging for tooth detection, numbering, and lesion identification. For example, Mima et al. used zone-specific detectors on panoramic radiographs to detect and classify all 32 permanent teeth, achieving 98.9% sensitivity and 91.7% accuracy while reducing false positives [20]. Similarly, Sari et al. identified dens invaginatus—a rare dental anomaly—on panoramic images with 0.91 precision and 0.90 sensitivity, outperforming YOLOv8 in diagnostic accuracy [21]. These findings underscore Faster R-CNN’s suitability for detecting small, irregular dental structures, a key advantage for clinical diagnostics, and support its integration into computer-aided diagnosis to enhance charting consistency, early pathology detection, and treatment planning.

Figure 1.

Figure 1

Faster R-CNN architecture diagram.

Mask R-CNN (Figure 2) extends Faster R-CNN by adding a third, fully convolutional mask head that predicts a binary segmentation map for each Region of Interest (ROI), while RoIAlign preserves sub-pixel alignment; the network therefore outputs bounding boxes, class labels, and per-instance masks in a single pass. Conceptually, it integrates Faster R-CNN’s two-stage detector (RPN plus classifier/regressor) with the encoder–decoder principles of U-Net-style fully convolutional networks, enabling first localization and then high-resolution decoding—an approach well suited to the small, irregular structures common in dentistry [22]. Fatima et al. reduced RPN anchor scales to 8 pixels with aspect ratio multipliers of 0.5:1:2, allowing Mask R-CNN to capture tiny cavitated regions in periapical radiographs; their implementation, based on a MobileNet-v2 + FPN (or ResNet-50) encoder fine-tuned end-to-end for 50 epochs, demonstrated that lightweight backbones can deliver speed without sacrificing accuracy [23]. On CBCT volumes, Ma et al. transferred COCO-pretrained ResNet-50 weights into Mask R-CNN and, after 200–300 epochs, increased mean average precision from 53% (trained from scratch) to 81% while more than halving training time [24]. Clinically, Özbay et al. applied Mask R-CNN to 1050 periapical radiographs to detect fractured endodontic instruments, reporting mAP of 98.8% and an F1 score of 96.97%, surpassing YOLOv8 and matching specialist performance [25]. Kanwal et al. used Mask R-CNN on 1500 panoramic radiographs for tooth segmentation, achieving 98% accuracy, an F1 score of 88%, and 99% specificity, outperforming ten unsupervised baselines [26]. In 3-D terms, Cui et al. incorporated a cascaded 3-D Mask R-CNN into an AI system that segmented individual teeth and alveolar bone on 4938 CBCT scans with mean Dice coefficients of 91.5–94% and a throughput roughly 500-fold faster than manual annotation [27]. Collectively, these studies demonstrate Mask R-CNN’s versatility for dental detection, numbering, pathology mapping, and surgical planning, reinforcing its central role in contemporary computer-aided diagnostics.

Figure 2.

Figure 2

The Mask R-CNN framework for instance segmentation.

Despite the close conceptual kinship between Faster R-CNN and Mask R-CNN, the literature lacks a systematic, head-to-head synthesis of their performance specifically for dental caries detection. Crucially, existing AI syntheses have not conducted a modality-stratified, head-to-head evaluation of Faster R-CNN versus Mask R-CNN (radiographs vs. photographs), which remains essential for translating model choice into practice. Most reports evaluate a single architecture or use heterogeneous metrics and datasets, preventing direct comparison and offering little guidance on when the instance-segmentation head in Mask R-CNN yields a tangible advantage over Faster R-CNN’s two-stage detector. This evidentiary gap is clinically consequential: selecting the right algorithm for a given imaging workflow can influence early lesion detection, minimally invasive treatment planning, and resource allocation in both screening and diagnostic settings.

Accordingly, this systematic review and meta-analysis (i) compares the diagnostic accuracy of Faster R-CNN and Mask R-CNN for caries detection, and (ii) performs modality-stratified analyses (radiographs versus photographs) and context-specific analyses (general screening vs. comprehensive diagnostic) of these CNN models. Our overarching clinical objective is to determine which algorithm performs better for each imaging modality and under which circumstances, thereby providing actionable recommendations for deployment in real-world dental care.

2. Methods

This systematic review and meta-analysis report was prepared in line with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses of Diagnostic Test Accuracy Studies (PRISMA-DTA) guidelines [28]. The protocol was prospectively registered in the PROSPERO International Prospective Register of Systematic Reviews (CRD420251074443).

2.1. Literature Search

A comprehensive search was performed in PubMed (MEDLINE), Embase, Web of Science, and Scopus, with the final update on 15 June 2025. Search terms combined concepts for dental caries (e.g., “oral”, “dental”, “tooth disease”, “dental caries”, “dental cavities”) with deep-learning/detection keywords (e.g., “convolutional neural networks”, “Faster R-CNN”, “Mask R-CNN”). The full database-specific strategies are provided in Supplementary File S1. Searches were limited to English-language publications, with no restrictions on publication year, participant age, or country.

2.2. Inclusion Criteria

Studies were eligible if they:

  1. Reported original diagnostic accuracy evaluations of Faster R-CNN and/or Mask R-CNN for dental caries detection using dental images (radiographic and/or photographic) compared against a reference standard (or clearly defined alternative verification).

  2. Used cross-sectional or retrospective designs in which model outputs were evaluated against an established caries status.

  3. Reported sufficient performance information to allow extraction of sensitivity/specificity and/or derivation of 2 × 2 data (TP/FP/TN/FN).

  4. Included comparative evaluations against other automated approaches or human assessment, where applicable.

2.3. Exclusion Criteria

We excluded:

  • 5.

    Non-original reports (e.g., case reports, commentaries, editorials, letters, narrative/systematic reviews) and conference abstracts, as well as animal experiments.

  • 6.

    Studies lacking enough information to extract or compute diagnostic performance measures.

  • 7.

    Studies not addressing caries detection (or not applying Faster R-CNN/Mask R-CNN as the index test of interest).

2.4. Study Selection and Data Extraction

All records were imported into EndNote® v21 (Clarivate, Philadelphia, PA, USA) for de-duplication. Two authors (Quang Tuan Lam and Minh Huu Nhat Le) independently conducted the search and study selection. Titles and abstracts were screened to identify potentially eligible articles, followed by full-text assessment against the inclusion criteria by the same two authors. Reference lists of included studies were manually searched to identify additional records. Quang Tuan Lam extracted data using a standardized Microsoft Excel form; the form was pilot-tested on five included studies and approved by all of 4 authors, and then the extracted data were cross-checked by another author. Discrepancies were resolved through discussion or, when needed, consultation with co-authors (Nguyen Quoc Khanh Le, I-Ta Lee, and Minh Huu Nhat Le). Inter-reviewer agreement for title/abstract screening and full-text selection was assessed using Cohen’s kappa statistic (κ = 0.85); disagreements were resolved through discussion and consensus.

For general characteristics of included studies, we extracted data as authors/year, machine learning algorithms, number of annotators/qualifications, consensus approach, blinding, and caries definition. For performance metrics, we recorded study characteristics (first author, country, year); the evaluated algorithms (Faster R-CNN, Mask R-CNN); imaging modality (e.g., bitewing radiograph, periapical radiograph, intraoral photograph); dataset size and any train/validation/test splitting; reported performance metrics (accuracy, precision/PPV, sensitivity/recall, F1-score, specificity, AUC); and, where possible, 2 × 2 contingency data (TP, FP, TN, FN) for caries detection. TP/FP/TN/FN were extracted at the operating threshold applied in each study as reported by the authors. If a paper reported multiple thresholds, we prioritized the threshold used for the primary (main) analysis. Since this meta-analysis review used a bivariate random-effects model and the HSROC framework, differences in thresholds across studies are inherently accommodated in the pooled estimates and HSROC visualization.

2.5. Methodological Quality and Risk of Bias Assessment

Two authors (Quang Tuan Lam and Minh Huu Nhat Le) independently evaluated the methodological quality of each included study. Details regarding the imaging types, image resolution, equipment, camera settings, and standardization processes for each paper are provided in Supplementary Files S2 and S3. The risk of bias and applicability concerns were assessed using the Quality Assessment of Diagnostic Accuracy Studies—Artificial Intelligence (QUADAS-AI) tool [29], built on the QUADAS-2 framework [30] for AI-centered diagnostic accuracy studies. Two reviewers independently evaluated each included study across the QUADAS-AI domains—patient selection, index test (AI model), reference standard, and flow and timing—using signaling questions to guide judgments on risk of bias and applicability. For patient selection, we examined sampling strategy, clarity of inclusion/exclusion criteria, and whether image-quality-related exclusions could introduce selection/spectrum bias. For the index test, we assessed AI-specific risks such as transparency of train/validation/test splits, potential data leakage (e.g., patient-level overlap), prespecification of operating thresholds, and reporting of preprocessing/augmentation. For the reference standard, we evaluated the appropriateness and description of ground-truth labeling (criteria, annotator credentials, and consensus procedures) and blinding where relevant. For flow/timing, we assessed completeness of case inclusion, consistency of the reference standard, handling of missing/uninterpretable data, and any timing issues that could plausibly bias results. Each domain was rated as low, high, or unclear risk of bias, and applicability was rated as low, high, or unclear concern. Disagreements were resolved by discussion and, if needed, adjudicated by a third reviewer (Nguyen Quoc Khanh Le). The full results of Quality Assessment of Diagnostic Accuracy Studies-AI (QUADAS-AI) are provided in Supplementary Table S1.

For studies utilizing radiographic images and radiomics workflows, we additionally assessed methodological quality and reporting rigor using the Radiomics Quality Score (RQS) proposed by Lambin et al. [31]. RQS consists of 16 items covering key aspects of radiomics study design, analysis, and validation. Specifically, we evaluated: (i) image acquisition and protocol reporting (including whether acquisition parameters were sufficiently described and whether robustness/reproducibility was assessed across scanners or settings); (ii) segmentation procedures and feature robustness (e.g., use of multiple segmentations/readers, assessment of inter-/intra-observer variability, and stability of extracted features); (iii) feature engineering and statistical control (appropriate feature reduction/selection, correction for multiple testing, and avoidance of overfitting); (iv) model development and validation (clear train/validation/test strategy, internal validation such as cross-validation/bootstrapping, and external validation on independent cohorts where available); (v) clinical utility and comparison to relevant baselines (e.g., comparison to clinician performance or standard clinical models, and decision-curve analysis where applicable); and (vi) transparency and evidence level (e.g., availability of code/data or sufficient methodological detail for reproducibility, and higher-level evidence such as prospective or multi-center design). Each item was scored according to the published RQS rubric to yield a total score ranging from 0 to 36, with higher scores indicating better methodological rigor and reproducibility. RQS scoring was performed independently by the same two reviewers, and discrepancies were resolved by consensus or, when necessary, adjudication by the third reviewer (Nguyen Quoc Khanh Le). The final assessment of Radiomics Quality Score (RQS) can be found in Supplementary Table S2.

2.6. Data Synthesis and Analysis

The primary aim of this systematic review and meta-analysis was to quantify the overall diagnostic accuracy of Faster R-CNN and Mask R-CNN for dental caries detection using standard performance measures, and the secondary aim was to examine whether study- or image-level characteristics influenced model performance. Because Faster R-CNN and Mask R-CNN were evaluated across independent study cohorts rather than within-study head-to-head designs, all between-model comparisons were treated as indirect. We synthesized extracted diagnostic accuracy data using a bivariate random-effects (Reitsma) model to generate pooled estimates of sensitivity and specificity with corresponding 95% confidence intervals (CIs), accounting for both within-study sampling error and between-study variability. To illustrate the sensitivity–specificity trade-off, we constructed a summary ROC (sROC) curve within a hierarchical diagnostic accuracy framework. Between-study heterogeneity was summarized by reporting τ2 for logit-transformed sensitivity and specificity; for interpretability, we additionally fitted univariate random effects models for sensitivity and specificity and reported I2 for each outcome. To explore potential sources of heterogeneity and adjust for major study-level confounders, prespecified meta-regression and stratified analyses were conducted within the same hierarchical framework to evaluate the impact of covariates on pooled sensitivity, specificity, and AUC. Covariates included imaging modality (radiographic vs. photographic), dataset size, presence of external validation, and study design (retrospective vs. prospective). Subgroup analyses were additionally performed by imaging modality. All analyses were conducted in R (version 4.4.2; R Foundation for Statistical Computing, Vienna, Austria), and statistical significance was defined as p < 0.05.

3. Results

3.1. Study Selection

Faster R-CNN: The search identified 616 records. After removing duplicates, 354 unique records remained for title/abstract screening. We assessed 23 full texts for eligibility, of which 11 studies [32,33,34,35,36,37,38,39,40,41,42] met the inclusion criteria and were included in the qualitative synthesis; all 11 provided sufficient data for meta-analysis (Figure 3).

Figure 3.

Figure 3

PRISMA-DTA flow diagram summarizing the identification, screening, eligibility assessment, and inclusion of studies evaluating Faster R-CNN for caries detection.

Mask R-CNN: A similar screening and eligibility process was applied. Six studies [43,44,45,46,47,48] satisfied the inclusion criteria and were included in both the qualitative synthesis and the meta-analysis (Figure 4).

Figure 4.

Figure 4

PRISMA-DTA flow diagram summarizing the identification, screening, eligibility assessment, and inclusion of studies evaluating Mask R-CNN for caries detection.

3.2. Study Characteristics

Sixteen of the 17 included studies were published from 2022 onward and originated from researchers in nearly 20 countries, reflecting increasing global interest in these DL models. Characteristics of included studies such as authors/year, machine learning algorithms, number of annotators/qualifications, consensus approach, blinding and caries definition can be found in Table 1. Across studies, a total of 41,384 tooth images were analyzed (after augmentation), with sample sizes ranging from 90 to 12,750 images per study. Faster R-CNN was used in 11/17 studies (64.7%), while Mask R-CNN was evaluated in the remaining 6/17. Radiographic images were used in 10/17 studies (58.8%) and intraoral photographs in 8/17 (47.1%), with Rashid et al. [48] including both modalities. Data augmentation was reported in all studies to expand and/or balance training data. Only 6 studies (35.3%) incorporated an explicit segmentation step, consistent with the Mask R-CNN workflow. Overall, the included studies aimed to develop AI-based tools to improve the accuracy and efficiency of caries detection in clinical settings.

Table 1.

Characteristics of Included Studies.

Authors/Year Machine Learning
Algorithms
Number of Annotators/Qualifications Consensus
Approach
Blinding Caries Definition
N. Cauás et al., 2023 [32] YOLOv5 and Faster R-CNN Manually annotated Not reported Not reported Not reported
X. T. Chen et al., 2022 [33] Faster R-CNN 2 endodontist and 1 radiologist Independent labeling; disagreements resolved by discussion Not reported Radiolucent area between adjacent contacts with specified radiographic appearance
M. Estai et al., 2022 [34] Faster R-CNN and Inception-ResNet-v2 3 qualified dentists 2 dentists draw rectangles; consensus required. Discrepancies resolved by 3rd dentist 2 dentists were blinded to each other during initial review Detection per WHO standard + ICDAS
S. Fan et al., 2023 [35] YOLO V5, Faster R-CNN, Retinanet Manually annotated Not reported Not reported Artificial demineralization model on 30 bovine teeth with “score1, score2, score3” groups (3 phases).
A. Juyal et al., 2023 [36] YOLOv3, Faster R-CNN Not reported Not reported Not reported “Dental cavities/caries” in camera images, no operational definition or grading stated.
L. Kunt et al., 2023 [37] YOLOv5, Faster R-CNN, RetinaNet, EfficientDet; 1 specialist in cariology No consensus (single annotator) Not reported 7257 “carious lesions” annotated on bitewing radiographs using minimal bounding boxes.
Mahaveerakannan R et al., 2024 [38] YOLOv3, Faster R-CNN, RetinaNet, SSD 1 qualified dentist No consensus (single annotator) Not reported ICCMS-based classes: 0 sound (NSC), 1 visually non-cavitated (VNC), 2 cavitated (moderate), 3 late cavitated (extensive)
E. Y. Park et al., 2022 [39] U-Net, ResNet-18, Faster R-CNN 1 board-certified dentist No consensus (single annotator) Not reported Ground truth per ICDAS; only distinct caries ICDAS codes 4–6 annotated as “caries cases”
M. T. G. Thanh et al., 2022 [40] Faster R-CNN, YOLOv3, RetinaNet, SSD 1 experienced dentist No consensus (single annotator) Not reported ICCMS-based classes: 0 sound (NSC), 1 VNC, 2 cavitated with localized enamel breakdown or dentin shadow, 3 late cavitated with visible dentin
J. Velusamy et al., 2024 [41] Faster R-CNN, YOLOv3 Not reported Not reported Not reported “Three distinct caries level” (no explicit operational definition provided)
Yuang Zhu et al., 2022 [42] Faster-RCNN 1 doctor No consensus (single annotator) Not reported Not reported (caries labeled as “caries positions/locations,” no case definition stated)
E. T. Chaves 2024 [43] Mask R-CNN 2 graduate students; 3 PhD students; 2 caries experts Final annotations in joint sessions; disagreements resolved by consensus Patient-level de-identification: radiographs de-identified; database contained only images “Primary caries lesions” on tooth surfaces and “secondary caries around restorations” on bitewings
Yanbin Guo 2024 [44] Mask R-CNN 3 experienced dentists Not reported Not reported “Non-normal teeth” includes caries; dataset categories include “caries” and other abnormalities (residual root, retainer, filling, etc.), later merged into “abnormal” for training
K. Moutselos 2019 [45] Mask R-CNN 2 ICDAS dental experts Explicit consensus noted for superpixel-based lesion segmentation settings (two ICDAS experts) Not reported Occlusal caries detection and classification across full ICDAS 7-class scale (0–6) on intraoral images
N. van Nistelrooij 2024 [46] Mask R-CNN 2 PhD students, 3 senior dentists, 1 caries expert Discrepancies resolved via joint discussions Not reported Severity score based on lesion key points: 0.0 = lesion not reaching dentine, 1.0 = lesion reaching pulp (staging of secondary caries)
Lizheng Liu 2020 [47] Mask R-CNN Not reported Not reported Not reported Caries operationalized as “decayed tooth” among 7 disease classes
Umer Rashid 2022 [48] Mask R-CNN 2 qualified dentists comparing 2 dentist labeling correctness Not reported Defines dental cavity as “destruction of a tooth’s tissue”

3.3. Methodological Quality and Risk of Bias of Included Studies

Overall, the methodological quality of the included studies was of a fairly good standard. According the Figure 5, the mean RQS for studies involving radiographic images was 74.4% (268/360) of the maximum score (range: 55.6–91.7%), indicating generally well-structured radiomics study design and reporting while still leaving room for improvement. The lowest RQS was reported in Velusamy et al., (2024) (20/36; 55.6%) [41], whereas the highest score was observed in E. Chaves et al., (2024) (33/36; 91.7%) [43]. This variability in RQS suggests meaningful differences in study rigor that may partly contribute to the observed between-study heterogeneity. The most common shortcomings were the lack of multiple segmentations, omission of calibration statistics, absence of prospective design or robust validation, and failure to conduct cost-effectiveness analyses or provide open data. In contrast, nearly all studies reported basic discrimination metrics (e.g., AUC or accuracy), compared model outputs against an appropriate ground-truth “gold standard,” and discussed potential clinical applications.

Figure 5.

Figure 5

Quality assessment of included studies using the Radiomics Quality Score (RQS) and QUADAS-AI. (a) Mean RQS across included radiomics studies, shown as the percentage of the maximum score by domain. (b) Risk of bias and applicability concerns summary: distribution of reviewers’ QUADAS-AI judgments for each item, presented as percentages across all included studies (n = 17). QUADAS-AI = Quality Assessment of Diagnostic Accuracy Studies–Artificial Intelligence.

Using QUADAS-AI, we found generally low concern for applicability across studies, although several risks of bias were identified. For patient selection, four studies were judged to have a high risk of bias, most commonly because images or patients were not selected randomly (e.g., retrospective sampling or exclusion of certain cases), which may introduce selection bias. The index test domain was rated as low risk of bias in all studies, as caries detection was performed using an objective algorithm without knowledge of the reference standard outcomes. Likewise, the flow and timing domain was low risk in most studies (13/17), since the index test and reference standard were typically applied to the same images (making the time interval not applicable) and/or all cases received reference standard verification of caries status. By contrast, the reference standard domain was unclear in approximately half of the studies (47.1%), largely because papers did not consistently report whether reference assessments (e.g., clinical examination or expert radiographic reading) were blinded to the algorithm results or how rigorously the reference diagnosis was established.

Regarding applicability, patient selection raised the greatest concern, with 76.5% of studies classified as high/unclear concern. A similar pattern was observed for the reference standard, where 9 of 17 studies were assessed as low concern. In contrast, the index test domain showed the most favorable applicability profile, with no studies rated as having high concern.

3.4. Quantitative Analysis (Meta-Analysis)

We included 11 studies [32,33,34,35,36,37,38,39,40,41,42] and 6 studies [43,44,45,46,47,48] in our meta-analysis evaluating the diagnostic accuracy of the Faster R-CNN and Mask R-CNN algorithms, respectively, for detecting dental caries. Performance metrics of Faster R-CNN papers and Mask R-CNN papers can be found in Table 2 and Table 3, respectively. Additionally, we categorized the included studies based on the type of dental images used—radiographic or photographic—to examine potential differences in performance between the two algorithms across these imaging modalities. Importantly, all between-algorithm contrasts are indirect because the Faster R-CNN and Mask R-CNN estimates come from different sets of studies, and no head-to-head evaluations on the same datasets under identical annotation protocols and operating thresholds were available. Between-study heterogeneity was consistently high (I2 > 90% for most outcomes), with larger between-study variance for specificity in Mask R-CNN overall (τ2 = 2.577), while modality-specific analyses reduced—but did not eliminate—heterogeneity (radiograph τ2_Specificity = 0.885; photograph τ2_Specificity = 0.856). The results of the meta-analysis are presented in Figure 6, Figure 7, Figure 8, Figure 9, Figure 10 and Figure 11 and summarized in Table 4.

Table 2.

Performance metrics of Faster R-CNN papers.

Authors/Year Country Imaging
Modality
N TP TN FP FN Accuracy Sensitivity
(95% CI)
Specificity
(95% CI)
Precision F1 Score ROC AUC
N. Cauás et al., 2023 [32] Brazil dental radiographs 294 42 206 15 31 0.84 0.575
(0.50–0.65)
0.93
(0.87–0.97)
0.73 0.65 0.752
X. T. Chen et al., 2022 [33] China bitewing 2087 444 1363 111 169 0.87 0.724
(0.69–0.76)
0.925
(0.91–0.94)
0.8 0.76 0.824
M. Estai et al., 2022 [34] Australia bitewing 2468 680 1465 239 84 0.87 0.89
(0.87–0.91)
0.86
(0.84–0.88)
0.74 0.81 0.875
S. Fan et al., 2023 [35] China optical coherence tomography 100 31 53 4 12 0.85 0.72
(0.65–0.79)
0.93
(0.87–0.97)
0.88 0.79 0.82
A. Juyal et al., 2023 [36] India intraoral photos 300 113 113 32 42 0.75 0.73
(0.65–0.81)
0.78
(0.70–0.86)
0.78 0.76 0.76
L. Kunt et al., 2023 [37] Czech Republic bitewing 598 53 497 32 16 0.92 0.77
(0.72–0.82)
0.94
(0.88–0.97)
0.62 0.69 0.855
Mahaveerakannan R et al., 2024 [38] India intraoral photos 5630 785 4271 273 301 0.90 0.723
(0.65–0.79)
0.94
(0.88–0.98)
0.742 0.73 0.831
E. Y. Park et al., 2022 [39] South Korea intraoral photographic images 2348 546 1428 182 192 0.84 0.74
(0.68–0.80)
0.887
(0.83–0.93)
0.75 0.745 0.814
M. T. G. Thanh et al., 2022 [40] Vietnam, Japan intraoral photos 750 136 519 40 55 0.87 0.712
(0.64–0.78)
0.929
(0.87–0.97)
0.773 0.741 0.82
J. Velusamy et al., 2024 [41] India panoramic 600 317 201 47 35 0.86 0.90
(0.85–0.95)
0.81
(0.75–0.87)
0.87 0.88 0.86
Yuang Zhu et al., 2022 [42] China periapical 800 203 482 36 79 0.86 0.72
(0.65–0.79)
0.93
(0.87–0.97)
0.85 0.78 0.83

Table 3.

Performance metrics of Mask R-CNN papers.

Authors/Year Country Imaging Modality N TP TN FP FN Accuracy Sensitivity
(95% CI)
Specificity
(95% CI)
Precision F1 Score ROC AUC
E. T. Chaves 2024
(Primary caries) [43]
Netherlands
Brazil
Germany
bitewing 12,750 193 12,404 86 67 0.988 0.742
(0.68–0.82)
0.99
(0.98–0.99)
0.687 0.712 0.849
E. T. Chaves 2024
(Secondary caries) [43]
Netherlands
Brazil
Germany
bitewing 12,750 377 12,068 160 145 0.976 0.722
(0.622–0.782)
0.987 (0.98–0.99) 0.702 0.713 0.804
Yanbin Guo 2024 [44] China
USA
panoramic 1008 733 150 25 100 0.876 0.879
(0.86–0.90)
0.857
(0.81–0.91)
0.967 0.921 0.868
K. Moutselos 2019 [45] Greece Intraoral images 4909 2539 846 339 1185 0.691 0.682
(0.67–0.70)
0.714
(0.69–0.74)
0.882 0.767 0.698
N. van Nistelrooij 2024
(all lesions) [46]
Netherlands
Germany
Denmark
bitewing 2612 282 2147 78 105 0.932 0.729
(0.69–0.78)
0.966
(0.96–0.97)
0.783 0.755 0.851
N. van Nistelrooij 2024
(dentine lesions) [46]
Netherlands
Germany
Denmark
bitewing 2612 246 2240 79 47 0.951 0.839
(0.80–0.88)
0.964
(0.96–0.97)
0.757 0.796 0.902
Lizheng Liu 2020 [47] China
Sweden
Intraoral images 2556 634 1660 178 84 0.898 0.883
(0.85–0.92)
0.903
(0.83–0.97)
0.780 0.828 0.893
Umer Rashid 2022
(photographic images) [48]
Pakistan
United Kingdom
photographic images 90 71 10 1 8 0.90 0.90
(0.85–0.95)
0.91
(0.84–0.94)
0.986 0.940 0.903
Umer Rashid 2022
(radiographs) [48]
Pakistan
United Kingdom
X-ray radiographs 936 881 41 5 9 0.985 0.989
(0.97–1.00)
0.891
(0.84–0.95)
0.994 0.992 0.946
Umer Rashid 2022
(mixed set) [48]
Pakistan
United Kingdom
photographic images, X-ray radiographs 210 139 47 4 20 0.886 0.874
(0.81–0.92)
0.922
(0.87–0.98)
0.972 0.920 0.898

Figure 6.

Figure 6

Figure 6

Forest plots of studies evaluating Faster R-CNN (k = 11) [32,33,34,35,36,37,38,39,40,41,42] and Mask R-CNN (k = 6) [43,44,45,46,47,48] for dental caries detection.

Figure 7.

Figure 7

HSROC curves comparing Faster R-CNN and Mask R-CNN for dental caries detection across all included studies.

Figure 8.

Figure 8

Figure 8

Forest plots of radiographic-image studies evaluating Faster R-CNN (k = 6) [32,33,34,37,41,42] and Mask R-CNN (k = 4) [43,44,46,48] for dental caries detection.

Figure 9.

Figure 9

HSROC curves comparing Faster R-CNN and Mask R-CNN for dental caries detection using radiographic images.

Figure 10.

Figure 10

Figure 10

Forest plots of photographic image studies evaluating Faster R-CNN (k = 5) [35,36,38,39,40] and Mask R-CNN (k = 3) [45,47,48] for dental caries detection.

Figure 11.

Figure 11

HSROC curves comparing Faster R-CNN and Mask R-CNN for dental caries detection using photographic images.

Table 4.

Meta-Regression of Faster R-CNN and Mask R-CNN in detecting dental caries.

Covariate Sub-
Group
Sensitivity
(95% CI)
p-Value Specificity
(95% CI)
p-Value AUC p-Value I2 (Sens/Spec) τ2 (Sens/Spec)
All images Faster R-CNN 71.7%
(62%; 79.7%)
0.0244 * 81.4%
(74.8%; 86.6%)
0.00089 *** 0.84 0.0053 ** 96.8%/97.9% 0.297/0.422
Mask R-CNN 85.6%
(75.5%; 92%)
94.2%
(87.9%; 97.3%)
0.95 97.5%/99.5% 0.484/2.577
Radiographic images Faster R-CNN 67.2%
(51.3%; 79.9%)
0.0497 * 85%
(78.5%; 89.8%)
0.00105 ** 0.86 0.0067 ** 97.2%/95.7% 0.405/0.327
Mask R-CNN 86.3%
(68.4%; 94.9%)
96.5%
(90.8%; 98.7%)
0.97 96.9%/98.2% 0.547/0.885
Photographic images Faster R-CNN 77.3%
(66.9%; 85.2%)
0.435 75.1%
(64.6%; 83.3%)
0.156 0.83 0.048 * 96.8%/94.6% 0.437/0.128
Mask R-CNN 83.5%
(67.3%; 92.5%)
86%
(70.5%; 94.1%)
0.91 98.3%/98.8% 0.792/0.856

AUC, area under the curve. * p < 0.05. ** p < 0.01. *** p < 0.001.

3.5. Evaluation of the Diagnostic Accuracy of Faster R-CNN and Mask R-CNN Algorithms

We generated forest plots (Figure 6) and hierarchical sROC curves (Figure 7) to summarize study-level and pooled diagnostic performance for both algorithms. Pooled estimates are reported with 95% CIs. Given the high heterogeneity across studies, pooled estimates and indirect contrasts should be interpreted cautiously, since observed differences may reflect study-level factors such as dataset composition, lesion definitions, annotation procedures, and threshold selection rather than algorithm effects alone.

Sensitivity: Faster R-CNN achieved a pooled sensitivity of 71.7% (62.0–79.7%), with study-level estimates ranging from 47.0% to 89.7%. Mask R-CNN yielded a pooled sensitivity of 85.6% (75.5–92.0%), with a range of 68.2% to 99.0%. The indirect Z-test for proportions suggested a statistical difference between pooled sensitivities (p = 0.0244); however, this reflects an indirect contrast across non-overlapping study sets and should be interpreted in the context of very high between-study heterogeneity.

Specificity: Faster R-CNN showed a pooled specificity of 81.4% (74.8–86.6%), with study-level estimates ranging from 65.0% to 93.3%. Mask R-CNN achieved a pooled specificity of 94.2% (87.9–97.3%), with a range of 71.4% to 99.3%. The indirect Z-test suggested a statistical difference between pooled specificities (p = 0.00089), but this finding remains subject to the same limitations of indirect comparison and substantial heterogeneity.

AUC: Mask R-CNN had a higher pooled AUC than Faster R-CNN (0.95 vs. 0.84). DeLong’s test suggested a statistical difference (p = 0.0053). This AUC contrast is also indirect and should be interpreted cautiously given the high residual heterogeneity.

3.6. Comparison of Diagnostic Accuracy of Faster R-CNN and Mask R-CNN Algorithms with Radiographic Images

In radiographic studies, Faster R-CNN had a pooled sensitivity of 67.2% (95% CI: 51.3–79.9%), with study-level estimates ranging from 47.0% to 89.7%. Mask R-CNN had a pooled sensitivity of 86.3% (95% CI: 68.4–94.9%), with estimates ranging from 72.2% to 99.0%. The indirect Z-test suggested a statistical difference (p = 0.0497). This contrast remains indirect, since the two algorithms were assessed in different study sets, and heterogeneity persisted despite modality stratification.

For specificity, Faster R-CNN yielded a pooled estimate of 85.0% (95% CI: 78.5–89.8%) with a range of 71.3% to 92.5%. Mask R-CNN yielded 96.5% (95% CI: 90.8–98.7%) with a range of 85.7% to 99.3%. The indirect Z-test suggested a statistical difference (p = 0.00105), and interpretation should remain cautious because residual heterogeneity was not eliminated.

For overall discrimination, Mask R-CNN showed a higher pooled AUC than Faster R-CNN (0.97 vs. 0.86). DeLong’s test suggested a statistical difference (p = 0.0067). This result reflects an indirect contrast and may be influenced by study-level differences across radiographic datasets and evaluation thresholds.

3.7. Comparison of Diagnostic Accuracy of Faster R-CNN and Mask R-CNN Algorithms with Photographic Images

In photographic studies, Faster R-CNN showed a pooled sensitivity of 77.3% (95% CI: 66.9–85.2%), with study-level estimates ranging from 66.7% to 87.0%. Mask R-CNN showed a pooled sensitivity of 83.5% (95% CI: 67.3–92.5%), with estimates ranging from 68.2% to 89.9%. The indirect Z-test did not indicate a statistical difference (p = 0.435).

For specificity, Faster R-CNN yielded a pooled estimate of 75.1% (95% CI: 64.6–83.3%) with a range of 65.0% to 93.3%, while Mask R-CNN yielded 86.0% (95% CI: 70.5–94.1%) with a range of 71.4% to 90.9%. The indirect Z-test did not indicate a statistical difference (p = 0.156).

For overall discrimination, Mask R-CNN had a higher pooled AUC than Faster R-CNN (0.91 vs. 0.83). DeLong’s test suggested a statistical difference (p = 0.048). This AUC result should be interpreted cautiously because it is an indirect contrast, it is near the conventional significance threshold, sensitivity and specificity differences were not statistically significant, and residual heterogeneity remained. Therefore, this finding may not translate into clinically meaningful improvement without prospective head-to-head evaluations that apply standardized annotation protocols and uniform operating thresholds within the same photographic datasets.

3.8. Meta-Regression and Subgroup Analysis

Table 4 presents the results of subgroup analysis; in this analysis, we compared the effects of different covariates on summary estimates.

4. Discussion

Dental caries (tooth decay) remains one of the most common oral diseases globally and a major contributor to the worldwide burden of oral conditions [49]. Its development is multifactorial, but a key driver is frequent consumption of free sugars, which fuels acid production by dental plaque biofilms and is consistently associated with a higher risk of caries [50]. Caries and other oral diseases are also strongly socially patterned—poverty and constrained access to care amplify risk and untreated disease—while the economic burden is substantial, with estimates that direct treatment costs for dental diseases accounted for nearly 5% of global health expenditure [49]. Because lesions can progress and ultimately compromise tooth structure and function, timely identification of lesion location/severity is essential for appropriate preventive or minimally invasive management, particularly in settings with high baseline risk and limited resources [51].

Dental informatics applies computer and information sciences to dentistry to support and augment clinical workflows, including diagnostic decision-making [52]. Automated systems for detecting and classifying dental pathology can enable earlier diagnosis, reduce dependence on time-consuming manual assessment, and alleviate clinician workload, thereby improving oral health and preventing complications [53]. While machine-learning methods have long been applied to medical imaging, deep learning (DL) has gained prominence because these models can learn hierarchical feature representations directly from raw images, often improving performance in tasks such as detection, segmentation, and classification [54]. Convolutional neural networks (CNNs)—a core class of deep learning models—are now widely used for medical and dental computer vision tasks (e.g., classification, detection, and segmentation), and have become a methodology of choice in medical image analysis [54]. In dental radiology, CNN-based CAD systems are increasingly discussed as workflow support tools that can help clinicians interpret images more efficiently and consistently, with the potential to reduce diagnostic workload and human error when used as an adjunct to expert review [55]. Among DL detectors, Faster R-CNN and Mask R-CNN are prominent for object detection and segmentation. Faster R-CNN is a two-stage framework that first proposes regions and then refines them for accurate bounding box predictions [15]. Mask R-CNN extends this architecture with a mask branch for pixel-level instance segmentation, enabling simultaneous detection, classification, and mask generation [22]. Both architectures use a shared convolutional backbone to extract feature maps and leverage region-based representations to capture object appearance and spatial context for localization and segmentation.

Following database searching, title/abstract screening, and full-text assessment, 11 Faster R-CNN studies and 6 Mask R-CNN studies were included in the systematic review; all 17 studies contributed data to the meta-analysis. All papers were published in 2022 or later and represent contributions from multiple countries, underscoring the growing global interest in applying Faster R-CNN and Mask R-CNN to caries detection. Across studies, the shared aim was to develop AI-enabled tools to enhance diagnostic accuracy in clinical practice.

Regarding RQS, included radiographic studies performed well on criteria such as basic discrimination statistics, comparison with a “gold standard,” and articulation of potential clinical applications (>95%). However, multiple segmentations, calibration statistics, prospective design, and cost-effectiveness analyses were infrequently addressed. The mean RQS was 74.4% (range 55.6–91.7%). QUADAS-AI results were likewise encouraging: the “index test” domain generally showed low risk of bias and low applicability concerns, as did “flow and timing.” The least favorable ratings appeared in “patient selection,” followed by “reference standard.” This pattern likely reflects reliance on retrospectively sampled public datasets without clear reporting of consecutive or random inclusion, limiting assessment of selection bias. In addition, ground-truth labeling procedures were often insufficiently documented (e.g., single- vs. multi-expert annotation), raising questions about the adequacy and independence of the reference standard. Heterogeneity may also arise from differences in design (retrospective vs. prospective) and imaging modality. For example, bitewing radiograph studies showed lower heterogeneity in specificity (narrower CIs in subgroup analyses), consistent with standardized acquisition, whereas photographic studies exhibited greater variability, potentially driven by lighting inconsistency and annotation subjectivity [56]. Overall, RQS and QUADAS-AI profiles indicate generally well-structured designs and reporting while highlighting the need for modality-tailored guidelines for DL applications in oral health.

Both Faster R-CNN and Mask R-CNN achieved satisfactory diagnostic performance, but Mask R-CNN was superior across metrics, with higher pooled sensitivity (85.6% vs. 71.7%, p = 0.0244), specificity (94.2% vs. 81.4%, p = 0.00089), and AUC (0.95 vs. 0.84, p = 0.0053). These gains are consistent with several architectural advantages. Mask R-CNN preserves the two-stage proposal pipeline of Faster R-CNN while adding a mask head that provides pixel-level supervision, enabling true instance segmentation—crucial for delineating subtle enamel–dentin borders. RoIAlign further preserves spatial precision relative to RoIPooling, avoiding rounding artifacts and aligning with the higher pooled sensitivity observed for Mask R-CNN. In addition, backbones are typically paired with a Feature Pyramid Network, granting multiscale representation that improves detection of small, low-contrast cavities commonly seen on bitewings [24]. Beyond caries, results from IEEE BIBM 2018 suggest the Mask R-CNN framework transfers effectively to other oral mucosal lesions, indicating versatility for subtle dental imaging tasks [57]. Collectively, precise alignment, multiscale feature fusion, and mask-level supervision likely underpin the statistically significant advantage of Mask R-CNN, positioning it as a robust option for automated screening pipelines.

Modality-specific analyses revealed important differences. For radiographic images, Mask R-CNN outperformed Faster R-CNN in pooled sensitivity (86.3% vs. 67.2%, p = 0.0497), specificity (96.5% vs. 85.0%, p = 0.00105), and AUC (0.97 vs. 0.86, p = 0.0067), mirroring the overall results. In photographic images, Mask R-CNN again showed higher pooled sensitivity (83.5% vs. 77.3%) and specificity (86.0% vs. 75.1%), but these differences were not statistically significant (p = 0.435 and 0.156, respectively); AUC favored Mask R-CNN (0.91 vs. 0.83, p = 0.048). These findings suggest Mask R-CNN is particularly well suited to radiographic workflows, where its mask branch can disentangle overlapping anatomic shadows intrinsic to bitewings and other radiographs. RoIAlign and FPN help resolve faint radiolucent margins indicating early dentinal demineralization—features that a box-only pipeline may blur. By contrast, photographic images often contain strong color/texture cues that allow bounding box detectors to localize lesions adequately; segmentation adds less incremental information and may be hindered by glare, saliva, and depth-of-field artifacts that disrupt mask continuity, yielding non-significant sensitivity–specificity margins [56]. Moreover, the photographic subgroup results likely reflect substantial heterogeneity in image acquisition and labeling. Unlike radiographs, intraoral photographs are frequently captured under non-standardized conditions, with large variability in image quality driven by differing lighting (illumination intensity, color temperature, shadows), camera angle, focus/depth-of-field, motion blur, and saliva. These factors can obscure lesion boundaries and introduce inconsistent visual cues, further disrupting pixel-level mask continuity and increasing between-study variability. Furthermore, many photographic datasets are relatively small (<3000 frames) and annotated at coarse tooth-level granularity, limiting generalization for mask heads, whereas radiographic corpora more often provide dense, accurate contours that improve training [58]. Accordingly, the additional mask loss may act as an effective regularizer only when ground-truth masks are reliable; this is more typical in calibrated X-ray imagery than in variably illuminated intraoral photos, where boundaries can be visually unstable and labels relatively coarse. Consistent with this, He et al. reported that mask heads confer the greatest benefit when spatial precision is paramount [22]. Taken together, the integration of rich grayscale structural details, dense annotations, and inherent overlap artifacts renders radiographs particularly well-suited for Mask R-CNN, whereas photographic workflows yield only marginal improvements over Faster R-CNN. Importantly, comparisons between Faster R-CNN and Mask R-CNN in this meta-analysis are indirect because the two architectures were not evaluated head-to-head on the same datasets under identical annotation protocols and operating thresholds. Accordingly, differences in pooled performance may be driven by study-level heterogeneity, including imaging modality, dataset composition, lesion definition, annotation strategy, validation design, and threshold selection, rather than architecture alone. To mitigate this, we performed modality-stratified analyses and prespecified meta-regression adjusting for these covariates; however, residual confounding is likely. Definitive conclusions regarding architectural superiority require within-study head-to-head evaluations under identical experimental settings, which were not available in the current literature. In addition, very high heterogeneity in this meta-analysis means pooled sensitivity, specificity, and AUC should be interpreted as context specific summaries rather than universal benchmarks. Accuracy differed across studies due to variation in modality, acquisition, lesion definitions, annotation and reference standards, operating thresholds, and validation design. Thus, pooled estimates reflect an average across heterogeneous research settings and may not predict performance in a given clinic or deployment workflow. Consequently, for implementation, the most informative results are setting matched analyses, such as modality specific subgroups and externally validated cohorts, and future work should report prediction intervals to quantify expected performance dispersion across new populations and institutions.

Among radiographic modalities, bitewings were most frequently employed, consistent with prior literature. Bitewing radiographs are often used as a reference standard in clinical caries diagnosis [59] and are recommended for detecting early-stage and interproximal caries given their high sensitivity, low radiation dose, low cost, and rapid acquisition [60]. Consequently, radiograph-based caries detection can help prevent disease progression and reduce invasive treatments, while facilitating personalized care in high-risk patients [61]. Nonetheless, dental radiographs should be used judiciously due to ionizing radiation exposure, even though risks are generally low for most individuals [62]. Photographic imaging, by contrast, is radiation-free and highly accessible, including in rural settings with limited dental infrastructure. The ubiquity and affordability of smartphones in many countries support smartphone-based diagnostic tools that could expand screening reach. Indeed, recent biomedical research has leveraged mobile device sensors to deliver cost-effective solutions for remote dental care [63].

This publication should be interpreted alongside our prior YOLO-focused systematic review and meta-analysis on dental caries detection, which synthesized evidence for one-stage detectors tailored to rapid, real-time screening, where low latency, deployment simplicity, and coarse lesion localization often determine clinical feasibility (for example, chairside or mobile screening). By contrast, the present study was intentionally conducted as a separate analysis because Faster R-CNN and Mask R-CNN belong to a distinct family of two-stage, region-based detectors, and Mask R-CNN further adds an instance-segmentation head that can materially change localization precision, annotation burden, and error modes relative to one-stage box prediction. These architectural differences introduce model-specific heterogeneity (for example, region proposal behavior, reliance on RoIAlign, and sensitivity to mask supervision quality) that would be obscured if two-stage detectors were pooled with one-stage YOLO studies. Consistent with this tradeoff, YOLO can process dental radiographs in about 15 milliseconds on a mid-range GPU [64], whereas two-stage approaches typically require about 90 milliseconds, with runtimes extending to 270 milliseconds when deeper backbones are used or when CBCT volumes are processed [65]. An our meta-analysis encompassing 14 studies reported that YOLO achieved a pooled sensitivity of 79%, specificity of 85%, and an area under the ROC curve (AUC) of 0.83 for caries-level detection, indicating that its rapid processing is accompanied by moderate diagnostic performance [66], which remains significantly lower than that of Faster R-CNN and Mask R-CNN observed in the present study. Accordingly, the distinct technical contribution of this study is modality-stratified performance characterization for two-stage detectors and an indirect comparative synthesis of Faster R-CNN versus Mask R-CNN within a diagnostic accuracy meta-analytic framework, while explicitly recognizing that statistical differences across pooled study sets are not head-to-head evidence and may be confounded by study-level factors unless within-study comparisons are available. Clinically, these results inform when two-stage detection and, when appropriate, instance segmentation is more defensible for workflows requiring higher spatial precision and boundary delineation on radiographs (for example, subtle radiolucent lesions, interproximal or deeper disease, and longitudinal monitoring around restorations), while YOLO-based systems remain appropriate for fast detection-only screening in low-resource settings.

To the best of our knowledge, this is the first systematic review and meta-analysis to synthesize the diagnostic accuracy of Faster R-CNN and Mask R-CNN for dental caries detection, supporting earlier detection across diverse clinical settings. Our strengths include the inclusion of commonly used imaging modalities for caries assessment, enabling modality-specific comparisons across different technical and clinical settings. We also explained results based on technical aspects of not only two mentioned algorithms but also DL structures in general for a more comprehensive overview. However, several limitations should be noted, including the predominance of retrospective studies and substantial heterogeneity in patient characteristics and imaging parameters. Given the substantial between-study variability (high I22), we interpreted the pooled estimates cautiously, emphasizing that they provide only an overall summary and may not represent any single clinical setting. We therefore focused on the HSROC curve and reported subgroup-specific results rather than relying solely on a single pooled estimate. Additionally, we observed limited methodological rigor (suboptimal RQS and QUADAS-AI), limited reproducibility, potential publication bias, and insufficient standardization of imaging and labeling protocols. Lastly, no included study directly compared Faster R-CNN and Mask R-CNN on the same dataset under identical experimental conditions, precluding true head-to-head subset analysis and limiting causal attribution of performance differences to model architecture alone. These limitations should be explicitly acknowledged and addressed in future research to enhance methodological rigor and the reliability of evidence.

Future directions may further elevate performance through architectural, data, and deployment strategies. Beyond model design, future studies should strengthen methodological rigor through standardized acquisition/reporting for photographic and radiographic data, harmonized annotation (clear lesion definitions and labeling granularity, double-reading/consensus, and reporting inter-/intra-rater reliability), and prospective or consecutively sampled multi-center cohorts with prespecified analysis plans and independent external validation. Because the number of eligible studies on Faster R-CNN (k = 11) and Mask R-CNN (k = 6) remains limited, excluding studies with unclear reporting may further reduce statistical power. Nevertheless, future work should conduct sensitivity analyses that exclude studies with unclear or high risk of bias in the reference standard and/or patient selection domains and report the impact on pooled estimates. Moreover, because QUADAS-AI highlighted patient selection as the weakest domain, future studies should consider an inclusion preference by performing sensitivity analyses restricted to studies at low risk of bias for patient selection. Authors should also explicitly discuss spectrum bias, as variations in disease prevalence, the distribution of lesion severity, and image quality filtering or exclusion criteria can inflate apparent performance and reduce generalizability to routine clinical settings. Given the substantial residual heterogeneity after modality stratification, future work should incorporate influence and outlier diagnostics to identify heterogeneity driving studies, including leave one out analyses in the bivariate random effects model and hierarchical influence checks to detect high leverage studies. In addition, reporting prediction ranges for sensitivity and specificity will better communicate expected performance variability in new settings rather than only the mean pooled estimate. To improve comparability and reproducibility, we propose minimum standards: reporting key acquisition parameters and quality control criteria; transparent annotation procedures (definitions/thresholds, labeling level, annotator credentials, agreement, and adjudication); prespecified data splits to avoid leakage with reported operating thresholds; explicit reporting of the threshold strategy per study (fixed vs. optimized, and whether optimization was validation-based vs. test set-based), with specific disclosure of any test set-optimized thresholds that may inflate performance; and clear justification for pooled group comparisons using DeLong or Z tests under threshold variation and study-level heterogeneity, or the use of hierarchical approaches (e.g., meta-regression or interaction models) that explicitly account for these sources of variability; and external validation with transparent reporting and, where feasible, code/model sharing. From a technical perspective, several improvements may further enhance performance: boundary-preserving mask heads (e.g., BMask R-CNN) [67], deformable or cascade detection heads for Faster R-CNN [68], and 3-D extensions for volumetric CBCT when applicable [24]. Together, these avenues can point toward faster, safer, and more generalizable caries detection systems.

5. Conclusions

Faster R-CNN and Mask R-CNN show promise for dental caries detection, but the current evidence is limited by substantial heterogeneity, predominantly retrospective designs, and variable imaging and labeling practices. Across the included studies, Mask R-CNN yielded higher pooled performance estimates overall, with the clearest differences in radiographic workflows; however, these findings arise from indirect comparisons and should be interpreted as suggestive rather than definitive given between-study heterogeneity and uncertainty in the reference standard. In photographic images, differences in pooled sensitivity and specificity were not statistically significant, which may reflect variability in image quality, limited standardization, and smaller or less comparable datasets. Overall, these models may serve as adjunctive decision-support tools, but stronger conclusions will require prospective, multi-center studies with standardized acquisition and annotation, prespecified analysis plans, and independent external validation, alongside sensitivity and meta-regression analyses to evaluate the stability of observed performance differences.

Acknowledgments

We thank our respective institutions for their support and colleagues for constructive discussion and feedback during manuscript preparation.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/diagnostics16050731/s1, Table S1: Quality Assessment of Diagnostic Accuracy Studies-AI (QUADAS-AI); Table S2: Radiomics Quality Score (RQS); File S1: The full database-specific strategies; File S2: Details regarding the imaging types, image resolution, equipment, camera settings, and standardization processes for each paper; FileS3: PRISMA 2020 Checklist.

Author Contributions

Q.T.L.: conceived the study and designed the systematic review and meta-analysis protocol. Q.T.L. and M.H.N.L.: conducted the literature search, study selection, data extraction, and methodological quality assessment. F.-Y.F.: contributed to data verification and assisted in the interpretation of diagnostic accuracy outcomes. N.Q.K.L. and I.-T.L.: served as corresponding authors, provided methodological and clinical oversight, supervised the analytical framework, and critically reviewed the interpretation of results. Q.T.L. drafted the initial manuscript. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study were extracted from published studies cited in the reference list. The extracted dataset and analytical files are available from the corresponding author upon reasonable request due to the secondary nature of the data and copyright considerations.

Conflicts of Interest

The authors declare no conflicts of interest.

Funding Statement

This research received no external funding.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

References

  • 1.Botelho J., Mascarenhas P., Viana J., Proença L., Orlandi M., Leira Y., Chambrone L., Mendes J.J., Machado V. An umbrella review of the evidence linking oral health and systemic noncommunicable diseases. Nat. Commun. 2022;13:7614. doi: 10.1038/s41467-022-35337-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Collaborators G.O.D. Trends in the global, regional, and national burden of oral conditions from 1990 to 2021: A systematic analysis for the Global Burden of Disease Study 2021. Lancet. 2025;405:897–910. doi: 10.1016/s0140-6736(24)02811-3. [DOI] [PubMed] [Google Scholar]
  • 3.Pitts N.B.Z.D., Marsh P.D., Ekstrand K., Weintraub J.A., Ramos-Gomez F., Tagami J., Twetman S., Tsakos G., Ismail A. Dental caries. Nat. Rev. Dis. Primers. 2017;3:17030. doi: 10.1038/nrdp.2017.30. [DOI] [PubMed] [Google Scholar]
  • 4.Li X.L.R., Wang H., Yang Z., Liu Y., Li X., Xue X., Sun S., Wu L.A. Global Burden of Dental Caries from 1990 to 2021 and Future Projections. Int. Dent. J. 2025;75:100904. doi: 10.1016/j.identj.2025.100904. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Molyneux L., Banerjee A. Minimum intervention oral care: Staging and grading dental carious lesions in clinical practice. Br. Dent. J. 2024;237:457–463. doi: 10.1038/s41415-024-7843-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Keenan J.R.a.A.V.K. Accuracy of dental radiographs for caries detection. Evid. Based Dent. 2016;17:43. doi: 10.1038/sj.ebd.6401166. [DOI] [PubMed] [Google Scholar]
  • 7.Xu Y.L.X., Cao X., Huang C., Liu E., Qian S., Liu X., Wu Y., Dong F., Qiu C.W., Qiu J., et al. Artificial intelligence: A powerful paradigm for scientific research. Innovation. 2021;2:100179. doi: 10.1016/j.xinn.2021.100179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Topol E.J. High-performance medicine: The convergence of human and artificial intelligence. Nat. Med. 2019;25:44–56. doi: 10.1038/s41591-018-0300-7. [DOI] [PubMed] [Google Scholar]
  • 9.Chen Z.L.N., Zhang H., Li H., Yang Y., Zong X., Chen Y., Wang Y., Shi N. Harnessing the power of clinical decision support systems: Challenges and opportunities. Open Heart. 2023;10:e002432. doi: 10.1136/openhrt-2023-002432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Esteva A., Robicquet A., Ramsundar B., Kuleshov V., DePristo M., Chou K., Cui C., Corrado G., Thrun S., Dean J. A guide to deep learning in healthcare. Nat. Med. 2019;25:24–29. doi: 10.1038/s41591-018-0316-z. [DOI] [PubMed] [Google Scholar]
  • 11.Chang H.J., Lee S.J., Yong T.H., Shin N.Y., Jang B.G., Kim J.E., Huh K.H., Lee S.S., Heo M.S., Choi S.C., et al. Deep learning hybrid method to automatically diagnose periodontal bone loss and stage periodontitis. Sci. Rep. 2020;10:7531. doi: 10.1038/s41598-020-64509-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Leo L.M., Reddy T.K. Learning compact and discriminative hybrid neural network for dental caries classification. Microprocess. Microsyst. 2021;82:103836. [Google Scholar]
  • 13.Setzer F.C., Shi K.J., Zheng Z., Yan H., Yoon H., Mupparapu M., Li J. Artificial intelligence for the computer-aided detection of periapical lesions in cone-beam computed tomographic images. J. Endod. 2020;46:987–993. doi: 10.1016/j.joen.2020.03.025. [DOI] [PubMed] [Google Scholar]
  • 14.You W., Hao A., Li S., Wang Y., Xia B. Deep learning-based dental plaque detection on primary teeth: A comparison with clinical assessments. BMC Oral Health. 2020;20:141. doi: 10.1186/s12903-020-01114-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Ren S., He K., Girshick R., Sun J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017;39:1137–1149. doi: 10.1109/TPAMI.2016.2577031. [DOI] [PubMed] [Google Scholar]
  • 16.Cao C., Wang B., Zhang W., Zeng X., Yan X., Feng Z., Liu Y., Wu Z. An Improved Faster R-CNN for Small Object Detection. IEEE Access. 2019;7:106838–106846. doi: 10.1109/ACCESS.2019.2932731. [DOI] [Google Scholar]
  • 17.Albuquerque C., Henriques R., Castelli M. Deep learning-based object detection algorithms in medical imaging: Systematic review. Heliyon. 2024;11:e41137. doi: 10.1016/j.heliyon.2024.e41137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Vilcapoma P., Parra Meléndez D., Fernández A., Vásconez I.N., Hillmann N.C., Gatica G., Vásconez J.P. Comparison of Faster R-CNN, YOLO, and SSD for Third Molar Angle Detection in Dental Panoramic X-rays. Sensors. 2024;24:6053. doi: 10.3390/s24186053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Chen H., Li H., Zhao Y., Zhao J., Wang Y. Dental disease detection on periapical radiographs based on deep convolutional neural networks. Int. J. Comput. Assist. Radiol. Surg. 2021;16:649–661. doi: 10.1007/s11548-021-02319-y. [DOI] [PubMed] [Google Scholar]
  • 20.Mima Y., Nakayama R., Hizukuri A., Murata K. Tooth detection for each tooth type by application of faster R-CNNs to divided analysis areas of dental panoramic X-ray images. Radiol. Phys. Technol. 2022;15:170–176. doi: 10.1007/s12194-022-00659-1. [DOI] [PubMed] [Google Scholar]
  • 21.Sarı A.H., Sarı H., Magat G. Artificial intelligence-based detection of dens invaginatus in panoramic radiographs. BMC Oral Health. 2025;25:917. doi: 10.1186/s12903-025-06317-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.He K.G.G., Dollar P., Girshick R. Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 2020;42:386–397. doi: 10.1109/TPAMI.2018.2844175. [DOI] [PubMed] [Google Scholar]
  • 23.Fatima A., Shafi I., Afzal H., Mahmood K., Díez I.D., Lipari V., Ballester J.B., Ashraf I. Deep Learning-Based Multiclass Instance Segmentation for Dental Lesion Detection. Healthcare. 2023;11:347. doi: 10.3390/healthcare11030347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Ma Y., Al-Aroomi M.A., Zheng Y., Ren W., Liu P., Wu Q., Liang Y., Jiang C. Application of Mask R-CNN for automatic recognition of teeth and caries in cone-beam computerized tomography. BMC Oral Health. 2025;25:927. doi: 10.1186/s12903-025-06293-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Özbay Y., Kazangirler B.Y., Özcan C., Pekince A. Detection of the separated endodontic instrument on periapical radiographs using a deep learning-based convolutional neural network algorithm. Aust. Endod. J. 2024;50:131–139. doi: 10.1111/aej.12822. [DOI] [PubMed] [Google Scholar]
  • 26.Kanwal M., Ur Rehman M.M., Farooq M.U., Chae D.K. Mask-Transformer-Based Networks for Teeth Segmentation in Panoramic Radiographs. Bioengineering. 2023;10:843. doi: 10.3390/bioengineering10070843. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Cui Z., Fang Y., Mei L., Zhang B., Yu B., Liu J., Jiang C., Sun Y., Ma L., Huang J., et al. A fully automatic AI system for tooth and alveolar bone segmentation from cone-beam CT images. Nat. Commun. 2022;13:2096. doi: 10.1038/s41467-022-29637-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Salameh J.-P., Bossuyt P.M., A McGrath T., Thombs B.D., Hyde C.J., Macaskill P., Deeks J.J., Leeflang M., A Korevaar D., Whiting P., et al. Preferred reporting items for systematic review and meta-analysis of diagnostic test accuracy studies (PRISMA-DTA): Explanation, elaboration, and checklist. BMJ. 2020;370:m2632. doi: 10.1136/bmj.m2632. [DOI] [PubMed] [Google Scholar]
  • 29.Sounderajah V.A.H., Rose S., Shah N.H., Ghassemi M., Golub R., Kahn C.E., Jr., Esteva A., Karthikesalingam A., Mateen B., Webster D., et al. A quality assessment tool for artificial intelligence-centered diagnostic test accuracy studies: QUADAS-AI. Nat. Med. 2021;27:1663–1665. doi: 10.1038/s41591-021-01517-0. [DOI] [PubMed] [Google Scholar]
  • 30.Whiting P.F., Rutjes A.W.S., Westwood M.E., Mallett S., Deeks J.J., Reitsma J.B., Leeflang M.M.G., Sterne J.A.C., Bossuyt P.M.M., QUADAS-2 Group QUADAS-2: A revised tool for the quality assessment of diagnostic accuracy studies. Ann. Intern. Med. 2011;155:529–536. doi: 10.7326/0003-4819-155-8-201110180-00009. [DOI] [PubMed] [Google Scholar]
  • 31.Lambin P., Leijenaar R.T.H., Deist T.M., Peerlings J., de Jong E.E.C., van Timmeren J., Sanduleanu S., Larue R., Even A.J.G., Jochems A., et al. Radiomics: The bridge between medical imaging and personalized medicine. Nat. Rev. Clin. Oncol. 2017;14:749–762. doi: 10.1038/nrclinonc.2017.141. [DOI] [PubMed] [Google Scholar]
  • 32.Cauás N., Millan-Arias C., De Brito D.H.S., Mota C.C.B.D.O., Fonseca M.M.D.O.D., Rosenblatt A., Fernandes B.J.T. Detection of Cavities in Radiographs Using Machine Learning: Approaches for Limited Data; Proceedings of the IEEE Latin American Conference on Computational Intelligence (LA-CCI); Recife-Pe, Brazil. 29–30 November 2023; New York, NY, USA: IEEE; pp. 1–6. [Google Scholar]
  • 33.Chen X., Guo J., Ye J., Zhang M., Liang Y. Detection of Proximal Caries Lesions on Bitewing Radiographs Using Deep Learning Method. Caries Res. 2022;56:455–463. doi: 10.1159/000527418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Estai M., Tennant M., Gebauer D., Brostek A., Vignarajan J., Mehdizadeh M., Saha S. Evaluation of a deep learning system for automatic detection of proximal surface dental caries on bitewing radiographs. Oral Surg. Oral Med. Oral Pathol. Oral Radiol. 2022;134:262–270. doi: 10.1016/j.oooo.2022.03.008. [DOI] [PubMed] [Google Scholar]
  • 35.Fan S., Yu H., Guan Z., Lv F., Zhou Z., Dai C. Diagnosis of Dental Caries in OCT Images Based on Deep Learning; Proceedings of the Asia Communications and Photonics Conference/2023 International Photonics and Optoelectronics Meetings (ACP/POEM); Wuhan, China. 4–7 November 2023; New York, NY, USA: IEEE; pp. 1–5. [Google Scholar]
  • 36.Juyal A., Tiwari H., Singh U.K., Kumar N., Kumar S. Dental Caries Detection Using Faster R-CNN and YOLO V3. ITM Web Conf. 2023;53:02005. doi: 10.1051/itmconf/20235302005. [DOI] [Google Scholar]
  • 37.Kunt L., Kybic J., Nagyová V., Tichý A. Automatic caries detection in bitewing radiographs: Part I-deep learning. Clin. Oral Investig. 2023;27:7463–7471. doi: 10.1007/s00784-023-05335-1. [DOI] [PubMed] [Google Scholar]
  • 38.Mahaveerakannan R., Nair S.V., Rajakumar B. A Deep Learning Application for Identifying Cavities in Dentistry: Utilizing Smartphone-Taken Intraoral Photos; Proceedings of the 2024 International Conference on Sustainable Communication Networks and Application (ICSCNA); Theni, India. 11–13 December 2024; pp. 718–723. [Google Scholar]
  • 39.Park E.Y., Cho H., Kang S., Jeong S., Kim E.-K. Caries detection with tooth surface segmentation on intraoral photographic images using deep learning. BMC Oral Health. 2022;22:573. doi: 10.1186/s12903-022-02589-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Thanh M.T., Van Toan N., Ngoc V.T., Tra N.T., Giap C.N., Nguyen D.M. Deep Learning Application in Dental Caries Detection Using Intraoral Photos Taken by Smartphones. Appl. Sci. 2022;12:5504. doi: 10.3390/app12115504. [DOI] [Google Scholar]
  • 41.Velusamy J., Rajajegan T., Alex S.A., Ashok M., Mayuri A.V.R., Kiran S. Faster Region-based Convolutional Neural Networks with You Only Look Once multi-stage caries lesion from oral panoramic X-ray images. Expert Syst. 2024;41:e13326. doi: 10.1111/exsy.13326. [DOI] [Google Scholar]
  • 42.Zhu Y., Xu T., Peng L., Cao Y., Zhao X., Li S., Zhao Y., Meng F., Ding J., Liang S. Faster-rcnn based intelligent detection and localization of dental caries. Displays. 2022;74:102201. doi: 10.1016/j.displa.2022.102201. [DOI] [Google Scholar]
  • 43.Chaves E.T., Vinayahalingam S., van Nistelrooij N., Xi T., Romero V.H., Flügge T., Saker H., Kim A., da Silveira Lima G., Loomans B., et al. Detection of caries around restorations on bitewings using deep learning. J. Dent. 2024;143:104886. doi: 10.1016/j.jdent.2024.104886. [DOI] [PubMed] [Google Scholar]
  • 44.Guo Y., Guo J., Li Y., Zhang P., Zhao Y.D., Qiao Y., Liu B., Wang G. Rapid detection of non-normal teeth on dental X-ray images using improved Mask R-CNN with attention mechanism. Int. J. Comput. Assist. Radiol. Surg. 2024;19:779–790. doi: 10.1007/s11548-023-03047-1. [DOI] [PubMed] [Google Scholar]
  • 45.Moutselos K., Berdouses E., Oulis C., Maglogiannis I. Recognizing Occlusal Caries in Dental Intraoral Images Using Deep Learning; Proceedings of the 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC); Berlin, Germany. 23–27 July 2019; pp. 1617–1620. [DOI] [PubMed] [Google Scholar]
  • 46.Van Nistelrooij N., Chaves E.T., Cenci M.S., Cao L., Loomans B.A., Xi T., El Ghoul K., Romero V.H., Lima G.S., Flügge T., et al. Deep Learning-Based Algorithm for Staging Secondary Caries in Bitewings. Caries Res. 2024;29:163–173. doi: 10.1159/000542289. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Liu L., Xu J., Huan Y., Zou Z., Yeh S.C., Zheng L.R. A Smart Dental Health-IoT Platform Based on Intelligent Hardware, Deep Learning, and Mobile Terminal. IEEE J. Biomed. Health Inform. 2020;24:898–906. doi: 10.1109/JBHI.2019.2919916. [DOI] [PubMed] [Google Scholar]
  • 48.Rashid U., Javid A., Khan A.R., Liu L., Ahmed A., Khalid O., Saleem K., Meraj S., Iqbal U., Nawaz R. A hybrid mask RCNN-based tool to localize dental cavities from real-time mixed photographic images. PeerJ Comput. Sci. 2022;8:e888. doi: 10.7717/peerj-cs.888. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Peres M.A., Macpherson L.M., Weyant R.J., Daly B., Venturelli R., Mathur M.R., Listl S., Celeste R.K., Guarnizo-Herreño C.C., Kearns C., et al. Oral diseases: A global public health challenge. Lancet. 2019;394:249–260. doi: 10.1016/S0140-6736(19)31146-8. [DOI] [PubMed] [Google Scholar]
  • 50.Moynihan P.J., Kelly S.A. Effect on caries of restricting sugars intake: Systematic review to inform WHO guidelines. J. Dent. Res. 2014;93:8–18. doi: 10.1177/0022034513508954. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Warreth A. Dental Caries and Its Management. Int. J. Dent. 2023;2023:9365845. doi: 10.1155/2023/9365845. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Benoit B., Frédéric B., Jean-Charles D. Current state of dental informatics in the field of health information systems: A scoping review. BMC Oral Health. 2022;22:131. doi: 10.1186/s12903-022-02163-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Araidy S., Batshon G., Mirochnik R. Artificial Intelligence Applications in Dentistry: A Systematic Review. Oral. 2025;5:90. doi: 10.3390/oral5040090. [DOI] [Google Scholar]
  • 54.Litjens G., Kooi T., Bejnordi B.E., Setio A.A., Ciompi F., Ghafoorian M., Van Der Laak J.A., Van Ginneken B., Sánchez C.I. A survey on deep learning in medical image analysis. Med. Image Anal. 2017;42:60–88. doi: 10.1016/j.media.2017.07.005. [DOI] [PubMed] [Google Scholar]
  • 55.Ali M., Irfan M., Ali T., Wei C.R., Akilimali A. Artificial intelligence in dental radiology: A narrative review. Ann. Med. Surg. 2025;87:2212–2217. doi: 10.1097/MS9.0000000000003127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Moharrami M., Farmer J., Singhal S., Watson E., Glogauer M., Johnson A.E., Schwendicke F., Quinonez C. Detecting dental caries on oral photographs using artificial intelligence: A systematic review. Oral Dis. 2024;30:1765–1783. doi: 10.1111/odi.14659. [DOI] [PubMed] [Google Scholar]
  • 57.Anantharaman R., Velazquez M., Lee Y. Utilizing Mask R-CNN for Detection and Segmentation of Oral Diseases; Proceedings of the 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM); Madrid, Spain. 3–6 December 2018; pp. 2197–2204. [Google Scholar]
  • 58.Kühnisch J., Meyer O., Hesenius M., Hickel R., Gruhn V. Caries Detection on Intraoral Images Using Artificial Intelligence. J. Dent. Res. 2022;101:158–165. doi: 10.1177/00220345211032524. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Schwendicke F., Göstemeyer G. Conventional bitewing radiography. Clin. Dent. Rev. 2020;4:22. doi: 10.1007/s41894-020-00086-8. [DOI] [Google Scholar]
  • 60.Grieco P., Jivraj A., Da Silva J., Kuwajima Y., Ishida Y., Ogawa K., Ohyama H., Ishikawa-Nagai S. Importance of bitewing radiographs for the early detection of interproximal carious lesions and the impact on healthcare expenditure in Japan. Ann. Transl. Med. 2022;10:2. doi: 10.21037/atm-21-2197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Saini D., Jain R., Thakur A. Dental Caries early detection using Convolutional Neural Network for Tele dentistry; Proceedings of the 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS); Coimbatore, India. 19–20 March 2021. [Google Scholar]
  • 62.Oakley P.A., Harrison D.E. Radiophobia: 7 Reasons Why Radiography Used in Spine and Posture Rehabilitation Should Not Be Feared or Avoided. Dose Response. 2018;16:1559325818781445. doi: 10.1177/1559325818781445. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Al-Jallad N., Ly-Mapes O., Hao P., Ruan J., Ramesh A., Luo J., Wu T.T., Dye T., Rashwan N., Ren J., et al. Artificial intelligence-powered smartphone application, AICaries, improves at-home dental caries screening in children: Moderated and unmoderated usability test. PLoS Digit. Health. 2022;1:e0000046. doi: 10.1371/journal.pdig.0000046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Karakuş R., Öziç M.Ü., Tassoker M. AI-Assisted Detection of Interproximal, Occlusal, and Secondary Caries on Bite-Wing Radiographs: A Single-Shot Deep Learning Approach. J. Imaging Inform. Med. 2024;37:3146–3159. doi: 10.1007/s10278-024-01113-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Çetinkaya I., Çatmabacak E.D., Öztürk E. Detection of Fractured Endodontic Instruments in Periapical Radiographs: A Comparative Study of YOLOv8 and Mask R-CNN. Diagnostics. 2025;15:653. doi: 10.3390/diagnostics15060653. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Lam Q.T., Le M.H.N., Lee I.-T., Le N.Q.K. Evaluating YOLO for dental caries diagnosis: A systematic review and meta-analysis. Evid. Based Dent. 2025;26:176. doi: 10.1038/s41432-025-01180-1. [DOI] [PubMed] [Google Scholar]
  • 67.Cheng T., Wang X., Huang L., Liu W. Boundary-Preserving Mask R-CNN, Computer Vision—ECCV 2020. Springer International Publishing; Cham, Switzerland: 2020. [Google Scholar]
  • 68.Cai Z., Vasconcelos N. Cascade R-CNN: Delving Into High Quality Object Detection; Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition; Salt Lake City, UT, USA. 18–22 June 2018; pp. 6154–6162. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The data presented in this study were extracted from published studies cited in the reference list. The extracted dataset and analytical files are available from the corresponding author upon reasonable request due to the secondary nature of the data and copyright considerations.


Articles from Diagnostics are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES