Skip to main content
Springer logoLink to Springer
. 2025 Sep 26;29(10):477. doi: 10.1007/s00784-025-06511-1

Intraoral vs. extraoral bitewing radiography for approximal caries detection: A multi-observer ex vivo ROC study using thin-section microscopy as gold standard

Julia Caroline Quintus 1, Ralf Kurt Willy Schulze 2,
PMCID: PMC12474592  PMID: 41003781

Abstract

Objectives

This ex vivo study aimed to compare the accuracy in detection of interproximal natural carious lesions between intraoral (iBWR) and extraoral bitewing radiographs (eBWR) using a multi-observer design and a rigorous gold standard.

Materials and methods

Eighty extracted teeth (40 premolars, 40 molars) were arranged in anatomical sequence within a simulated jaw composed of PMMA and modified gypsum, with an emphasis on creating natural interproximal contacts. Approximately 50% of the teeth exhibited enamel caries, while the remaining 50% were caries-free. Image acquisition was performed using a custom-designed PMMA phantom. iBWR were obtained with a CMOS intraoral sensor (XIOS XG Supreme, Sirona Dental Systems, Bensheim, Germany), and eBWR with a digital panoramic device (Orthophos SL 3D, Dentsply Sirona, Bensheim, Germany).

Twenty-seven licensed dentists assessed caries presence and depth on 120 approximal surfaces (each surface assessed twice using both modalities) using a 5-point confidence scale and a 4-point lesion depth scale. Observers were blinded to the true caries status, which was determined through histological serial sectioning and brightfield microscopy. Diagnostic accuracy was evaluated via ROC analysis, with Youden’s index used to calculate sensitivity, specificity, predictive values, and likelihood ratios. Statistical analyses were conducted at a significance level of α = 0.05.

Results

Overall accuracy was higher for iBWR (Azpooled = 0.58) than for eBWR (Azpooled = 0.54). Both intra-rater (test-retest, eBWR Inline graphicspearman = 0.44, iBWR Inline graphicspearman = 0.48) as well as inter-rater reliability (mean ICC eBWR = 0.19, iBWR = 0.27) were low. For enamel caries detection, iBWR outperformed eBWR in terms of specificity and positive predictive values, while eBWR in the first reading round achieved significantly higher sensitivity.

Conclusions

Overall, our multi-observer ex vivo study using microscopy as ground truth revealed higher diagnostic accuracy for intraoral bitewing radiography as compared to its extraoral counterpart.

Clinical Relevance

Our results from a highly standardized study using a rigorous gold standard support the assumption that intraoral bitewing radiography still represents the radiographic state-of-the-art in interproximal caries detection. For minute enamel, diagnostic accuracy of both methods is just above random guessing.

Keywords: Interproximal caries detection, Bitewing radiography, Extraoral bitewing radiography, Receiver operating characteristic (ROC) analysis

Introduction

Despite their long-term existence, intraoral bitewing radiographs (iBWR) still are the standard of care particularly for diagnosing interproximal caries [1, 2]. Dental caries is a prevalent oral disease, with an estimated (33.6%) of caries in of permanent teeth affecting 294 million people in the European Region [3]. For the US, the prevalence of total dental caries (untreated and treated) in primary or permanent teeth among youth aged 2–19 years was 45.8% in 2015 to 2016 [4].

These figures indicate that simple and readily available techniques to detect carious lesions are required. BWR taken with typical intraoral radiographic tubes and equipment provide a solution. Compared with apical projections, iBWR were shown to provide significantly better sensitivity for all levels of caries progression (iBWR: 94.5 for dentin caries, 90.43 to 82.7 for enamel caries, periapical: 69.7 for dentin caries, 39.01 to 56.2 for enamel caries) [5]. However, interestingly the authors observed no difference regarding specificity [5]. One important factor in favor of BWR may be the relatively standardized projection geometry with the image receptor placed parallel to the long axes of the crowns. However, the horizontal projection angle is a critical point, and thus longer receptors (such as size 3) are not recommended since they increase proximal surface overlapping in their periphery [6]. Intraoral BWR may induce patient discomfort and the possibility of cross-contamination.

An extraoral solution termed “extraoral bitewings” (eBWR), based on a specialized panoramic image acquisition technique, was introduced around 2012 [7]. Since then, a variety of authors have investigated the technique with inconsistent observations. While Kamburoglu and colleagues using film as detector for iBWR found superior performance of iBWR compared to extraoral bitewing and panoramic radiography in diagnosing proximal caries of premolar and molar teeth ex vivo [7], Terry and colleagues observed no significant difference in posterior proximal surface caries detection between the modalities [8]. Terry et al. used storage phosphor plates as iBWR-receptors and a “gold standard” established by two experts on the basis of the iBWR. The latter is known to provide a weak standard since the method under investigation is identical to the method used to define the gold standard [9]. For artificially induced “caries lesions”, Abu El-Ela and colleagues in 2016 observed no significant differences between iBWR (phosphor plates and CMOS-sensor) and eBWR [10]. This ambiguity in the different studies may be due to design differences, yet also due to sample size.

To further investigate this interesting topic, this study was designed. Other than the studies described above, our investigation was based on (i) multiple observers and (ii) histological sections in combination with microscopy to achieve a most accurate gold standard. A multi-observers design design was used to reduce bias due to the well-known inter-observer variance in radiological diagnosis [11, 12].

Aim of this study was to, using a multi-observer set-up, investigate probable differences between iBWR (acquired with a digital intraoral detector) and eBWR (acquired with a state-of-the-art digital panoramic device) with respect to detection of interproximal caries. A rigorous accuracy analysis was conducted by comparing to a best-possible golden standard, i.e. histological sections of the teeth. Our null hypothesis was that the diagnostic accuracy for interproximal caries detection does not differ between the two modalities.

Materials and methods

As phantom study, no ethical approval was required. The teeth for the phantom were taken from regular extractions under the consent of the patients for secondary scientific use.

Phantom

The test phantom, developed with the support of Sirona Dental Systems GmbH, consisted of a transparent polymethylmethacrylate (PMMA) head phantom (Fig. 1A and C) serving as a scatter body, with scaffolds for inserting self-fabricated, tooth-bearing maxillary and mandibular units. The acrylic phantom had a thickness of 2.5 to 3 cm in the simulated cheek region. A large opening allowed external routing of the intraoral sensor cable and holder.

Fig. 1.

Fig. 1

A: Drawing of the PMMA-phantom containing the tooth-bearing maxillary and mandibular units (B) made of superhard stone. The yellow insert at the bottom of the phantom fits into the respective trough in the panoramic machine. C displays the entire phantom with tooth-bearing unit plus the radiographic holding device for iBWR in place

PMMA inserts with 2-mm-thick walls, precisely designed to fit into the phantom scaffolds in the upper and lower jaw regions, served as the tooth-containing units for positioning in the phantom. Preliminary tests evaluated substances and mixtures to replicate the radiographic appearance of human maxillary and mandibular bone.

A total of 80 teeth (40 premolars and 40 molars) from regular extractions were arranged in their natural anatomical sequence in the jaws. Approximately 50% of the teeth had enamel caries (visually identified as white/brown discolorations or small cavitations), while the remaining 50% were visually assessed as caries-free. Teeth with restorations, extensive caries, or prior endodontic treatments were excluded.

The teeth were initially fixed in the PMMA inserts using hot wax to mimic natural dentition, with particular attention to creating “natural” interproximal contacts between the teeth. A silicone impression (Dentalsilikon Monosil Twin 90, HLW Dentalinstruments GmbH, Wernberg-Köblitz, Germany) was used to stabilize the position of the teeth, enabling the removal of the wax and its replacement with the final embedding medium. To simulate a periodontal ligament gap, the tooth roots were briefly immersed in hot wax, forming a thin wax layer.

Preliminary tests demonstrated that a mixture of 100 g superhard stone (HS-Superhartgips gelbbraun, Henry Schein Dental Deutschland GmbH, Langen, Germany) and 22 ml water containing a calcium effervescent tablet produced a realistic bone-like radiographic appearance. This mixture was poured into the PMMA inserts, and the silicone impression containing the arranged teeth was placed into the fluent stone. After setting, the models (Fig. 1B) were consecutively positioned in the PMMA phantom for radiographic exposure (Fig. 1C). A total of five upper and five lower jaw models, i.e. five patient phantoms, were produced from a total of 80 teeth.

Image acquisition

IBWR were acquired using a complementary metal-oxide-semiconductor (CMOS) intraoral sensor (XIOS XG Supreme, Sirona Dental Systems, Bensheim, Germany) with a square pixel size of 0.015 mm and an active area of 36 mm × 25.6 mm (1200 × 868 pixel). The PMMA-inserts containing the teeth were sequentially positioned in the phantom. iBWR were produced at 60 kV and 7 mA using a typical aiming system (Aimright, Sirona Dental Systems, Bensheim, Germany). Due to the holder system, the teeth of the upper and lower jaw models were realistically not in occlusal contact, and a source-to-receptor distance of approximately 268 mm was maintained.

EBWR were exposed using a digital panoramic device (Orthophos SL 3D, Dentsply Sirona, Bensheim, Germany). The PMMA phantom, containing the jaw inserts, was positioned in the panoramic device using a custom-designed metal mount. To simulate the spacing that typically occurs when patients bite on an anterior bite block, pink modeling wax (1 × 3 cm; Henry Schein Dental Deutschland GmbH, Langen, Germany) was bilaterally placed as a 1.5 cm spacer between the teeth. The “BW 1” program, specifically designed for lateral eBWR, was used for imaging. According to the manufacturer, this program applies a slightly wider primary slit collimation compared to standard panoramic radiography and uses a modified motion trajectory for the X-ray source and detector. eBWR were acquired at 60 kV and 8 mA with an exposure time of 8.8 s, producing images with a pixel size of 0.1 mm (1708 × 956 pixel).

Figure 2A and B display an extraoral and an intraoral bitewing radiograph of the same test teeth examined in this study.

Fig. 2.

Fig. 2

Exemplary eBWR (A) and iBWR (B) as obtained from the phantom. Note that the size of the two images in this Figure does not correctly reflect the actual 1:1 display that was used for image assessment

Image evaluation/viewing sessions

Raters were selected from the dentists available at the Dental Hospital of the University Medical Center of the Johannes Gutenberg-University of Mainz. We aimed to include a sufficiently large rater-sample comprising of different sub-specialties. Twenty-seven licensed dentists (mean radiographic interpretation experience: 6.2 years; range: 2.7–17.0 years) participated. Of these, 25 worked in departments of the Dental Hospital (five in Periodontology and Conservative Dentistry, 12 in Prosthodontics and Dental Materials, and eight in Oral and Maxillofacial Surgery), while two were not employed at the time of the viewing sessions.

Image evaluations were conducted on a 19-inch LED display (Model B19-7 LED, Fujitsu Minato, Tokyo, Japan [1280 × 1024 pixel, luminance = 250 cd/m²]) in a quiet, darkened room. Quality checks were performed on evaluation days according to German DIN 6868 − 157 standards using the TG18-QC test pattern [13]. Images were displayed in 1:1 format, meaning that one sensor pixel is represented in one monitor pixel.

Observers received standardized instructions, including graphical explanations and example radiographs (not part of the study sample) to illustrate caries depth classification. Each observer, blinded to the true caries status, evaluated 20 radiographs (10 iBWR, 10 eBWR) corresponding to 80 teeth and 120 proximal surfaces. Radiographs were presented in a randomized order using proprietary software (Sidexis XG, Dentsply Sirona, Bensheim, Germany), routinely used at the University Medical Center Mainz. Observers could freely adjust image size, contrast, and brightness without time constraints. Raters were asked to evaluate the interproximal surfaces between the two premolars, between the premolar and first molar, and between the two molars, coronal to the cemento-enamel junction [14]. Possible carious lesions at the occlusal surfaces or in the cervical roots were not considered. Caries presence was rated on a 5-point confidence scale, ranging from 1 (“approximal caries definitely not present”) to 5 (“approximal caries definitely present”) [15]. Lesion depth was classified on a 4-point scale: 1 = outer enamel, 2 = inner enamel, 3 = outer dentin, and 4 = inner dentin.

Per session, a total of 240 approximal surfaces were evaluated (120 surfaces from 80 teeth, each assessed twice using two radiographic techniques).

To examine reliability, 24 observers repeated the evaluation after at least 30 days, while three raters were unavailable due to scheduling conflicts.

Histological assessment

To validate the caries status, 78 study teeth (two teeth were lost during histological processing) were embedded in an MMA-based plastic embedding system (Technovit 9100, Kulzer Technik GmbH, Wehrheim, Germany) following the recommendations of Willbold and Witte (2010) [16]. The teeth were serially sectioned perpendicular to the central axis in a mesiodistal direction using the Exakt-Trennschleifsystem (EXAKT Advanced Technologies GmbH, Norderstedt, Germany) based on the sawing and grinding technique described by Donath and Breuner in 1982 [17]. Depending on tooth diameter, four to 12 thin sections were prepared per tooth.

Sections were analyzed using brightfield microscopy (Biorevo BZ-9000 microscope, Keyence, Osaka, Japan) at 2× magnification. Caries depth on proximal surfaces above the cemento-enamel junction was assessed by one of the study’s authors. Based on the definition by Hintze and Wenzel in 2003, a “whitish, opaque to brown/dark discoloration” was classified as a carious lesion (189. Lesions were categorized into five classes: 0 = no lesion, 1 = lesion in outer enamel, 2 = lesion in inner enamel, 3 = lesion in outer dentin, and 4 = lesion in inner dentin [18].

Statistical analysis

Intrarater reliability was assessed by means of the Spearman’s rank correlation coefficient, while inter-rater reliability was assessed using the intraclass correlation coefficient (ICC) according to Shrout and Fleiss [19].

To assess the diagnostic accuracy of both imaging techniques, observers’ evaluations were compared to the histological gold standard through receiver operating characteristic (ROC) analysis. All statistical analyses were performed using R 4.3.2 [20] with RStudio 4.0.735 [21]. ROC analysis was conducted using the pROC package (v1.18.5 [22]). To calculate sensitivity, specificity, positive and negative predictive values, as well as positive and negative likelihood ratios, the threshold with the highest Youden index was selected from the ROC analyses. A significance level of α = 0.05 was applied for all statistical calculations. For assessment of relevant factors (imaging method, lesion depth) on performance differences an analysis of variance was computed.

Results

Histological evaluation

Of the 117 examined proximal surfaces, 58 (49.57%) showed carious lesions: 19 in outer enamel (D1, 32.76%), 23 in inner enamel (D2, 39.66%), 15 in outer dentin (D3, 25.86%), and one in inner dentin (D4, 1.72%).

Intra- and interrater reliability

Intrarater reliability, assessed using Spearman’s rank correlation coefficient, showed Inline graphicmean = 0.44 (SD = 0.16, range Inline graphic = 0.18–0.70) for extraoral and Inline graphicmean = 0.48 (SD = 0.16, range Inline graphic = 0.12–0.90) for intraoral bitewing radiographs. Although the mean intrarater reliability coefficients were significantly different from zero, they indicated only low reliability according to Koo and Li’s classification [20]. We found no significant difference between the two modalities (z = −0.16, p = 0.87).

Interrater reliability was assessed using the intraclass correlation coefficient [ICC [1, 2] as described by Shrout and Fleiss [21]. For extraoral radiographs, ICC [1, 2] was 0.19 (p < 0.01, CI = [0.14; 0.24]) at both sessions. For intraoral radiographs, ICC [1, 2] was 0.28 (p < 0.01, CI = [0.23; 0.35]) at the first session and 0.25 (p < 0.01, CI = [0.20; 0.32]) at the second. Although the mean interrater reliability coefficients were significantly different from zero, they also indicated only low reliability according to Koo and Li’s classification [23].

Diagnostic accuracy

As shown in Table 1, the AUC values for diagnostic accuracy were 0.527 (T1) and 0.545 (T2) for extraoral bitewing radiographs, and 0.580 (T1) and 0.586 (T2) for intraoral bitewing radiographs, indicating a poor diagnostic performance for both imaging techniques [24]. However, AUC values for intraoral bitewing radiographs were significantly higher than those for extraoral bitewing radiographs (Fig. 3), as determined by t-tests (t [23]T1 = 5.369, p < 0.01, d = 1.01; t [23]T2 = 4.332, p < 0.01, d = 0.88), with large effect size at T1 and medium effect size at T2 [25].

Table 1.

Comparison of diagnostic metrics between extraoral (highlighted in light gray) and intraoral bitewing radiographs

Method Observation No (T) Parameter [mean ± SD; 95%-CI]
AUC
eBWR 1 (T1) 0.527 ± 0.045; 0.440–0.638
2 (T2) 0.545 ± 0.042; 0.463–0.654
iBWR 1 (T1) 0.580 ± 0.050; 0.511–0.699
2 (T2) 0.586 ± 0.050; 0.476–0.683
 sensitivity
eBWR 1 (T1) 0.504 ± 0.248; 0.103–0.983
2 (T2) 0.439 ± 0.218; 0.069–0.845
iBWR 1 (T1) 0.309 ± 0.140; 0.103–0.569
2 (T2) 0.360 ± 0.145, 0.069–0.586
 specificity
eBWR 1 (T1) 0.586 ± 0.237; 0.085–0.915
2 (T2) 0.666 ± 0.203; 0.288–1.000
iBWR 1 (T1) 0.870 ± 0.096; 0.576–0.983
2 (T2) 0.823 ± 0.124; 0.559–0.983
PPV
eBWR 1 (T1) 0.553 ± 0.036; 0.509–0.633
2 (T2) 0.597 ± 0.109; 0.510–1.000
iBWR 1 (T1) 0.723 ± 0.098, 0.569–0.909
2 (T2) 0.695 ± 0.105; 0.553–0.947
 NPV
eBWR 1 (T1) 0.569 ± 0.073; 0.509–0.833
2 (T2) 0.556 ± 0.039; 0.509–0.654
iBWR 1 (T1) 0.565 ± 0.036; 0.520–0.658
2 (T2) 0.570 ± 0.034; 0.514–0.648

SD standard deviation; 95%-CI 95%-Confidence interval, auc area under the Curve, ppv positive predictive Value, npv negative predictive Value, LR + Positive likelihood Ratio, LR- Negative likelihood ratio

Fig. 3.

Fig. 3

ROC-curves for accuracy analysis pooled over all raters yet separated for the two observations. Both ROC curves for extraoral bitewing radiographs are positioned just above the 45° diagonal at both test sessions, indicating low diagnostic accuracy. Clearly, both ROC curves for intraoral bitewing radiographs demonstrate a greater distance from the 45° diagonal, reflecting significantly yet only slightly higher diagnostic accuracy

Sensitivity values were however higher for extraoral than for intraoral bitewing radiographs. At T1, this difference was significant (WT1 = 142, p < 0.01, r = 0.62), corresponding to a medium effect size, while at T2, no significant difference appeared (WT2 = 125, p = 0.09).

Specificity values were in turn significantly higher for intraoral bitewing radiographs at both test sessions (WT1 = 168, p < 0.01, r = 0.73; WT2 = 172, p < 0.01, r = 0.62), corresponding to a medium effect size.

Positive predictive values and positive likelihood ratios were also significantly higher for intraoral bitewing radiographs at both sessions (all Ws Inline graphic 170, all ps < 0.01, all rs Inline graphic 0.88) with medium to large effect sizes. In contrast, no significant differences were found between the two imaging techniques for negative predictive values and negative likelihood ratios (all Ws Inline graphic 89, all ps Inline graphic 0.10).

Table 1 presents the diagnostic metrics for assessing the diagnostic accuracy of extraoral and intraoral bitewing radiographs, separated by both test sessions.

Enamel versus dentin caries

To examine whether the diagnostic accuracy of both techniques was influenced by caries depth, a follow-up analysis differentiated between enamel and dentin caries. Of the 117 approximal surfaces, 42 surfaces (35.90%) showed carious lesions confined to enamel, and 16 surfaces (13.68%) extended into dentin.

iBWR demonstrated significantly higher AUC (range: 0.513 to 0.875, mean: 0.684 versus 0.475 to 0.723, mean: 0.563 for eBWR), specificity, LR+, PPV, and NPV values for detecting dentin caries at both time points and higher sensitivity at T2 (all Ws ≥ 86, all ps ≤ 0.04, Fig. 4).

Fig. 4.

Fig. 4

ROC-curves of all 27 readers (gray lines) and a pooled curve over all readers (red curve) for eBWR (A) versus iBWR (B) for lesions reaching into dentine. Here, iBWR clearly outperforms eBWR

For enamel caries detection, iBWR outperformed eBWR in specificity, PPV, and LR + at both time points (all Ws Inline graphic 110, all ps Inline graphic 0.03, Fig. 5). AUC ranged between 0.481 and 0.633 (mean: 0.541) for the former, versus 0.377 to 0.606 (mean: 0.518) for the latter technology. However, eBWR demonstrated significantly higher sensitivity (mean ± standard deviation eBWR: 0.564 ± 0.23 vs. 0.268 ± 0.14, p < 0.01) at T1. No significant differences were observed between iBWR and eBWR for detecting enamel caries of AUC, NPV, and LR− (all Ws ≥ 64, all ps ≥ 0.06). Two-factorial analysis of variance revealed significant influence of lesion depth (p < 0.001) and the combination of lesion depth and method (p < 0.01) on sensitivity. The imaging method alone did not show a significant influence on sensitivity.

Fig. 5.

Fig. 5

ROC-curves of all 27 readers (gray lines) and a pooled curve over all readers (red curve) for eBWR (A) versus iBWR (B) for lesions confined to enamel. Clearly, the lines indicate a performance that on average is just above chance

Discussion

It is a well-known fact that diagnostic accuracy of bitewing radiography for the detection of minute 1 st stage carious lesions is low [2628]. Despite that fact, these radiographs nevertheless prevail to be acquired in multitude in the clinical world. This is comprehensible, since intraoral BWR, despite their questionable performance for early-stage lesions, represent the state-of-the-art for general inter-proximal caries detection in patients [29]. Not only from a work-process perspective, it would be beneficial to acquire these radiographs without a rather complex placement of an intraoral image receptor in the patients’ oral cavity. Patients would certainly also favor such an option if it were proven equivalent to the state-of-the art. Particulary patients with strong gag reflex or narrow anatomical conditions would benefit from extraoral imaging.

Here extraoral BWR comes into play. Although some authors question the term “extraoral bitewing radiography” per se [30], various panoramic radiography manufacturers now implement programs using this term. This study investigated the diagnostic accuracy of iBWR versus eBWR in detection of real carious lesions in an ex vivo scenario. To remedy the well-known inter-observer variance in radiological diagnosis [11, 12], a multi-observer design was employed. 27 dentists rated the radiographs on a five-point Likert scale. One important difference is the resulting image size, which due to its’ acquisition in a panoramic machine with the line detector has more pixel compared to iBWR (Fig. 2). We used a 1:1 pixel display which should be applied for medical imagery not to waist image information. However, owing to this fact the displayed size of the assessment regions (the tooth crowns) is smaller in eBWR as there is quite a lot anatomy (i.e. the alveolar ridge) included above and below the teeth. Yet this also reflect the clinical situation. How this influences diagnostic accuracy cannot be concluded from our study.

Our results indicate that both intraoral as well as extraoral BWR proved an overall low performance in the detection of interproximal carious lesions. Yet intraoral BWR (AUC-values of ca. 0.58) slightly outperformed their extraoral counterparts (AUC-values of ca. 0.54). Particularly, specificity was higher for iBWR, meaning that it is more likely on them to detect sound interproximal surfaces. In a clinical context, this would reduce the number of unnecessary interventions. Interestingly, for enamel caries the performance difference was even more pronounced. On the other hand, in the first observation round we observed a higher sensitivity for enamel lesions in eBWR. We cannot really explain this finding. This would yield more truly detected enamel lesions. As unnecessary interventions particularly for enamel caries should be avoided, we feel that the risk for false positive “detections” in a patient setting would be less favorable. As AZ-values reflect the performance of a diagnostic system by iteratively shifting the trade-off of between sensitivity and specificity according to the observers’ confidence levels, altogether iBWR outperformed eBWR. This may by partly explained by the spatial resolution differences of the systems. While intraoral sensor-based digital radiography may offer a spatial resolution of between 6 lp/mm and 15 lp/mm [31], the values for digital panoramic radiography range roughly between 2 lp/mm and 3 lp/mm [32]. Abu El-Ela and colleagues, in their study using artificially induced acid lesions, observed no difference between eBWR and iBWR [10]. Although they used the same panoramic device for the eBWR, this study only had two observers, and no real caries was assessed. The authors observed strong interobserver agreement [10] while in our study both inter- as well as intra-rater reproducibility were low. This can be also attributed to the fact, that diagnostic accuracy of bitewing radiography for the detection of minute 1 st stage carious lesions is low [2628]. Hence agreement between observers will also be low because they are very insecure on their judgment. Obviously, the Chance of a lower overall agreement rises with the number of observers involved. Obuchowski in 2004 concluded “From the earliest phase to the final phases of assessment, multiple-observer studies are critical to clinical studies of medical imaging” [33]. It seems likely, that specific training with both modalities would enhance agreement between observers. Here, the focus should lie on initial enamel lesion detection. It would also be interesting to investigate, how a well-trained deep learning system performs for this specific task.

We believe that the relatively small differences in performance of the two modalities could only be detected due to the high observer number involved in our study. Another study comparing eBWR versus iBWR in real patients found higher sensitivity yet a lower specificity for caries detection in the former modality [34]. However, this study suffers from the severe, yet in a patient study unavoidable drawback that the definition of a carious lesion was made by consensus radiographic diagnosis from 5 observers. This lack of a gold standard has been shown to overestimate accuracy and to potentially be misleading [9]. We identified two studies using histological sections for assessment of ground truth [7, 35]. Both studies observed substantially higher AZ-values (> 0.8 [7] and > 0.7 [35]). We believe the low values we observed can be explained by the high prevalence of lesions confined to the enamel and a low proportion extending into dentin. This made the diagnostic task for the observers rather complex. In summary our findings suggest, that intraoral bitewing radiography performs somewhat better for the generally difficult task to detect an initial carious interproximal lesion detection. The small but significant difference raises the question of the extent to which it is clinically relevant. This is particularly true with regard to today’s much less invasive caries therapy. Thus, the question on clinical relevance needs to be addressed in the future by patient outcome-centered studies.

Despite enhanced patient comfort, extraoral bitewing radiography also has some drawbacks. Effective dose has been shown to be higher compared with that imposed by intraoral BWR [36]. In addition, eBWR requires a fancy panoramic radiography device, which is certainly not as globally available as intraoral radiography equipment.

Our study also has limitations. Although we tried very hard to produce a model most closely resembling the natural situation, obviously the natural teeth were placed in plaster models and not real bone. This altered the radiographic appearance at least in the root region of the resulting BWR. It can only be speculated if this had an influence on observer performance and thus resulting accuracy measures. The PMMA-phantom also is not producing a real-world scattering environment. Hence, the absolute accuracy values (sensitivity and specificity) may likely overestimate diagnostic accuracy in a real world scenario. Using real patient radiographs would be a solution here, but with the considerable disadvantage of a missing gold standard [18]. Unfortunately, there is as yet no approach to using both real patient radiographs and at the same time a robust reliable gold standard for caries status. Despite this obvious shortcoming, our study due to its design was able to detect rather small differences in diagnostic performance of the two modalities. Thus subject to the above restrictions we believe that the results are relevant for daily practice. However, it is also clear that future studies should focus on patient outcome to provide more insight on the clinical implications of the small differences.

In conclusion, our results indicate a slightly yet significantly higher overall accuracy in detecting carious lesions for intraoral BWR. The study also confirms that diagnostic accuracy for minute enamel lesions is, at best slightly higher than random guess. For dentin lesions, both intraoral and extraoral BWR showed equal performance except of the specificity, which was higher for iBWR. Altogether our results indicate that the gold standard for radiographic interproximal caries remains to be intraoral bitewing radiography, however the patient-centered effect caused by the performance difference needs to be still investigated.

Acknowledgements

The authors greatfully acknowledge Dentsply Sirona for providing the PMMA-Phantom. We also thank all observers for their valuable time and experience and also the staff (particularly Mrs. Christina Babel) of the Dept. of Oral and Maxillofacial Surgery, Plastic Surgery Department, University Medical Centre of the Johannes Gutenberg University Mainz, Germany for her help in preparing the specimen for histological analysis. This work includes parts of the Dr. med. dent. Thesis of Julia Caroline Quintus entitled "An ex vivo comparison of extraoral and intraoral bitewing radiographs in the detection of approximal caries".

Author contributions

Thu study concept was designed by RS and JQ. RS and JQ wrote the manuscript text. JQ acquired and evaluated the data. Both authors reviewed the manuscript.

Funding

Open access funding provided by University of Bern. This study was funded by Dentsply Sirona who provided the panoramic machine and the PMMA phantom.

Data Availability

The scientific raw data of this study can be provided as an excel-file upon reasonable request

Declarations

Ethics approval

As phantom study, no ethical approval was required. The teeth for the phantom were taken from regular extractions under the consent of the patients for secondary scientific use.

Consent

All patients in which the teeth had been extracted gave their general consent for secondary scientific use of material and data.

Competing interests

Financial interests: RS (The University Medical Center of the Johannes Gutenberg-University of Mainz) received financial support by Dentsply Sirona for the project. Non-financial interests: RS is an unpaid member of different national (DIN) and international (IEC) standardization committees.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Grieco P, Jivraj A, Silva JD, Kuwajima Y, Ishida Y, Ogawa K, Ohyama H, Ishikawa-Nagai S (2022) Importance of bitewing radiographs for the early detection of interproximal carious lesions and the impact on healthcare expenditure in Japan. Ann Transl Med 10:2. 10.21037/atm-21-2197 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Schwendicke F, Göstemeyer G (2020) Conventional bitewing radiography. clin. Dent Rev 4:22. 10.1007/s41894-020-00086-8 [Google Scholar]
  • 3.Global oral health status report (2023) Towards universal health coverage for oral health by 2030. regional summary of the Eastern mediterranean region. World Health Organization, Geneva. https://iris.who.int/bitstream/handle/10665/375771/9789240070806-eng.pdf?sequence=1 Licence: CC BY-NC-SA 3.0 IGO [Google Scholar]
  • 4.Fleming E, Afful J (2018) Prevalence of total and untreated dental caries among youth: United States, 2015–2016. NCHS Data Brief, no 307. Hyattsville, MD: National Center for Health Statistics https://www.cdc.gov/nchs/data/databriefs/db307.pdf
  • 5.Takahashi N, Lee C, Silva JDD, Ohyama H, Roppongi M, Kihara H, Hatakeyama W, Ishikawa-Nagai S, Izumisawa M (2019) A comparison of diagnosis of early stage interproximal caries with bitewing radiographs and periapical images using consensus reference. Dentomaxillofac Radiol 48:20170450. 10.1259/dmfr.20170450 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Wenzel A (2009) Dental caries. In: White SC, Pharoah MJ (eds) Oral radiology, principles and interpretation, 6th edn. Mosby, St Louis, MO, pp 270–281 [Google Scholar]
  • 7.Kamburoglu K, Kolsuz E, Murat S, Yüksel S, Ozen T (2012) Proximal caries detection accuracy using intraoral bitewing radiography, extraoral bitewing radiography and panoramic radiography. Dentomaxillofac Radiol 41:450–459. 10.1259/dmfr/30526171 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Terry GL, Noujeim M, Langlais RP, Moore WS, Prihoda TJ (2016) A clinical comparison of extraoral panoramic and intraoral radiographic modalities for detecting proximal caries and visualizing open posterior interproximal contacts. Dentomaxillofac Radiol 45:20150159. 10.1259/dmfr.20150159 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Wenzel A, Hintze H (1999) The choice of gold standard for evaluating tests for caries diagnosis. Dentomaxillofac Radiol 28:132–136. 10.1259/dmfr.28.3.10740465 [DOI] [PubMed] [Google Scholar]
  • 10.Abu El-Ela WH, Farid MM, Mostaf MS (2016) Intraoral versus extraoral bitewing radiography in detection of enamel proximal caries: an ex vivo study. Dentomaxillofac Radiol 45:20150326. 10.1259/dmfr.20150326 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Goldman M, Pearson AH, Darzenta N (1974) Reliability of radiographic interpretations. Oral Surg Oral Med Oral Pathol 38:287–93. 10.1016/0030-4220(74)90070-x [DOI] [PubMed] [Google Scholar]
  • 12.Kaffe I, Gratt BM (1988) Variations in the radiographic interpretation of the periapical dental region. J Endod 14:330–335. 10.1016/S0099-2399(88)80193-6 [DOI] [PubMed] [Google Scholar]
  • 13.Madsack B, Walz M, Weisser G (2014) Abnahme und Konstanzprüfung an Bildwiedergabesystemen – was ändert Sich Mit der Neuen DIN V 6868 – 157? Radiopraxis 7:195–210. 10.1055/s-0034-1390974 [Google Scholar]
  • 14.Alkurt MT, Peker I, Bala O, Altunkaynak B (2007) In vitro comparison of four different dental X-ray films and direct digital radiography for proximal caries detection. Oper Dent 32:504–509. 10.2341/06-148 [DOI] [PubMed] [Google Scholar]
  • 15.Hanley JA (1989) Receiver operating characteristic (ROC) methodology: the state of the art. Crit Rev Diagn Imaging 29:307–335 [PubMed] [Google Scholar]
  • 16.Willbold E, Witte F (2010) Histology and research at the hard tissue–implant interface using technovit 9100 new embedding technique. Acta Biomater 6:4447–4455. 10.1016/j.actbio.2010.06.022 [DOI] [PubMed] [Google Scholar]
  • 17.Donath K, Breuner G (1982) A method for the study of undecalcified bones and teeth with attached soft tissues. The Säge-schliff (sawing and grinding) technique. J Oral Pathol 11:318–326. 10.1111/j.1600-0714.1982.tb00172.x [DOI] [PubMed] [Google Scholar]
  • 18.Hintze H, Wenzel A (2003) Diagnostic outcome of methods frequently used for caries validation. A comparison of clinical examination, radiography and histology following hemisectioning and serial tooth sectioning. Caries Res 37:115–124. 10.1159/000069016 [DOI] [PubMed] [Google Scholar]
  • 19.Shrout PE, Fleiss JL (1979) Intraclass correlations: uses in assessing rater reliability. Psychol Bull 86:420–428. 10.1037//0033-2909.86.2.420 [DOI]
  • 20.R Core Team (2024) _R: A Language and Environment for Statistical Computing_. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
  • 21.RStudio Team (2020) RStudio: integrated development for R. RStudio. PBC, Boston., MA. http://www.rstudio.com/ [Google Scholar]
  • 22.Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J-C, Müller M, Siegert S, Doering M, Billings Z (2023) [accessed 1 November 2024]). pROC: display and analyze ROC curves [Internet]. https://cran.r-project.org/web/packages/pROC/pROC.pdf
  • 23.Koo TK, Li MY (2016) A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med 15:155–163. 10.1016/j.jcm.2016.02.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Šimundić A-M (2009) Measures of diagnostic accuracy: basic definitions. Electron J Int Fed Clin Chem Lab Med 19(4):203–211 [Google Scholar]
  • 25.Cohen J (1988) Statistical power analysis for the behavioral sciences. Lawrence Erlbaum Associates,
  • 26.Hintze H, Wenzel A, Danielsen B, Nyvad B (1998) Reliability of visual examination, fibre-optic transillumination, and bite-wing radiography, and reproducibility of direct visual examination following tooth separation for the identification of cavitated carious lesions in contacting approximal surfaces. Caries Res 32:204–209. 10.1159/000016454 [DOI] [PubMed] [Google Scholar]
  • 27.Machiulskiene V, Nyvad B, Baelum V (2004) Comparison of diagnostic yields of clinical and radiographic caries examinations in children of different age. Eur J Paediatr Dent 5:157–162 [PubMed] [Google Scholar]
  • 28.Devlin H, Williams T, Graham J, Ashley M (2021) The ADEPT study: a comparative study of dentists’ ability to detect enamel-only proximal caries in bitewing radiographs with and without the use of assistdent artificial intelligence software. Br Dent J 231:481–485. 10.1038/s41415-021-3526-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Wenzel A (2021) Radiographic modalities for diagnosis of caries in a historical perspective: from film to machine-intelligence supported systems. Dentomaxillofac Radiol 50:20210010. 10.1259/dmfr.20210010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Johnson KB, Mol A, Tyndall DA (2021) Extraoral bite-wing radiographs: a universally accepted paradox. J Am Dent Assoc 152:444–447. 10.1016/j.adaj.2021.02.015 [DOI] [PubMed] [Google Scholar]
  • 31.Hellén-Halme K, Johansson C, Nilsson M (2016) Comparison of the performance of intraoral X-ray sensors using objective image quality assessment. Oral Surg Oral Med Oral Pathol Oral Radiol 121:e129–137. 10.1016/j.oooo.2016.01.016 [DOI] [PubMed] [Google Scholar]
  • 32.Yeom HG, Kim JE, Huh KH, Yi WJ, Heo MS, Lee SS, Choi SC (2020) Correlation between Spatial resolution and ball distortion rate of panoramic radiography. BMC Med Imaging 68. 10.1186/s12880-020-00472-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Obuchowski NA (2004) How many observers are needed in clinical studies of medical imaging? AJR Am J Roentgenol 182:867–869. 10.2214/ajr.182.4.1820867 [DOI] [PubMed] [Google Scholar]
  • 34.Chan M, Dadul T, Langlais R, Russell D, Ahmad M (2018) Accuracy of extraoral bite-wing radiography in detecting proximal caries and crestal bone loss. J Am Dent Assoc 149:51–58. 10.1016/j.adaj.2017.08.032 [DOI] [PubMed] [Google Scholar]
  • 35.Gaalaas L, Tyndall D, Mol A, Everett ET, Bangdiwala A (2016) Ex vivo evaluation of new 2D and 3D dental radiographic technology for detecting caries. Dentomaxillofac Radiol 45:20150281. 10.1259/dmfr.20150281 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Dorsey AK, Mol A, Green P, Ludlow J, Johnson B (2024) Radiation doses in extraoral bitewing radiography compared with intraoral bitewing and panoramic radiography. Oral Surg Oral Med Oral Pathol Oral Radiol 137(2):182–189. 10.1016/j.oooo.2023.09.002 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The scientific raw data of this study can be provided as an excel-file upon reasonable request


Articles from Clinical Oral Investigations are provided here courtesy of Springer

RESOURCES