Dental Cavity Grading: Comparing Algorithm Reliability and Agreement with Expert Evaluation

Abubaker Qutieshat; Abdurahman Salem; Melina N Kyranides

doi:10.1155/2024/3965641

. 2024 Aug 10;2024:3965641. doi: 10.1155/2024/3965641

Dental Cavity Grading: Comparing Algorithm Reliability and Agreement with Expert Evaluation

Abubaker Qutieshat ^1,^2,^✉, Abdurahman Salem ³, Melina N Kyranides ⁴

PMCID: PMC11330331 PMID: 39157299

Abstract

Aim

The current study introduces a novel, algorithm-based software developed to objectively evaluate dental cavity preparations. The software aims to provide an alternative or complement to traditional, subjective assessment methods used in operative dentistry education.

Materials and Methods

The software was tested on cavity preparations carried out by 70 participants on artificial molar teeth. These cavities were also independently assessed by an experienced academic panel. The software, using 3D imaging, calculated cavity dimensions and assigned an error score based on deviation from ideal measurements. Statistical analyses included sensitivity, specificity, positive predictive value, negative predictive value, Cohen's kappa, the intraclass correlation coefficient (ICC3k), Spearman's rho, Kendall's tau correlation coefficients, and a confusion matrix.

Result

The software demonstrated a high degree of accuracy and agreement with the panel assessments. The average software and panel scores were 64.1 and 60.91, respectively. Sensitivity (0.98) was high, specificity (0.55) was moderate, and the ICC3k value (0.857) indicated a strong agreement between the software and the panel. Further, Spearman's rho (0.73) and Kendall's tau (0.56) suggested a strong correlation between the two grading methods.

Conclusion

The results support the algorithm-based software as a valid and reliable tool for dental cavity preparation assessments. The software's potential use in dental education is promising, though future research is necessary to validate and optimize this technology for wider application.

1. Introduction

Dental education traditionally relies on hands-on training and practical assessments to evaluate students' skills and competence in various procedures. One critical aspect of dental education is the preparation of dental cavities, which demands a high degree of precision and accuracy to ensure successful restorations. A panel of experienced dental professionals typically carries out the assessment of students' performance in cavity preparation. They evaluate the quality of the work based on various criteria, such as the cavity's shape, depth, and preservation of the tooth structure.

However, the conventional assessment method involving an expert panel can be time-consuming, subjective, and potentially influenced by human error or bias [1, 2, 3]. Additionally, the growing number of dental students and the demand for more efficient and objective assessment methods have prompted investigations into alternative approaches, such as technology-aided assessment systems [4, 5, 6, 7, 8, 9]. These tools have the potential to provide more objective, accurate, and efficient evaluations than traditional panel-based assessments. Moreover, the use of technology-aided assessment systems can standardize the assessment process, enhance student feedback, and facilitate the identification of areas for improvement [10, 11, 12].

The aim of this study was to compare the efficacy of our specially developed open-source algorithm software with that of traditional expert panel evaluations in assessing dental cavity preparations. The null hypothesis proposed was that there would be no significant difference in the assessment results produced by the algorithm-based software and the expert panel.

In this study, we developed a specially programed open-source algorithm software, available at https://cephcad.com/dentalign/, for evaluating dental cavity preparations. This software's performance was compared with the assessments made by a panel of experienced academics who are specialists in restorative dentistry.

The findings from this study have the potential to contribute significantly to the expanding body of literature on technology's role in dental education, providing valuable insights into the benefits and limitations of implementing software applications for assessing dental cavity preparations. Ultimately, these results may equip dental educators and institutions with crucial knowledge for making informed decisions on adopting technology-enhanced assessment methods, thus enhancing the overall quality of dental education.

2. Methodology

This study has been approved by the Institutional Review Board under protocol number ODC-AE-2022-170, ensuring strict adherence to ethical guidelines for research involving human subjects. A cohort of 70 fourth-year dental students participated in this study. Participants were asked to prepare a cavity, a common exercise in operative dentistry, on an artificial mandibular third molar tooth (Nissin Dental Prod. Inc) in the traditional phantom head. The students were provided with precise instructions following ideal amalgam cavity preparation criteria. Regarding the outline, the participants were tasked with preparing the mesial and distal grooves, including the mesial, central, and distal pits. Clear specifications for ideal depth and width measurements were provided, with a depth of 1.50 mm and a width of 1.25 mm. While extending the cavity into the secondary and developmental grooves was optional, fulfilling the cavity convergence criterion was not requested. All participants used the same preparation bur (no. 330 bur, Komet) and were equipped with a standardized Williams graduated periodontal probe (Hu-Friedy).

2.1. Evaluating Performance on the Task

After completing the task, the prepared teeth were collected and optically scanned using Ceramill Map400 (Amann Girrbach). The Standard Tessellation Language (STL) images obtained from the CAD/CAM scanner were then aligned against a reference “unprepared” tooth STL image using a custom software developed by the authors (Figure 1) [13]. The alignment process is carried out by the software using an iterative closest point (ICP) algorithm. The alignment processes the surface information from two sets of 3D point vertices (i.e., reference tooth vs. prepared tooth) to calculate the rigid body transformation using singular value decomposition. The ICP algorithm and the mean squared error of matching two point sets are evaluated by measuring the 3D Euclidean distances between the closest surface points on the two images. Part of the occlusal surface was excluded from the alignment calculation to prevent cavity preparation values from interfering with the initial alignment (Figure 2).

Source code for the dental cavity assessment algorithm: (a) the rigid body transformation algorithm essential for aligning 3D dental scans; (b) the worker code responsible for processing cavity dimensions and evaluating deviations from ideal measurements.

A screenshot from the custom software displaying the aligned prepared tooth (b) and the reference, unprepared tooth model (a). The depth of the preparation is visualized in the corresponding heatmap at (c), allowing for an in-depth comparison with the ideal tooth geometry.

Cavity preparations were assessed via a specially programed algorithm software, developed by the authors, using JavaScript language (software is open-source and available at https://cephcad.com/dentalign/) [13]. The software was programed to measure cavity depth, width, and extensions (mesiodistal). Measurements included three depth readings (mesial, middle, and distal), two cavity isthmus width readings (mesial and distal), and two readings for the remaining tooth structure at the marginal ridge area (mesial and distal marginal ridge). The algorithm defined an error as a deviation from the ideal geometry of 0.25 mm and was given an error value of one for each (e.g., an error value of one for a cavity that is 0.25 mm short from, including the mesial pit and an error value of three for a 2 mm wide distal cavity isthmus). The final score for each tooth was therefore calculated as the sum of the number of errors from seven readings (Figure 3).

Screenshots from the authors' custom assessment software showcasing the measurement process for cavity preparations. Key dimensions are highlighted: cavity depth readings at mesial, middle, and distal points; isthmus width at mesial and distal points; and remaining tooth structure at mesial and distal marginal ridge areas. These measurements provide comprehensive insight into the quality of the cavity preparation.

Cavity preparations were assessed by a panel consisting of three experienced academics specializing in restorative dentistry, selected for their extensive experience and educational roles. Prior to the study, the panel members were oriented to a unified framework of evaluation through a validated assessment sheet, informed by the University of Dundee's criteria for amalgam cavity preparations. This sheet detailed the specific dimensions and characteristics of ideal cavity preparations, ensuring that all evaluators' judgments were grounded in a shared understanding of the assessment criteria.

The panel's calibration process was designed to closely simulate the environment of a standard examination setting, emphasizing the panel's familiarity with the criteria to reflect the natural variance seen in academic evaluations while maintaining the assessment's integrity and relevance. In cases of divergent evaluations, panel members engaged in a consensus discussion, leveraging their collective expertise to determine the final grades. This procedure mirrors the collaborative decision-making typical in academic settings, ensuring fair and comprehensive student evaluations.

2.2. Standard Setting and Cutoff Point Determination

To further solidify the reliability of the assessment, the fail mark for cavity preparations was established using the borderline group method, which involved an additional group of 10 experienced dentists. This group independently reviewed a selection of cavity preparations to define the threshold between passing and failing, ultimately identifying a score range of 40–50. The average score within this borderline group was calculated and set as the new cutoff score for pass/fail decisions, which was established at 44. This process provided a clear and empirically grounded benchmark for software comparison, contributing to the study's objective of evaluating the software's effectiveness in a rigorous academic setting.

2.3. Statistical Analysis

The grades assigned by both the panel and the software were given out of 100 points. To perform some statistical analyses, these grades were further converted into an alphabetical grading system: A (80–100), B (60–79.99), C (40–59.99), D (20–39.99), and E (0–19.99).

The software's performance in assessing the dental cavity preparations was evaluated using a series of statistical tests. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), the area under the receiver operating characteristic (ROC) curve, and Cohen's kappa were calculated to understand the software's discriminative ability and its agreement with the panel. To complement these analyses, a precision–recall curve was generated, helping identify an optimal cutoff score for the software.

Additionally, the intraclass correlation coefficient (ICC3k) was computed to measure the degree of agreement between the software and the panel. The ICC3k value offers insight into the consistency of the average of the panel and software's measurements in this specific comparison.

Further insight into the agreement between the panel and the software was sought by conducting a Bland–Altman analysis. This plot-based approach would provide a visual representation of the differences between the panel and the software scores against their averages, thus revealing any potential fixed or proportional bias between the two methods.

Further analysis of the agreement between the panel and the software was conducted using Spearman's rho and Kendall's tau, rank correlation coefficients, measuring the strength and direction of association between the panel and software scores. The associated p-values indicated the statistical significance of these correlations.

Lastly, to assess the software's ability to correctly assign grades according to the panel's assessments in the A, B, C, D, and F grading system, a confusion matrix was constructed. Linear and quadratic weighted kappa were calculated based on this matrix to provide more nuanced insights into the agreement, especially considering the ordinal nature of the modified scores.

All statistical analyses were performed using the R software (version 3.6.2; R Foundation for Statistical Computing, Vienna, Austria). For an illustrative representation of the data, heatmaps were generated using Python (Python Software Foundation, version 3.11) with Seaborn 0.11.2 and Matplotlib 3.7.1 packages.

3. Results

Table 1 provides a concise comparison of key statistical metrics, including mean, median, mode, standard deviation, variance, and interquartile range (IQR), for both the panel and software assessments of cavity preparations.

Table 1.

Summary of statistical measures for panel and software scores.

Metric	Panel scores	Software scores
Mean	60.91	64.1
Median	67.5	66.5
Mode	69	69
Standard deviation	18.24	16.1
Variance	332.76	259.32
Minimum score	3	8
Maximum score	87	94
Range	84	86
1st quartile (Q1)	53.5	56.25
3rd quartile (Q3)	72.75	74.0
Interquartile range (IQR)	19.25	17.75

Open in a new tab

Table 2 presents the discriminative metrics of the software, encompassing sensitivity, specificity, PPV, NPV, and the area under the ROC curve (AUC). For the ROC curve's visual representation, see Figure 4. The precision–recall tradeoffs at various thresholds are depicted in Figure 5.

Table 2.

Discriminative metrics of the software.

Metric	Score
Sensitivity	0.98
Specificity	0.55
Positive predictive value (PPV)	0.92
Negative predictive value (NPV)	0.86
Area under the ROC curve (AUC)	0.82

Open in a new tab

Receiver operating characteristic (ROC) curve for the software's assessment of dental cavity preparations. The curve plots the true positive rate (sensitivity, on the y-axis) against the false positive rate (1 − specificity, on the x-axis) at various threshold settings. The area under the curve (AUC) represents the software's discriminative ability in distinguishing between correctly and incorrectly prepared cavities. A larger AUC indicates better discriminative performance.

Precision–recall curve illustrating the performance of the software in assessing dental cavity preparations. The curve demonstrates the tradeoff between the precision (y-axis) and the recall (x-axis) at various threshold settings. The area under the curve (AUC) represents the overall performance of the software in recognizing correct cavity preparations, with a larger AUC indicating better performance.

Table 3 details the agreement metrics between software and panel assessments, offering a precise comparison between the two methods' scores. Further insight into the agreement was gained through a Bland–Altman analysis, visualizing the differences between the panel and software scores against their averages (Figure 6). This approach gave a more detailed picture of the level of agreement between the two methods, revealing any potential fixed or proportional bias. For further analysis on the agreement levels between the panel and software across different grading categories, refer to Figure 7. This heatmap of the confusion matrix visualizes the distribution and frequency of agreement and disagreement on grades, with color intensity highlighting the prevalence of each grade combination.

Table 3.

Agreement metrics between software and panel assessment.

Metric	Value	Note
ICC3k	0.857	High level of agreement; 95% CI = [0.77, 0.91], p < 0.01
Cohen's kappa	0.62	Substantial agreement
Spearman's rho	0.73	Strong positive correlation; p < 0.01
Kendall's tau	0.56	Strong positive correlation; p < 0.01
Agreement on worst 5%	80.00%	—
Agreement on top 20%	65.00%	—
Linear weighted kappa	0.51	Moderate agreement
Quadratic weighted kappa	0.66	Substantial agreement

Open in a new tab

Note. The intraclass correlation coefficient (ICC3k) measures the overall agreement between the software and the panel. Cohen's kappa, Spearman's rho, and Kendall's tau indicate the consistency in ranking and grading between the two methods. The linear and quadratic weighted kappa values assess the degree of agreement, factoring in the severity of disagreements.

Bland–Altman plot illustrating the agreement between the panel and software scores. The x-axis represents the mean of the panel and software scores, and the y-axis represents the differences between the two scores. The middle line represents the mean difference (bias), and the outer lines represent the limits of agreement (mean difference ± 1.96 × SD of the difference), encompassing 95% of the differences. The plot reveals any potential fixed or proportional bias between the two assessment methods.

Heatmap of the confusion matrix representing the comparison of grades assigned by the expert panel and the algorithm-based software. The x-axis corresponds to the grades assigned by the software, while the y-axis corresponds to the grades assigned by the panel. The color intensity reflects the frequency of cases in each grade combination. The diagonal from the top left to the bottom right represents cases where the software and the panel assigned the same grade. Off-diagonal elements represent disagreements.

4. Discussion

Our study revealed significant findings regarding the assessment of dental cavity preparations through an innovative integration of algorithm-based software and traditional expert panel evaluations. Key outcomes include the software's high sensitivity (0.98), indicating exceptional accuracy in identifying correct and incorrect cavity preparations, and an impressive intraclass correlation coefficient (ICC3k) of 0.857, demonstrating strong agreement with the expert panel's assessments. These results highlight the software's capability to offer precise, objective measurements of cavity depth, width, and deviations from ideal geometry, closely aligning with the experienced academics' validated assessments in operative dentistry education.

The decision to use a mandibular third molar in our study was primarily to minimize the inherent biases associated with expert judgment. Opting for a tooth that is less frequently encountered in routine evaluations allowed us to create a level playing field for both human and software assessments. This approach was particularly important to reduce familiarity bias, ensuring that the evaluations were based solely on the objective criteria set forth by the study, rather than prior experience or subjective judgment.

The software tended to assign slightly higher scores on average than the panel, which might reflect a more lenient evaluation by the algorithm, focusing on quantifiable aspects of cavity preparations and possibly overlooking some subjective criteria valued by experts [14, 15]. Despite this leniency, the variation in scores from both the software and the panel underscored a moderate performance level overall, pointing to the complex nature of operative dentistry and the diversity in technique. Such variability is indicative of the challenges inherent in standardizing assessments in a field where individual technique plays a significant role.

A noteworthy aspect of the software's performance was its high sensitivity and precision, indicating significant accuracy in distinguishing between successful and unsuccessful cavity preparations. Although there was a slight inclination toward false-positive errors, the high PPV and NPV suggested that the software-assigned grades were largely reliable indicators of performance.

The software's discriminative ability, as illustrated by the ROC curve, was good, and the AUC further reinforced this result. The precision–recall curve suggested that a balance between the precision and recall of the software can be achieved at various thresholds, making it a valuable tool for real-world assessment scenarios.

The analysis of agreement between the software and the panel yielded intriguing results. A high level of agreement, as reflected by the ICC3k, suggests that the software and panel's assessments were largely consistent. This outcome gives promising evidence for the algorithm's capacity to replicate human judgment in a standardized and unbiased manner. Bland–Altman analysis, Cohen's kappa, and correlation coefficients further supported this finding by demonstrating a strong agreement and positive correlation between the two methods.

It is worth noting that the software agreed more on the lower-performing samples than on the higher-performing ones. This observation implies that while the software can effectively identify underperformance, it may not be as proficient at recognizing top performers, possibly due to the intricacies of optimal preparations that are more readily appreciated by human experts. However, arguably, identifying lower-performing students can be more critical for educational purposes, as these students would greatly benefit from immediate intervention and feedback to improve their skills [16]. Early identification of struggling students has been evidenced to result in significantly better outcomes in academic performance and skill development [17, 18]. Therefore, our software's tendency to reliably detect underperformance is not necessarily a drawback but rather a valuable feature for effective pedagogical practice.

The analysis of the ordinal agreement, demonstrated through the linear weighted kappa of 0.51 and quadratic weighted kappa of 0.66, indicated a moderate-to-substantial agreement when considering the grades' ordinal nature. These findings suggest that while the software can effectively categorize performance into grades, it still requires refinement to match the panel's grading precision accurately. This analysis underscores the need for further development of the software to enhance its capacity for evaluating dental cavity preparations with precision akin to that of expert evaluators.

Interestingly, our findings resonate with those of related projects (i.e., E4D Compare Software and PrepCheck) that noted growing affinity toward technology-facilitated evaluations, particularly in dental education [3, 19, 20, 21]. Dental academics' favorable reception of these technologies can be attributed to its perceived objectivity and consistency, attributes that align with the features of our algorithm-based software. Hence, our study further emphasizes the potentially transformative impact of such tools on the evaluation process within dental education.

This inclination of academics toward technological tools for evaluation also addresses the pervasive issue of skepticism toward subjective judgment. By providing a standardized, unbiased evaluation, our software could alleviate these concerns and circumvent potential disputes over grading legitimacy.

Digital assessment limitations, highlighted previously in the literature [19], mirror our own and point to areas for improvement, especially when it comes to cavity convergence, divergence, and outline form. It underlines the need for further validation of the software's accuracy, an aspect that our study also acknowledges. The software is an invaluable tool for assessing the primary parameters of cavity preparations; however, it still requires further development to thoroughly evaluate other significant factors in dental preparations.

In our study, both precision, defined as the measure of how often the software's positive predictions are correct, and sensitivity, indicating how well the software identifies actual positives, were remarkable [22, 23]. These robust metrics indicate that the software could potentially function as a standalone assessment tool, capable of delivering consistent and accurate evaluations without the need for an accompanying expert panel. This independence aligns with previous findings in dental education, where digital assessment tools have shown promise in achieving positive outcomes across differing methodologies [5, 12, 20, 24, 25].

This does not negate the value of an expert panel's insights and experience, but rather, it proposes an efficient alternative for instances where immediate, objective, and scalable assessment is advantageous. Our findings are corroborated by those of another study, which reported that excellent repeatability of digital evaluations does not necessarily equate to valid grading [24]. As such, striking a balance between human expertise and the software's precision and sensitivity could provide comprehensive, accurate, and efficient assessments, ultimately improving the quality of dental education.

Our findings underscore the value of algorithm-based software in dental cavity preparation assessments, offering a valuable supplement to traditional evaluation methods. There is room for improvement in the software's ability to recognize top-tier preparations, ensuring alignment with expert panel assessments. Such advancements will enhance the software's reliability and utility in operative dentistry education. Additionally, the software's capacity for customization and adaptability underlines its potential for wider educational use. By allowing for the personalization of comparison criteria and standardizing teaching methodologies, our software paves the way for its further development, promising a more targeted and comprehensive feedback mechanism.

One notable limitation of our study lies in its focus on cavity preparations using a mandibular third molar and specifically for amalgam restorations. This approach, while providing a rigorous test of precision, may not fully represent the broad spectrum of clinical scenarios encountered in contemporary dental practice. As the dental field continues to evolve with a shift toward composite restorations and inlays, moving away from amalgam, the specific choice of tooth and restoration material in this study may limit the generalizability of our findings. However, we emphasize the software's capability to adapt to a variety of restorative procedures, including inlay preparations and composite restorations, provided these are defined with clear evaluative criteria. This adaptability suggests that while our study used amalgam restorations for its initial assessment, the software's utility is not confined to this material alone, offering potential applicability across different restorative techniques. Additionally, the study was conducted within a controlled academic setting, which might not fully capture the complexities and variations of real-world clinical environments. Future studies could expand the scope by incorporating a wider range of teeth and restoration materials, including composites, to better reflect current dental practices and materials. Moreover, integrating the software's assessment capabilities into a clinical setting could provide insights into its practical utility and areas for further refinement.

5. Conclusion

The outcomes of this investigation underscore the algorithm-based software's efficacy in closely aligning with the assessments of well-trained and calibrated dental educators. This alignment reaffirms the high standards of teaching and evaluation upheld by experienced educators in the field of dentistry. While our findings do not suggest the software is superior to human evaluation, they underscore its potential as a complementary tool that enhances traditional assessment methods.

By integrating the objectivity and consistency of algorithmic assessment with the comprehensive understanding and personalized feedback from skilled educators, this software can enrich the evaluative process, making it more robust and comprehensive. Thus, the study contributes to the evolving landscape of dental education by illustrating how technology can support and augment the expert judgment of dental educators, rather than replace it. Future research may focus on quantifying the added value of such technologies in terms of efficiency, bias reduction, and educational outcomes to further define their role in dental education.

Acknowledgments

The authors extend their gratitude to Dr. Rayan Arfaoui for his exceptional coding skills and contributions to the development of the software used in this study. His expertise was invaluable, enabling precise functionality and features that significantly enhanced our research efforts.

Data Availability

The data that support the findings of this study are available on request from the corresponding author.

Ethical Approval

This research was approved by the Oman Dental College Ethics, Research, and Innovation Committee (ERIC), reference AE-2022-170.

Consent

Informed consent was obtained from all participating students through a cover letter that explained the study's objectives.

Conflicts of Interest

The authors declare they have no conflicts of interest.

Authors' Contributions

Abubaker Qutieshat was responsible for conceptualization (lead), data collection (lead), data analysis (lead), writing—original draft (lead), writing—review and editing (equal). Abdurahman Salem was responsible for data analysis (supporting), writing—review and editing (equal). Melina N. Kyranides was responsible for methodology (lead), data analysis (supporting), writing—review and editing (equal).

References

1.Sharaf A. A., AbdelAziz A. M., El Meligy O. A. S. Intra-and inter-examiner variability in evaluating preclinical pediatric dentistry operative procedures. Journal of Dental Education . 2007;71(4):540–544. doi: 10.1002/j.0022-0337.2007.71.4.tb04307.x. [DOI] [PubMed] [Google Scholar]
2.Satterthwaite J. D., Grey N. J. A. Peer-group assessment of pre-clinical operative skills in restorative dentistry and comparison with experienced assessors. European Journal of Dental Education . 2008;12(2):99–102. doi: 10.1111/j.1600-0579.2008.00509.x. [DOI] [PubMed] [Google Scholar]
3.Renne W. G., McGill S. T., Mennito A. S., et al. E4D compare software: an alternative to faculty grading in dental education. Journal of Dental Education . 2013;77(2):168–175. doi: 10.1002/j.0022-0337.2013.77.2.tb05459.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Kateeb E. T., Kamal M. S., Kadamani A. M., Abu Hantash R. O., Abu Arqoub M. M. Utilising an innovative digital software to grade pre-clinical crown preparation exercise. European Journal of Dental Education . 2017;21(4):220–227. doi: 10.1111/eje.12204. [DOI] [PubMed] [Google Scholar]
5.Schepke U., van Wulfften Palthe M. B. E., Meisberger E. W., Kerdijk W., Cune M. S., Blok B. Digital assessment of a retentive full crown preparation—an evaluation of prepCheck in an undergraduate pre-clinical teaching environment. European Journal of Dental Education . 2020;24(3):407–424. doi: 10.1111/eje.12516. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Matthisson L., Zitzmann N. U., Zaugg L. K., Joda T. Potential of intraoral optical scanning to evaluate motor skills’ improvement for tooth preparation: a prospective cohort study. European Journal of Dental Education . 2022;26(4):669–675. doi: 10.1111/eje.12745. [DOI] [PubMed] [Google Scholar]
7.Kim Y.-K., Kim J.-H., Jeong Y., Yun M.-J., Lee H. Comparison of digital and conventional assessment methods for a single tooth preparation and educational satisfaction. European Journal of Dental Education . 2023;27(2):262–270. doi: 10.1111/eje.12799. [DOI] [PubMed] [Google Scholar]
8.Tahani B., Rashno A., Haghighi H., Monirifard R., Khomami H. N., Kafieh R. Automatic evaluation of crown preparation using image processing techniques: a substitute to faculty scoring in dental education. Journal of Medical Signals & Sensors . 2020;10(4):239–247. doi: 10.4103/jmss.JMSS_5_20. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Truchetto T., Dumoncel J., Nabet C., Galibourg A. Computer-assisted evaluation and feedback of a complete student class for preclinical tooth preparation. Journal of Dental Education . 2023;87(S3):1776–1779. doi: 10.1002/jdd.13183. [DOI] [PubMed] [Google Scholar]
10.Browning W. D., Reifeis P., Willis L., Kirkup M. L. Including CAD/CAM dentistry in a dental school curriculum. Journal of Indian Dental Association . 2013;92 [PubMed] [Google Scholar]
11.Ardila C. M., González-Arroyave D. Efficacy of CAD/CAM technology in dental procedures performed by students: a systematic scoping review of randomized clinical trials. Heliyon . 2023;9(4) doi: 10.1016/j.heliyon.2023.e15322.e15322 [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Choi S., Choi R., Peters O. A., Peters C. I. Design of an interactive system for access cavity assessment: a novel feedback tool for preclinical endodontics. European Journal of Dental Education . 2023;27(4):1031–1039. doi: 10.1111/eje.12895. [DOI] [PubMed] [Google Scholar]
13.Qutieshat A., Aouididi R., Salem A., et al. Personality, learning styles and handedness: the use of the non-dominant hand in pre-clinical operative dentistry training. European Journal of Dental Education . 2021;25(2):397–404. doi: 10.1111/eje.12616. [DOI] [PubMed] [Google Scholar]
14.Casko J. S., Vaden J. L., Kokich V. G., et al. Objective grading system for dental casts and panoramic radiographs. American Journal of Orthodontics and Dentofacial Orthopedics . 1998;114(5):589–599. doi: 10.1016/S0889-5406(98)70179-9. [DOI] [PubMed] [Google Scholar]
15.Basting R. T., Trindade R. S., Flório F. M. Comparative study of smile analysis by subjective and computerized methods. Operative Dentistry . 2006;31(6):652–659. doi: 10.2341/06-24. [DOI] [PubMed] [Google Scholar]
16.Hattie J., Timperley H. The power of feedback. Review of Educational Research . 2007;77(1):81–112. doi: 10.3102/003465430298487. [DOI] [Google Scholar]
17.Jayaprakash S. M., Moody E. W., Lauría E. J. M., Regan J. R., Baron J. D. Early alert of academically at-risk students: an open source analytics initiative. Journal of Learning Analytics . 2014;1(1):6–47. doi: 10.18608/jla.2014.11.3. [DOI] [Google Scholar]
18.Marbouti F., Diefes-Dux H. A., Madhavan K. Models for early prediction of at-risk students in a course using standards-based grading. Computers & Education . 2016;103:1–15. doi: 10.1016/j.compedu.2016.09.005. [DOI] [Google Scholar]
19.Sly M. M., Barros J. A., Streckfus C. F., Arriaga D. M., Patel S. A. Grading class I preparations in preclinical dental education: E4D compare software vs. the traditional standard. Journal of Dental Education . 2017;81(12):1457–1462. doi: 10.21815/JDE.017.107. [DOI] [PubMed] [Google Scholar]
20.Kunkel T. C., Engelmeier R. L., Shah N. H. A comparison of crown preparation grading via PrepCheck versus grading by dental school instructors. International Journal of Computerized Dentistry . 2018;21(4):305–311. [PubMed] [Google Scholar]
21.Lenherr P., Marinello C. P. prepCheck computer-supported objective evaluation of students preparation in preclinical simulation laboratory. Swiss Dental Journal SSO—Science and Clinical Topics . 2014;124(10):1085–1092. doi: 10.61872/sdj-2014-10-06. [DOI] [PubMed] [Google Scholar]
22.Powers D. M. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. 2020. arXiv preprint arXiv: 201016061.
23.Saito T., Rehmsmeier M., Brock G. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLOS ONE . 2015;10(3) doi: 10.1371/journal.pone.0118432.e0118432 [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Kwon S. R., Restrepo-Kennedy N., Dawson D. V., et al. Dental anatomy grading: comparison between conventional visual and a novel digital assessment technique. Journal of Dental Education . 2014;78(12):1655–1662. doi: 10.1002/j.0022-0337.2014.78.12.tb05844.x. [DOI] [PubMed] [Google Scholar]
25.Esser C., Kerschbaum T., Winkelmann V., Krage T., Faber F.-J. A comparison of the visual and technical assessment of preparations made by dental students. European Journal of Dental Education . 2006;10(3):157–161. doi: 10.1111/j.1600-0579.2006.00408.x. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data that support the findings of this study are available on request from the corresponding author.

[B1] 1.Sharaf A. A., AbdelAziz A. M., El Meligy O. A. S. Intra-and inter-examiner variability in evaluating preclinical pediatric dentistry operative procedures. Journal of Dental Education . 2007;71(4):540–544. doi: 10.1002/j.0022-0337.2007.71.4.tb04307.x. [DOI] [PubMed] [Google Scholar]

[B2] 2.Satterthwaite J. D., Grey N. J. A. Peer-group assessment of pre-clinical operative skills in restorative dentistry and comparison with experienced assessors. European Journal of Dental Education . 2008;12(2):99–102. doi: 10.1111/j.1600-0579.2008.00509.x. [DOI] [PubMed] [Google Scholar]

[B3] 3.Renne W. G., McGill S. T., Mennito A. S., et al. E4D compare software: an alternative to faculty grading in dental education. Journal of Dental Education . 2013;77(2):168–175. doi: 10.1002/j.0022-0337.2013.77.2.tb05459.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B4] 4.Kateeb E. T., Kamal M. S., Kadamani A. M., Abu Hantash R. O., Abu Arqoub M. M. Utilising an innovative digital software to grade pre-clinical crown preparation exercise. European Journal of Dental Education . 2017;21(4):220–227. doi: 10.1111/eje.12204. [DOI] [PubMed] [Google Scholar]

[B5] 5.Schepke U., van Wulfften Palthe M. B. E., Meisberger E. W., Kerdijk W., Cune M. S., Blok B. Digital assessment of a retentive full crown preparation—an evaluation of prepCheck in an undergraduate pre-clinical teaching environment. European Journal of Dental Education . 2020;24(3):407–424. doi: 10.1111/eje.12516. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B6] 6.Matthisson L., Zitzmann N. U., Zaugg L. K., Joda T. Potential of intraoral optical scanning to evaluate motor skills’ improvement for tooth preparation: a prospective cohort study. European Journal of Dental Education . 2022;26(4):669–675. doi: 10.1111/eje.12745. [DOI] [PubMed] [Google Scholar]

[B7] 7.Kim Y.-K., Kim J.-H., Jeong Y., Yun M.-J., Lee H. Comparison of digital and conventional assessment methods for a single tooth preparation and educational satisfaction. European Journal of Dental Education . 2023;27(2):262–270. doi: 10.1111/eje.12799. [DOI] [PubMed] [Google Scholar]

[B8] 8.Tahani B., Rashno A., Haghighi H., Monirifard R., Khomami H. N., Kafieh R. Automatic evaluation of crown preparation using image processing techniques: a substitute to faculty scoring in dental education. Journal of Medical Signals & Sensors . 2020;10(4):239–247. doi: 10.4103/jmss.JMSS_5_20. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B9] 9.Truchetto T., Dumoncel J., Nabet C., Galibourg A. Computer-assisted evaluation and feedback of a complete student class for preclinical tooth preparation. Journal of Dental Education . 2023;87(S3):1776–1779. doi: 10.1002/jdd.13183. [DOI] [PubMed] [Google Scholar]

[B10] 10.Browning W. D., Reifeis P., Willis L., Kirkup M. L. Including CAD/CAM dentistry in a dental school curriculum. Journal of Indian Dental Association . 2013;92 [PubMed] [Google Scholar]

[B11] 11.Ardila C. M., González-Arroyave D. Efficacy of CAD/CAM technology in dental procedures performed by students: a systematic scoping review of randomized clinical trials. Heliyon . 2023;9(4) doi: 10.1016/j.heliyon.2023.e15322.e15322 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B12] 12.Choi S., Choi R., Peters O. A., Peters C. I. Design of an interactive system for access cavity assessment: a novel feedback tool for preclinical endodontics. European Journal of Dental Education . 2023;27(4):1031–1039. doi: 10.1111/eje.12895. [DOI] [PubMed] [Google Scholar]

[B13] 13.Qutieshat A., Aouididi R., Salem A., et al. Personality, learning styles and handedness: the use of the non-dominant hand in pre-clinical operative dentistry training. European Journal of Dental Education . 2021;25(2):397–404. doi: 10.1111/eje.12616. [DOI] [PubMed] [Google Scholar]

[B14] 14.Casko J. S., Vaden J. L., Kokich V. G., et al. Objective grading system for dental casts and panoramic radiographs. American Journal of Orthodontics and Dentofacial Orthopedics . 1998;114(5):589–599. doi: 10.1016/S0889-5406(98)70179-9. [DOI] [PubMed] [Google Scholar]

[B15] 15.Basting R. T., Trindade R. S., Flório F. M. Comparative study of smile analysis by subjective and computerized methods. Operative Dentistry . 2006;31(6):652–659. doi: 10.2341/06-24. [DOI] [PubMed] [Google Scholar]

[B16] 16.Hattie J., Timperley H. The power of feedback. Review of Educational Research . 2007;77(1):81–112. doi: 10.3102/003465430298487. [DOI] [Google Scholar]

[B17] 17.Jayaprakash S. M., Moody E. W., Lauría E. J. M., Regan J. R., Baron J. D. Early alert of academically at-risk students: an open source analytics initiative. Journal of Learning Analytics . 2014;1(1):6–47. doi: 10.18608/jla.2014.11.3. [DOI] [Google Scholar]

[B18] 18.Marbouti F., Diefes-Dux H. A., Madhavan K. Models for early prediction of at-risk students in a course using standards-based grading. Computers & Education . 2016;103:1–15. doi: 10.1016/j.compedu.2016.09.005. [DOI] [Google Scholar]

[B19] 19.Sly M. M., Barros J. A., Streckfus C. F., Arriaga D. M., Patel S. A. Grading class I preparations in preclinical dental education: E4D compare software vs. the traditional standard. Journal of Dental Education . 2017;81(12):1457–1462. doi: 10.21815/JDE.017.107. [DOI] [PubMed] [Google Scholar]

[B20] 20.Kunkel T. C., Engelmeier R. L., Shah N. H. A comparison of crown preparation grading via PrepCheck versus grading by dental school instructors. International Journal of Computerized Dentistry . 2018;21(4):305–311. [PubMed] [Google Scholar]

[B21] 21.Lenherr P., Marinello C. P. prepCheck computer-supported objective evaluation of students preparation in preclinical simulation laboratory. Swiss Dental Journal SSO—Science and Clinical Topics . 2014;124(10):1085–1092. doi: 10.61872/sdj-2014-10-06. [DOI] [PubMed] [Google Scholar]

[B22] 22.Powers D. M. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. 2020. arXiv preprint arXiv: 201016061.

[B23] 23.Saito T., Rehmsmeier M., Brock G. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLOS ONE . 2015;10(3) doi: 10.1371/journal.pone.0118432.e0118432 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B24] 24.Kwon S. R., Restrepo-Kennedy N., Dawson D. V., et al. Dental anatomy grading: comparison between conventional visual and a novel digital assessment technique. Journal of Dental Education . 2014;78(12):1655–1662. doi: 10.1002/j.0022-0337.2014.78.12.tb05844.x. [DOI] [PubMed] [Google Scholar]

[B25] 25.Esser C., Kerschbaum T., Winkelmann V., Krage T., Faber F.-J. A comparison of the visual and technical assessment of preparations made by dental students. European Journal of Dental Education . 2006;10(3):157–161. doi: 10.1111/j.1600-0579.2006.00408.x. [DOI] [PubMed] [Google Scholar]

PERMALINK

Dental Cavity Grading: Comparing Algorithm Reliability and Agreement with Expert Evaluation

Abubaker Qutieshat

Abdurahman Salem

Melina N Kyranides

Abstract

Aim

Materials and Methods

Result

Conclusion

1. Introduction

2. Methodology

2.1. Evaluating Performance on the Task

Figure 1.

Figure 2.

Figure 3.

2.2. Standard Setting and Cutoff Point Determination

2.3. Statistical Analysis

3. Results

Table 1.

Table 2.

Figure 4.

Figure 5.

Table 3.

Figure 6.

Figure 7.

4. Discussion

5. Conclusion

Acknowledgments

Data Availability

Ethical Approval

Consent

Conflicts of Interest

Authors' Contributions

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases