Skip to main content
Dentomaxillofacial Radiology logoLink to Dentomaxillofacial Radiology
. 2023 Jan 24;52(3):20220209. doi: 10.1259/dmfr.20220209

Detection of the separated root canal instrument on panoramic radiograph: a comparison of LSTM and CNN deep learning methods

Cansu Buyuk 1,, Burcin Arican Alpay 2, Fusun Er 3
PMCID: PMC9944016  PMID: 36688738

Abstract

Objectives:

A separated endodontic instrument is one of the challenging complications of root canal treatment. The purpose of this study was to compare two deep learning methods that are convolutional neural network (CNN) and long short-term memory (LSTM) to detect the separated endodontic instruments on dental radiographs.

Methods:

Panoramic radiographs from the hospital archive were retrospectively evaluated by two dentists. A total of 915 teeth, of which 417 are labeled as “separated instrument” and 498 are labeled as “healthy root canal treatment” were included. A total of six deep learning models, four of which are some varieties of CNN (Raw-CNN, Augmented-CNN, Gabor filtered-CNN, Gabor-filtered-augmented-CNN) and two of which are some varieties of LSTM model (Raw-LSTM, Augmented-LSTM) were trained based on several feature extraction methods with an applied or not applied an augmentation procedure. The diagnostic performances of the models were compared in terms of accuracy, sensitivity, specificity, positive- and negative-predictive value using 10-fold cross-validation. A McNemar’s tests was employed to figure out if there is a statistically significant difference between performances of the models. Receiver operating characteristic (ROC) curves were developed to assess the quality of the performance of the most promising model (Gabor filtered-CNN model) by exploring different cut-off levels in the last decision layer of the model.

Results:

The Gabor filtered-CNN model showed the highest accuracy (84.37 ± 2.79), sensitivity (81.26 ± 4.79), positive-predictive value (84.16 ± 3.35) and negative-predictive value (84.62 ± 4.56 with a confidence interval of 80.6 ± 0.0076. McNemar’s tests yielded that the performance of the Gabor filtered-CNN model significantly different from both LSTM models (p < 0.01).

Conclusions:

Both CNN and LSTM models were achieved a high predictive performance on to distinguish separated endodontic instruments in radiographs. The Gabor filtered-CNN model without data augmentation gave the best predictive performance.

Keywords: Artificial intelligence, CNN, deep learning, LSTM, panoramic radiograph, separated endodontic instruments

Introduction

The common causes of failure in endodontic treatments are the persistence of bacteria, inadequate root canal filling, poor obturation quality, microleakage, and complications of instrumentation such as separated endodontic instruments (SEI).1 The incidence of the SEI in the root canal was reported to range from 0.4 to 7.4%.2 In teeth with a SEI, the success of retreatment is determined by the recognition of the situation, a true radiographic detection, localization of the instrument, and the presence of necrotic pulp tissue apical to the instrument.3 Radiographic detection of the SEI in the root canal supports the rational decision-making process during the treatment planning which directly affects the survival rate of the involved tooth.3,4

Due to allowing the evaluation of physiological and pathological conditions on the dental and skeletal structures with a wide perspective, the primary diagnostic imaging modality is panoramic radiography.5 However, the tendency of focusing on the patient’s main complaint especially by the inexperienced dentists or clinicians working in busy clinics, some fine details such as the presence of a SEI in the root canal cannot be recognized early on the radiographs.4,6

Machine learning (ML) is a branch of computer science in which algorithms are built that can learn from data and make predictions.7 Due to its nature of consisting of digitally coded images, dentomaxillofacial radiology is prone to adapt ML models.8 The studies in the field have focused on detection, segmentation, and classification of the anatomical and pathological conditions on radiographic images by deep learning (DL) methods.9

Despite several DL methods that have been proposed for the detection of dental pathologies such as early detection of dental caries, oral cancer, or dental plaque segmentation, there is no study focusing on the detection of SEI by DL methods in the literature, yet.10–12 An algorithm that predicts the SEI with high accuracy will contribute to the diagnosis and treatment of the case as a warning sign. The similarity in shape and radiopacity of root canal filling materials and SEIs complicates the differentiation of these two situations. The purpose of the present study was to compare the diagnostic performances of two different DL methods that were CNN (convolutional neural network) and LSTM (long short-term memory), in the detection of SEIs on dental radiographs. The null hypotheses were as follows: There would be no difference between: (1) Gabor filtered-CNN model and Raw-LSTM models and, (2) Gabor-filtered-augmented-CNN and Augmented-LSTM architectures regarding the detection of SEIs.

Methods and materials

Study design

Two independent observers, a dentomaxillofacial radiologist (CB) and an endodontist (BAA), were evaluated about 10,000 panoramic radiographs on the Picture Archiving Communication Systems (PACS) of Istanbul Okan University, Faculty of Dentistry. The inclusion and exclusion criteria are summarized in Table 1.

Table 1.

The inclusion and the exclusion parameters of the study

Inclusion criteria: Exclusion criteria:
Presence of a single or multirooted tooth with a successful RCT on radiographs for the healthy group. Radiographs that have no RCT.
Presence of a separated instrument determined by the observer on radiographs for the SEI group. Presence of a RCT with post or pin material on radiographs.
Presence of periapical radiograph of the relevant tooth that showed the SEI existence. Radiographs that have RCT or SEI but couldn't be precisely differentiated by the observers.
Absence of imaging artifacts related to motion, positioning, or foreign body on radiographs. Presence of imaging artifacts related to motion, positioning, or foreign body on radiographs.
Radiographs obtained with standard exposure parameters as 66 kVp, 8 mA, 16 s. Radiographs obtained differently from the standard exposure parameters.

RCT, root canal treatment; SEI, separated endodontic instruments.

For the sample size estimation, a power analysis was performed. It was revealed that at least 775 samples were needed to detect a 5% accuracy benefit when applying the CNN models compared to LSTM models having an estimated accuracy of 80%. In the light of previous studies, 915 out of 10,000 panoramic radiographs were used, half of which were SEIs, and the other half were healthy root canal treatments (hRCTs) that met the inclusion criteria.13,14

Collecting radiographic scan data

Under dim light conditions, the radiographs were evaluated on the same medical monitor (Radioforce MX216, EIZO Corporation, Ishikawa, Japan) by the observers. Panoramic radiographs with hRCT or SEI were selected. The presence of SEI on panoramic radiographs was confirmed by periapical radiographs (Figure 1a, Figure 1b). The panoramic radiograph was not included in the study if there was no periapical radiograph of the relevant tooth in the system. Periapical radiographs were used only for SEI detection. The patient names on panoramic radiographs who met the inclusion criteria were anonymized and saved in JPEG format via the imaging software (Planmeca Romexis, Helsinki, Finland). Then, the observers marked the X and Y coordinates of the hRCT and SEI on panoramic radiographs on the ImageJ program (National Institute of Mental Health, Maryland) (Figure 1c). Pixel coordinates of start and end points of the line regions were also saved in a text file with the same name as the image file. When there were more than one hRCT and/or SEI on the image, each one was saved as a separate row on the text file.

Figure 1.

Figure 1.

(a) An example of the panoramic radiograph containing a separated endodontic instrument on the right upper first molar (black arrow), (b) the periapical radiograph of the same case; the black arrow points to the separated instrument located on the apical third, (c) the processes of the labeling and noting the X and Y coordinates.

For the assessment of the intraobserver agreement, each observer has re-evaluated randomly selected 20% of the data set 1 month apart. Results were rated by the intraclass correlation coefficient (ICC) test. Cohen’s κ analysis was used to evaluate the interobserver agreement.

Defined region of interest data sets

MATLAB (Matlab, The MathWorks, Inc., Natick, MA) was used for data analysis and software development. A region of interest (ROI) images data set was prepared for developing and evaluating DL models of SEI detection, which is called DataSetRaw.

A procedure, which mimics the behavior of the user who is interested in whether a line with a marked start and end pixels indicate an SEI or not, was employed to extract ROI to prepare the raw data for training DL models. In other words, the labeled linear ROIs were converted into squared ROIs. Thus, in this study, a total of 915 ROIs, of which 417 were labeled as SEI and 498 were labeled as hRCT were used.

Extracting features

Different types of DL algorithms accept different types of input data to train a model. A two-dimensional (2D) CNN model requires an image data set as a set of 2D arrays to feed its input layer; on the other hand, the input layer of a LSTM network is time-series data.

To train the CNN model, two copies of the DataSetRaw data set were generated. The first one was directly used as raw features of ROI images without any image processing operation (ImageTypeRawFeatures, the data set of the Raw-CNN). The other copy was filtered using a Gabor filter with a wavelength of 4 and an orientation of 90 degrees to enhance the horizontal lines in the ROI images before resizing to a resolution of 64 × 64 pixels (ImageTypeGaborFeatures, the data set of the Gabor-filtered- CNN).

In order to train an LSTM model, the image data set was converted into a time-series data set of numerical features using the proposed sliding window procedure. To achieve this goal, a copy of the DataSetRaw data set was prepared to be used by an LSTM model, which was called TimeSeriesResNETFeatures (the data set of the Raw-LSTM). A 32 × 32 pixels sliding window mask was traveled over the center horizontal axis of each ROI image of the DataSetRaw data set in a step size of 16 pixels to define time-series patches (Figure 2).

Figure 2.

Figure 2.

The green box with a red dot in the center represents the patch at the time point zero masked with a 32 × 32 sliding window. The red center point moves 16 pixels to the right with each time step along the blue line indicated in this figure.

Augmenting the data set

The ImageTypeRawFeatures and ImageTypeGaborFeatures sets were artificially augmented by creating the modified versions of the original data set using horizontal shifting by a randomly selected number of pixels in the range of [−10 10] without changing the size of the ROI and using rotating the ROI at a randomly selected angle in the range of [−10 10] around its center. In this way, the data sets of Augmented-CNN and Gabor-filtered-augmented- CNN were generated for raw and Gabor filtered architectures, respectively. Data augmentation mimics the variations that can occur during data labeling; such that, the same SEI or hRCT may be expressed by a little bit different start and end points. The TimeSeriesResNETFeatures set was augmented in a similar way that the augmentation was applied before extracting time-point features, thus the data set of Augmented-LSTM was generated.

Design of the CNN model

The proposed model consisted of four convolutional layers, each followed by a max-pooling layer. A batch normalization and rectified linear function were applied to the output of each convolutional layer. The convolutional layers included 8, 16, 32 and 64 numbers of filters in size of 3,3, respectively. All max-pooling layers were configured to have a pooling size of 2,2 and a stride of two pixels. Finally, a fully connected layer followed by a dropout layer was added to align the learned features into a 2D vector to feed a classification layer. The output of the fully connected layer was normalized using a softmax layer.

Design of the LSTM model

In the design of the LSTM model, the input layer was designed to receive 512 features for each time point. The model included an LSTM layer with 250 hidden nodes, followed by a drop-out layer with a 0.25 drop ratio. Finally, a fully connected layer was converted the output of the drop-out layer into feature vectors to feed the classification layer just after normalization by a softmax layer.

Constructed deep learning models

In summary, six different DL models were named as follows: four based on CNN models (Raw-CNN, Augmented-CNN, Gabor filtered-CNN, Gabor-filtered-augmented-CNN) and two based on the LSTM models (Raw-LSTM, Augmented-LSTM). Figure 3 summarized the workflow of the constructed models.

Figure 3.

Figure 3.

The workflow of the constructed deep learning models. At first, ROIs were extracted from the data set contains all labeled SEIs (green tooth) and hRCT (blue tooth) panoramics as input. The ROI extraction procedure has contained four steps: (1) calculating the straight angle (Θ) needed to align the line with the horizontal-axis, (2) calculating the mid-point of the line as the pivot point, (3) rotating the raw image by Θ around the pivot point, (4) extracting a squared-ROI. Then, the extracted ROIs duplicated as three identical data sets. First raw data set directly was used with CNN architecture, second one was first processed with Gabor filter and then run by CNN, third one was converted to the time-series data and processed by LSTM architecture. Additionally, those three main branches were also processed with the augmented data sets. At the end of all procedures, each of the six algorithms outputted whether an SEI or hRCT was present in the input radiograph. CNN, convolutional neural network; hRCT, healthy root canal treatments; LSTM, long short term memory; ROI, region of interest; SEI, separated endodontic instrument.

Performance evaluation of the deep learning models

A DL model for predicting whether a given ROI image included a SEI or not was constructed. To assess the performance of the model, the 10-fold cross-validation method was employed such that 1/10th of the data set was used for testing the model that was trained using the remaining data set. At each epoch, five percent of the training data set was split as a validation data set used during the training procedure. Finally, the average of measured performance values of 10 epochs was reported as the final performance of the model. The diagnostic performance of the models was measured using the performance evaluation metrics of accuracy, sensitivity (recall), specificity, positive-predictive value (PPV; precision), negative-predictive value (NPV), positive likelihood ratio (LR+) and negative likelihood ratio (LR). These performance metrics were calculated using the following formulas:

Accuracy=TP+TNTP+FP+TN+FN
Sensitivity(Recall)=TPTP+FN
Specificity=TNFP+TN
PositivePredictiveValue(Precision)=TPTP+FP
NegativePredictiveValue=TNTN+FN
PositiveLikelihoodRatio(LR+)=TP/(TP+FN)FP/(FP+TN)=Sensitivity1Specificity
NegativeLikelihoodRatio(LR)=FN/(TP+FN)TN/(FP+TN)=1SensitivitySpecificity

In these formulas, there are true and false classes, where the “SEI class” is considered as the “true class”. Thus, TP is the abbreviation for “true positive” which is the number of correctly predicted “true class”; and false positive (FP) is the number of data classified as “true class” which is in fact “false class”. Similarly, true negative (TN) and false negative (FN) are the numbers of data predicted as “hRCT” which are in fact “hRCT class” and “SEI class”, respectively.

Accuracy is interpreted as the ratio of all correctly identified data to the amount of data in the test data set. Sensitivity refers to the proportion of correctly identified “SEI” among those who belong to the “SEI” class. Specificity measures predictive ability on “hRCT” in contrast to sensitivity. PPV demonstrates that if the DL model predicts that the test data is “SEI”, what is the ratio of actual “SEI” among all data predicted as “SEI”. Like this, the NPV is related to the “hRCT” class. Likelihood ratios (LR+ and LR) are used to assess how likely a data belongs to true or false class that range from zero to infinity. A higher value in LR+ is interpreted as the data has the positive condition, vice versa for LR

Statistical methods

All the methods were evaluated using the same folding while performing the cross-fold validation procedure. In other words, the predictions of each learning model were obtained on the same test data, while training with the same group of data. McNemar’s tests were conducted to detect any significant prediction difference with a significance level of 0.01. Finally, the most promising model among those six models was trained and tested 10 times using k-fold cross-validation using the shuffle function to randomize partitions each time. Then, a 95% confidence interval was computed for the 10,000 accuracies of 10 different 10-fold cross-validation procedures.

Results

In this study, six different DL models, four based on the CNN models, the others based on the LSTM models, were built, and evaluated on the classification performance metrics to figure out their predictive performances on the data sets.

A total of 925 ROIs were labeled as either hRCT or SEI by two observers independently. 417 and 498 of them were labeled as SEI and hRCT by both observers, respectively. The observers could not get an agreement on the remaining 10 ROI, so those teeth were excluded. Thus, Cohen’s κ statistic was calculated as 0.977 indicating a “near-perfect agreement” between observers. Also, the ICC value was 0.985 for the first observer and 0.986 for the second observer; accordingly, the reliability of both observers within themselves indicated excellent reliability.

Performance evaluation of the CNN

Table 2 shows the performance metrics of the CNN models.

Table 2.

Performance measures of the CNN models presented in the mean and standard deviation (mean ± std) between 10 folds

CNN Model Accuracy Sensitivity Specificity PPV NPV LR+ LR-
Raw-CNN 69.72 ± 8.44 59.54 ± 21.21 78.89 ± 6.46 69.17 ± 7.45 71.23 ± 10.31 2.87 ± 0.94 0.50 ± 0.23
Augmented-CNN 58.03 ± 4.79 17.57 ± 22.68 90.88 ± 9.83 33.96 ± 31.03 57.75 ± 3.88 inf±inf 0.89 ± 0.18
Gabor-filtered-CNN 84.37 ± 2.79 81.26 ± 4,79 87.16 ± 2.82 84.16 ± 3.35 84.62 ± 4.56 6.71 ± 1.99 0.22 ± 0.06
Gabor-filtered-augmented-CNN 78.15 ± 5.19 76.00 ± 18.71 79.66 ± 15.19 78.66 ± 10.44 82.41 ± 10.51 6.29 ± 4.88 0.28 ± 0.19

CNN, convolutional neural network, PPV: positive-predictive value, NPV: negative-predictive value, LR+: Positive likelihood ratio, LR-: Negative likelihood ratio.

All performance metrics of the Gabor filtered models (Gabor filtered-CNN and Gabor-filtered-augmented-CNN) were higher than both of the raw models (Raw-CNN and Augmented-CNN) except the specificity of Augmented-CNN. Further, no statistically significant difference was observed between Gabor filtered-CNN and Gabor-filtered-augmented-CNN models (p = 0.25). However, the non-filtered models showed significant differences that the non-augmented method (Raw-CNN) showed better performance than the augmented (Augmented-CNN) one (p < 0.01), (Table 4). Similarly, all parameters of the non-augmented models (Raw-CNN and Gabor filtered-CNN) showed higher performance than the augmented models (Augmented-CNN and Gabor-filtered-augmented-CNN) except the specificity of Augmented-CNN.

Considering our results, Gabor filtered-CNN showed the best performance metrics among the CNN models. Also, accuracy, sensitivity, PPV, and NPV values of Gabor filtered-CNN model gave very close results to each other. According to the 10-fold cross-validation test of the Gabor filtered-CNN model, the mean of the hundred accuracy values obtained was 80.6 ± 0.0385, the confidence interval was calculated as 80.6 ± 0.0076. To plot the ROC curves of each of the 10 folds of the cross-validation procedure, the false-positive rate (FPR) and the true-positive rate (TPR) were collected at different thresholds of the output of the last soft-max layer of the CNN model. Figure 4 shows the average ROC curves and corresponding AUC values of each fold (Figure 4).

Figure 4.

Figure 4.

Average ROC curves and AUC values of each fold of the cross-validation of the Gabor filtered-CNN model. AUC, area under the curve; ROC, receiver operating characteristic.

Feature activations

The proposed model has four convolutional layers that the deepest one learns high-level features. Figure 5 shows activations of eight channels of the first convolutional layer of the models trained by ImageTypeRawFeatures and ImageTypeGaborFeatures for a randomly selected image of the corresponding data set, separately. Since the deeper layers contain more channels, only those with the highest activation values are displayed in Figure 6.

Figure 5.

Figure 5.

The subfigures show (a) one randomly selected image of the ImageTypeRawFeatures, (b) activations of the eight channels of the first convolutional layer after the feed-forward process of the image shown in the subfigure-a, (c) input of the ImageTypeGaborFeatures data, (d) activations of the eight channels of the first convolutional layer after the feed-forward process of the image shown in the subfigure-c.

Figure 6.

Figure 6.

Randomly selected sample inputs from the ImageTypeRawFeatures and ImageTypeGaborFeatures data sets of the proposed CNN model and sample of the outputs with the highest activation values of each convolutional layer (conv). CNN, convolutional neural network.

Performance evaluation of the LSTM

Table 3 presents the performance metrics of LSTM calculated for each fold of the cross-validation in the form of mean and standard deviation (Mean ± Std). The performance evaluation procedure was repeated twice in total for the proposed LSTM models, separately for the data sets TimeSeriesResNETFeatures without augmentation (Raw-LSTM) and with augmentation (Augmented-LSTM). The prediction performance of the LSTM models significantly differed in favor of Raw-LSTM model (p > 0.001).

Table 3.

Performance measures of the LSTM models presented in the mean and standard deviation(mean ± std) between 10 folds

LSTM Model Accuracy Sensitivity Specificity PPV NPV LP+ LP-
Raw-LSTM 80.42 ± 5.85 71.02 ± 11.44 88.23 ± 6.80 84.17 ± 6.39 78.79 ±+ 7.07 7.42 ± 3.02 0.33 ± 0.12
Augmented-LSTM 79.88 ±+ 4.78 83.13 ± 7.92 77.29 ± 12.07 76.60 ± 8.30 84.91 ± 5.42 4.51 ± 1.93 0.21 ± 0.08

PPV: positive-predictive value, NPV: negative-predictive value, LR+: Positive likelihood ratio, LR-: Negative likelihood ratio, LSTM, long short-term memory.

Comparison of the performance of the CNN and LSTM

McNemar’s test was applied to understand whether the predictions made by two different models for the same data made a significant difference. McNemar’s test statistics of the different paired models are shown in Table 4. Gabor filtered CNN models and LSTM models showed statistically significant differences between paired groups (p < 0.01). Only Gabor-filtered-augmented-CNN and Raw-LSTM model showed no significant difference (p = 0.014).

Table 4.

McNemar’s test statistics for each paires of the models.

McNemar’s statistics χ2 p-value
Raw-CNN Augmented-CNN 141.06 <0.01
Raw-CNN Gabor-filtered-CNN 8.11 <0.01
Augmented-CNN Gabor-filtered-augmented-CNN 223.88 <0.01
Gabor-filtered-CNN Gabor-filtered-augmented-CNN 1.34 =0.25
Gabor-filtered-CNN Raw-LSTM 11.51 <0.01
Gabor-filtered-CNN Augmented-LSTM 15.14 <0.01
Gabor-filtered-augmented-CNN Raw-LSTM 16.99 <0.01
Gabor-filtered-augmented-CNN Augmented-LSTM 6.03 =0.014
Raw-LSTM Augmented-LSTM 63.25 <0.01

CNN, convolutional neural network; LSTM, long short-term memory.

Discussion

Removing the SEI from the root canal is a challenging procedure for dentists. Although the main approach in these cases is to remove the SEI, the treatment protocol varies depending on which third of the root canal of that instrument is broken according to the American Association of Endodontics (AAE) guidelines.15 For this reason, detection of the localization of the SEI on radiographs plays a major role in determining the treatment protocol. The length of the instruments and the clinician’s experience may influence interpretation. Therefore, the development of a DL model would be beneficial to improve the clinician’s ability to diagnose.

The main idea behind employing CNN and LSTM methods with or without augmentation techniques independently in the present study is to measure whether the different DL models would achieve similar results to experienced dentists, thus being able to compare the robustness of the architectures. Starting from this point, it was aimed to build a DL model to detect SEI on panoramic radiographs in the present study. According to our results, Gabor-filtered-CNN model showed better results than the Raw-LSTM model in the detection of SEI, while no statistically significant difference was observed between Gabor-filtered-augmented-CNN and Augmented-LSTM models regarding the same parameter. Considering these, the first null hypothesis was rejected while the second one was accepted.

A total of six different DL models were presented, four of which were performed on the CNN models and the rest of which were performed on the LSTM models. CNN models were divided as raw/filtered, and with or without augmentation among themselves. The Gabor-filtered-CNN-model showed the highest accuracy, sensitivity, PPVs, NPVs, LR+ and LR . In the literature, the Gabor filter is generally used for the enhancement of big data consisting of medical images for ML studies.16,17 In the study designed by Kaya and Bebek, the Gabor filter was applied to the localization of biopsy needles on the ultrasonography images for line filtering purposes.17 They reported that the Gabor filter produced better results than the other filtering methods since it improved the needle’s pixel better. Despite panoramic radiographs and ultrasonography being different imaging modalities, the close radiodensities of biopsy needles and endodontic instruments make the problems of the studies similar. In consistency with the study of Kaya and Bebek,17 results of the present study revealed that the Gabor-filtered-CNN algorithm showed better diagnostic performance than the unfiltered CNN models. On the other hand, the CNN model that was trained by the raw and augmented data set (Augmented-CNN) gave the lowest performance metrics. Furthermore, the Gabor-filtered-augmented-CNN showed lower scores than the Gabor-filtered-CNN. These results point out that the augmentation of the data set reduces the detection rate of the SEI by CNN. Further, LSTM models were arranged according to whether they were trained by the raw or the augmented data sets. Both LSTM models showed close accuracy results, whereas the other parameters were varied. Despite the difference between the accuracy of those models being low, the McNemar test showed a significant difference between them.

Radiographic image interpretation of practitioners mostly depends on the experience of the dentist, the viewing environment and imaging tools. Recently, many DL methods such as CNN or LSTM have been proposed for radiographic image analysis as a clinical decision-support system.13,14,18,19 There are 2D radiographic imaging studies that combine these two methods, the CNN as a feature extractor and the LSTM as a classifier.20,21 Both methods have some strengths and weaknesses. The CNN DL model is a well-known and successfully applied DL algorithm for image classification problems. On the other hand, the CNN models are not a powerful method to detect objects in different variations within the images. The LSTM model is applied when the data are a collection of features at different time intervals in the literature. This makes it slower than the other DL models.21 Singh and Senghal22 used the algorithm they developed to detect dental caries and classify caries accordingly, using the combined CNN-LSTM methods. Based on this point, in our research, it was aimed to compare both methods in the problem of differentiating healthy root canal treatment and separated instruments that give a close appearance to each other on the radiograph. In this study, a windowed feature extraction mechanism was introduced to make the use of the LSTM model possible in this data set. Features were extracted along the line suspected to be SEI at each defined sliding window for decreasing sensitivity to the variation in orientation and size.

The current study has some limitations and future aspects need to be improved. In a recent scoping review about Artificial Intelligence algorithms and models in endodontics, the quality of the studies was assessed by using a modified Quality Assessment of Studies of Diagnostic Accuracy Risk tool.23 According to this report, the study with the highest score in the literature had a score of 6. However, the present study gets 7 points out of 8 which is announced as a “high score”. The reason for breaking the score was that the data set was obtained from a single center and the panoramic units were the same brand. Multicenter studies conducted by different X-ray units might increase the inclusivity of the algorithm due to the data variation. To overcome this limitation, we conducted an augmentation procedure to simulate different variations. Another limitation was the exclusion of the hRCTs with post or pin materials. These metallic materials are easily differentiated from the SEI with human eyes, whereas it might be a complex problem for an algorithm. Keeping out these materials restricts the decision-support ability of the algorithm. Although the incidence of encountering SEI in clinic conditions is lower than hRCTs, using a large data set and two different neural network architectures reduced the risks of over- or underfit in this study. Future studies conducted with different data sets of 2D and 3D imaging modalities will facilitate the detection problem of the SEI.

Despite these limitations, one of the important strengths of the present study is that the presence of the SEI was confirmed with a periapical radiograph taken from that related area. Additionally, two experienced dentists from different specialties evaluated the radiographs, and each observer re-evaluated 20% of the data set at 1-month intervals to assess intraobserver agreement. Radiographs in which both specialists did not say “there is a SEI in the root canal” were not included in the study. The panoramic radiographs were obtained using the same X-ray unit with standard parameters which had a positive effect on getting standardization. In addition, the data sets of the study were very large and the number of samples in the healthy and non-healthy classes was almost equal. In this case, although accuracy was reported to be a good parameter to evaluate the diagnostic performance of DL models,24,25 multiple performance metrics were also evaluated so that they can be used reliably in clinical conditions.

Conclusion

Within the limitations of this study, we achieved high predictive performance in terms of accuracy on both CNN and LSTM models to distinguish SEI in the root canal. When the results of this study are applied to the clinic, the detection of the SEI can be determined with greater reliability based on models that can be integrated into imaging systems rather than physician experience and knowledge. Detection of the SEI before starting the treatment will lead to more conservative treatments under the guidance of AAE. Future studies will be conducted with a varied data set to improve the predictive performance of the features extracted for the CNN model using various Gabor filter banks with different orientations and angles. Furthermore, the idea of converting spatial 2D dental images into a windowed stream of features will be improved for the LSTM model in the future by applying different pre-trained networks and incorporating the indisputable power of CNN models.

Footnotes

Funding: This research project was funded by the Scientific and Technological Research Council of Turkey (TUBITAK) of (1002) Short-Term R&D Funding Program (project no: 220S755).

Ethics Approval: This retrospective study was conducted in compliance with the 1964 Helsinki declaration on medical research ethics. The study protocol was approved by the Local Ethics Committee (Decision no: 56665618–204.01.07)

Contributor Information

Cansu Buyuk, Email: cansubuyuk@yahoo.com.

Burcin Arican Alpay, Email: burcin.aricanalpay@dent.bau.edu.tr.

Fusun Er, Email: fer@pirireis.edu.tr.

REFERENCES

  • 1. Tabassum S, Khan FR. Failure of endodontic treatment: the usual suspects. Eur J Dent 2016; 10: 144–47. doi: 10.4103/1305-7456.175682 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Panitvisai P, Parunnit P, Sathorn C, Messer HH. Impact of a retained instrument on treatment outcome: a systematic review and meta-analysis. J Endod 2010; 36: 775–80. doi: 10.1016/j.joen.2009.12.029 [DOI] [PubMed] [Google Scholar]
  • 3. Ramos Brito AC, Verner FS, Junqueira RB, Yamasaki MC, Queiroz PM, Freitas DQ, et al. Detection of fractured endodontic instruments in root canals: comparison between different digital radiography systems and cone-beam computed tomography. J Endod 2017; 43: 544–49. doi: 10.1016/j.joen.2016.11.017 [DOI] [PubMed] [Google Scholar]
  • 4. Rosen E, Azizi H, Friedlander C, Taschieri S, Tsesis I. Radiographic identification of separated instruments retained in the apical third of root canal-filled teeth. J Endod 2014; 40: 1549–52. doi: 10.1016/j.joen.2014.07.005 [DOI] [PubMed] [Google Scholar]
  • 5. Vandenberghe B, Jacobs R, Bosmans H. Modern dental imaging: a review of the current technology and clinical applications in dental practice. Eur Radiol 2010; 20: 2637–55. doi: 10.1007/s00330-010-1836-1 [DOI] [PubMed] [Google Scholar]
  • 6. Ariji Y, Yanashita Y, Kutsuna S, Muramatsu C, Fukuda M, Kise Y, et al. Automatic detection and classification of radiolucent lesions in the mandible on panoramic radiographs using a deep learning object detection technique. Oral Surg Oral Med Oral Pathol Oral Radiol 2019; 128: 424–30. doi: 10.1016/j.oooo.2019.05.014 [DOI] [PubMed] [Google Scholar]
  • 7. Hwang JJ, Jung YH, Cho BH, Heo MS. An overview of deep learning in the field of dentistry. Imaging Sci Dent 2019; 49: 1–7. doi: 10.5624/isd.2019.49.1.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Hung K, Montalvao C, Tanaka R, Kawai T, Bornstein MM. The use and performance of artificial intelligence applications in dental and maxillofacial radiology: a systematic review. Dentomaxillofac Radiol 2020; 49: 20190107. doi: 10.1259/dmfr.20190107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Yaji A, Prasad S, Pai A. Artificial intelligence in dento-maxillofacial radiology. Acta Sci Dent Sci 2019; 3: 116–21. [Google Scholar]
  • 10. Saini D, Jain R, Thakur A. Dental Caries early detection using Convolutional Neural Network for Tele dentistry. 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS); Coimbatore, India. ; 2021. pp. 958–63. doi: 10.1109/ICACCS51430.2021.9442001 [DOI] [Google Scholar]
  • 11. Goswami M, Maheshwari M, Baruah PD, Singh A, Gupta R. Automated Detection of Oral Cancer and Dental Caries Using Convolutional Neural Network. 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO). ; 2021. pp. 1–5. doi: 10.1109/ICRITO51393.2021.9596537 [DOI] [Google Scholar]
  • 12. Li S, Pang Z, Song W, Guo Y, You W, Hao A, et al. Low-Shot Learning of Automatic Dental Plaque Segmentation Based on Local-to-Global Feature Fusion. IEEE 17th International Symposium on Biomedical Imaging (ISBI). ; 2020. pp. 664–68. doi: 10.1109/ISBI45749.2020.9098741 [DOI] [Google Scholar]
  • 13. Saghiri MA, Asgar K, Boukani KK, Lotfi M, Aghili H, Delvarani A, et al. A new approach for locating the minor apical foramen using an artificial neural network. Int Endod J 2012; 45: 257–65. doi: 10.1111/j.1365-2591.2011.01970.x [DOI] [PubMed] [Google Scholar]
  • 14. Fukuda M, Inamoto K, Shibata N, Ariji Y, Yanashita Y, Kutsuna S, et al. Evaluation of an artificial intelligence system for detecting vertical root fracture on panoramic radiography. Oral Radiol 2020; 36: 337–43. doi: 10.1007/s11282-019-00409-x [DOI] [PubMed] [Google Scholar]
  • 15. AAE guideline. 2021. Available from: https://www.aae.org/specialty/communique/broken-instruments-clinical-decision-making-algorithm/ (accessed 17 Dec 2021)
  • 16. Bourkache N, Laghrouch M, Sidhom S. Gabor Filter Algorithm for medical image processing: evolution in Big Data context. In 2020 International Multi-Conference on: “Organization of Knowledge and Advanced Technologies”(OCTA). IEEE; 2020. pp. 1–4. [Google Scholar]
  • 17. Kaya M, Bebek O. Needle localization using gabor filtering in 2D ultrasound images. In: 2014 IEEE International Conference on Robotics and Automation (ICRA. IEEE; 2014. pp. 4881–86. [Google Scholar]
  • 18. Braman N, Beymer D, Dehghan E. Disease detection in weakly annotated volumetric medical images using a convolutional LSTM network. arXiv 2018. doi: 10.48550/arXiv.1812.01087 [DOI]
  • 19. Wahyuningrum RT, Anifah L, Purnama IKE, Purnomo MH. A new approach to classify knee osteoarthritis severity from radiographic images based on CNN-LSTM method. In: Paper presented at the In: Paper presented at the In: Paper presented at the In 2019 IEEE 10th International Conference on Awareness Science and Technology (iCAST). [Google Scholar]
  • 20. Islam MZ, Islam MM, Asraf A. A combined deep CNN-LSTM network for the detection of novel coronavirus (COVID-19) using X-ray images. Inform Med Unlocked 2020; 20: 100412. doi: 10.1016/j.imu.2020.100412 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Manogaran U, Wong YP, Ng BY. CapsNet vs CNN: Analysis of the Effects of Varying Feature Spatial Arrangement. In: K. A, S. K, R B, eds. Intelligent Systems and Applications. IntelliSys 2020. Advances in Intelligent Systems and Computing, Vol. 1251. Cham: Springer; 2021. doi: 10.1007/978-3-030-55187-2_1 [DOI] [Google Scholar]
  • 22. Singh P, Sehgal P. G.v black dental caries classification and preparation technique using optimal CNN-LSTM classifier. Multimed Tools Appl 2021; 80: 5255–72. doi: 10.1007/s11042-020-09891-6 [DOI] [Google Scholar]
  • 23. Umer F, Habib S. Critical analysis of artificial intelligence in endodontics: a scoping review. J Endod 2022; 48: 152–60. doi: 10.1016/j.joen.2021.11.007 [DOI] [PubMed] [Google Scholar]
  • 24. M H, M.n S. A review on evaluation metrics for data classification evaluations. IJDKP 2015; 5: 01–11. doi: 10.5121/ijdkp.2015.5201 [DOI] [Google Scholar]
  • 25. Handelman GS, Kok HK, Chandra RV, Razavi AH, Huang S, Brooks M, et al. Peering into the black box of artificial intelligence: evaluation metrics of machine learning methods. AJR Am J Roentgenol 2019; 212: 38–43. doi: 10.2214/AJR.18.20224 [DOI] [PubMed] [Google Scholar]

Articles from Dentomaxillofacial Radiology are provided here courtesy of Oxford University Press

RESOURCES