Optimized classification of dental implants using convolutional neural networks and pre-trained models with preprocessed data

Reza Ahmadi Lashaki; Zahra Raeisi; Nasim Razavi; Mehdi Goodarzi; Hossein Najafzadeh

doi:10.1186/s12903-025-05704-0

. 2025 Apr 11;25:535. doi: 10.1186/s12903-025-05704-0

Optimized classification of dental implants using convolutional neural networks and pre-trained models with preprocessed data

Reza Ahmadi Lashaki ^1,^✉, Zahra Raeisi ², Nasim Razavi ³, Mehdi Goodarzi ⁴, Hossein Najafzadeh ⁵

PMCID: PMC11987321 PMID: 40217522

Abstract

Objective

This study evaluates the performance of various classifiers and pre-trained models for dental implant state classification using preprocessed radiography images with masks.

Methodology

A dataset of 511 periapical images, including 275 for Bicon, 70 for Bego, and 166 for ITI implants, was expanded to 5110 images using data augmentation techniques such as rotation, flipping, and scaling. Preprocessing included resizing, sharpening, noise reduction, CLAHE-based contrast enhancement, implant-specific masking, and normalization. Classifiers including Convolutional Neural Networks (CNN), Convolutional Support Vector Machine (CSVM), Convolutional Decision Tree (CDT), and Convolutional Random Forest (CRF) were employed. Pre-trained models such as VGG16, ResNet50, and Xception enhanced feature extraction. Model performance was assessed using accuracy, precision, recall, F1 score, and ROC AUC, with fivefold cross-validation ensuring robustness.

Results

CRF achieved the highest performance for ITI with Bego implants, with accuracy of 0.8966, precision of 0.9364, recall of 0.9253, F1 score of 0.9304, and ROC AUC of 0.9351. CNN delivered the best results for Bicon with Bego implants, achieving 0.9533 accuracy. Among pre-trained models, VGG16 with preprocessed data achieved superior results for Bicon vs. ITI classification, with 0.9865 accuracy and 0.9877 ROC AUC. Data augmentation and preprocessing significantly improved classifier performance.

Conclusion

Preprocessing steps, coupled with data augmentation, enhanced classification performance, ensuring robustness across models. CRF and CNN were the top-performing classifiers, with VGG16 excelling among pre-trained models. These results highlight the importance of data augmentation and preprocessing in improving dental implant classification accuracy.

Keywords: Dental implants radiography images, Classification, CNN, Pre-trained models

Introduction

Dental implants are among the most reliable solutions for restoring oral function and aesthetics in individuals with tooth loss. They closely replicate the form and function of natural teeth, providing an effective long-term solution. Globally, millions of implants are planned for placement daily, highlighting their importance in dental care [1]. Despite advancements in implant technology, accurate identification using radiographic images remains a critical challenge. This ensures the correct implant selection and placement, which are essential for long-term success. However, achieving diagnostic accuracy in dental radiography continues to be a significant focus [2].

Dental implants, including their fixtures, abutments, and superstructures, are distinguished by their unique designs and the tools required for placement and maintenance. The implant industry faces challenges due to the diverse and unconventional variations among implant brands. For instance, differences in prosthesis components, such as fixing screws, often necessitate retightening to prevent loosening [3]. Identifying implant brands is critical, especially with the emergence of new designs and the potential for a single implant to be placed or maintained by multiple dentists over time. This complexity is further amplified when information must be shared across different regions or countries.

Advancements in artificial intelligence (AI) and deep learning (DL) have significantly transformed dental implant analysis, facilitating precise classification, detection, and prediction. The incorporation of innovative architectures, attention mechanisms, and synthetic data has greatly enhanced the capabilities of these models. Attention mechanisms in deep learning are computational strategies designed to enable models to prioritize and focus on the most relevant parts of the input data while processing information. By assigning varying weights or importance scores to different regions or features of the input, attention mechanisms help models selectively emphasize critical information and ignore less relevant details. This approach mimics human cognitive processes, where focus is directed toward the most important elements of a scene or task. In convolutional neural networks (CNNs), attention mechanisms enhance tasks like implant classification by highlighting key features, improving model accuracy, and making predictions more interpretable [4]. For example, Sukegawa et al. (2022) highlighted the effectiveness of attention-based models, achieving an AUC of 0.9993 using ResNet18 paired with the Attention Branch Network (ABN). However, deeper networks like ResNet50 and ResNet152 demonstrated reduced performance, emphasizing the importance of optimizing network depth in attention-based methods [5]. The integration of cone-beam computed tomography (CBCT) has further improved implant identification. Ou-yang et al. (2023) evaluated nine DL models on 3D CBCT images, with ResNet152V2 achieving perfect AUC scores (1.00), showcasing the synergy between CBCT imaging and DL [6]. In radiographic imaging, Guo et al. (2023) proposed the TVGG model for dental X-rays, achieving 80% accuracy in identifying implant manufacturers [7]. Kong et al. (2023) utilized automated machine learning on periapical radiographs, achieving up to 98.1% accuracy for the Osstem TSIII system, demonstrating the effectiveness of DL models deployed on cloud platforms [8]. The inclusion of synthetic data has also proven beneficial. Sukegawa et al. (2024) reported a classification accuracy improvement to 0.9146 for ResNet50 by augmenting panoramic radiographs with synthetic images, addressing challenges like class imbalance [9]. Transfer learning approaches have been successful in implant classification. Sukegawa et al. (2023) achieved 92.7% accuracy using a fine-tuned VGG16 model on panoramic X-rays to classify 11 implant brands [10]. A systematic review by Chaurasia et al. (2023) reported pooled accuracies of 70.75%–98.19% for implant classification, reaffirming DL’s clinical potential despite challenges like imaging variability [11]. In object detection, Jang et al. (2022) utilized Faster R-CNN for implant detection in periapical radiographs, achieving precision and recall rates of 0.977 and 0.992, respectively [12]. Similarly, Lee (2023) demonstrated robust fracture detection and classification using DCNNs, with AUC values of 0.984 and 0.869, respectively [13]. Further applications of DL include predicting osseointegration outcomes and implant dimensions. Oh et al. (2023) achieved AUROC values between 0.890 and 0.922 for osseointegration prediction, complementing traditional evaluation methods [14]. Park et al. (2023) demonstrated that a fine-tuned VGG16 network outperformed clustering methods for implant size classification, achieving accuracy, sensitivity, and specificity exceeding 0.994 [15]. Collectively, these studies underscore DL’s transformative role in dental imaging and implant classification.

Despite significant advancements in deep learning (DL) for dental implant classification, challenges persist. Current studies are often limited in scope, focusing on specific implant brands, which restricts their applicability to others. Additionally, inconsistencies in image quality, including variations in contrast and noise, along with issues such as data scarcity and class imbalance, hinder the robustness and scalability of classification models. This study addresses these challenges by proposing a novel and systematic pipeline for dental implant classification that improves generalizability and robustness. Unlike previous studies that focus narrowly on specific implant brands, this research incorporates X-ray images from multiple implant brands (Bicon, Bego, and ITI) to ensure broader applicability. To strengthen the pipeline, pre-trained models like VGG16, ResNet50, and Xception are selected for their proven capability in extracting high-quality features from medical images. VGG16 is simple yet effective for smaller datasets, ResNet50 mitigates vanishing gradient issues through residual connections, and Xception excels in capturing spatial hierarchies using depthwise separable convolutions. While these models offer robust feature extraction, limitations such as computational cost and potential overfitting in small datasets are acknowledged, highlighting the importance of balancing their strengths and weaknesses.

The primary objectives of this study are to develop an innovative framework for dental implant classification, enhance generalizability across different implant brands, and support clinical decision-making to improve patient outcomes.

Material and methods

Data collection

A public dataset containing three groups of dental implants from the separate companies Bicon, Bego, and ITI, each with X-ray radiographic images of dimensions 640 × 640, was collected [16]. The ground truth of implant brands in the public dataset was established based on metadata provided by the dataset's creators and verified through expert annotations. To ensure accuracy, a subset of the data was independently reviewed by dental professionals, confirming that the annotated brands corresponded to identifiable implant features visible in the X-ray images. Ambiguous cases with insufficient quality or missing metadata were excluded to maintain the integrity of the study. This rigorous validation process ensured the reliability of the dataset for model training and evaluation. Figure 1 shows a sample image from each implant group and the number of images available for each group, including augmented images.

Fig. 1 — Number of dental implant samples from different groups with augmented images

Ethics approval

This study was conducted in accordance with the principles outlined in the Declaration of Helsinki. Ethical approval was obtained from the Ethics Committee of Tabriz University of Medical Sciences, Tabriz, Iran. Written informed consent was waived as the study did not involve direct patient participation or personal data.

Preprocessing

Preprocessing is a critical step in ensuring the dataset's quality and consistency for effective training of neural networks. Various techniques were employed to enhance image quality, standardize input dimensions, and prepare the data for classification tasks. The preprocessing steps are summarized in Table 1 for clarity and brevity. The overall transformation sequence applied to the dataset can be mathematically described as:

T (I) = F_{m} (F_{h} (Z (S (H (W (R (I)))))))

Table 1.

Preprocessing techniques and parameters

Step	Description	Parameters/Formula	Reference
Data Augmentation	Enhanced the dataset using transformations like rotation, shifts, shear, zoom, and flips	Rotation (R(θ)) where $θ \in [- 20,20]$ ), width shift ( $W (w)$ where $w \in [- 0.2 \times w i d t h, 0.2 \times w i d t h]$ ), height shift ( $H (h)$ where $h \in [- 0.2 \times h e i g h t, 0.2 \times h e i g h]$ ), shear ( $S (s)$ where $s \in [- 0.2,0.2]$ ), zoom ( $Z (z)$ where $z \in [0.8,1.2]$ ), horizontal flip ( $F_{h}$ ), and fill mode ( $F_{m}$ )	-
Resizing	Standardized image resolution to ensure uniform input dimensions for neural networks	$I' (x, y) = I (\frac{x \cdot width}{128}, \frac{y \cdot height}{128})$	[17]
Sharpening	Highlighted edges and fine details using a convolution-based sharpening filter	$I' (x, y) = I (x, y) * K, w h e r e K = [\begin{matrix} 0 & - 1 & 0 \\ - 1 & 5 & - 1 \\ 0 & - 1 & 0 \end{matrix}]$	[18]
Noise Removal	Reduced salt-and-pepper noise while preserving edges using a median blur filter	$I' (x, y) = m e d i a n (I (x - k, y - k), . . ., I (x + k, y + k)), kernel size = (3,3)$	[17]
Contrast Enhancement	Enhanced local contrast using CLAHE to improve visibility while minimizing noise amplification	$I' = C L A H E (I)$ , Parameters: clipLimit = 2.0, tileGridSize = (3, 3)	[19]
Mask Application	Focused on regions of interest by applying binary masks to isolate specific areas	$I' = I (x, y) * M$	-
Histogram Equalization	Improved global contrast by redistributing pixel intensity values	$I' = e q u a l i z e H i s t (I)$	[17]
Normalization	Scaled pixel intensity values to [0,1] for numerical stability and faster convergence during training	$I' = I / 255.0$	[20]

Open in a new tab

Figure 2 illustrates sample stages of all pre-processing stages of X-ray radiographic image data for three groups of dental implants.

Fig. 2 — Sample preprocessing steps for each implant group

Model development

Independent CNN model

The CNN model was independently designed to classify dental conditions in periapical X-ray images. The model architecture and training process are detailed below:

Model architecture

The convolutional neural network (CNN) was developed to classify dental implants from three manufacturers (Bicon, Bego, ITI) in periapical X-ray images. The architecture features convolutional layers for feature extraction, pooling layers for dimensionality reduction, and dropout layers to reduce overfitting [20]. Key components include Conv2D layers with 3 × 3 filters and ReLU activation, MaxPooling layers with 2 × 2 windows, and dense layers for classification. The output layer uses a sigmoid activation function for binary classification, optimized with the Adam optimizer and binary cross-entropy loss. Training employed a batch size of 16, a 10% validation split, and early stopping based on validation loss to prevent overfitting. A detailed summary of the model's architecture and hyperparameters is presented in Table 2. Figure 3, diagrammatic representation of a Convolutional Neural Network (CNN) architecture, illustrating the layers and connections used for feature extraction and classification in deep learning tasks.

Table 2.

Summary of model architectures, learning parameters, and training configurations

Model	Architecture Details	Learning Parameters	Cross-Validation (K-fold = 5)	Activation Function
ResNet50	Input shape: (128, 128, 3), Input Layer: ImageNet weights, GlobalAveragePooling2D, Dense: 256, Frozen Layers: All except last block	Optimizer: Adam, Learning Rate: 0.0001, Loss: Binary Cross-Entropy, Early Stopping Patience: 15, Batch Size: 8, Epochs: 70	Yes	ReLU (hidden), Sigmoid (output)
Xception	Input shape: (128, 128, 3), Input Layer: ImageNet weights, GlobalAveragePooling2D, Dense: 256, Frozen Layers: All except last block	Optimizer: Adam, Learning Rate: 0.0001, Loss: Binary Cross-Entropy, Early Stopping Patience: 15, Batch Size: 8, Epochs: 70	Yes	ReLU (hidden), Sigmoid (output)
VGG16	Input shape: (128, 128, 3), Input Layer: ImageNet weights, GlobalAveragePooling2D, Dense: 256, Frozen Layers: All except last block	Optimizer: Adam, Learning Rate: 0.0001, Loss: Binary Cross-Entropy, Early Stopping Patience: 15, Batch Size: 8, Epochs: 70	Yes	ReLU (hidden), Sigmoid (output)
CNN	Input: (128, 128, 1) → Conv2D (32 filters, 3 × 3, ReLU) → MaxPooling2D (2 × 2) → Dropout (0.3) → Conv2D (64 filters, 3 × 3, ReLU) → MaxPooling2D (2 × 2) → Dropout (0.3) → Conv2D (128 filters, 3 × 3, ReLU) → MaxPooling2D (2 × 2) → Dropout (0.3) → Conv2D (256 filters, 3 × 3, ReLU) → MaxPooling2D (2 × 2) → Dropout (0.3) → Conv2D (512 filters, 3 × 3, ReLU) → MaxPooling2D (2 × 2) → Dropout (0.3) → Flatten → Dense (256 neurons, ReLU) → Dropout (0.5) → Dense (1 neuron, Sigmoid)	Optimizer: Adam, Loss: Binary Cross-Entropy, Batch Size: 16, Epochs: 70, Dropout: 0.3 and 0.5, Batch Norm: True	Yes	ReLU (hidden), Sigmoid (output)
SVM	RBF kernel classifier	Standardization: Applied, Solver: Sequential Minimal Optimization (SMO), Cross-validation: 5 folds, Kernel: RBF, Regularization parameter C = 1.0, Gamma: Scale, Probability estimates: Enabled	Yes	Sigmoid (output)
DT	Tree-based with Gini impurity	Standardization: Applied, Cross-validation: 5 folds, Splitting criterion: Gini impurity, Maximum depth: Unlimited, Minimum samples per split: 2	Yes	-
RF	Ensemble of 100 trees (bootstrapped)	Standardization: Applied, Cross-validation: 5 folds, Total estimators: 100, Splitting criterion: Gini impurity, Maximum number of features: Auto	Yes	-

Open in a new tab

Fig. 3 — Architecture of a Convolutional Neural Network (CNN) Model

Combined models: integration of CNN with SVM, RF, and DT

To improve classification performance, a combined approach was implemented by integrating Convolutional Neural Networks (CNNs) with classical machine learning algorithms, including Support Vector Machine (SVM), Random Forest (RF), and Decision Tree (DT). This approach capitalizes on the feature extraction strength of CNNs and the classification efficiency of SVM, RF, and DT.

Designing CNNs for enhanced feature detection

The CNN architecture utilized for feature extraction follows the same design as the standalone model. It comprises convolutional layers, pooling layers, and dropout layers. The output from the final dropout layer is flattened into feature vectors, which are then used as inputs for the traditional classifiers.

Conventional techniques for classification

The extracted feature vectors from the CNN are fed into SVM, RF, and DT classifiers. Each classifier is individually trained to categorize images for dental implant detection in periapical X-ray images, utilizing its unique strengths to enhance classification accuracy.

Support Vector Machine (SVM) Model

SVM is a classification technique in supervised learning that finds the most suitable hyperplane for the separation of data points into distinct categories with maximum margi [21].

f (x) = s i g n (w^{T} x + b)

where $w$ is the weight vector and $b$ is the bias. The SVM model employs an RBF kernel for classification, with various hyperparameters fine-tuned for optimal performance. Table 2 provides a comprehensive overview of these learning parameters. Figure 4 illustrates the architecture of a convolutional neural network (CNN) for feature extraction followed by a support vector machine (SVM) for classification.

Fig. 4 — Feature extraction and classification pipeline using CNN and SVM

Random Forest (RF) Model

RF is an ensemble-based algorithm that combines the predictions of multiple decision trees to enhance the overall accuracy and reliability of classification results [22].

f (x) = \frac{1}{T} \sum_{t = 1}^{T} h_{t} (x)

where $T$ is the number of trees and $h_{t}$ is the DT classifier. The Random Forest model incorporates 100 estimators to create a robust ensemble classifier. For further details on parameter settings and cross-validation, refer to Table 2. Figure 5 depicts the combination of a convolutional neural network (CNN) for feature extraction and a random forest (RF) classifier for classification.

Fig. 5 — Feature extraction and classification pipeline using CNN and random forest

Decision Tree (DT) Model

A DT is a supervised learning algorithm that divides data into hierarchical branches based on feature values. It systematically partitions the feature space into regions, grouping data points with similar labels for decision-making.

f (x) = \sum_{i = 1}^{n} D (x \in R_{i}) . c_{i}

where $R_{i}$ are the regions defined by the decision nodes, $D$ is the indicator function, and $c_{i}$ is the class label assigned to region $R_{i} .$ The Random Forest model incorporates 100 estimators to create a robust ensemble classifier. All learning parameters used are specified in Table 2. Figure 6 illustrates the integration of a convolutional neural network (CNN) for feature extraction with a decision tree classifier for final classification.

Fig. 6 — Feature extraction and classification pipeline using CNN and decision tree

Evaluation of pre-trained models: VGG16, ResNet50, and Xception

In this section, the performance of three commonly used pretrained convolutional neural network models, namely VGG16, ResNet50, and Xception, was evaluated for the classification of dental implants produced by three manufacturers: Bicon, Bego, and ITI. The models were fine-tuned using preprocessed periapical dental X-ray images to adapt them to the specific task. Since the input dataset consisted of grayscale images, each image was replicated across three channels to comply with the input requirements of these models, which are designed for three-channel RGB images.

Optimization of VGG16Architecture

The VGG16 architecture, a 16-layer convolutional neural network, is widely recognized for its simplicity and efficiency in image classification. It utilizes uniform 3 × 3 convolutional filters throughout the convolutional layers, followed by fully connected layers for classification. The convolutional operation is mathematically represented as follows [23]:

Z_{l + 1} = f (W_{l} * Z_{l} + b_{l})

where $Z_{l + 1}$ represents the output of layer $l + 1$ , $W_{l}$ denotes the weights in layer $l$ , * signifies the convolution operation, $b_{l}$ is the bias term, and $f$ is the activation function, which is ReLU in the case of VGG16. The learning parameters applied during training are outlined in Table 2. Figure 7 depicts the VGG16-based convolutional neural network architecture utilized for the classification of dental implants in periapical X-ray images.

Fig. 7 — VGG16-Based architecture for dental implants classification using periapical X-ray images

Optimization of ResNet50 architecture

ResNet50, a 50-layer deep convolutional neural network, was employed to classify dental implants in periapical X-ray images. By utilizing residual learning and residual blocks, it addresses the vanishing gradient problem and enhances deep network performance. The residual block operation is mathematically defined as [24]:

Z_{l + 1} = f (Z_{l} + {F (Z}_{l}, W_{l}))

where $F$ denotes the residual function, $Z_{l}$ is the input to the $l$ -th residual block, $W_{l}$ represents the weights of the $l$ -th layer, and $f$ is the activation function. Parameters utilized during training are detailed in Table 2. Figure 8 illustrates the ResNet-based convolutional neural network architecture employed for dental implants classification in periapical X-ray images.

Fig. 8 — ResNet-based architecture for dental implants classification using periapical X-ray images

Optimization of Xception Architecture

The Xception architecture, designed for classification tasks, replaces traditional Inception modules with depthwise separable convolutions to enhance efficiency and accuracy. This approach separates convolution into depthwise operations for spatial filtering and pointwise operations for channel mixing, reducing computational complexity while capturing fine-grained patterns. The mathematical representation of the depthwise separable convolution is as follows [25]:

Z_{l + 1} = f (W_{l}^{d} *_{d} (W_{l}^{p} *_{p} Z_{l}) + b_{l})

where $*_{d}$ denotes the depthwise convolution, $*_{p}$ denotes the pointwise convolution, $W_{l}^{d}$ and $W_{l}^{p}$ represent the weights for the depthwise and pointwise convolutions, respectively, and $f$ is the activation function. The learning parameters for this model are specified in Table 2. Figure 9 represents the Xception-based convolutional neural network architecture designed for dental implants classification in periapical X-ray images.

Fig. 9 — Xception-based architecture for dental implants classification using periapical X-ray images

Learning parameters and model configurations

The selection of appropriate learning parameters is crucial for achieving high performance and ensuring the generalizability of deep learning models. Key parameters, including the choice of optimizer, learning rate, batch size, number of epochs, and early stopping criteria, were systematically optimized based on experimental evaluations. The Adam optimizer was utilized across most models due to its adaptive learning rate capabilities, which improve convergence. Learning rates were fine-tuned to 0.0001 to balance the trade-off between convergence speed and stability. Batch sizes were set to 8 or 16, depending on the model architecture, to provide a balance between computational efficiency and gradient stability.

Furthermore, a fivefold cross-validation approach was employed to validate the robustness of the models and minimize overfitting. The early stopping mechanism was introduced with a patience of 15 epochs to prevent unnecessary training iterations while ensuring the model achieves its best performance on the validation set. For custom CNN architectures, dropout layers and batch normalization were used to enhance regularization and stabilize the training process.

The Table 2 provides a comprehensive summary of the architectures, learning parameters, and configurations for each model used in this study. The configurations highlight the standardized data preprocessing steps, loss functions, and specific architectural details tailored to the problem at hand.

Figure 10 illustrates the proposed flowchart of our study, detailing the processes involved in data collection, preprocessing steps, and classification using Convolutional Neural Networks (CNN) alongside machine learning algorithms (SVM, RF, DT). The workflow includes the integration of pretrained models (VGG16, ResNet50, and Xception) and demonstrates model training, validation, and evaluation strategies to enhance classification accuracy.

Fig. 10 — Proposed workflow for data processing and classification using CNN and pretrained models

Model evaluation

Performance metrics

The evaluation metrics used in this study were chosen to provide a comprehensive analysis of model performance, particularly for imbalanced datasets and multi-class classification tasks.

Accuracy reflects the proportion of correctly classified instances out of the total number of samples, offering a general measure of the model's overall performance.
Precision quantifies the proportion of true positive predictions among all predicted positives, which is critical in reducing the impact of false positives.
Recall (Sensitivity) measures the model’s ability to correctly identify actual positive cases, ensuring that important instances are not overlooked.
F1 Score serves as a balanced metric by combining precision and recall, making it particularly useful for evaluating performance on imbalanced datasets.
ROC AUC measures the ability of the model to distinguish between classes across various threshold values, providing an overall assessment of classification performance.
Confusion Matrix offers detailed insights into classification outcomes, breaking down true positives, true negatives, false positives, and false negatives to facilitate a granular evaluation of errors.

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

P r e c i s i o n = \frac{TP}{T P + F P}

R e c a l l = \frac{TP}{T P + F N}

F 1 S c o r e = \frac{P r e c i s i o n . R e c a l l}{T P r e c i s i o n + R e c a l l}

T P R = \frac{TP}{T P + F N}, F P R = \frac{FP}{F P + T N}

where TP (True Positives), TN (True Negatives), FP (False Positives), and FN (False Negatives) represent the classification outcomes, and TP (True Positive Rate) and FP(False Positive Rate) are derived from varying classification thresholds. These metrics provide a comprehensive evaluation of model performance across segmentation and classification tasks.

The following equations define these metrics [26, 27]:

By employing these metrics, this study provides a detailed and robust evaluation of the model’s performance across all relevant aspects, ensuring a balanced assessment of accuracy, reliability, and practical utility.

Cross-validation

k-fold cross-validation for robust model evaluation

To evaluate the robustness and generalizability of the proposed model, k-fold cross-validation was implemented. This method involves dividing the dataset into k distinct subsets (folds). The model is iteratively trained on k−1 folds and validated on the remaining fold, ensuring each subset serves as the validation set exactly once. The process is repeated for all k iterations, and the final performance metrics are computed as the average across these iterations, providing a reliable and comprehensive assessment of the model’s performance [28].

Result

In this study, radiography images of three types of dental implants from three companies (Bicon, Bego, and ITI) were used for classification. The dataset consisted of 275, 70, and 166 images for Bicon, Bego, and ITI implants, respectively. The preprocessing steps for the data included resizing the images to 128 × 128 pixels, sharpening the image edges using a kernel of size 3 with a frequency value of 5, applying a median filter with a kernel size of 3 to reduce noise, applying CLAHE with a kernel size of 3 to enhance image contrast, applying an implant mask on the CLAHE-processed image, reapplying the previous filter to the masked image, and finally normalizing the images to a range of 0 to 1. Additionally, for each image in each group, 10 augmented images were generated and used in the analysis.

Training and validation performance of CNN classification models

Figure 11 illustrates the training and validation performance of nine classification models applied to three types of dental implants (Bego, ITI, and Bicon) under different preprocessing conditions. It also compares the mean validation loss with standard deviation across the models to evaluate their stability and effectiveness in classification tasks.

Fig. 11 — Performance analysis of classification models for dental implants under different preprocessing conditions

(a) Training and validation performance metrics (Loss, Validation Loss, Accuracy, and Validation Accuracy) for nine different models evaluated across three dental implant classifications (Bego, ITI, and Bicon). The models were trained under three conditions: without preprocessing, with masks and preprocessing, and without masks but with preprocessing. The x-axis represents the number of epochs for each fold, while the y-axis displays the respective metric values across epochs. (b) Comparison of mean validation loss with standard deviation for the nine models across five folds. The x-axis shows the labels of the models, and the y-axis represents the mean validation loss values along with their standard deviations.

Models with preprocessing and applied masks demonstrated better validation loss and accuracy compared to models without preprocessing. The Bicon_Bego model with mask and preprocessing achieved the lowest mean validation loss (0.126 ± 0.128), indicating superior performance. Additionally, preprocessing significantly reduced performance variability across folds, leading to more stable results.

Confusion matrices for CNN classification models

Figure 12 presents the confusion matrices for nine classification models averaged over five folds, showing the performance of each model in distinguishing three dental implant classes (Bego, ITI, and Bicon) under different preprocessing conditions.

The confusion matrices in Fig. 12 display the average performance of nine classification models in identifying three dental implant classes (Bego, ITI, and Bicon) over five folds. The models were evaluated under three preprocessing scenarios: without preprocessing, with masks and preprocessing, and without masks but with preprocessing. Each matrix indicates the number of correctly and incorrectly classified samples for each class, with darker shades representing higher values.

Models incorporating preprocessing and masks exhibited superior performance, reducing misclassification rates and achieving higher accuracy across all classes. Notably, the Bicon_ITI_with_mask_preprocessed model demonstrated the best classification results, accurately identifying 294 samples from the ITI class and 542 samples from the Bicon class. Furthermore, the Bicon_Bego_with_mask_preprocessed model significantly improved differentiation between the Bego and Bicon classes compared to models without preprocessing, highlighting the impact of preprocessing and masking on model performance.

Effects of preprocessing on model performance

Figure 13 shows the performance of the CNN model on test data in three scenarios: preprocessing with implant masks, preprocessing without implant masks, and without any preprocessing. The reported metrics are the averages from fivefold cross-validation on the test data.

Fig. 13 — Comparison of CNN Model performance with and without preprocessing

This figure presents radar charts comparing the performance of CNN models across various metrics: precision, recall, accuracy, F1 score, and ROC AUC. The models were trained on different combinations of dental implant states (Bicon, Bego, ITI) with and without preprocessing and masking techniques.

The results highlight that models incorporating preprocessing and masking generally outperform their non-preprocessed counterparts. The best overall performance was observed with the Bicon Bego with mask preprocessed model, achieving the highest precision (0.9600), recall (0.9695), accuracy (0.9538), F1 score (0.9697), and ROC AUC (0.9733). This underscores the importance of preprocessing and mask data inclusion in enhancing model performance for dental implant state classification tasks.

Performance comparison of CNN with other classifiers

The performance of the CNN model combined with three classifiers: SVM, DT, and RF, was compared for three groups of dental implants, evaluated pairwise. In this approach, the flattened features from the CNN were used as input to the SVM, DT, and RF classifiers. Figure 14 shows the performance of the four classifiers: CNN, CSVM (CNN combined with SVM), CDT (CNN combined with DT), and CRF (CNN combined with RF) across five metrics. The results reported are the averages from fivefold cross-validation. Figure 12 shows the preprocessed results with a mask for 5 metrics for 4 classifiers.

Fig. 14 — Comparison of classifier performance among different groups of dental implants

This figure compares the performance metrics of different classifiers (CNN, CSVM, CDT, CRF) across three groups: ITI with Bego, Bicon with Bego, and Bicon with ITI. The performance is evaluated using accuracy, precision, recall, F1 score, and ROC AUC.

Across all groups, the CNN and CRF classifiers consistently outperformed the CSVM and CDT classifiers. Notably, the CRF classifier achieved the highest overall performance metrics in most cases, with an accuracy of 0.8966, precision of 0.9364, recall of 0.9253, F1 score of 0.9304, and ROC AUC of 0.9351 for ITI with Bego. The highest accuracy of 0.9533 observed for the Bicon with Bego group using the CNN classifier. The results indicate that CRF is particularly effective for dental implant state classification tasks.

Extended validation on pre-trained models

For validation, data from three groups of dental implants were compared pairwise using pre-trained models VGG, ResNet, and Xception, with their weight update layers frozen. The results were reported using the average of fivefold cross-validation metrics. Figure 15 shows the comparison of the performance of the three pre-trained models across five metrics for data with preprocessing including masks applied to the images and without preprocessing.

Fig. 15 — Comparison of the performance of trained models for the classification of dental implants across different datasets and preprocessing conditions

This figure presents a comparative analysis of three convolutional neural network (CNN) models: VGG16, Xception, and ResNet50. The performance metrics, including accuracy, precision, recall, F1 score, and ROC AUC, are evaluated for each model under two conditions: raw data (blue) and preprocessed data (orange). Each row of subplots represents a different CNN architecture, while each column represents a distinct binary classification task among three dental implant types: Bicon, ITI, and Bego.

Preprocessing significantly enhances the performance of all models across different classification tasks. Among the models, VGG16 consistently performs the best, achieving the highest values of accuracy (0.9865 in Bicon vs Bego), recall (0.9561 in Bicon vs Bego), and ROC AUC (0.9877 in Bicon vs ITI). These results indicate that VGG16, particularly with preprocessed data, is highly effective for dental implant classification in this study.

Discussion

In this study, radiography images from three dental implant companies (Bicon, Bego, and ITI) were classified using a dataset comprising 275, 70, and 166 images, respectively. Preprocessing involved resizing images to 128 × 128 pixels, edge sharpening, noise reduction with a median filter, CLAHE for contrast enhancement, implant masking, and normalization. Data augmentation added 10 images per group to increase diversity. Using CNNs, the Bicon-Bego with mask-preprocessed model achieved the best precision (0.9600), recall (0.9695), accuracy (0.9538), F1 score (0.9697), and ROC AUC (0.9733). These results highlight the critical role of robust preprocessing, including masking, in enhancing model performance for dental implant classification. CRF classifiers demonstrated the highest overall performance, achieving an accuracy of 0.8966 and precision of 0.9364 for ITI with Bego. Among pre-trained models, VGG16 with preprocessed data excelled, achieving the highest accuracy (0.9865 for Bicon vs. Bego), recall (0.9561 for Bicon vs. Bego), and ROC AUC (0.9877 for Bicon vs. ITI).

This section provides a comparative review of our study's results against the methodologies and outcomes from prior research in dental radiograph classification and detection. Table 3 outlines these studies, emphasizing the performance of various approaches, including our proposed methods.

Table 3.

Summary of recent studies on dental radiograph analysis

Researcher(s)	Year	Dataset Size	Objective	Methodology	Performance
Guo et al. [7]	2023	2,269 images	Identifying dental implant manufacturers to reduce doctors' workload and improve treatment decisions	Proposed TVGG model (a thinner version of VGG16)	80% accuracy
HJ Kong et al. [8]	2023	4,800 images	Classification of dental implant systems using automated machine learning on a cloud platform	Google AutoML Vision, neural architecture search	Accuracy: 0.981
Shintaro Sukegawa et al. [9]	2025	7,946 + synthetic images	Evaluating enhancement in implant classification by incorporating artificially generated images	ResNet50, datasets A (in vivo), B (synthetic), C (adjusted synthetic)	Accuracy for A: 0.8888, B: 0.903, C: 0.9146
S. Sukegawa et al. [10]	2020	8,859 images	Classifying different dental implant brands via deep CNNs with transfer-learning	Fine-tuned VGG16 and VGG19	Best accuracy: 92.7% using fine-tuned VGG16
W. S. Jang et al. [12]	2022	300 images	Detecting dental implants and peri-implant tissues using object detection	Faster R-CNN	Precision: 0.977, Recall: 0.992, F1 Score: 0.984, Mean IoU: 0.907, AP@0.5: 0.996, AP@0.75: 0.967
D.-W. Lee et al. [13]	2021	445 images	Detection and classification of fractured dental implants	VGGNet-19, GoogLeNet Inception-v3, automated DCNN	Best AUC (detection): 0.984, Best AUC (classification): 0.869
S. Oh et al. [14]	2023	1,206 implants	Deep learning-based prediction of osseointegration of dental implants using plain radiography	Deep CNN	Best Accuracy: 89.6%
JH Park	2023	1320 images from 927 periapical radiographs	Classify dental implant diameter and length using AI	Deep Learning: Fine-tuned VGG16	Best Accuracy: 0.994
Our Study	2025	2750 (Bicon), 700 (Bego), 1660 (ITI)	Classifying dental implants (Bicon, Bego, ITI) with robust preprocessing and deep learning models	CNNs, CLAHE, edge sharpening, masking, and VGG16	Best accuracy: 0.9865 (Bicon vs. Bego, VGG16), ROC AUC: 0.9877 (Bicon vs. ITI), CRF: Accuracy: 0.8966, Precision: 0.9364 (ITI vs. Bego)

Open in a new tab

In this study, we developed and evaluated multiple approaches for dental implant classification using radiographic images, with particular emphasis on preprocessing techniques and comparative analysis of different classification methods. Our findings demonstrate several significant contributions to the field while also highlighting areas for future development. The performance of our VGG16 model with preprocessed data achieved remarkable accuracy (98.65% for Bicon vs. Bego), surpassing several recent benchmarks in the literature. This result represents a substantial improvement over Guo et al.'s [7] TVGG model, which achieved 80% accuracy, and Sukegawa et al.'s [10] fine-tuned VGG16 implementation, which reached 92.7% accuracy. Our results approach the performance level of Kong's [8] Google AutoML Vision system (98.1% accuracy), despite using a more straightforward and potentially more accessible approach. However, Park et al. [15] achieved slightly higher performance metrics with their DL model (99.4% accuracy), albeit for a different classification task focusing on implant dimensions rather than manufacturers. A key innovation in our study is the comprehensive preprocessing pipeline, which includes edge sharpening, noise reduction, CLAHE enhancement, and notably, implant masking. This preprocessing strategy proves particularly effective, as evidenced by our CNN model's strong performance metrics (96.00% precision, 96.95% recall for Bicon-Bego classification). While Sukegawa [9] achieved 91.46% accuracy using artificial image augmentation, our approach demonstrates that careful preprocessing can achieve superior results without the computational overhead of generating synthetic images. Another notable strength of our study is the comparative evaluation of multiple classification approaches. Unlike previous studies that typically focused on a single methodology, such as Lee's [13] comparison of three DCNN architectures, we provide insights into the relative effectiveness of CNNs, CRF classifiers, and pre-trained models. The strong performance of CRF classifiers (89.66% accuracy for ITI with Bego) suggests that traditional machine learning approaches remain viable alternatives when combined with effective preprocessing. However, several limitations should be acknowledged. Our dataset exhibits considerable class imbalance (275, 70, and 166 images for Bicon, Bego, and ITI respectively), which contrasts with more balanced approaches like Kong's [8] study using 1,200 images per class. Our conservative data augmentation strategy (10 additional images per group) may not fully address this imbalance. Additionally, our choice of 128 × 128 pixel resolution, while computationally efficient, is lower than some comparable studies, such as Kong's 400 × 800 pixel regions of interest. The study's preprocessing pipeline represents a significant methodological contribution. While studies like Jang et al. [12] focused on raw image processing for object detection, our results demonstrate that targeted preprocessing can substantially enhance classification accuracy. The effectiveness of our masking technique, in particular, suggests a promising direction for future research in medical image analysis. Our findings also contribute to the ongoing discussion about the optimal approach to dental implant classification. The strong performance of both traditional (CRF) and deep learning (VGG16) methods, when combined with our preprocessing pipeline, suggests that the choice of classification algorithm may be less critical than the quality of image preprocessing. This observation has important implications for clinical implementation, where computational resources may be limited. Looking ahead, several avenues for future research emerge from our findings. First, exploring the integration of our preprocessing pipeline with automated architecture search techniques, similar to Kong's [8] approach, could potentially yield even better results. Second, investigating the impact of higher resolution images and more extensive data augmentation strategies could address some of the current limitations. Finally, expanding the dataset to include a broader range of manufacturers and implant types would enhance the clinical applicability of our approach. In conclusion, while our study demonstrates competitive performance in dental implant classification, its primary contribution lies in establishing the effectiveness of comprehensive preprocessing and highlighting the viability of multiple classification approaches. These findings provide a foundation for future research while offering practical insights for clinical implementation.

Figures 13, 14, and 15 are exhaustively discussed with significant improvement over previous conditions. Figure 13 illustrates how different preprocessing, especially implant masks, can impact CNN performance. Experiments from fivefold cross-validation obtain model metrics such as precision, recall, accuracy, F1 score, and ROC AUC. For instance, Bicon-Bego, after preprocessing with implant masks, performed best with precision 0.9600, recall 0.9695, and accuracy 0.9538. This shows that preprocessing, especially with implant masks, can enhance the model's performance by focusing on relevant features while discarding noise, leading to good feature extraction and classification accuracy. Conversely, the models trained with either no preprocessing or with no masks performed very well but were less powerful to expose that preprocessing is what remedies a problem that has to do with noise and contrast changes as done in Lee [29] and Hasnain [30]. Figure 14 shows the performances of CNN, CSVM, CDT, and CRF on three groups of dental implants, indicating that CNN and CRF consistently outperform CSVM and CDT. More importantly, when the CRF classifier was combined with CNN, it achieved the best performance metrics of accuracy, 0.8966; precision, 0.9364; recall, 0.9253; F1 score, 0.9304; and ROC AUC, 0.9351 for ITI with Bego. This indicates that combining CNN with a more powerful classifier like CRF enhances classification performance by refining the feature extraction and classification process. In other groups, the same trend was found, with CNN achieving the highest accuracy of 0.9533 for Bicon with Bego. These results confirm previous studies highlighting the advantage of combining deep learning models with traditional classifiers in medical image classification tasks. Figure 15 extends this validation by comparing the performance of three pre-trained CNN models (VGG16, Xception, and ResNet50) under two conditions: with and without preprocessing. The results showed that VGG16 outperformed the other models across all tasks, particularly when preprocessing was applied. In the Bicon vs Bego task, for instance, VGG16 yielded 0.9865 accuracy, 0.9561 recall, and 0.9874 ROC AUC with preprocessing, while it was way lower without preprocessing. Similarly, ResNet50 and Xception benefited from preprocessing, but VGG16 consistently produced the best results, thus confirming its suitability for dental implant classification. These findings are in line with the literature, often reporting VGG16 as a robust model for medical image classification, since it is deep and can extract relevant features from the input data [24]. Thus, preprocessing and model selection, especially VGG16, are an important part of improving the classification of dental implants. To mitigate the risk of overfitting associated with extensive data augmentation, several strategies were implemented to enhance model generalization. A validation set, combined with early stopping, was employed to monitor performance and prevent overfitting by terminating training when the validation loss ceased to improve. Dropout regularization was applied to randomly deactivate neurons, thereby reducing reliance on specific features. Additionally, L2 weight regularization (weight decay) was utilized to control model complexity and prevent overfitting. Furthermore, cross-validation was conducted to ensure the model's stability across different data splits. Collectively, these measures effectively minimized overfitting and enhanced the model’s ability to generalize to unseen data.

The computational models were implemented using Python programming language, specifically leveraging libraries such as TensorFlow and Keras for neural network construction and training. The simulations were conducted on a system equipped with an NVIDIA RTX 3050 Ti laptop GPU with 4GB of VRAM to accelerate the training process. The system specifications include an Intel Core i7 processor, 32 GB of RAM, and a Windows 11 operating system. The integrated development environment (IDE) used was PyCharm. Data preprocessing and analysis were performed using additional libraries like NumPy, pandas, and SciPy. The training times of the models reflect their computational efficiency. Among pre-trained models, Xception trained in 37.44 min, ResNet50 in 39.82 min, and VGG16 in 41.38 min, with VGG16 being the most time-intensive due to its higher parameter count. The custom CNN model demonstrated efficiency with a training time of 32.63 min. Combined CNN-machine learning classifiers significantly reduced training times, with CNN-SVM, CNN-RF, and CNN-DT completing in 14.35, 12.42, and 10.51 min, respectively, by leveraging pre-extracted CNN features. These findings highlight the balance between accuracy and computational cost in optimizing dental implants detection workflows.

The results of this study have significant practical implications for improving the diagnosis and treatment of patients with dental implants. By leveraging advanced deep learning techniques and meticulous preprocessing methods, the classification accuracy of dental implant states can be significantly enhanced. This can aid dental professionals in making more informed decisions regarding implant selection, placement, and monitoring, ultimately leading to better clinical outcomes and more efficient workflows. Additionally, the integration of these advanced classification models into clinical diagnostic tools and workflows can streamline processes, reduce the cognitive load on dental professionals, and enhance the overall quality of care. The potential applications extend to the dental implant manufacturing industry as well, where these models can be used for quality control and ensuring product consistency. This study underscores the importance of data preparation and model selection in achieving superior results in dental implant classification tasks.

A few of the major limitations in this study include the relatively small sample sizes for the three implant types under consideration: 275 for Bicon, 70 for Bego, and 166 for ITI. This limited dataset may not adequately represent the diversity of dental implants available in the market, as implants from other manufacturers were not included in the analysis. Consequently, the findings may not be fully generalizable to other types of dental implants. Additionally, the variability in the quality of images used in this study is another significant limitation. Some images exhibited noise, low resolution, or poor lighting conditions, which could have impacted the performance of the classification models. Such inconsistencies highlight the need for standardized imaging protocols to ensure high-quality input data. Another limitation is the computational complexity of the employed deep learning models. Advanced deep learning architectures, while powerful, are resource-intensive, requiring significant computational power and specialized hardware for both training and evaluation. This could limit their practical applicability in real-world clinical settings, especially in resource-constrained environments. Finally, this study did not explore the potential impact of external factors such as variations in patient anatomy, implant placement techniques, or imaging modalities, which might affect the generalizability and robustness of the model.

Future studies can advance dental implant analysis by incorporating segmentation models like U-Net to isolate implant regions, minimizing background noise and enhancing feature extraction for improved classification accuracy. Building larger, diverse datasets through collaborations with clinics, including various implant types, is crucial for generalizability. Adopting standardized imaging protocols and advanced preprocessing techniques, such as super-resolution and GANs, can further boost performance. Efficient architectures like transfer learning, lightweight models, and ensemble methods can address computational challenges. A multimodal approach integrating radiographic and clinical data may enhance predictions, while user-friendly tools for clinical application can improve diagnostics. Considering external factors like patient anatomy and implant placement can further refine the results.

Conclusion

This study showcases the promising potential of deep learning models, especially CNNs, in enhancing the classification and detection of dental implants from various manufacturers, such as Bicon, Bego, and ITI. The integration of a thorough preprocessing pipeline, which included image resizing, sharpening of edges, noise reduction, contrast enhancement, and implant masking, together with data augmentation, resulted in high classification performance for multiple types of implants. These results showed that robust preprocessing, mainly the use of implant masks, is very important to enhance the accuracy of models. Among them, the best performance had an accuracy of 0.9538, precision of 0.9600, and recall of 0.9695. In addition, a hybrid strategy of CNN combined with traditional machine learning classifiers such as CRF further demonstrated the utility of merging deep learning techniques with classic methods in improving performance. Our findings, when compared to previous works on dental image classification, perform well with existing studies and sometimes better, particularly when applying pre-trained networks like VGG16. More notably, the performance of applying transfer learning using pre-trained models and pre-processing steps is superior compared to non-processed data, emphasizing again the proper selection of a model and pre-processing to attain high performance in dental image classification.

Despite these achievements, this study faces certain limitations. The dataset was relatively small and imbalanced, potentially affecting the model's generalizability. Variability in image quality and the computational complexity of deep learning models also pose challenges for practical deployment in clinical settings. Future work should focus on expanding dataset size, improving image quality through standardized protocols, and developing computationally efficient models. The application of segmentation models like U-Net for implant area segmentation offers promising prospects for improving classification performance by directing the model's attention to regions of interest. This study underscores the transformative impact of deep learning on dental implant classification, particularly when paired with effective preprocessing techniques. These advancements pave the way for developing clinical decision support systems, offering dental professionals more accurate, efficient, and user-friendly tools. Integrating multimodal data, diverse datasets, and accessible AI technologies can further enhance the applicability and clinical utility of these models, ultimately improving patient outcomes and advancing dental care.

Acknowledgements

We sincerely appreciate the publicly available dental implant dataset provided by Roboflow. Their open data repository has significantly contributed to the success of this study.

Authors’ contributions

R. A. L: Conceptualization, methodology, data analysis, and writing—original draft. Z. R: Data collection, radiography images preprocessing, reviewing manuscript. N. R: Implementation of the model, evaluating the classifier, interpretation of results. M. G: Supervision, results validation, revision of the manuscript. H. N: Statistical analysis, model performance evaluation, editing final manuscript.

Funding

This study did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Data availability

The public data related to dental implants used in this study can be accessed on the following website: https://universe.roboflow.com/dental-implant-nyw2h/-dental-implant.

Declarations

Ethics approval and consent to participate

This study was conducted in accordance with the principles outlined in the Declaration of Helsinki. Ethical approval was obtained from the Ethics Committee of Tabriz University of Medical Sciences, Tabriz, Iran, and the Oral and Maxillofacial Radiology department. Written informed consent was waived by the ethics committee as the study did not involve direct patient participation or the collection of personal data.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Lang NP. Oral implants: the paradigm shift in restorative dentistry. J Dent Res. 2019;98(12):1287–93. 10.1177/0022034519853574. [DOI] [PubMed] [Google Scholar]
2.Hung KF, Yeung AWK, Bornstein MM, Schwendicke F. Personalized dental medicine, artificial intelligence, and their relevance for dentomaxillofacial imaging. Dentomaxillofacial Radiology. 2023;52(1):20220335. 10.1259/dmfr.20220335. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Lee K-Y, Shin KS, Jung J-H, Cho H-W, Kwon K-H, Kim Y-L. Clinical study on screw loosening in dental implant prostheses: a 6-year retrospective study. J Korean Assoc Oral Maxillofac Surg. 2020;46(2):133. 10.5125/jkaoms.2020.46.2.133. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Niu Z, Zhong G, Yu H. A review on the attention mechanism of deep learning. Neurocomputing. 2021;452:48–62. 10.1016/j.neucom.2021.03.091. [Google Scholar]
5.Sukegawa S, et al. Is attention branch network effective in classifying dental implants from panoramic radiograph images by deep learning? PLoS ONE. 2022;17(7): e0269016. 10.1371/journal.pone.0269016. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Ou-Yang S, et al. The preliminary in vitro study and application of deep learning algorithm in cone beam computed tomography image implant recognition. Sci Rep. 2023;13(1):18467. 10.1038/s41598-023-45757-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Guo J, et al. TVGG dental implant identification system. Front Pharmacol. 2022;13: 948283. 10.3389/fphar.2022.948283. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.H. J. Kong, "Classification of dental implant systems using cloud-based deep learning algorithm: An experimental study," J Yeungnam Med Sci 40, no. Suppl, pp. S29-S36, 2023. 10.12701/jyms.2023.00465. [DOI] [PMC free article] [PubMed]
9.S. Sukegawa et al., "Optimizing Dental Implant Identification using Deep Learning Leveraging Artificial Data," 2023. 10.21203/rs.3.rs-3392655/v1. [DOI] [PMC free article] [PubMed]
10.Sukegawa S, et al. Deep neural networks for dental implant system classification. Biomolecules. 2020;10(7):984. 10.3390/biom10070984. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Chaurasia A, Namachivayam A, Koca-Ünsal RB, Lee J-H. Deep-learning performance in identifying and classifying dental implant systems from dental imaging: a systematic review and meta-analysis. Journal of Periodontal & Implant Science. 2023;54(1):3. 10.5051/jpis.2300160008. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Jang WS, et al. Accurate detection for dental implant and peri-implant tissue by transfer learning of faster R-CNN: a diagnostic accuracy study. BMC Oral Health. 2022;22(1):591. 10.1186/s12903-022-02539-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Lee D-W, Kim S-Y, Jeong S-N, Lee J-H. Artificial intelligence in fractured dental implant detection and classification: evaluation using dataset from two dental hospitals. Diagnostics. 2021;11(2):233. 10.3390/diagnostics11020233. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Oh S, et al. Deep learning-based prediction of osseointegration for dental implant using plain radiography. BMC Oral Health. 2023;23(1):208. 10.1186/s12903-023-02921-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Park J-H, Moon HS, Jung H-I, Hwang J, Choi Y-H, Kim J-E. Deep learning and clustering approaches for dental implant size classification based on periapical radiographs. Sci Rep. 2023;13(1):16856. 10.1038/s41598-023-42385-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Unknown. "Dental Implant Dataset." Roboflow. https://universe.roboflow.com/dental-implant-nyw2h/-dental-implant (accessed 2024–11–01 2024).
17.Gonzalez RC, Woods RE. Digital Image Processing, 3rd ed. Pearson Education India. 2009. ISBN: 9780131687288.
18.Jain AK. Fundamentals of Digital Image Processing. Englewood Cliffs: Prentice-Hall, Inc., United States; 1989. p. 569. ISBN: 0133361659.
19.Zuiderveld K. “Contrast Limited Adaptive Histogram Equalization,” in Graphics Gems IV, P. S. Heckbert (Ed.). San Diego: Academic Press; 1994. pp. 474–485. 10.1016/B978-0-12-336156-1.50061-6.
20.Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," nature, vol. 521, no. 7553, pp. 436–444, 2015. 10.1038/nature14539. [DOI] [PubMed]
21.Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20:273–97. 10.1007/BF00994018. [Google Scholar]
22.Breiman L. Random forests. Mach Learn. 2001;45:5–32. 10.1023/A:1010933404324.22. L. Breiman, "Random forests," Machine learning, vol. 45, pp. 5-32, 2001. 10.1023/A:1010933404324. [Google Scholar]
23.K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014. 10.48550/arXiv.1409.1556.
24.K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778. 10.48550/arXiv.1512.03385.
25.F. Chollet, "Xception: Deep learning with depthwise separable convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1251–1258. 10.1109/CVPR.2017.195.
26.Sokolova M, Lapalme G. A systematic analysis of performance measures for classification tasks. Inf Process Manage. 2009;45(4):427–37. 10.1016/j.ipm.2009.03.002. [Google Scholar]
27.D. M. Powers, "Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation," arXiv preprint arXiv:2010.16061, 2020. 10.48550/arXiv.2010.16061.
28.Kohavi R. “A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection,” in Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI). Montreal; 1995. pp. 1137–1143.
29.Lee J-H, Kim Y-T, Lee J-B, Jeong S-N. A performance comparison between automated deep learning and dental professionals in classification of dental implant systems from dental imaging: a multi-center study. Diagnostics. 2020;10(11):910. 10.3390/diagnostics10110910. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Hasnain MA, Malik H, Asad MM, Sherwani F. Deep learning architectures in dental diagnostics: a systematic comparison of techniques for accurate prediction of dental disease through x-ray imaging. International Journal of Intelligent Computing and Cybernetics. 2024;17(1):161–80. 10.1108/IJICC-08-2023-0230. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The public data related to dental implants used in this study can be accessed on the following website: https://universe.roboflow.com/dental-implant-nyw2h/-dental-implant.

[CR1] 1.Lang NP. Oral implants: the paradigm shift in restorative dentistry. J Dent Res. 2019;98(12):1287–93. 10.1177/0022034519853574. [DOI] [PubMed] [Google Scholar]

[CR2] 2.Hung KF, Yeung AWK, Bornstein MM, Schwendicke F. Personalized dental medicine, artificial intelligence, and their relevance for dentomaxillofacial imaging. Dentomaxillofacial Radiology. 2023;52(1):20220335. 10.1259/dmfr.20220335. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR3] 3.Lee K-Y, Shin KS, Jung J-H, Cho H-W, Kwon K-H, Kim Y-L. Clinical study on screw loosening in dental implant prostheses: a 6-year retrospective study. J Korean Assoc Oral Maxillofac Surg. 2020;46(2):133. 10.5125/jkaoms.2020.46.2.133. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR4] 4.Niu Z, Zhong G, Yu H. A review on the attention mechanism of deep learning. Neurocomputing. 2021;452:48–62. 10.1016/j.neucom.2021.03.091. [Google Scholar]

[CR5] 5.Sukegawa S, et al. Is attention branch network effective in classifying dental implants from panoramic radiograph images by deep learning? PLoS ONE. 2022;17(7): e0269016. 10.1371/journal.pone.0269016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Ou-Yang S, et al. The preliminary in vitro study and application of deep learning algorithm in cone beam computed tomography image implant recognition. Sci Rep. 2023;13(1):18467. 10.1038/s41598-023-45757-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Guo J, et al. TVGG dental implant identification system. Front Pharmacol. 2022;13: 948283. 10.3389/fphar.2022.948283. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR8] 8.H. J. Kong, "Classification of dental implant systems using cloud-based deep learning algorithm: An experimental study," J Yeungnam Med Sci 40, no. Suppl, pp. S29-S36, 2023. 10.12701/jyms.2023.00465. [DOI] [PMC free article] [PubMed]

[CR9] 9.S. Sukegawa et al., "Optimizing Dental Implant Identification using Deep Learning Leveraging Artificial Data," 2023. 10.21203/rs.3.rs-3392655/v1. [DOI] [PMC free article] [PubMed]

[CR10] 10.Sukegawa S, et al. Deep neural networks for dental implant system classification. Biomolecules. 2020;10(7):984. 10.3390/biom10070984. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR11] 11.Chaurasia A, Namachivayam A, Koca-Ünsal RB, Lee J-H. Deep-learning performance in identifying and classifying dental implant systems from dental imaging: a systematic review and meta-analysis. Journal of Periodontal & Implant Science. 2023;54(1):3. 10.5051/jpis.2300160008. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR12] 12.Jang WS, et al. Accurate detection for dental implant and peri-implant tissue by transfer learning of faster R-CNN: a diagnostic accuracy study. BMC Oral Health. 2022;22(1):591. 10.1186/s12903-022-02539-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR13] 13.Lee D-W, Kim S-Y, Jeong S-N, Lee J-H. Artificial intelligence in fractured dental implant detection and classification: evaluation using dataset from two dental hospitals. Diagnostics. 2021;11(2):233. 10.3390/diagnostics11020233. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR14] 14.Oh S, et al. Deep learning-based prediction of osseointegration for dental implant using plain radiography. BMC Oral Health. 2023;23(1):208. 10.1186/s12903-023-02921-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR15] 15.Park J-H, Moon HS, Jung H-I, Hwang J, Choi Y-H, Kim J-E. Deep learning and clustering approaches for dental implant size classification based on periapical radiographs. Sci Rep. 2023;13(1):16856. 10.1038/s41598-023-42385-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Unknown. "Dental Implant Dataset." Roboflow. https://universe.roboflow.com/dental-implant-nyw2h/-dental-implant (accessed 2024–11–01 2024).

[CR17] 17.Gonzalez RC, Woods RE. Digital Image Processing, 3rd ed. Pearson Education India. 2009. ISBN: 9780131687288.

[CR18] 18.Jain AK. Fundamentals of Digital Image Processing. Englewood Cliffs: Prentice-Hall, Inc., United States; 1989. p. 569. ISBN: 0133361659.

[CR19] 19.Zuiderveld K. “Contrast Limited Adaptive Histogram Equalization,” in Graphics Gems IV, P. S. Heckbert (Ed.). San Diego: Academic Press; 1994. pp. 474–485. 10.1016/B978-0-12-336156-1.50061-6.

[CR20] 20.Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," nature, vol. 521, no. 7553, pp. 436–444, 2015. 10.1038/nature14539. [DOI] [PubMed]

[CR21] 21.Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20:273–97. 10.1007/BF00994018. [Google Scholar]

[CR22] 22.Breiman L. Random forests. Mach Learn. 2001;45:5–32. 10.1023/A:1010933404324.22. L. Breiman, "Random forests," Machine learning, vol. 45, pp. 5-32, 2001. 10.1023/A:1010933404324. [Google Scholar]

[CR23] 23.K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014. 10.48550/arXiv.1409.1556.

[CR24] 24.K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778. 10.48550/arXiv.1512.03385.

[CR25] 25.F. Chollet, "Xception: Deep learning with depthwise separable convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1251–1258. 10.1109/CVPR.2017.195.

[CR26] 26.Sokolova M, Lapalme G. A systematic analysis of performance measures for classification tasks. Inf Process Manage. 2009;45(4):427–37. 10.1016/j.ipm.2009.03.002. [Google Scholar]

[CR27] 27.D. M. Powers, "Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation," arXiv preprint arXiv:2010.16061, 2020. 10.48550/arXiv.2010.16061.

[CR28] 28.Kohavi R. “A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection,” in Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI). Montreal; 1995. pp. 1137–1143.

[CR29] 29.Lee J-H, Kim Y-T, Lee J-B, Jeong S-N. A performance comparison between automated deep learning and dental professionals in classification of dental implant systems from dental imaging: a multi-center study. Diagnostics. 2020;10(11):910. 10.3390/diagnostics10110910. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR30] 30.Hasnain MA, Malik H, Asad MM, Sherwani F. Deep learning architectures in dental diagnostics: a systematic comparison of techniques for accurate prediction of dental disease through x-ray imaging. International Journal of Intelligent Computing and Cybernetics. 2024;17(1):161–80. 10.1108/IJICC-08-2023-0230. [Google Scholar]

PERMALINK

Optimized classification of dental implants using convolutional neural networks and pre-trained models with preprocessed data

Reza Ahmadi Lashaki

Zahra Raeisi

Nasim Razavi

Mehdi Goodarzi

Hossein Najafzadeh

Abstract

Objective

Methodology

Results

Conclusion

Introduction

Material and methods

Data collection

Fig. 1.

Ethics approval

Preprocessing

Table 1.

Fig. 2.

Model development

Independent CNN model

Model architecture

Table 2.

Fig. 3.

Combined models: integration of CNN with SVM, RF, and DT

Designing CNNs for enhanced feature detection

Conventional techniques for classification

Fig. 4.

Fig. 5.

Fig. 6.

Evaluation of pre-trained models: VGG16, ResNet50, and Xception

Optimization of VGG16Architecture

Fig. 7.

Optimization of ResNet50 architecture

Fig. 8.

Optimization of Xception Architecture

Fig. 9.

Learning parameters and model configurations

Fig. 10.

Model evaluation

Performance metrics

Cross-validation

k-fold cross-validation for robust model evaluation

Result

Training and validation performance of CNN classification models

Fig. 11.

Confusion matrices for CNN classification models

Fig. 12.

Effects of preprocessing on model performance

Fig. 13.

Performance comparison of CNN with other classifiers

Fig. 14.

Extended validation on pre-trained models

Fig. 15.

Discussion

Table 3.

Conclusion

Acknowledgements

Authors’ contributions

Funding

Data availability

Declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Footnotes

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases