. 2025 Aug 22;15:30834. doi: 10.1038/s41598-025-14450-w

Table 1.

Summary of related works.

Author	Title	Summarized description
Welikala R. A., et al. (2020)¹¹	Automated detection and classification of oral lesions using deep learning for early detection of oral cancer	Dataset: The binary class of lesion and no-lesion, referral and non-referral, and the multiclass including the categories of high risk and low risk of OPMD were presented. Originality: The automated detection and classification for the early detection of oral cancer was conducted by ResNet101 for classification and Faster-RCNN for object detection. Result: The achievement of image classification was F1 score of 87.07% for identification of images that contained lesions and 78.30% for the identification of images that required referral. Object detection achieved an F1 score of 41.18% for the detection of lesions that required referral.
Lin H., et al. (2021)¹²	Automatic detection of oral cancer in smartphone-based images using deep learning for early diagnosis	Dataset: The multiclass dataset of oral lesions including aphthous ulcer, low-risk OPMD, high-risk OPMD, and cancer was used. Oral lesion images were collected from four different smartphone types. Originality: Three CNN architectures of VGG16, ResNet50, and DenseNet169 were implemented and compared with their proposed method (HRNet). Result: The performance of the proposed method (HRNet) achieved a sensitivity of 83.0%, specificity of 96.6%, precision of 84.3%, and F1 score of 83.6% on 455 test images.
Tanriver G., et al. (2021)¹³	Automated detection and classification of oral lesions using deep learning to detect oral potentially malignant disorders	Dataset: Photographic images of 3 classes of oral lesions, including benign, OPMD, and carcinoma, were used to construct the models. Originality: Classification of different CNNs, segmentation of U-net backbone, Mask R-CNN with ResNet backbones, and object detection of YOLOv5 were implemented. Result: Test results of U-Net with EfficientNet-b7 applied test-time augmentation (TTA) achieved Dice test of 0.929. Mask-RCNN results on test set with ResNet backbones achieved approximate AP50 at 80% and Mask AP50 in the range of 70–80%. YOLOv5l achieved 0.953 on AP50 with TTA applied. The results of classification of 4 different CNN models were reported in precision, recall, and F1 score, which were similarly obtained in the range of 80–90% across all of the implemented CNN models.
Warin K., et al. (2021)¹⁴	Automatic classification and detection of oral cancer in photographic images using deep learning algorithms	Dataset: The binary class of oral photographic images including 350 images of oral squamous cell carcinoma and 350 images of normal oral mucosa was used to construct the models. Originality: The CNN architecture of DenseNet121 for classification and Faster R-CNN for object detection were conducted with the balanced binary class of datasets for oral cancer screening. Result: The classification accuracy of DenseNet121 model achieved a precision of 99%, a recall of 100%, an F1 score of 99%, a sensitivity of 98.75%, a specificity of 100%, and an area under the receiver operating characteristic curve of 99%. The detection accuracy of a Faster R-CNN model achieved a precision of 76.67%, a recall of 82.14%, an F1 score of 79.31%, and an area under the precision-recall curve of 0.79.
Warin K., et al. (2022)¹⁵	AI-based analysis of oral lesions using novel deep convolutional neural networks for early detection of oral cancer	Dataset: The dataset is oral photographic images of oral potentially malignant disorders (OPMDs) and oral squamous cell carcinoma (OSCC). Originality: The classification models were created by DenseNet-169, ResNet-101, SqueezeNet, and Swin-S. The object detection models were created by Faster R-CNN, YOLOv5, RetinaNet, and CenterNet2. The performance of trained models was assessed additionally by comparing oral and maxillofacial surgeons and general practitioners. Result: DenseNet-196 yielded the best multiclass image classification performance with AUC of 1.00 and 0.98 on OSCC and OPMD, respectively. The AUC of the best multiclass CNN-base object detection models, Faster R-CNN, was 0.88 and 0.64 on OSCC and OPMDs, respectively. In comparison with experts and general practitioners, CNN-based models showed comparable diagnostic performances to expert level in classifying OSCC and OPMDs on oral photographic images.
Das M., et al. (2023)¹⁶	Automatic detection of oral squamous cell carcinoma from histopathological images of oral mucosa using deep convolutional neural network	Dataset: Binary class of histopathological images including cancerous and non-cancerous leions was used. The images were preprocessed by the Gaussian filter. Originality: The author proposed 10-layer CNN architecture to construct the model and compared the performance with 7 CNN architectures of VGG16, VGG19, AlexNet, ResNet50, ResNet101, Mobile Net, and Inception Net for classification. Result: The proposed 10-layer CNN model outperformed the other comparative models with the highest accuracy of 0.9782. AlexNet, ResNet50, ResNet101, Mobile Net, and Inception Net achieved an accuracy of 0.88, 0.91, 0.89, 0.93, and 0.92, respectively. VGG16 and VGG19 yielded the lowest performance with an accuracy of 0.74 and 0.71, respectively.