Abstract
Objective
To overcome the scarcity of annotated dental X-ray datasets, this study presents a novel pipeline for generating high-resolution synthetic orthopantomography (OPG) images using customized generative adversarial networks (GANs).
Methods
A total of 4777 real OPG images were collected from clinical centres in Pakistan, Thailand, and the U.S., covering diverse anatomical features. Twelve GAN models were initially trained, with four top-performing variants selected for further training on both combined and region-specific datasets. Synthetic images were generated at 2048 × 1024 pixels, maintaining fine anatomical detail. The evaluation was conducted using (1) a YOLO-based object detection model trained on real OPGs to assess feature representation via mean average precision, and (2) expert dentist scoring for anatomical and diagnostic realism.
Results
All selected models produced realistic synthetic OPGs. The YOLO detector achieved strong performance on these images, indicating accurate structural representation. Expert evaluations confirmed high anatomical plausibility, with models M1 and M3 achieving over 50% of the reference scores assigned to real OPGs.
Conclusion
The developed GAN-based pipeline enables the ethical and scalable creation of synthetic OPG images, suitable for augmenting datasets used in artificial intelligence-driven dental diagnostics.
Clinical Significance
This method provides a practical solution to data limitations in dental artificial intelligence, supporting model development in privacy-sensitive or low-resource environments.
Keywords: Orthopantomography (OPG) X-ray, Generative Adversarial Networks (GAN), Dental Imaging, Data Augmentation and Artificial intelligence
Introduction
Orthopantomography (OPG) is a dental imaging technique referred to as panoramic dental radiography or panoramic X-ray. It provides a panoramic view of the upper and lower jaws, teeth, and surrounding structures in a single image. OPG X-rays are used in dentistry for diagnosing dental conditions, assessing jaw alignment, planning orthodontic treatments, and evaluating oral surgery procedures.1 OPG images play a central role in the development of artificial intelligence (AI) models for various dental diagnostic tasks. Prior studies have demonstrated the use of deep learning models for detecting a range of conditions from panoramic radiographs with high accuracy and agreement with expert diagnoses (orhan2023determining), as well as for identifying specific pathologies like idiopathic osteosclerosis2 and classifying periodontal defects using radiomic features.3 AI has also shown strong potential in diagnosing dental caries through convolutional neural network (CNN)-based models4 and in automating clinically relevant assessments such as Kennedy’s classification using Mask R-CNN.5 Additionally, in our previous work, object detection models were used to diagnose multiple conditions from OPGs, such as root fractures and periodontal disease, demonstrating the diagnostic value of OPGs in AI-assisted dentistry.6 However, we encountered a scarcity of OPGs for certain diagnostic classes, which motivated the present study.
Deep learning methods for medical image analysis face significant challenges due to the limited availability of medical image datasets, which can lead to issues like data bias and overfitting.7 These models require large high quality datasets for effective training, but the scarcity of medical data, caused by factors such as privacy concerns, limited data sharing between organizations, and high storage costs, poses a major barrier to the adoption of modern machine learning tools in healthcare and medical research.8,9 To address these challenges, various data augmentation strategies have been employed to expand datasets and improve model performance. Traditional image transformation techniques such as cropping, rotating, and flipping, as well as deep learning-based image generation methods, have proven effective in enhancing model accuracy.10 Tailoring augmentation techniques to the specific characteristics of the image data yields the best results.7 However, certain types of medical images, such as OPGs, limit the applicability of traditional methods like cropping and flipping, underscoring the need for more specialized augmentation approaches. Advanced generative models such as variational autoencoders, generative adversarial networks (GAN), and diffusion models can be employed for medical image augmentation and can enable the creation of realistic and diverse datasets. Generating synthetic OPGs using advanced generative models not only addresses data scarcity but also mitigates ethical and privacy concerns associated with real patient data.
This work focuses on developing and evaluating advanced GAN models specifically tailored for producing high-resolution synthetic OPG images. By generating anatomically realistic panoramic dental X-rays, this approach aims to augment existing datasets, thereby addressing data scarcity and enhancing the training of deep learning models for dental image analysis.
Literature review
AI is rapidly transforming dental practice, education, and research, driven by advancements in Large Language Models, Large Vision Models, and multimodal systems. The International Dental Journal has launched a dedicated section, ‘AI and Dentistry’, to showcase emerging research in areas such as disease prediction, teledentistry, and ethical AI use, recognizing AI as a catalyst for innovation across the field.11 A comprehensive two-part narrative review further explores AI’s expanding role in diagnostics, treatment planning, clinical documentation, patient care, and education, while emphasizing ethical considerations, regulatory needs, and the importance of AI literacy for successful integration into dental practice.12,13 Another series of recent contributions underscore the expanding role of AI in various dental specialties. These include its impact on clinical diagnostics, patient care, and workflow optimization,14 as well as its integration into orthodontic treatment planning and decision support systems.15 In oral medicine, AI shows promise in diagnosing image-rich conditions such as oral cancer and lichen planus, highlighting the need for increased AI literacy among clinicians.16 Alongside these clinical advances, the importance of addressing ethical and legal concerns surrounding bias, autonomy, and data governance in AI-driven healthcare is emphasized.17
GANs introduced by Goodfellow et al,18 are a type of generative model designed to create realistic synthetic data by training two neural networks, the generator and the discriminator, in an adversarial framework. The generator attempts to produce data indistinguishable from the original dataset, while the discriminator learns to differentiate between real and synthetic data. This adversarial process helps GANs produce high-quality synthetic data that closely resembles the target dataset. GANs infer the data distribution in a latent vector space during training and use random sampling in this space to generate new, realistic samples. Traditional data augmentation methods, such as cropping, rotating, and flipping, often result in synthetic data with limited diversity. GANs overcome this limitation by creating highly realistic and diverse synthetic data. However, GANs face challenges such as mode collapse, where the generator produces limited variations of the same data, and lengthy training times due to the adversarial nature of the framework. To address these issues, Arjovsky et al introduced the Wasserstein GAN (WGAN), which replaced the traditional binary cross-entropy loss with a loss function based on the Wasserstein distance. This improvement enhanced the stability of training, reduced mode collapse, and provided meaningful learning curves, making it easier to debug and tune hyperparameters.19
Over the years, GANs have gained significant attention for their applications in medical imaging, a domain where datasets are often small, imbalanced, and difficult to obtain due to privacy concerns and limited data sharing.20,21 Researchers have demonstrated the utility of GANs for data augmentation to address these challenges. Bowles et al22 showed that GAN-generated synthetic data could improve the performance of brain segmentation tasks, highlighting the potential of GANs in specific areas of medical imaging. Similarly, Han et al applied GANs to enhance CNN-based classification and segmentation tasks, generating synthetic chest X-rays and augmenting datasets for liver lesion detection. These techniques significantly improve performance compared to traditional augmentation methods.20,21,23 Specialized GAN architectures have further expanded the applicability of this technology. For example, Radford et al introduced deep convolutional GANs (DCGANs), which used convolutional layers instead of traditional fully connected layers. This modification allows DCGANs to effectively learn hierarchical representations, making them highly suitable for image-based tasks.24 Conditional GANs extended this capability by generating images that meet specific conditions, such as incorporating labelled information alongside the latent vector during training. Innovative approaches like CycleGANs have also been used for unique medical imaging tasks. For instance, researchers employed CycleGANs to transform contrast computed tomography images into noncontrast images, providing a novel approach to data augmentation.25 Dumagpi and Jeong26 addressed the issue of data imbalance in X-ray security images by using DCGAN and CycleGAN to synthesize new images, reducing the false-positive rate by up to 19.9%. Latent augmentation techniques have also emerged as a way to enhance the diversity and fidelity of synthetic images by modifying latent vectors, further reducing the likelihood of mode collapse.27 The use of GAN-synthesized periapical radiographs has shown potential for improving the classification accuracy of C-shaped root canals by enhancing data diversity. Synthetic images have proven effective as data augmentation tools in dental imaging tasks, particularly in scenarios with limited or imbalanced datasets.28 GANs have also shown strong potential in dental image reconstruction, offering improved accuracy and efficiency over traditional methods. Recent studies demonstrate the effectiveness of GANs and other generative models in reconstructing partial and full dental structures from intraoral and plaster scans, with diffusion models and wavelet-based approaches further enhancing outcomes.29 Additionally, GAN-based systems have shown promising results in designing biomimetic single-molar prostheses, suggesting potential for streamlining CAD workflows in prosthetic dentistry.30
One significant very recent work related to this study has been reported in ref.31 The authors used a dataset of 2322 OPG images, applying extensive preprocessing, such as cropping to exclude nondental regions, potentially limiting anatomical completeness, and yielding images with limited realism and structural fidelity. While the presented model was capable of generating images of 256 × 256 resolution, their applicability for diagnostic or augmentation purposes remains constrained due to the lack of fine details and lower resolution. Our work focuses on generating highly realistic and high-resolution panoramic dental radiographs using a more advanced GAN architecture. Unlike previous studies that relied on small datasets and low-resolution images, we utilize a significantly larger dataset with diverse samples to improve model generalization. Instead of cropping specific regions, our approach generates complete OPG images, preserving critical anatomical structures for better clinical relevance, thus surpassing existing synthetic dental radiographs in quality and realism. A detailed technical comparison of our work with31 is given in the discussion sections of this manuscript.
Another relevant study employed StyleGAN2-ADA to generate synthetic panoramic radiographs and validated them through clinician-based visual Turing tests, demonstrating high visual realism and effective deidentification.32 While their work emphasizes perceptual quality and privacy-preserving use, our approach complements it by focusing on clinical applicability and diagnostic task-readiness, supported by automated and expert-based evaluations. A comparative discussion is provided later in the manuscript.
The application of GANs to OPG X-rays hold significant promise, as prior successes with similar imaging tasks suggest favourable outcomes. Overall, GANs have revolutionized data augmentation by generating realistic and diverse synthetic data, addressing limitations of traditional methods, and improving the performance of deep learning models in both computer vision and medical domains.33,34
Anatomy of OPG X-ray
OPG, also known as panoramic dental radiography or panoramic X-ray, is a dental imaging technique that captures a single panoramic view of the upper and lower jaws, teeth, and surrounding structures. Widely used in dentistry, OPG X-rays aid in diagnosing dental conditions, assessing jaw alignment, planning orthodontic treatments, and evaluating oral surgery procedures. Its ability to provide a comprehensive view of the entire oral cavity in a single exposure makes it an invaluable tool for dental care. However, OPGs are part of the diagnostic process and do not provide complete clinical information on their own. While interpreting OPG X-rays requires a thorough understanding of anatomy, this summary aims to make interpretation more accessible to researchers without a medical background. For a detailed guide to OPG evaluation, interested readers are encouraged to refer to Pandolfo.1
A sample OPG X-ray is shown in Figure 1 to highlight key areas of interest. Different dentists may prioritize specific features based on the reason the OPG was conducted, leading to variations in their focus and analysis. This differing analysis can sometimes result in interpretations that may not align, causing a lack of consensus among practitioners. This highlights the importance of understanding the context in which the OPG is used and the potential for subjectivity in interpreting the images.
Fig. 1.
Basic anatomy and nomenclature for the OPG image.
The X-ray highlights some important parts of anatomy that will be relevant to our discussion.
-
•
The red box indicates the maxillary (upper) and mandibular (lower) teeth. Teeth are typically identified by their position and type, ie, incisors (front teeth), canines (pointed teeth next to the incisors), premolars (behind the canines), and molars (large back teeth). The highlighted red box in the image encompasses the entire set of teeth, indicating the area of primary dental interest. Dentists often use this box to look for cavities, tooth decay, gum disease, impacted teeth, and the alignment of teeth and jaws. The teeth are divided into four quadrants and then assigned numbers from one to eight in each quadrant. The maxilla and mandible are also identified along with their teeth.
-
•
The curved, yellow-highlighted areas above the upper teeth point to the maxillary sinuses, which are air-filled spaces near the nose. They appear as darker hollow areas in the X-ray because of their air content. Abnormalities in these sinuses could indicate sinusitis or other sinus conditions.
-
•
The mandibular nerve (indicated as a red-highlighted curved area) is part of the trigeminal nerve, which runs within the mandibular canal along the mandible. The nerve is responsible for sensation in the lower lip, chin, and lower teeth. Dentist’s typically account for it while performing dental surgeries such as tooth extractions or root canals to avoid complications.
-
•
The green-highlighted areas on the right side of the image indicate the path of this nerve.
-
•
Temporomandibular joints are located on both sides of the head where the lower jaw connects to the skull, near the ears. These joints allow the jaw to move up and down and side to side. Issues with these joints can lead to temporomandibular disorders, causing pain and dysfunction in the jaw.
-
•
The darker grey regions represent less dense areas (soft tissues).
-
•
The lighter grey to white areas indicate denser structures like bones and teeth. Dentists and specialists analyse these variations to check for signs of bone loss, fractures, or other structural anomalies.
As stated, experienced radiologists will be able to point out even more details and can present us with medical nomenclature. We believe that the above description will be relevant and sufficient for the readers to interpret this study. We now discuss various examples of OPG X-ray images in the context of GANs, supported by Figure 2. These examples, sampled from our training data, represent the general categories of classes present in the dataset and provide a basis for the descriptions that follow.
-
•
Artefacts: Foreign objects are sometimes captured in OPG images, and while these are easily identified by radiologists and dentists, they can pose challenges for training algorithms. For example, Figure 2A,B shows the presence of a nose pin and a necklace, and earrings, respectively. Some artefacts, such as the white patch in Figure 2A, are less obvious and may obscure critical anatomical structures, like the jaw midline, complicating image interpretation for dentists.
-
•
Zoom ratio: OPG X-ray images can be captured at varying distances, resulting in zoomed-in or zoomed-out images. While both types are easily interpretable by radiologists, they can create challenges for training algorithms, which typically require images of uniform resolution. For instance, Figure 2C illustrates a zoomed-in image that excludes the sinus cavities. This might be sufficient for a dentist focused on examining the lower jaw, but could be problematic for one needing to assess the upper jaw and sinus cavities, potentially labelling it as artefact-ridden. Conversely, Figure 2D depicts a zoomed-out image encompassing all regions, offering a more comprehensive view suitable for high-level examination.
-
•
Dental fillings and root canals: As noted above, the white part is the hard surface, and grey indicates soft tissue. The white in Figure 2E represents relatively hard tooth filings and root canals. Similarly, Figure 2F shows orthodontic braces.
Fig. 2.
Classifications of orthopantomography X-ray images with the presence of artefacts. (A) Presence of a nose pin and necklace as a foreign object. (B) The presence of earrings as a foreign object. (C) Zoomed in orthopantomography image excluding sinus cavities. (D) Zoomed out orthopantomography image including the cervical vertebra. (E) Orthopantomography image with tooth filings and root canals, and (F) orthopantomography image with orthodontic braces.
Synthetic image generation using deep learning models
GANs are widely used generative models for synthetic image generation. A GAN consists of two competing neural networks: the generator, which transforms latent noise into synthetic images, and the discriminator, a CNN that distinguishes real from generated images. Both networks are trained adversarially until Nash equilibrium is reached, iteratively improving their performance.35 The discriminator is a CNN-based binary classifier with multiple convolutional layers, progressively increasing filters to capture complex patterns. It employs ReLU activations and downsampling, followed by a final Sigmoid or Hyperbolic Tangent (Tanh) layer for binary classification. The generator, designed for image synthesis, accepts a latent noise vector and uses transpose convolutional layers to upsample the input until it matches the target image dimensions, essentially the reverse of the discriminator’s architecture. GAN training follows a min-max loss function, as proposed by Ian Goodfellow et al,18 where the generator minimizes the ability of the discriminator to differentiate real from fake images:
| (1) |
The generator takes a random noise vector where is sampled from a prior distribution of latent noise and generates a fake sample . is the output of discriminator and indicates whether image is real or fake. is the distribution of the real data. The first term in the equation represents the discriminator’s ability to detect real image, ie, discriminator tries to increase the probability of the correct identification of the real image. The second term in the equation represents the discriminator’s ability to classify generated data as a fake image. The generator tries to minimize to make it difficult for discriminator in identifying fake images.
DCGANs were introduced in 2015 as a significant advancement in GAN architecture. They leverage deep convolutional networks in both the generator and discriminator. DCGANs have become quite popular in data augmentation tasks due to their ability to generate high-resolution, realistic images from learned data distributions. Figure 3 illustrates the general architecture of DCGAN.
Fig. 3.
General architecture of DCGAN trained for 64 × 64 three-channel images (Creative Commons License).
The choice of the number of layers is dependent on the size of the images. The discriminator takes the image as an input and applies convolution layers until the last layer, where it transforms it to a fully connected layer and applies a classifier to output a binary value to identify a real or fake image. The generator reverses this process, takes an input latent noise vector, and applies a reverse convolution layer until it reaches the original image size. Just as in GAN, the training is performed until equilibrium is reached in min-max loss. GANs are challenging to train due to issues like mode collapse and the diminishing gradient problem. Mode collapse occurs when the generator produces limited diversity in outputs, focusing on a small subset of the target data distribution. Mode collapse results when GAN fails to capture the complete distribution of data. The diminishing gradient problem arises when the discriminator becomes too strong, causing its feedback to the generator to diminish significantly. It makes it difficult for the generator to learn effectively.
WGANs, introduced as an enhancement to DCGANs, mitigate the challenges of mode collapse and diminishing gradients. WGAN does this by replacing the conventional binary cross-entropy loss function used in DCGAN with the Wasserstein distance metric. The Wasserstein distance measure quantifies the effort needed to transform the generator’s distribution into the real data distribution, providing more meaningful and smoothed gradients to guide the optimization process. WGANs also enforce a Lipschitz continuity constraint, either by clipping the weights of the discriminator or using a gradient penalty, which ensures stable and reliable updates during training. The architectural modifications proposed in WGAN stop the discriminator from saturating and encourage the generator to produce diverse samples.
Methodology
This study proposes a pipeline for generating and evaluating synthetic OPG images using GANs. The process begins with data collection from three diverse sources, followed by preprocessing and augmentation to prepare the dataset. Twelve different GAN models with varying architectures and hyperparameters were trained, from which four were shortlisted based on the visual quality of the generated images. Each selected model was used to generate 1000 synthetic OPGs, and the top 50 images from each were used for evaluation. The generated images were assessed through two methods: an automated test using a real-world YOLO-based detection model,6 and manual evaluation by expert dentists using a structured assessment rubric. The complete methodology is outlined in the flow diagram given in Figure 4. All data related to this research project, including the synthetic OPG images generated by the models, selected labelled images, codebase, and trained model files, are available at.36
Fig. 4.
Workflow of the GAN-based synthetic OPG image generation and evaluation pipeline.
Data collection and preparation
To ensure dataset diversity and representative variability, a total of 4777 OPG images were collected from three different sources across Pakistan, Thailand, and the United States. The images from Pakistan and Thailand were obtained from private clinical datasets, which are publicly available at Mendeley Data,37,38 respectively, while the U.S. dataset was sourced from the Tufts Dental Database, which is publicly available at.39 All images were examined to exclude worn-out or low-quality scans, followed by normalization to standardize intensity ranges. The dataset exhibited a range of resolutions, from 1024 × 527 to 3006 × 1859 pixels. Specifically, the Pakistani dataset comprised 3209 images with varying resolutions, the Thai dataset 568 images with varying resolutions, and the Tufts dataset 1000 images with a uniform resolution of 1615 × 840 pixels. These details are summarized in Table 1.
Table 1.
Summary of OPG image datasets collected from three countries, including source type, number of images, and resolution range.
| Country | Number of images | Resolution (pixels) |
|
|---|---|---|---|
| Minimum | Maximum | ||
| Pakistan | 3209 | 2190 × 1236 | 3006 × 1859 |
| Thailand | 568 | 1024 × 527 | 2872 × 1504 |
| USA | 1000 | 1615 × 840 | 1615 × 840 |
| Total images | 4777 | 1615 × 840 | 3006 × 1859 |
Training and selection of GAN models
A total of 12 GAN models were developed and trained for synthetic OPG image generation, each using distinct architectures and hyperparameters. The models were implemented using Keras and TensorFlow libraries in Python 3.10.12, and experiments were conducted on a workstation equipped with a quad-core Intel i7-6700 CPU (3.4 GHz), 32 GB RAM, and an NVIDIA RTX A4000 GPU with 16 GB of graphics memory, utilizing CUDA version 12.2 for accelerated training. Initial designs followed simple GAN structures referenced from existing literature, starting with seven layers each for the generator and discriminator. Various configurations were tested, such as weak vs strong generator/discriminator, colour vs greyscale image training, different data augmentation techniques, and training on single-source vs mixed-source datasets with varying image quality and resolution. Several models failed to converge or produce meaningful outputs.
From these experiments, four GAN models were shortlisted based on manual inspection of the generated images, specifically focusing on visual quality and anatomical plausibility. These models are referred to as M1, M2, M3, and M4 in the remainder of this text. For these models, training was gradually scaled up to determine convergence behaviour and learning progression. We began training with 10 epochs and incrementally increased it to 50, 75, 100, and so on until the model seemed to stop learning any further. It was observed that the models began capturing tooth structures by 50 epochs, with joint formations and upper jaw details becoming more prominent by 75 to 150 epochs. One representative model was selected to illustrate this progression in Figure 5, which shows how training quality evolved visually over time. For each model, the epochs were gradually increased until no significant qualitative improvements were observed (Table 2).
Fig. 5.
Deep learning model progression during training. The images show model’s output after (A) 50 epochs, (B) 75 epochs, (C) 150 epochs, and (D) 175 epochs.
Table 2.
Hyperparameters, dataset characteristics, and architectural configurations of the four final models (M1-M4) used for training and evaluation.
| Model ID |
||||
|---|---|---|---|---|
| M1 | M2 | M3 | M4 | |
| Dataset sources | Pakistan only | All three | All three | All three |
| Dataset | ||||
| Resolution range | 2190 × 1236 to 3006 × 1859 | 1024 × 527 to 3006 × 1859 | ||
| Size | 2888 | 4777 | 4777 | 4777 |
| Augmentation | ||||
| Techniques | Horizontal flip, hue adjustment | None | None | Horizontal flip |
| Size after augmentation | 8664 | 4777 | 4777 | 9554 |
| Hyperparameters | ||||
| Grey scale/colour | Colour | Colour | Grey scale | Grey scale |
| Scaled Resolution | 2048 × 1024 | 2048 × 1024 | 2048 × 1024 | 1024 × 512 |
| Epochs | 100 | 290 | 450 | 200 |
| Batch Size | 4 | 16 | 16 | 16 |
| Kernel size for discriminator | 4 | 4 | 4 | 3 |
| Kernel size for generator | 4 | 4 | 3 | 3 |
| Input latent dimension | 128 | 128 | 128 | 128 |
| Dense | 16 × 16 × 128 | 16 × 16 × 128 | 16 × 16 × 128 | 16 × 16 × 128 |
| Reshape | 8 × 16 × 256 | 8 × 16 × 256 | 4 × 8 × 1024 | 2 × 4 × 4096 |
| Number of layers in discriminator | 9 | 9 | 9 | 10 |
| Number of layers in generator | 8 | 8 | 9 | 9 |
| Discriminator activation function | ReLU, Sigmoid | ReLU, Sigmoid | ReLU, Sigmoid | ReLU, Sigmoid |
| Generator activation function | ReLU, Sigmoid | ReLU, Tanh | ReLU, Tanh | ReLU, Tanh |
| Loss function | Binary cross entropy | Binary cross entropy | Binary cross entropy | Binary cross entropy |
| Optimizer learning rate | 0.0001 | 0.0001 | 0.0001 | 0.0001 |
| Optimizer beta_1 | 0.9 | 0.5 | 0.5 | 0.5 |
Evaluation of GAN models
Each of the four shortlisted models, M1, M2, M3, and M4, was used to generate 1000 synthetic OPG images. Based on visual assessment, 50 high-quality images per model were manually selected for further analysis. It was observed that models M1, M2, and M3 produced some ‘trash’ images, synthetic outputs that visually resembled OPGs but exhibited unrealistic or distorted teeth and other anatomical structures, while model M4 generated no such trash images but produced images at a lower resolution and with less detail. Figure 6 presents visual examples of the synthetic images generated by each of the four models alongside real OPG images for qualitative comparison. Two complementary evaluation strategies were adopted to assess the realism and clinical relevance of the generated images, as described below.
Fig. 6.
Visual comparison of real and synthetic orthopantomogram (OPG) images. The top row displays three real OPGs, followed by three synthetic samples from each of the four models (M1-M4) in successive rows. The figure highlights anatomical quality, visual coherence, and structural realism across models relative to real images.
Automated evaluation
To evaluate the quality and realism of the synthetic images, we employed an automated evaluation method using a previously developed real-world YOLO-based object detection model from our earlier work reported in.6 This model, trained to detect common dental conditions, such as broken roots, periodontally compromised teeth, and partially edentulous arches, in OPG X-ray images, demonstrated reliable performance on real clinical data despite a limited dataset, owing to carefully optimized architectures and a piecewise annotation approach. For the current evaluation, a set of 200 synthetic images, 50 from each of the four models (M1-M4), were manually annotated using the computer vision annotation tool to create ground truth labels. The trained YOLO model was then used to detect objects in these images, and its performance was assessed based on the number of predicted instances and mean average precision (mAP) scores.
Table 3 summarizes the automated evaluation results for the 200 synthetic images using the YOLO-based dental object detection model. Ground truth annotations covered nine dental classes spanning broken roots (class 0), periodontally compromised teeth (class 1), and various edentulous areas in the maxillary (classes 2-5) and mandibular (classes 6-9) arches as reported in.6 The table reports ground truth and predicted instances per class, and mAP scores. Model M1 achieved a moderate mAP of 43.3% with limited correct detections mostly in mandibular classes.6, 7, 8 Model M2 performed worse than M1, mAP score of 23.4%, with many false positives and lower detection accuracy. Model M3 showed the highest overall performance with a mAP score of 53.7%, with accurate predictions across multiple classes, especially the class used in 8. Model M4, producing lower resolution images with less detail, resulted in zero predicted instances, indicating the YOLO model could not recognize relevant features in these images, though it was able to process the lower resolution input.
Table 3.
Automated evaluation of synthetic images generated by models M1 to M4 using a previously trained YOLO-based dental object detection model.
| Model | Class | Images | Ground truth instances | Predicted instances | mAP | Model | Class | Images | Ground truth instances | Predicted instances | mAP |
|---|---|---|---|---|---|---|---|---|---|---|---|
| M1 | All | 50 | 56 | 7 | 0.433 | M2 | All | 50 | 19 | 38 | 0.234 |
| 0 | 50 | 0 | 0 | - | 0 | 50 | 3 | 2 | - | ||
| 1 | 50 | 0 | 0 | - | 1 | 50 | 0 | 0 | - | ||
| 2 | 50 | 4 | 0 | 0 | 2 | 50 | 0 | 1 | - | ||
| 3 | 50 | 2 | 0 | 0.149 | 3 | 50 | 1 | 0 | 0 | ||
| 4 | 50 | 5 | 0 | 0.558 | 4 | 50 | 0 | 0 | - | ||
| 5 | 50 | 0 | 0 | - | 5 | 50 | 0 | 0 | - | ||
| 6 | 50 | 6 | 2 | 0.559 | 6 | 50 | 0 | 1 | - | ||
| 7 | 50 | 11 | 2 | 0.47 | 7 | 50 | 2 | 4 | 0.135 | ||
| 8 | 50 | 28 | 3 | 0.865 | 8 | 50 | 16 | 5 | 0.568 | ||
| 9 | 50 | 0 | 0 | - | 9 | 50 | 0 | 0 | - | ||
| M3 | All | 50 | 35 | 31 | 0.537 | M4 | All | 50 | 58 | 0 | - |
| 0 | 50 | 6 | 6 | 0.007 | 0 | 50 | 0 | 0 | - | ||
| 1 | 50 | 0 | 0 | - | 1 | 50 | 0 | 0 | - | ||
| 2 | 50 | 2 | 0 | 0.516 | 2 | 50 | 0 | 0 | - | ||
| 3 | 50 | 2 | 1 | 0 | 3 | 50 | 0 | 0 | - | ||
| 4 | 50 | 0 | 0 | - | 4 | 50 | 1 | 0 | - | ||
| 5 | 50 | 1 | 0 | 0.498 | 5 | 50 | 0 | 0 | - | ||
| 6 | 50 | 6 | 7 | 0.749 | 6 | 50 | 10 | 0 | - | ||
| 7 | 50 | 1 | 0 | 0.995 | 7 | 50 | 5 | 0 | - | ||
| 8 | 50 | 17 | 17 | 0.995 | 8 | 50 | 42 | 0 | - | ||
| 9 | 50 | 0 | 0 | - | 9 | 50 | 0 | 0 | - |
The table reports ground truth and predicted instances, and mean average precision at the intersection of union (IoU) 0.5 (mAP50).
A detailed per-class analysis of the model M3’s performance reveals notably higher mAP score for classes corresponding to the mandibular arch, specifically classes 6, 7, and 8, which represent various edentulous areas in the mandibular region. Classes 6 and 8 achieved mAP50 scores above 70%, with class 8 reaching as high as 99.5%, indicating clear and consistent anatomical features in the synthetic images. Conversely, classes related to the maxillary arch (classes 2, 3, 4, and 5) showed lower detection performance, suggesting less accurate formation of these regions in the generated images. Class 9, representing edentulous areas crossing the centre of the mandibular arch, had very low detection accuracy, consistent with the original YOLO model’s performance in this class due to insufficient training data at the time. This per-class evaluation suggests that model M3 more reliably replicates mandibular arch structures compared to maxillary arch features, possibly reflecting anatomical complexity or limitations of the training dataset.
In addition to its superior detection performance, model M3 also exhibited greater diversity in the generated samples, covering a broader range of classes compared to the other models. This variety indicates better generalization and suggests that M3 was more effective in capturing the distribution of dental conditions present in the training data.
Expert evaluation
For this evaluation, a few images were selected, on which two complementary evaluation strategies were employed. The first was a qualitative assessment, in which field experts, including medical professionals and radiologists, were informed that the images were synthetic and asked to provide open-ended feedback. Representative comments are presented in Table 4. While experts noted minor anomalies in dental structures, they emphasized that such irregularities are often present even in real OPG images, making it difficult to distinguish synthetic from real images based solely on tooth formations. Their comments suggested that the better-performing models showed a promising ability to replicate dental anatomy. However, some images exhibited blurred or incomplete jaw structures, which could limit their clinical diagnostic utility. Though not ideal for clinical use, these images are considered useful for data augmentation in machine learning pipelines.
Table 4.
Feedback from field experts on the quality and usability of generated OPG X-ray images for data augmentation.
| Synthetic OPG image | Remarks of the field expert |
|---|---|
![]() |
OPG X-ray has good teeth structure. The mandibular joints are present but do not show the normal socket-like structure needed for smooth jaw movement. The canal that carries the nerve in the lower jaw is not visible, which could be due to developmental issues or imaging limitations. |
![]() |
The teeth structure is slightly blurred, but it can also happen in real images, so it is hard to recognize that. The asymmetric mandible joints on top left and top right are extremely blur, and structures recognition is not good. Dentists might categorize this as a poor-quality OPG image and could potentially ask for another one. |
![]() |
The teeth structure seems to have defects, but it looks like a real image since teeth can have such abnormalities. There is asymmetry in mandibular joins. The temporomandibular joint (TMJ) that connects the jawbone to the skull and helps in allowing movements needed for chewing and speaking is absent on the right side. Some structures like nasal bones and mandibular canal are very blurry or not present, but this can generally happen with a poorly postured X-ray. |
![]() |
The OPG has captured tooth fillings, but again, the image has asymmetry in mandibular joints. |
The second evaluation strategy involved a quantitative, rubric-based review of 15 selected OPG images, including both real and synthetic examples (shown in Figure 6). These were assessed by experienced dentists, radiologists, and domain experts using a standardized 5-point scale, where 5 represented the highest clinical plausibility and image quality. Table 5 summarizes the average percentage scores across multiple anatomical features, with the final row presenting the overall average score for each image set. Figure 7 provides a graphical representation of these results for enhanced visualization.
Table 5.
Expert evaluation of OPG image quality and clinical plausibility.
| Criteria | Consolidated per cent score |
||||
|---|---|---|---|---|---|
| Real | M1 | M2 | M3 | M4 | |
| Teeth morphology/formation | 90.67 | 45.33 | 30.67 | 38.67 | 26.67 |
| Maxillary arch formation | 85.33 | 49.33 | 37.33 | 49.33 | 37.33 |
| Mandibular arch formation | 90.67 | 54.67 | 38.67 | 52 | 40 |
| Temporomandibular joints formation | 65.33 | 49.33 | 21.33 | 34.67 | 24 |
| Maxillary sinuses formation | 77.33 | 41.33 | 22.67 | 34.67 | 20 |
| Mandibular nerve formation | 69.33 | 33.33 | 21.33 | 22.67 | 20 |
| Maxillary bones formation | 69.33 | 38.67 | 24 | 30.67 | 22.67 |
| Mandibular bones formation | 81.33 | 44 | 24 | 37.33 | 29.33 |
| Hyoid bone | 68 | 56 | 34.67 | 38.67 | 36 |
| Inferior alveolar canal | 77.33 | 37.33 | 20 | 20 | 20 |
| Soft tissues formation | 60 | 37.33 | 20 | 20 | 20 |
| Mental Foramen | 77.33 | 33.33 | 20 | 22.67 | 20 |
| Overall clarity of the image | 76 | 42.67 | 21.33 | 36 | 22.67 |
| How real does it seem? | 85.33 | 42.67 | 22.67 | 36 | 22.67 |
| Total average per cent score | 76.67 | 43.24 | 25.62 | 33.81 | 25.81 |
| Qualitative assessment |
|||||
|---|---|---|---|---|---|
| Best anatomical features | Teeth, arches, realism | Mandibular and maxillary arches, hyoid bone | Hyoid bone (moderate) | Mandibular arch (moderate) | Hyoid bone (moderate) |
| Features to Improve | Soft tissues, nerve canal | Soft tissues, canals | Most fine structures | Fine anatomical features | Most anatomical details |
| Clinical Implication | Gold standard | Promising fidelity for key structures | Needs enhancement | Shows basic structure | Limited anatomical detail |
This table presents the average percentage scores awarded by domain experts (dentists and radiologists) to 15 selected OPG images, real and synthetic, based on a 5-point rubric. The final row indicates the mean score across all evaluations, which reflects the overall quality of the model’s synthetic image outputs.
Fig. 7.
Comparison of expert evaluation scores for real and synthetic OPG images across multiple anatomical features.
The results highlight the promising potential of synthetic image generation models. Among them, Model M1 achieved the highest overall rating (43.24%), showing notable fidelity in replicating major anatomical structures such as the mandibular and maxillary arches. Model M3 also performed encouragingly, scoring 33.81% with balanced outputs across several key features, including the hyoid bone and maxillary region. This suggests M3 could serve as a stable baseline with room for targeted enhancement. Even the lower-scoring models (M2 and M4) successfully generated recognizable anatomical outlines, establishing a foundation for further optimization. Overall, the expert evaluations affirm the clinical plausibility of synthetic OPG images, reinforcing their potential for use in AI training, data augmentation, and educational applications, especially where access to large annotated datasets is limited.
Importantly, although evaluators were aware that the dataset included synthetic images, they were blinded to which specific images were real or generated. Interestingly, even the real OPG images did not receive perfect scores, scoring 76.67% on average, indicating that even clinically captured images may not always reflect complete anatomical detail. When compared relatively, Models M1 and M3 scored over 50% in reference to the real images, suggesting a reasonable degree of anatomical plausibility. In particular, the arch formations (both mandibular and maxillary) were consistently well-represented in the synthetic outputs, structures that are often the focus of clinical assessments, such as in cases of missing teeth or arch deformities.
These findings reinforce the utility of synthetic OPG images in supporting downstream applications like AI model training, dataset augmentation, and anatomical education, especially in contexts where large, annotated real datasets are scarce.
Web application interface for the model
We developed a simple web application to host our model, utilizing Python’s FastAPI for the backend, vanilla JavaScript for image serving, and a basic HTML interface for user interaction. The application is deployed in a containerized Docker environment, ensuring easy setup on any device. Figure 8 provides snapshots of the interface.
Fig. 8.
Snapshots of the web application interface for generating and downloading synthetic OPG X-ray images.
Users can generate synthetic OPG X-ray images by clicking the ‘Generate Image’ button. Each click triggers the application to randomly generate a new latent vector, which produces a fresh image from our trained DCGAN. Additionally, users have the option to download the generated images to their preferred directory for further use. Overall, the application is designed to be user-friendly, allowing easy setup and efficient data generation for prospective users.
Discussion
Our research addresses a critical need in dental AI by leveraging GAN models to generate high-resolution synthetic OPG images, providing a valuable resource for data augmentation in deep learning applications. High-quality OPG images, often ranging from 2000 × 1000 to over 3000 × 1500 pixels, are essential for accurate dental diagnostics, especially for tasks such as disease classification, object detection, and anatomical localization. Similarly, Yang et al developed and evaluated a GAN (StyleGAN2-ADA) to synthesize periapical radiographs for augmenting training datasets aimed at classifying C-shaped root canals. They generated realistic C-shaped and non-C-shaped mandibular second molar images and assessed their quality using Frechet Inception Distance (FID) and a visual Turing test by experienced radiologists. Furthermore, they tested the impact of these synthetic images on classification performance by training an EfficientNet-B0 CNN under multiple scenarios, including combinations of real and GAN-generated data. Their results demonstrated that the GAN-generated images were visually realistic and effectively improved classification accuracy, especially in situations involving limited or imbalanced data.28 Another study further validated the potential of synthetic data in dental imaging by generating realistic panoramic radiographs using a progressive growing GAN model. The synthetic images achieved high visual fidelity, to the extent that experts found it difficult to differentiate them from real radiographs. These findings highlight the value of GAN-generated data in augmenting limited datasets, improving diagnostic model performance, and supporting dental education and training.40
However, training AI models on such high-resolution images requires large volumes of annotated data, which are often difficult to acquire due to privacy constraints and limited availability. By generating realistic OPGs at resolutions suitable for both diagnostic and machine learning purposes, up to 2048 × 1024 for clinical realism, our synthetic dataset can significantly enhance model generalizability and performance. This flexibility supports scalable training pipelines that maintain critical anatomical details while accommodating varying computational resources. Thus, our approach offers a scalable and ethically sound solution to a common bottleneck in dental AI development.
As indicated in the literature review, the work presented in ref.31 was the closest we could find that aligned with our research objectives. However, our study significantly advances the state of the art in several aspects:
-
•
The dataset in31 consisted of 2322 images, whereas our dataset includes approximately 5000 raw images, expanded to almost 10,000 with augmentation, enabling more robust model training and better generalization.
-
•
The synthetic images generated in ref.31 were limited to 256 × 256 resolution, while our model produces high-resolution images at 2048 × 1024, delivering greater anatomical detail and clinical relevance.
-
•
The prior study cropped images to focus on the dentoalveolar region, excluding key structures like the temporomandibular joints and maxillary sinus; in contrast, we utilize full OPG images, preserving comprehensive anatomical context.
-
•
Their model employed a 7-layer generator and discriminator architecture, whereas we utilized a deeper, more sophisticated 9-layer GAN, enhancing image fidelity.
-
•
For evaluation,31 relied on FID scores and human Turing testing, while we incorporate human Turing testing alongside a customized real-world object detection and localization model to directly assess the functional usability of generated images. This approach underscores the practical utility of synthetic data for downstream clinical applications. Notably, while FID is a standard quantitative metric, it does not always correlate with human perception or task-specific effectiveness, particularly in medical imaging contexts.41, 42, 43
In addition to31,32 employed StyleGAN2-ADA to generate synthetic panoramic radiographs and validated their visual realism through Turing-style tests conducted with clinicians and students. Their results highlight the potential of using advanced pretrained GAN architectures for high-fidelity, privacy-preserving image generation. We appreciate the visual quality achieved in their outputs and recognize the importance of their work in demonstrating that synthetic images can support educational and anonymization goals in dentistry. Our approach complements this by adopting a custom-built, lightweight GAN architecture that, while simpler, successfully generates high-resolution full OPGs with diverse anatomical details across populations. Furthermore, we extend the evaluation paradigm by incorporating automated, task-based object detection metrics alongside expert rubric-based assessment to assess not just realism, but also diagnostic relevance. The modularity of our model allows for future enhancement using conditional GANs and active learning strategies, positioning it for scalable use in focused clinical applications.
The results of both automated and expert-based evaluations point to the growing maturity of synthetic OPG image generation using GANs. While Model M3 excelled in object detection tasks with high mAP scores, especially for mandibular structures, expert reviewers found Model M1 more anatomically plausible overall. This divergence highlights a key insight: technical performance (eg, detection accuracy) and human-perceived realism may not always align, suggesting the need to balance these perspectives when evaluating generative models for medical use.
The expert rubric-based assessment further strengthens the case for using synthetic OPGs in clinical training and AI development. Although real images predictably scored the highest, it is notable that even they did not receive perfect evaluations, emphasizing that clinical imaging itself is subject to variability and limitations. In this context, synthetic models like M1 and M3, despite scoring below real images, achieved over 50% of the real-image benchmark, particularly in key anatomical regions such as the arches and jaw structures. These areas are clinically significant, frequently used in diagnoses involving tooth loss, misalignment, or bone degeneration. Their accurate replication in synthetic images enhances the practical value of these models for specific diagnostic and educational applications.
Moreover, the models’ ability to generate structurally coherent outputs, even when fine anatomical details are less precise, suggests a strong foundation for further improvement. This is especially important for resource-constrained settings, where access to large, annotated datasets is limited. The synthetic images offer a viable alternative for data augmentation, supporting model robustness without the ethical and logistical complications of acquiring real patient scans. While we acknowledge that the diversity of GAN-generated images is inherently limited by the distribution of the training data, it is important to note that the dataset used in this study remains well-suited for certain types of labelling and diagnostic tasks. GANs can be custom-trained on focused datasets, allowing for targeted augmentation that enhances performance for specific applications. In this regard, our approach offers a notable advantage over traditional augmentation techniques by producing more realistic and varied samples, even within a constrained domain.
Finally, compared to prior work,31 our GAN model demonstrates superior performance in preserving high-frequency details and reducing background noise, further advancing the realism and utility of generated images. While challenges remain in achieving full anatomical completeness, these findings represent a meaningful step towards bridging the data gap in dental AI and expanding access to quality training tools.
Conclusion
This study demonstrates the effectiveness of using GANs for augmenting dental datasets through the generation of high-resolution OPG images. Our deep 9-layer GAN architecture successfully produces realistic synthetic images at resolutions up to 2048 × 1024, preserving crucial anatomical features such as the temporomandibular joints, maxillary sinuses, and complete dental arches. These synthetic images can meaningfully support downstream tasks like disease classification, object detection, and anatomical localization, especially in scenarios where annotated real-world data are scarce.
Visual Turing tests conducted with professional dentists indicated that the generated images closely resemble real OPGs, particularly those captured under suboptimal conditions. This suggests that the model has successfully learned to replicate essential anatomical features, especially the dentition, with a level of realism that makes the images suitable for training and diagnostic support purposes. While some visual artefacts remain, the overall fidelity of the outputs demonstrates strong potential for real-world applicability and highlights the model’s capacity to generalize across diverse imaging conditions. Compared to prior studies, particularly the most closely aligned work discussed in,31 our approach offers improvements in dataset size, resolution, anatomical coverage, architectural depth, and evaluation rigour. These enhancements make our GAN model a strong candidate for generating training data to support AI applications in dentistry, such as tooth classification or anomaly detection.
Looking forward, retraining this model on a larger volume of high-quality, standardized OPG data could further enhance its realism and generalizability. Additionally, even with limited domain data, conditional GANs offer a powerful mechanism for task-specific image synthesis. Their integration with active learning frameworks, by selectively including informative samples for training, can further improve representativeness and performance in downstream tasks. Incorporating more advanced generative techniques such as conditional GANs or diffusion models could allow for finer control over specific anatomical features or pathologies. In particular, combining conditional GANs with active learning strategies offers a promising direction for improving diversity and representativeness in synthetic datasets. By systematically identifying and utilizing the most informative samples during training, such approaches can help mitigate the limitations imposed by small or biased datasets and enable task-specific augmentation tailored to particular diagnostic objectives. We also envision expanding the model’s utility by generating annotated datasets for specific clinical tasks, enabling the development of AI systems capable of accurate diagnosis and treatment planning in resource-constrained settings. Ultimately, this work sets a strong foundation for ethically sound, scalable data augmentation in dental AI and represents a significant step towards bridging the data gap in medical image analysis.
Funding
This work was supported by the Deanship of Scientific Research, Vice Presidency for Graduate Studies and Scientific Research, King Faisal University, Saudi Arabia (Grant No. KFU252806).
Author contributions
Maria Waqas: Conceptualization, methodology, validation, formal analysis, investigation, project administration, and writing – original draft. Shehzad Hasan: Conceptualization, methodology, validation, formal analysis, investigation, project administration, writing – review and editing. Ammar Farid Ghori: Design and training of model M1, writing – a few portions of the original draft. Amal Alfaraj and Muhammad Faheemuddin: Validation, formal analysis, evaluation, writing – review and editing. Zohaib Khurshid: Data collection, validation, formal analysis, investigation, project administration, and writing – original draft.
Declaration of generative AI and AI-assisted technologies in the writing process
During the revision of this work, the author(s) used ChatGPT-4 to improve the English language in a few paragraphs, not the whole manuscript. After using this tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the publication.
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this article.
Footnotes
Supplementary material associated with this article can be found in the online version at doi:10.1016/j.identj.2025.103878.
Appendix. Supplementary materials
The file titled ‘Supplementary Material – Model Evaluation by Field Experts’ includes the original rubric sheets completed by all five domain experts, along with a consolidated results summarizing their quantitative feedback on the synthetic OPG images. This supplementary material supports the reproducibility of our expert evaluation process and provides deeper insight into the clinical plausibility and anatomical assessment of the generated images.
References
- 1.Pandolfo I, Mazziotti S, Raffaele A, D’Angelo G. Orthopantomography. Springer-Verlag Italia; Milano: 2013. [Google Scholar]
- 2.Jagtap R., Yesiltepe S., Bayrakdar I.S., Orhan K., Çelik Ö. A deep-learning model for idiopathic osteosclerosis detection on dental panoramic radiographs. Oral Surg Oral Med Oral Pathol Oral Radiol. 2022;134(3):e77. doi: 10.1159/000527145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Karacaoglu F., Kolsuz M.E., Bagis N., Evli C., Orhan K. Development and validation of intraoral periapical radiography-based machine learning model for periodontal defect diagnosis. Proc Inst Mech Eng. 2023;237(5):607–618. doi: 10.1177/09544119231162682. [DOI] [PubMed] [Google Scholar]
- 4.Lee J.H., Kim D.H., Jeong S.N., Choi SH. Detection and diagnosis of dental caries using a deep learning-based convolutional neural network algorithm. J Dent. 2018;77:106–111. doi: 10.1016/j.jdent.2018.07.015. [DOI] [PubMed] [Google Scholar]
- 5.Meine H., Metzger M.C., Weingart P., et al. Determination of Kennedy’s classification in panoramic X-rays by automated tooth labeling. Int J Comput Assist Radiol Surg. 2025:1–9. doi: 10.1007/s11548-025-03469-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Khurshid Z., Waqas M., Hasan S., Kazmi S., Faheemuddin M. Deep learning architecture to infer Kennedy classification of partially edentulous arches using object detection techniques and piecewise annotations. Int Dent J. 2025;75(1):223–235. doi: 10.1016/j.identj.2024.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Goceri E. Medical image data augmentation: techniques, comparisons and interpretations. Artif Intell Rev. 2023;56(11):12561–12605. doi: 10.1007/s10462-023-10453-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Kebaili A., Lapuyade-Lahorgue J., Ruan S. Deep learning approaches for data augmentation in medical imaging: a review. J Imaging. 2023;9(4):81. doi: 10.3390/jimaging9040081. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Taylor L, Nitschke G. Improving deep learning with generic data augmentation. In: 2018 IEEE symposium series on computational intelligence (SSCI); 2018 Nov 18–21. IEEE; Bangalore, India. Piscataway, NJ: 2018. pp. 1542–1547. [Google Scholar]
- 10.Perez L, Wang J. The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:1712.04621. 2017.
- 11.Samaranayake L. IDJ pioneers efforts to reframe dental health care through artificial intelligence (AI) Int Dent J. 2024;74(2):177. doi: 10.1016/j.identj.2024.03.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Samaranayake L., Tuygunov N., Schwendicke F., et al. The transformative role of artificial intelligence in dentistry: a comprehensive overview. Part 1: fundamentals of AI, and its contemporary applications in dentistry. Int Dent J. 2025;75(2):383–396. doi: 10.1016/j.identj.2025.02.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Tuygunov N., Samaranayake L., Khurshid Z., et al. The transformative role of artificial intelligence in dentistry: a comprehensive overview part 2: the promise and perils, and the international dental federation communique. Int Dent J. 2025;75(2):397–404. doi: 10.1016/j.identj.2025.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Orhan K., Ünsal G. Artificial Intelligence in Dentistry. Digital Dentistry: An Overview and Future Prospects. Springer International Publishing; Cham, Switzerland: 2024. pp. 285–301. [Google Scholar]
- 15.Orhan K., Amasya H. In: Artificial Intelligence in Dentistry. Orhan K, Jagtap R, editors. Springer International Publishing; Cham, Switzerland: 2024. AI Orthodontics; pp. 131–141. [Google Scholar]
- 16.Keser G., Namdar Pekiner F., Orhan K. Artificial Intelligence in Dentistry. Springer International Publishing; Cham, Switzerland: 2024. AI on Oral Mucosal Lesion Detection; pp. 143–176. [Google Scholar]
- 17.Orhan K., Gülbeş M.M., Jadhav A., Jagtap R. Artificial Intelligence in Dentistry. Springer International Publishing; Cham, Switzerland: 2024. Medico-Legal Problems of Artificial Intelligence; pp. 259–282. [Google Scholar]
- 18.Goodfellow I., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., et al. Advances in Neural Information Processing Systems. Montreal, Quebec, Canada; 2014. Generative Adversarial Nets; p. 27. [Google Scholar]
- 19.Arjovsky M., Chintala S., Bottou L. International conference on machine learning. PMLR; Sydney, Australia: 2017. Wasserstein generative adversarial networks; pp. 214–223. [Google Scholar]
- 20.Changhee H., Kohei M., Shin’ichi S., Hideki N. Learning more with less: GAN-based medical image augmentation. Med Imaging Technol. 2019;37(3):137–142. [Google Scholar]
- 21.Godbin AB, Jasmine SG. Pediatric Pneumonia Detection in Chest X-ray Images: A Deep Feature Analysis Approach Enhanced with LightGBM. In: 2024 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT) IEEE; Bangalore, India: 2024. pp. 1–6. [Google Scholar]
- 22.Bowles C, Chen L, Guerrero R, Bentley P, Gunn R, Hammers A, et al. Gan augmentation: Augmenting training data using generative adversarial networks. arXiv preprint arXiv:1810.10863. 2018.
- 23.Frid-Adar M., Diamant I., Klang E., Amitai M., Goldberger J., Greenspan H. GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing. 2018;321:321–331. [Google Scholar]
- 24.Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434. 2015.
- 25.Sandfort V., Yan K., Pickhardt P.J., Summers RM. Data augmentation using generative adversarial networks (CycleGAN) to improve generalizability in CT segmentation tasks. Sci Rep. 2019;9(1) doi: 10.1038/s41598-019-52737-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Dumagpi J.K., Jeong YJ. Evaluating GAN-based image augmentation for threat detection in large-scale Xray security images. Appl Sci. 2020;11(1):36. [Google Scholar]
- 27.Tronchin L, Vu MH, Soda P, Löfstedt T. LatentAugment: Data Augmentation via Guided Manipulation of GAN's Latent Space. arXiv preprint arXiv:2307.11375. 2023. [DOI] [PubMed]
- 28.Yang S., Kim K.D., Ariji E., Takata N., Kise Y. Evaluating the performance of generative adversarial network-synthesized periapical images in classifying C-shaped root canals. Sci Rep. 2023;13(1) doi: 10.1038/s41598-023-45290-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Broll A., Goldhacker M., Hahnel S., Rosentritt M. Generative deep learning approaches for the design of dental restorations: a narrative review. J Dent. 2024;104988 doi: 10.1016/j.jdent.2024.104988. [DOI] [PubMed] [Google Scholar]
- 30.Chau R.C.W., Hsung R.T.C., McGrath C., Pow E.H.N., Lam WYH. Accuracy of artificial intelligence-designed single-molar dental prostheses: a feasibility study. J Prosthet Dent. 2024;131(6):1111–1117. doi: 10.1016/j.prosdent.2022.12.004. [DOI] [PubMed] [Google Scholar]
- 31.Pedersen S., Jain S., Chavez M., Ladehoff V., de Freitas B.N., Pauwels R. Pano-GAN: a deep generative model for panoramic dental radiographs. J Imaging. 2025;11(2):41. doi: 10.3390/jimaging11020041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Schoenhof R., Schoenhof R., Blumenstock G., Lethaus B., Hoefert S. Synthetic, non-person related panoramic radiographs created by generative adversarial networks in research, clinical, and teaching applications. J Dent. 2024;146 doi: 10.1016/j.jdent.2024.105042. [DOI] [PubMed] [Google Scholar]
- 33.Yi X., Walia E., Babyn P. Generative adversarial network in medical imaging: a review. Med Image Anal. 2019;58 doi: 10.1016/j.media.2019.101552. [DOI] [PubMed] [Google Scholar]
- 34.Chen Y., Yang X.H., Wei Z., et al. Generative adversarial networks in medical image augmentation: a review. Comput Biol Med. 2022;144 doi: 10.1016/j.compbiomed.2022.105382. [DOI] [PubMed] [Google Scholar]
- 35.Elgendy M. Deep learning for vision systems. Vol. 478. Simon and Schuster; New York: 2020. [Google Scholar]
- 36.Waqas M., Shehzad H., Khurshid Z. Synthetic Dental Orthopantomography (OPG) Data Generated by GAN Models. Mendeley Data. 2025;IV doi: 10.17632/y35z46ccw6.1. [DOI] [Google Scholar]
- 37.Waqas M., Hasan S., Khurshid Z., Kazmi S. OPG Dataset for Kennedy Classification of Partially Edentulous Arches. Mendeley Data. 2024;V1 doi: 10.17632/ccw5mvg69r.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Khurshid Z., Waqas M., Faridoon F., Porntaveetus T., Trachoo V. Annotated OPG Dataset for Dental Fillings, Prostheses (Crowns & Bridges), Endodontic Treatments, Endodontic Posts, and Dental Implants. Mendeley Data. 2025;V1 doi: 10.17632/mdvs6mjgf2.1. [DOI] [Google Scholar]
- 39.Panetta K., Rajendran R., Ramesh A., Rao S.P., Agaian S. Tufts dental database: a multimodal panoramic x-ray dataset for benchmarking diagnostic systems. IEEE J Biomed Health Inform. 2021;26(4):1650–1659. doi: 10.1109/JBHI.2021.3117575. [DOI] [PubMed] [Google Scholar]
- 40.Kokomoto K., Okawa R., Nakano K., Nozaki K. Intraoral image generation by progressive growing of generative adversarial network and evaluation of generated image quality by dentists. Sci Rep. 2021;11(1) doi: 10.1038/s41598-021-98043-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Woodland M., Castelo A., Al Taie M., Albuquerque Marques Silva J., Eltaher M., Mohn F., et al. International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer; Marrakesh, Morocco: 2024. Feature extraction for generative medical imaging evaluation: New evidence against an evolving trend; pp. 87–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Konz N, Chen Y, Gu H, Dong H, Mazurowski MA. Rethinking perceptual metrics for medical image translation. arXiv preprint arXiv:2404.07318. 2024.
- 43.Wu Y, Liu F, Yilmaz R, Konermann H, Walter P, Stegmaier JA. Pragmatic Note on Evaluating Generative Models with Fr\'echet Inception Distance for Retinal Image Synthesis. arXiv preprint arXiv:2502.17160, 2025.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
The file titled ‘Supplementary Material – Model Evaluation by Field Experts’ includes the original rubric sheets completed by all five domain experts, along with a consolidated results summarizing their quantitative feedback on the synthetic OPG images. This supplementary material supports the reproducibility of our expert evaluation process and provides deeper insight into the clinical plausibility and anatomical assessment of the generated images.












