Automated Image Clarity Detection for the Improvement of Colposcopy Imaging with Multiple Devices

Lillian Ekem; Erica Skerrett; Megan J Huchko; Nimmi Ramanujam

doi:10.1016/j.bspc.2024.106948

. Author manuscript; available in PMC: 2026 Feb 1.

Published in final edited form as: Biomed Signal Process Control. 2024 Sep 27;100(Pt B):106948. doi: 10.1016/j.bspc.2024.106948

Automated Image Clarity Detection for the Improvement of Colposcopy Imaging with Multiple Devices

Lillian Ekem ^a,^*, Erica Skerrett ^a, Megan J Huchko ^b,^c, Nimmi Ramanujam ^a,^d

PMCID: PMC11633643 NIHMSID: NIHMS2026277 PMID: 39669100

Abstract

The proportion of women dying from cervical cancer in middle- and low-income countries is over 60%, twice that of their high-income counterparts. A primary screening strategy to eliminate this burden is cervix visualization and application of 3–5% acetic acid, inducing contrast in potential lesions. Recently, machine learning tools have emerged to aid visual diagnosis. As low-cost visualization tools expand, it is important to maximize image quality at the time of the exam or of images used in algorithms. Objective: We present the use of an object detection algorithm, the YOLOv5 model, to localize the cervix and describe blur within a multi-device image database. Methods: We took advantage of the Fourier domain to provide pseudo-labeling of training and testing images. A YOLOv5 model was trained using Pocket Colposcope, Mobile ODT EVA, and standard of care digital colposcope images. Results: When tested on all devices, this model achieved a mean average precision score, sensitivity, and specificity of 0.9, 0.89, and 0.89, respectively. Mobile ODT EVA and Pocket Colposcope hold out sets yielded mAP score of 0.81 and 0.83, respectively, reflecting the generalizability of the algorithm. Compared to physician annotation, it yielded an accuracy of 0.72. Conclusion: This method provides an informed quantitative, generalizable analysis of captured images that is highly concordant with expert annotation. Significance: This quality control framework can assist in the standardization of colposcopy workflow, data acquisition, and image analysis and in doing so increase the availability of usable positive images for the development of deep learning algorithms.

1. Introduction

Cervical cancer is preventable and curable when caught early and managed effectively. However, there is a disproportionate burden of this illness in low and middle-income countries (LMICs). In 2020, there were 604,000 new cases of cervical cancer and 342,000 deaths [1]. Without intervention, new cases of cervical cancer are expected to rise to 700,000 by 2030 resulting in up to 400,000 deaths [2,3]. The high incidence and mortality rate of cervical cancer can be mitigated by effective, site-appropriate interventions targeting screening and treatment [4,5]. Visual inspection with acetic acid (VIA) has been established as a low-cost method for identification of precancerous and/or cancerous lesions in settings where high-risk Human Papilloma Virus (HPV) testing and/or cytology is limited. The application of 3–5% acetic acid to the cervix results in the acetowhitening of lesions, caused by the precipitation of nuclear proteins [6,7]. Though the specificity of VIA can be as high as 80%, its sensitivity is highly variable ranging anywhere from 30%−70%, missing lesions that could advance to cancer [8].

A colposcope, a low power upright microscope mounted on a ely used for imaging the cervix in high-income settings (HICs). However, this device is not practical for use in LMIC settings owing to its cost and complexity. Low-cost devices are now available to address the lack of access to colposcopy. Imaging devices including the Enhanced Visual Assessment (EVA) system and Gynocular serve as portable, low-cost alternatives to colposcopy [9–13]. Like the colposcope, these devices are placed outside the vaginal canal at a 15 to 30 cm distance from the cervix. Our group has developed a Pocket Colposcope, which is inserted into the vaginal canal to image the cervix. Increasing proximity to the cervix through insertion allows for high contrast imaging without potential obstruction of the field of view from vaginal walls, non-uniform illumination owing to an external light source, or significant glare due to incidence of the light on reflective objects such as the speculum [14–17]. The images captured with these colposcopy alternatives are poised to improve visualization of lesions on the cervix over VIA. Further, the growing repository of images collected with these devices provides the opportunity to develop automated diagnostic algorithms in areas with limited or no access to expert physicians (to either evaluate in-person or through telemedicine) [8].

The emergence of deep learning has transformed the field of image analysis, notably in clinical decision-making. Algorithms have the potential to aid provider decisions while also expanding the scale and scope of such tasks [8,18,19]. As private and public cervix image databases grow, algorithms for cervical cancer diagnosis and/or lesion detection/segmentation have continued to evolve [15,20,21]. However, a major challenge is the number of usable positive images to train these algorithms [15]. This is due to the low prevalence of disease which is approximately 2–4% in the general population and 10–15%, in screen-positive populations (who test positive for the Human Papilloma Virus (HPV)) [3–5,22]. Therefore, deep learning algorithms for automated detection of cervical lesions will necessarily rely on unconstrained datasets to take advantage of as many positive images as possible, collected across different devices and studies [23]. The effect of inclusion of poorer quality colposcopy images can negatively impact the verification and generalizability of these diagnostic models. Specifically, introduction of blurry images can lead to unreliable predictive diagnoses if models do not have the capacity to handle such degradations [11,24,25]. A standardized manner of identifying colposcopy images that are not interpretable for diagnosis is a key step towards quality control in this domain.

An additional challenge to the development of deep learning algorithms is the number of interpretable cervical images. Interpretability can be impacted by several factors. The first factor is obstruction of the cervix by the vaginal walls or the speculum (a challenge for external imaging devices). The second issue is the lack of visibility of the squamocolumnar junction (SCJ) (an anatomical issue) which can render image interpretation inconclusive. The third factor is image quality, particularly blur which can be impacted by the operator using the device, the stability of the device and anatomical variations across patients. This factor alone has been shown to render a large fraction of cervical images unusable. Blur, caused by motion or lack of focus, has a significant impact on image recognition, both by human and machines [26]. When images are blurry, the deep learning networks may struggle to extract meaningful features and patterns, leading to decreased performance, particularly false-negative diagnoses [10,11,27]. However, unlike the other two cases, there is an opportunity to develop a quantitative metric of blur. This metric can prompt the provider to retake the image at the time of the exam and establish a threshold for the quality of images used in deep learning algorithms.

Few previous works have developed algorithms to evaluate the quality of cervical images. Guo et al made use of ensemble of deep learning architectures to select cervix images and prevent inclusion of non-cervix captures, i.e., object detection [11]. Guo et al additionally assessed image focus using RetinaNet on Mobile ODT EVA system data alone [10]. This work was followed by Xue et al’s use of ensemble architectures to classify cellphone, Mobile ODT EVA system, digital colposcope images using a four-category label scheme [28]. These works are further summarized in Supplementary Table 1. These previous studies have presented important findings that motivate the need for image assessment, either for algorithm development or feedback during image capture. However, there are several challenges to translating these methods into a practical clinical setting. First these techniques use complex architectures which are computationally long. Second, these methods require some level of manual annotation, which is subjective and time intensive. Third, only one of the three previously published papers evaluated multiple devices albeit using manual annotation which is likely to be variable across devices. That study, by Xue et al, highlighted the need for data diversity that algorithm generalizability in this domain [28]. Our work seeks to build on these key investigations as well as address their limitations towards the goal of facilitating cervical image quality at the point of care or for data selection for deep learning algorithms.

In this work, we describe a lightweight and simple model that can be integrated into image capture at the point of care, automate ground truth generation without the need for expert annotation to label input images and strategies to generalize image assessment across multiple devices. This algorithm includes objection detection to delineate the cervix from other non-relevant areas such as the vaginal walls and the speculum. A Fast Fourier transform (FFT) based analysis serves as a simple, yet powerful method for automatic pseudo labeling. This serves as the ground truth for our YOLOv5 blur classification algorithm. The YOLOv5 model was trained using Pocket Colposcope, Mobile ODT EVA, and standard of care (SOC) digital colposcope images. When tested with all devices, this model achieves a mean average precision score, sensitivity, and specificity of 0.9, 0.89, and 0.89, respectively. These YOLOv5 outputs were also compared to physician annotation, yielding an accuracy of 0.72 and a specificity of 0.92. Our model can serve as an automatic method for the detection of images with excessive blur that should either be recaptured at the time of the exam or removed from training sets for algorithm development. This method sets a standard for consistent identification of images too blurry for provider interpretation. This information can be propagated for the easy compilation of datasets for diagnostic algorithms. Taken together these methods can assist in efficient data acquisition and image analysis, refining the development of deep learning algorithms.

2. Methods

2.1. Datasets and Annotation

Images used in this study were sourced from three different image acquisition tools: the Pocket Colposcope, a standard of care (SOC) digital colposcope, and Mobile ODT’s EVA smartphone.

The Pocket colposcopy image data set that was used to develop the blur detection algorithm consisted of 1,350 images collected from 1,235 patients across six countries (U.S., Peru, India, Tanzania, Zambia, and Honduras). SOC images using a digital colposcope (Leisegang Optik2 with an 18-megapixel camera CooperSurgical, Inc. 2012. Trumbull, CT, USA.) were collected in tandem with Pocket colposcope images at three of the six clinical sites (U.S., Peru, and India). Images were collected in public and private clinics where patients met a positive screening criteria with either VIA, a Pap smear, or an HPV test [15,29]. The full diagnostic breakdown of the dataset used in this study is shown in Supplementary Figure 1. When imaging was performed with the Pocket Colposcope and the SOC colposcope, a speculum was used to orient and allow cervix visualization. After application of acetic acid, the SOC colposcope was used to capture images followed by the Pocket Colposcope or vice versa. The imaging procedure of the Pocket Colposcope was different from that of the digital colposcope. The slender form factor of the Pocket colposcope allowed for insertion through the speculum, into the vaginal canal, for image capture at a working distance of about 3.5 to 4.5 cm from the cervix. Camera feed on a connected cellphone or camera was used for image capture.

We used 1651 cervix images from the Mobile ODT dataset for algorithm development. From this same dataset. The Mobile ODT EVA images were sourced from the publicly available Kaggle cervix image database [30]. This dataset was previously provided in the 2017 “Intel & MobileODT Cervical Cancer Screening Competition” towards the goal of classifying the images as three cervix types. These images were recorded by multiple providers in various regions, mostly for the purposes of documentation rather than clinical interpretation. The Mobile ODT EVA system can be handheld or used with a stabilizing stand [31,32].

2.2. Ground Truth Labeling with Fast Fourier Transform (FFT) Based Blur Analysis

Image annotation was conducted using knowledge of the frequency content to produce a single score of image blur caused by either motion or loss of focus [33–35]. First, images were transformed to the Fourier domain. The transform was shifted and multiplied with a high-pass box filter to zero-out its center and suppress low frequencies as shown in Figure 1(b). The image was then reconstructed, and the magnitude spectrum was computed. A logarithmic transform was applied to the spectrum to compress the dynamic range. The mean of the log magnitude values was used to numerically score the focus and clarity of the image. Images of a life-scale educational mannequin cervix were captured using the Pocket Colposcope and initially used to evaluate this threshold range. Images were captured for low and high clarity. To capture low clarity images, the Pocket Colposcope was placed outside of its working range to induce out-of-focus blur. High clarity images were then subsequently blurred using Gaussian filters with sigma (σ) ranging from 9 to 21. We used thresholds in the range of the global minimum and maximum to delineate each blurred image set from the clear counterpart. We calculated the accuracy, sensitivity, and specificity resulting from the use of each blue threshold. An overall threshold value was empirically determined by selecting a value that captured the peak accuracy for each blur level. A generous threshold value of 0 was chosen to prioritize classification with high accuracy and specificity. Images below this threshold value were classified as blurry. Images that were blurry at capture were compared to artificially blurred images. Approximately half of the Mobile ODT EVA and SOC colposcope images were blurry. However, only 25% of Pocket colposcope images had blur based on initial labelling. To ensure a clear and blurry class balance, 50% of the Pocket Colposcope clear images in training and testing sets were randomly blurred with a Gaussian filter with σ of 21.

Figure 1. — The Pocket Colposcope was used to collect two sets of cervix mannequin images of high (N=41) and low (N=41) clarity. Artificial blur was applied to images with high visual clarity using Gaussian filters with increasing standard deviations of 9, 15, 17, 19, and 21. **(a).** Example images of a clear mannequin image, an artificially blurred mannequin image, and low clarity mannequin image. The artificially blurred example was blurred with a standard deviation of 21. The average log image magnitude was analyzed for its use as a pseudo label with all seven sets of images. **(b).** Use of the Fourier transform space to quantify image clarity for a clear and blurry example image. Grayscale input images are transformed, smooth image component attenuated with a high pass filter, and the mean magnitude is calculated upon reconstruction. **(c).** An analysis of variance yielded significant difference among the groups. A post hoc Tukey test was to compare the multiple image treatments and images captured for low visual clarity. Groups are labeled with letters. Groups with the same letter label were not found to be significantly different from each other. Red plus signs indicate outliers. Images captured for high visual clarity differed from all others with a p<0.001. Similarly, images simulated with a Gaussian filter with σ=21 also differed from all other groups at the same significance. Images blurred with filters with σ=17 and σ=19 were not found to be statistically different from each other. These group means were higher than images blurred with σ=15 and images captured for low clarity. **(d-i)**. Single feature classification results for the different blurred image groups, artificially and at capture, and high clarity images. Sensitivity, specificity, and accuracy curves across different thresholds are shown for each comparison in (d) clear and σ=9, (e) clear and σ=15, (f) clear and σ=17, (g) clear and σ=19, (h) clear and σ=21, and (i) clear and low visual clarity at capture. Across the classification results with artificial blur, accuracy consistently peaked in a similar range though this peak value moved towards a lower mean log magnitude, with increased blur. At σ=21, comparison with the most distorted images, accuracy, sensitivity, and specificity overlapped over a wide range. Finally, when classifying with captured blurry images, accuracy, sensitivity, specificity overlapped at a blur score of about 2, (i). This analysis demonstrates the ability to use this single feature to conduct preliminary classification of clear and blurry images. A threshold can be established to differentiate clear from blurry images, especially at the extremes of image clarity.

2.3. Implementation of YOLOv5 Blur Detection

YOLOv5 is a later version of the popular YOLO object detection algorithm and uses several neural network optimization strategies [36–38]. Upon input, images undergo data preprocessing steps such as mosaic data augmentation, scaling, and color transformations [39]. Images were not processed to attenuate any additional image quality factors such as specular reflection or non-uniform illumination. The YOLOv5 algorithm is composed of three main components. First, the backbone formulates image features. The neck aggregates these features which are interpreted by the head which performs class prediction. The YOLOv5s6 model was selected due to its balance of larger image input size and lower number of parameters. Images were randomly divided into training and testing sets with 80% of images in the training set and 20% of images in the test set. Images were resized to 1280×1280. The model used stochastic gradient descent (SGD) as an optimizer with a learning rate of 0.01, mini-batch size of 32, and 300 epochs. The trained model predicted image clarity using binary labels with class-specific confidence scores. The precision, recall, F1 score, and mean average precision were calculated for the trained model. The algorithm was assessed on two external holdout sets, one consisting of MobileODT EVA images and another consisting of Pocket colposcope images from Kenya.

2.4. Validation of YOLOv5 with Expert Analysis

We collaborated with an expert physician and head gynecologist at the Duke University Hospital System (DUHS) to annotate a set of 135 Pocket Colposcope clinical images from a clinical study we recently performed in Kenya (used as the holdout Pocket colposcope dataset). Towards image quality, the physician was asked to classify the blur content and glare content using descriptions of “Small amounts,” “Medium amounts,” “Large amounts,” or None. In addition, the visibility of the transformation zone and/or the squamocolumnar junction (SCJ) was annotated. The physician was then asked if the image was interpretable and given a list of issues to select from if not. Beyond image quality, the physician also provided a colposcopic impression. They were asked about the presence of a lesion and asked to characterize possible lesions as cervical intraepithelial neoplasia (CIN) 1 or CIN 2/3+. The physician determined whether a biopsy, endocervical curettage, or no procedure should be performed if needed. Finally, a rating of 1–10 was provided for their colposcopic impression confidence. The entirety of this schema is described in Supplementary Figure 2. The physician used this question set to annotate images using an interactive annotation tool. Images were also separately evaluated using the established YOLOv5 algorithm and the Fourier domain-based blur score. The results of the physician review were compared to both analyses for validation.

3. Results

3.1. FFT Based Blur Analysis

We implemented a FFT based image analysis tool for ground truth labeling of image clarity. Here, images are described interchangeably as high clarity or clear and low clarity or blurry. The low frequency content in the Fourier transform of an image is responsible for the appearance of smooth areas on the image. Manipulating this content can provide a rough assessment of the impact of blur on an image. Information from the image Fourier domain was used to create criteria for describing image blur. The algorithm was assessed with high clarity and low clarity cervix mannequin images captured with the Pocket Colposcope. The high clarity images were sequentially blurred for controlled evaluation. Figure 1(a) shows an example high clarity image, artificially blurred image, and an example low clarity image. The log of the average magnitude of the manipulated image was used as a metric to describe blur degradation. As shown in Figure 1(b) the more blur an image contains, the lower this value is as more content would have been removed by the applied high-pass filter. Figure 1(c) shows that the blur metric is depressed with applications of Gaussian blur with increasing sigma (σ) to the originally clear images. A post hoc Tukey test indicated a significant difference between the high clarity image scores and the scores of all image sets with simulated blur distortion. The low clarity images were not found to be significantly different from the image set blurred with a Gaussian filter with σ=15. Among the degraded images, the blur metric average did not decrease linearly, making it difficult to use. However, as this Fourier domain-based blur metric demonstrated the ability to delineate image clarity, we evaluated a threshold that could serve as a pseudo-label for a more robust blur algorithm. In Figure 1(d-h), the accuracies, sensitivities, and specificities of this blur metric at different average image magnitude thresholds were determined by comparing the clear images with their blurred image counterparts as well as with images which were blurry at capture. Clear images could be differentiated from highly blurred images (σ=21) with high accuracy, specificity, and sensitivity over a broad range compared to low blur images (for example, σ=9). In Figure 1(i), the blur metric was able to differentiate clear images from images that were blurry at capture with an accuracy, sensitivity, and specificity of 0.98, 1.0, and 0.95, respectively at an average image magnitude value of 4. These values were most comparable to images artificially blurred with σ=15 which at a threshold of 4 had an accuracy, sensitivity, and specificity of 0.97, 1.0, and 0.95, respectively. All groups were able to achieve an accuracy of at least 0.8. The best threshold on which to binarize the FFT algorithm output varied for each group with artificial blur. A threshold of 0, prioritized a high specificity, 1.0 for all blur degradations, and captured the width of the performance of this metric on varied datasets. We were able to demonstrate the suitability of this method as an initial blur analysis tool.

3.2. Implementation of the YOLOv5 Blur Detection

We developed a YOLOv5 object detection model to isolate the cervix region of interest (ROI) and determine the image clarity of said region. We have previously used the YOLOv3 model to pre-process images by detecting an ROI to be analyzed by a diagnostic algorithm [15]. A portion of the Pocket Colposcope data used here was applied towards that previous goal. A full diagnostic breakdown of the dataset is shown in Supplementary Figure 1. A total of 3425 multi-device images, summarized in Table 1, were split into a training set with 2740 and a test set with 685 images. Additional images were reserved as hold out sets to further evaluate the model. Here training and testing sets refer to images used to develop the YOLOv5 model. Hold out sets refer to images used to run inference on the YOLOv5 model that were not involved in model training. Images were labeled as either blurry or not blurry using an average image magnitude threshold of zero calculated from the FFT analysis. This threshold was selected based on evaluation of this score on simulated and non-simulated datasets and thresholds that produced peak accuracy as shown in Figure 1. On initial review, approximately 50% of SOC and 40% of Mobile ODT EVA images were blurry. However, less than 25% of the Pocket Colposcope dataset contained some blur at capture. To ensure a class balance, 50% of the entire Pocket Colposcope image set, 675 images, were randomly blurred with a σ=21 to ensure an equal balance in the clear and blurry image subsets. A sigma of 21 was chosen to ensure that blur would be generously induced and images truly degraded. No blur simulation was applied to hold out images. Representative images of low and high clarity for each device are shown in Figure 2.

Table 1. Training and testing colposcopy dataset.

The Pocket and standard of care colposcopy dataset was compiled from a screen-positive population across 6 countries in Asia, Africa, North America, and South/Central America. Due to class imbalance, half of the total Pocket Colposcope dataset was artificially blurred. The low clarity subset of the Pocket Colposcope dataset includes these artificially blurred images and any blurry at capture images. This allowed for more representation of degraded images in this portion of the dataset. The Mobile ODT EVA dataset was compiled from the publicly available Kaggle dataset. Images were compiled to maximize the size of each dataset. Images were randomly separated in training and testing subsets using an 80/20 split, a common split. Images were labeled as clear and blurry using the FFT based blur metric. Using a threshold of zero, images were pseudo-labeled for blur. Gynocular data was not included as there is not a publicly available database.

	Training/Testing Set			Pocket Colposcope Hold Out Set	Mobile ODT EVA Hold Out Set	Expert Reviewed Hold Out Set
Imaging Device	High Clarity N=1834	Low Clarity N=1591	Total N=3425
Pocket Colposcope	648	702	1350	252	0	135
SOC Digital	217	207	424	0	0	0
Mobile ODT EVA	969	682	1651	0	239	0

Open in a new tab

Figure 2. — Example high clarity and low clarity images for each imaging device. Multiple factors can affect the size and shape of the cervix, but the average cervix measures about 3–4 cm in length and 2.5 cm in diameter.

For each training and testing iteration, we trained with a batch size of 32 over 300 epochs. Model training was halted when validation loss plateau or began to increase to prevent overfitting. Loss curves were inspected for insight into this factor of model performance (Supplementary Figure 4). Figure 3 displays the performance and evaluation metrics of the test set when trained on images for a given device (Pocket colposcope or Mobile ODT EVA), and in the case where images from all three devices are represented in the training set. Based on the evaluation metrics, the YOLOv5 model performed the best when trained and tested on the Pocket Colposcope. With this data subset, it achieved a mean average precision (mAP) of 0.98 and an F1-score of 0.95. For both classes within the Mobile ODT EVA dataset, the YOLOv5 model performed with an overall mean average precision of 0.85 and an F1-score of 0.85. Standard of care images were not evaluated alone because of the small size of this subset. When the model was trained and tested on the complete dataset including Pocket Colposcope images, Mobile ODT EVA images, and SOC images, the mAP and F1-score were 0.9.

Figure 3. — The YOLO model was trained and tested on three dataset iterations: Mobile ODT EVA images alone; Pocket colposcope images alone; and the full dataset of Pocket colposcope images, Mobile ODT EVA images, and standard of care images. The standard of care digital colposcope dataset was not evaluated alone due to its limited size. Images from each imaging source were divided between training and validation on an 80/20 split. Images that were used for training in a single device evaluation remained in that split for the training on the complete dataset and vice versa. The model achieved the highest mean average precision (mAP) of 0.98 when trained and tested with the Pocket alone. Additionally, this dataset attained an accuracy of 0.94 and an F1-score of 0.95. It had similar mAP scores for blurry and clear images alone. Training on the entirety of the available dataset provided the second-best performance with a mAP score of 0.9. For training with Mobile ODT EVA data alone the model achieved a mAP of 0.85 for all images. It performed slightly better on clear image prediction with a mAP of 0.9. Training with this dataset produced an accuracy of 0.86 and an F1-score of 0.85. The Mobile ODT EVA training dataset size was slightly larger than that of the Pocket images alone (see Table 1).

3.3. Validation of YOLOv5 on Additional Datasets

A portion of the Mobile ODT EVA and Pocket images were withheld for validation of the trained and tested YOLOv5 model. The withheld Pocket colposcope images were all obtained from clinical studies conducted in Kenya. These images were not sourced from the same sites as the training and test sets and therefore serve as a true holdout set. However, the Mobile ODT EVA image holdout set consisted of a reserved portion of the original training/testing set. Further, we ensured that there was no image or patient overlap between training/validation and holdout datasets for the Mobile ODT EVA images. These additional datasets are also summarized in Table 1. These images sets were separately evaluated using the three different models trained in Figure 3. They were labeled in the same manner as the training datasets and the results are shown in Figure 4. The best performance is achieved when the images in the holdout and training data sets are from the same device for either the Pocket colposcope (Figure 4(a)) and Mobile ODT EVA (Figure 4(b)). The converse is true when the images in the hold out set are from a different device from that used in the training set. Training across multiple devices improved the performance for both the Pocket colposcope (mAP of 0.81) and Mobile ODT EVA (mAP of 0.83) images. For Pocket Colposcope images, training across devices produced accuracy, sensitivity, and specificity of 0.78, 0.6, and 0.98, respectively. For Mobile ODT EVA images, validation with this model produced accuracy, sensitivity, and specificity of 0.85, 0.81, and 0.88, respectively. Evaluation of Mobile ODT EVA images showed that the increased training data diversity from a single device to all three devices caused a drop in precision possibly due to image batches with higher false positive calls. Example images for true negatives (true clear), true positives (true blurry), false negatives, and false positives are shown in Figure 4(c).

Figure 4. — To evaluate the impact of varying data contributions, YOLO models trained with different data iterations were validated on different hold out sets. The hold out sets image numbers are described in the above table. Pocket colposcope images were acquired more recently with an imaging study in Kenya. The Mobile ODT EVA images came from the same publicly available, previously described Kaggle dataset. (a). Validating of trained models on Pocket colposcope image hold out set (N=252). As in Figure 3, the model was trained on Mobile ODT EVA images alone, Pocket colposcope images alone, and the complete dataset. SOC data contribution was not evaluated in this manner due to the limited size of the dataset. On a model trained with the complete dataset, a mAP of 0.81 was achieved. In addition, this model produced accuracy, sensitivity, and specificity of 0.78, 0.6, and 0.98, respectively. **(b).** Validation of trained models on Mobile ODT EVA image hold out set (N=239). The model was trained on Pocket colposcope images alone, Mobile ODT EVA images alone, and the complete train/test dataset. As expected, performance increased with additional data, achieving the highest mAP, 0.83, with training on all compiled data. The introduction of more data also caused some drops in precision as evidenced by the pattern of the precision-recall curve. Validation with this model also produced accuracy, sensitivity, and specificity of 0.85, 0.81, and 0.88, respectively. **(c).** Example true negative, true positive, false negative, and false positive validation outputs from the YOLO model trained and validated on the complete dataset. Images are output with an ROI box and the predicted class, and the predicted class probability. The lighter pink boxes indicate detection of blurry cervix ROIs while the dark red boxes indicate detection of clear cervix ROIs.

3.4. Comparison of Blur Detection Methods with Expert Analysis

A subset of Pocket Colposcope images (N=135) from the same Kenya clinical site were provided to an expert physician for evaluation. Like the holdout set, this dataset was from a new clinical site that was not represented in YOLOv5 training/test sets. With these images, the expert physician answered a provided annotation schema (Supplementary Figure 2) that included questions on image quality metrics and diagnostic information. Examples of annotated images are shown in Figure 5. The use of the FFT based blur method and the YOLOv5 model were further evaluated using this expert annotation. To approximate accuracy, images annotated with “No blur” or “Small Amounts of Blur” were labeled as clear while the remaining images were labeled as blurry. Figure 6 outlines the results of these comparisons. In Figure 6(a), we show the binarized FFT for blur metric output in each blur annotation category. The strongest classification using this blur detection can be made at the extremes. Results are displayed on a continuous scale in Supplementary Figure 3. As a result, this blur detection method may be able to predict which images can be annotated with confidence i.e., a blur score above 0. Using a binary prediction threshold of 0 yielded an accuracy of 0.69 and a specificity of 0.6. Figure 6(b) displays some delineation between the FFT log average magnitude blur score of images that could not be annotated, i.e., images labeled “Cannot determine,” and that of others. However, this difference is limited, especially when considering images that were labeled as having a lesion. Figure 6(c) compares the four-category qualitative blur annotation of the physician with the binary output of the YOLO algorithm. Most images determined to be clear by the YOLOv5 model were concentrated in the “no blur,” “small amount of blur,” and “medium amount of blur,” physician annotations. Further, the majority images determined to be blurry were concentrated in the “large amount of blur” physician category. The greatest discrepancy occurred when comparing images described as having “medium amounts of blur.” These YOLOv5 binary outputs were also compared to physician annotation, yielding an accuracy of 0.72 and a specificity of 0.92. Results shown in Figure 6(d) reinforce the impact of the YOLOv5 automatic blur detection method by comparing the binary blur designation and the physician’s ability to outline a lesion. No matter what the blur description was, the blur designation of the algorithm is a good indicator of the provider’s ability to outline a lesion.

Figure 5. — Representations of images annotated as (a). No blur, (b). Small amount of blur, (c). Medium amount of blur, and (d). Large amounts of blur by an expert physician.

Figure 6. — **(a).** Comparison of binarized FFT based algorithm output and expert physician annotations of blur. The log average image magnitude was calculated for a Pocket Colposcope image set (N=135) and clarity threshold of 0 was applied. This blur detection method worked best when used to isolate the extremes of the annotated blur range, “no blur” and “large amounts of blur.” **(b).** Comparison of the binarized FFT based algorithm output and the lesion determination annotation. While the averages of these scores provided some ability to predict whether an image could be annotated, the difference between groups was difficult to consistently define. **(c).** Comparison of YOLO detection and expert provider image evaluation. Inference was run on the same additional set of patient images collected with the Pocket Colposcope and the outcome compared to provider image interpretation. There was almost complete alignment on physicians’ interpretation of images with no blur. There were discrepancies for images the physician identified as having some amount of blur. **(d).** Comparison of lesion determination annotation and YOLO detection. Despite the discrepancies in blur description, the YOLO model had a high accuracy in detecting images within which physicians felt comfortable identifying a lesion. In the “Cannot determine” category, most images were rendered uninterpretable due to the degree of blur degradation. Remaining images in this category that were found to be clear were primarily uninterpretable due to the orientation of the squamocolumnar junction. This indicates alignment with the model’s object detection and colposcopic impression confidence.

4. Discussion

When fed substantial amounts of data, neural networks can automatically extract diagnostic patterns that indicate the presence of premalignant lesions of the cervix [40,41]. While convenient and effective, these models are affected by input image quality. Due to the diverse types of medical equipment settings and users, image artifacts such as blur are frequently introduced. Towards the goal of detecting low clarity images, we have developed an algorithm to provide a binary designation of blurry or clear images across three devices, a standard colposcope and two low-cost point-of-care devices: the EVA device (an external imaging platform) and the Pocket colposcope (an insertable device). This algorithm includes a FFT based analysis for ground truth labeling, and a YOLOv5 object detection algorithm for blur classification and ROI delineation. We show that the algorithm-based designation of image quality aligns well with the ability of an expert physician to interpret the image.

The ODT EVA system is an external device and therefore, images can be affected by motion blur owing to free hand imaging. The Mobile ODT EVA system can also focus on irrelevant elements thus resulting in focus blur. For the Pocket Colposcope however, the device is rested on the speculum minimizing motion artifacts. However, placement of the device outside of the working distance can lead to focus blur. Additionally, for both devices, orientation of the cervix may cause some portions to be more in focus than others [17,31]. Both motion and focus blur have similar impacts on the high frequency content of the Fourier domain [33–35]. Therefore, FFT analysis can be used to create a pseudo label for images with both motion and focus blur using metrics quantified in the Fourier. Here, we sought to use FFT analysis to automatically generate ground truth labels and constrain bias that could potentially impact training using metrics quantified in the Fourier domain. The use of this pseudo label was validated with artificial blurring, a technique useful for controlling model inputs. This method was useful for initial analysis, especially when identifying very clear and very blurry images. This approach provides a significant advantage over manual annotation used by previous groups.

Image features exclusive to cervix colposcopy images limit the use of this simple quantitative score. For example, some cervixes, especially in older populations, have smooth mature squamous epithelium and less of the more textured columnar epithelium [42–44]. These smoother regions contribute to the low frequency information in an image and confound an averaged metric. Similarly, the presence of columnar epithelium on the ectocervix can contribute textural edges that are not accounted for when using this method to describe blur [21,45]. Further, the angle of the cervix and the resulting distance from the detector can result in non-uniform blur. As per our expert gynecologist’s annotation, the quality of these kinds of images is determined by the placement of this blur. The limitations of the Fourier algorithm motivated the development of the YOLOv5 model for blur detection.

The deep learning YOLOv5 blur detection technique provides binary feedback on image clarity, designating images as either “Clear” or “Blurry.” Three models were trained using different data combinations of the available Pocket Colposcope, EVA, and standard of care images. The model with the most data heterogeneity, included images from all devices, and had a mean average precision of 0.9 and an F1-score of 0.9, comparable to the few other studies in this arena [10,28]. We validated this model on holdout sets which, though captured with the same imaging devices, did not overlap with the training/testing sets. A holdout set was not created for standard colposcopy images due to its relatively small sample size compared to that available with the other two devices. The EVA images in the holdout set consisted of a reserved portion of the original training/testing set. In the case of the Pocket Colposcope, however, the holdout images were collected at a clinical site that was completely independent from those represented in the training/testing sets. Therefore, the Pocket Colposcope holdout set provided a more rigorous evaluation of the blur detection algorithm.

Most images identified as blurry by the algorithm were not interpretable to the provider. On the other hand, images identified as clear were more likely to be interpretable to the provider, no matter the provider’s blur assessment. In addition, of the 18 images that the YOLO algorithm found to be “Clear,” but could not be evaluated diagnostically, 13 had SCJ visibility reported as the primary quality issue. The remaining five images also reported glare as a primary issue impacting colposcopic evaluation. This indicates the importance of these additional anatomical features that can render blur-free images non-interpretable. All images where blur alone interfered with physician interpretability were identified by the YOLOv5 algorithm as “Blurry.” There were 15 images that the YOLOv5 algorithm labeled as “Blurry” that the provider was able to interpret in some manner. Of these images, four were labeled as having small amounts of blur, five were labeled as having medium amounts of blur, and five were labeled as having large amounts of blur. Of the five images described as having large amounts of blur, the provider reported the possible presence of a lesion, but could not ascertain the lesion type e.g., low-grade, or high-grade squamous intraepithelial lesions (SIL).

The impact of blur on diagnostic model outputs can be found in previous research referenced in our work. In the work described by Guo et al’s on automatic cervix image selection, it was noted that blur caused misclassification of cervix images [11]. Xue et al also expressed the detrimental impact of blur in automatic visual analysis of smartphone colposcopy images [31]. While this impact has been acknowledged in this domain, to our knowledge, there is limited systematic analysis of blur on the confidence of image capture and evaluation in a clinical setting, the most widely used method for cervical cancer detection. Therefore, our study focused entirely on developing a set of automated blur assessment tools that could be compared to the confidence of assessing cervix images by an expert colposcopist. This method sets a standard for consistent identification of images too blurry for provider interpretation. This information can be propagated for the easy compilation of datasets for diagnostic algorithms. Future study in this area can improve image acquisition and underscore the importance of image pre-processing.

Given the issues related to the sparse number of positive images in the general population, it is critical to ensure quality control of every image collected during the clinical exam. The blur algorithm can be integrated into the image capture software to provide real-time feedback to the provider so that they can retake the image if required. A pre-trained model can be deployed directly on a cellphone to provide immediate image description at a high inference speed, an average 12.8 ms per image, to providers of varied medical backgrounds. This enhances patient care by improving the quality of patient records and making post-examination image diagnostics possible. Therefore, an important attribute of any blur detection algorithm used at the time of image capture is the speed at which it provides feedback. Guo et al. assessed the use of multiple iterations of deep learning architectures for binary focus detection in cervix images collected with the Mobile ODT EVA system [10,31]. Images were annotated for clarity using quantized outputs from a trained classifier including RetinaNet, Inception, VGG, and transfer learning models, and cervix regions were automatically detected using the pre-trained Faster-R-CNN. These detected regions were used as ground truth for the examined object detection network, RetinaNet, and as inputs for the other architectures. Their evaluation found that RetinaNet, the object detection model, performed better than the other evaluated deep learning architectures. Our work uses an alternate approach where a rapid FFT algorithm was used for pseudo labeling. We use this labeling approach to develop and validate the extended use of object detection networks with the additional evaluation of the computationally efficient, portable, and modifiable YOLOv5 [39]. The smaller YOLOv5 model, with a relatively low 12.3 million parameters, provides the potential for fast image examination with reduced computational complexity. Blur detection using YOLOv5 model shows more promise for real-time detection in resource constrained environments as its smaller number of model parameters allows faster inference speed while maintaining accuracy [46,47].

The impact of image quality issues such as specular reflection and non-uniform illumination correction remains a confounding factor. The Pocket Colposcope can mitigate or reduce the effects of specular reflection and non-uniform lighting as it is placed near the cervix to obtain well-lit images, and the angle of device can be easily manipulated to minimize blur. The degree to which these issues affect image analysis is greater with external imaging devices. For specular reflection, pre-processing techniques that accurately isolate and inpaint impacted pixels can attenuate its impact. Poor lighting is a key issue in colposcopy image quality. In the study reported by Guo et al., poor lighting was found to be the primary obstacle to identifying the presence of a cervix in an image. Images used in this study experienced variations in lighting partially due to the distance from the cervix at which images were captured [11]. Gao et al and Xiong et al demonstrate that potential methods extraction and correction of image illumination components can reduce shadow distortion and enhance image detail [48,49].

Further research is warranted to make the blur algorithm usable in a practical clinical setting for image quality control and/or assessment. Future efforts will focus on continuation of the development of a quality control toolbox that addresses the primary issues affecting cervix image capture that can propagate to any machine learning analysis using the methods described above as well as those developed by our group. We will investigate how techniques can be integrated to maximize usable image datasets while standardizing poor quality image removal. Moreover, we will also investigate and quantify the impact of each of these degradations together and alone in cervical cancer diagnostic algorithm performance. This work will contribute to information needed to better inform medical imaging and artificial intelligence applications in this domain.

4.1. Conclusion

Alongside global HPV vaccination efforts, the best strategy for lowering the cervical cancer burden is increasing access to cervical cancer screening and care. For colposcopy screening, strides have been made to avail providers in low-resource contexts to reliable, portable imaging tools. With these advances, however, standardization of image collection, whether for further provider review or deep learning diagnostic algorithm development, has become a burgeoning need. This work delves into blur detection methods using images collected with multiple devices using portable, accessible algorithms. Incorporation of this algorithm in the clinical workflow can allow providers at all levels of training to capture high quality images at the point of care. Consequently, colposcopy image datasets can grow and require less data removal due to poor quality at the time of data collection. In addition, as colposcopy datasets are compiled, there is a need for pre-processing tools to review image quality. Currently, in deep learning colposcopy classification tasks, manual review is widely used to apply any quality criteria. This blur detection algorithm can reduce the subjectivity of, and the time required to annotate images. Taken together, the clinically informed quality control process we have developed can be used at the point of care, enhance the healthcare pipeline, especially increasing the feasibility of telehealth options and aid in the curation of high-quality data for diagnostic algorithm development.

Supplementary Material

NIHMS2026277-supplement-1.pdf^{(594.1KB, pdf)}

Highlights.

Deep learning algorithms for automatic cervical cancer diagnosis are limited by the quality of input images.
The use of Fourier domain-based image analysis and the YOLOv5 object detection network is proposed for automatic, standardized blur detection across multiple point of care imaging devices.
Experimental results coupled with expert physician validation demonstrate favorable performance of this blur detection method.
Proposed methods can be used to curate current databases and refine image capture with various devices at the time of capture.

Acknowledgements

Clinical partners that supported data collected in this study included: the study team at PATH, Manuel Sandoval with Asociación Hondureña de Planificación de Familia and Jacqueline Figueroa with the Secretaría de Salud de Honduras; Neerja Bhatla with the All India Institute of Medical Sciences Department of Obstetrics and Gynecology; Silvia San Jose and Jose Jeronimo with the National Institutes of Health; Gino Venegas Rodriguez with Liga Peruana de Lucha Contra El Cancer; Groesbeck Parham with the UNC Department of Obstetrics and Gynecology; Megan Huchko and John Schmitt with the Duke University Department of Obstetrics and Gynecology; and Olola Oneko with the Kilimanjaro Christian Medical Center Hospital. The work of this study was partially supported by NSF, ONR, Simons Foundation, and NGA.

References

[1].Sung H et al., “Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries,” CA Cancer J Clin, vol. 71, no. 3, pp. 209–249, 2021. 10.3322/CAAC.21660. [DOI] [PubMed] [Google Scholar]
[2].Zhao M. et al. , “Global, regional, and national burden of cervical cancer for 195 countries and territories, 2007–2017: findings from the Global Burden of Disease Study 2017,” BMC Womens Health, vol. 21, no. 1, 2021. 10.1186/S12905-021-01571-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
[3].Arbyn M. et al. , “Estimates of incidence and mortality of cervical cancer in 2018: a worldwide analysis,” Lancet Glob Health, vol. 8, no. 2, pp. e191–e203, 2020. 10.1016/S2214-109X(19)30482-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
[4].Tao L. et al. , “Prevalence and risk factors for cervical neoplasia: a cervical cancer screening program in Beijing,” BMC Public Health, vol. 14, no. 1, 2014. 10.1186/1471-2458-14-1185. [DOI] [PMC free article] [PubMed] [Google Scholar]
[5].Gargano JW et al. , “Trends in High-grade Cervical Lesions and Cervical Cancer Screening in 5 States, 2008–2015,” Clin Infect Dis, vol. 68, no. 8, pp. 1282–1291, 2019. 10.1093/CID/CIY707. [DOI] [PMC free article] [PubMed] [Google Scholar]
[6].Geneva: World Health Organization, “WHO guideline for screening and treatment of cervical pre-cancer lesions for cervical cancer prevention, Second Edition,” 2021. [PubMed] [Google Scholar]
[7].Balli C. et al. , “Transformation Zone Assessment Using Visual Inspection With Acetic Acid Before and After Thermal Ablation: Implications for Follow-Up,” JCO Glob Oncol, no. 9, 2023. 10.1200/go.22.00241. [DOI] [PMC free article] [PubMed] [Google Scholar]
[8].Xue P, Ng MTA, and Qiao Y, “The challenges of colposcopy for cervical cancer screening in LMICs and solutions by artificial intelligence,” BMC Med, vol. 18, no. 1, pp. 1–7, 2020. 10.1186/S12916-020-01613-X/TABLES/1. [DOI] [PMC free article] [PubMed] [Google Scholar]
[9].Mink J. and Peterson C, “MobileODT: a case study of a novel approach to an mHealth-based model of sustainable impact,” Mhealth, vol. 2, pp. 12–12, 2016. 10.21037/MHEALTH.2016.03.10. [DOI] [PMC free article] [PubMed] [Google Scholar]
[10].Guo P, Singh S, Xue Z, Long R, and Antani S, “Deep learning for assessing image focus for automated cervical cancer screening,” 2019 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2019 - Proceedings, May 2019. 10.1109/BHI.2019.8834495. [DOI] [Google Scholar]
[11].Guo P. et al. , “Ensemble Deep Learning for Cervix Image Selection toward Improving Reliability in Automated Cervical Precancer Screening,” Diagnostics (Basel), vol. 10, no. 7, 2020. 10.3390/DIAGNOSTICS10070451. [DOI] [PMC free article] [PubMed] [Google Scholar]
[12].Nessa A. et al. , “Evaluation of the accuracy in detecting cervical lesions by nurses versus doctors using a stationary colposcope and Gynocular in a low-resource setting,” BMJ Open, vol. 4, no. 11, p. e005313, 2014. 10.1136/BMJOPEN-2014-005313. [DOI] [PMC free article] [PubMed] [Google Scholar]
[13].Nessa A. et al. , “Evaluation of Stationary Colposcope and the Gynocular, by the Swede Score Systematic Colposcopic System in VIA Positive Women: A Crossover Randomized Trial,” International Journal of Gynecological Cancer, vol. 24, no. 2, p. 339, 2014. 10.1097/IGC.0000000000000042. [DOI] [PMC free article] [PubMed] [Google Scholar]
[14].Lam CT et al. , “An integrated strategy for improving contrast, durability, and portability of a Pocket Colposcope for cervical cancer screening and diagnosis,” PLoS One, vol. 13, no. 2, 2018. 10.1371/JOURNAL.PONE.0192530. [DOI] [PMC free article] [PubMed] [Google Scholar]
[15].Skerrett E. et al. , “Multicontrast Pocket Colposcopy Cervical Cancer Diagnostic Algorithm for Referral Populations,” BME Front, 2022. 10.34133/2022/9823184. [DOI] [PMC free article] [PubMed] [Google Scholar]
[16].Asiedu MN et al. , “Development of Algorithms for Automated Detection of Cervical Pre-Cancers with a Low-Cost, Point-of-Care, Pocket Colposcope,” IEEE Trans Biomed Eng, vol. 66, no. 8, pp. 2306–2318, 2019. 10.1109/TBME.2018.2887208. [DOI] [PMC free article] [PubMed] [Google Scholar]
[17].Mueller JL et al., “Portable Pocket colposcopy performs comparably to standard-of-care clinical colposcopy using acetic acid and Lugol’s iodine as contrast mediators: an investigational study in Peru,” BJOG, vol. 125, no. 10, pp. 1321–1329, 2018. 10.1111/1471-0528.15326. [DOI] [PMC free article] [PubMed] [Google Scholar]
[18].Saini SK, Bansal V, Kaur R, and Juneja M, “ColpoNet for automated cervical cancer screening using colposcopy images,” Mach Vis Appl, vol. 31, no. 3, pp. 1–15, 2020. 10.1007/S00138-020-01063-8/FIGURES/18. [DOI] [Google Scholar]
[19].Bai B, Du Y, Liu P, Sun P, Li P, and Lv Y, “Detection of cervical lesion region from colposcopic images based on feature reselection,” Biomed Signal Process Control, vol. 57, p. 101785, 2020. 10.1016/J.BSPC.2019.101785. [DOI] [Google Scholar]
[20].Cho BJ et al. , “Classification of cervical neoplasms on colposcopic photography using deep learning,” Scientific Reports 2020 10:1, vol. 10, no. 1, pp. 1–10, 2020. 10.1038/s41598-020-70490-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
[21].Yue Z, Ding S, Li X, Yang S, and Zhang Y, “Automatic Acetowhite Lesion Segmentation via Specular Reflection Removal and Deep Attention Network,” IEEE J Biomed Health Inform, vol. 25, no. 9, pp. 3529–3540, 2021. 10.1109/JBHI.2021.3064366. [DOI] [PubMed] [Google Scholar]
[22].Perkins RB et al. , “Use of risk-based cervical screening programs in resource-limited settings,” Cancer Epidemiol, vol. 84, p. 102369, 2023. 10.1016/J.CANEP.2023.102369. [DOI] [PubMed] [Google Scholar]
[23].Castiglioni I. et al. , “AI applications to medical images: From machine learning to deep learning,” Physica Medica, vol. 83, pp. 9–24, 2021. 10.1016/J.EJMP.2021.02.006. [DOI] [PubMed] [Google Scholar]
[24].Vasiljevic I, Chakrabarti A, and Shakhnarovich G, “Examining the Impact of Blur on Recognition by Convolutional Networks,” Nov. 2016, Accessed: Mar. 22, 2024. [Online]. Available: https://arxiv.org/abs/1611.05760v2
[25].Grm K, Struc V, Artiges A, Caron M, and Ekenel HK, “Strengths and weaknesses of deep learning models for face recognition against image degradations,” IET Biom, vol. 7, no. 1, pp. 81–89, 2018. 10.1049/IET-BMT.2017.0083. [DOI] [Google Scholar]
[26].Dodge S. and Karam L, “Understanding how image quality affects deep neural networks,” 2016 8th International Conference on Quality of Multimedia Experience, QoMEX 2016, Jun. 2016. 10.1109/QOMEX.2016.7498955. [DOI] [Google Scholar]
[27].Liu R, Li Z, and Jia J, “Image partial blur detection and classification,” 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2008. 10.1109/CVPR.2008.4587465. [DOI] [Google Scholar]
[28].Xue Z. et al. , “Image Quality Classification for Automated Visual Evaluation of Cervical Precancer,” Medical image learning with limited and noisy data: first international workshop, MILLanD 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings. MILLanD (Workshop) (1st: 2022: Singapore), vol. 13559, p. 206, 2022. 10.1007/978-3-031-16760-7_20. [DOI] [PMC free article] [PubMed] [Google Scholar]
[29].Mueller JL et al. , “International Image Concordance Study to Compare a Point of Care Tampon Colposcope to a Standard-of-Care Colposcope,” J Low Genit Tract Dis, vol. 21, no. 2, p. 112, 2017. 10.1097/LGT.0000000000000306. [DOI] [PMC free article] [PubMed] [Google Scholar]
[30].jljones KH BenO M. R. Mr. V. S. V. W. K. Y. B.-O., “Intel & MobileODT Cervical Cancer Screening Competition,” Kaggle. [Google Scholar]
[31].Xue Z. et al. , “A demonstration of automated visual evaluation of cervical images taken with a smartphone camera,” Int J Cancer, vol. 147, no. 9, pp. 2416–2423, 2020. 10.1002/IJC.33029. [DOI] [PMC free article] [PubMed] [Google Scholar]
[32].Madiedo M, Contreras S, Villalobos O, Kahn BS, Safir A, and Levitz D, “Mobile colposcopy in urban and underserved suburban areas in Baja California,” 10.1117/12.2218697, vol. 9699, pp. 56–60, 2016. 10.1117/12.2218697. [DOI]
[33].Mavridaki E. and Mezaris V, “No-reference blur assessment in natural images using Fourier transform and spatial pyramids,” 2014 IEEE International Conference on Image Processing, ICIP; 2014, pp. 566–570, 2014. 10.1109/ICIP.2014.7025113. [DOI] [Google Scholar]
[34].Yitzhaky Y. and Kopeika NS, “Identification of Blur Parameters from Motion Blurred Images,” Graphical Models and Image Processing, vol. 59, no. 5, pp. 310–320, 1997. 10.1006/GMIP.1997.0435. [DOI] [Google Scholar]
[35].Wichmann FA and Henning GB, “No role for motion blur in either motion detection or motion-based image segmentation,” J Opt Soc Am A Opt Image Sci Vis, vol. 15, no. 2, p. 297, 1998. 10.1364/JOSAA.15.000297. [DOI] [PubMed] [Google Scholar]
[36].Chen S. et al. , “Automatic detection of stroke lesion from diffusion-weighted imaging via the improved YOLOv5,” Comput Biol Med, vol. 150, p. 106120, 2022. 10.1016/J.COMPBIOMED.2022.106120. [DOI] [PubMed] [Google Scholar]
[37].Almufareh MF, Imran M, Khan A, Humayun M, and Asim M, “Automated Brain Tumor Segmentation and Classification in MRI Using YOLO-Based Deep Learning,” IEEE Access, vol. 12, pp. 16189–16207, 2024. 10.1109/ACCESS.2024.3359418. [DOI] [Google Scholar]
[38].Den H, Ito J, and Kokaze A, “Diagnostic accuracy of a deep learning model using YOLOv5 for detecting developmental dysplasia of the hip on radiography images,” Scientific Reports 2023 13:1, vol. 13, no. 1, pp. 1–10, 2023. 10.1038/s41598-023-33860-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
[39].Jocher Glenn, “YOLOv5 github repository,” Available: https://github.com/ultralytics/yolov5.
[40].Egemen D. et al. , “Artificial intelligence–based image analysis in clinical testing: lessons from cervical cancer screening,” JNCI: Journal of the National Cancer Institute, 2023. 10.1093/JNCI/DJAD202. [DOI] [PMC free article] [PubMed] [Google Scholar]
[41].Ahmed SR et al. , “Reproducible and clinically translatable deep neural networks for cervical screening,” Scientific Reports 2023 13:1, vol. 13, no. 1, pp. 1–18, 2023. 10.1038/s41598-023-48721-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
[42].Arora M, Dhawan S, and Singh K, “Deep Neural Network for Transformation Zone Classification,” ICSCCC 2018 – 1st International Conference on Secure Cyber Computing and Communications, pp. 213–216, 2018. 10.1109/ICSCCC.2018.8703327. [DOI] [Google Scholar]
[43].Prendiville W. and Sankaranarayanan R, “The effect of oncogenic HPV on transformation zone epithelium,” in Colposcopy and Treatment of Cervical Cancer, International Agency for Research on Cancer, Ed., Lyon (FR): International Agency for Research on Cancer, 2017. Accessed: Sep. 26, 2022. [Online]. Available: https://www.ncbi.nlm.nih.gov/books/NBK568360/ [Google Scholar]
[44].Arora M, Dhawan S, and Singh K, “Exploring Deep Convolution Neural Networks with Transfer Learning for Transformation Zone Type Prediction in Cervical Cancer,” Advances in Intelligent Systems and Computing, vol. 1053, pp. 1127–1138, 2020. 10.1007/978-981-15-0751-9_104/TABLES/3. [DOI] [Google Scholar]
[45].Alush A, Greenspan H, and Goldberger J, “Automated and interactive lesion detection and segmentation in uterine cervix images,” IEEE Trans Med Imaging, vol. 29, no. 2, pp. 488–501, 2010. 10.1109/TMI.2009.2037201. [DOI] [PubMed] [Google Scholar]
[46].Tan L, Huangfu T, Wu L, and Chen W, “Comparison of RetinaNet, SSD, and YOLO v3 for real-time pill identification,” BMC Med Inform Decis Mak, vol. 21, no. 1, pp. 1–11, 2021. 10.1186/S12911-021-01691-8/TABLES/4. [DOI] [PMC free article] [PubMed] [Google Scholar]
[47].Rahman A, Lu Y, and Wang H, “Performance evaluation of deep learning object detectors for weed detection for cotton,” Smart Agricultural Technology, vol. 3, p. 100126, 2023. 10.1016/J.ATECH.2022.100126. [DOI] [Google Scholar]
[48].Gao Y, Hu HM, Li B and Guo Q, “Naturalness preserved nonuniform illumination estimation for image enhancement based on retinex,” IEEE Trans Multimedia, vol. 20, no. 2, pp. 335–344, 2018. 10.1109/TMM.2017.2740025. [DOI] [Google Scholar]
[49].Xiong X. and Shang Y, “An adaptive method to correct the non-uniform illumination of images,” 10.1117/12.2575590, vol. 11567, pp. 173–179, 2020. 10.1117/12.2575590. [DOI]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

NIHMS2026277-supplement-1.pdf^{(594.1KB, pdf)}

[R1] [1].Sung H et al., “Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries,” CA Cancer J Clin, vol. 71, no. 3, pp. 209–249, 2021. 10.3322/CAAC.21660. [DOI] [PubMed] [Google Scholar]

[R2] [2].Zhao M. et al. , “Global, regional, and national burden of cervical cancer for 195 countries and territories, 2007–2017: findings from the Global Burden of Disease Study 2017,” BMC Womens Health, vol. 21, no. 1, 2021. 10.1186/S12905-021-01571-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] [3].Arbyn M. et al. , “Estimates of incidence and mortality of cervical cancer in 2018: a worldwide analysis,” Lancet Glob Health, vol. 8, no. 2, pp. e191–e203, 2020. 10.1016/S2214-109X(19)30482-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] [4].Tao L. et al. , “Prevalence and risk factors for cervical neoplasia: a cervical cancer screening program in Beijing,” BMC Public Health, vol. 14, no. 1, 2014. 10.1186/1471-2458-14-1185. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] [5].Gargano JW et al. , “Trends in High-grade Cervical Lesions and Cervical Cancer Screening in 5 States, 2008–2015,” Clin Infect Dis, vol. 68, no. 8, pp. 1282–1291, 2019. 10.1093/CID/CIY707. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] [6].Geneva: World Health Organization, “WHO guideline for screening and treatment of cervical pre-cancer lesions for cervical cancer prevention, Second Edition,” 2021. [PubMed] [Google Scholar]

[R7] [7].Balli C. et al. , “Transformation Zone Assessment Using Visual Inspection With Acetic Acid Before and After Thermal Ablation: Implications for Follow-Up,” JCO Glob Oncol, no. 9, 2023. 10.1200/go.22.00241. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] [8].Xue P, Ng MTA, and Qiao Y, “The challenges of colposcopy for cervical cancer screening in LMICs and solutions by artificial intelligence,” BMC Med, vol. 18, no. 1, pp. 1–7, 2020. 10.1186/S12916-020-01613-X/TABLES/1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] [9].Mink J. and Peterson C, “MobileODT: a case study of a novel approach to an mHealth-based model of sustainable impact,” Mhealth, vol. 2, pp. 12–12, 2016. 10.21037/MHEALTH.2016.03.10. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] [10].Guo P, Singh S, Xue Z, Long R, and Antani S, “Deep learning for assessing image focus for automated cervical cancer screening,” 2019 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2019 - Proceedings, May 2019. 10.1109/BHI.2019.8834495. [DOI] [Google Scholar]

[R11] [11].Guo P. et al. , “Ensemble Deep Learning for Cervix Image Selection toward Improving Reliability in Automated Cervical Precancer Screening,” Diagnostics (Basel), vol. 10, no. 7, 2020. 10.3390/DIAGNOSTICS10070451. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] [12].Nessa A. et al. , “Evaluation of the accuracy in detecting cervical lesions by nurses versus doctors using a stationary colposcope and Gynocular in a low-resource setting,” BMJ Open, vol. 4, no. 11, p. e005313, 2014. 10.1136/BMJOPEN-2014-005313. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] [13].Nessa A. et al. , “Evaluation of Stationary Colposcope and the Gynocular, by the Swede Score Systematic Colposcopic System in VIA Positive Women: A Crossover Randomized Trial,” International Journal of Gynecological Cancer, vol. 24, no. 2, p. 339, 2014. 10.1097/IGC.0000000000000042. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] [14].Lam CT et al. , “An integrated strategy for improving contrast, durability, and portability of a Pocket Colposcope for cervical cancer screening and diagnosis,” PLoS One, vol. 13, no. 2, 2018. 10.1371/JOURNAL.PONE.0192530. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] [15].Skerrett E. et al. , “Multicontrast Pocket Colposcopy Cervical Cancer Diagnostic Algorithm for Referral Populations,” BME Front, 2022. 10.34133/2022/9823184. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] [16].Asiedu MN et al. , “Development of Algorithms for Automated Detection of Cervical Pre-Cancers with a Low-Cost, Point-of-Care, Pocket Colposcope,” IEEE Trans Biomed Eng, vol. 66, no. 8, pp. 2306–2318, 2019. 10.1109/TBME.2018.2887208. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] [17].Mueller JL et al., “Portable Pocket colposcopy performs comparably to standard-of-care clinical colposcopy using acetic acid and Lugol’s iodine as contrast mediators: an investigational study in Peru,” BJOG, vol. 125, no. 10, pp. 1321–1329, 2018. 10.1111/1471-0528.15326. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] [18].Saini SK, Bansal V, Kaur R, and Juneja M, “ColpoNet for automated cervical cancer screening using colposcopy images,” Mach Vis Appl, vol. 31, no. 3, pp. 1–15, 2020. 10.1007/S00138-020-01063-8/FIGURES/18. [DOI] [Google Scholar]

[R19] [19].Bai B, Du Y, Liu P, Sun P, Li P, and Lv Y, “Detection of cervical lesion region from colposcopic images based on feature reselection,” Biomed Signal Process Control, vol. 57, p. 101785, 2020. 10.1016/J.BSPC.2019.101785. [DOI] [Google Scholar]

[R20] [20].Cho BJ et al. , “Classification of cervical neoplasms on colposcopic photography using deep learning,” Scientific Reports 2020 10:1, vol. 10, no. 1, pp. 1–10, 2020. 10.1038/s41598-020-70490-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] [21].Yue Z, Ding S, Li X, Yang S, and Zhang Y, “Automatic Acetowhite Lesion Segmentation via Specular Reflection Removal and Deep Attention Network,” IEEE J Biomed Health Inform, vol. 25, no. 9, pp. 3529–3540, 2021. 10.1109/JBHI.2021.3064366. [DOI] [PubMed] [Google Scholar]

[R22] [22].Perkins RB et al. , “Use of risk-based cervical screening programs in resource-limited settings,” Cancer Epidemiol, vol. 84, p. 102369, 2023. 10.1016/J.CANEP.2023.102369. [DOI] [PubMed] [Google Scholar]

[R23] [23].Castiglioni I. et al. , “AI applications to medical images: From machine learning to deep learning,” Physica Medica, vol. 83, pp. 9–24, 2021. 10.1016/J.EJMP.2021.02.006. [DOI] [PubMed] [Google Scholar]

[R24] [24].Vasiljevic I, Chakrabarti A, and Shakhnarovich G, “Examining the Impact of Blur on Recognition by Convolutional Networks,” Nov. 2016, Accessed: Mar. 22, 2024. [Online]. Available: https://arxiv.org/abs/1611.05760v2

[R25] [25].Grm K, Struc V, Artiges A, Caron M, and Ekenel HK, “Strengths and weaknesses of deep learning models for face recognition against image degradations,” IET Biom, vol. 7, no. 1, pp. 81–89, 2018. 10.1049/IET-BMT.2017.0083. [DOI] [Google Scholar]

[R26] [26].Dodge S. and Karam L, “Understanding how image quality affects deep neural networks,” 2016 8th International Conference on Quality of Multimedia Experience, QoMEX 2016, Jun. 2016. 10.1109/QOMEX.2016.7498955. [DOI] [Google Scholar]

[R27] [27].Liu R, Li Z, and Jia J, “Image partial blur detection and classification,” 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2008. 10.1109/CVPR.2008.4587465. [DOI] [Google Scholar]

[R28] [28].Xue Z. et al. , “Image Quality Classification for Automated Visual Evaluation of Cervical Precancer,” Medical image learning with limited and noisy data: first international workshop, MILLanD 2022, held in conjunction with MICCAI 2022, Singapore, September 22, 2022, proceedings. MILLanD (Workshop) (1st: 2022: Singapore), vol. 13559, p. 206, 2022. 10.1007/978-3-031-16760-7_20. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] [29].Mueller JL et al. , “International Image Concordance Study to Compare a Point of Care Tampon Colposcope to a Standard-of-Care Colposcope,” J Low Genit Tract Dis, vol. 21, no. 2, p. 112, 2017. 10.1097/LGT.0000000000000306. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] [30].jljones KH BenO M. R. Mr. V. S. V. W. K. Y. B.-O., “Intel & MobileODT Cervical Cancer Screening Competition,” Kaggle. [Google Scholar]

[R31] [31].Xue Z. et al. , “A demonstration of automated visual evaluation of cervical images taken with a smartphone camera,” Int J Cancer, vol. 147, no. 9, pp. 2416–2423, 2020. 10.1002/IJC.33029. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] [32].Madiedo M, Contreras S, Villalobos O, Kahn BS, Safir A, and Levitz D, “Mobile colposcopy in urban and underserved suburban areas in Baja California,” 10.1117/12.2218697, vol. 9699, pp. 56–60, 2016. 10.1117/12.2218697. [DOI]

[R33] [33].Mavridaki E. and Mezaris V, “No-reference blur assessment in natural images using Fourier transform and spatial pyramids,” 2014 IEEE International Conference on Image Processing, ICIP; 2014, pp. 566–570, 2014. 10.1109/ICIP.2014.7025113. [DOI] [Google Scholar]

[R34] [34].Yitzhaky Y. and Kopeika NS, “Identification of Blur Parameters from Motion Blurred Images,” Graphical Models and Image Processing, vol. 59, no. 5, pp. 310–320, 1997. 10.1006/GMIP.1997.0435. [DOI] [Google Scholar]

[R35] [35].Wichmann FA and Henning GB, “No role for motion blur in either motion detection or motion-based image segmentation,” J Opt Soc Am A Opt Image Sci Vis, vol. 15, no. 2, p. 297, 1998. 10.1364/JOSAA.15.000297. [DOI] [PubMed] [Google Scholar]

[R36] [36].Chen S. et al. , “Automatic detection of stroke lesion from diffusion-weighted imaging via the improved YOLOv5,” Comput Biol Med, vol. 150, p. 106120, 2022. 10.1016/J.COMPBIOMED.2022.106120. [DOI] [PubMed] [Google Scholar]

[R37] [37].Almufareh MF, Imran M, Khan A, Humayun M, and Asim M, “Automated Brain Tumor Segmentation and Classification in MRI Using YOLO-Based Deep Learning,” IEEE Access, vol. 12, pp. 16189–16207, 2024. 10.1109/ACCESS.2024.3359418. [DOI] [Google Scholar]

[R38] [38].Den H, Ito J, and Kokaze A, “Diagnostic accuracy of a deep learning model using YOLOv5 for detecting developmental dysplasia of the hip on radiography images,” Scientific Reports 2023 13:1, vol. 13, no. 1, pp. 1–10, 2023. 10.1038/s41598-023-33860-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] [39].Jocher Glenn, “YOLOv5 github repository,” Available: https://github.com/ultralytics/yolov5.

[R40] [40].Egemen D. et al. , “Artificial intelligence–based image analysis in clinical testing: lessons from cervical cancer screening,” JNCI: Journal of the National Cancer Institute, 2023. 10.1093/JNCI/DJAD202. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R41] [41].Ahmed SR et al. , “Reproducible and clinically translatable deep neural networks for cervical screening,” Scientific Reports 2023 13:1, vol. 13, no. 1, pp. 1–18, 2023. 10.1038/s41598-023-48721-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R42] [42].Arora M, Dhawan S, and Singh K, “Deep Neural Network for Transformation Zone Classification,” ICSCCC 2018 – 1st International Conference on Secure Cyber Computing and Communications, pp. 213–216, 2018. 10.1109/ICSCCC.2018.8703327. [DOI] [Google Scholar]

[R43] [43].Prendiville W. and Sankaranarayanan R, “The effect of oncogenic HPV on transformation zone epithelium,” in Colposcopy and Treatment of Cervical Cancer, International Agency for Research on Cancer, Ed., Lyon (FR): International Agency for Research on Cancer, 2017. Accessed: Sep. 26, 2022. [Online]. Available: https://www.ncbi.nlm.nih.gov/books/NBK568360/ [Google Scholar]

[R44] [44].Arora M, Dhawan S, and Singh K, “Exploring Deep Convolution Neural Networks with Transfer Learning for Transformation Zone Type Prediction in Cervical Cancer,” Advances in Intelligent Systems and Computing, vol. 1053, pp. 1127–1138, 2020. 10.1007/978-981-15-0751-9_104/TABLES/3. [DOI] [Google Scholar]

[R45] [45].Alush A, Greenspan H, and Goldberger J, “Automated and interactive lesion detection and segmentation in uterine cervix images,” IEEE Trans Med Imaging, vol. 29, no. 2, pp. 488–501, 2010. 10.1109/TMI.2009.2037201. [DOI] [PubMed] [Google Scholar]

[R46] [46].Tan L, Huangfu T, Wu L, and Chen W, “Comparison of RetinaNet, SSD, and YOLO v3 for real-time pill identification,” BMC Med Inform Decis Mak, vol. 21, no. 1, pp. 1–11, 2021. 10.1186/S12911-021-01691-8/TABLES/4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R47] [47].Rahman A, Lu Y, and Wang H, “Performance evaluation of deep learning object detectors for weed detection for cotton,” Smart Agricultural Technology, vol. 3, p. 100126, 2023. 10.1016/J.ATECH.2022.100126. [DOI] [Google Scholar]

[R48] [48].Gao Y, Hu HM, Li B and Guo Q, “Naturalness preserved nonuniform illumination estimation for image enhancement based on retinex,” IEEE Trans Multimedia, vol. 20, no. 2, pp. 335–344, 2018. 10.1109/TMM.2017.2740025. [DOI] [Google Scholar]

[R49] [49].Xiong X. and Shang Y, “An adaptive method to correct the non-uniform illumination of images,” 10.1117/12.2575590, vol. 11567, pp. 173–179, 2020. 10.1117/12.2575590. [DOI]

PERMALINK

Automated Image Clarity Detection for the Improvement of Colposcopy Imaging with Multiple Devices

Lillian Ekem

Erica Skerrett

Megan J Huchko

Nimmi Ramanujam

Abstract

1. Introduction