Abstract
Testicular histology based on testicular biopsy is an important factor for determining appropriate testicular sperm extraction surgery and predicting sperm retrieval outcomes in patients with azoospermia. Therefore, we developed a deep learning (DL) model to establish the associations between testicular grayscale ultrasound images and testicular histology. We retrospectively included two-dimensional testicular grayscale ultrasound from patients with azoospermia (353 men with 4357 images between July 2017 and December 2021 in The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China) to develop a DL model. We obtained testicular histology during conventional testicular sperm extraction. Our DL model was trained based on ultrasound images or fusion data (ultrasound images fused with the corresponding testicular volume) to distinguish spermatozoa presence in pathology (SPP) and spermatozoa absence in pathology (SAP) and to classify maturation arrest (MA) and Sertoli cell-only syndrome (SCOS) in patients with SAP. Areas under the receiver operating characteristic curve (AUCs), accuracy, sensitivity, and specificity were used to analyze model performance. DL based on images achieved an AUC of 0.922 (95% confidence interval [CI]: 0.908–0.935), a sensitivity of 80.9%, a specificity of 84.6%, and an accuracy of 83.5% in predicting SPP (including normal spermatogenesis and hypospermatogenesis) and SAP (including MA and SCOS). In the identification of SCOS and MA, DL on fusion data yielded better diagnostic performance with an AUC of 0.979 (95% CI: 0.969–0.989), a sensitivity of 89.7%, a specificity of 97.1%, and an accuracy of 92.1%. Our study provides a noninvasive method to predict testicular histology for patients with azoospermia, which would avoid unnecessary testicular biopsy.
Keywords: azoospermia, deep learning, male infertility, testicular histology, ultrasound
INTRODUCTION
Azoospermia is the most severe form of male infertility, accounting for 10%–15% of males seeking medical care for infertility.1 It is typically classified as obstructive azoospermia (OA) or nonobstructive azoospermia (NOA). Whereas OA is usually characterized by normal spermatogenesis, NOA has a variety of testicular histological types including hypospermatogenesis, maturation arrest (MA), Sertoli cell-only syndrome (SCOS), seminiferous tubule hyalinization, and mixed pattern of testicular histology.2,3,4 Some OA patients can produce offspring through microsurgical reconstruction of the reproductive tract, and many patients with azoospermia (OA and NOA) can father their own children through sperm retrieval techniques such as conventional testicular sperm extraction (cTESE) or microsurgical TESE (micro-TESE) combined with intracytoplasmic sperm injection (ICSI).5,6
According to the European Association of Urology (EAU) guidelines panel for male sexual and reproductive health in 2022, the probability of finding sperm at testicular sperm retrieval varies according to testicular histology.7 Notably, the sperm retrieval rate also depends on the sperm retrieval techniques employed for testicular histology. We agree with some authors that cTESE could be the first choice for patients with MA, hypospermatogenesis, and normal spermatogenesis, and micro-TESE could be the first choice for patients with SCOS.8,9 However, to prevent injury to the testis, diagnostic testicular biopsy before surgical sperm retrieval is not recommended according to the EAU guidelines of 2022.7,10,11,12 Therefore, it is necessary to find a noninvasive technology that can predict testicular histology for the recommendation of testicular sperm extraction surgery type.
Testicular grayscale ultrasound, which can identify seminiferous tubules ranging from 100 μm to 300 μm in diameter, has also been investigated as a noninvasive, widely accessible technology to evaluate testicular sperm retrieval.13 It can identify tubules that are larger and more opaque than their neighboring tubules as these two characteristics are evidence of active spermatogenesis and a higher sperm retrieval rate.9,14 In our previous research, we reported that the intensity and inhomogeneity of grayscale ultrasound images changed during the process of impairment and recovery of spermatogenic function in testicular injury animal models.15 Similarly, testicular histology varies with the diameter and cell composition of seminiferous tubule, resulting in different testicular grayscale ultrasound images.15 Deep learning (DL) techniques are attracting considerable critical attention due to their superior performance in the field of medical images,16 as DL can find image details that tend to be hard for the naked eye to recognize.17 The purpose of this study was to investigate the potential of DL to establish the associations between testicular grayscale ultrasound images and testicular histology in patients with azoospermia.
PARTICIPANTS AND METHODS
Study design and dataset
This study complied with the Declaration of Helsinki and was approved by the institutional Clinical Research Ethics Committee of The First Affiliated Hospital of Sun Yat-sen University (Guangzhou, China; Approval No. [2019]165). Informed consent from patients was waived for retrospectively collected ultrasound images. This single-center study included data from patients aged 18 years or older who were diagnosed with azoospermia and underwent testicular grayscale ultrasound and testicular histology between July 2017 and December 2021 in The First Affiliated Hospital of Sun Yat-sen University. We obtained testicular histology during cTESE. The eligibility criteria were as follows: (1) OA with obstruction or congenital bilateral absence of the male reproductive tract, and (2) idiopathic NOA (iNOA),18 which was defined as NOA without a previous history of cryptorchidism or a genetic etiology (normal karyotype and absence of Y chromosome microdeletions and genetic mutations) or other etiologies that could explain NOA (infections, radiotherapy, chemotherapy, and hypogonadotropic hypogonadism).
Patients were excluded if they met the following criteria: (1) testicular microlithiasis, which affects the evaluation of testicular echotexture; (2) seminiferous tubule hyalinization or mixed pattern of testicular histology, both of which lack enough samples for statistical analysis; or (3) incomplete images or images of poor quality. The flowchart of patient selection is presented in Figure 1.
Figure 1.
The flowchart of patient selection. MA: maturation arrest; iNOA: idiopathic nonobstructive azoospermia; OA: obstructive azoospermia; SPP: spermatozoa presence in pathology; SAP: spermatozoa absence in pathology; SCOS: Sertoli cell-only syndrome.
In this study, with the Johnsen score (JS) as the reference standard,19,20 testicular histology was divided into different groups: spermatozoa presence in pathology (SPP) with JS ≥ 8, including normal spermatogenesis and hypospermatogenesis; and spermatozoa absence in pathology (SAP) with JS ≥2 and JS < 8, including MA (JS >2 and JS <8) and SCOS (JS = 2). The JS of 1 was excluded since its corresponding histological pattern was seminiferous tubule hyalinization.
On the basis of the above categories, our classification models were divided into two tasks: SPP vs SAP and MA vs SCOS. Considering that testicular histology plays an important role in selecting the type of sperm extraction surgery in patients with iNOA, the model was validated not only in patients with azoospermia (consisting of OA and iNOA) but also in those with iNOA.
Ultrasound examinations
All examinations were performed by a single radiologist (ZW, who had 17 years of experience in testicular ultrasound) with a 9.0-MHz ultrasound scanner (DC-8; Mindray, Shenzhen, China) before testicular biopsy (cTESE). The Digital Imaging and Communications in Medicine (DICOM) format was used to store grayscale images of the maximum longitudinal section and clips covering the left pole and right pole of the testis.21 As a part of the ultrasound examination process, testicular volume (TV) was calculated via the Lambert formula: length (cm) × width (cm) × depth (cm) × 0.71.22
Data preparation
All patients’ ultrasound examinations, histological results, and clinical data, which included age, semen parameters, and serum hormones, were collected from medical records.
The data between July 2017 and January 2020 were static images of the maximum longitudinal section (1–3 images per person), and the data between February 2020 and December 2021 were clips. Clips were converted into consecutive images via the native function of MicroDicom DICOM viewer 2.9.2 (MicroDicom Ltd., Sofia, Bulgaria). Representative images were manually selected from each clip according to the following criteria: (1) images near the middle dorsal part of the testis corresponding to the biopsy site (near the maximum longitudinal section) and (2) images where the mediastinum was avoided. Each clip provided at least ten images, with the specific number varying based on the size of the testis and the quality of converted images. Finally, 4357 original images for DL development were included in this study and randomly divided into a training cohort and a test cohort at a ratio of 7:3. The random image assignment was made at patient level, ensuring that the test cohort did not contain any images originating from the training cohort.
Development of the DL model
The whole pipeline of our model is shown in Figure 2. The comprehensive DL model had two branch networks: Network 1 (ResNet-152), a residual network,23 which was selected for the classification task of SPP and SAP; and Network 2 (ConvNeXt-Large),24 which was selected for the classification task of MA and SCOS. The input of Network 1 was a tensor of an ultrasound image, whereas the input of Network 2 was a fusion tensor of an ultrasound image and the corresponding TV. All the inputs were resized and normalized into the corresponding model separately. Interestingly, the input for Network 2 was a new tensor in which the ultrasound image was fused with TV. The fusion data were obtained by multiplying TV as a coefficient by the original image tensor.
Figure 2.
The overall pipeline of the model. DL: deep learning; TV: testicular volume; L: length of the testis; W: width of the testis; D: depth of the testis; FC: fully-connected.
The input image required the following preprocessing operations: (1) the pixel matrix of the image was transformed into a tensor matrix that transformed the pixel value (0, 255) of the RGB three-channel into a standard scale tensor in the range (0, 1.0); and (2) the tensor matrix was normalized, where the mean and standard deviation refer to the sequence of means and standard deviations for each channel. The means (standard deviation [s.d.]) of the RGB channels used were 0.485 (0.229), 0.456 (0.224), and 0.406 (0.225), respectively.
The two networks were optimized by the Adam optimizer with an initial learning rate of 0.001. Training and testing were performed via a Tesla V100 GPU×4 (CUDA version 11.6; Nvidia, Santa Clara, CA, USA) with a batch size of 40.
Statistical analyses
We used the area under the curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) to estimate model performance. The receiver operating characteristic (ROC) curve was created by plotting sensitivity against 1-specificity with GraphPad Prism 8.0 software (GraphPad Software, La Jolla, CA, USA). Comparisons between the groups were made via an independent samples t-test or the Mann-Whitney U test. All the statistical tests were two-sided permutation tests, and P < 0.05 was considered statistically significant. We used SPSS 26.0 software (IBM, Armonk, NY, USA) for the analysis of demographic characteristics.
RESULTS
Patient demographics
As shown in Table 1, 353 men with azoospermia and testicular histological results and 4357 ultrasound images were enrolled for analysis between July 2017 and December 2021. According to the JS results, 259 patients (3011 images) were confirmed to have SPP, and 94 patients (1346 images) were confirmed to have SAP, among which 37 patients (462 images) had MA and 57 patients (884 images) had SCOS. The SPP group consisted of 223 OA patients (2562 images) and 36 iNOA patients (449 images), whereas the SAP group consisted solely of iNOA patients.
Table 1.
Clinical characteristics of azoospermia patients
Characteristic | SPP | SAP | ||
---|---|---|---|---|
| ||||
All | MA | SCOS | ||
Total number of patients (n) | 259 | 94 | 37 | 57 |
OA | 223 | 0 | 0 | 0 |
iNOA | 36 | 94 | 37 | 57 |
Total number of images (n) | 3011 | 1346 | 462 | 884 |
Average number of images per patient (n) | 11 | 14 | 12 | 15 |
Age (year), median (IQR) | 30 (27–33) | 31 (28–33) | 30 (28–34) | 31 (29-33) |
Semen volume (ml), median (IQR) | 1.6 (1.3–2.5) | 3.5 (2.8–4.5)a | 3.5 (2.6–4.2) | 3.5 (3.0-4.6) |
pH, median (IQR) | 7.5 (6.0–7.5) | 7.5 (7.5–7.5)a | 7.5 (7.5–7.5) | 7.5 (7.5-7.5) |
Sperm concentration (×106 ml-1) | 0 | 0 | 0 | 0 |
FSH (IU l-1), median (IQR) | 4.0 (3.0–4.6) | 15.3 (10.1–18.9)a | 14.9 (6.9–15.7) | 15.5 (13.8-19.9)b |
LH (IU l-1), median (IQR) | 3.1 (2.2–3.6) | 5.7 (4.7–6.5)a | 5.5 (3.8–5.7) | 5.7 (5.4-6.8)b |
TT (ng ml-1), median (IQR) | 5.3 (3.7–5.7) | 4.7 (3.0–5.5)a | 4.7 (3.5–5.7) | 4.7 (2.4-5.3) |
TV# (ml), median (IQR) | 13.2 (11.0–16.4) | 7.7 (5.8–9.9)a | 9.0 (6.9–11.5) | 6.9 (5.6-8.5)b |
#Testicular volume was calculated corresponding to the biopsy site. aP<0.05, the indicated value compared with that in the SPP. bP<0.05, the indicated value compared with that in the MA. FSH: follicle-stimulating hormone; iNOA: idiopathic nonobstructive azoospermia; LH: luteinizing hormone; MA: maturation arrest; OA: obstructive azoospermia; SPP: spermatozoa presence in pathology; SAP: spermatozoa absence in pathology; SCOS: Sertoli cell-only syndrome; TT: total testosterone; TV: testicular volume; IQR: interquartile range
Performance of the DL model between SPP and SAP
Adopting SPP as the negative reference standard, 2979 images were randomly assigned as the training cohort, and the other 1378 images were assigned to the test cohort of patients with azoospermia (consisting of OA and iNOA). Compared with DL on fusion data, DL based on images achieved better diagnostic performance, with an AUC of 0.922 (95% confidence interval [CI]: 0.908–0.935, P = 0.016), a sensitivity of 80.9%, a specificity of 84.6%, and an accuracy of 83.5% (Table 2 and Figure 3a). DL on fusion data reached an AUC of 0.901 (95% CI: 0.884–0.916), but its sensitivity was not satisfactory at 63.2%, with a specificity of 95.0% and an accuracy of 85.0% (Table 2 and Figure 3a).
Table 2.
Diagnostic performance of the deep learning model
Diagnostic performance | AUC (95% CI) | Sensitivity (%) | Specificity (%) | Accuracy (%) | PPV (%) | NPV (%) |
---|---|---|---|---|---|---|
Patients with azoospermia (consisting of OA and iNOA), SPP vs SAP | ||||||
DL on images | 0.922 (0.908–0.935) | 80.9 | 84.6 | 83.5 | 70.9 | 90.6 |
DL on fusion data | 0.901 (0.884–0.916)a | 63.2 | 95.0 | 85.0 | 85.4 | 84.8 |
Patients with iNOA, SPP vs SAP | ||||||
DL on images | 0.912 (0.890–0.935) | 88.9 | 63.0 | 83.0 | 89.1 | 62.6 |
DL on fusion data | 0.854 (0.818–0.891)b | 83.0 | 79.7 | 82.3 | 93.3 | 57.8 |
Patients with azoospermia (consisting solely of iNOA), MA vs SCOS | ||||||
DL on images | 0.683 (0.624–0.742) | 91.8 | 49.6 | 77.9 | 78.7 | 75.0 |
DL on fusion data | 0.979 (0.969–0.989)c | 89.7 | 97.1 | 92.1 | 98.4 | 82.3 |
aP=0.016, DeLong’s test in comparison with DL on images in patients with azoospermia. bP=0.002, DeLong’s test in comparison with DL on images in patients with iNOA. cP<0.001, DeLong’s test in comparison with DL on images in patients with iNOA. AUC: area under the receiver operating characteristic curve; DL: deep learning; iNOA: idiopathic nonobstructive azoospermia; MA: maturation arrest; NPV: negative predictive value; OA: obstructive azoospermia; SPP: spermatozoa presence in pathology; SAP: spermatozoa absence in pathology; PPV: positive predictive value; SCOS: Sertoli cell-only syndrome; CI: confidence interval
Figure 3.
Performance of the DL model in predicting testicular histology. (a) ROC curves for distinguishing SPP and SAP in patients with azoospermia. (b) ROC curves for distinguishing SPP and SAP in patients with iNOA. (c) ROC curves for distinguishing MA and SCOS in patients with iNOA. ROC: receiver operating characteristic; AUC: area under ROC curve; CI: confidence interval; DL: deep learning; MA: maturation arrest; iNOA: idiopathic nonobstructive azoospermia; SPP: spermatozoa presence in pathology; SAP: spermatozoa absence in pathology; SCOS: Sertoli cell-only syndrome.
In addition, we validated the above model in patients with iNOA. DL based on images achieved an AUC of 0.912 (95% CI: 0.890–0.935), a sensitivity of 88.9%, a specificity of 63.0%, and an accuracy of 83.0%. The AUC of DL on fusion data decreased slightly and reached 0.854 (95% CI: 0.818–0.891, P = 0.002), with a sensitivity of 83.0%, a specificity of 79.7%, and an accuracy of 82.3% (Table 2 and Figure 3b).
Performance of the DL model between MA and SCOS
The SAP group was further divided into the MA group and the SCOS group. When MA was adopted as the negative reference standard, 927 images were assigned to the training cohort and 419 images were assigned to the test cohort. DL on fusion data outperformed DL based on images, with a sensitivity of 89.7%, a specificity of 97.1%, and an accuracy of 92.1% for the combined model and a sensitivity of 91.8%, a specificity of 49.6%, and an accuracy of 77.9% for the single model (Table 2). Owing to the superior performance of the data fusion model, its AUC also exceeded that of the model based on images (0.979 [95% CI: 0.969–0.989] vs 0.683 [95% CI: 0.624–0.742], P < 0.001; Table 2 and Figure 3c).
Visualization of the DL model
To better explain the predictive performance of DL model, heatmaps were produced by applying gradient-weighted class activation mapping++ (Grad-Cam++), which can identify those pixels in ultrasound images to which the DL model pays attention for each prediction for an SPP instance, an MA instance, and an SCOS instance (Figure 4). The Grad-Cam++ images were drawn by extracting the weighted output of the last convolutional layer of the network. On the basis of the region obtained from Grad-Cam++, one can infer why the model makes the current prediction for each image and whether the region of model interest is consistent with the region of clinician interest, thus improving the reliability of the model. These heatmaps showed that the DL model focused on the testicular parenchyma region and therefore used visual features therein to make the current decision. The red and yellow regions represent areas activated by the DL model and have the greatest predictive significance. In contrast, the green and blue backgrounds represent areas with weaker predictive values. The redder the feature color is, the greater the possibility of positive predictive values.
Figure 4.
Testicular grayscale ultrasound images (the upper panel) and Grad-Cam++ maps (the lower panel) of different testicular histologies in patients with azoospermia. The region of interest, located in the testicular parenchyma, was consistent between the model and clinicians, and the DL model correctly identified (a) SPP, (b) MA, and (c) SCOS. DL: deep learning; MA: maturation arrest; SPP: spermatozoa presence in pathology; SCOS: Sertoli cell-only syndrome.
Clinical application of the DL model
During ultrasound examination, in addition to detecting signs of obstruction of the male reproductive tract, we could also assess TV and echotexture. Using the DL model to predict testicular histology, we complemented ultrasound reports with additional information. The computational time for testing the DL model was 1 s per image, and there was no additional cost to finish the testing. The DL model can be applied to the preoperative examination of azoospermia, enabling patients to be well informed about the chances of successful sperm retrieval. The DL model is helpful to urological surgeons for determining the most appropriate testicular sperm extraction surgery.
DISCUSSION
This study demonstrated an association existed between testicular grayscale ultrasound images and different testicular histologies, and a DL model was developed to noninvasively predict testicular histology in patients with azoospermia, including SPP (normal spermatogenesis and hypospermatogenesis), MA, and SCOS. For azoospermia, DL based on images showed favorable discriminability between patients with SPP and those with SAP (MA and SCOS; AUC = 0.922). In addition, the model was validated in iNOA patients and achieved similar results with good diagnostic efficacy (AUC = 0.912). Encouragingly, DL on fusion data achieved better diagnostic performance in distinguishing between patients with MA and SCOS than that of DL on images (AUC = 0.979 vs 0.683, P < 0.001). Therefore, the model could be used to predict testicular histology in iNOA patients for the recommendation of testicular sperm extraction surgery type.
Testicular histology is closely to sperm retrieval outcomes, and is vital for determining appropriate testicular sperm extraction surgery.3,25,26 Notably, we endorse the perspective put forth by some authors, positing that cTESE may be the preferred option for patients with MA, hypospermatogenesis, and normal spermatogenesis, whereas micro-TESE could potentially be the prime choice for those with SCOS.8,25 However, few previous studies have sought to determine the relationship between preoperative clinical data and testicular histology. Follicle-stimulating hormone (FSH), the most commonly used parameter in the evaluation of male infertility, could predict both hypospermatogenesis and SCOS (AUC=0.705 and 0.709, respectively) but could not correctly classify MA.27 Previous studies have attempted to evaluate the possible role of shear wave elastography in predicting successful sperm retrieval and differentiating histologic categories by correlating tissue stiffness with histological tissue alterations.28,29 However, this method was not accurate for identifying different testicular histology categories with an AUC of 0.79.29 In addition, several recent studies have assessed the potential role of diffusion-weighted magnetic resonance imaging and proton magnetic resonance spectroscopy in distinguishing OA from NOA and predicting sperm retrieval together with histological alterations, but the prediction of spermatid arrest was poor at only 40%.30 In our study, the DL model achieved better diagnostic performance in predicting SPP (normal spermatogenesis and hypospermatogenesis), SCOS, and MA with AUCs of 0.912–0.979.
Previous studies have reported that echotexture alterations, such as inhomogeneity and reduced echogenicity, could be indicators of reduced testicular spermatogenic function and could discriminate between normozoospermia and impaired semen quality in infertile patients.31,32 Lenz et al.33 proposed a 5-point scale ranging from 1 (normal testis) to 5 (testis affected by a solid lesion) and reported that a higher score was related to irregular echotexture. A new and comprehensive scoring system, named the testicular ultrasound score (TU score), was subsequently developed by analyzing the correlations between ultrasound characteristics and all clinical parameters and was shown to be a more accurate predictor of impaired spermatogenic function than the Lenz score was (AUC=0.73 vs 0.66, P < 0.001).34 However, the interpretation of ultrasound images is often operator-dependent and subjective. In a recent study, radiomics was first applied to the testis in an attempt to quantify this subjective evaluation; the correlation between testicular echotexture and semen parameters, including sperm number and motility, was confirmed, suggesting that testicular echotexture was able to predict spermatogenic capability.35 However, the discriminability between normozoospermic patients and oligozoospermic patients, between normozoospermic patients and asthenozoospermic patients, and between normozoospermic patients and teratozoospermic patients was not good, with AUCs ranging from 0.50 to 0.73.35 It seems that similar histological conditions that rarely result in significant changes can account for the poor discriminability.
In this study, we applied DL to ultrasound images to provide objective interpretation results and engineered a DL model to establish the association between testicular echotexture and testicular histology, enabling the evaluation of testicular spermatogenic function in patients with azoospermia. Our study yielded good diagnostic performance when DL was applied to images for the classification of SPP and SAP. This result can be attributed to the histological condition, in which SAP (MA and SCOS) is characterized by varying degrees of reduction in the number of spermatogenic cells or even completely absent germ cells with only Sertoli cells in the seminiferous tubules compared with SPP (normal spermatogenesis and hypospermatogenesis). Accordingly, echotexture changes occur in these histological conditions, which are imperceptible to humans but can be well recognized by DL.
Nevertheless, in the SAP group, DL based on images was not accurate enough to classify MA and SCOS, with a poor AUC. In addition to the small sample size, another possible explanation for this finding is the overlap of echotexture alterations between the conditions. MA, a diverse histological type of severe male infertility with JS of 3–7, can be classified into early MA, in which the testes only contain spermatogonia or spermatocytes, and late MA, in which they contain round spermatids but no spermatozoa.20 SCOS, the most severe histological type, is characterized by totally absent germ cells with only Sertoli cells visible in the seminiferous tubules.36 It is reasonable to presume that patients with early MA (which only contains spermatogonia) would display a similar echotexture that is indistinguishable from those with SCOS, given that the difference in the number of cell compositions is expected to be less evident in both cases. Therefore, the overlapping echotexture alteration between MA and SCOS prevents DL on images from being an efficient method of distinguishing them.
In both MA and SCOS, obvious differences exist in the diameters of seminiferous tubules in addition to the alteration of testicular cell composition, resulting in the variance of TV. In this study, we added TV to further improve the prediction performance of the model by fusing the ultrasound images with the corresponding TVs. Encouragingly, DL on fusion data showed significantly better diagnostic performance than DL on images. However, when distinguishing SPP and SAP, the performance of DL on fusion data did not improve significantly, and the sensitivity decreased instead. The reason is that most azoospermia patients with normal spermatogenesis, hypospermatogenesis, and late MA have normal-sized testes,4 indicating that TV plays a less critical role in distinguishing SPP (normal spermatogenesis and hypospermatogenesis) from SAP (including MA). Taken together, the findings showed that DL on images had the best diagnostic performance in distinguishing between SPP and SAP, whereas DL on fusion data achieved better diagnostic performance in distinguishing SCOS and MA.
Our research has several limitations. First, this was a single-center retrospective study including OA and iNOA patients. Prospective studies including multicenter data and different NOA types are needed to validate the potential role of DL models in the management of azoospermia. Second, one of the purposes of our study was to provide more diagnostic information on ultrasound, which is why we did not integrate clinical parameters with our model. Further studies are needed to collaboratively use clinical information and ultrasound image features to build a model for the prediction of sperm retrieval rate. Third, a testicular biopsy sample may not reflect the entire testicular tissue and the testicular histology is usually heterogeneous throughout the parenchyma of the testis in patients with NOA, although testicular biopsy is the standard approach for assessing testicular histology.26,37,38
CONCLUSIONS
We showed that different testicular histology types of patients with azoospermia could be identified by DL with testicular grayscale ultrasound images, suggesting the possibility of noninvasive prediction of testicular histology and a new direction for research on the prediction of sperm extraction.
AUTHOR CONTRIBUTIONS
ZW designed the study. XYX and MDL supervised the study. JYH, LD, ZXZ, and WLH performed the research and collected the data. ZZL and SSH built the deep learning algorithms, trained the model, and validated the data. JYH, YG, ZZL, and BL analyzed the data. JYH and ZZL were involved in drafting the manuscript. ZW, YG, and HTL were involved in reviewing the manuscript. All authors read and approved the final manuscript.
COMPETING INTERESTS
All authors declare no competing interests.
ACKNOWLEDGMENTS
This work was supported by the Key Research and Development Program of Guangzhou Science and Technology Projects (No. 2023B03J1260).
REFERENCES
- 1.Corona G, Minhas S, Giwercman A, Bettocchi C, Dinkelman-Smit M, et al. Sperm recovery and ICSI outcomes in men with non-obstructive azoospermia:a systematic review and meta-analysis. Hum Reprod Update. 2019;25:733–57. doi: 10.1093/humupd/dmz028. [DOI] [PubMed] [Google Scholar]
- 2.Tournaye H, Krausz C, Oates RD. Concepts in diagnosis and therapy for male reproductive impairment. Lancet Diabetes Endocrinol. 2017;5:554–64. doi: 10.1016/S2213-8587(16)30043-2. [DOI] [PubMed] [Google Scholar]
- 3.Toksoz S, Kizilkan Y. Comparison of the histopathological findings of testis tissues of non-obstructive azoospermia with the findings after microscopic testicular sperm extraction. Urol J. 2019;16:212–5. doi: 10.22037/uj.v16i2.4756. [DOI] [PubMed] [Google Scholar]
- 4.Andrade DL, Viana MC, Esteves SC. Differential diagnosis of azoospermia in men with infertility. J Clin Med. 2021;10:3144. doi: 10.3390/jcm10143144. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Practice Committee of the American Society for Reproductive Medicine. Electronic address:asrm@asrmorg. Management of nonobstructive azoospermia:a committee opinion. Fertil Steril. 2018;110:1239–45. doi: 10.1016/j.fertnstert.2018.09.012. [DOI] [PubMed] [Google Scholar]
- 6.Practice Committee of the American Society for Reproductive Medicine in collaboration with the Society for Male Reproduction and Urology. Electronic address:asrm@asrmorg. The management of obstructive azoospermia:a committee opinion. Fertil Steril. 2019;111:873–80. doi: 10.1016/j.fertnstert.2019.02.013. [DOI] [PubMed] [Google Scholar]
- 7.Salonia A, Bettocchi C, Carvalho J, Corona G, Jones TH, et al. EAU Guidelines on Sexual and Reproductive Health. Arnhem: EAU Guidelines Office; 2024. [Google Scholar]
- 8.Deruyver Y, Vanderschueren D, Van der Aa F. Outcome of microdissection TESE compared with conventional TESE in non-obstructive azoospermia:a systematic review. Andrology. 2014;2:20–4. doi: 10.1111/j.2047-2927.2013.00148.x. [DOI] [PubMed] [Google Scholar]
- 9.Bernie AM, Mata DA, Ramasamy R, Schlegel PN. Comparison of microdissection testicular sperm extraction, conventional testicular sperm extraction, and testicular sperm aspiration for nonobstructive azoospermia:a systematic review and meta-analysis. Fertil Steril. 2015;104:1099–103. doi: 10.1016/j.fertnstert.2015.07.1136. e1–3. [DOI] [PubMed] [Google Scholar]
- 10.Schlegel PN, Sigman M, Collura B, De Jonge CJ, Eisenberg ML, et al. Diagnosis and treatment of infertility in men: AUA/ASRM guideline part I. Fertil Steril. 2021;115:54–61. doi: 10.1016/j.fertnstert.2020.11.015. [DOI] [PubMed] [Google Scholar]
- 11.Alkandari MH, Zini A. Medical management of non-obstructive azoospermia:a systematic review. Arab J Urol. 2021;19:215–20. doi: 10.1080/2090598X.2021.1956233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Amer MK, Ahmed AR, Abdel Hamid AA, GamalEl Din SF. Can spermatozoa be retrieved in non-obstructive azoospermic patients with high FSH level?A retrospective cohort study. Andrologia. 2019;51:e13176. doi: 10.1111/and.13176. [DOI] [PubMed] [Google Scholar]
- 13.Najari BB. Ultrasound evaluation of seminiferous tubules:a promising prognostic tool for men with nonobstructive azoospermia undergoing microsurgical testicular sperm extraction. Fertil Steril. 2020;113:73–4. doi: 10.1016/j.fertnstert.2019.09.015. [DOI] [PubMed] [Google Scholar]
- 14.Nariyoshi S, Nakano K, Sukegawa G, Sho T, Tsuji Y. Ultrasonographically determined size of seminiferous tubules predicts sperm retrieval by microdissection testicular sperm extraction in men with nonobstructive azoospermia. Fertil Steril. 2020;113:97–104.e2. doi: 10.1016/j.fertnstert.2019.08.061. [DOI] [PubMed] [Google Scholar]
- 15.Huang WL, Ding L, Yao JH, Hu HT, Gao Y, et al. Testicular quantitative ultrasound:a noninvasive monitoring method for evaluating spermatogenic function in busulfan-induced testicular injury mouse models. Andrologia. 2021;53:e13927. doi: 10.1111/and.13927. [DOI] [PubMed] [Google Scholar]
- 16.LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–44. doi: 10.1038/nature14539. [DOI] [PubMed] [Google Scholar]
- 17.Gillies RJ, Kinahan PE, Hricak H. Radiomics:images are more than pictures, they are data. Radiology. 2016;278:563–77. doi: 10.1148/radiol.2015151169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Agarwal A, Baskaran S, Parekh N, Cho CL, Henkel R, et al. Male infertility. Lancet. 2021;397:319–33. doi: 10.1016/S0140-6736(20)32667-2. [DOI] [PubMed] [Google Scholar]
- 19.Johnsen SG. Testicular biopsy score count –a method for registration of spermatogenesis in human testes:normal values and results in 335 hypogonadal males. Hormones. 1970;1:2–25. doi: 10.1159/000178170. [DOI] [PubMed] [Google Scholar]
- 20.Cerilli LA, Kuang W, Rogers D. A practical approach to testicular biopsy interpretation for male infertility. Arch Pathol Lab Med. 2010;134:1197–204. doi: 10.5858/2009-0379-RA.1. [DOI] [PubMed] [Google Scholar]
- 21.Huang W, Xiang Y, Yang Y, Tang Q, Liu G, et al. Expert recommendations on data collection and annotation of two dimensional ultrasound images in azoospermic males for evaluation of testicular spermatogenic function in intelligent medicine. Intell Med. 2022;2:97–102. [Google Scholar]
- 22.Sakamoto H, Ogawa Y. Does a clinical varicocele influence the relationship between testicular volume by ultrasound and testicular function in patients with infertility? Fertil Steril. 2009;92:1632–7. doi: 10.1016/j.fertnstert.2008.08.105. [DOI] [PubMed] [Google Scholar]
- 23.He KM, Zhang XY, Ren SQ, Sun J. Deep residual learning for image recognition. [[Last accessed on 2024 Sep 14]]. Available from: https://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf .
- 24.Liu Z, Mao HZ, Wu CY, Feichtenhofer C, Darrell T, et al. A convnet for th 2020s. [[Last accessed on 2024 Sep 14]]. Available from: https://arxiv.org/pdf/2201.03545 .
- 25.Caroppo E, Colpi EM, Gazzano G, Vaccalluzzo L, Scroppo FI, et al. Testicular histology may predict the successful sperm retrieval in patients with non-obstructive azoospermia undergoing conventional TESE:a diagnostic accuracy study. J Assist Reprod Genet. 2017;34:149–54. doi: 10.1007/s10815-016-0812-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Abdel Raheem A, Garaffa G, Rushwan N, De Luca F, Zacharakis E, et al. Testicular histopathology as a predictor of a positive sperm retrieval in men with non-obstructive azoospermia. BJU Int. 2013;111:492–9. doi: 10.1111/j.1464-410X.2012.11203.x. [DOI] [PubMed] [Google Scholar]
- 27.Caroppo E, Colpi EM, D’Amato G, Gazzano G, Colpi GM. Prediction model for testis histology in men with non-obstructive azoospermia:evidence for a limited predictive role of serum follicle-stimulating hormone. J Assist Reprod Genet. 2019;36:2575–82. doi: 10.1007/s10815-019-01613-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Abdelaal AM, El-Azizi HM, GamalEl Din SF, Abdulsalam Mohammad Azzazi O, Shokr Mohamed M. Evaluation of the potential role of shear wave elastography as a promising predictor of sperm retrieval in non-obstructive azoospermic patients:a prospective study. Andrology. 2021;9:1481–9. doi: 10.1111/andr.13005. [DOI] [PubMed] [Google Scholar]
- 29.Hu JY, Huang WL, Gao Y, Yang Z, Ding L, et al. Preliminary investigation of the diagnostic value of shear wave elastography in evaluating the testicular spermatogenic function in patients with azoospermia. Andrologia. 2021;53:e14039. doi: 10.1111/and.14039. [DOI] [PubMed] [Google Scholar]
- 30.Hesham Said A, Ragab A, Zohdy W, Ibrahim AS, Abd El Basset AS. Diffusion-weighted magnetic resonance imaging and magnetic resonance spectroscopy for non-invasive characterization of azoospermia:a prospective comparative single-center study. Andrology. 2023;11:1096–1106. doi: 10.1111/andr.13392. [DOI] [PubMed] [Google Scholar]
- 31.Harris RD, Chouteau C, Partrick M, Schned A. Prevalence and significance of heterogeneous testes revealed on sonography:ex vivo sonographic-pathologic correlation. AJR Am J Roentgenol. 2000;175:347–52. doi: 10.2214/ajr.175.2.1750347. [DOI] [PubMed] [Google Scholar]
- 32.Spaggiari G, Granata AR, Santi D. Testicular ultrasound inhomogeneity is an informative parameter for fertility evaluation. Asian J Androl. 2020;22:302–8. doi: 10.4103/aja.aja_67_19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Lenz S, Giwercman A, Elsborg A, Cohr KH, Jelnes JE, et al. Ultrasonic testicular texture and size in 444 men from the general population:correlation to semen quality. Eur Urol. 1993;24:231–8. doi: 10.1159/000474300. [DOI] [PubMed] [Google Scholar]
- 34.Pozza C, Kanakis G, Carlomagno F, Lemma A, Pofi R, et al. Testicular ultrasound score:a new proposal for a scoring system to predict testicular function. Andrology. 2020;8:1051–63. doi: 10.1111/andr.12822. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.De Santi B, Spaggiari G, Granata AR, Romeo M, Molinari F, et al. From subjective to objective:a pilot study on testicular radiomics analysis as a measure of gonadal function. Andrology. 2022;10:505–17. doi: 10.1111/andr.13131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.McLachlan RI, Rajpert-De Meyts E, Hoei-Hansen CE, de Kretser DM, Skakkebaek NE. Histological evaluation of the human testis –approaches to optimizing the clinical value of the assessment:mini review. Hum Reprod. 2007;22:2–16. doi: 10.1093/humrep/del279. [DOI] [PubMed] [Google Scholar]
- 37.Schoor RA, Elhanbly S, Niederberger CS, Ross LS. The role of testicular biopsy in the modern management of male infertility. J Urol. 2002;167:197–200. [PubMed] [Google Scholar]
- 38.Kavoussi PK, West BT, Chen SH, Hunn C, Gilkey MS, et al. A comprehensive assessment of predictors of fertility outcomes in men with non-obstructive azoospermia undergoing microdissection testicular sperm extraction. Reprod Biol Endocrinol. 2020;18:90. doi: 10.1186/s12958-020-00646-4. [DOI] [PMC free article] [PubMed] [Google Scholar]