Table 6.
Study | ART process | Outcomes of interest | Dataset | AI methods | Results |
---|---|---|---|---|---|
Khosravi et al. (2019)85 | Prediction of blastocyst quality (poor vs. good). | -Classification of blastocyst quality at 110 hrs. post insemination. | Retrospective dataset consisting of 12,001 time-lapse images at 110h post insemination. | Deep learning—CNN |
-Development of AI model (STORK) to predict blastocyst quality. -Predicted blastocyst quality with AUC above 0.98. -AUC of 0.90 and 0.76 achieved on validation with two external datasets. |
Dimitriadis et al. (2019)81 | Determination of normal fertilization (2PN vs. non-2PN embryos). | -Categorization of embryos based on fertilization outcomes. | Retrospective dataset of 3469 embryos (2893 2PN; 576 non-2PN). | Deep learning—CNN |
-AUC of 0.90, with PPV of 96.2% and NPV of 78.1%. -Trained CNN capable of automated fertilization check with high accuracy. |
Fukunaga et al. (2020)82 | Pronuclei determination. | -Categorization of oocytes based on pronuclei status. | Retrospective dataset of 900 embryos (300 each 0PN, 1PN, and 2PN). | Deep learning—CNN |
-Precision of machine learning equivalent to that of expert embryologist. -Sensitivity for detection of 0PN, 1PN, and 2PN: 99%, 82%, and 99%, respectively. |
Coticchio et al. (2021)83 | Cytoplasmic movement to predict blastocyst development. | -Deep learning methods based on cytoplasmic movements at early cleavage stage to predict development to blastocyst. | Retrospective analysis of 230 embryo time-lapse sequential images. | Deep learning ANN extended by k-NN. |
-Combination of blind operator assessment and deep learning models led to prediction accuracy of 82.6%, 79.4% sensitivity and 85.7% specificity. -Highlights importance of cytoplasm dynamics as novel source of data. |
Zhao et al. (2021)84 | Labeling of segmented day-1 embryos. | -CNN labeling of zona pellucida, cytoplasm, and pronuclei performance compared with manual labeling by a clinical embryologist. | 1218 images from 24 day-one embryos of 14 subjects. | Deep learning—CNN |
-Good precision in measurement of cytoplasm, pronuclei, and zona pellucida (97%, 84%, and 80% accuracy respectively) and comparable with morphometrics reported in literature. -Rapid labeling of all images: 130 hrs. for manual labeling against 12.18 s for CNN. |
Thirumalaraju et al. (2021)86 | Blastocyst classification based on morphological data. | -Classifying blastocysts based on morphological data in eight different neural network architectures. | 742 embryo images used for validation. | Deep learning—CNN |
-XCeption CNN architecture correctly classified > 99.5% of the highest quality blastocysts as good embryos. -Accuracy of Xception model in categorizing blastocyst and non-blastocyst was 90.9%. |
Berntsen et al. (2022)87 | Embryo selection for transfer. | -Prediction of implantation outcome with fully automated deep learning tool. | 115,832 embryo time-lapse sequences (validation set of 17,249 embryos, 2212 with known outcomes). | Deep learning—CNN (iDAScore v1). |
-AUC of 0.95 in predicting implantation when all embryos are considered together (including 1510 embryos labeled as discarded due to manual deselection by embryologist or aneuploidy). -Inclusion of discarded embryos in model training aids deep learning. |
Hickman et al. (2022)95* | Embryo selection for transfer. | -CHLOE EQ™ score based on embryo bioinformatics and relation to expert embryologist grading, implantation, and live birth. | 799 day-5 embryo time-lapse videos | Not disclosed |
-CHLOE EQ™ score was directly related to embryologist ranking of morphology. -CHLOE EQ™ score differentiated between embryos that implanted and those that did not. -Strong correlation between human and AI-determined morphokinetic labeling. -Was not predictive of live birth. |
Diakiw et al. (2023)89 | Embryo selection for transfer. | -AI model using deep CNN and Grad-CAM++ mapping. | -9359 day-5 blastocyst images from 4709 women who underwent IVF. | Deep learning—CNN |
-Heat maps generated for regions relating to viable and nonviable embryo classification and AI score generated. -Positive linear correlation of AI scores with pregnancy outcomes were found, leading to 12.2% reduction in time to pregnancy in comparison with standard morphological grading methods. -AI scores significantly correlated with Gardner morphological score and associated with embryo ploidy status. |
Meseguer Escriva et al. (2022)99* | Aneuploidy assessment | -AI model using 5 feature extraction models to predict ploidy status (abnormal morphokinetic patterns, an embryo grading classification algorithm, differential cell division activity, mitochondrial DNA content, and quantification of blastocoelic contractions). | Retrospective dataset of 2502 embryo time lapse sequences with known ploidy status. | Deep learning—CNN |
-Integration of all 5 features led to 90% accuracy in prediction of ploidy status. -Non-invasive AI-guided PGT triage could be a useful adjunct to conventional embryo selection or recommendation for PGT. |
Barnes et al. (2023)100 | Aneuploidy assessment | -Prediction of ploidy status based on static images, morphokinetic parameters, morphological assessments, and maternal age. | Retrospective dataset of 10,378 annotated blastocysts from 1385 patients with known ploidy status. | Deep learning—CNN |
-‘STORK-A’ automated embryo evaluation predicted aneuploid versus euploid embryos with an accuracy of 69.3% (AUC 0.761) when using images, maternal age, morphokinetics, and blastocyst score. -Accuracy increased to 77.6% in prediction of complex aneuploidy vs. euploidy. -Two external test datasets, achieved an accuracy of 63.4% and 65.7%, showing generalizability. |
Summary of studies using artificial intelligence (AI) and machine learning (ML) methods for embryo assessment, prediction, and selection. The asterisk (*) indicates studies from conference proceedings. PN pronuclear, AUC area under curve, CNN convolutional neural network, ANN artificial neural network, k-NN k-nearest neighbor.