Skip to main content
. 2024 Mar 1;7:55. doi: 10.1038/s41746-024-01006-x

Table 2.

Ovarian stimulation assessment studies using artificial intelligence

Study Aims of study Outcomes of interest Dataset AI methods Results
Andersen et al. (2017)22 -Assess the efficacy of individualized dosing of follitropin delta (r-FSH) by body weight and AMH vs. conventional follitropin alpha dosing.

-Ongoing pregnancy and implantation rates.

-Patient safety and level of OHSS.

-Randomized, multi-center, assessor-blinded, noninferiority trial across 11 countries with 1329 women aged 18–40 years.

-Cycles were under a fixed day-6 GnRH-ant protocol with a GnRH-a or hCG trigger for oocyte maturation.

Proprietary algorithm

-Ongoing pregnancy (30.7% vs. 31.6%) and implantation rates (29.8% vs. 30.7%) were similar.

-2.3% of patients required measures against OHSS, compared to 4.5% in conventional dosing.

-A similar efficacy and improved safety was observed, with significantly less FSH used.

Letterie and Mac Donald (2020)30 -CDSS to conduct the day-to-day management of OS.

Acc., TPR, and PPV of the algorithm to support critical decision-points:

(1) stop or continue stimulation?

(2) if stop, trigger or cancel the cycle?

(3) if continue, after how many days to follow-up?

(4) any FSH dose adjustments?

-Retrospective dataset with 3,159 IVF cycles from a single center. 556 cycles (17.6%) held out for independent testing of the CDSS vs. 12 clinicians.

-Cycles were either the flexible GnRH-ant protocol or MDL, with an hCG trigger for oocyte maturation. Input variables: E2 levels, follicle sizes during scans, cycle day during OS, FSH dose during OS.

Evaluated 5 model types: Decision trees, random forest, support vector machine, logistic regression, ANN.

(1) Acc. 0.92: continue (TPR 0.94; PPV 0.95), stop (TPR 0.85; PPV 0.85).

(2) 0.96: trigger (0.98; 0.97), cancel (0.75; 0.78).

(3) 0.87: 1 day (0.89; 0.86), 2 days (0.84; 0.88), 3 days (0.91; 0.86).

(4) 0.82: same (0.96; 0.84), decrease (0.23; 0.67), increase (0.25; 0.55).

-A seminal proof-of-concept for a CDSS during OS which generally agreed with evidence-based decisions.

Qiao et al. (2021)24 -Assess the efficacy of individualized dosing of follitropin delta (r-FSH) by body weight and AMH vs. conventional follitropin alpha dosing in Asian patients.

-Ongoing pregnancy rate and LBR.

-Patient safety and incidence of OHSS.

-Randomized, multi-center, assessor-blinded, noninferiority trial across China, South Korea, Vietnam, and Taiwan, with 1009 women aged 20–40 years

-Cycles were under a fixed day-6 GnRH-ant protocol with a GnRH-a or hCG trigger for oocyte maturation.

Proprietary algorithm

-Ongoing pregnancy (31.3% vs. 25.7%) was similar between individualized and conventional dosing.

-LBR was significantly higher in individualized dosing (31.3% vs. 24.7%).

-Incidence of early OHSS was significantly lower (5.0% vs. 9.6%).

-A similar efficacy and improved safety was observed, with significantly less FSH used.

Ishihara et al. (2021)23 -Assess the efficacy of individualized dosing of follitropin delta (r-FSH) by body weight and AMH vs. conventional follitropin beta dosing in Japanese patients.

-Number of oocytes retrieved and LBR.

-Patient safety and incidence of OHSS.

-Randomized, multi-center, assessor-blinded, noninferiority trial in Japan with 347 women aged 20–40 years. Proprietary algorithm

-Noninferiority shown in the number of oocytes retrieved with individualized dosing (9.3 vs. 10.5).

-LBR was 23.5% with individualized dosing vs. 18.6%.

-Occurrence of OHSS was significantly less with individualized dosing (11.2% vs. 19.8%).

-A similar efficacy and improved safety was observed, with significantly less FSH used.

Fanton et al. (2022)11 -Optimize starting OS dose with respect to maximizing the number of MIIs and usable blastocysts.

(1) Number of MIIs retrieved (MAE).

(2) Expected benefits in reduced total OS dose requirements, and increased number of MIIs, 2PNs, and usable blastocysts, when comparing to propensity-matched patients under an optimal dosing strategy.

-Retrospective dataset with 18,591 cycles from three centers (1229, 11,233, 6129 cycles respectively).

-87% of cycles had OS including Menopur; 13% had only pure FSH administered.

-Input variables: age, BMI, baseline AMH and AFC.

k-NN regression (k = 100) using 5-fold CV. Logistic regression for propensity matching.

(1) MAE of 3.79 MIIs with r2 = 0.45. (2) 30% of cycles were dose-responsive and 64% flat-responsive.

-Dose-responsive patients with an optimal dosing strategy were predicted to have 1.5 more MIIs, 1.2 2PNs, and 0.6 usable blastocysts, with 195 IU less total FSH.

-Flat-responsive patients were predicted to have 0.3 more MIIs, 0.3 2PNs, and 0.2 usable blastocysts, with 1375 IU less total FSH.

-Demonstrates potential for individualized OS with significantly reduced dosing requirements.

Correa et al. (2022)16 -Optimize starting OS dose with respect to the number of MIIs. -Mean performance score in comparison to a clinician’s dosing strategy. The performance range was defined from -1 (dose too low) to +1 (dose too high), and 0 being ideal.

-Retrospective dataset of first cycles from 5 centers were analyzed. 2713 patients with a mean age of 37.7 ± 4.6 years, and a further 774 patients with a mean age of 38.3 ± 4.4 years held out for independent testing.

-Input variables: age, BMI, AFC, AMH, and proven fertility (Yes/No).

Linear regression with 5-fold CV

-Algorithm aimed to optimize dosing strategy to achieve 12 MIIs.

-Demonstrated potential to surpass the performance of standard practice.

-Mean performance score in the test set was 0.89 (95% CI 0.88–0.90) versus the clinicians’ suggestions 0.84 (95% CI 0.82–0.86).

Xu et al. (2023)17

-Predicting (A) starting and (B) adjustment of OS dose with respect to the number of oocytes retrieved.

-Developing an online tool for clinicians to use.

-Generalized R2 and RMSE of models A and B.

-Development of a practical online tool for clinicians.

-Retrospective dataset with 621 antagonist cycles from a single center. 30% held out for independent testing.

-Input variables for (A): AMH, AFC, basal FSH, age.

-Input variables for (B): AMH, AFC, age, change in inhibin B.

Lasso regression

-Model (A) had R2 = 0.923 and RMSE = 0.224 in the test set. AMH contributed the most.

-Model (B) had R2 = 0.909 and RMSE = 0.231 in the test set. Change in inhibin B contributed the most.

-A practical online tool (‘POvaStim’) was successfully developed incorporating both models, which now awaits RCT validation.

Zieliński et al. (2023)18 -Predict the number of MIIs retrieved using both clinical and genetic features. -RMSE (primary metric), MAE, and MAPE of models solely based on clinical data (A) and when augmented with genetic data (B).

-Retrospective dataset across 9 clinics from 6,043 patients, who had 9,090 IVF treatment cycles.

-264 of the patients had genetic data available (with 516 IVF cycles).

-Clinical and genetic features were iteratively added to reduce the RMSE of the models.

-Light Gradient Boosting Machine with 5-fold CV and ‘SHAP’ predictor analysis.

-Genetic features were generated using classical bioinformatics analyses (e.g., haplotype construction).

-(A) RMSE = 3.53, MAE = 2.58, MAPE = 2.71. The final included features were AMH, AFC, age, number of cumulus-denuded oocytes and MIIs in the previous cycle attempt, and PCOS diagnosis. AMH was the most important predictor.

-(B) RMSE = 3.35, MAE = 2.48, MAPE = 0.68. The final included features were the same as (A) in addition to IV8-6, IV41-8, and IV22-2 genetic features. AMH remained the most important predictor and was correlated to IV8-6. Haplotypes IV41-8 and IV22-2 both contributed to increasing the number of MIIs retrieved.

-Seminal contribution to the capability of genetic data to augment the performance of clinical predictor models.

Ferrand et al. (2023)13 -Predict the number of oocytes retrieved from OS without transferring sensitive data.

(1) Number of oocytes retrieved (MAE) and MAPE.

(2) Range of oocyte number, determined by 2 clinicians: (A) {0, 1–3, 4-7, 8-12, 13-20, 21-29, 30+} oocytes (B) {0, 1–5, 6-10, 11-18, 19-25, 25+} oocytes

-Retrospective dataset with 11,286 cycles from a single center. 20% of cycles held out for independent testing.

-16 input variables considered: age, AMH, BMI, initial OS dose, basal E2, AFC, basal FSH, basal LH, infertility types, number of previous pregnancies, number of oocytes retrieved, protocol, OS drug type, WHO ovulatory disorder status, smoking status, basal testosterone, basal thyroid stimulating hormone.

-Light Gradient Boosting Machine and 5-fold CV.

-‘SHAP’ predictor analysis was used.

(1) MAE = 4.21 oocytes; MAPE = 0.52.

(2) A: MAE 0.73 bins of deviation. B: MAE 0.62 bins of deviation.

-Overall 5 most important features across models: AFC, AMH, basal FSH, initial OS dose, and number of previous pregnancies.

-Presents the feasibility of using federated learning to develop an oocyte prediction model.

Studies which use machine learning (ML) techniques to optimize gonadotropin dosing and duration during OS. IVF in vitro fertilization, CDSS clinical decision support system, OS ovarian stimulation, Acc. accuracy, TPR true positive rate (sensitivity), PPV positive predicted value, FSH follicle-stimulating hormone, r-FSH recombinant FSH, GnRH-ant gonadotropin-releasing hormone antagonist, MDL microdose leuprolide (flare), GnRH-a gonadotropin-releasing hormone agonist, hCG human chorionic gonadotropin, E2 estradiol, P4 progesterone, AFC antral follicle count, AMH anti-Müllerian hormone, LH luteinizing hormone, ANN artificial neural network, MAE mean absolute error, R2 coefficient of determination, RMSE root-mean-squared error, MAPE mean absolute percentage error, PCOS polycystic ovary syndrome, CV cross-validation, BMI body-mass index, IU international units, MIIs metaphase-II oocytes, 2PNs two-pronuclear embryos, k-NN k-nearest neighbor, LBR live birth rate, cLBR cumulative LBR.