Skip to main content
. 2024 Mar 1;7:55. doi: 10.1038/s41746-024-01006-x

Table 1.

Rules-of-thumb for most suitable machine learning algorithms

AI method Learning type Common tasks Must suitable data types Quantity of data required Interpretability Example use in ART?
Linear/Logistic regression Supervised C&R Numerical ++ +++ Optimizing trigger day timing27
Decision tree Supervised C&R Numerical, categorical ++ +++ Decision-making during OS30
k-NN Supervised C&R Numerical, categorical + ++ Optimizing starting dose during OS11
SVM Supervised C&R Numerical, categorical ++ ++ Streamlining monitoring of patients during OS31
Random forest Supervised C&R Numerical, categorical ++ ++ Predicting risk of OHSS during OS32
CNN Supervised, unsupervised C&R, clustering Image, audio, text +++ + Predicting ploidy status of an embryo100
k-means Unsupervised Clustering Numerical ++ ++ Effect of sperm parameters on IVF outcomes64
GAN Unsupervised Generative Image, time-series, text +++ + Generating synthetic embryo images73
LLM Unsupervised Generative Text +++ + Pre-treatment counseling6

Rules-of-thumb in determining the most suitable machine learning algorithm for a task with relevant examples of their application. Three plus signs imply the highest requirement or capacity, and one plus sign the lowest. For example, the convolutional neural network (CNN) supports several data types, and generally requires high quantities of data (i.e., thousands) for adequate performance, but exhibits poor interpretability (i.e., ‘black-box’). Conversely, k-nearest neighbors (k-NN) can work well even with only hundreds of data samples, and the weighting of predictors can be reasonably estimated for interpretability purposes. AI artificial intelligence, ART assisted reproductive technology, C&R classification and regression, SVM support vector machine, CNN convolutional neural network, GAN generative adversarial network, LLM large language model, OS ovarian stimulation, OHSS ovarian hyperstimulation syndrome.