Skip to main content
Reproductive Biology and Endocrinology : RB&E logoLink to Reproductive Biology and Endocrinology : RB&E
. 2026 Mar 11;24:36. doi: 10.1186/s12958-026-01542-z

The potential, perils and pitfalls of Artificial intelligence (AI) in Assisted Reproductive Technologies (ART)

Akanksha Garg 1,, David B Seifer 1,2
PMCID: PMC13003707  PMID: 41814354

Abstract

The applications of Artificial intelligence (AI) have exponentially increased in all aspects of life, including within Assisted Reproductive Technologies (ART). Although there is tremendous promise demonstrated by AI assisted tools, there are concerning limitations and cautions to heed before widespread use is adopted into everyday clinical practice. The aim of this review is to offer healthcare providers and researchers an overview of AI applications and to assist them in independently reviewing and appraising AI research in its application to ART.

We review different types of AI technologies while exploring current research of applications of AI within ART. Specific applications include embryo and gamete selection and AI as a clinical decision-making support system. We summarize the benefits of integrating AI into an ART setting and describe the known limitations and challenges of this technology in its application to our missions as clinicians, educators and scholars. We consider the ethical implications of AI within ART and equip clinicians and researchers with a framework to critically appraise, evaluate and review research utilizing AI specifically within the setting of ART. We present current international guidance and governance that can be adapted to publishing and reporting AI research in ART.

Keywords: Artificial Intelligence, AI, AI limitations, Assisted reproduction, ART, IVF, Critical appraisal, Benefits, Risks

Introduction

Artificial intelligence (AI) has infiltrated all aspects of day-to-day life, including medicine, research, and beyond [1]. The accessibility, functionality, and ease of use make AI an appealing and potentially transformational tool that is anticipated to positively impact patient care and clinical outcomes. However, these early days of AI are characterized by minimal administrative regulation and a convergence of influences that are leading to growth at an exponential rate due to billions in annual capital expenditure by global private enterprise. All of this contributes to a sense of wonder and excitement about the potential for transformations in societal thought and productivity. One touted aspiration of AI is to reduce the financial burden of healthcare by lowering the cost of intelligence and cognitive labor while augmenting clinical decision-making. However, there are serious concomitant concerns about the impact on job security, trustworthiness of the product, data confidentiality, and how AI may diminish our deep thinking as a species [2]. As users of this novel and powerful technology, we have a responsibility to approach this rapid change in society with both curiosity and prudent caution.

In the field of Reproductive Endocrinology and Infertility (REI), Assisted reproductive technologies (ART) have been at the forefront of newly adopted technologies directed at improving patient reproductive treatment outcomes [3]. Infertility is estimated to affect 186 million people worldwide [4]. Huge technical, social, and clinical advances have resulted in the widespread use of in vitro fertilization (IVF) over the course of almost 50 years since the birth of the first IVF baby in 1978, with roughly 2.6% of all babies born in 2023 in the United States and more than 10 million children worldwide being conceived via IVF [5]. As medical professionals within the field of ART who are leading users of such evolving medical technologies, we have a responsibility to be early vetters.

We must carefully examine the limited data at hand, and if our due diligence shows such advances to be promising with benefits outweighing the risks, we often become early adopters of cutting-edge technologies. This article addresses the benefits and applications of AI to the field of ART while highlighting the potential perils, pitfalls and challenges of this technology in its application to our missions as clinicians, educators and scholars. Most importantly, it aims to equip REI providers with tools to critically appraise, evaluate and review ART research utilizing AI methodologies while providing a broad overview of relevant AI platforms.

Definition of AI technologies being used in ART

What is Artificial intelligence (AI)?

For decades, advances in computer programming have allowed day-to-day human tasks to become efficient, automated and precise. AI describes a set of computer systems which can perform more complex tasks, which generally require higher-order human thinking (e.g. decision-making, reasoning, creative thought) [6] and can apply pre-existing knowledge/experience to solving new problems. AI systems can perform tasks in unpredictable circumstances using experience to improve performance, demonstrate human-like higher-level cognitive thinking skills, and act rationally to achieve these goals [6].

Characterizing the different AI technologies is challenging due to the rapid advancements and overlap between various models. The early AI models in the 1950 s were rule-based, whereby logic, search trees and hand codes were used to tackle narrow problems such as playing a game of chess. This was followed by a brief period of lack of funding and interest in AI, known as ‘AI winters’. In the late 1990 s and early 2000 s, machine learning (ML) based models emerged. The key shift with this AI technology was a switch from using rules to using large volumes of data, and the ability to learn new patterns rather than following pre-programmed rules. Within machine learning (i.e. AI that learns from data), systems can either use ‘supervised’, ‘unsupervised’ or ‘reinforcement learning’. For example, supervised learning would involve learning from pre-defined examples and is often used for prediction or classification. Unsupervised learning involves finding patterns without pre-defined labels. Reinforcement learning framework describes another model whereby learning happens via trial and error.

The majority of AI technologies in IVF today generally utilize a subset of ML titled ‘deep learning’ (DL), whereby massive datasets are used to train and develop models that have many neural networks (NN) with multiple layers, much like the human brain. There are numerous types of NN such as convolutional NN (CNNs), which use convolutional filters to detect local patterns and artificial NN (ANN) which have artificial nodes (acting as neurons) amongst others. Convolutional filters are small, learning matrices which scan input data and can detect patterns and artificial nodes are computational units within NN which can receive input and transmit data to other layers (much like biological neurons) [7]. Other commonly encountered AI technologies include natural language processing (NLP) which understand and generate language, generative AI which creates new content, or multimodal AI which can process multiple data types simultaneously. Generative AI includes large language models (LLMs) and large multimodal models (LMMs), which respectively utilize language and multi-media inputs [7, 8]. Table 1 compares the two types of generative AI tools and their possible uses. It is important to note, that none of the generative AI models are validated as diagnostic medical tools so should be used with caution in the context of medicine, much like how clinicians/researchers may use any other references, such as search engines or textbooks. OpenEvidence, a specialized generative AI can be used to inform medical decision making in a similar manner to other platforms such as UpToDate.

Table 1.

A comparison of commonly used Generative AI tools

Type of generative AI Examples Pros Cons Possible uses
Foundational: large, generalized models trained on broad data sets used by the general public ChatGPT [9], Gemini [10], Claude [11] Broader set of knowledge, more powerful as trained on larger datasets, better at generalized reasons

Lacks focus and specific depth to topics

Lacks safeguards that are specific to industries or applications and not HIPPA compliant

-generate summaries on a topic

-generate bodies of text

-generate images

Specialized: Models that are fine-tuned to specific use cases/industrial verticals used by verified medical professionals in a specific field of expertise Open Evidence [12]

Will embed specific knowledge and guardrails

Has deeper knowledge of area of medicine and is HIPPA compliant

Cannot generate images

Limited functionality

-explore specific medical questions using high precision medical search and literature synthesis with primary sources relying upon high impact peer reviewed journals

To confirm the quality and legitimacy of AI response to any query we recommend verification by checking and validating the primary references that are cited as primary sources of data and the basis for conclusions and recommendations that are offered by the generative AI platform. Once that step verifies legitimate references of acceptable quality, it is advisable to add a review layer of confirming with another known trusted independent source (e.g. known professional society guidelines, and/or Cochrane Reviews) to consider the given AI advice before taking clinical action.

The applications of AI in healthcare can be generally categorized into 5 broad domains [7]. AI can be used as a predictive tool (using data to predict future outcomes), classification tool (sorting data into normal/abnormal), association tool (finding relations between different variables to draw clinical conclusions), regression tool (assessing the strength of a relation of one variable to other variables), or optimization tool (automating practical tasks e.g. administrative tasks) [7]. In the next section, a brief overview of AI in ART is provided, which mostly involves harnessing the predictive, classification and optimization capabilities of AI.

Applications of AI in ART

Embryo selection

High-quality embryo selection is a critical step in IVF to maximize chances of a successful pregnancy [13]. Traditionally, this is done via morphology-based embryo selection, during which a trained embryologist visually inspects and selects the healthiest appearing embryos, however, this is prone to intra-observer variability and subjectivity [14]. Traditional embryo selection relies on the annotation of embryo imaging and ranking the quality amongst a given cohort of embryos [13]. Images can either be static (e.g. cleavage-stage embryos graded based on specific criteria) or can utilize time-lapse videography (TLV) which is thought to reduce inter-operator subjectivity and allow novel viability biomarker identification [13, 15]. TLV involves optical microscopy imaging captured at set time intervals to allow assessment of embryo development [16]. However, TLV alone has not shown to improve clinical pregnancy or live birth rates compared with standard static image morphological selection [17]. Multiple AI technologies incorporating either static images, TLV and/or additional clinical parameters have been developed with a view to improving embryo selection.

Several individual studies report on different outcomes when comparing AI models against current practice or as a prediction model for embryo selection. One of the largest randomized double-blind non-inferiority trials compared standard morphological assessment versus a DL AI model across 14 centers in Australia and Europe. Across 1066 patients, the clinical pregnancy rate was 46.5% in the AI group versus 48.2% in the morphological grading group, showing a lack of non-inferiority for deep learning in terms of clinical pregnancy rate [18]. Wang et al. prospectively compared embryologist grading (Gardner grading system) versus AI (using TLV and deep learning models) and demonstrated improved implantation rates in the AI group (81% versus 68%, p = 0.02), however, there was no difference in terms of live birth or miscarriage rates [19]. Meta-analysis of 20 studies demonstrated how AI models predicted the likelihood of successful clinical pregnancy with greater accuracy (median accuracy 81.5%) when compared with clinical embryologists (median accuracy 51%) [20], suggesting that AI-based embryo selection may outperform human-driven embryo selection, however, data from large-scale high-quality randomized studies assessing live birth rate as an outcome is currently lacking.

AI technologies utilizing morphology and/or clinical parameters can also be used as prediction models to detect embryo quality, ploidy status, miscarriage, pregnancy and live birth rates as outcomes. One retrospective study demonstrated that a static image-based AI embryo selection model had 67% accuracy in predicting first-trimester abortions [21]. Several studies have assessed AI-based algorithms utilizing ML and ANN technologies, demonstrating (Area under the curve) AUCs ranging between 0.70 and 0.84 in terms of predicting implantation potential [2226]. Another DL based AI model trained on 10,378 embryos used static images and clinical information to predict ploidy status in embryos and showed an accuracy of 70% [27]. AI technologies have also been applied to whole-genome DNA methylation, which is thought to be a biomarker for assessing embryo quality, where preliminary data suggest that AI-based models can successfully predict live-birth rates and aid embryo selection (with an AUC of 0.9) [28].

Gamete selection

Selecting the highest quality sperm and oocytes during an IVF cycle is essential to achieving optimal fertilization. Assessing gamete quality and predicting fertilization rates is also useful to counsel patients, for example, for those undergoing oocyte cryopreservation. The ability to predict outcomes based on the quality/number of retrieved oocytes can guide decision making in terms of planning for the number of required oocytes to tailor to the patient’s family planning goals [29].

Traditional sperm selection techniques include assessing the morphology, motility and DNA integrity, while advanced sperm techniques lack high-quality data demonstrating improvement in clinical outcomes [30]. In most clinical settings, an embryologist will manually select sperm based on the World Health Organization (WHO) criteria [31]. This is a time-consuming process, with significant inter- and intra-observer variability.

Several AI systems have been developed to assess sperm morphology, with the most common models utilizing deep neural networks (DNNs). Computer-aided sperm analyzers (CASA) have been around since the 1980 s and can reduce intra-operator subjectivity; indeed the latest WHO manual on sperm analysis recognized how CASA can accurately determine sperm concentration and progressive motility parameters through fluorescent DNA stains and tail-detection algorithms [32]. ML prediction algorithms utilizing neuronal networks can achieve up to 92.9% accuracy in predicting abnormal sperm concentration and 85.7% accuracy for predicting sperm abnormalities [33]. Apart from prediction, DL AI technologies can also assess motility, morphology and DNA fragmentation. DL methodologies using videos have been able to allow correlation of kinetic motility patterns to aid selection for Intracytoplasmic sperm injection (ICSI) [34]. In terms of morphology, WHO describes 11 different sperm head abnormalities, however, it can be challenging to accurately classify sperm into these using the naked eye [35]. DL advancements using neuronal networks can objectively detect and classify these morphological abnormalities in real time [36]. Current tests that assess DNA fragmentation are invasive, which means that selected individual sperm cannot be subsequently used for ICSI. There are no guidelines on using DNA fragmentation analysis in selecting sperm, however, AI technologies using neuronal networks have been shown to successfully predict DNA integrity using single images and omit the need for invasive testing [37].

Alongside healthy sperm, the selection of a mature oocyte is also vital to allow fertilization. To truly assess nuclear maturity, the extruded polar body must be confirmed and requires removal of the cumulus [38]. Currently, nuclear maturity is assessed non-invasively and visually by embryologists much like other aspects of fertilization described above. Oocytes are scored based on various parameters. Numerous DL based AI systems have been developed to assist this process. Examples include VIOLET™ (Future Fertility), which can analyze morphological features of 2D images and predict fertilization with 91.2% accuracy, thus being faster and more accurate than expert embryologists [39]. Beyond image-based analysis, AI models can also enhance non-invasive gene expression tests. The ‘OsteraTest’ uses ML and bioinformatics to non-invasively predict oocyte development with 86% accuracy [40]. Although large scale RCTs and subsequent meta-analyses are required, such applications of AI in terms of enhancing sperm and oocyte selection are promising avenues.

Clinical decision making

One of the biggest challenges in IVF is the prediction and optimization of outcomes, stratification of risk of adverse effects, and therefore, the personalization of treatment protocols to balance and assess these. For example, one could maximize the number of oocytes retrieved by using a more aggressive approach to ovarian stimulation; however, this may increase the risk of ovarian hyperstimulation syndrome (OHSS). Therefore, a large part of the clinician-patient relationship is fine-tuning protocols to individual medical and personal circumstances (e.g. patient’s tolerance of risk, access to funding, insurance status, personal timelines and motivation) and providing holistic care that encompasses the patient’s short and long-term goals. Such complex multi-layer decisions are often made using clinical judgment, experience and shared decision making. AI technologies have shown promise in assisting at various crossroads of clinical decision-making.

During pre-treatment counselling, numerous parameters are used to inform patients of their chance of success, including age, ovarian reserve, race and previous treatment history. AI models are being developed and used by patients, insurance companies and clinics to provide access to care (e.g. offer shared-risk financial programs) [41]. Such models can also reduce the time during initial intake and scoping of services. For example, an LLM based model titled ‘Fast Track to Fertility’ has shown to increase patient access by 24% and decrease the time to treatment by 50%, with increased patient engagement [42].

The ovarian stimulation and trigger protocols used are key aspects of an IVF cycle. Various permutations of protocols exist, and optimizing drug regimens can maximize follicle yield and chance of successful live birth [43]. Utilizing AI models (mostly regression-based) to aid with gonadotropin dosing has been shown to improve safety, lower cancellation and OHSS rates, and improve oocyte yield, clinical pregnancy and live birth rates [8]. A large study of the national US database (SART CORS) with 365,473 patients demonstrated how AI modelling using similar patients’ data can be used to improve oocyte yield and live birth rates [44]. ML techniques can also aid decision-making in terms of when to trigger oocyte maturation during the stimulation cycle, incorporating both ultrasound images and lab values. Retrospective studies show that follicles between 16 and 19 mm on the day of trigger are most likely associated with mature oocytes, and utilizing AI technologies can improve the number of mature oocytes and blastocysts, however, live birth data is currently lacking [4446].

Benefits of AI

There are numerous practical and clinical benefits of AI technologies, specifically when there is access to big data [47] as demonstrated in the above sections. AI is capable of detecting, recognizing and classifying large amounts of data in order to analyze and predict outcomes beyond what may be expected from any human endeavor while reducing time to completion of the task and as well as reducing subjective bias [24]. This has been reflected when comparing clinical outcomes as described above, where AI technologies generally match or outperform current human-led tasks. The fact that AI technologies learn in real time from large datasets can also improve standardization, reproducibility and scalability. AI is also less prone to making errors compared to humans, thus potentially improving the validity, reproducibility and reliability of results.

AI technologies may also improve accessibility to IVF services which in many countries remain privately funded. AI is able to perform functions much faster than human-driven tasks, which in turn can improve productivity in a clinical setting [42]. For example, AI using DL can automate embryo scoring without needing human annotation, potentially eliminating the need for manual scoring and saving a significant amount of time [48, 49]. Although there are costs associated with setting up AI technologies, once functional, they can help reduce long-term expenses in terms of workforce and labor, while improving productivity. For example, if AI is integrated into clinical practice and can truly improve gamete and embryo selection, it could theoretically reduce the number of cycle failures. AI powered tools such as voice-to-text transcription and automated charting can save 37–46% of provider time on administrative tasks such as charting admissions, transfers and discharges [50] Integration of an AI-driven prognostic tools was also found to increase IVF usage amongst new patients in a fertility clinic [51], suggesting that AI can help broaden access and awareness of IVF treatments. Furthermore, generative AI technologies (e.g. ChatGPT) are able to adapt their interface based on the user’s language, health literacy and personality, which makes it a powerful tool for broadening access to care and interacting with patients from different backgrounds [52]. For example, ChatGPT-assisted surgical consent prior to a total knee arthroplasty was shown to reduce anxiety scores and improve overall patient hospitalization experience compared to traditional, surgeon-led consent [53]. Furthermore, generative AI has been shown to accurately generate notes and discharge instructions in other languages which can improve language barriers in healthcare [54].

Numerous predictive models aim to anticipate an individual patient’s response to ovarian stimulation and suggest ways to improve upon this, however, there is no single established way to ‘personalize’ treatment plans for patients in order to maximize their chances of success [55, 56]. For example, a large multi-center study utilizing AI was able to predict which follicle sizes improved individual live birth rates [8]. Since AI can analyze far more data points than the human process, it carries tremendous hope to be able to move toward personalizing individual treatments for patients and to maximize their treatment outcomes.

Thus, while AI appears promising in each of the above working examples, there is a considerable way to go before these tools can be reliably incorporated into clinical use. The next section discusses current challenges with AI platforms, and the need for more robust high-quality data for training and validation to produce better models that are accurate, precise and tested.

Perils of AI

AI’s limited proven track record

As exciting and promising the above AI technologies are, we need to use caution when considering their extensive role in clinical practice. Although there are numerous studies comparing AI driven tools to current practice, there is a lack of high-quality, large, prospective randomized controlled trials. With the existing studies, there is significant heterogeneity between study designs, their quality and a lack of standardization of the outcomes which limits the conclusions and generalizability that we can draw from meta-analyses. Depending on the intervention being tested, outcomes can vary from ‘non-inferiority’ to assessing the impact on clinical pregnancy and live birth rates. Ultimately, the purpose of IVF is not only a successful live birth, but also to achieve a healthy pregnancy and a healthy baby. There are very few studies that consider if AI technologies specifically improve live birth rates or impact the health of the pregnancy and neonate, all of which are important clinical considerations when considering the purpose of ART.

Lack of transparency of how decisions are made by the AI model

ML and DL AI systems contain a multitude of layers, which are often non-transparent and hide the exact logic and pathways that drive decision making. This has created a sense of a ‘black box’, whereby the lack of transparency, interpretability, and traceability makes it challenging for professionals to have sufficient trust in these systems [13]. An important question to ask is whether these additional hidden layers actually improve clinical outcomes [57]. For example, a comparison of 12 algorithms that predict blastocyst viability, showed that logistic regression outperformed the more complicated, ‘black-box’ machine learning systems [58]. If the clinician utilizing a specific AI tool to make a clinical decision cannot entirely understand how that tool comes to a conclusion it creates a sense of mistrust [59]. This natural hesitance in using an ‘unexplainable’ tool is valid, as AI technologies are at risk of bias and overfitting based upon the presence or absence of bias in the data that was used to build the AI model. Clinician trust and acceptance is dependent upon the validation transparency of the AI platform and the patient populations its output is based upon, as these two items will impact the generalizability of the AI recommendations. In other words, transparency enhances clinical judgement while opacity emphasizes clinical vigilance.

Limitations due to the quality and bias in data used to train and validate AI models

Given that AI models are trained using a particular dataset, any limitations in the data itself are reflected in the model’s functionality. Overfitting describes a phenomenon where a model learns the training data ‘too well’ and includes background ‘noise’ which is irrelevant to the question asked. Importantly, this is not a critique of the actual methodology in itself, but of the data that AI uses to learn from and then being validated from. Ultimately, the ability of AI to accurately predict an outcome or act as a diagnostic tool is limited by the quality and quantity of data it has been trained and validated on [13, 60]. There is indeed a high risk of bias with ML and DL due to the biases in the data set, and these can vary, including discrimination based on gender, race and socioeconomic status [61]. For example, one study demonstrated how a ML based model used to predict asthma exacerbations in children incorrectly overpredicted for children of lower socioeconomic status, suggesting worse predictive model performance driven by bias [62]. Marginalized, underrepresented populations are often not included due to economic and cultural barriers to access thus limiting their representation in any training or validating database for developing an AI model [63].

Accountability and risk of misinformation due to hallucinating, fabricated data, data leakage, and/or fragility

Within DL, LLMs and LMMs carry a high risk of ‘hallucinating’, whereby they can fabricate false data and draw from data that does not exist or suggest incorrect outcomes that may on the surface seem plausible but is false information. This also raises the question of accountability. Who is held accountable for a particular AI system? Is it the team that developed it, or the individual who is using it? This raises deeper ethical questions (discussed later); however, AI must be viewed as any other helpful tool in a clinical setting (e.g. a serum full blood count panel), and ultimate clinical accountability lies with the clinician.

The handling of data by AI to draw clinical conclusions also carries risk. For example, ML applications are at risk of data leakage, undercutting and data poisoning. Data leakage occurs when the data used for training is ‘leaked’ into the testing dataset. Data poisoning refers to the intentional and malicious introduction of harmful data into the system. Given that ML and DL AI rely on being ‘trained’ with a data set, another concern is the external validity, scalability and transferability of an unbiased algorithm. If the data set is not representative of the wider population or is inherently biased, this can reduce the real-world application of an algorithm (e.g. a prognostic AI tool is trained on data derived purely from a Caucasian population but is tested on a wider population) resulting in inaccurate predictors of marginalized groups.

AI fragility also referred to as AI brittleness is the model’s tendency to manufacture aberrant responses when it encounters data with minor deviations from the original input training data. Thus, the AI performs well with familiar data but degrades and produces inaccurate information when triggered by noise. For example, LLMs may produce contradictory information or misinformation when interacting with idiosyncratic phrasing that differs from their training set.

Ethical considerations

Information decoupled from morality

E.O. Wilson stated that “the problem with humanity is that we have Paleolithic brains, medieval institutions and 21 st century God like technology.” Given this insightful observation, we need to be intentionally vigilant in our approach to adopting AI and mindful of its impact on humanity. As sophisticated as AI is, a machine is still just a machine and lacks semantic understanding with aspects that define humanity with respect to morals, compassion, comfort and care. Thus, AI applications in medicine are and should remain as tools to be used by a human with professional training and humanity as opposed to acting autonomously of human supervision.

Propagation of embedded and inherent bias

In an ever-evolving society with growing disparities and social inequalities, clinicians carry enormous ethical obligations to deliver safe, resourceful and equitable healthcare. As highlighted above, AI systems carry a significant risk of introducing and amplifying pre-existing biases inherent in the data used to train models. For example, if poor-quality data that is not representative of real-world populations is used to train a model, then performance biases can further harm already marginalized and underrepresented groups [64]. This is challenging as such biases are often structural and are embedded in multiple social and political levels. Therefore, clinicians need to be mindful of such possible biases when interpreting or using AI tools.

Breach of privacy

Patient confidentiality and safe data handling are integral to any healthcare system. A major concern with any technology that relies on storage and access of vast amounts of data is the risk of breaches in data security and privacy. AI is particularly at risk of cyber-attacks, data leakage and poisoning. Higher-level technical security is not always accessible or understandable, which makes patients and providers vulnerable to vast amounts of potential harm. Data shows growing mistrust of data handling and concerns around privacy and cybersecurity amongst patients, therefore it is crucial to ensure healthcare systems utilizing AI technologies have foolproof security measures [65].

The impact on employment

The possible impact of AI on the structure and functioning of society is monumental. With AI being more efficient and accurate than humans, many industries are being reshaped, and jobs are being redefined. AI is leading to both the creation of new jobs and loss of current ones. As AI grows and transforms industries, job profiles will also significantly change, requiring humans to reskill and incorporate AI into their skillset. Ethically speaking, this widens the scope of disparities, as not everyone has the means and access to learn AI technologies to remain relevant. It also raises the question as to whose responsibility is it to provide training on emerging AI technologies. Within the realm of ART for example, many AI models can perform embryo selection at a similar or superior level to embryologists. This could reshape the role of embryologists, whereby part of their current job could be automated by AI [66].

The impact on the environment

AI also impacts our planet in that energy-intensive computations leave a significant carbon footprint. Indeed, 5.4 million liters of water was used to train GPT-3 models in Microsoft’s United States’ data centers [67]. Therefore, it is vital to consider the renewable energy sources that will power AI systems and how sustainable this is for our planet. There is growing concern about the impact of AI data centers on public health risks and the environment. Harmful air pollutants are released by on-site generators, such as nitrogen oxides which are linked to respiratory disease such as asthma and lung cancer [68]. Given the 24/7 nature of AI centers, diesel is currently used to power backup generators, adding further burden to the environment. AI data centers are projected to use up to 12% of the United States’ electricity consumption by 2028, compared to 4.4% in 2023 which demonstrates the rapid increase in the carbon footprint associated with AI. The use of renewable energy to power such data centers is vital, and we must remain cognizant of the environmental impact of incorporating AI into everyday life on a larger scale [69]. Alternatives of other emerging non-renewable power sources for data centers include small modular reactors (SMR) supplying nuclear carbon free energy.

The patient’s perspective

As AI becomes more integrated into all aspects of life and healthcare, people’s experiences and expectations of AI will also evolve. A key ethical aspect to consider is how patients pursuing ART perceive AI involvement in their care. One survey of 200 patients undergoing IVF reported lower trust in AI-informed reproductive care and demonstrated a preference for physician-based recommendations in treatment-related decisions, however more patients favored AI generated recommendations for gamete and embryo selection. Patients were also unwilling to pay more for AI-informed IVF care [70]. It is important to note that this study was subject to selection bias as the patient population had high health and technological literacy levels, and this may not represent the views of the general population. Ethically speaking, it is important to consider how patients perceive AI involvement in their care and how this affects their engagement and trust in their treatment and clinicians.

WHO ethical guidelines on AI in healthcare

In view of the above concerns, The World Health Organisation (WHO) has developed ethical principles to guide the use of AI in healthcare. The key ethical principles include [71]:

  • Protect human autonomy: With AI being able to compete with and even outperform human cognitive thinking, it is vital that humans remain in control of medical decision-making and patients should make autonomous decisions with transparency in the use of AI. This expands beyond provider-patient interactions into the realm of medical research, whereby human intellectual thinking and analysis remain vital in assessing and using AI. AI design that specifically requires human-in- the-loop (HITL) prioritizes the effort to integrate a specific AI with human judgement and oversight for accuracy and ethical decision making.

  • Promoting human wellbeing and safety: At the very core of the medical community lies protecting the wellbeing of our population, and the mantra of ‘do no harm’. AI should not harm people, and should follow the safety, accuracy and efficacy standards that are applicable to any other aspect of society.

  • Transparency, explainability and intelligibility: AI should be understandable (to different extents) by any human who is a stakeholder (e.g. developers, providers, patients, regulators). The extent of understanding will depend on the context; however, AI should be understandable, according to the capacity of those to whom they are explained. Furthermore, an AI tool should be required to identify itself as an AI tool in any interaction with both AI and non-AI (i.e. human) parties so all parties are aware of the source of interaction at all times.

  • Responsibility and Accountability: AI technologies should be evaluated by stakeholders, including patients and providers.

  • Inclusiveness and Equity: AI should be designed “to encourage the widest possible appropriate, equitable use and access, irrespective of age, sex, gender, income, race, ethnic group, sexual orientation, ability, or other characteristics protected under human rights codes”. AI should actively not perpetuate biases to already minoritized groups.

  • Responsive and sustainable: AI should constantly be re-evaluated during use by users and developers. Governing organizations remain responsible for ensuring that the environmental consequences of incorporating AI are minimal and that workplace disruptions, including job losses, training of healthcare workers and restructuring of the workplace, are appropriately navigated.

A framework for assessing AI technologies in ART

As highlighted above, AI tools have tremendous potential but also carry a need for caution and vigilance as they are integrated into clinical practice. Importantly, clinicians and researchers should have the ability to interpret and critically appraise ART research involving AI. This section aims to equip IVF researchers and clinicians with a framework on how to approach reading and interpreting manuscripts that utilize AI systems within the context of ART. There are numerous appraisal tools (e.g. APPRAISE-AI) which can evaluate the methodology and reporting quality of AI studies, with a recent systematic review identifying at least 26 studies and 9 reporting checklists [72, 73], however, there are none within the context of ART to our knowledge.

The first step when evaluating AI tools in ART is to classify what the AI system does. In ART, they are generally either prediction models, diagnostic tools or provide clinical decision support (as described above). It is important to ensure the study design and methods of developing the AI tool are robust. For example, AI methodologies must be externally validated, and the dataset should be large, diverse, transparent and traceable. If a cherry-picked small data set is used for training, this amplifies the risk of bias and untrustworthy results. It is also important to consider whether the AI tool is useful and being compared against a fair parameter. For example, evaluation of AI tools for embryo selection should be compared against the gold standard (multiple blinded embryologist assessments) with robust methodologies and not used as a standalone parameter. This risks ‘overclaiming’ or over generalizing, where the AI model can claim to ‘improve outcomes’ (e.g. improve live birth rates) but without actual clinical comparators to assess the true impact of the AI model. Table 2 highlights 8 key domains to consider when reading a manuscript describing AI technologies in ART [7].

Table 2.

A framework to critically appraise AI research in ART

1. What is the clinical question and purpose of the AI system?

Consider in a ‘PICO’ (Population/Problem, Intervention, Comparison, and Outcome) format the following:

- What is the population. Is it different from the one you plan to apply it to?

- What is the intervention/control?

- Are the outcomes meaningful and relevant to your clinical practice?

- What is the AI system being used for? (prediction, classification, association, regression, optimization)

2. Data quality and AI methodology

Data quality:

- Single center or multiple centers?

- How representative is the population of the question being asked?

- What was the inclusion/exclusion criteria and how is missing data handled?

- How was the data handled and acquired? Is the AI system described clearly including the software/hardware used and underlying algorithm?

- IVF specific pitfalls include selection bias (e.g. only accounting for embryos that were transferred), clinician/practice confounding factors (e.g. a bias towards the protocols/choices made by that clinic) (think ‘rubbish in-rubbish out’)

AI methodology:

- Where in the clinical workflow was the AI system used? This can increase anchoring bias (e.g. were users blinded from the AI system prior to decision making if this is what is being assessed?)

3. Model training and development

- How was the data split between training and validation? Depending on the clinical question, this may require splitting by patient, time etc.

- How was data splitting handled with a multi-center design and was one center left out for external testing?

- What is the risk of data being in both the training, validation and test cohorts?

- Who developed the model? A diverse team should consist of scientists, clinicians, statisticians and engineers who are able to appropriately handle the model.

4. Model hygiene and failure mode

- Were features (input variables that affect the outcome) pre-specified? E.g. was it prespecified that age, BMI etc. would be input variables in a prediction model?

- How were hyperparameters (values which are used to configure the learning process) tuned? Was this done on training/validation and test sets? If images are involved (e.g. embryo selection), are the pre-processing/augmentation parameters well defined?

- Where relevant, was a confusion matrix described?

- How were errors approached? Were there any algorithm errors, software/hardware errors or user errors?

- Are there examples of where the model fails and when to not use it?

5. Comparators - Was the comparison appropriate? E.g. was the model compared against current practice (embryologist grading, diagnostic test, clinical decision making)
6. Performance outcomes

- Is the outcome valid for the question asked? E.g. for a prediction model, can you trust the results? Look beyond reported AUC, including whether sensitivity/specificity/positive predictive value/negative predictive value were calculated where appropriate

- was there calibration for risk prediction

- Were safety concerns and instances of harm/adverse outcomes reported?

7. Real world applicability and transparency

Generalizability and transportability

- Were the methods translatable to other IVF settings (e.g. protocols used, embryo development).

- Was appropriate subgroup analysis performed?

Bias and equity

- Were socioeconomic status, ethnicity and other demographic data accounted for? Does the model worsen disparities? (e.g. unpredicts success in one group purely based on ethnicity)

Study applicability and transparency

- For decision support tools, what is the study design? Randomised trials being the strongest compared to retrospective designs

- Can this AI tool be audited? Is the model code available or described in enough detail to be reproduced (should usually be available in the supplementary material)?

- Is there data transparency?

8. Ethical use

- How was bias assessed and mitigated when developing the model? E.g. were adjustments made in model development to correct for possible bias?

- How was data handled in terms of security and privacy?

- If the AI system involved direct patient care or decision making, how was this communicated to patients and how was their consent utilized with regards to decision making?

Guidelines for reporting AI research

Transparency on the use of AI in research

Given the exponential rise in AI research, there have been numerous guidelines on what and how authors should report when publishing research assisted by AI. Firstly, researchers must disclose and declare if AI has been used to assist the generation of non-AI research. Standard submission, peer-review and publication of research require disclosures of interest, and this must be extended to the use of AI with a view to improving transparency and trust. For example, the Frontiers’ journal has released a framework for responsible AI governance which includes transparency into when AI is used, training and capacity building, ethical guardrails, equity and access governance, community feedback and monitoring and audit of AI in research [74].

Appropriate use of AI for the research question asked

For research utilizing AI methodologies, there are numerous reporting checklists available to authors, and it is crucial that the correct one is used. For example, the TRIPOD-AI checklist is for prediction studies, START-AI for diagnostic accuracy studies and SPIRIT-AI/CONSORT-AI for randomized controlled trials [75]. Additional guidelines that can be tailored to the AI methodology are being developed such as CHART (chatbot assessment studies), TRIPOD-LLM (LLM studies), CLAIM (imaging), and CHEERS-AI (economic evaluation studies) [7, 72]. Table 3 lists various reporting checklists available for AI research.

Table 3.

Reporting checklists available for AI research in healthcare

Name Study design

TRIPOD (Transparent Reporting of a Multivariable Prediction Model for

Individual Prognosis Or Diagnosis)- AI

Prediction model evaluation
STARD (Standards for Reporting of Diagnostic Accuracy Studies)-AI Diagnostic accuracy studies

DECIDE-AI (Developmental and Exploratory Clinical

Investigations of Decision Support Systems Driven by Artificial Intelligence)

Multiple (e.g. prospective cohort studies, non randomised studies)

SPIRIT-AI (Standard Protocol Items: Recommendations for

Interventional Trials)

Randomised controlled trials
CONSORT-AI (Consolidated Standards of Reporting Trials) Randomised controlled trials
CLAIM (Checklist for Artificial Intelligence in Medical Imaging) Medical imaging
CHART (chatbot assessment reporting tool) Chatbot assessment studies
CHEERS-AI (Consolidated Health Economic Evaluation Reporting Standards) Economic studies

AI governance in healthcare and ART

At present there are no dedicated government agencies that oversee the use of AI in healthcare, apart from the WHO which released ethical considerations as above. Each country is adapting their own regulatory framework to incorporate AI in different ways. For example, in the United Kingdom, AI in healthcare falls under the umbrella of the Data Protection Act 2018 and the UK Medical Device Regulations 2002 [76]. The Medicines and Health Care products regulatory agency (MHRA) created the “Software and AI as a Medical Device Change Programme” as a regulatory framework to oversee AI-Medical devices. The European Union (EU) created the AI Act which establishes a risk-based framework to safely use AI [77]. In the United States, The Food and Drug Administration (FDA) has authorized over 1000 medical devices which utilize AI, however there is no formal regulatory body to oversee this [78]. Per the FDA’s “AI/ML-based SaMD (software as a medical device) Action Plan”, developers are held accountable for real-world performance of AI systems. At the time of writing this article in early 2026 there are no proposed Federal regulations in the United States despite several different state regulations. In Australia and New Zealand, the regulation of SaMDs falls under the Therapeutic Goods Administration (TGA) [77]. From a data security perspective, AI handled data falls under the umbrella of various data governing agencies such as HIPAA (Health Insurance Portability and Accountability Act) and GDPR (General Data Protection Regulation) in the US and Europe respectively. Within the realm of ART, two major international organizations include the European Society of Human Reproduction and Embryology (ESHRE) and the American Society of Reproductive Medicine (ASRM). At the time of writing this article, neither organization has released formal guidance on the regulation of AI in ART.

Conclusions and recommendations

If used responsibly and with caution, AI has tremendous potential to revolutionize ART. However, all AI models are only as robust as the quality of the data used to train them. If the training data is unrepresentative the output is likely to be biased and less accurate depending upon the nature of the population to which it is applied. Researchers have an ethical and social obligation to reduce bias and be as transparent as possible on the use and the limitations of the data used to create the AI models that they utilize. Likewise, clinicians should be AI literate by being equipped with the knowledge and skillsets to critically appraise AI research in ART. To confirm the quality and legitimacy of AI response to any query we recommend not having blind trust in the information provided by AI and verifying using independent resources. Verification can take place by checking and validating the primary provided as well as nonprovided references and adding an additional review layer of checking with another known trusted independent source (e.g. known professional society guidelines, and/or Cochrane Reviews, experienced subject matter expert). When appraising AI research in ART carefully review the recommended checklist summarized in Table 2.

Furthermore, emerging AI technologies should be subjected to rigorous methodological analysis and meet appropriate ethical and governance standards. With some degree of regulation and careful development, AI technologies have numerous promising applications in ART with a view to improving patient care and outcomes.

Acknowledgements

We would like to thank Samuel Faulls for his expertise on artificial intelligence.

AI statement

No generative AI was used to write this manuscript.

Authors’ contributions

The study was designed, conducted, analyzed, and reported entirely by the authors. The views expressed are those of the author(s). There are no other competing interests to declare.

Funding

Not applicable.

Data availability

Not applicable.

Declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Ma S, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Fischer A, Rietveld A, Teunissen P, Hoogendoorn M, Bakker P. What is the future of artificial intelligence in obstetrics? A qualitative study among healthcare professionals. BMJ Open. 2023;13(10):e076017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Niederberger C, Pellicer A, Cohen J, Gardner DK, Palermo GD, O’Neill CL, et al. Forty years of IVF. Fertil Steril. 2018;110(2):185–324. e5. [DOI] [PubMed] [Google Scholar]
  • 4.Inhorn MC, Patrizio P. Infertility around the globe: new thinking on gender, reproductive technologies and global movements in the 21st century. Hum Reprod Update. 2015;21(4):411–26. [DOI] [PubMed] [Google Scholar]
  • 5.Sean Tipton AH, Benjamin US IVF Usage Increases J. In 2023, Leads To Over 95,000 Babies Born Asrm 2025 [Available from: https://www.asrm.org/news-and-events/asrm-news/press-releasesbulletins/us-ivf-usage-increases-in-2023-leads-to-over-95000-babies-born/.
  • 6.NASA. Defining articial intelligence [Available from: https://www.nasa.gov/what-is-artificial-intelligence/.
  • 7.Dijkstra P, Greenhalgh T, Mekki YM, Morley J. How to read a paper involving artificial intelligence (AI). BMJ Med. 2025;4(1):e001394. [Google Scholar]
  • 8.Hanassab S, Nelson SM, Akbarov A, Yeung AC, Hramyka A, Alhamwi T, et al. Explainable artificial intelligence to identify follicles that optimize clinical outcomes during assisted conception. Nat Commun. 2025;16(1):296. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Tangsrivimol JA, Darzidehkalani E, Virk HUH, Wang Z, Egger J, Wang M, et al. Benefits, limits, and risks of ChatGPT in medicine. Front Artif Intell. 2025;8:1518049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Karnan N, Francis J, Vijayvargiya I, Rubino Tan C. Analyzing the effectiveness of AI-generated patient education materials: a comparative study of ChatGPT and Google Gemini. Cureus. 2024;16(11):e74398. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Mavrych V, Yaqinuddin A, Bolgova O, Claude. ChatGPT, Copilot, and Gemini performance versus students in different topics of neuroscience. Adv Physiol Educ. 2025;49(2):430–7. [DOI] [PubMed] [Google Scholar]
  • 12.Patel N, Grewal H, Buddhavarapu V, Dhillon G. OpenEvidence: enhancing medical student clinical rotations with AI but with limitations. Cureus. 2025;17(1):e76867. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Lee T, Natalwala J, Chapple V, Liu Y. A brief history of artificial intelligence embryo selection: from black-box to glass-box. Hum Reprod. 2024;39(2):285–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Sundvall L, Ingerslev HJ, Breth Knudsen U, Kirkegaard K. Inter- and intra-observer variability of time-lapse annotations. Hum Reprod. 2013;28(12):3215–21. [DOI] [PubMed] [Google Scholar]
  • 15.Liu Y, Sakkas D, Afnan M, Matson P. Time-lapse videography for embryo selection/de-selection: a bright future or fading star? Hum Fertil (Camb). 2020;23(2):76–82. [DOI] [PubMed] [Google Scholar]
  • 16.Gardner DK, Meseguer M, Rubio C, Treff NR. Diagnosis of human preimplantation embryo viability. Hum Reprod Update. 2015;21(6):727–47. [DOI] [PubMed] [Google Scholar]
  • 17.Meng Q, Xu Y, Zheng A, Li H, Ding J, Xu Y, et al. Noninvasive embryo evaluation and selection by time-lapse monitoring vs. conventional morphologic assessment in women undergoing in vitro fertilization/intracytoplasmic sperm injection: a single-center randomized controlled study. Fertil Steril. 2022;117(6):1203–12. [DOI] [PubMed] [Google Scholar]
  • 18.Illingworth PJ, Venetis C, Gardner DK, Nelson SM, Berntsen J, Larman MG, et al. Deep learning versus manual morphology-based embryo selection in IVF: a randomized, double-blind noninferiority trial. Nat Med. 2024;30(11):3114–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Wang S, Chen L, Sun H. Interpretable artificial intelligence-assisted embryo selection improved single-blastocyst transfer outcomes: a prospective cohort study. Reprod Biomed Online. 2023;47(6):103371. [DOI] [PubMed] [Google Scholar]
  • 20.Salih M, Austin C, Warty RR, Tiktin C, Rolnik DL, Momeni M, et al. Embryo selection through artificial intelligence versus embryologists: a systematic review. Hum Reprod Open. 2023;2023(3):hoad031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Chavez-Badiola A, Farias AF, Mendizabal-Ruiz G, Silvestri G, Griffin DK, Valencia-Murillo R, et al. Use of artificial intelligence embryo selection based on static images to predict first-trimester pregnancy loss. Reprod Biomed Online. 2024;49(2):103934. [DOI] [PubMed] [Google Scholar]
  • 22.Bori L, Paya E, Alegre L, Viloria TA, Remohi JA, Naranjo V, et al. Novel and conventional embryo parameters as input data for artificial neural networks: an artificial intelligence model applied for prediction of the implantation potential. Fertil Steril. 2020;114(6):1232–41. [DOI] [PubMed] [Google Scholar]
  • 23.Canosa S, Licheri N, Bergandi L, Gennarelli G, Paschero C, Beccuti M, et al. A novel machine-learning framework based on early embryo morphokinetics identifies a feature signature associated with blastocyst development. J Ovarian Res. 2024;17(1):63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Fordham DE, Rosentraub D, Polsky AL, Aviram T, Wolf Y, Perl O, et al. Embryologist agreement when assessing blastocyst implantation probability: is data-driven prediction the solution to embryo assessment subjectivity? Hum Reprod. 2022;37(10):2275–90. [DOI] [PubMed] [Google Scholar]
  • 25.Fruchter-Goldmeier Y, Kantor B, Ben-Meir A, Wainstock T, Erlich I, Levitas E, et al. An artificial intelligence algorithm for automated blastocyst morphometric parameters demonstrates a positive association with implantation potential. Sci Rep. 2023;13(1):14617. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Milewski R, Kuczynska A, Stankiewicz B, Kuczynski W. How much information about embryo implantation potential is included in morphokinetic data? A prediction model based on artificial neural networks and principal component analysis. Adv Med Sci. 2017;62(1):202–6. [DOI] [PubMed] [Google Scholar]
  • 27.Barnes J, Brendel M, Gao VR, Rajendran S, Kim J, Li Q, et al. A non-invasive artificial intelligence approach for the prediction of human blastocyst ploidy: a retrospective model development and validation study. Lancet Digit Health. 2023;5(1):e28–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Zhan J, Chen C, Zhang N, Zhong S, Wang J, Hu J, et al. An artificial intelligence model for embryo selection in preimplantation DNA methylation screening in assisted reproductive technology. Biophys Rep. 2023;9(6):352–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Edgar DH, Gook DA. How should the clinical efficiency of oocyte cryopreservation be measured? Reprod Biomed Online. 2007;14(4):430–5. [DOI] [PubMed] [Google Scholar]
  • 30.Ribas-Maynou J, Barranco I, Sorolla-Segura M, Llavanera M, Delgado-Bermudez A, Yeste M. Advanced sperm selection strategies as a treatment for infertile couples: a systematic review. Int J Mol Sci. 2022. 10.3390/ijms232213859. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Bjorndahl L, Kirkman Brown J, other Editorial Board Members of the WHOLMftE. Processing of Human S. The sixth edition of the WHO Laboratory Manual for the Examination and Processing of Human Semen: ensuring quality and standardization in basic examination of human ejaculates. Fertil Steril. 2022;117(2):246–51. [DOI] [PubMed] [Google Scholar]
  • 32.Gallagher MT, Cupples G, Ooi EH, Kirkman-Brown JC, Smith DJ. Rapid sperm capture: high-throughput flagellar waveform analysis. Hum Reprod. 2019;34(7):1173–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Badura A, Marzec-Wroblewska U, Kaminski P, Lakota P, Ludwikowski G, Szymanski M, et al. Prediction of semen quality using artificial neural network. J Appl Biomed. 2019;17(3):167–74. [DOI] [PubMed] [Google Scholar]
  • 34.Mendizabal-Ruiz G, Chavez-Badiola A, Aguilar Figueroa I, Martinez Nuno V, Flores-Saiffe Farias A, Valencia-Murilloa R, et al. Computer software (SiD) assisted real-time single sperm selection associated with fertilization and blastocyst formation. Reprod Biomed Online. 2022;45(4):703–11. [DOI] [PubMed] [Google Scholar]
  • 35.Shaker F, Monadjemi SA, Alirezaie J, Naghsh-Nilchi AR. A dictionary learning approach for human sperm heads classification. Comput Biol Med. 2017;91:181–90. [DOI] [PubMed] [Google Scholar]
  • 36.Javadi S, Mirroshandel SA. A novel deep learning method for automatic assessment of human sperm images. Comput Biol Med. 2019;109:182–94. [DOI] [PubMed] [Google Scholar]
  • 37.McCallum C, Riordon J, Wang Y, Kong T, You JB, Sanner S, et al. Deep learning-based selection of human sperm with high DNA integrity. Commun Biol. 2019;2:250. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Abbara A, Clarke SA, Dhillo WS. Novel concepts for inducing final oocyte maturation in in vitro fertilization treatment. Endocr Rev. 2018;39(5):593–628. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Nayot D, Meriano J, Casper R, Alex K. An oocyte assessment tool using machine learning; predicting blastocyst development based on a single image of an oocyte. Hum. Reprod. 2020;35:129–30.
  • 40.Link CA, Von Mengden L, De Bastiani MA, Faller M, Dorneles L, Pedo R, et al. P-246 A novel non-invasive tool for oocyte selection using gene expression and artificial intelligence. Hum Reprod. 2022;37(Supplement_1):deac107.236.
  • 41.Jenkins J, van der Poel S, Krussel J, Bosch E, Nelson SM, Pinborg A, et al. Empathetic application of machine learning may address appropriate utilization of ART. Reprod Biomed Online. 2020;41(4):573–7. [DOI] [PubMed] [Google Scholar]
  • 42.Senapati S, Asch DA, Merchant RM, Rosin R, Seltzer E, Mancheno C, et al. The fast track to fertility program: rapid cycle innovation to redesign fertility care. NEJM Catalyst Innovations Care Delivery. 2022;3(10):CAT220065. 2642-0007. [Google Scholar]
  • 43.Abbara A, Patel A, Hunjan T, Clarke SA, Chia G, Eng PC, et al. FSH requirements for follicle growth during controlled ovarian stimulation. Front Endocrinol (Lausanne). 2019;10:579. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Fanton M, Baker VL, Loewke KE. Selection of optimal gonadotropin dose using machine learning may be associated with improved outcomes and reduced utilization of FSH. Fertil Steril. 2022;118(4):e80–10015. [Google Scholar]
  • 45.Abbara A, Hunjan T, Ho VNA, Clarke SA, Comninos AN, Izzi-Engbeaya C, et al. Endocrine requirements for oocyte maturation following hCG, GnRH agonist, and kisspeptin during IVF treatment. Front Endocrinol (Lausanne). 2020;11:537205. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Fanton M, Nutting V, Solano F, Maeder-York P, Hariton E, Barash O, et al. An interpretable machine learning model for predicting the optimal day of trigger during ovarian stimulation. Fertil Steril. 2022;118(1):101–8. [DOI] [PubMed] [Google Scholar]
  • 47.Abbaoui W, Retal S, El Bhiri B, Kharmoum N, Ziti S. Towards revolutionizing precision healthcare: a systematic literature review of artificial intelligence methods in precision medicine. Inform Med Unlocked. 2024;46:101475. [Google Scholar]
  • 48.Ueno S, Berntsen J, Ito M, Uchiyama K, Okimura T, Yabuuchi A, et al. Pregnancy prediction performance of an annotation-free embryo scoring system on the basis of deep learning after single vitrified-warmed blastocyst transfer: a single-center large cohort retrospective study. Fertil Steril. 2021;116(4):1172–80. [DOI] [PubMed] [Google Scholar]
  • 49.Ueno S, Berntsen J, Okimura T, Kato K. Improved pregnancy prediction performance in an updated deep-learning embryo selection model: a retrospective independent validation study. Reprod Biomed Online. 2024;48(1):103308. [DOI] [PubMed] [Google Scholar]
  • 50.Leung F, Lau YC, Law M, Djeng SK. Artificial intelligence and end user tools to develop a nurse duty roster scheduling system. Int J Nurs Sci. 2022;9(3):373–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Yao MWM, Nguyen ET, Retzloff MG, Gago LA, Copland S, Nichols JE, et al. Improving IVF utilization with patient-centric artificial intelligence-machine learning (AI/ML): a retrospective multicenter experience. J Clin Med. 2024. 10.3390/jcm13123560. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Blease CR, Locher C, Gaab J, Hagglund M, Mandl KD. Generative artificial intelligence in primary care: an online survey of UK general practitioners. BMJ Health Care Inform. 2024. 10.1136/bmjhci-2024-101102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Gan W, Ouyang J, She G, Xue Z, Zhu L, Lin A, et al. ChatGPT’s role in alleviating anxiety in total knee arthroplasty consent process: a randomized controlled trial pilot study. Int J Surg. 2025;111(3):2546–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Reategui-Rivera CM, Finkelstein J. Leveraging generative AI to overcome language barriers in healthcare. Stud Health Technol Inform. 2025;328:86–90. [DOI] [PubMed] [Google Scholar]
  • 55.Alper MM, Fauser BC. Ovarian stimulation protocols for IVF: is more better than less? Reprod Biomed Online. 2017;34(4):345–53. [DOI] [PubMed] [Google Scholar]
  • 56.Olawade DB, Teke J, Adeleye KK, Weerasinghe K, Maidoki M, Clement David-Olawade A. Artificial intelligence in in-vitro fertilization (IVF): a new era of precision and personalization in fertility treatments. J Gynecol Obstet Hum Reprod. 2025;54(3):102903. [DOI] [PubMed] [Google Scholar]
  • 57.Gleicher N, Gayete-Lafuente S, Barad DH, Patrizio P, Albertini DF. Why the hypothesis of embryo selection in IVF/ICSI must finally be reconsidered. Hum Reprod Open. 2025;2025(2):hoaf011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Bamford T, Easter C, Montgomery S, Smith R, Dhillon-Smith RK, Barrie A, et al. A comparison of 12 machine learning models developed to predict ploidy, using a morphokinetic meta-dataset of 8147 embryos. Hum Reprod. 2023;38(4):569–81. [DOI] [PubMed] [Google Scholar]
  • 59.Linardatos P, Papastefanopoulos V, Kotsiantis S. Explainable AI: a review of machine learning interpretability methods. Entropy. 2020. 10.3390/e23010018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Bormann CL, Thirumalaraju P, Kanakasabapathy MK, Kandula H, Souter I, Dimitriadis I, et al. Consistency and objectivity of automated embryo assessments using deep neural networks. Fertil Steril. 2020;113(4):781-7 e1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Gurupur V, Wan TTH. Inherent bias in artificial intelligence-based decision support systems for healthcare. Medicina Kaunas. 2020. 10.3390/medicina56030141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Juhn YJ, Ryu E, Wi CI, King KS, Malik M, Romero-Brufau S, et al. Assessing socioeconomic bias in machine learning algorithms in health care: a case study of the HOUSES index. J Am Med Inform Assoc. 2022;29(7):1142–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Boyd AD, Gonzalez-Guarda R, Lawrence K, Patil CL, Ezenwa MO, O’Brien EC, et al. Equity and bias in electronic health records data. Contemp Clin Trials. 2023;130:107238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Parikh RB, Teeple S, Navathe AS. Addressing bias in artificial intelligence in health care. JAMA. 2019;322(24):2377–8. [DOI] [PubMed] [Google Scholar]
  • 65.Alhammad N, Alajlani M, Abd-Alrazaq A, Epiphaniou G, Arvanitis T. Patients’ perspectives on the data confidentiality, privacy, and security of mHealth apps: systematic review. J Med Internet Res. 2024;26:e50715. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Coticchio G, Cimadomo D, Rienzi L. Do we still need embryologists? Reprod Biomed Online. 2025;50(4):104790. [DOI] [PubMed] [Google Scholar]
  • 67.Li P, Yang J, Islam MA, Ren S. Making ai less’ thirsty’. Commun ACM. 2025;68(7):54–610001. [Google Scholar]
  • 68.So R, Andersen ZJ, Chen J, Stafoggia M, de Hoogh K, Katsouyanni K, et al. Long-term exposure to air pollution and mortality in a Danish nationwide administrative cohort study: beyond mortality from cardiopulmonary disease and lung cancer. Environ Int. 2022;164:107241. [DOI] [PubMed] [Google Scholar]
  • 69.Shaolei Ren AW. Mitigating the Public Health Impacts of AI Data Centers 2025 [Available from: https://hbr.org/2025/11/mitigating-the-public-health-impacts-of-ai-data-centers.
  • 70.Cromack SC, Lew AM, Bazzetta SE, Xu S, Walter JR. The perception of artificial intelligence and infertility care among patients undergoing fertility treatment. J Assist Reprod Genet. 2025;42(3):855–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Organization WH. Ethics and governance of artifical intelligence for health 2021 [Available from: https://www.who.int/publications/i/item/9789240029200.
  • 72.Kolbinger FR, Veldhuizen GP, Zhu J, Truhn D, Kather JN. Reporting guidelines in medical artificial intelligence: a systematic review and meta-analysis. Commun Med. 2024;4(1):71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Kwong JCC, Khondker A, Lajkosz K, McDermott MBA, Frigola XB, McCradden MD, et al. APPRAISE-AI tool for quantitative evaluation of AI studies for clinical decision support. JAMA Netw Open. 2023;6(9):e2335377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Fontiers Media. Unlocking AI’s untapped potential: responsible innovation in research and publishing. 2025.
  • 75.Nagendran M, Chen Y, Lovejoy CA, Gordon AC, Komorowski M, Harvey H, et al. Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies. BMJ. 2020;368:m689. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Kakkar P, Gupta S, Paschopoulou KI, Paschopoulos I, Paschopoulos I, Siafaka V, et al. The integration of artificial intelligence in assisted reproduction: a comprehensive review. Front Reprod Health. 2025;7:1520919. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Palaniappan K, Lin EYT, Vogel S. Global Regulatory Frameworks for the Use of Artificial Intelligence (AI) in the Healthcare Services Sector. Healthcare (Basel). 2024;12(5):562. 10.3390/healthcare12050562. PMID: 38470673; PMCID: PMC10930608. [DOI] [PMC free article] [PubMed]
  • 78.AMA. Augmented intelligence in medicine 2025 [Available from: https://www.ama-assn.org/practice-management/digital-health/augmented-intelligence-medicine.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Not applicable.


Articles from Reproductive Biology and Endocrinology : RB&E are provided here courtesy of BMC

RESOURCES