Abstract
This narrative literature review synthesized evidence to address gaps in knowledge regarding AI performance and its integration into surgical operations. The purpose of the review was to assess AI accuracy and reliability, benchmark real-time guidance technologies, identify data and ethical issues, compare model performance across different specialties, and review the role of AI in improving surgical accuracy and safety. It reviewed 28 studies conducted across various geographic and disciplinary contexts and discussed machine learning (ML) and deep learning (DL) as applied to major surgeries. Results show that AI models' overall performance is substantial in intraoperative (IOP) decision-making, with five of six studies reporting AUC values of 0.85-0.95, indicating significant discriminatory power. Moreover, the accuracy performance metric across 22 studies showed high predictive performance of AI models in surgical settings, with accuracies ranging from 80% to 99%, except for one study, which reported an accuracy below 70%. These findings emphasized the practical feasibility of AI in IOP decision-making. Hence, AI's role in IOP is promising, assisting surgeons' decision-making in the operating room. Therefore, ML and DL are highly precise in anatomic detection, surgical-phase detection, complication prediction, and real-time event detection. Developments in DL algorithms, such as convolutional neural networks and generative adversarial networks, have enabled more accurate surgical guidance and the prediction of IOP events, thereby increasing surgical accuracy and potentially reducing errors. However, the model's performance needs to be validated through long-term computational and real-time clinical study designs, ensuring appropriate strategies for data validation and model performance assessment. The narrative review study design focused solely on the narrative synthesis, rather than on data validation (internal or external) or quality assessment of the included studies. Therefore, future researchers should conduct a systematic review to validate the findings. The readers must be cautious when interpreting the findings. Hence, AI use in surgery training and workflow optimization has the potential to improve surgical performance and patient outcomes, but scalability and long-term outcomes have yet to be demonstrated. Although AI technologies can improve the accuracy and reliability of decisions made in IOP settings, it is critical to address methodological, infrastructural, and ethical constraints to enable safe and effective clinical application in major surgeries.
Keywords: artificial intelligence in surgery, decision-making, intraoperative, reliability, surgery
Introduction and background
Surgery accounts for a large share of global morbidity and mortality, with low- and middle-income countries (LMICs) suffering the most significant burden due to inadequate access to timely and quality surgery [1]. Artificial Intelligence (AI) has significant applications in surgery, specifically in improving intraoperative (IOP) decision-making, but validation quality, data standardization, and public data availability remain suboptimal. Currently, AI applications are mainly focused on preoperative risk assessment and are suggested to improve decision-making [2,3]. The exploration of AI in IOP decision-making systems during large-scale surgery is a debatable topic.
Over the last few years, the nature of research and practice in surgery has shifted from preoperative risk assessment to real-time IOP decision-making enabled by machine learning (ML) and deep learning (DL) [4,5]. During surgery, AI uses computational systems and algorithms to simulate human cognitive functions, e.g., decision-making, learning, and problem-solving. ML algorithms support predictive analysis, while DL assists with interpreting image and video data and with reinforcement learning for real-time decision-making in autonomous surgical settings [5]. AI assists in real-time anatomical recognition, hazard detection, surgical phase classification, and predictive analysis in surgical decision-making. These algorithms help improve surgeons' decision accuracy, reduce errors, and improve patient outcomes by providing data-driven recommendations during surgery [6]. The use of AI would be beneficial, as IOP complications are quite high, and decision-making under dynamic conditions for surgeons in this field is impossible without complex tools that assist them [6,7]. Since surgical procedures, by definition, represent a significant source of morbidity and mortality worldwide, AI-driven approaches to resource management and patient health will be streamlined [8,9].
Although there is increased interest, there are critical issues with the use of AI in the IOP environment. The existing body of knowledge identifies a gap in understanding the accuracy and reliability of ML and DL models in surgery, particularly for real-time decision support [10,11]. Some of these studies demonstrate promising predictive performance and autonomous capability, but others highlight limitations, such as poor data standardization, insufficient external validation, and ethical concerns [12,13]. Concerns about the safety of AI augmentation or the replacement of human judgment during an operation persist, as does debate about the risks and opportunities of overreliance and interpretation issues on both sides of the technology [14]. These literature gaps included delays in clinical adoption and missed opportunities to improve surgery-related outcomes [15].
The review provides a contextualization of IOP AI, including providing ML algorithms, DL models, and utility to surgical processes [16]. ML is the hallmark of predictive analytics applied to historical data, and DL is the hallmark of smart pattern recognition applied to streams of image and video data [17]. That is, while no single technology can address all three dimensions of improving IOP decision-making, these technologies work together because each contributes to the three heuristic dimensions: real-time anatomic recognition, hazard visualization, and autonomous robotic assistance [18]. A template is provided to systematically evaluate the validity and reliability of AI in surgery.
The purpose of this literature review was to critically assess the current status of ML and DL technology with respect to the accuracy, reliability, and applicability for clinical practice in IOP decision-making associated with major surgery. Hopefully, the knowledge gaps covered above have been addressed, and this review can also serve as a guide for future research on the responsible use of AI in surgery and, by extension, through surgery, to benefit patient care [19].
Objective
The paper synthesized evidence on AI (ML and DL) uses to identify technological advances, assess their clinical effectiveness, and identify challenges to be overcome to achieve safe and effective application of AI in major surgical operations.
Review
Methodology
The literature review was conducted using relevant topic-specific keywords such as “artificial intelligence”, “machine learning”, “deep learning”, “surgery”, “intra-operative”, “decision making”, “accuracy”, “reliability”, “low resource setting”, and “implication of AI”. The Boolean operators were used to incorporate keywords and search across PubMed, Embase, and Google Scholar, focusing on articles published from January 2015 to August 2025. All original articles, including RCTs, cross-sectional, cohort, and longitudinal studies, relevant to the keywords, scope, and objectives of the studies were included. However, editorials, letters to the editor, abstracts, and conference papers were excluded before synthesizing evidence. The literature search was limited to peer-reviewed English-language literature. The evidence was synthesized solely using a narrative approach. The selection of the study process was demonstrated in Figure 1.
Figure 1. PRISMA flow chart showing the study selection process.
PRISMA: Preferred Reporting Items for Systematic Reviews and Meta-Analyses
Results and discussion
Upon synthesizing the evidence, the following studies were analyzed to determine the role of AI in IOP decision-making (Table 1).
Table 1. Evaluation of the literature to assess the performance metrics and applicability of AI predictive models.
AI: artificial intelligence, AUC: area under the curve, F1 score: a measure of a model's accuracy, ML: machine learning, DL: deep learning, CNN: convolutional neural network, EMR: electronic medical records, EHR: electronic health records, RSI: retained surgical items, SSI: surgical site infection, YOLOv5: You Only Look Once Version 5 (a DL object detection model), IOP: intraoperative
| Study | Accuracy metrics | Reliability and robustness | Clinical integration feasibility | Ethical and regulatory compliance | Impact on surgical outcomes |
| Hossain et al. (2024) [5] | Moderate (63.2%) to high accuracy (93.4%) in anatomy identification and phase classification | Challenges due to data quality and standardization | Requires substantial data and infrastructure; ethical concerns noted | Ethical and patient acceptance issues highlighted | Improved IOP guidance and complication prediction (p < 0.05) |
| Cheruvu et al. (2024) [20] | IOP AI guidance accuracy of up to 95% reported | Limited clinical validation; mostly retrospective data | Early-stage integration; regulatory and ethical challenges remain | Emphasizes the need for ethical frameworks | Potential to transform surgical phases and outcomes |
| Shetti et al. (2024) [4] | Demonstrated enhanced surgical precision and error reduction on review of case studies | Consistent improvements across case studies | Integration feasible with current surgical workflows | Ethical and future trend considerations discussed | Significant improvements in patient outcomes |
| Knudsen et al. (2024) [18] | High accuracy in robotic surgery metrics (60-95% accuracy, AUC = 0.88) and automation | Robustness shown in ex vivo and in silico models | Infrastructure-intensive; early clinical adoption | Ethical dilemmas in autonomy addressed | Enhanced surgical education and IOP feedback |
| Madani et al. (2022) [6] | F1 scores up to 0.83 for zone identification in laparoscopic surgery | Validated across diverse international datasets | Real-time application feasible with video processing | Data privacy and annotation ethics considered | Reduced risk of adverse events intraoperatively |
| Loftus et al. (2023) [10] | Variable accuracy; some models with AUC < 0.83 | Mostly internal validation; limited external and real-time validation | Clinical implementation frameworks proposed but untested | Lack of equity and demographic performance reporting | Limited evidence on direct outcome improvements |
| Andras et al. (2020) [21] | Encouraging accuracy in robotic surgery skill feedback and guidance (93% for ML vs. 72% for the clinical approach) | Consistent performance in skill acquisition and process efficiency | Integration with robotic platforms feasible | Ethical and regulatory frameworks under development | Improved surgical training and precision |
| Taher et al. (2022) [12] | Identified technical and clinical challenges limiting DL accuracy to 80% | Reliability affected by data scarcity and complexity | Infrastructure and surgeon education critical | Ethical and business challenges noted | Potential hindered by current limitations |
| Celotto et al. (2024) [8] | Superior predictive power for anastomotic leak prevention (76.7-91.9% accuracy) | Robustness in clinical datasets for risk factors | Feasible IOP feedback integration | Ethical use in patient safety emphasized | Reduced complication rates and improved decision-making |
| Rodler et al. (2024) [22] | Emerging generative AI shows promising accuracy in data synthesis | Reliability depends on task-specific training | Real-time feedback and documentation feasible | Ethical considerations in data use highlighted | Enhances IOP decision support and documentation |
| Othman and Kaleem (2024) [11] | Moderate accuracy (75.7-82%) in IOP guidance and training | Limited data availability affects robustness | Early-stage clinical integration; validation tools lacking | Ethical concerns and data limitations significant | Potential to enhance surgical training and autonomy |
| Checcucci et al. (2023) [23] | Over 90% accuracy in bleeding event prediction | Reliable performance comparable to human assessment | Real-time IOP application feasible | Ethical use in patient safety emphasized | Improved bleeding management and surgical safety |
| Rus et al. (2023) [24] | High accuracy (90.63%) in real-time hemorrhage detection using YOLOv5 | Robust detection with low false positives | Real-time AR integration feasible; hardware limits noted | Ethical use and surgeon interaction considered | Enhanced hazard detection and patient safety |
| Mascagni et al. (2024) [7] | The feasibility of real-time AI assistance demonstrated a mean accuracy of 71.4% | Early-stage validation with multidisciplinary input | Technical and cultural barriers identified | Ethical and regulatory challenges noted | Potential to improve IOP assistance |
| Celotto et al. (2025) [19] | High accuracy (80-94% with an F1 score of 0.90 ± 0.11) in IOP guidance and complication prediction | Robust across colorectal surgery datasets | Integration feasible with imaging and EHR systems | Ethical and regulatory challenges discussed | Improved surgical precision and postoperative outcomes |
| Zarghami (2024) [14] | High accuracy (> 90%) 90%) in imaging and physiological monitoring | Robustness challenged by data quality and interpretability | Integration requires infrastructure and clinician engagement | Ethical, privacy, and regulatory challenges significant | Improved IOP decision-making and personalized care |
| Kuemmerli et al. (2023) [25] | Promising accuracy (71-94%) in pancreatic surgery AI applications | Robustness limited by the evidence level | Integration feasible in pre-, intra-, and postoperative phases | Ethical and regulatory challenges noted | Improved diagnosis, decision support, and risk stratification |
| Demir et al. (2023) [17] | High accuracy in (82-85%) surgical phase and step recognition | Robust temporal modeling with DL | Integration with surgical workflow analysis feasible | Ethical concerns less emphasized | Enhanced workflow recognition and surgical assistance |
| Mehta et al. (2024) [26] | Variable accuracy (70-85%) in perioperative ML interventions | Reliability depends on the intervention type and data | Clinical integration in perioperative care feasible | Ethical and implementation challenges noted | Improved perioperative outcomes in some settings |
| Morris et al. (2023) [16] | Broad AI applications with promising accuracy (>80%) | Robustness varies with application and data | Integration feasible with training and decision support | Ethical and interpretability challenges discussed | Enhanced surgical training and decision-making |
| Henn et al. (2022) [27] | ML reported 97.8%, outperforms conventional decision-making in abdominal surgery | Robustness limited by data heterogeneity | Integration feasible with EHR and clinical workflows | Ethical and interpretability challenges noted | Enhanced clinical decision-making and risk assessment |
| Ladinez et al. (2024) [28] | ML algorithms (78-96%, AUC of 0.9) outperform conventional methods in complication prediction | Robustness varies with dataset and algorithm | Integration feasible in postoperative care | Ethical and interpretability challenges noted | Enhanced postoperative complication prediction |
| Abo-Zahhad et al. (2024) [29] | High accuracy in (99.9% with AUC 0.81–0.85) RSI detection and prevention | Robustness enhanced by large datasets | Integration feasible with real-time monitoring | Ethical and data privacy challenges noted | Reduced RSI and improved safety |
| Spence et al. (2023) [30] | Neural networks (89.4% accuracy) outperform industry standards in surgery duration prediction | Robustness demonstrated in multiple studies | Integration feasible with scheduling systems | Ethical concerns minimal | Improved operating room efficiency (p < 0.05) |
| Wu et al. (2024) [31] | Significant improvement in surgical performance with AI coaching (accuracy 11% to 78%) | Robustness shown in randomized controlled trial | Integration feasible in surgical education | Ethical concerns minimal | Enhanced surgical safety and training outcomes (p = 0.021) |
| Chen et al. (2020) [32] | CNN and self-attention models achieve AUC ~0.88 in SSI risk | Robust internal and external validation | Integration feasible with EMR data | Ethical and privacy considerations addressed | Improved SSI risk prediction |
| Tanzi et al. (2020) [33] | Encouraging results in DL (> 85% accuracy) for IOP management | Robustness across surgical subfields | Integration feasible in intelligent operating rooms | Ethical and workflow challenges noted | Improved surgical workflow and context detection |
| Ahmad (2023) [34] | ML shows a significant edge (> 75% accuracy) over clinical diagnosis in neurosurgery | Robustness limited by study design variability | Integration feasible with radiology workflows | Ethical and variability challenges noted | Enhanced diagnosis and treatment planning (p < 0.05) |
Performance Metrics
AI can be used to perform high-to-middle and high-to-very-high accuracy IOP tasks, with F1 scores of 0.83 and AUCs over 0.90 [6,32]. Other studies in this category have reported that AI was highly predictive compared with conventional approaches, particularly for predicting complications and surgical plans [8,28]. Specific initial or preliminary reports of variable accuracy have been noted in some of these early-stage or experimental studies, and thus further validation is required [10,35]. The overall performance of AI models is substantial in IOP decision-making, with five of six studies reporting AUC values of 0.85-0.95, indicating significant discriminatory power (Figure 2). Moreover, the accuracy performance metric among 22 studies showed high predictability, with accuracy ranging from 80% to 99%, except for one study below 70%, emphasizing the practical feasibility of AI in IOP decision-making. Hence, AI's role in IOP is promising for assisting surgeons' decision-making in the operating room (Figure 3).
Figure 2. AUC for AI-based IOP decision-making.
These six studies provided AUC values directly; therefore, only these are used to draw an AUC curve diagram. The reasoning is that AUC is ideal for understanding overall model performance and generalization. AUC is better suited for evaluating model robustness and general performance in clinical practice. These findings are interpreted from a narrative review. Readers should be cautious when interpreting the exact AI model performance due to the study's design. However, the purpose of this graph is to illustrate overall general performance.
AUC: area under the curve, AI: artificial intelligence, IOP: intraoperative
Figure 3. Accuracy curve diagram illustrating the predictability of AI models for IOP decision-making.
The accuracy metrics from 22 studies were selected to demonstrate the AI model's predictive performance. These studies did not provide a direct computation of the AUC. The reader must be cautious when interpreting the AI model's accuracy, given the narrative review study design. Readers must consider the quality of the studies when evaluating the accuracy of the performance metric. Overall, the graph illustrates the comprehensive practical feasibility of using AI in IOP decision-making.
AUC: area under the curve, AI: artificial intelligence, IOP: intraoperative
Reliability and Robustness
The researchers reported similar AI performance across various surgical cases and datasets, even though data quality, heterogeneity, and external validation are usually poor predictors of performance [5,27]. Only animal and in silico models have shown promising robustness in autonomous and robotic surgery applications, although clinical robustness is yet to be demonstrated [35]. The importance of large, well-labeled data to facilitate model stability and generalization has been noted in multiple reviews [12,29].
Clinical Integration Feasibility
Research also suggested that AI can be integrated with existing surgical processes, particularly robotic surgery, IOP guidance, and perioperative risk assessment [18,19,31]. Some of these systems were also shown to be capable of processing in real time, but the facilities and computing power requirements are still significant obstacles [24]. The user must accept and train with the technology, and there must be evidence of positive feedback from the surgeon during use [7,23].
Impact on Surgical Outcomes
The results of the study showed that strong AI ability predicts the quality of IOP judgment and complication prevention, as well as patient and evidence safety and protection [4,19,31]. The error rate in acquiring surgical skills and ensuring surgical safety improved during training under AI coaching [31]. Given its initial research, only a limited number of studies (Table 2) have shown concrete improvements in results, and this is why further clinical confirmation is required [7,10].
Table 2. Convergence and divergence across studies.
AI: artificial intelligence, ML: machine learning, DL: deep learning, CNN: convolutional neural network
| Comparison criterion | Studies in convergence | Studies in divergence | Potential explanations |
| Accuracy metrics | Many studies report high accuracy, sensitivity, and specificity of AI models in IOP tasks, such as real-time anatomy identification and complication prediction (e.g., 90%+ accuracy in bleeding detection, 89-95% sensitivity in guidance systems [6,23,32] | Some studies highlight moderate or variable accuracy and emphasize challenges in achieving consistent, reliable metrics across diverse surgical phases and settings [5,10] | Differences stem from AI model types (ML vs. DL), surgical specialties, dataset size and quality, and the stage of development (experimental vs. clinical). Also, some focus on early-phase validation, while others report mature system performance |
| Reliability and robustness | Consensus exists that AI tools show promising robustness in controlled or experimental settings, such as ex vivo and animal models for robotic surgery and IOP prediction [18,23] | Divergence arises in real-world reliability; many systems lack external validation, real-time clinical testing, and show risk of overfitting or less robustness in heterogeneous clinical environments [10,11] | Variability is attributed to limited external validation, small or homogeneous datasets, and a lack of large-scale clinical trials. Differences in infrastructure support and surgical workflow integration also affect robustness |
| Clinical integration feasibility | Agreement that AI has potential for IOP decision support and workflow improvement, with some initial clinical implementations and promising real-time applications [6,7,16] | Disagreement on readiness: many reviews stress the infancy of clinical adoption, citing infrastructure limitations, real-time processing constraints, and user acceptance barriers [10,11,35]. Also, some highlight hardware limitations in devices like HoloLens [24] | Divergence due to varying surgical environments, technological readiness, and differences in AI system design (standalone vs. server-based). Clinical workflow disruption and surgeon trust issues also contribute |
| Ethical and regulatory compliance | Most papers acknowledge important ethical concerns, such as patient data privacy, algorithm transparency, and the need for regulatory frameworks. Implementation requires addressing these issues for safe AI adoption [5,10,12,14] | Some studies provide more detailed ethical frameworks or call for standardized guidelines, while others focus mainly on technical performance without extensive ethical discussion [11,18,28] | Differences reflect the scope of reviews (technical vs. comprehensive), geographic regulatory environments, and the maturity of AI applications in clinical contexts. Ethical considerations are evolving alongside technology development |
| Impact on surgical outcomes | General consensus that AI enhances surgical precision, reduces errors, and improves patient safety, supported by improvements in surgical performance scores and complication prediction [4,19,31] | Some divergence in evidence strength for direct clinical outcome improvements; a few highlight limited prospective clinical trials and lack of long-term outcome data [10,27] | Variability due to predominance of retrospective analyses, limited randomized controlled trials, and early-stage AI tools, mostly validated in simulated or animal models rather than extensive human trials |
| AI model comparison | Several studies agree that DL techniques (e.g., CNNs) often outperform traditional ML in visual recognition and IOP guidance tasks [6,16,32]. Gradient boosting and random forests are favored for risk prediction | Contrasting perspectives exist on the best algorithms depending on specific tasks; some emphasize interpretability of simpler models over accuracy of complex DL models [12,27] | Differences arise from task-specific requirements, dataset characteristics, need for interpretability vs. accuracy, and computational resource availability. Different surgical applications demand tailored AI approaches |
| Data quality and availability | Strong agreement that high-quality, large, and standardized datasets are critical for model training, with data limitations being a major bottleneck [5,10-12] | Some studies differ on the sufficiency of current datasets; while some report large multicenter datasets, others highlight scarcity and lack of annotated data as barriers [27,29] | Variances depend on surgical specialty data-sharing cultures, regulatory constraints on patient data, and the availability of annotated surgical videos or imaging. The nascent stage of data infrastructure contributes to disparities |
Theoretical Implications
Overall, the findings suggest that AI, specifically ML and DL, is increasingly accurate and reliable in IOP decision-making. This confirms already formulated hypotheses that AI will be able to assist with human cognitive tasks during complex surgeries, providing high-quality real-time guidance and predicting risk [4,5,20]. The literature reviewed demonstrates the importance of integrating data and state-of-the-art algorithms, such as convolutional neural networks and generative adversarial networks, to capture surgical workflow and predict subsequent surgical stages. In this way, context-aware intelligent systems and surgical workflow analysis can be achieved [17,33]. Both the transparency and interpretability of clinical decision-making are called into question by ethical considerations and by the fact that some AI models are black boxes. This will require theoretical clarification of explainability and trust-building in AI systems used intraoperatively [10,12,14]. The lack of autonomy in existing AI applications in robotic surgery, mainly of the assistance or task-autonomy type, suggests that the theories of human-robot collaboration and the gradual adoption of autonomy hold. The shift to AI-specific measures to process outcomes with greater autonomy is consistent with developmental models of surgical AI implementation. The supposition that AI will uncover nonlinear, complex interactions in clinical data that conventional statistical algorithms can overlook is supported by the statistically proven superiority of the AI model over these algorithms in forecasting surgical complications and outcomes [8,28]. The emergence of generative AI as a potential source of real-time feedback and for synthesizing IOP data provides novel theoretical frameworks for AI as an interactive co-worker in the operating room, rather than a passive decision-support system.
Practical Implications
IOP decision support systems based on AI can boost precision, reduce IOP errors, and improve patient safety, potentially leading to mass adoption of the technology by primary surgical specialties as soon as the issues associated with data quality and clinical validation are resolved [5,6]. Even though the application of AI in robotic surgical operations is still at its initial phase and it is not based on high autonomy levels, the potential robotic surgical operation changes it provides, such as advanced metrics, fully automated task performance, and improved surgical training, should be invested in infrastructure and training [18]. The use of AI in the operating room can be integrated as a standard tool to reduce risk and facilitate communication with colleagues through real-time applications such as bleeding detection and hazard identification using computer vision and augmented reality (AR) [23,24]. The lack of external validation, the extremely small sample size, and the improper reporting of AI model performance across multiple populations also underscore the need to implement a standardized assessment framework and regulatory rules to support the deployment of AI in the clinical setting [10,27]. As the verification confirms, the programs are operational for AI training for surgeons, providing an opportunity to develop high-quality surgical skills and ensure adherence to safety standards. That is why the identified programs can be introduced into the surgical training practice session [31]. The innovations in risk stratification in the perioperative setting and the ability of AI models to predict the time spent on each surgery case suggest that they can be used in practice to better organize the operating room workflow, allocate resources, and direct specific cases [26,30].
Current Landscape of AI in Surgical Care
AI and ML applications in surgery span various domains, including preoperative assessment, IOP guidance, and postoperative monitoring [9]. Predictive analytics can assess patient risk factors and optimize surgical planning, while AI-driven imaging technologies, such as AR and computer vision, enhance surgical precision [36]. Robotics-assisted surgery, though still in its infancy in LMICs, holds promise for improving surgical accuracy and accessibility [26]. Moreover, AI-powered telemedicine and remote surgical mentoring can address the shortage of specialized surgeons in rural and underserved areas [37].
AR is another rapidly emerging technology that enhances IOP visualization, allowing surgeons to overlay digital images onto real-time surgical fields [38]. AI-integrated AR can provide real-time guidance, reducing IOP complications and improving surgical efficiency [39]. Such technologies have already been widely adopted in high-income countries and have the potential to be scaled for use in LMICs with appropriate investment and policy support.
Challenges to AI Adoption in LMICs
Despite the promise of AI in surgery, LMICs face various challenges to its adoption. The unavailability of high-speed internet, cloud computing, and AI-compatible surgical equipment in many LMICs is a significant issue [40]. Without these core tools, a surgical AI application is challenging. Another obstacle is data gaps and the need for local context adaptation. In LMICs, the lack of locally relevant surgical data restricts the accuracy and applicability of AI models [41].
Lack of AI competence is another issue. The lack of AI-literate healthcare workers delays AI-driven surgical interventions and technological integration [42]. Since AI applications require significant technological and training investments, financial and policy restrictions are also necessary. Without regulatory frameworks for AI in healthcare, integrating AI into surgical practice is difficult, thereby increasing ambiguity about AI-based solutions. AI-driven surgical treatment may worsen health disparities if not adequately regulated, raising ethical and equality concerns. Without strategic planning, AI developments may benefit urban centers while disregarding rural and underprivileged communities, exacerbating the LMICs' healthcare divide [39].
Limitations
Even with strong performance metrics, the generalizability and robustness of AI models are limited by data quality, heterogeneity, and weaknesses in standardization. The relative lack of large, multi-institutional, and well-annotated IOP datasets is a severe constraint on the ability to design models that can be successfully applied across different clinical sites. Given that the issue of ethics, e.g., patient privacy, algorithm transparency, and AI results impartiality, is not a novel one, they demonstrate the importance of having extensive regulatory interventions at their disposal, as well as interpretable AI software, which will become a key metric in the establishment of trust among clinicians and will go a long way toward comforting patients. The lack of standardized reporting on demographic equity and external validation further complicates the clinical adoption of AI.
Future of AI and ML in Surgical Care for LMICs
AI-driven surgery in LMICs has a bright future if smart investments and collaboration overcome constraints. We advocate training healthcare workers in AI and ML through interdisciplinary collaborations with academic institutions and technological enterprises. Encourage government, corporate sector, and international partnerships to finance AI-based surgical efforts. Designing local AI solutions using LMIC-specific data to increase diagnosis accuracy and relevance. Increasing digital health infrastructure to support AI-powered surgical tools and remote surgery. Policy and ethics to create ethical, egalitarian, and sustainable frameworks for surgical AI adoption. To develop affordable AI-assisted surgical instruments that fit LMIC healthcare budgets and skills.
Conclusions
This narrative literature review found that AI models' overall performance in IOP decision-making is substantial, with five of six studies reporting AUC values of 0.85-0.95, indicating strong discriminatory power. Moreover, the accuracy performance metric among 22 studies showed high predictability, with accuracy ranging from 80% to 99%, except for one study below 70%, emphasizing the practical feasibility of AI in IOP decision-making. Hence, AI's role in IOP is promising, assisting surgeons' decision-making in the operating room. Hence, ML and DL are highly precise in anatomic detection, surgical phase detection, complication prediction, and real-time event detection. Developments in deep learning architectures, such as convolutional neural networks and generative adversarial networks, have enabled more accurate surgical guidance and the prediction of IOP events, thereby increasing surgical accuracy and potentially reducing errors. However, the model's performance needs to be validated through long-term computational and real-time clinical study designs, ensuring appropriate strategies for data validation and model performance assessment. The narrative review study design focused solely on the narrative synthesis, rather than on data validation (internal or external) or quality assessment of the included studies. Future researchers are encouraged to perform systematic reviews to validate the evidence. Although AI tools are still in the early stages of clinical integration, they offer potential, especially in robotic surgery, IOP guidance, and perioperative risk assessment. Technical demonstrations of real-time AI applications have occurred, and initial surgeon feedback has been overwhelmingly positive, particularly for optimizing workflows and surgical education through AI-assisted coaching programs. Nevertheless, there are infrastructural, regulatory, and cultural barriers to adoption in low-resource environments, such as the need to support large computational resources, provide surgeon training, and modify existing workflows.
Disclosures
Conflicts of interest: In compliance with the ICMJE uniform disclosure form, all authors declare the following:
Payment/services info: All authors have declared that no financial support was received from any organization for the submitted work.
Financial relationships: All authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work.
Other relationships: All authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work.
Author Contributions
Concept and design: Nicolás Idárraga Ruiz, Israel Cardona Salazar, Lincoln Xavier Naranjo Palacio, Carolina Agudelo Agudelo, Alfonso Miguel Ledesma Parra, Julio Cesar Flores Rodriguez
Acquisition, analysis, or interpretation of data: Nicolás Idárraga Ruiz, Israel Cardona Salazar, Lincoln Xavier Naranjo Palacio, Carolina Agudelo Agudelo, Alfonso Miguel Ledesma Parra, Julio Cesar Flores Rodriguez
Drafting of the manuscript: Nicolás Idárraga Ruiz, Israel Cardona Salazar, Lincoln Xavier Naranjo Palacio, Carolina Agudelo Agudelo, Alfonso Miguel Ledesma Parra, Julio Cesar Flores Rodriguez
Critical review of the manuscript for important intellectual content: Nicolás Idárraga Ruiz, Israel Cardona Salazar, Lincoln Xavier Naranjo Palacio, Carolina Agudelo Agudelo, Alfonso Miguel Ledesma Parra, Julio Cesar Flores Rodriguez
Supervision: Nicolás Idárraga Ruiz
References
- 1.World Bank Group. Disease control priorities. Washington (DC): World Bank Group; 2015. Essential Surgery. [PubMed] [Google Scholar]
- 2.Artificial intelligence in surgery: a systematic review of use and validation. Kenig N, Monton Echeverria J, Muntaner Vives A. J Clin Med. 2024;13:7108. doi: 10.3390/jcm13237108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Artificial intelligence and healthcare: a journey through history, present innovations, and future possibilities. Hirani R, Noruzi K, Khuram H, et al. Life (Basel) 2024;14:557. doi: 10.3390/life14050557. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.The role of artificial intelligence in enhancing surgical precision and outcomes. Shetti A, Ingale P, Mavi S. IP J Surg Allied Sci. 2024;6:78–81. [Google Scholar]
- 5.Machine learning perioperative applications in visceral surgery: a narrative review. Hossain I, Madani A, Laplante S. Front Surg. 2024;11:1493779. doi: 10.3389/fsurg.2024.1493779. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Artificial intelligence for intraoperative guidance: using semantic segmentation to identify surgical anatomy during laparoscopic cholecystectomy. Madani A, Namazi B, Altieri MS, et al. Ann Surg. 2022;276:363–369. doi: 10.1097/SLA.0000000000004594. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Early-stage clinical evaluation of real-time artificial intelligence assistance for laparoscopic cholecystectomy. Mascagni P, Alapatt D, Lapergola A, et al. Br J Surg. 2024;111:353. doi: 10.1093/bjs/znad353. [DOI] [PubMed] [Google Scholar]
- 8.Application and use of artificial intelligence in colorectal cancer surgery: where are we? Celotto F, Capelli G, Ferrari S, Scarpa M, Pucciarelli S, Spolverato G. Art Int Surg. 2024;348:63. [Google Scholar]
- 9.Artificial intelligence in perioperative management of major gastrointestinal surgeries. Solanki SL, Pandrowala S, Nayak A, Bhandare M, Ambulkar RP, Shrikhande SV. World J Gastroenterol. 2021;27:2758–2770. doi: 10.3748/wjg.v27.i21.2758. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Artificial intelligence-enabled decision support in surgery: state-of-the-art and future directions. Loftus TJ, Altieri MS, Balch JA, et al. Ann Surg. 2023;278:51–58. doi: 10.1097/SLA.0000000000005853. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.The intraoperative role of artificial intelligence within general surgery: a systematic review. Othman D, Kaleem A. Cureus. 2024;16:73006. doi: 10.7759/cureus.73006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.The challenges of deep learning in artificial intelligence and autonomous actions in surgery: a literature review. Taher H, Grasso V, Tawfik S, Gumbs A. Art Int Surg. 2022;2:144–158. [Google Scholar]
- 13.Impact of machine learning prediction on intraoperative transfusion in cranial operation: classification, regression, and decision curve analysis. Tunthanathip T, Sae-Heng S, Oearsakul T, Kaewborisutsakul A, Taweesomboonyat C. Int J Nutr Pharmacol Neurol Dis. 2022;12:186–194. [Google Scholar]
- 14.Role of artificial intelligence in surgical decision-making: a comprehensive review. Zarghami A. Galen Med J. 2024;13:3332. [Google Scholar]
- 15.A review on the current applications of artificial intelligence in the operating room. Birkhoff DC, van Dalen AS, Schijven MP. Surg Innov. 2021;28:611–619. doi: 10.1177/1553350621996961. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Deep learning applications in surgery: Current uses and future directions. Morris MX, Rajesh A, Asaad M, Hassan A, Saadoun R, Butler CE. Am Surg. 2023;89:36–42. doi: 10.1177/00031348221101490. [DOI] [PubMed] [Google Scholar]
- 17.Deep learning in surgical workflow analysis: a review of phase and step recognition. Demir KC, Schieber H, Weise T, Roth D, May M, Maier A, Yang SH. IEEE J Biomed Health Inform. 2023;27:5405–5417. doi: 10.1109/JBHI.2023.3311628. [DOI] [PubMed] [Google Scholar]
- 18.Clinical applications of artificial intelligence in robotic surgery. Knudsen JE, Ghaffar U, Ma R, Hung AJ. J Robot Surg. 2024;18:102. doi: 10.1007/s11701-024-01867-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Machine learning and deep learning to improve prevention of anastomotic leak after rectal cancer surgery. Celotto F, Bao QR, Capelli G, Spolverato G, Gumbs AA. World J Gastrointest Surg. 2025;17:101772. doi: 10.4240/wjgs.v17.i1.101772. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wp2.10 - the future of artificial intelligence (AI) in surgery. Cheruvu C, Davies A, Lu Y, Mak R, Ingley P. Br J Surg. 2024;111:197–132. [Google Scholar]
- 21.Artificial intelligence and robotics: a combination that is changing the operating room. Andras I, Mazzone E, van Leeuwen FW, et al. World J Urol. 2020;38:2359–2366. doi: 10.1007/s00345-019-03037-6. [DOI] [PubMed] [Google Scholar]
- 22.Generative artificial intelligence in surgery. Rodler S, Ganjavi C, De Backer P, et al. Surgery. 2024;175:1496–1502. doi: 10.1016/j.surg.2024.02.019. [DOI] [PubMed] [Google Scholar]
- 23.Development of bleeding artificial intelligence detector (BLAIR) system for robotic radical prostatectomy. Checcucci E, Piazzolla P, Marullo G, et al. J Clin Med. 2023;12:7355. doi: 10.3390/jcm12237355. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Artificial intelligence-based hazard detection in robotic-assisted single-incision oncologic surgery. Rus G, Andras I, Vaida C, et al. Cancers (Basel) 2023;15:3387. doi: 10.3390/cancers15133387. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Artificial intelligence in pancreatic surgery: current applications. Kuemmerli C, Rössler F, Berchtold C, et al. J Pancreatol. 2023;6:74–81. [Google Scholar]
- 26.Machine learning-augmented interventions in perioperative care: a systematic review and meta-analysis. Mehta D, Gonzalez XT, Huang G, Abraham J. Br J Anaesth. 2024;133:1159–1172. doi: 10.1016/j.bja.2024.08.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Machine learning to guide clinical decision-making in abdominal surgery-a systematic literature review. Henn J, Buness A, Schmid M, Kalff JC, Matthaei H. Langenbecks Arch Surg. 2022;407:51–61. doi: 10.1007/s00423-021-02348-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Efficacy of machine learning algorithms versus conventional assessment techniques in predicting postoperative complications in general surgery: a comprehensive literature review. Ladinez MJV, Figueroa CBD, Guartatanga PGP, Veloz BAA, Arias CAL, Dominguez YF. Ibero-Am J Health Sci Research. 2024;4:89–86. [Google Scholar]
- 29.Minimization of occurrence of retained surgical items using machine learning and deep learning techniques: a review. Abo-Zahhad M, El-Malek AH, Sayed MS, Gitau SN. BioData Min. 2024;17:17. doi: 10.1186/s13040-024-00367-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Machine learning models to predict surgical case duration compared to current industry standards: scoping review. Spence C, Shah OA, Cebula A, Tucker K, Sochart D, Kader D, Asopa V. BJS Open. 2023;7:113. doi: 10.1093/bjsopen/zrad113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Impact of an AI-based laparoscopic cholecystectomy coaching program on the surgical performance: a randomized controlled trial. Wu S, Tang M, Liu J, et al. Int J Surg. 2024;110:7816–7823. doi: 10.1097/JS9.0000000000001798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Artificial intelligence-based multimodal risk assessment model for surgical site infection (Amrams): development and validation study. Chen W, Lu Z, You L, Zhou L, Xu J, Chen K. JMIR Med Inform. 2020;8:18186. doi: 10.2196/18186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Intraoperative surgery room management: a deep learning perspective. Tanzi L, Piazzolla P, Vezzetti E. Int J Med Robot. 2020;16:1–12. doi: 10.1002/rcs.2136. [DOI] [PubMed] [Google Scholar]
- 34.Artificial intelligence and machine learning in neurosurgery: a review of diagnostic significance and treatment planning efficiency. Ahmad RG. West Afr J Radiol. 2023;30:29–40. [Google Scholar]
- 35.Intraoperative applications of artificial intelligence in robotic surgery: a scoping review of current development stages and levels of autonomy. Vasey B, Lippert KA, Khan DZ, et al. Ann Surg. 2023;278:896–903. doi: 10.1097/SLA.0000000000005700. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Unveiling the influence of AI predictive analytics on patient outcomes: a comprehensive narrative review. Dixon D, Sattar H, Moros N, et al. Cureus. 2024;16:59954. doi: 10.7759/cureus.59954. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.A comprehensive review on exploring the impact of telemedicine on healthcare accessibility. Anawade PA, Sharma D, Gahane S. Cureus. 2024;16:55996. doi: 10.7759/cureus.55996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Application of extended reality in pediatric neurosurgery: a comprehensive review. Chang YZ, Wu CT. Biomed J. 2025;48:100822. doi: 10.1016/j.bj.2024.100822. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Artificial intelligence in surgical care for low- and middle-income countries: challenges, opportunities, and the path forward. Nkenguye W. Surg Pract Sci. 2025;22:100290. doi: 10.1016/j.sipas.2025.100290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Exploring the impact of artificial intelligence on global health and enhancing healthcare in developing nations. Zuhair V, Babar A, Ali R, et al. J Prim Care Community Health. 2024;15:21501319241245847. doi: 10.1177/21501319241245847. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Generalizability assessment of AI models across hospitals in a low-middle and high income country. Yang J, Dung NT, Thach PN, et al. Nat Commun. 2024;15:8270. doi: 10.1038/s41467-024-52618-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Challenges and strategies for wide-scale artificial intelligence (AI) deployment in healthcare practices: a perspective for healthcare organizations. Esmaeilzadeh P. Artif Intell Med. 2024;151:102861. doi: 10.1016/j.artmed.2024.102861. [DOI] [PubMed] [Google Scholar]



