Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Feb 1.
Published in final edited form as: Med Image Anal. 2021 Nov 18;76:102306. doi: 10.1016/j.media.2021.102306

Surgical data science – from concepts toward clinical translation

Lena Maier-Hein a,b,c,1,*, Matthias Eisenmann a,1, Duygu Sarikaya d,e, Keno März a, Toby Collins f, Anand Malpani g, Johannes Fallert h, Hubertus Feussner i, Stamatia Giannarou j, Pietro Mascagni k,l, Hirenkumar Nakawala m, Adrian Park n,o, Carla Pugh p, Danail Stoyanov q, Swaroop S Vedula g, Kevin Cleary s, Gabor Fichtinger t, Germain Forestier u,v, Bernard Gibaud e, Teodor Grantcharov w,x, Makoto Hashizume y,z, Doreen Heckmann-Nötzel a, Hannes G Kenngott r, Ron Kikinis A, Lars Mündermann h, Nassir Navab B,C, Sinan Onogur a, Tobias Roß a,c, Raphael Sznitman D, Russell H Taylor C, Minu D Tizabi a, Martin Wagner r, Gregory D Hager g,C, Thomas Neumuth E, Nicolas Padoy k,l, Justin Collins F, Ines Gockel G, Jan Goedeke H, Daniel A Hashimoto I,J, Luc Joyeux K,L,M,N, Kyle Lam O, Daniel R Leff P,Q,R, Amin Madani S, Hani J Marcus T, Ozanan Meireles U, Alexander Seitel a, Dogu Teber V, Frank Ückert W, Beat P Müller-Stich r, Pierre Jannin e,1, Stefanie Speidel X,Y,1
PMCID: PMC9135051  NIHMSID: NIHMS1802848  PMID: 34879287

Abstract

Recent developments in data science in general and machine learning in particular have transformed the way experts envision the future of surgery. Surgical Data Science (SDS) is a new research field that aims to improve the quality of interventional healthcare through the capture, organization, analysis and modeling of data. While an increasing number of data-driven approaches and clinical applications have been studied in the fields of radiological and clinical data science, translational success stories are still lacking in surgery. In this publication, we shed light on the underlying reasons and provide a roadmap for future advances in the field. Based on an international workshop involving leading researchers in the field of SDS, we review current practice, key achievements and initiatives as well as available standards and tools for a number of topics relevant to the field, namely (1) infrastructure for data acquisition, storage and access in the presence of regulatory constraints, (2) data annotation and sharing and (3) data analytics. We further complement this technical perspective with (4) a review of currently available SDS products and the translational progress from academia and (5) a roadmap for faster clinical translation and exploitation of the full potential of SDS, based on an international multi-round Delphi process.

Keywords: Surgical data science, Artificial intelligence, Deep learning, Computer aided surgery, Clinical translation

1. Introduction

More than 15 years ago, in 2004, leading researchers in the field of computer aided surgery (CAS) organized the workshop “OR2020: Operating Room of the Future”. Around 100 invited experts including physicians, engineers, and operating room (OR) personnel attended the workshop (Cleary et al., 2004) to define the OR of the future, with 2020 serving as target time frame. Interestingly, many of the problems and challenges identified back in 2004 do not differ substantially from those we are facing today. Already then, researchers articulated the need for “integration of technologies and a common set of standards”, “improvements in electronic medical records and access to information in the operating room”, as well as “interoperability of equipment”. In the context of data-driven approaches, they criticized the lack of an “ontology or standard” for “high-quality surgical informatics systems” and underlined the need for “clear understanding of surgical workflow and modeling tools”. Broadly speaking, the field has not made progress as quickly as researchers had hoped for at the time.

More recently, the renaissance of data science techniques in general and deep learning (DL) in particular has given new momentum to the field of CAS. In response to the general artificial intelligence (AI) hype, a consortium of international experts joined forces to discuss the role of data-driven methods for the OR of the future. Based on a workshop held in 2016 in Heidelberg, Germany, the consortium defined Surgical Data Science (SDS) as a scientific discipline with the objective of improving “the quality of interventional healthcare and its value through capture, organization, analysis, and modelling of data” (Maier-Hein et al., 2017). In this context, “data may pertain to any part of the patient care process (from initial presentation to long-term outcomes), may concern the patient, caregivers, and/or technology used to deliver care, and are analyzed in the context of generic domain-specific knowledge derived from existing evidence, clinical guidelines, current practice patterns, caregiver experience, and patient preferences”. Importantly, SDS involves the physical “manipulation of a target anatomical structure to achieve a specified clinical objective during patient care” (Maier-Hein et al., 2018a). In contrast to general biomedical data science, it also includes procedural data as depicted in Fig. 1.

Fig. 1.

Fig. 1.

Building blocks of a surgical data science (SDS) system. Perception: Relevant data is perceived by the system (Section 3). In this context, effectors include humans and/or devices that manipulate the patient including surgeons, operating room (OR) team, anesthesia team, nurses and robots. Sensors are devices for perceiving patient- and procedure-related data such as images, vital signals and motion data from effectors. Data about the patient includes preoperative images and laboratory data, for example. Domain knowledge serves as the basis for data interpretation (Section 4). It comprises factual knowledge, such as previous findings from studies, clinical guidelines or hospital-specific standards related to the clinical workflow as well as practical knowledge from previous procedures. Interpretation: The perceived data is interpreted in a context-aware manner (Section 5) to provide real-time assistance (Section 6). Applications of SDS are manifold, ranging from surgical education to various clinical tasks, such as early detection, diagnosis, and therapy assistance.

Three years later, in 2019, an international poll revealed that no commonly recognized surgical data science success stories exist to date, while success stories in other fields have been dominating media reports for years, as detailed in Section 2. The purpose of this paper was therefore to go beyond the broad discussion of the potential of SDS by providing an extensive review of the field and identifying concrete measures to pave the way for clinical success stories. The paper is based on an international workshop that took place in June 2019 in Rennes, France, and structured according to core topics discussed at the workshop. In Section 2, we will review the questionnaire that served as the basis for the workshop as well as an international 4-round Delphi process (Hsu and Sandford, 2007) we conducted with 50 clinical and technical stakeholders from 51 institutions to present concrete goals for the future. In the ensuing sections, we will present the current practice, key initiatives and achievements, standards, platforms and tools as well as current challenges and next steps for the main building blocks of SDS, namely technical infrastructure for data acquisition, storage and access (Section 3), methods for data annotation and sharing (Section 4) as well as data analytics (Section 5). A section about achievements, pitfalls and current challenges related to clinical translation of SDS (Section 6) and a discussion of our findings (Section 7) will close the manuscript. While, by definition, SDS encompasses multiple interventional disciplines, such as interventional radiology and gastroenterology, the present paper puts a strong focus on surgery.

2. Lack of success stories in surgical data science

Machine learning (ML) has begun to revolutionize almost all areas of healthcare. Success stories cover a wide variety of application fields ranging from radiology and dermatology to gastroenterology and mental health applications (Miotto et al., 2018; Topol, 2019). Strikingly, such success stories appear to be lacking in surgery.

The international Surgical Data Science Initiative (Maier-Hein et al., 2017) was founded in 2015 with the mission to pave the way for AI success stories in surgery. Key result of the first workshop, which was inspired by current open space and think tank formats, was a common definition of SDS (Maier-Hein et al., 2017) and a thorough description of the challenges in applying AI in interventional healthcare. The second edition of the workshop in 2019 focused on a comprehensive overview of the field including key research initiatives, industrial perspectives and first success stories. Prior to the workshop, the registered participants were asked to fill out a questionnaire, covering various aspects related to SDS. 43% of the 77 participants were professors/academic group leaders (clinical or engineering), while the remaining were mostly either from industry (14%) or PhD students / Postdocs (36%). The majority of participants (61%) agreed that the most important developments since the last workshop in 2016 were related to advances in AI. Notably, however, when participants were asked about the most impressive SDS paper, only a single paper (Maier-Hein et al., 2017) (the position paper from the first workshop) was mentioned more than twice (primarily by non-co-authors). The majority of participants agreed that the lack of representative annotated data is the main obstacle in the field and the main reason for the failure of previous SDS projects. Also, when referring to their personal experience, 33% associated the main reason of failure of an SDS project with lack of data, followed by underestimation of the problem complexity (29%). EndoVis (28%), Cholec80 (Twinanda et al., 2017) (21%) and JIGSAWS (Gao et al., 2014) (17%) were mentioned as the most useful publicly available data sets but the small size/limited representativeness of the data set was identified as a core issue (45%).

Based on the replies to the questionnaire and the subsequent workshop discussions, we identified four areas that are essential for moving the field forward: (1) Technical infrastructure for data acquisition, storage and access, (2) data annotation and sharing, (3) data analytics, and (4) aspects related to clinical translation. These are reflected in the four main sections of this paper. We then conducted a Delphi process involving a consortium of 50 medical and technical experts from 51 institutions (see list of co-authors) to formulate a mission statement along with a set of goals that are necessary to accomplish the respective mission (see Table 2, 3, 4 and 7) for each of the four areas. More specifically, the coordinating team of the Delphi process (eight members from five institutions; non-voting) put forth an initial mission statement and an initial set of goals for each of the four missions based on the workshop discussions. In a 4-round Delphi process, the remaining consortium members then iteratively refined the phrasing of the missions statements and goals and added further proposals for goals. This process yielded a set of 6–9 goals per mission that received support by at least two thirds of the voting members. Finally, the consortium collaboratively compiled a list of relevant stakeholders (Table 1) and then rated their importance for the four missions (Appendix F). To avoid redundancy, the consortium further agreed on the following:

Table 2.

Mission statement corresponding to technical infrastructure (Sec. 3) along with corresponding goals. The distribution of priorities (from left to right: not a priority, low priority, medium priority, high priority, essential priority) as rated by the participants of the Delphi process is depicted for each goal.

graphic file with name nihms-1802848-t0001.jpg

Table 3.

Mission statement corresponding to data annotation and sharing (Sec. 4) along with corresponding goals. The distribution of priorities (from left to right: not a priority, low priority, medium priority, high priority, essential priority) as rated by the participants of the Delphi process is depicted for each goal.

graphic file with name nihms-1802848-t0002.jpg

Table 4.

Mission statement corresponding to data analytics (Sec. 5) along with corresponding goals. The distribution of priorities (from left to right: not a priority, low priority, medium priority, high priority, essential priority) as rated by the participants of the Delphi process is depicted for each goal.

graphic file with name nihms-1802848-t0003.jpg

Table 7.

Mission statement corresponding to clinical translation (Sec. 6) along with corresponding goals. The distribution of priorities (from left to right: not a priority, low priority, medium priority, high priority, essential priority) as rated by the participants of the Delphi process is depicted for each goal.

graphic file with name nihms-1802848-t0004.jpg

Table 1.

List of relevant SDS stakeholders.

Surgical Data Science Stakeholders
Clinical stakeholders
  • Hospital administration

  • Healthcare information technology (HIT)

  • Data-generating units in healthcare institutions (e.g. imaging departments, laboratories, centers of clinical studies)

  • Surgical teams (e.g. surgeons, nurses, anesthesiologists)

  • Medical professional bodies (e.g. the Society of American Gastrointestinal and Endoscopic Surgeons (SAGES) and the European Association of Endoscopic Surgery (EAES))

Research stakeholders
  • Researchers (including clinician scientists)

  • Research institutions (e.g. university hospitals)

  • Scientific societies (e.g. the international Society for Medical Image Computing and Computer Assisted Intervention (MICCAI))

  • Journals/editors

  • Funding agencies/institutions (e.g. the European Research Council (ERC))

Industrial stakeholders
  • Medtech companies - large

  • Medtech companies - medium-sized

  • Medtech companies - small-sized

  • Industry federations

  • Investors

Regulatory stakeholders
  • Lawmakers

  • Regulatory agencies (e.g. the U.S. Food and Drug Administration (FDA))

  • Institutional review boards

  • Insurance companies

Public and private stakeholders
  • Patients and/or their legal guardians/family

  • Charities and donors

  • Public health organizations (e.g. the World Health Organization (WHO))

  • Media

  • Citizens

Context statement: Unless otherwise specified, in all of the following text, a) surgical data science (SDS) represents the general context of the suggested phrases and b) “data” may pertain to any part of the patient care process (from initial presentation to long-term outcomes), may concern the patient, caregivers and/or technology used to deliver care and must be acquired, stored, and shared in accordance with both local and international regulatory constraints. In general, c) data handling should comply with the FAIR (Findability, Accessibility, Interoperability, and Reuse) principles (Wilkinson et al., 2016) and d) user-friendliness should be a guiding principle in all processes related to data handling. Finally, e) the term SDS stakeholders refers to clinical, research, industrial, regulatory, public and private stakeholders.

Based on the international questionnaire, the on-site workshop and the subsequent Delphi process, the following sections present the perspective of the members of the international data science initiative on the identified key aspects for generating SDS success stories.

3. Technical infrastructure for data acquisition, storage and access

To date, the application of data science in interventional medicine (e.g. surgery, interventional radiology, endoscopy, radiation therapy) has found comparatively limited attention in the literature. This can partly be attributed to the fact that only a fraction of patient-related data and information is being digitized and stored in a structured manner (Hager et al., 2020) and that doing so is often an infeasible challenge in modern ORs. This section focuses on current hurdles in creating an environment that can record and structure highly heterogeneous surgical data for long-term usage.

3.1. Current practice

Different types of data pose different types of challenges:

Not all data can currently be acquired:

The OR is a highly dynamic environment where a team of health workers with varying specializations (e.g. surgeons, anesthesia team) continuously makes decisions based on device data, observation of the patient, and the outcome of previous actions. However, a lot of information that the healthcare workers perceive by interacting with the patient and each other is currently not at all acquired although it crucially affects decision making. This information relates to different human senses including vision, touch (e.g. palpation and tactile feedback from tissue) and hearing (e.g. acoustic signals resulting from instrument-tissue interaction (Ostler et al., 2020), communication in the OR). First initiatives have begun addressing these issues (see Section 3.2) but the infrastructure is not yet widely available.

Not all data that can be acquired is recorded and permanently stored:

Surgical data in minimally invasive surgery (MIS) routinely involves live image data of high resolution and frame rate. Modern stereoscopic endoscopes create two Full High Definition (HD) video streams at 60 Hz. If this data is to be stored uncompressed, it can quickly exceed 50 GB per video, with much larger file sizes possible depending on the situation and additional sensory input, and even larger again considering 4K resolutions. Healthcare information technology (HIT) is currently not designed to prospectively record and store such large data files.

Not all acquired data is digitized and stored in a structured manner:

A large proportion of documentation in the hospital is still unstructured. Reports, doctors’ letters, transcripts from examinations, treatment strategy plans and many more need to be documented in their original form for legal reasons (Kilian et al., 2015). When creating such documents, it is not uncommon to use printouts or Portable Document Format (PDF) documents that then form the basis of discussions between healthcare personnel or with patients. Resulting decisions are subsequently entered into the most relevant information systems as scans, unstructured, or semi-structured documents. As a result, all processes are documented in a manner satisfactory for legal aspects, but largely inaccessible to computation. This is especially true for information related to the surgical procedure, where the decision process leading up to the final operation strategy may not be stored at all (in simple cases) or only in the form of handwritten plans (in complex cases). Additionally, the exact parameters recorded for a specific intervention may differ between hospitals, leading to missing values if such data sets are merged. A host of information is potentially available from the actual surgery, including the exact steps taken, instruments used, information exchanged between personnel, haptic feedback, distractions, adaptations of the strategy plan, etc., many of which are not documented at all in OR reports, or documented incompletely. Evidence of this are e.g. similarly sized reports of the same procedures while the corresponding surgeries have radically different lengths. Additionally, problems during surgery may systematically be underreported (Hamilton et al., 2018).

Not all data that is stored can be exchanged between systems:

Perioperative data is distributed over varying information systems. For example, Picture Archiving and Communication Systems (PACS) contain image data and videos, Radiology Information Systems (RIS) contain reports, findings and radiotherapy plans, and Laboratory Information Systems (LIS) contain laboratory data. Information systems that focus on a single aspect, e.g. laboratory data, can implement efficient storage, manipulation and retrieval methods specific to the given data types. At the same time, user interaction can be kept as simple as possible, with a large degree of workflow optimization for relevant personnel interfacing with the information systems. Linking data from several systems effectively complicates these models. The more data types are incorporated in a model, the more special cases need to be considered, making the model less accessible and harder to query. However, a strict semantic annotation is a prerequisite for guaranteeing retrievability and interoperability (Lehne et al., 2019). As a result, data exchange between information systems is rare. A positive example has been set in radiology, where the Digital Imaging and Communications in Medicine (DICOM) standard has enabled the structured exchange of imaging data. OR data recording systems have also started to offer connection to other hospital infrastructure systems like electronic medical records (EMR), e.g. NUCLeUS (Sony Corporation, Tokyo, Japan). At present, however, this connectivity is typically not utilized widely or effectively. Also, stored OR data is generally not labeled and hence has limited utilization for SDS projects without significant efforts to restructure it.

Regulatory constraints make data acquisition, storage and access challenging:

SDS data collection, management and use must comply with standards in security and fidelity which typically vary depending on the data type and level of patient-specific information. Data governance in healthcare and specifically in surgery is still challenging and less mature compared to other domains (Tse et al., 2018). In the European Union (EU), the General Data Protection Regulation (GDPR) covers issues pertaining to personal data both within the EU and its entry to or exit out of the EU since 2018 (European Parliament and Council of European Union, 2016). Similarly, in the United States of America (USA) the healthcare-specific Health Insurance Portability and Accountability Act of 1996 (HIPAA) protects the confidentiality and integrity of patient data. In the United Kingdom (UK), the Data Protection Act (2018) was put in place for the National Health Service (NHS). In other countries, equivalents for data protection exist and are related to the legal frameworks of the respective healthcare system.

From an ethico-legal perspective, it is worth noting that companies commonly obtain surgical data either through contracts with individual consulting surgeons, licensing agreements with hospitals or in exchange for discounted pricing of their products. This current practice raises important issues regarding power imbalances and the democratization of data access (August et al., 2021).

3.2. Key initiatives and achievements

This section presents prominent SDS initiatives with a specific focus on data acquisition, access and exchange.

Data acquisition:

Several industrial and academic initiatives have been proposed to overcome the bottleneck of prospective surgical data acquisition.

The DataLogger (KARL STORZ SE & Co. KG, Tuttlingen, Germany) is a technical platform for synchronously capturing endoscopic video and device data from surgical devices, such as the endoscopic camera, light source, and insufflator (Wagner et al., 2017). The DataLogger has served as a basis for the development of a Smart Data Platform as part of the InnOPlan project (Roedder et al., 2016) and has been continuously expanded to support an increasing number of medical devices and clinical information systems. It has also been used to collect data for Endoscopic Vision challenges (e.g. EndoVis-Workflow; EndoVis-Workflow and Skill; EndoVis-ROBUST-MIS).

The OR Black Box® (Goldenberg et al., 2017) is a platform that allows healthcare professionals to identify, understand, and mitigate risks that impact patient safety. It combines input from video cameras, microphones, and other sensors with human and automated processing to produce insights that lead to improved efficiency and reduced adverse events. The OR Black Box has been in operation in Canada since 2014, in Europe since 2017 and in the USA since 2019. An early analysis of the OR Black Box use in laparoscopic procedures of over 100 patients has demonstrated that errors and distractions as annotated by experts viewing the procedures took place in every case, and often went unnoticed or were at least not recalled by the surgeon at the time (Jung et al., 2020).

In Strasbourg, France, the Nouvel Hôpital Civil (NHC), the Institut de Recherche contre les Cancers de l’Appareil Digéstif (IRCAD) and the Institut hospitalo-universitaire (IHU) record surgery videos for education purposes and research. These are curated and used mainly for IRCAD’s WebSurg (Mutter et al., 2011), a free online reference for video-based surgery training with over 370,000 members.

The Surgical Metrics Project began in October 2019 at the Annual Clinical Congress meeting of the American College of Surgeons (ACS). Over 200 board certified surgeons were equipped with wearable technology while they performed a simulated open bowel repair on porcine intestines. Multi-modal data, including electroencephalography (EEG), audio and video data were acquired to quantify efficient and successful operative approaches (Pugh et al., 2020).

The CDEGenerator is an online platform that addresses the need to create and share definitions of joint Core Data Elements (CDE) (Varghese et al., 2018). These definitions combine a list of recorded parameters together with an exact semantic description. By agreeing on a common CDE, two hospitals can guarantee that the collected data is compatible to the degree of the described acquisition processes.

Data access and exchange:

In the perioperative environment, the nonprofit organization Integrating the Healthcare Enterprise (IHE, Oak Brook, Illinois, USA) has been a driving force in forming a set of standards that facilitate data exchange (Grimes, 2005). It identifies clinical use cases, their requirements and relevant standards, and publishes guidelines (called “profiles”) on how to fulfill such use cases. IHE does not publish standards by itself, but rather identifies sets of standards (e.g. DICOM for image exchange and Logical Observation Identifiers Names and Codes (LOINC) (Forrey et al., 1996) for nomenclature) that are best suited to solve specific aspects of healthcare interoperability. Additionally, IHE regularly hosts “Connectathons”, where vendors present services with IHE profile implementations and test their systems against those of other vendors, verifying correct data exchange.

Inside the OR, efforts for transmitting and centralizing data have been explored for some time, particularly with integrated OR solutions provided by endoscopic device manufacturers and medical technology providers (KARL STORZ: OR1; Olympus Medical Systems (Tokyo, Japan): ENDOALPHA; Stryker (Michigan, USA): iSuite; Getinge AB (Getinge, Sweden): Tegris®; Richard Wolf GmbH (Knittlingen, Germany): core nova; STERIS plc (Derby, UK): Harmony iQ®; Brainlab AG (Munich, Germany): Digital O.R.; caresyntax GmbH (Berlin, Germany): PRIME365; Medtronic plc (Dublin, Ireland): Touch Surgery Enterprise; Sony: NUCLeUS; General Electric Company (Boston, USA): Edison; EIZO Corporation (Hakusan, Japan): CuratOR®). The wide availability of such systems should be an enabling technology for SDS efforts, not only allowing capturing of data from the OR but also setting a precedent on data management, security, storage and transmission.

Frequently, integrated ORs only provide technical interoperability for connecting image sources with displays (sinks) by using video and broadcasting standards such as Video Graphics Array (VGA), Digital Visual Interface (DVI), High-Definition Multimedia Interface (HDMI) or DisplayPort (DP). Higher levels of interoperability are easier to achieve with Internet Protocol (IP)-based data exchange standards (see Section 3.3).

Additionally to video routing and capturing, the integration of data from further devices in the OR is relevant. The German Federal Ministry of Education and Research (BMBF) lighthouse project OR.NET (Rockstroh et al., 2017), now continued as a nonprofit organization OR.NET e.V., worked on cross-manufacturer concepts and standards for the dynamic and secure networking of medical devices and information technology (IT) systems in the OR and clinics (Kricka, 2019; Miladinovic and Schefer-Wenzl, 2018). Initial results laid important foundations in the shape of a service-oriented communication protocol for the dynamic cross-vendor networking of medical devices and resulted in the International Organization for Standardization (ISO)/Institute of Electrical and Electronics Engineers (IEEE) 11073 Service-oriented Device Connectivity (SDC) series of standards (see Section 3.3). The projects InnOPlan (Roedder et al., 2016) (see paragraph “Data acquisition“) and OP 4.1 also used SDC as the basis for device communication. InnOPlan’s Smart Data platform enables real-time provision and analysis of medical device data to enable data-driven services in the operating room. The project OP 4.1 aimed at developing a platform for the OR - in analogy to an operating system for smartphones - that allows for integration of new technical solutions via apps.

The project Connected Optimized Network & Data in Operating Rooms (CONDOR) is another collaborative endeavor that aims to build a video-driven Surgical Control Tower (Padoy, 2019; Mascagni and Padoy, 2021) within the new surgical facilities of the IRCAD and IHU Strasbourg hospital by developing a novel video standard and new surgical data analytics tools. A similar initiative is The Operating Room of the Future (ORF) that researches device integration in the OR, workflow process improvement, as well as decision support by combining patient data and OR devices for MIS (Stahl et al., 2005).

3.3. Standards, platforms and tools

Standards, platforms and tools have focused on the topics of interoperability as well as data storage and exchange.

3.3.1. Interoperability

Interoperability is defined by IEEE as “the ability of two or more systems or components to exchange information and to use the information that has been exchanged” (IEEE, 1991) or by the Association for the Advancement of Medical Instrumentation (AAMI) as “the ability of medical devices, clinical systems, or their components to communicate in order to safely fulfill an intended purpose” (AAMI, 2012).

Numerous standards have been introduced to provide interoperability including Health Level 7 (HL7), IEEE 11073, American Society for Testing and Materials (ASTM) F2761 (Integrated Clinical Environment (ICE)), DICOM, ISO TC215, European Committee for Standardization (CEN) TC251 and International Electrotechnical Commission (IEC) 62A. Different levels of interoperability can be distinguished, for example through the 7 Level Conceptual Interoperability Model (LCIM) from Tolk et al. (2007), which is defined as follows (Wang et al., 2009):

  • Level 0 – No interoperability: Two systems cannot interoperate.

  • Level 1 – Technical interoperability: Two systems have the means to communicate, but neither has a shared understanding of the structure nor meaning of the data communicated. The systems have common physical and transport layers.

  • Level 2 – Syntactic interoperability: Two systems communicate using an agreed-upon protocol with structure but without any meaning. The systems exchange data using a common format.

  • Level 3 – Semantic interoperability: Two systems communicate with structure and have agreed on the meaning of the exchanged terms. The meaning of only the exchanged data is understood.

  • Level 4 – Pragmatic interoperability: Two systems communicate with a shared understanding of data, the relationships between elements of the data, and the context of the data but these systems do not support changing relationships or context over time. The meaning of the exchanged data and the relationships between pieces of information is understood.

  • Level 5 – Dynamic interoperability: Two systems are able to adapt their information models based on changing meaning and context of data over time. Evolving semantics are understood.

  • Level 6 – Conceptual interoperability: Includes the understanding and exchange of complex concepts. Systems are aware of each other’s underlying assumptions, models and processes.

The number of interoperability levels varies from model to model and depends on the goal of the intended classification. For example, Lehne et al. (2019) use only four levels, the first two being identical to those listed above; the third, also called “semantic interoperability” addresses the complexities mentioned in levels 3 to 5 here, and the fourth puts forth the concept of “Organisational Interoperability”, which includes aspects of level 5 and 6. The following paragraphs use the LCIM to classify the standards of interest to this paper.

(1). Technical interoperability:

Modern hospitals typically have sophisticated networks, which makes technical interoperability the most achievable level (Lehne et al., 2019). The main challenge inside the OR, where real-time capability is often critical, is the available bandwidth. An uncompressed Full HD video stream at 60 fps in a color depth of 24 bit requires a bandwidth of 2.98 Gigabit per second (Gbps, not to be confused with Gigabyte per second (GBps), which is eight times larger). Available Ethernet ports typically have a data transfer rate of 1 Gbps. While more modern installations may reach Ethernet data transfer rates of 10 Gbps, this technology is still expensive and typically reserved for networks in data centers. Wireless networks are even slower: Modern devices often support theoretical speeds between 0.45 Gbps and 1.3 Gbps, which results in an effective bandwidth of around 50% of the theoretical limit. The newest Wi-Fi (Wireless Fidelity) 6 standard, released late 2019, increases this theoretical limit to over 10 Gbps under laboratory conditions, but the effective speeds and adoption rate remain to be seen. In general, Wi-Fi suffers from a higher rate of associated uncertainties as well as latency, depending on a number of environment factors. Critically, Wi-Fi packets may get lost if interference between networks is too high, causing latency spikes of potentially several hundreds of milliseconds, which may negatively affect real-time applications. The new 5G standard for wireless communication can potentially ease some of these problems by reaching theoretical speeds of 20 Gbps and avoiding conflicts with other networks since the relevant frequencies are licensed for specific areas. Additionally, 5G as a method of Internet access could enable the transfer of large amounts of data to and from the hospital in relatively short time, something which previously required not readily available fast physical connections like glass fibre. While limitations of available bandwidth can be mitigated by using data compression, importantly, “losses imperceptible to humans” can still impede algorithm performance.

It is worth noting that, especially inside the OR, devices still exist that are entirely unable to connect to networks (from basic technical infrastructure like doors or lights to routine medical equipment like certain anesthesia systems) or are not in the network due to missing capacities (e.g. Ethernet sockets) or software add-ons (e.g. a proprietary application programming interface (API)).

(2). Syntactic interoperability:

At this level, the structure of exchanged data is defined with basic semantic information. This level is arguably where most of today’s efforts in medical data interoperability take place, and where a number of standards compete. A major player in the standardization is HL7 (Kalra et al., 2005), which has developed standards for the exchange of patient data since 1987. The eponymous HL7 standard has been continuously updated and most notably includes the Version 3 Messaging Standard, which specifies interoperability for health and medical transactions. HL7 has been criticized for the complexity of its implementation (Goldenberg et al., 2017), resulting in the proposal of HL7 Fast Healthcare Interoperability Resources (FHIR). HL7 FHIR simplifies implementation through the use of widely applied web technologies. Another important standard is provided by the openEHR foundation. In contrast to HL7, openEHR is not only a standard for medical data exchange, but an architecture for a data platform that provides tools for data storage and exchange. With this, however, come added complexity and challenges.

HL7 and openEHR provide the broadest scope of medical data exchange, but both build on standards that solve specific subtasks. While a complete listing is out of scope for this article, one notable exception is DICOM, which today is the undisputed standard for the management of medical imaging information. In 2019, DICOM was extended to include real-time video (DICOM Real-Time Video (DICOM-RTV)). This extension is an IP-based DICOM service for transmitting and broadcasting real-time video, with synchronized metadata, to subscribers (e.g. a monitor or SDS application server) with a quality comparable to standard OR video cables.

The previously mentioned standards focus on enabling the exchange of patient-individual data between Hospital Information Systems (HIS). Inside the OR, requirements differ, since a host of devices create a real-time data stream that focuses on sensoric input instead of direct patient information (diagnosis, habits, morbidity). Accordingly, data exchange standards inside the OR are geared toward these data types. OpenIGTLink (Tokuda et al., 2009), for example, started as a communication protocol for Image Guided Therapy (IGT) applications. Today, OpenIGTLink has been expanded to exchange arbitrary types of data by providing a general framework for data communication. However, it does not define broad standards for the data format, instead relying on users to implement details according to their needs. Through this model, OpenIGTLink enabled data exchange inside the OR long before broad standards were feasible. Similarly, for the field of robotics, the Robot Operating System (ROS) (Koubaa, 2016) has been proposed.

More recent efforts by the OR.NET initiative (see Section 3.2) produced the IEEE 11073 SDC ISO standard which provides a means for general data and command exchange for devices and enables users to control devices in the OR. Standards less specific to the healthcare environment are also available. Similar to OpenIGTLink, The Internet of Things (IoT), for example, defines a standard for device communication without defining standards for the communicated data. While it has been used for data exchange between information systems (Xie et al., 2018), and between devices in the OR (Miladinovic and Schefer-Wenzl, 2018), it has elicited mixed reactions.

(3). Semantic interoperability:

This is the domain of clinical nomenclatures, terminologies and ontologies. While modern standards like HL7 FHIR and openEHR already define basic semantics in data exchange, extending these annotations to more powerful nomenclatures like SNOMED CT (Systematized Nomenclature of Medicine - Clinical Terms) (Cornet and de Keizer, 2008) (see Section 4) enables systems to not only share data, but also their exact meaning and scope (i.e. what kind of data exactly falls under the given definition). To illustrate the difference between this level and the previous: HL7 FHIR defines less than 200 healthcare concepts (i.e. terms with a well-defined meaning) (Bender and Sartipi, 2013), while SNOMED CT defines more than 340,000 concepts (Miñarro-Giménez et al., 2019). Today, semantic interoperability is largely defined by terminologies (systematic lists of vocabulary), ontologies (definitions of concepts and categories along with their relationships) and taxonomies (classifications of entities, especially organisms) - the borders between which are often fluid. Standard languages such as the Resource Description Framework (RDF), Resource Description Framework Schema (RDFS) and the Web Ontology Language (OWL) (Bechhofer, 2009) have been defined by the World Wide Web Consortium (W3C), guaranteeing interoperability between ontology resources and data sets based on these ontologies. The aforementioned SNOMED CT is arguably the most complete terminology, spanning the whole field of clinical terms with a wide set of available translations. However, specialized alternatives may perform better on their respective field. Additionally, a host of medical ontologies are available. Most notable is the family of ontologies gathered under the OpenBiological and Biomedical Ontologies (OBO) Foundry (Smith et al., 2007), which cover a wide array of topics from the biomedical domain and share the Basic Formal Ontology (BFO) (Grenon and Smith, 2004) as a common top-level ontology. Intraoperatively, the OntoSPM (Gibaud et al., 2018) provides terminology for the annotation of intraoperative processes, and has spawned efforts for the annotation of binary data (Katić et al., 2017). Common to all these efforts is that they serve best in combination with a standard addressing syntactic interoperability, where they can add semantic information to the data exchange. Semantic interoperability goes hand in hand with data annotation, and is expanded upon in Section 4.

It is important to note that semantic interoperability does not guarantee the availability of data. If two hospitals have agreed on a detailed semantic model but record different parameters for a specific procedures, then the two resulting data sets will contain well-defined but empty fields. Two avoid this, it is necessary to agree on lists of recorded parameters, e.g. in the form of CDE.

(4). Pragmatic interoperability:

In order to define context, additional modeling is required to capture data context and involved processes. This can in part be achieved by extending modeling efforts from the semantic interoperability level to include these concepts. Furthermore, efforts to formalize the exchange processes themselves are required. In IEEE 11073 descriptions for architecture and protocol (IEEE 11073–20701) and in HL7 the IHE Patient Care Device (PCD) implementation guide and the conformance model are provided.

For the remaining two levels, developments are more recent and less formalized. For Level (5) Dynamic interoperability, it is required to model how the meaning of data changes over time. This can range from simple state changes (planned operations becoming realized, proposed changes becoming effective) to new data types being introduced and old data types changing meaning or being deprecated. In IEEE 11073 the participant key purposes and in HL7 the workflow descriptions are created for supporting these aspects. Finally, Level (6) Conceptual interoperability allows for exchanging and understanding complex concepts. This requires a means to share the conceptual model of the system, its processes, state, architecture and use cases. This level can be achieved through defining use cases and profiles (e.g. IHE Services-oriented Device Point-of-care Interoperability (SDPi) Profiles) and/or provisioning reference architecture and frameworks.

3.3.2. Data storage and distribution

While current standards have focused on data exchange, they typically do not address data distribution and storage. Typically, data is exchanged between two defined endpoints (e.g. a tracking device and an IGT application, or a computed tomography (CT) scanner and a PACS). To achieve a system that can be dynamically expanded with regard to its communication capabilities, it is necessary to implement messaging technology. Such tools allow arbitrary devices to take part in communication by registering via a message broker, where messages can typically be filtered by their origin, type, destination, for instance. Examples include Apache Kafka (Kim et al., 2017; Spangenberg et al., 2018) or RabbitMQ® (Ongenae et al., 2016; Trinkūnas et al., 2018). Such systems enable developers to create flexible data exchange architectures using technologies that are mature and usually well documented thanks to their wide application outside the field of healthcare. However, they also create a level of indirection which introduces additional delay (which may be negligible with only a few milliseconds in local networks, or significant with several tens or even hundreds of milliseconds over the Internet or wireless networks).

Finally, recording of the exchanged data requires distinct solutions as well. High-performance, high-reliability databases form an essential requirement for many modern businesses. Thanks to this demand, a large body of established techniques exists, from which users can select the right tool for their specific needs. Binary medical data (images, videos, etc.) can be stored on premise in modern PACSs, which provide extensive support for data annotation, storage and exchange. For clinical metadata, the selection of technology typically depends on the level of standardization of the recorded data. Highly standardized data can possibly be stored directly through interfaces of e.g. the IHE family of standards. If the target data are not standardized, but homogeneous, then a database model for classical database languages (e.g. Structured Query Language (SQL)) may be suitable. Use cases where a wide array of highly heterogeneous data is recorded may choose modern NoSQL databases. These databases do not (or not exclusively) rely on classical tabular data models, but instead allow the storage and querying of tree-like structures. The JavaScript Object Notation (JSON) Format is a popular choice for NoSQL databases for its wide support in toolkits and the immediate applicability with regard to Representational State Transfer (REST)-APIs. While initially applications of these databases were geared toward data lakes because of the relative ease of application, NoSQL databases have recently seen widespread application in big data and ML (Dasgupta, 2018). A notable example is Elasticsearch (Elastic NV, Amsterdam, the Netherlands), which has achieved widespread distribution and is ranked among the most used search servers (DB-Engines, 2020).

Through the rising relevance of web technology, storing data in the cloud is increasingly becoming a viable option. A vast array of services are available and have been applied in the medical domain (e.g. Amazon Web Services (AWS) (Holmgren and Adler-Milstein, 2017), Microsoft Azure (Hussain et al., 2013), and others). Storing data in the cloud has the potential to save money on HIT by eliminating the need to reduce the locally required storage capacity and maintenance personnel, but brings with it privacy concerns and slower local access to data than from local networks, which may be noticeable especially for large binary data like medical images and video streams. While data privacy options are available for all major services, the implementing personnel have to understand these options and align with them the privacy needs of the institution and the respective data. Since answering these questions is complex, the privacy requirements strict, and the consequences for failing to comply with the law severe, the created solutions are often conservative in nature with regard to privacy. Additionally, downloading large data sets may be costly, as in general, cloud storage providers incentivize performing computations in the cloud.

Finally, solutions to facilitate local storage have been proposed. Commercially available systems such as SCENARA®.STORE (KARL STORZ) compress surgical images and video data over time to decrease storage needs. Alternatively, SDS tools can be used to selectively store critical video sequences instead of entire procedural videos, as recently proposed (Mascagni et al., 2021b).

3.4. Current challenges and next steps

The infrastructure-related mission as well as the corresponding goals generated by the consortium as part of the Delphi process are provided in Table 2. This section elaborates on some of the most fundamental aspects:

How to enable prospective capturing and storing of relevant perioperative data? (goals 1.1/1.2):

A major challenge we face is to capture all relevant perioperative data. While several initiatives and standards are already dedicated to this problem, a particular focus should be put on the recording and integration of patient outcome measures, including measures that need to be captured long after the patient has left the hospital (e.g. 5-year-survival). The field of SDS stands in contrast to the field of radiology, where the DICOM standard now covers the exchange of medical images and related data. This standard can be seen as a direct result of market pressure: Early medical imaging devices did not prioritize communication standards, instead relying on manufacturer-supplied software specific to the hardware purchased. This behaviour did not change until PACSs became widespread, providing specialized software that offered a benefit to clinical workflows, and the ability to transmit images to them became a driving requirement for the purchase of new imaging hardware. However, the previously mentioned domain complexity also affects standard development. For example, the DICOM specification document alone consists of 6,864 pages2, indicating the effort to develop and maintain such a standard. Evolving standards for the exchange of medical data like IEEE 11073 SDC and HL7 FHIR are a step in the right direction, but in order to create a driving force, incentivizing the industry to enable widespread interconnection appears useful.

Storing acquired data is, in theory, largely possible with modern technologies. Missing, however, are standards for storage format, duration and data quality. These should be developed with the involvement of industrial stakeholders and the respective clinical/technical societies and should specifically include recommendations with respect to minimum standards for storage and annotation. The international Society of American Gastrointestinal and Endoscopic Surgeons (SAGES), for example, created an AI task force with the mission to propose and establish best practices for structured video data acquisition and storage, including recommendations for resolution and compression (Feldman et al., 2020). Generally speaking, a clear distribution of roles between different stake-holders, particularly regarding who takes the initiative, as well as a clear definition of the subject matter to be standardized are now needed.

How to link data from different sources and sites? (goal 1.3)

The need for exchanging data between different sources and sites calls for semantic operability (Section 3.3): Simply storing all data in a data lake without sufficient metadata management poses the risk of creating a data swamp that makes data extraction hard to impossible (Hai et al., 2016). Data distribution among several systems is a healthy approach since it reduces load on a single system and enables engineers to choose the system best suited for the specific types of data stored within. As long as metadata models (Gibaud et al., 2018; März et al., 2015; Soualmia and Charlet, 2016) exist that are able to sufficiently describe the data and where to find them, retrieval will be possible through querying the model. Accordingly, efforts should focus on enhancing current clinical information infrastructures from the level of syntactic operability to semantic interoperability. Metadata also becomes essential for data sharing. An increasingly popular approach to data sharing is federated learning (Konečný et al., 2016; Rieke et al., 2020). Instead of sharing data between institutions, the training of algorithms is distributed among participants. While this presumably reduces the ethical and legal complications associated with large-scale data sharing, it is still necessary to achieve semantic interoperability, and the regulatory issues regarding the exchange of models that contain encoded patient data are not fully understood yet.

How to perceive relevant tissue properties dynamically? (goal 1.4)

Surgical imaging modalities should provide discrimination of local tissue with a high contrast-to-noise-ratio, should be quantitative and digital, ideally be radiation- and contrast agent-free, enable fast image acquisition and be easy to integrate into the clinical workflow. The approach of registering 3D medical image data sets to the current patient anatomy for augmented reality visualization of subsurface anatomical details has proven ill-suited for handling tissue dynamics such as perfusion or oxygenation (e.g. for ischemia detection). The emerging field of biophotonics refers to techniques that take advantage of the fact that different tissue components feature unique optical properties for each wavelength. Specifically, spectral imaging uses multiple bands across the electromagnetic spectrum (Clancy et al., 2020) to extract relevant information on tissue morphology, function and pathology (see e.g. Wirkert et al. (2016); Moccia et al. (2018); Ayala et al. (2021)). Benefiting from a lack of ionizing radiation, low hardware complexity and easy integrability into the surgical workflow, spectral imaging could be leveraged to inform surgical operators directly or be used for the generation of relevant input for SDS algorithms (Mascagni et al., 2018). Open research questions are, among others, related to reproducibility of measurements, possible confounders in the data (Dietrich et al., 2021), inter-patient variability and the robust quantification of tissue parameters in clinical settings.

How to enable real-time inference in interventional settings? (goal 1.5)

While processing times of several seconds or even minutes may be acceptable in some scenarios, other SDS applications, such as autonomous robotics, require real-time inference. Real-time inference requires a number of complex prerequisites to be fulfilled. Relevant data needs to be streamed to a common end point where it can be processed; data streams need to be sufficiently formalized to enable fully automatic decoding; the hardware and networks receiving these streams must be sufficiently fast to decode the streams with minimal latency and high resilience, and the algorithms that provide inference need to be implemented efficiently and run on sufficiently fast hardware to enable real-time execution. If additional data (e.g. preoperative imaging, patient-specific data) is required, the algorithms need to be able to access this data, and inferred information needs to be relayed to the OR team in an adequate manner. These problems can potentially be addressed in a variety of ways, however, it seems prudent to integrate the necessary infrastructure (acquisition, computation, display) directly on site in or near the OR. In a first step, test environments such as experimental operating rooms can serve as platforms where technical concepts for real-time interference can be developed, validated and evaluated in a realistic setting.

How to overcome regulatory and political hurdles? (goal 1.6)

Timelines and associated costs of data privacy management (discussed further in Section 4.4) and regulatory processes need to be supported in both academic and commercial projects: Academic work requires funding and appropriate provision for delays in the project timeline. Notably, the COVID-19 pandemic may have stimulated rapid response from both academic and regulatory bodies in response to urgent needs, and perhaps some of this expedience will remain (examples in Continuous Positive Airway Pressure (CPAP) devices such as UCL-Ventura). Industry also needs to allocate costs, adhere and maintain standards, cover liability and have clear expectations on the required resources. While these processes are well developed and supported in large organizations, smaller companies, in particular startups, have less capacities for them at their disposal. A variety of additional standards would also need to be met since a prospective SDS system approaches a medical device as defined by The U.S. Food and Drug Administration (FDA) (USA) or the Medical Device Regulation (MDR) (EU). These may be ISO-certified or require audits and approval from regulatory agencies and notified bodies, compliance with data protection regulations (e.g. GDPR), more stringent (cyber-)security features and testing adherence. As the field of AI and its regulation is increasingly discussed in public venues, political visibility is rising. By clearly identifying the limiting effects of insufficient infrastructure on the one hand, and potential benefits of improving it on the other, it should become possible to convince political and clinical stakeholders that an investment in HIT as well as dedicated data management and processing personnel is key to exploiting the potential of AI for interventional healthcare. Furthermore, industrial engagement in creating the necessary infrastructure needs to be fostered within the boundaries of global standardization while considering the specific market needs. Healthcare institutions thus need to engage globally with industry to put forth common standards and processes enabling SDS applications compatible with strategic business needs. Of note, existing infrastructures can be leveraged and enhanced in this process. The SDS community should be aware of the complexity of the topic and the messages that are publicized (i.e. premature success stories) and create constructive proposals with realistic outlooks on potential benefits, focusing on long-term investments with the potential to drive change. Specifically, market studies could identify for each individual stakeholder the benefits of SDS solutions compared to their expected costs. Consider for instance a “number needed to treat” type of example, where for every X number of patients for which data insights are applied, one complication costing USD Y may be avoided. By providing estimated returns on investment for improvements to clinical delivery based on reducing person-hours, complications, or duplicative work, such studies would in turn provide key arguments for future investments.

Overall, local and international collaborations and partnerships involving clinical, patient, academic, industry and political stakeholders are needed (see Table 1). Policies and procedures regarding data governance within an institution have to be defined that involve all stakeholders within the SDS data lifecycle. Already existing multinational political entities or governing bodies, as exemplified by the EU, can be leveraged in a first step toward international collaboration and standardization. When implementing the goals put forth in Table 2, internationally agreed standards should be respected. These include, but are not limited to, ethical guidelines. In fact, the World Health Organization (WHO) recently put forth a guidance document on Ethics & Governance of Artificial Intelligence for Health (WHO, 2021), which was compiled by a multidisciplinary team of experts from the fields of ethics, digital technology, law and human rights, as well as experts from Ministries of Health. The report identifies the ethical challenges and risks associated with the use of AI in healthcare and puts forth several internationally agreed on best practices for both the public and the private sector.

4. Data annotation and sharing

The access to annotated data is one of the most important pre-requisites for SDS. There are different requirements that impact the quality of the annotated data sets. Ideally, they should include multiple centers to capture possible variations using defined protocols regarding acquisition and annotation, preferably linked to patient outcome. In addition, the data set has to be representative for the task to be solved and combined with well-defined criteria for validation and replication of results. Broadly, the key considerations when generating an annotated data set include reliability, accuracy, efficiency, scalability, cost, representativeness and correct specification.

4.1. Current practice

A comprehensive list of available curated data sets that are relevant to the field of SDS is provided in Appendix A. In general, they serve as a good starting point, but are still relatively small, often tied to a single institution, and extremely diverse in structure, nomenclature, and target procedure.

Surgical data such as video involves diverse annotations with different granularity depending on the clinical use case to be solved. It can be distinguished between spatial, temporal or spatio-temporal annotations. Examples for spatial annotations include image-level classification (e.g. what tissue/tools/events are visible in an image), semantic segmentation (e.g. which pixels belong to which tissue/tools/events in an image) and numerical regression (e.g. what is the tissue oxygenation at a certain location). Temporal annotations involve the surgical workflow and can have different levels of granularity, e.g. surgical phases at the highest level, which consist of several steps, which are in turn composed of activities such as suturing or knot-tying (Lalys and Jannin, 2014). In addition, specific events such as complications, performance or quality assessment of specific tasks complement temporal annotations. Spatio-temporal annotations involve both spatial and temporal information. While simple annotation tasks such as labeling surgical instruments may be accomplished by non-experts (Maier-Hein et al., 2014), more complex tasks such as tissue labeling or quality assessment of anastomoses most likely require domain experts.

The major bottleneck for data annotation in surgical applications is access to expert knowledge. Reducing the annotation effort is therefore of utmost importance, and various methods have been proposed. Crowdsourcing (Maier-Hein et al., 2014) has proven to be a successful method, but designing the task such that non-experts are able to provide meaningful annotations is still one of the biggest challenges. Recently, active learning approaches that determine which unlabeled data points would provide the most information and thus reduce the annotation effort to these samples have been proposed (Bodenstedt et al., 2019a). Similarly, error detection methods reduce the annotation effort to erroneous samples only (Lecuyer et al., 2020). Data can also be annotated directly during acquisition (Padoy et al., 2012; Sigma Surgical Corporation).

4.2. Key initiatives and achievements

One of the most successful initiatives fostering access to open data sets is Grand Challenge which provides infrastructure and tools for organizing challenges in the context of biomedical image analysis. The platform hosts several challenges including data sets and also serves as a framework for end-to-end development of ML solutions. Notably, the Endoscopic Vision Challenge EndoVis, an initiative that takes place at the international conference hosted by the Medical Image Computing and Computer Assisted Intervention (MICCAI) Society, is the largest source of SDS data collections (Bernal et al., 2017; EndoVis’15 Instrument Subchallenge Dataset, 0000; EndoVis-GIANA, 0000; Allan et al., 2019; Hattab et al., 2020; ALHAJJ et al., 2021; Allan et al., 2020; Maier-Hein et al., 2021; Allan et al., 2021; EndoVis-Workflow and Skill, 0000; Roß et al., 2021b; Zia et al., 2021; Huaulmé et al., 2021; HeiSurf, 0000; GIANA21, 0000; CholecTriplet21, 0000; FetReg, 0000; PETRAW, 0000; SimSurgSkill). It consists of several sub-challenges every year which support the availability of new public data sets for developing and benchmarking methods. Generally speaking, however, quality control in biomedical challenges and data sharing is still an issue (Maier-Hein et al., 2018b; 2020).

The importance of public data sets in general is illustrated through new journals dedicated to only publishing high quality data sets, such as Nature Scientific Data. An important contribution in this context are the FAIR data principles (Wilkinson et al., 2016), already introduced in the context statement above. Recently, the Journal of the American Medical Association (JAMA) Surgery partnered with the Surgical Outcomes Club and launched a series consisting of statistical methodology articles and a checklist that aims to elevate the science of surgical database research (Haider et al., 2018). It also includes an overview of the most prominent surgical registries and databases, e.g. the National Cancer Database (Merkow et al., 2018), the National Trauma Data Bank (Hashmi et al., 2018) or the National Surgical Quality Improvement Program (Raval and Pawlik, 2018).

Annotation of data sets requires consistent ontologies for SDS. The OntoSPM project (Gibaud et al., 2014) is the first initiative whose goal is to focus on the modeling of the entities of surgical process models, as well as the derivation LapOntoSPM (Katić et al., 2016a) for laparoscopic surgery. OntoSPM is now organized as a collaborative action associating a dozen research institutions in Europe, with the primary goal of specifying a core ontology of surgical processes, thus gathering the basic vocabulary to describe surgical actions, instruments, actors, and their roles. An important endeavor that builds upon current initiatives was recently initiated by SAGES, which hosted an international consensus conference on video annotation for surgical AI. The goal was to define standards for surgical video annotation based on different working groups regarding temporal models, actions and tasks, tissue characteristics and general anatomy as well as software and data structure (Meireles et al., 2021).

4.3. Standards, platforms and tools

In SDS, images or video are typically the main data sources since they are ubiquitous and can be used to capture information at different granularities ranging from cameras observing the whole interventional room or suite to cameras inserted into the body endoscopically or observing specific sites through a microscope (Chadebecq et al., 2020). Different image/video annotation tools regarding spatial, temporal and spatio-temporal annotations already exist (Table C.1), but to date no gold standard framework enabling different annotation types combined with AI-assisted annotation methods exists in the field of SDS.

Consistent annotation requires well-defined standards and protocols taking different clinical applications into account. Current initiatives are working on the topic of standardized annotation, but no widely accepted standards have resulted from the efforts yet. Notable exceptions can be seen in the fields of skill assessment, where annotations have been required for a long time to rate students and can serve as an example for different kinds of SDS an notation protocols (Vedula et al., 2017), and in cholecystectomy, where methods for consistent assessment of photos (Sanford and Strasberg, 2014) and videos (Mascagni et al., 2020a) of the Critical View of Safety (CVS) were developed to favour documentation of this important safety step.

Data annotation also requires a consistent vocabulary, preferable modeled as ontology (Section 3). Several relevant ontologies with potential use in surgery such as the Foundational Model of Anatomy (FMA), SNOMED CT or RadLex (Langlotz, 2006) are already available. Existing initiatives like the OBO Foundry project that focuses on biology and biomedicine provide further evidence that building and sharing interoperable ontologies stimulate data sharing within a domain. In biomedical imaging, ontologies have been successfully used to promote interoperability and sharing of heterogeneous data through consistent tagging (Gibaud et al., 2011; Smith et al., 2015).

The challenges and needs for gathering large-scale, representative and high-quality annotated data sets are certainly not limited to SDS. In response, a new industry branch has emerged offering online data set annotation services through large organized human workforces. A listing of the major companies is provided in Table C.2. Interestingly, the market was estimated to grow to more than USD 1 billion by 2023 in 2019 (Cognilytica, 2019), but the consecutive annual report estimates the market to grow to more than USD 4.1 billion by 2024 in 2020 (Cognilytica, 2020). Most companies recruit non-specialists who can perform conceptually simple tasks on image and video data, such as urban scene segmentation and pedestrian detection for autonomous driving. Recently, several companies such as Telus International (Vancouver, BC, CA) and Edgecase AI LLC (Hingham, MA, US) have started offering medical annotation services performed by networks of medical professionals. However, it is unclear to what extent medical image data annotation can be effectively outsourced to such companies, particularly in the case of surgical data, where important context information may be lost. Furthermore, the associated costs of medical professionals as annotators and annotation reviewers for quality assurance may render these services out of reach for many academic institutes and small companies.

4.4. Current challenges and next steps

The data annotation-related mission as well as corresponding goals generated by the consortium are provided in Table 3. This section elaborates on some of the most fundamental aspects:

How to develop standardized ontologies for surgical data science? (goal 2.1)

As current practices and standards differ greatly between different countries, clinical sites, and healthcare professionals, publicly available surgical data sets generally display vast variation in terms of their annotations. The field, however, is in need of standardized annotations based on a common vocabulary which can be achieved by shared ontologies. For example, evaluating the efficacy of a particular procedure requires a standardized definition and nomenclature for the different hierarchy levels, e.g. the phases, steps/tasks and activities/actions. A standardized nomenclature along with specifics such as beginning and end of temporal events does not exist yet. Studies can help standardize these definitions and reach a consensus. This is for instance demonstrated by Kaijser et al. (2018) who conducted a Delphi consensus study to standardize the definitions of crucial steps in the common procedures of gastric bypass and sleeve gastrectomy. Such processes could be adopted for other domains, with the Delphi methods being a particularly useful tool to agree on terminology. Once available and broadly adopted, a shared ontology would stimulate the community as well as boost data and knowledge exchange in the entire domain of SDS. Less formal options such as terminologies are also an alternative but may risk to reach some limits in the long term.

How to account for biases? (goal 2.2)

Various sources and types of bias with potential relevance to SDS have been identified in the past (Ho and Beyan, 2020). Among the most critical are selection bias and confounding bias. Selection bias, also called sample bias, refers to a selection of contributing data in a way that does not allow for proper randomization or representativeness to be achieved. Crucially, in the context of SDS, representativeness refers to numerous factors including variances related to patients (e.g. age, gender, origin), the surgical procedure (e.g. adverse events), input data (e.g. device type, protocol; preprocessing methods), and surgeons (e.g. level of expertise). Creating a fully representative data set is thus highly challenging and only possible in a multi-center setting. Unrepresentative data, on the other hand, leads to biased algorithms. A recent study published in the context of radiological data science (Larrazabal et al., 2020), for example, showed that the performance of AI algorithms for a specific sex (e.g. female) crucially depends on the ratio of samples from the respective sex in the training data set. Another source of overestimation regarding algorithm performance is confounding bias. Confounding “arises when variables that are not mediators of the effect under study, and that can explain part or all of the observed association between the study exposure and the outcome, are not measured and controlled for during study design or analysis” (Arah, 2017). Recent work in biomedical image analysis (Badgeley et al., 2019; Roberts et al., 2021; Dietrich et al., 2021) showed that knowledge of confounding variables is crucial to the development of successful predictive models. Conversely, a striking recent example of a confounder rendering results meaningless can be seen in the many papers using a particular pneumonia data set as a control group in the development of COVID-19 detection and prognostication models. Since this data set solely consists of young paediatric patients, any model using adult COVID-19 patients and these patients as a control group would likely overperform merely by detecting children (Roberts et al., 2021). Other examples of confounders (also called hidden variables) are chest drains and skin markings in the context of pneumothorax (Oakden-Rayner et al., 2020) and melanoma diagnosis (Winkler et al., 2019). Recognizing and minimizing potential biases in SDS by enhancing data sets with, for example, relevant metadata is thus of eminent importance.

How to make data annotation more efficient? (goal 2.3)

Overcoming the lack of experienced observers might be possible through embedding clinical data annotation in the education and curricula of medical students. In fact, early evidence suggests that annotating surgical skills during video-based training improves the learning experience (De La Garza et al., 2019). The annotation process could also involve several stages, starting with annotations by non-experts that are reviewed by experts. In a similar fashion, active learning methods reduce the annotation effort to the most uncertain samples (Bodenstedt et al., 2019a; Maier-Hein et al., 2016). An alternative approach to overcome the lack of annotated data sets is to generate realistic synthetic data based on simulations. A challenge in this context is to bridge the domain gap, so that models trained on synthetic data generalize well to real data. Promising approaches already studied in the context of SDS are for example generative adversarial networks (GANs) for image-to-image translation of laparoscopic images (Pfeiffer et al., 2019; Rivoir et al., 2021) or transfer learning-based methods for physiological parameter estimation (Wirkert et al., 2017). In the context of photoacoustic imaging, recent work has further explored the GAN-based generation of plausible tissue geometries from available imaging data (Schellenberg et al., 2021).

How to establish common standards, protocols and best practices for quality-assured data annotation? (goals 2.3–2.6/2.9)

Standardized open-source protocols that include well-defined guidelines for data annotation are needed to provide accurate labels. Ideally, the annotations should be generated by multiple observers and the protocol should be defined to reduce inter-observer variability and bias. A recent study in the context of CT image annotation concluded that more than three annotators might be necessary to establish a reference standard (Joskowicz et al., 2019). Comprehensive labeling guides and extensive training are necessary to ensure consistent annotation. Shankar et al. (2020), for example, proposed a 400-page labeling guide in the context of ImageNet annotations to reduce common human failure modes such as fine-grained distinction of classes. In SDS, a protocol with check-lists and examples on how to consistently segment hepatocystic anatomy and assess the CVS in laparoscopic cholecystectomy was recently published to favour reproducibility and trust in the clinical relevance of annotations (Mascagni et al., 2021a). Such detailed annotation protocols and extensive user training supported by adequate training material are now required. However, establishing annotation guides for surgical video data is a particularly challenging task since it involves complex actions that require understanding of the surgical intent based on visual cues. In particular, temporal annotations such as phase transitions are often challenging as the start and end of a specific phase is hard to define. Ward et al. (2021) provide a comprehensive list regarding challenges associated with surgical video annotation. Taking into account the variety of surgical techniques this may lead to annotation inconsistencies even amongst experts, but these could also be used as a hint to estimate the difficulty associated with a surgical situation (Ward et al., 2021). In this context, research on the needs with respect to data and annotation quality in the context of the clinical goals is also required. As data sets and annotations evolve over time, another aspect to be taken into account involves versioning of data sets and annotations, similar to code, which is a non-trivial task (Marzahl et al., 2021). For all tasks related to data annotation, it will be prudent to establish and enforce best practices, e.g. in the form of standardized annotation protocols, that can easily be integrated into the surgical workflow. Once these are established, adherence to best practices could be increased by journal editors explicitly requesting annotation protocols to be submitted along with a respective paper that is based on annotated data. Journals could also allow for the explicit publication of annotation protocols in analogy to study protocols. Finally, platforms that enable spatial as well as temporal annotation in a collaborative manner and share common annotation standards and protocols as well as ML-based methods to facilitate automatic annotations are crucial. One means is to adapt already existing annotation platforms (see Table C.1) to fit the specific needs of SDS. Funding agencies should explicitly support efforts to make progress in this regard. Overall, a particularly promising approach to generating progress with respect to annotation standards is to start from the respective societies, such as SAGES. Alternatively or additionally, international working groups, similar to the one developing the DICOM standard, should be established. Such working groups should collaborate with existing initiatives, such as DICOM or HL7. In the end, standards will only be successful if enough resources are invested into the actual data annotation. In this case various non-monetary incentives should be considered, including gamification and the issuing of certificates (e.g. for Certified Professional for Medical Data Annotation in analogy to Certified Professional for Medical Software).

How to incentivize and facilitate data sharing across institutions? (goals 2.7–2.9)

Data anonymization is a key enabler for sharing medical data and advancing the SDS field. By definition, anonymized data cannot be traced back to the individual and in both the USA and EU, anonymized data are not considered personal data, rendering them out of the scope of privacy regulation such as the GDPR. However, achieving truly anonymized data is usually difficult, especially when multiple data sources from an individual are linked in one data set. Removing identifiable metadata such as sensitive DICOM fields linking the patient to the medical image is necessary but not always sufficient for anonymization. For example, removing DICOM fields in a magnetic resonance imaging (MRI) scan of a patient’s head is not sufficient because the individual may be identified from the image data through facial recognition (Schwarz et al., 2019). Full anonymization also exhibits the drawback of it being difficult to identify potential existing biases in data sets. Pseudonymization is a weaker form of anonymization where data cannot be attributed to an individual unless it is linked with other data held separately (European Parliament and Council of European Union, 2016). This is often easier to achieve compared to true anonymization, however, pseudonymized data are still defined as personal data, and as such remain within the scope of the GDPR. The public data sets used in SDS research such as endoscopic videos recorded within the patient’s body are generally assumed to be anonymized but clear definitions and regulatory guidance are needed. Recent advances in federated learning could reduce security and privacy concerns since they rely on sharing machine learning models rather than the data itself (Kaissis et al., 2020) (see Section 3). A complementary strategy for bypassing current hurdles related to data sharing is data donation. Medical Data Donors e.V., for example, is a registered German non-profit organization, designed to build a large annotated image database which will serve as a basis for medical research. It can be supported by the public via donation of medical imaging data or by shopping at Amazon Smile. In the broader context of data donation, the SDS initiative discussed the concept of a data donor card in analogy to the existing organ donor card. With such a card, patients could explicitly state which kind of data they are willing to share with whom and under which circumstances. Overall, making progress on large public databases will require establishing an interlocking set of standards, technical methods, and data analysis tools tied to metrics to support reproducible SDS (Nichols et al., 2017) and provide value for the community. Clinical registries provide a good example of such a mechanism. In a registry, a specific area of practice agrees on data to be shared, outcome measures to be assessed, and standardized formats as well as quality measures for the data (Arts et al., 2002). Identifying areas of SDS where the value proposition exists to drive the use of registries would provide much-needed impetus to create data archives. So would creating more monetary and non-monetary incentives for institutions, clinical staff and patients to share and annotate data, although particularly the issue of incentivizing patients to share data presents an ethical gray area.

5. Data analytics

Data analytics (addressing the interpretation task in Fig. 1) is often regarded as the core of any SDS system. The perioperative data is processed to derive information addressing a specific clinical need, where applications may range from prevention and training to interventional diagnosis, treatment assistance and follow-up (Maier-Hein et al., 2017).

5.1. Current practice

Surgical practice has traditionally been based on observational learning, and decision making before, during and after surgical procedures highly depends on the domain knowledge and past experiences of the surgical team (Maier-Hein et al., 2017). SDS has the potential to initiate a paradigm shift with a data-driven approach (Hager et al., 2020; Vercauteren et al., 2020). Bishop and others classify data analytics tools as descriptive, diagnostic, predictive, and prescriptive (Bishop, 2006; Tukey, 1977):

Descriptive analytics tools - what happened?

Descriptive analytics primarily provide a global, comprehensive summary of data made available through data communication such as simple reporting features. Syus’ Periop Insight (Syus, Inc., Nashville, TN, USA) is an example of how descriptive analytics are used to access data, view key performance metrics, and support operational decisions through documentation and easy interpretation of historical data on supply costs, delays, idle time etc., relating overall operating room efficiency and utilization. Business Intelligence (BI) (Chen et al., 2012) tools are a typical form of descriptive analysis tools which comprise an integrated set of IT tools to transform data into information and then into knowledge, and have been used in healthcare settings (Ward et al., 2014) (e.g. Sisense (Sisense Ltd., New York City, NY, USA), Domo (Domo, Inc., American Fork, UT, USA), MicroStrategy (MicroStrategy Inc., Tysons Corner, VA, USA), Looker (Looker Data Sciences Inc., Santa Cruz, CA, USA), Microsoft Power BI (Microsoft Corporation, Redmond, WA, USA) and Tableau (Tableau Software Inc., Seattle, WA, USA)). These tools often incorporate features such as interactive dash-boards (Upton, 2019) that provide customized graphical displays of key metrics, historical trends, and reference benchmarks and can be used to assist in tasks such as surgical planning, personalized treatment, and postoperative data analysis.

Diagnostic analytics tools - why did it happen?

Diagnostic analytics tools, on the other hand, explore the data, address the correlations and dependencies between variables, and focus on interpreting the factors that contributed to a certain outcome through data discovery and data mining. These tools can facilitate the understanding of complex processes and reveal relationships between variables, or find root causes. For example, clinicians can use data on postoperative care to assess the effectiveness of a treatment (Bowyer and Royse, 2016; Kehlet and Wilmore, 2008).

Predictive and prescriptive analytics tools - What will happen? How can we make it happen?

Predictive analytics uses historical data, performs an in-depth analysis of historical key trends underlying patterns and correlations, and uses the insights gained to make predictions about what will likely happen next (What will happen?). Prescriptive analytics complement predictive analytics by offering insights into what actions can be taken to achieve target outcomes (How can we make it happen?). ML can meet these needs, but the challenges specific to surgery are manifold, as detailed in Maier-Hein et al. (2017). Importantly, the preoperative, intraoperative and postoperative data processed are potentially highly heterogeneous, consisting of 2D/3D/4D imaging data (e.g. diagnostic imaging data), video data (e.g. from medical devices or room cameras), time series data (e.g. from medical devices or microphones), and more (e.g. laboratory results, patient history, genome information). Furthermore, while the diagnostic process follows a rather regular flow of data acquisition, the surgical process varies significantly and is highly specific to patient and procedure. Finally, team dynamics play a crucial role. In fact, several studies have demonstrated a correlation between nontechnical skills, such as team communication, and technical errors during surgery (Hull et al., 2012). While first steps have been taken to apply ML in open research problems with applications ranging from decision support (e.g. determining surgical resectability (Marcus et al., 2020)) to data fusion for enhanced surgical vision (e.g. Akladios et al. (2020)), and OR logistics (e.g. Twinanda et al. (2019); Bodenstedt et al. (2019b); Hager et al. (2020)), the vast majority of research has not yet made it to clinical trial stages. Section 5.4 highlights several challenges that need to be addressed in order to effectively adopt ML as an integral part of surgical routine.

5.2. Key initiatives and achievements

This section reviews some key initiatives and achievements from both an industrial and an academic perspective.

Industry initiatives:

Commercial platforms and projects have conventionally focused on analysing multidimensional patient data for clinical decision-making - primarily outside the field of surgery. The most widely discussed initiative so far is probably IBM® Watson Health® (International Business Machines Corporation (IBM), Armonk, NY, USA), which initiated several projects such as Watson Medical Sieve, Watson For Oncology or Watson Clinical Matching that apply the Watson cognitive computing technology to different challenges in healthcare (Chen et al., 2016). The goal of Watson Medical Sieve, for example, is to filter relevant information from patient records consisting of multimodal data to assist clinical decision making in radiology and cardiology. Watson Clinical Matching finds clinical studies that match the conditions of individual patients. With its vast capability to reach patient records and medical literature, Watson was believed to be the future of medicine. However, after it was put to use in the real world, it quickly became clear that the powerful technology has its limitations, as reported by Strickland: It performed poorly in India for breast cancer, where only 73% of the treatment recommendations were in concordance with the experts. Another critical example is the Watson-powered Oncology Expert Advisor which had only around 65% accuracy in extracting time-dependent information like therapy timelines from text documents in medical records (Strickland, 2019). Despite its limitations, Watson Health has shown to be efficient in certain, narrow and controlled applications. For example, Watson for Genomics is used by genetics labs that generate reports for practicing oncologists. Given the information on a patient’s genetic mutations, it can generate a report that describes all relevant drugs and clinical trials (Strickland, 2019). Other companies, societies and initiatives, such as Google (Mountain View, CA, USA) DeepMind Health (Graves et al., 2016; Tomašev et al., 2019), Intel (Santa Clara, CA, USA) (Healthcare IT News, 2012) and the American Society of Clinical Oncology (ASCO) CancerLinQ® (Sledge et al., 2013) have also been focusing on clinical data, and industrial success stories in surgery at scale are still lacking, as detailed in Section 6.

Academic initiatives:

In academia, interdisciplinary collaborative large-scale research projects have developed data analytics tools to address different aspects of SDS. The Transregional Collaborative Research Center “Cognition Guided Surgery” focused on the development of a technical-cognitive assistance system for surgeons that explores new methods for knowledge-based decision support for surgery (März et al., 2015) as well as intraoperative assistance (Katić et al., 2016b). First steps toward the operating room of the future have recently been taken, focusing on different aspects like advanced imaging and robotics, multidimensional data modelling, acquisition and interpretation, as well as novel human-machine interfaces for a wide range of surgical and interventional applications (e.g. Brigham and Women’s Hospital (BWH), Computer-Integrated Surgical Systems and Technology (CISST) Engineering Research Center, Hamlyn Centre, University College London (UCL), Innovation Center Computer Assisted Surgery (ICCAS), IHU Strasbourg, National Center for Tumor Diseases Dresden (NCT/UCC) and National Center for Tumor Diseases Heidelberg).

Broadly speaking, much of the academic work in SDS is currently focusing on the application of ML methods in various contexts (Navarrete-Welton and Hashimoto, 2020; Zhou et al., 2019b; Alapatt et al., 2020), but clinical impact remains to be demonstrated (see Section 6).

5.3. Standards, platforms and tools

A broad range of software tools are used by the SDS community each day, reflecting the interdisciplinary nature of the field. Depending on the SDS application, tools may be required from the following technical disciplines that intersect with SDS: classical statistics, general ML, deep learning, data visualization, medical image processing, registration and visualization, computer vision, matural language processing (NLP), signal processing, surgery simulation, surgery navigation and augmented reality (AR), robotics, BI and software engineering. Many established and emerging software tools exist within each discipline and a comprehensive list would be vast and continually growing. In Table B.3, we have listed software tools that are commonly used by SDS practitioners today, organized by the technical disciplines mentioned above. In this section, we focus on ML frameworks and the regulatory aspects of software development for SDS.

ML frameworks and model standards:

ML is today one of the central themes of SDS analytics, and many frameworks are used by the SDS community. The scikit-learn library in Python is the most widely used framework for ML-based classification, regression and clustering using non-DL models such as Support Vector Machines (SVMs), decision trees and multi-layer perceptron (MLPs). DL, the sub-field of ML that uses Artificial Neural Networks (ANNs) with many hidden layers, has exploded over the past 5 years, also due to the mature DL frameworks. The dominating open-source frameworks today are TensorFlow by Google and PyTorch by Facebook (Menlo Park, CA, USA). These provide mechanisms to construct, train and test ANNs with comprehensive and ever-growing APIs and they are backed up by large industrial investment and community involvement. Other important, but less widely used frameworks include Caffe, Caffe2 (now a part of PyTorch), Apache MXNet, Flux, Chainer, MATLAB’s Deep Learning Toolbox and Microsoft’s CNTK. Wrapper libraries have been constructed on top of several frameworks with higher level APIs that simplify DL model design and promote reusable components. These include TensorFlow’s Keras (now native to TensorFlow), TensorLayer, TFLearn, fastai and NiftyNet (specifically for medical image data), and PyTorch’s TorchVision (Nguyen et al., 2019). Other useful tools include training progress visualization with Tensorboard, and AutoML systems for efficient automatic hyperparameter and model architecture search, such as Hae2O, auto-sklearn, AutoKeras and Google Cloud AutoML. NVIDIA DIGITS takes framework abstraction a step further with a web application to train DL models for image classification, segmentation and object detection, and a graphical user interface (GUI) suitable for non-programmers. Such tools are relevant in SDS where clinical researchers can increasingly train standard DL models without any programming or ML experience (Faes et al., 2019). On the one hand this is beneficial for technology democratization, but on the other hand it elevates known risks of treating ML and DL systems as “black boxes” (PHG Foundation, 2020). Recently NVIDIA has released NVIDIA Clara, a software infrastructure to develop DL models specifically for healthcare applications with large-scale collaboration and federated learning.

Each major framework has its own format for representing and storing ML models and associated computation graphs. There are now efforts to standardize formats to improve interoperability, model sharing, and to reduce framework lock-in. Examples include the Neural Network Exchange Format (NNEF), developed by the Khronos Group with participation from over 30 industrial partners, Open Neural Network Exchange (ONNX) and Apple’s (Cupertino, CA, USA) Core ML for sharing models, and for sharing source code to train and test these models. GitHub is undeniably the most important sharing platform, used extensively by SDS practitioners, which greatly helps to promote research code reusability and reproducibility. “Model Zoos” (e.g. Model Zoo, ONNX Model Zoo) are also essential online tools to allow easy discovery and curation of many of the landmark models from research literature.

Regulatory software standards:

The usual research and development pipeline for an SDS software involves software developed at various stages including data collection and curation, model training, model testing, application deployment, distribution, monitoring, model improvement, and finally a medically approved product. For the classification as a medical product, the intended purpose by the manufacturer is more decisive than the functions of the software. Software is a “medical device software” (or “software as a medical device” (SaMD)) if “intended to be used, alone or in combination, for a purpose as specified in the definition of a medical device in the medical devices regulation or in vitro diagnostic medical devices regulation” (MDCG 2019–11), i.e. if intended to diagnose, treat or monitor diseases and injuries. The manufacturer of an SDS software application as SaMD needs to ensure that the safety of the product is systematically guaranteed, prove that they have sufficient competencies to ensure the relevant safety and performance of the product according to the state of the art (and keep evidence for development, risk management, data management, verification and validation, postmarket surveillance and vigilance, service, installation, decommissioning, customer communication, monitoring applicable new or revised regulatory requirements).

Yet, ML-based software requires particular considerations (Gerke et al., 2020). For example, the fact that models can be improved over time with more training data (often called the “virtuous cycle”) is not well handled by these established standards. In 2019, the FDA published a “Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD)”, specifically aimed to clarify this subject (FDA, 2019). In contrast to the previously “locked” algorithms and models, this framework formulates requirements on using Continuous Learning Systems (CLS) and de fines a premarket submission to the FDA when the AI/ML software modification significantly affects device performance, or safety and effectiveness; the modification is to the device’s intended use; or the modification introduces a major change to the SaMD algorithm. The implementation of these requirements, especially with regard to the actual product development, is an unsolved problem.

5.4. Current challenges and next steps

The data analytics-related mission as well as corresponding goals generated by the consortium are provided in Table 4. This section elaborates on the most important research questions from a ML methodological perspective:

How to ensure robustness and generalization? (goal 3.1)

Models trained on the data from one clinical site may not necessarily generalize well to others due to variability in devices, individual practices of the surgical team or the patient demographic. While data augmentation (Itzkovich et al., 2019) can address this issue to some extent, an alternative promising approach is to develop architectures designed to generalize across domains. Early approaches focused on domain adaptation (Heimann et al., 2013; Wirkert et al., 2017) or more generically transfer learning (Pan and Yang, 2010) to compensate for domain shifts in the data. Other attempts have focused on converting data into a domain-invariant representation and on decoupling generic task-relevant features from domain-specific ones (Dai et al., 2017; Mitchell, 2019; Sabour et al., 2017; Sarikaya and Jannin, 2020). Generally speaking, however, ML methods trained in a specific setting (e.g. hospital) still tend to fail to generalize to new settings.

How to improve transparency and explainability? (goal 3.2)

The WHO document on Ethics & Governance of Artificial Intelligence for Health (WHO, 2021) (see Section 3) states that “AI technologies should be intelligible […] to developers, medical professionals, patients, users and regulators” and that “two broad approaches to intelligibility are to improve the transparency of AI technology and to make AI technology explainable”. In this con text, transparency also relates to the requirement that “sufficient information be published or documented before the design or deployment of an AI technology and that such information facilitate meaningful public consultation and debate on how the technology is designed and how it should or should not be used”. Explainability stems from the urge to understand why an algorithm produced a certain output. In fact, the complexity of neural network architectures with typically millions of parameters poses a difficulty for humans to understand how these models reach their conclusions (Reyes et al., 2020). As a result, the EU’s GDPR, implemented in 2018, also discourages the use of black-box approaches, thus providing explicit motivation for the development of models that provide human-interpretable information on how conclusions were reached. Interpretable models are still in their infancies and are primarily studied by the ML community (Adebayo et al., 2018; Bach et al., 2015; Koh and Liang, 2017; Shrikumar et al., 2017). These advances are being adopted within medical imaging communities in applications that are used to make a diagnosis (e.g. detecting/segmenting cancerous tissue, lesions on MRI data) (Gallego-Ortiz and Martel, 2016), and to generate reports that are on a par with human radiologists (Gale et al., 2018), for example. Open research questions are related to how to validate the explanation of the models (lack of ground truth) and how to best communicate the results to non-experts. A concept related to explainability is causality. To date, it is generally unknown how a given intervention or change is likely to affect outcome, which is influenced by many factors even beyond the surgeon and the patient. Furthermore, randomized controlled trials (RCTs) to evaluate surgical interventions are difficult to perform (McCulloch et al., 2002). Thus, it is hard to provide the same quality of evidence and understanding of surgery as, for example, for a drug treating a common non-life-threatening condition (Hager et al., 2020). While large-scale data may help reveal relationships among many factors in surgery, correlation does not equal causation. Recent work on causal analysis (Peters et al., 2017; Schölkopf, 2019; Castro et al., 2020), however, may help in this regard.

How to address data sparsity? (goal 3.3)

One of the most crucial problems in SDS is the data sparsity (see Section 2), which is strongly linked to the lack of robustness and generalization capabilities of algorithms. Several complementary approaches have been proposed to address this bottleneck. These include crowdsourcing (Maier-Hein et al., 2014; 2015; Malpani et al., 2015; Heim et al., 2018; Albarqouni et al., 2016; Maier-Hein et al., 2016) and synthetic data generation (Pfeiffer et al., 2019; Ravasio et al., 2020; Wirkert et al., 2017; Rivoir et al., 2021) briefly mentioned above. Unlabeled data can also be exploited by using self-supervised (see e.g. (Ross et al., 2018)) and semi-supervised learning (see e.g. (Yu et al., 2019; Srivastav et al., 2020)). Self-supervised methods solve an alternate, pretext or auxiliary task, the result of which is a model or representation that can be used in the solution of the original problem. Semi-supervised methods can exploit the unlabeled data in many different ways. In (Yu et al., 2019; Srivastav et al., 2020), for example, pseudo-annotations are generated on the unlabeled data using a teacher model, and the resulting pseudo-annotated dataset is then used to train another (student) model. Recent studies have further shown that exploiting the relationship across different tasks with the concept of multi-task learning (Twinanda et al., 2017) may be used to address data sparsity as well. It has been demonstrated to be beneficial to jointly reason across multi-tasks (Kokkinos, 2017; Long et al., 2017; Yao et al., 2012; Sarikaya et al., 2018) and take advantage of a combination of shared and task-specific representations (Misra et al., 2016). However, the performance of some tasks may also worsen through such a paradigm (Kokkinos, 2017). A possible solution to this problem might lie in the approach of attentive single-tasking (Maninis et al., 2019). Finally, meta-learning (Vanschoren, 2018; Godau and Maier-Hein, 2021) and more generally lifelong learning (Parisi et al., 2019) are further potential paradigms for addressing the problem of data sparsity in the future. Progress in this field will, at any rate, crucially depend on the availability of more public multi-task data sets, such as described by Maier-Hein et al. (2021).

How to detect, represent and compensate for uncertainties and biases? (goal 3.4)

A common criticism of ML-based solutions is the way that they handle “anomalies”. If a measurement is out-of-distribution (ood; i.e. it does not resemble the training data), the algorithm cannot make a meaningful inference, and the probability of failure (error) is high. This type of epistemic uncertainty (Kendall and Gal, 2017) is particularly crucial in medicine as not all anomalies/pathologies can be known beforehand. As a result, current work is dedicated to this challenge of anomaly/novelty/ood detection (Adler et al., 2019). Even if a sample is in the support of the training distribution, a problem may not be uniquely solvable (Ardizzone et al., 2018) or the solution may be associated with high uncertainty. Further research has therefore been directed at estimating and representing the certainty of AI algorithms (Adler et al., 2019; Nölke et al., 2021). Future work should focus on making use of the uncertainty estimates in clinical applications and increasing the reliability of ood methods, as well as systematically understanding and addressing the issue of biases and confounders (see Section 4.4). In this context the increased involvement of statisticians and experts from clinical epidemiology, such as in the biomedical image analysis initiative (Maier-Hein et al., 2020; Roß et al., 2021a), would be desirable. Adopting the necessity of reporting data biases and confounders in publications should be a natural progression for the field of SDS.

How to address data heterogeneity and complexity? (goal 3.5)

The surgeons and surgical team dynamics play a significant role in intraoperative care. While the main surgeon has the lead and makes decisions based on domain knowledge, experience and skills, anesthesiologists, assistant surgeons, nurses and further staff play crucial roles at different steps of the workflow. Their smooth, dynamic collaboration and coordination is a crucial factor for the success of the overall process. Data analytics can play a key role in quantifying these intangibles by modeling workflows and processes. Surgeon skill evaluation, personalized and timely feedback during surgical training, optimal surgeon and patient/case or surgeon and surgical team matches are among the issues that can benefit from data analytics tools. Furthermore, data collected from multiple sources such as vital signs from live monitoring devices, electronic health records, patient demographics, or preoperative imaging modalities require analysis approaches that can accommodate their heterogeneity. Recent approaches in fusion of heterogeneous information include the use of specialized frameworks such as iFusion (Guo et al., 2019). Other work has specifically focused on handling incomplete heterogeneous data with Variational Autoencoders (VAEs) (Nazábal et al., 2020). Graph neural networks (Zhou et al., 2019a) appear to be another particularly promising research direction in this regard. Here as well, however, the lack of large amounts of annotated data is a limiting factor (Raghu et al., 2019). Heterogeneity may also occur in labels (Joskowicz et al., 2019). This could potentially be addressed with fuzzy output/references as well as with probabilistic methods capable of representing multiple plausible solutions in the output, as suggested by some early work on the topic (Kohl et al., 2018; Adler et al., 2019; Trofimova et al., 2020).

How to enable real-time assistance? (goal 3.6)

Fast inference in an interventional setting relies on (1) an adequate hardware and communication infrastructure (covered in Section 3) and on (2) fast algorithms. The trade-off between algorithm and software optimization should be finely balanced between the available edge compute power and the latency requirements of the specific application. Moving high resolution video between devices or displays inherently adds delays and should be minimized for dynamic assistance applications or whether data inference links to control systems. This means that edge compute solutions should carefully consider the input to the display pipeline and the size of the inference models that can be loaded into an edge processor. Where latency is less critical, cloud execution of AI models has already been shown to be viable in assistive systems (e.g. Cydar EV from Cydar Medical (Cambridge, UK) for endovascular navigation, or CADDIE / CADDU from Odin Vision Ltd (London, UK) for AI assisted endoscopy). Cloud computing for real-time assistance relies on good connectivity to move data but offers the possibility of running potentially large inference models and returning results for assistance to the OR. Recent advances in the emerging research field of Tactile Internet with Human-in-the-Loop (TaHiL) (Fitzek et al., 2021), which involves intelligent telecommunication networks and secure computing infrastructure is an enabling technology for real-time remote SDS application. To trigger progress in the field, specific clinical applications requiring real-time support should be identified and focused on. Dedicated benchmarking competitions in the context of these applications could further guide methodological development.

How to train and apply algorithms under regulatory constraints? (goal 3.7)

When an SDS data set contains personal medical data, an open challenge lies in how to perform data analytics and train ML models without sensitive information being exposed in the results or models. A general solution that is gaining increasing traction in ML is differential privacy (Dwork et al., 2006). This offers a strong protection mechanism against linkage, de-anonymization and data reconstruction attacks, with rigorous privacy guarantees from cryptography theory. A limitation of differential privacy can be seen in the resulting compromise in terms of model accuracy, which may conflict with accuracy targets. Differential privacy may ultimately be mandatory for federated learning (Li et al., 2019) and publicly releasing SDS models built from personal medical data. Since patients have the right to delete their data, privacy questions also arise regarding models that were trained on their data. In addition, it might be an attractive business model for companies to sell their annotated data or make them publicly available for research purposes. This requires methods to detect whether specific data has been used to train models, e.g. using concepts of “radioactive data” (Sablayrolles et al., 2020), or methods that detect whether a model has forgotten specific data (Liu and Tsaftaris, 2020). A complementary approach to preserving privacy is to work with a different representation of the data. For example, Twinanda et al. (2015); Sharghi et al. (2020) evaluate the use of depth images rather than RGB images to recognize human activity in the hospital, while Chou et al. (2018); Srivastav et al. (2019) perform the analysis on low-resolution images.

How to ensure meaningful validation and evaluation? (all goals)

Validation - defined as the demonstration that a system does what it has been designed to do as well as evaluation - defined as the demonstration of the short-, mid- and long-term added values of the system - are crucial for the development of SDS solutions. The problem with the assessment of ML methods today is that models trained on a particular data set are evaluated on new data taken from the same distribution as the training data. Although recent efforts have been made in healthcare (McKinney et al., 2020) to include test data from different clinical sites, these still remain limited. This situation poses a challenge particularly for healthcare applications, as real-world test data, after the model is deployed for clinical use, will typically not have ground-truth annotation, making its assessment difficult (Castro et al., 2020). A recent example of this is Google Health’s deep learning system that predicts whether a person might be at risk for diabetic retinopathy. In this case, after its deployment at clinics in rural Thailand, despite having high theoretical accuracy, the tool was reported to be impractical in real-world testing (TechCrunch, 2020). In the future, evaluation of methods should be performed increasingly in multi-center settings and incorporate the important aspects of robustness to domain shifts, data imbalance and bias. Global initiatives such as MLCommons and its Medical Working Group will play a central role in designing benchmarks and propose best practices in this regard. Furthermore, matching performance metrics to the clinical goals should be more carefully considered, as illustrated in recent work (Reinke et al., 2021). Finally, specific technical aspects (e.g. explainability, generalization) should be comparatively benchmarked with international challenges and covered at dedicated workshops. In this context, acquiring dedicated sponsor money for annotations could help generate more high-quality public data sets.

6. Clinical translation

The process of clinical translation from bench to bedside has been described as a valley of death, not only for surgical (software) products, but biomedical research in general (Butler, 2008). In this section, we will begin by describing current practice and key initiatives in clinical translation of SDS. We elaborate on the concept of “low-hanging fruit” that may be reached in a comparatively straightforward manner through collaboration of surgeon scientists, computer scientists and industry leaders. Finally, we will outline current challenges and next steps for those low-hanging fruit to cross the valley of death, rendering SDS applications from optional translational research projects to key elements of the product portfolio for modern OR vendors, which in turn will increase engagement on the part of researchers, industry, funding agencies and regulatory bodies alike.

6.1. Current practice

Clinical translation of products developed through SDS is regulated under existing rules and guidelines. Ultimately, systems or products using SDS components must be able to provide value before, during or after surgery or interventions. Validating such capabilities requires prospective clinical trials in real treatment practices, which require ethics and safety approval by relevant bodies as well as adherence to software standards described in Section 5.4. System documentation and reliability is critical to pass through such approval procedures, which can however also exceptionally be obtained for research purposes without proof of code stability.

From a clinical research perspective, meta-analyses of RCTs are considered the gold standard. However, the field of surgery exhibits a notable lack of high-quality clinical studies as compared to other medical disciplines (McCulloch et al., 2002). While long-term clinical studies are a common prerequisite for clinical translation, despite intense research, the number of existing clinical studies in AI-based medicine is extremely low (Nagendran et al., 2020). As a result, most current clinical studies in the field are based on selected data that are retrospectively analyzed, leading to a lack of high quality evidence that in turn hampers clinical progress. A recent scoping review on AI-based intraoperative decision support in particular named the small size, single-center provenance and questionable representability of the data sets, the lack of accounting for variability among human comparators, the lack of quantitative error analysis, and a failure to segregate training and test data sets as the prevalent methodological shortcomings (Navarrete-Welton and Hashimoto, 2020).

Despite these shortcomings, it should be noted that not all questions that arise in the process of clinical translation of an algorithm necessarily need to be addressed by RCTs. For example, a recent DL algorithm to diagnose diabetic retinopathy was approved by the FDA based on a pivotal cross-sectional study (Abràmoff et al., 2018). Translational research on SDS products for prognosis also leverages existing methodology on prospective and retrospective cohort studies for the purposes of internal and external validation.

Generally speaking, the field of SDS still faces several domain-specific impediments. For instance, digitalization has not percolated the OR and the surgical community in the same way as other areas of medicine (Wilhelm et al., 2020). A lack of standardization of surgical procedures hampers the creation of standardized annotation protocols, an important prerequisite for large-scale multi-center studies. Pioneering clinical success stories are important motivators to help set in motion a virtuous circle of advancement in the OR and beyond.

6.2. Key initiatives and achievements

The following section will provide an overview of existing SDS products and clinical studies in SDS.

SDS products:

Over the past few years, modest success in clinical translation and approval of SDS products has been achieved, as summarized in Table 5. This predominantly includes decision support in endoscopic imaging. Endoscopic AI (AI Medical Service, Tokyo, Japan) and GI Genius (Medtronic, Dublin, Ireland) support gastroenterologists in the detection of cancerous lesions, the former albeit struggling with a low positive predictive value (Hirasawa et al., 2018). Other successful applications include OR safety algorithms or computer vision-based data extraction.

Table 5.

Selection of SDS products with machine learning (ML)-based components as of October 2020.

Manufacturer Product Purpose SDS functionality Approval
Decision Support
AI Medical Service, Inc. (Tokyo, Japan) Endoscopic AI Early detection of gastrointestinal cancers Data-driven detection of cancer lesions in endoscopic videos FDA: Breakthrough Device Designation Europe: none
Medtronic plc (Dublin, Ireland) GI Genius Early detection of colorectal cancer Data-driven anomaly detection in colonoscopy videos FDA: none Europe: none
Gauss Surgical, Inc. (Menlo Park, CA, US) Triton Improvement of safety in the operating room Data-driven obstetric hemorrhage quantification through scans of sponges and canisters and sponge counting through scans of surgical field or counter bags FDA: De Novo and 510(k) Europe: CE mark
Surgical Education
Theator, Inc. (San Mateo, CA, US) Surgical Intelligence Platform Surgical training Computer vision-based key moment extraction and annotation on surgical videos and video-based training FDA: none Europe: none

Translational progress in academia:

While most of the work has focused on preoperative decision support, here, we place a particular focus on intraoperative assistance. Table 6 shows several exemplary studies in academia that illustrate how far SDS products have been translated to clinical practice in this regard.

Table 6.

Selection of SDS clinical studies. Searches were performed in June 2021 using [machine learning] AND [surgery] or [deep learning] AND [surgery] or [artificial intelligence] AND [surgery] or [decision support] AND [surgery] or [surgical data science] AND [clinical] in PubMed and Google. Search results were manually evaluated and all studies that analyzed an intraoperative SDS system with a machine learning (ML)-based component were included.

Publication Subject Type of study Study size (# patients)
Fan et al. (2016) ML-based intraoperative somatosensory evoked potential monitoring based on somatosensory evoked potential measurements Cross-sectional 10
Harangi et al. (2017) ML-based classification of uterine artery and the ureter based on video images from gynecologic surgery Cross-sectional 35
Korndorffer et al. (2020) ML-based detection of intraoperative events of interest and case severity based on laparoscopic cholecystectomy videos Cross-sectional n/a (1,051 videos)
Lundberg et al. (2018) Explainable ML-based predictions for the prevention of hypoxemia during surgery based on minute-by-minute data from electronic health records Prospective cohort n/a (53,126 procedures)
Madani et al. (2021) ML-based segmentation of safe and dangerous zones of dissection based on laparoscopic cholecystectomy videos Cross-sectional n/a (290 videos)
Mascagni et al. (2020b) ML-based segmentation of anatomy and assessment of CVS criteria based on laparoscopic cholecystectomy videos Cross-sectional n/a (201 videos)
Tokuyasu et al. (2021) ML-based bounding box detection of hepatocystic anatomy on laparoscopic cholecystectomy videos Cross-sectional 1 (99 videos)
Wijnberge et al. (2020) ML-based early warning system for intraoperative hypotension based on continuous invasive blood pressure monitoring Randomized controlled trial 68
Intraoperative assistance:

A recent review on AI for surgery mainly found studies that use ML to improve intraoperative imaging such as hyperspectral imaging or optical coherence tomography (Navarrete-Welton and Hashimoto, 2020). Further notable intraoperative decision support efforts have focused on hypoxemia prevention (Lundberg et al., 2018), sensor monitoring to support anesthesiologists with proper blood pressure management (Wijnberge et al., 2020) and intelligent spinal cord monitoring during spinal surgery (Fan et al., 2016). A number of models have been developed to promote safety in laparoscopic cholecystectomy, a very common and standardized minimally invasive abdominal procedure. For instance, a model for bounding box detection of hepatocystic anatomy was recently tested in the operating room (Tokuyasu et al., 2021). Another example of SDS for safe cholecystectomy is DeepCVS, a neural network trained to semantically segment hepatocystic anatomy and assess the criteria defining the CVS (Mascagni et al., 2020b). A recent study based on 290 laparoscopic cholecystectomy videos from 37 countries showed that DL-based image analysis may be able to identify safe and dangerous zones of dissection (Madani et al., 2021). Finally, a cross-sectional study using DL algorithms developed on videos of the surgical field from more than 1000 cholecystectomy procedures from two institutions showed an association between disease severity and surgeons’ ability to verify the CVS (Korndorffer et al., 2020). Another example of intraoperative decision support is a study by Harangi et al. (2017), who developed a neural network-based method to classify a structure specified by a surgeon (by drawing a line in the image) into the uterine artery or ureter. The authors reported a high accuracy, but the study was a cross-sectional design with a convenience sample. In fact, convenience samples are the norm in most existing studies in SDS addressing recognition of objects or anatomical structures in the surgical field. This sampling mechanism makes the findings susceptible to selection bias, which affects generalizability or external validation of the methods.

Perioperative decision support and prediction:

A selection of studies on perioperative assistance can be found in Appendix D. One important application of academic SDS is clinical decision support systems (CDSS) that integrate various information sources and compute a recommendation for surgeons about the optimal treatment option for a certain patient. Many of these CDSS are prediction systems that integrate into a mathematical model clinical, radiological and pathological attributes collected in a routine setting and weigh these parameters automatically to achieve a novel risk stratification (Shur et al., 2020). Trained with a specifically selected subpopulation of patients, these prediction systems may help improve current classification systems in guiding surgical decisions (Tsilimigras et al., 2020). Relevant information like overall- and recurrence-free survival (Schoenberg et al., 2020) or the likelihood of intra- and postoperative adverse events to occur (Bhandari et al., 2020) can be assessed and obtained quickly via online applications such as the pancreascalculator.com (van Roessel et al., 2020). In contrast to these score-based prediction systems, ML-based systems are more flexible. The most prominent ML-based system, IBM’s Watson for Oncology, is based on NLP and iterative features and demonstrated good accordance with treatments selected by a multidisciplinary tumor board in hospitals in India (Somashekhar et al., 2018) and South Korea (Lee et al., 2018). Weaknesses of this system include the necessity of skilled oncologists to operate the program, low generalizability to different regions, and the fact that not all subtypes of a specific cancer can be processed (Yao et al., 2020; Strickland, 2019).

Another important application besides decision support is prediction of adverse events. A widely discussed work showed that DL may predict kidney failure up to 48 hours in advance (Tomašev et al., 2019). In the intensive care unit (ICU), where surgeons face enormous quantities of clinical measurements from multiple sources, such as monitoring systems, laboratory values, diagnostic imaging and microbiology results, data-driven algorithms have demonstrated the ability to predict circulatory failure (Hyland et al., 2020).

Table E.1 provides an overview of currently registered SDS clinical studies. While most aim for evaluation of specific applications, a number of ongoing clinical trials focus on data collection for the original development of future CDSS or other SDS applications.

6.3. Low-hanging fruit

In light of the lack of a critical number of clinical success stories, a viable approach to clinical translation initially should focus on “low-hanging fruit”. We believe the following criteria influence the likelihood of successful translation of an SDS application: high patient safety, technical feasibility - especially regarding data needs and performance requirements - easy workflow integration, high clinical value and high business value to encourage industry adoption. Low-hanging fruit typically also avoid being classified as a high-risk medical product, thereby reducing regulatory demands and development barriers. However, it is difficult to satisfy all of these often conflicting criteria simultaneously. For example, applications of significant clinical value such as real-time decision support are highly technically challenging. By contrast, low-level video processing applications such as uninformative frame detection are technically simple but of limited clinical value. SDS applications that are low-hanging fruit are ones that offer a good balance between most or all of these criteria.

An example for a low-risk medical device in the broader scope of SDS is the aforementioned GI Genius that uses AI for real-time detection and localization of polyps during colonoscopy, supporting the examination but not replacing the clinical decision making and diagnostics by clinicians. Considering the low risk to patients, GI Genius is classified as a Class II medical device (with special controls) by the FDA (FDA, 2021b).

Different types and opportunities:

In surgery, a framework that may help determine the next steps for low-hanging fruit is the digital technology framework that categorizes data-centric product innovations in descriptive, diagnostic, predictive and prescriptive, as detailed in Section 5. Currently, the overwhelming focus for SDS researchers is in the prescriptive technology area - for example on tools that provide surgical decision support or predict adverse events. Changing the development lens from prescriptive to descriptive SDS applications, however, may open up entirely new avenues. For instance, a low-hanging fruit may lie in a descriptive decision support tool that informs surgeons on how many surgeons performed certain steps within an intervention and the consequences. Such a data-centric SDS product would not require embedded surgical expertise in order to provide value to the surgeon, but only a database of surgical videos and automated recognition of anatomical structures and surgical instruments, which is technically feasible. In essence, instead of the very difficult automation of surgical decisions, value can be found in providing surgeons and surgical teams with moment-to-moment risk stratification data to facilitate their decisions. An additional benefit of this approach is that it can be combined with real-time data acquisition regarding how surgeons interact with the risk stratification data, which would greatly facilitate the development of both predictive and prescriptive decision support tools.

Importantly, presenting statistical data and evidence-based risk stratification information to the surgeon would also have a different regulatory path than a prescriptive SDS product that offers surgical decisions based on an AI database grounded in surgical decision making. The data-focused product leaves the surgeon fully responsible, while the decision based product makes it questionable who is fully responsible if the surgeon followed an AI-based decision and there was a poor outcome. Another benefit of focusing on descriptive technologies is there is a much smaller technology adoption hurdle for the surgeon when faced with trusting descriptive statistics compared to an AI-based prescriptive decision support tool.

An ML-based descriptive low-hanging fruit could be data-driven surgical reporting and documentation. Surgical procedures are currently documented as one to two pages of text. While a six to eight hour video will not serve as a report in itself, SDS may help extract relevant information from this video by automatically documenting important steps in the procedure. Here, computer vision algorithms for recognition of surgical phases and instruments may be used to extract metainformation from videos (Mascagni et al., 2021b).

An ML-based predictive low-hanging fruit could lie in the optimization of OR logistics. Prediction of procedure time either preoperatively or utilizing intraoperative sensor data may not improve patient outcome, but could provide value to hospital managers if it helps cut down costs in the OR by optimizing patient volume (Aksamentov et al., 2017; Bodenstedt et al., 2019b; Twinanda et al., 2019). This, too, harbors low risk for patients and has a low barrier for market entry. Furthermore, the reference information, i.e., time between incision and suture, is already documented in most hospitals and no laborious annotation by surgical experts is necessary to train the respective ML algorithms. Since OR management tools already exist, SDS applications could even yield success stories within existing tools without having to establish entirely new software tools. Improvements in patient safety may already result from a simple tool that combines SDS algorithms for object recognition in laparoscopic video (e.g. gauze, specimen bag or suture needle) with a warning for surgeons and scrub nurses if these objects are introduced into the patient’s abdomen but not removed afterwards. Since such an SDS application warns clinical staff but does not perform an action on the patient itself, the risk for the patient is inherently low. Here, a combination of surgical knowledge (which objects are at what time introduced into the patient’s body?) with SDS algorithms (which objects can robustly be detected?) and an unobtrusive user interface with a low false alarm rate may result in a low-hanging fruit. Along these lines, automation of the surgical checklist (Conley et al., 2011) would be a technically feasible SDS application with high clinical value.

Surgical robotics as catalyst:

The impending success of next-generation surgical robotics in the OR may bring further opportunities to the clinical translation of SDS. The da Vinci® surgical system (Intuitive Surgical Inc., Sunnyvale, CA, USA) and its upcoming competitors lay the foundation for systematic data capture as well as surgical guidance by information augmentation in the OR. A relatively low-hanging fruit with benefit to the surgeon in the domain of surgical robotics may be an automated camera guidance system, as suggested by Wagner et al. (Wagner et al., 2021). On the one hand, the risk of poor camera positioning for the patient is low compared to that of invasive tasks such as suturing. On the other hand, correcting the camera position is currently a highly disruptive task to the surgeon. The first products for autonomous endoscopic camera control are now emerging in robotic surgery, such as the FDA-approved system from TransEnterix (Morrisville, NC, USA).

6.4. Current challenges and next steps

As highlighted in several previous publications (Maier-Hein et al., 2017; 2018a; Hager et al., 2020), clinical applications for SDS are manifold, ranging from preand intraoperative decision support to context-aware assistance and surgical skills training. The clinical translation-related goals generated by the consortium as part of the Delphi process are provided in Table 7. The following aspects deserve particular attention:

How to catalyze clinical translation of SDS? (goals 4.1/4.2)

Clinical data is recognized as “the resource most central to health-care progress” (Institute of Medicine (USA) Roundtable on Value & Science-Driven Health Care, 2010). What is needed is thus a cultural shift toward data acquisition, annotation and analysis within a well-defined data governance framework as a primary clinical task (August et al., 2021). The allocation of economic, infrastructural and personnel resources within hospitals for this appears as a non-negotiable requirement for the purpose. The need for creating value from large amounts of representative data, both for de novo development/validation and external validation studies, further necessitates multi-institutional collaborations. Researchers in other domains have achieved such collaborations, for example in genomics and bioinformatics; SDS would benefit from adopting relevant aspects of these domains’ research cultures. In addition, enabling explicit academic recognition for developing rigorously annotated data sets can facilitate data resources for research in SDS, as discussed in Section 4. Paving the way for short-term clinical success stories as well as long-term clinical translation further requires SDS applications to be integrated into clinical workflows. In fact, the sparsity of studies on SDS solutions for intraoperative care illustrate the challenge of conducting multidisciplinary research while prioritizing the patient. Therefore, research on SDS products should consider the impact on workflow early in product development and closely engage relevant stakeholders (see Table 1). Impactful success stories could then be generated by focusing on low-hanging fruit presented in the previous section. These, in turn, would contribute to building public trust in SDS and boost public enthusiasm to spark patient demand.

How to improve knowledge transfer among different stakeholders? (goal 4.3)

The creation of interdisciplinary networks involving the different stakeholders and the regular organization of SDS events in conjunction with both technical and medical conferences is key to improving knowledge transfer between the groups. Such events should, in part, be dedicated to specific questions, such as annotation guidelines, data structures or good practices with respect to external validation. As a means for actively disseminating, discussing, and promoting new insights in the field of SDS, a well-curated community web platform should be established as the central information hub. One could even go further and offer e.g. a prize for clinical trials demonstrating SDS success. A good means for public outreach could be the hosting of public days focused on a particular topic at major conferences in the field, as a way of creating awareness for that topic, or campaigns e.g. in the vein of “Stop the Bleed” (ACS Committee on Trauma).

How to train key SDS personnel? (goal 4.4)

In order to facilitate clinical translation of SDS in the long term, it will further be crucial to promote the transdisciplinary training of future surgical data scientists and thereby establish SDS as a career path. Computer scientists will have to enter ORs on a regular basis to understand real clinical problems and to get an impression of the obstacles in clinical translation. Similarly, surgeons will have to understand the basic principles, capabilities and limits of data science techniques to identify solvable clinical problems and proper applications for SDS. A viable path to improve knowledge transfer would be to establish SDS as a commonly respected career path in hospitals. In this context, both technical and clinical disciplines should be complemented by knowledge and expertise in clinical research methodology, i.e., epidemiology and biostatistics. Moreover, human factors engineering and human computer interaction researchers should be integrated into the community. Setting up such an SDS career path should also involve the definition of specifics and skills an ‘AI-ready’ clinician should meet. A curriculum should put a specific focus on medical statistics covering confounding variables, risk correction and data biases, as well as on regulatory issues (e.g. SaMD). On top of the research-oriented positions, we should further seek to establish SDS-related jobs for data acquisition, management and annotation, specifically in university hospitals.

How to ensure high-quality external validation of SDS applications? (goal 4.5–4.7)

A critical pitfall with clinical prediction models, which include models for diagnosis and prognosis, is unbridled proliferation of de novo development and validation studies, but scant external validation studies (Adibi et al., 2020). Research to support regulatory approval of SDS products, i.e., in order to market these products, would typically address external validation. However, advances in clinical care are not restricted to marketed products. Therefore, it is necessary for the research community to not only conduct de novo development and validation studies but also well designed external validation studies. Past experience with clinical prediction models shows the need for creative solutions. While some solutions, such as “living registries”, have been proposed (Adibi et al., 2020), proactive effort by the SDS community to develop effective solutions that allow for consistent and uniform external validation can be a transformative contribution. The status quo, summarized in a review of existing literature in AI-based intraoperative decision-making, shows that the SDS community has not addressed the pitfall of inadequate external validation studies (Navarrete-Welton and Hashimoto, 2020). This challenge is systematically addressed when the end-goal for the translational research is regulatory approval to market a SDS product; the regulatory agency serves as a steward in this case. Similar stewardship may benefit translational research in SDS that is not intended to support regulatory approval. Finally, it is important to develop new performance metrics for AI algorithms that quantify clinically relevant parameters currently not accounted for in outcome validation studies. One particular challenge lies in the assessment of long-term outcomes. Many established metrics, such as 5-year-survival after a surgical intervention for cancer, may not be immediately available following surgery. Here, ML techniques can help by capturing data patterns that could serve as potential surrogate measures: Surgical video or motion data localized to anatomy through imaging studies may be used to identify activities or events that increase the risk of cancer cell seeding and subsequent metastasis and thus predict the long-term outcome.

How to ensure ethical and legal guidance? (goals 4.8/ 4.9)

With the face of data-driven clinical practice about to change in a vast manner, unprecedented ethical and legal questions pertaining to both the regulation of medical AI as well as its practical use will be raised. Moving forward, liability and medical negligence/insurance regulations need to be adapted for data-driven clinical practice. A recent survey among Dutch surgeons revealed privacy and liability concerns as significant grounds for objection to video and audio recording of surgical procedures (van de Graaf et al., 2020), reinforcing the importance of clear regulatory frameworks toward better clinical acceptance. New regulations will have to go much further than these current considerations, with a particular focus to be placed on cases of AI failure, human rejection of AI recommendations, or potentially the omission of AI (European Parliament, 2020). Notably, the FDA recently put forth an Artificial Intelligence and Machine Learning (AI/ML) Software as a Medical Device Action Plan (FDA, 2021a). These regulatory issues strongly interconnect with previously raised issues of trust in as well as transparency and explainability of AI models, which have also been raised in the very recent WHO report Ethics & Governance of Artificial Intelligence for Health (WHO, 2021). An ethical and human rights-based framework intended to guide the development and use of AI was further proposed by Fjeld et al. (2020), taking eight key themes such as privacy, accountability, safety/security, transparency/explainability, fairness and non-discrimination, human control of technology, professional responsibility, and promotion of human values into account. Moreover, ethical and moral considerations regarding the democratization of data and/or AI model access will be necessary. In the specific context of surgery, first guidance on the ethical implications of integrating AI algorithms into surgical training workflows has recently become available (Collins et al., 2021). Similarly, new concepts for obtaining patient consent to data sharing that take into account the dynamics and unforeseeability of data usage in future SDS applications need to be established. One way to go might be the introduction of a data donor card, analogously to organ donor cards, as suggested in Section 4.4. Both patient- and healthcare professional-centric ethical and legal considerations are likely to have a large impact on the public perception of and trust in SDS, which needs to be boosted for higher patient demand. Above all, patient safety must be supported by the development of contemplative regulatory frameworks.

In summary, a multi-pronged approach to address challenges that can catalyze rapid advances in SDS and to develop solutions to problems considered low-hanging fruit will be crucial to the future of SDS as a scientific field. The introduction of initial features that provide clear benefits can facilitate advanced changes. To this end, a compositional approach may be pursued wherein complex SDS products reuse simpler AI models that have been previously approved and adopted in clinical care. Once a number of high value applications are established and there is hospital buy-in, a virtuous circle of SDS can be expected to begin, enabling more applications, higher volume data collection, stronger models, streamlined regulation, and better acceptance.

7. Discussion

15 years have passed since the vision of the OR of the future was sketched for the year 2020 (Cleary et al., 2004). A central goal of the SDS 2019 workshop was to revisit the paper and report produced by Cleary et al. (2005) and Mun and Cleary (2005) and investigate where we stand, what has hindered us to achieve some of the goals envisioned and what are new trends that had not been considered at the time.

When asked: “What has really changed when you are entering the OR of today as compared to the setting in 2004?”, participants came to the conclusion that they do not perceive any disruptive changes. Improvements were stated to be of rather incremental nature including advances in visualization (e.g. 3D visualization and 4K video imaging (Ceccarelli et al., 2018; Dunkin and Flowers, 2015; Rigante et al., 2017)) and improvements in tissue dissection, which is now safer, easier and faster to perform due to ultrasound scissors and impedance controlled electrosurgery, for example. None of these innovations includes a relevant AI or ML aspect. And some developments did not even come with the envisioned benefits. For instance, staplers of today are by far more sophisticated than 10 years ago, but the problem of anastomotic leakage is still relevant (Stamos and Brady, 2018). The following paragraphs put the main (six) topics of the 2004 workshop into today’s perspective.

Operational efficiency and workflow:

Core problems identified in 2004 were the “absence of a standard, computerized medical record for patients that documents their histories and their needs” as well as “multiple and disparate systems for tracking related work processes”. While these problems have remained until today (see Section 3), the challenge of integrating the different information sources related to the entire patient pathway has meanwhile been widely acknowledged. Emerging standards like HL7 FHIR and the maturing efforts of IHE form a solid base for future developments. However, standards alone are not sufficient to solve the problem; hospitals need to make data acquisition, exchange and accessibility a requirement. HIT that enables fast deployment of tools for data acquisition, annotation and processing should be seen as a core service to enable cutting edge research. By centralizing such efforts, data pools can be maintained over the scope of many projects instead of creating isolated databases. This brings with it the need to standardize regulatory workflows. Getting access to data for research is often highly challenging. By out-lining clear guidelines and codes of conduct, time spent on formalities can be cut while reducing uncertainties regarding what is the right or wrong way to handle sensitive data. Finally, the prevalence of unstructured data needs to be decreased in order to increase data accessibility. At this point, this also seems to be a matter of user interfaces - by providing clinicians with tools to rapidly create structured reports, reliance on free text can be reduced. This, however, requires training and acceptance by clinical personnel - which could be increased through education in data science topics.

Systems integration and technical standards:

OR integration was the aim of multiple international initiatives, such as OR.NET, the Smart Cyber Operating Theater (SCOT) project (Iseki et al., 2012) and the Medical Device “Plug-and-Play” (MD PnP) Interoperability Program. Despite these ongoing efforts we are, however, still far from an OR in which “all machines and imaging modalities can talk to each other”, as postulated in 2004. Again, interoperability with intraoperative devices should be viewed as a prerequisite by clinical management, and as an investment in future workflow and cost optimization. Emerging standards like SDC provide a means to enable data exchange; however, more work needs to be invested in the creation of platforms that enable dynamic reactions to events and complex interactions.

Telecollaboration:

While the OR of the twenty-first century connects many different individuals from various disciplines, telecollaboration has only slightly evolved during the last one and a half decades, and a genuine breakthrough has not yet been achieved (Choi et al., 2018). Many of the impediments can be seen in missing technical developments (e.g. regarding data compression and latency), coordination issues and knowledge gaps on the part of the prospective users as well as the aforementioned lack of data standardization (Mun and Cleary, 2005). It is to be hoped that coming improvements in intelligent telecommunication networks (e.g. 5G) might trigger future progress in telecollaboration.

Robotics and surgical instrumentation:

In 2020, numerous surgical procedures, including major surgery on the esophagus, pancreas or rectum, are feasible to be performed using surgical robots. In striking contrast, the actual use of surgical robotics is still marginal. A number of high-quality controlled trials failed to prove superiority, making the use of surgical robotics in many cases difficult to justify (Roh et al., 2018). Another reason for the poor progress may lie in the lack of competition in hardware. Since the discontinuation of the development of the ZEUS device in 2003, the field has been clearly dominated by the da Vinci system. Only in recent times, truly competitive systems such as the Senhance (TransEnterix) or the Versius® (Cambridge Medical Robotics Ltd., Cambridge, UK) system have begun to emerge (Peters et al., 2018). It will be exciting to see whether a broader range of technical solutions, along with, perhaps, a stronger interlocking with next-generation intraoperative imaging, will stimulate this particular aspect of the next OR.

Intraoperative diagnosis and imaging:

While intraoperative imaging appeared very promising in 2004, the modest successes that have been made in that area are mostly related to mobile X-Ray based devices and drop-in devices in robotics (Diana et al., 2017; Goyal, 2018). The pivotal problem of matching pre- and intraoperative images still remains, as does the unsolved issue of adaptive real-time visualization during intraoperative deformation of soft tissue. One emerging and very promising field is the field of biophotonics (see Section 3). Benefiting from a lack of ionizing radiation, low hardware complexity and easy integrability into the surgical workflow, biophotonics has yielded an increasing number of success stories in intraoperative imaging (Bruins et al., 2020; Neuschler et al., 2017).

Surgical informatics:

In 2004, the term SDS had not been invented. At that time, surgical informatics was defined as the collection, storage/organization, retrieval, sharing, and rendering of biomedical information that is relevant to the care of the surgical patient, with an aim to provide comprehensive support to the entire healthcare team (Mun and Cleary, 2005). Since the beginnings of the field of computer-aided surgery, however, AI and in particular ML have arisen as new enabling techniques that were not in the focus 15 years ago. While these techniques have begun revolutionizing other areas of medicine, in particular radiology (Kickingereder et al., 2019; Shen et al., 2017), SDS still suffers from a notable absence of success stories. This can be attributed to a number of various challenges, specifically related to high quality and high volume data annotation, as well as intraoperative data acquisition and analysis and surgical workflow integration, as detailed in Section 36.

Overall, the comparison between the workshop topics discussed in 2004 and 2019 revealed that the most fundamental perceived difference is related to how the future of surgery is envisioned by experts in the field. While discussions in 2004 were mainly centered around devices, AI is now seen as a key enabling technique for the future OR. This article has therefore been centered around technical challenges related to applying AI/ML techniques to surgery. A core challenge now is to put the vision of SDS into clinical practice. The large number of relevant SDS stakeholders (Table 1) as well as the large number of goals with high priority (Table 2, 3, 4, 7), as compiled by the international Delphi expert panel, illustrate that the hurdles are high. With the presented concrete recommendations for addressing the complexity of SDS and moving forward, we hope to support the SDS community in overcoming existing barriers and eventually achieving clinical translation.

Acknowledgments

Many thanks to Annika Reinke (DKFZ, Germany) for designing Fig. 1. We further thank all participants of the workshop, in particular those who filled out the questionnaire including Max Allan (Intuitive Surgical Inc., United States), Mark Asselin (Queen’s University, Canada), Steven Bishop (CMR Surgical Ltd., United Kingdom), Sebastian Bodenstedt (National Center for Tumor Diseases (NCT), Germany), Harold Jay Bolingot (Kyushu Institute of Technology, Japan), Elvis Chen (Robarts Research Institute, Canada), Bijan Dastgheib (International Centre for Surgical Safety ICSS, Canada), Roger Daglius Dias (Brigham Health / Harvard Medical School, United States), Luc Duong (Ecole de technologie superieure, Canada), Ulrich Eck (Technical University of Munich, Germany), Isabel Funke (National Center for Tumor Diseases (NCT) Dresden, Germany), Cong Gao (Johns Hopkins University, United States), Pablo Garcia Kilroy (Verb Surgical Inc., United States), Matthias Grimm (Technische Universität München, Germany), Tamas Haidegger (Obuda University, Hungary), Georges Hattab (National Center for Tumor Diseases (NCT), Germany), Changyan He (Johns Hopkins University, United States), Enes Hosgor (surgical.ai, United States), Hassan Ismail Fawaz (Université Haute-Alsace, France), Anthony Jarc (Intuitive Surgical, United States), Leo Joskowicz (The Hebrew University of Jerusalem, Israel), Ertugrul Karademir (German Aerospace Center (DLR), Germany), Tae Soo Kim (Johns Hopkins University, United States), Kirsten Klein (KARL STORZ SE & Co. KG, Germany), Michael Kranzfelder (Klinikum rechts der Isar, TU München, Germany), Shlomi Laufer (Technion, Israel), Greg Nelson (Aesculap AG, Germany), Chinedu Nwoye (University of Strasbourg, France), Molly O’Brien (Johns Hopkins University, United States), Daniel Ostler (Klinikum rechts der Isar of Technical University of Munich, Germany), Micha Pfeiffer (National Center for Tumor Diseases Dresden (NCT), Germany), Mohammad Rahbari (University Hospital Mannheim, Medical Faculty Mannheim of the University of Heidelberg, Germany), Wolfgang Reiter (Wintegral GmbH, Germany), Nicola Rieke (NVIDIA, Germany), Roozbeh Shams (École Polytechnique de Montreal, Canada), Amber Simpson (Memorial Sloan Kettering Cancer Center, United States), Vinkle Srivastav (University of Strasbourg, France), Sarina Thomas (German Cancer Research Center (DKFZ), Germany), Liset Vazquez Romaguera (Polytechnique Montreal, Canada), Tong Yu (University of Strasbourg, France). We further thank Tim Rädsch (German Cancer Research Center (DKFZ), Germany) for his contribution to the data annotation section and Alexander Jenke (NCT Dresden, Germany) for his contribution to the data repository table.

This work was supported by the European Research Council (ERC) starting grant COMBIOSCOPY under the New Horizon Framework Programme grant agreement [ERC-2015-StG-637960]; the Helmholtz Imaging Platform (HIP), a platform of the Helmholtz Incubator on Information and Data Science; the NCT Heidelberg; BPI France (project CONDOR); the Johns Hopkins Science of Learning Institute Research Grant; the National Institutes of Health [NIDCR R01 DE025265, P41 EB015902, P41 EB015898, R01 CA235589]; the Surgical Oncology Program of the National Center for Tumor Diseases (NCT) Heidelberg; KARL STORZ SE & Co. KG; the Royal Society (UF140290) and NIHR Imperial BRC (Biomedical Research Centre); the ERC - H2020 Autonomous Robotic Surgery (ARS) grant agreement [ERC-2016-ADG-742671]; the Surgical Metrics Project - American College of Surgeons (National Society Contract); the Quantified Physician - 7-SIGMA Simulation Systems, Minnesota, MN (Industry Contract); the Tourniquet Master Training - DOD SBIR Phase IIb - Continuation Award [W81XWH-13-C-0021]; the Ontology for Human Motion and Psychomotor Performance - Stanford University Media-X; Motion Analysis for Microvascular Anastomosis - University of Wisconsin (Academic Contract); the Precision Learning Initiative - American Medical Association (National Society Grant); Quantifying the Metrics of Surgical Mastery: An Exploration in Data Science (NIH) [R01DK123445]; the Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS) [203145Z/16/Z]; Engineering and Physical Sciences Research Council (EPSRC) [EP/P027938/1, EP/R004080/1, EP/P012841/1]; the Royal Academy of Engineering Chair in Emerging Technologies; the St. Michael’s Hospital; the University of Toronto; the Grant-in-Aid for Scientific Research on Innovative Area from MEXT, Japan; the National Cancer Data Ecosystem, contract number 19X037Q under Task Order HHSN26100071 from NCI; the project ProteCT [BMBF 16SV8568]; the German Research Foundation (DFG, Deutsche Forschungsgemeinschaft) as part of Germany’s Excellence Strategy - EXC 2050/1 - Project ID 390696704 - Cluster of Excellence “Centre for Tactile Internet with Human-in-the-Loop” (CeTI); the Canada Research Chair in Computer Integrated Surgery, Natural Sciences and Engineering Research Council of Canada; the ANR with grants ANR-16-CE33-0009 (project Deep-Surg), ANR-10-IAHU-02 (IHU Strasbourg) and ANR-20-CHIA-0029-01 (Chair AI4ORSafety); and the Federal Ministry of Economics and Energy (BMWi) and the German Aerospace Center (DLR) within the OP 4.1 project [BMWI 01MT17001C].

Appendix A. Publicly accessible and annotated surgical data repositories

Table A.1.

List of publicly accessible and annotated surgical data repositories, assigned to the categories (1) robotic minimally-invasive surgery, (2) laparoscopic surgery, (3) endoscopy, (4) microscopic surgery, and (5) surgery in sensor-enhanced OR, (6) other. Note that each repository occurs only once in the table although some categories overlap.

Source Procedure(s)/Activity(ies) Data Source Data Type Reference/Annotation Year
ROBOTIC MINIMALLY-INVASIVE SURGERY
PETRAW multiple training tasks virtual video, kinematics segmentation of instruments/pegs/blocks, phase, steps, activity 2021
SARAS-MESAD prostatectomy in-vivo/ex-vivo, human/phantom video action bounding boxes 2021
SimSurgSkill multiple training tasks virtual video bounding boxes tool, skill 2021
EndoVis-MISAW micro-surgical anastomosis (suturing, knot-tying) in training setting ex-vivo, phantom kinematics, video phase, step, activity 2020
EndoVis-SurgVisDom needle-driving, knot tying, dissection in training setting virtual/ex-vivo phantom/porcine video activity 2020
SARAS-ESAD (Bawa et al., 2021) prostatectomy in-vivo, human video action bounding boxes 2020
EndoVis-Scared (Allan et al., 2021) exploration of abdominal organs ex-vivo, porcine video depth maps, calibration 2019
EndoVis-RobSeg nephrectomy in-vivo, porcine video segmentation of instrument parts, objects, anatomy/tissue 2018
ATLAS Dione (Sarikaya et al., 2017) ball placement, ring peg transfer, suture pass, suture and knot tie, urethrovesical anastomosis ex-vivo, phantom video activity, skill, instrument bounding box 2017
EndoVis-Kidney (Hattab et al., 2020) partial nephrectomy in-vivo, porcine video kidney boundary 2017
EndoVis-RobInstrument (Allan et al., 2019) different porcine procedures in-vivo, porcine video segmentation of instrument parts, instrument type 2017
Nephrec9 (Nakawala, 2017) partial nephrectomy in-vivo, human video phase 2017
EndoAbS (Penza, 2016) exploration abdominal organs ex-vivo, phantom images 3D surface reconstruction, calibration 2016
JIGSAWS (Gao et al., 2014) suturing, knot-tying, needle passing in training setting ex-vivo, phantom kinematics, video activity, skill 2014
Hamlyn Centre Laparoscopic/Endoscopic Video data sets (Stoyanov et al., 2005; Lerotic et al., 2008; Mountney et al., 2010; Pratt et al., 2010; Stoyanov et al., 2010; Giannarou et al.,
2013; Ye et al., 2017)
diverse procedures, e.g. partial nephrectomy, totally endoscopic coronary artery bypass graft, intra-abdominal exploration in-vivo/ex-vivo, hu-man/porcine/phantom video depth maps, calibration, 3D surface reconstruction 2005 – 2017
LAPAROSCOPIC SURGERY
CholecTriplet21 cholecystectomy in-vivo, human images instrument, verb, target 2021
HeiSurf cholecystectomy in-vivo, human images, video segmentation of 23 different structures, phase, action, tool 2021
GLENDA (Leibetseder et al., 2020) laparoscopic gynecology in-vivo, human images, video segmentation of pathol. endometriosis categories, pathology type 2020
EndoVis-ROBUST-MIS (Roß et al., 2021) laparoscopic rectal resection, proctocolectomy in-vivo, human video multi-instance segmentation of instruments 2019
EndoVis-WorkflowAndSkill cholecystectomy in-vivo, human video phase, action, instrument type, skill 2019
LapGyn4 (Leibetseder et al., 2018) gynecologic laparoscopic surgeries in-vivo, human images, video actions, anatomy, instrument count 2018
Cholec80 (Twinanda et al., 2017) cholecystectomy in-vivo, human video phase, instrument type 2017
EndoVis-Workflow laparoscopic colorectal surgery in-vivo, human video, device signals phase, instrument type 2017
TrackVes (Penza et al., 2017) exploration of abdominal organs in-vivo/ex-vivo, human/porcine/goat video 2D polygon around area of interest, attributes of area 2017
m2cai16-tool (Twinanda et al., 2017) cholecystectomy in-vivo, human video instrument type 2016
m2cai16-tool-locations (Jin et al., 2018) cholecystectomy in-vivo, human video instrument bounding box 2016
m2cai16-workflow (Twinanda et al., 2017; Stauder et al., 2017) cholecystectomy in-vivo, human video phase 2016
EndoVis-Instrument laparoscopic colorectal surgery, robotic minimally invasive surgery in-vivo/ex-vivo, human/porcine video, images segmentation of instrument parts and center, 2D pose 2015
Crowd-Instrument (Maier-Hein et al., 2014) laparoscopic adrenalectomy, pancreas resection in-vivo, human images segmentation of instruments 2014
TMI Dataset (Maier-Hein et al., 2014) exploration of abdominal organs ex-vivo, porcine images 3D surface reconstruction, calibration 2014
Laparoscopy Instrument Sequence (Sznitman et al., 2012) cholecystectomy in-vivo, human video instrument center, scale 2012
MICROSCOPIC SURGERY
EndoVis-CATARACTSSemSeg cataract surgery in-vivo, human images segmentation of anatomical structures and instruments 2020
EndoVis-CATARACTSWorkflow cataract surgery in-vivo, human video phase 2020
Cataract-101 (Schoeffmann et al., 2018) cataract surgery in-vivo, human video phase, experience level of surgeon 2018
EndoVis-CATARACTS cataract surgery and surgical tray in-vivo, human video instrument type 2018
NeuroSurgicalTools data set (Bouget et al., 2015) neurosurgery in-vivo, human images instrument bounding polygon, instrument type 2015
Retinal Microsurgery Instrument Tracking (RMIT) (Sznitman et al., 2012) retinal surgery in-vivo, human video instrument center, scale 2012
ENDOSCOPY
AdaptOR endoscopic heart surgery ex-vivo/in-vivo, phantom/human images landmarks in phantoms 2021
EndoCV21 colonoscopy in-vivo, human video bounding box and pixel-wise segmentation of polyps 2021
EndoSLAM (Ozyoruk et al., 2021) standard/capsule endoscopy ex-vivo/synthetic, porcine/phantom images 6 DoF pose, 3D map ground truth 2021
FetReg fetoscopy in-vivo, humanl images, video segmentation of vessel/tool/fetus, phase, steps, activity 2021
GIANA21 colonoscopy in-vivo, human images, video polyp masks, classification of polyps 2021
Endoscopy Disease Detection and Segmentation (EDD) gastroscopy, gastro-esophageal, colonoscopy in-vivo, human video bounding boxes and segmentation of multi-class disease regions 2020
HyperKvasir (Borgli et al., 2020) gastro- and colonoscopy in-vivo, human images, video anatomical landmarks, pathologies, partially segmentation mask and bounding boxes 2020
Kvasir-Capsule (Smedsrud et al., 2021) capsule endoscopy in-vivo, human images, video anatomical landmarks, quality of mucosal view and pathological findings 2020
Sinus-Surgery-Endoscopic-Image-Datasets (Qin et al., 2020) endoscopic sinus surgery ex-vivo/in-vivo, human images segmentation of instruments 2020
Endoscopic Artefact Detection (EAD) (Ali et al., 2020) gastroscopy, cystoscopy, gastro-esophageal, colonoscopy in-vivo, human video bounding box and segmentation of multi-class artefacts 2019
NBI-InfFrames (Moccia et al., 2018) laryngeal endoscopy in-vivo, human video informative frames 2018
AIDA-E gastrointestinal confocal endoscopy, gastric chromoendoscopy, esophagus microendoscopy in-vivo, human images bounding box of abnormalities 2017
KID (Koulaouzidis et al., 2017) capsule endoscopy in-vivo, human images, video abnormalities 2017
Laryngeal data set (Moccia et al., 2017) laryngeal endoscopy in-vivo, human images patches healthy/cancerous laryngeal tissues 2017
Hamlyn Centre Laparoscopic / Endoscopic Video data sets (Ye et al., 2016) gastrointestinal endoscopy in-vivo, human video bounding box of optical biopsy sites 2016
Kvasir (Pogorelov et al., 2017) gastro- and colonoscopy in-vivo, human images anatomical landmarks, pathologies 2016
EndoVis-GIANA (Bernal et al., 2017) colonoscopy, wireless capsule endoscopy in-vivo, human video, images segmentation and classification of polyps / angiodysplasia / bowel lesions 2015 – 2018
SURGERY IN SENSOR-ENHANCED OR
Multi-View Operating Room (MVOR) (Srivastav et al., 2019) vertebroplasty, lung biopsy in-vivo, human RGB-D human bounding boxes, 2D/3D human body pose key points 2018
xawAR16 (Loy Rodas et al., 2017) experimental setting for radiation awareness in hybrid operating room ex-vivo, phantom RGB-D poses of the moving camera 2016
OTHER
DeepFluoroLabeling-IPCAI2020 (Grupp et al., 2020) fluoroscopy ex-vivo, human images segmentation of hip in CT and fluoroscopy, anat. landmarks 2020
Curious neurosurgery in-vivo, human images MRI images, intra-op. US with labeled anat. landmarks 2019

Appendix B. Surgical Data Science standards & tools

RabbitMQ is a trademark of VMware, Inc. in the U.S. and other countries. Elasticsearch is a trademark of Elasticsearch BV, registered in the U.S. and in other countries. Kibana is a trademark of Elasticsearch BV, registered in the U.S. and in other countries.

Table B.1.

Selected standards relevant to data acquisition, access, storage and communication in SDS.

Standard Organization Stage of interoperability Acceptance in / outside healthcare Purpose
AVRO Apache Software Foundation syntactic rare / widespread Data serialization format, especially for Apache Hadoop
DICOM National Electrical Manufacturers Association syntactic quasi-universal / none Defines usage of medical imaging information
HDF5 HDF Group syntactic rare / occasional Data format
HL7 FHIR Health Level Seven International (HL7) syntactic / semantic widespread / none Focuses on interoperability of electronic health information in healthcare
HL7 Version 2 & 3 (including CDA) Health Level Seven International (HL7) syntactic / semantic widespread / none Defines exchange, integration, distribution and retrieval of electronic health information
IEEE 11073 SDC standard family IEEE, OR.NET e.V. IEEE 11073–20702 syntactic interoperability, IEEE 11073–20701 binding standard IEEE 11073–20702 is based on industry standard DPWS, other substandards occasional / rare Communication protocol for service-oriented medical devices and IT systems
IoT Public consensus syntactic rare / widespread Collective term describing the interconnection of various systems and actors through the Internet with the purpose of providing intelligent services.
JSON Ecma syntactic occasional / widespread Format for data exchange and serialization, especially in REST-APIs
LOINC Regenstrief Institute at Indiana University School of Medicine (IUSM) in Indianapolis + Community semantic widespread / rare Terminology standard for laboratory and clinical measurements, observations and documents
OpenIGTLink primarily supported by the U.S. National Institutes of Health (NIH R01EB020667, PI: Junichi Tokuda) syntactic occasional / rare Enables communication between various systems and devices in the operating room for image-guided therapy
openEHR openEHR International syntactic / semantic widespread / none Architecture used for modelling patient-centric health data and management of electronic health records with a query language and an open API
Protobuf (Protocol buffers) Google syntactic rare / occasional Data format
RDF RDF Working group from the World Wide Web Consortium (W3C) semantic occasional / widespread Data model for describing resources and their relationship to each other
REST Public consensus syntactic occasional / widespread Set of principles for web services
XML XML Working group from the World Wide Web Consortium (W3C), derived from SGML (ISO 8879) syntactic / semantic widespread / universal Data serialization format for textual information.

Table B.2.

Selected tools relevant to data acquisition, access, storage and communication in SDS.

Tool Organization Acceptance in / outside healthcare Purpose
Amazon AWS Amazon Web Services Inc., Amazon occasional / widespread Cloud Computing
Apache Kafka Apache Software Foundation rare / widespread Streaming platform for message distribution
Docker® Docker Inc. rare / widespread Tool for building software packages, called containers
Docker® Swarm Google Inc. rare / widespread Orchestration tool for Docker containers
Elasticsearch Elastic rare / occasional Search and Analytics Engine
Google Cloud Platform Google Inc. occasional / widespread Cloud Computing
Hadoop® Apache Software Foundation occasional / occasional Framework for distributed computing
Kibana Elastic rare / occasional Dashboard for data visualization
Kubernetes® Cloud Native Computing Foundation rare / widespread Orchestration tool for Docker containers
LevelDB Google Inc. rare / occasional Key-value storage
Microsoft Azure Microsoft Corporation occasional / widespread Cloud Computing
RabbitMQ® Pivotal Software rare / widespread Message broker
ROS Community occasional / occasional Framework with a set of libraries and tools for robot applications

Table B.3.

Disciplines that intersect with SDS and representative software tools that are commonly used in each discipline.

Discipline Representative software tools
Classical statistics R, Python scipy.stats, Python statsmodels, MATLAB Statistics and Machine Learning Toolbox
General machine learning Python scikit-learn, Python statsmodels, MATLAB Statistics and Machine Learning Toolbox
Deep learning Frameworks: TensorFlow (including Keras), PyTorch, Caffe, Apache MXNet, Microsoft Cognitive Toolkit (CNTK), MATLAB Deep Learning Toolbox, OpenCV, NVIDIA Clara, DLTK, NiftyNet, fastai Pre-trained model repositories: Model Zoo, ONNX Model Zoo, TensorFlow Model Garden, torchvision models
Data visualization Python Matplotlib, Python seaborn, MATLAB
Medical image processing and visualization VTK, ITK, ITK-SNAP, 3D Slicer, MITK Visualization tools survey: Haak et al. (2016)
Classical computer vision OpenCV, PCL, VLFeat, MATLAB Computer Vision Toolbox
Natural language processing Python NLTK and spaCy, PyTorch-NLP, Google Cloud Natural Language, Amazon Comprehend
Signal processing Python scipy.signal, MATLAB Signal Processing Toolbox
Surgical simulation SOFA, iMSTK, OpenSurgSim
Surgery navigation / Augmented Reality SlicerIGT, ImFusion Suite
Robotics ROS
Software engineering Git, Docker, Jupyter Notebook, Data Version Control (DVC)

Appendix C. Surgical Data Science annotation tools & services

Table C.1.

Selection of annotation tools for spatial, spatio-temporal and temporal annotations.

Tool Data type Ontology integration Automatic annotation tools
Spatial annotation
3D Slicer Images - Plugins for AI-assisted annotation
DeepLabel Images - Automatic tagging
LabelMe Images - -
Make Sense Images - Semi-automatic bounding box annotation, detection
MITK Images - Plugins for AI-assisted annotation
NVIDIA Clara Imaging Images - Semi-automatic segmentation + interactive mode
Pixel Annotation Tool Images - Watershed segmentation
Semantic Segmentation Editor Images, point clouds - Polygon (automatic option)
EXACT (Marzahl et al., 2021) Images - Version control system, collaborative
Spatio-temporal annotation
Amazon SageMaker Ground Truth Images, videos, 3D point clouds, text - Interactive mode, semi-automated labeling
CVAT Images, videos - Semi-automatic segmentation, detection, collaborative
SuperAnnotate Desktop Images, videos - Active learning, interactive mode
UltimateLabeling Videos - Semi-automatic detection + tracking
VATIC Videos - Optical flow, crowdsourcing
VoTT Images, videos - Automatic object detection
Temporal annotation
ANVIL Videos, audio - -
b <> com Surgery Workflow Toolbox Videos yes -
Observer XT Multimodal - -
s.w.an Videos yes -

Table C.2.

Leading companies providing data set annotations with managed human workforces.

Company Domain
Alegion, Inc. (Austin, TX, US) General computer vision
Appen Ltd (Chatswood, NSW, Australia) General computer vision
CloudFactory Ltd (Richmond, UK) General computer vision
Cogito Tech LLC (New York, NY, US) General computer vision
General Blockchain, Inc. (San Jose, CA, US) General computer vision
Samasource Impact Solutions, Inc. (San Francisco, CA, US) General computer vision
Scale AI, Inc. (San Francisco, CA, US) General computer vision
CapeStart, Inc. (Cambridge, MA, US) Medical imaging
Edgecase AI LLC (Hingham, MA, US) Specialized computer vision & medical imaging
iMerit Technology Services Pvt Ltd (Kolkata,West Bengal, India) Specialized computer vision & medical imaging
Infolks Ptv Ltd (Mannarkkad, Kerala, India) Specialized computer vision & medical imaging
Labelbox, Inc. (San Francisco, CA, US) Specialized computer vision & medical imaging
Steldia Services Ltd (Limassol Agios Athanasios, Cyprus) Specialized computer vision & medical imaging
SuperAnnotate LLC (Sunnyvale, CA, US) Specialized computer vision & medical imaging
Telus International (Vancouver, BC, CA) Specialized computer vision & medical imaging

Appendix D. Published SDS clinical studies - perioperative

Table D.1.

Selection of perioperative SDS clinical studies. Searches were performed in June 2021 using [machine learning] AND [surgery] or [deep learning] AND [surgery] or [artificial intelligence] AND [surgery] or [decision support] AND [surgery] or [surgical data science] AND [clinical] in PubMed and Google. Search results were manually evaluated and all studies that analyzed a perioperative SDS system with a machine learning (ML)-based component were included.

Publication Subject Type of study Study size (# patients)
Bahl et al. (2017) ML-based prediction of pathological upgrade of high-risk breast lesions and reduction of unnecessary surgical excision based on data such as histologic results and text features from pathologic reports Retrospective cohort 986
Corey et al. (2018) ML-based prediction of postoperative complication risk in surgical patients based on electronic health record data Prospective cohort 66,370
De Silva et al. (2020) ML-based prediction models for postoperative outcomes of lumbar spine surgery based on image features and patient characteristics Retrospective cohort 64
Duke University (2016) ML-based clinical analytical platform for predicting risk of surgical complications and improving surgical outcomes based on patient care parameters Prospective cohort 200
Futoma et al. (2017) ML-based sepsis prediction based on clinical patient data over time Prospective cohort 51,697
Hyland et al. (2020) ML-based early prediction of circulatory failure in the intensive care unit based on physiological (clinical and laboratory) measurements from multiple organ systems Prospective cohort 36,098
Komorowski et al. (2018) ML-based identification of optimal treatment strategies for sepsis in intensive care based on laboratory and clinical patient data Prospective cohort 17,083
Mai et al. (2020) ML-based preoperative prediction of severe liver failure after hemihepatectomy in hepatocellular carcinoma patients based on laboratory and clinical parameters Prospective cohort 353
Marcus et al. (2020) ML-based prediction of surgical resectability in patients with glioblastoma based on preoperative MRI imaging Retrospective cohort 135
Mascagni et al. (2021b) ML-based detection of critical moments in laparoscopic cholecystectomy videos for selective video documentation Cross-sectional n/a (155 videos)
Meyer et al. (2018) ML-based real-time prediction of severe complications in post-cardiosurgical critical care based on electronic health record data Prospective cohort 42,007
Tomašev et al. (2019) ML-based prediction of future acute kidney injury based on electronic health records Prospective cohort 703,782
Vijayan et al. (2019) ML-based automatic pedicle screw planning in cone-beam guided spine surgery based on CT imaging data Cross-sectional 40

Appendix E. Registered SDS clinical studies

Table E.1.

Registered SDS clinical studies at ClinicalTrials.gov as of October 2020. Searches were performed using the following keywords: [machine learning] AND [surgery] or [deep learning] AND [surgery] or [artificial intelligence] AND [surgery] or [decision support] AND [surgery] or [data science] AND [surgery] or [surgical data science]. Search results were manually evaluated and all studies were included that either test an SDS system or component, or collect data to create and test an SDS system or component. ID is the ClinicalTrials.gov identifier.

Study summary Patient data Study type Period # Participants Locations
PREOPERATIVE APPLICATIONS
Evaluation of an ML-based CDSS to help decide if a patient should undergo hip or knee replacement surgery based on functional and health related quality of life (HRQoL) changes. ID: NCT04332055 Preoperative patient questionnaire Interventional, randomized, single-center Oct. 2020 – Oct. 2025 600 Northern Orthopaedic Division, Clinic Farsø, Aalborg University Hospital, Farsø, Northern Jutland, Denmark
Evaluation of an ML-based CDSS (IBM Watson) for hepatocellular carcinoma treatment, prognosis and assessment of surgical resection risk with radiomics. ID: NCT03917017 Preoperative abdominal images and radiomic parameters Interventional, non-randomized, single-center Jan. 2019 – Dec. 2024 100 Zhujiang Hospital of Southern Medical University, Guangzhou, Guangdong, China
Evaluation of an ML-based CDSS to predict ST-segment elevation myocardial infarction (STEMI). ID: NCT03317691 Preoperative ECG Observational, retrospective, single-center Oct. 2017 – Oct. 2018 2,000 Shanghai Tenth People’s Hospital, Shanghai, China
Evaluation of an ML-based CDSS to help assess risk of refractive eye surgery complications from corneal ectasia. ID: NCT04313387 Preoperative corneal tomography parameters Observational, retrospective, single-center Jan. 2012 – Jan. 2018 558 Visum Eye Center, São José do Rio Preto Medical School, São José do Rio Preto, Brazil
Data collection and creation of an ML-based CDSS to detect if a patient has an airway that increases risk of anesthesia related injury. ID: NCT04458220 Preoperative 3D face scans in different positions and from different angles Observational, retrospective, single-center Jul. 2020 – May 2023 4,000 The Ninth People’s Hospital of Shanghai Jiaotong University School of Medicine, Shanghai, China
Data collection and creation of an ML-based CDSS to predict total knee arthroplasty (TKA) surgery outcome. ID: NCT03894514 Demographic, psychosocial and preoperative clinical parameters from the EHR Observational, prospective, single-center May 2019 – May 2020 150 The University of Valencia, Valencia, Spain
Data collection and creation of an ML-based CDSS to assess risk and treatment strategy of patients with acute coronary syndromes in emergency departments. ID: NCT03286491 Unspecified Observational, prospective, single-center Aug. 2017 – Feb. 2018 400 Izmir Bozyaka Training and Research Hospital, Izmir, Turkey
Data collection and creation of an ML-based CDSS to detect if a patient has an airway that increases risk of anesthesia related injury. ID: NCT03125837 Preoperative digital photographs in different positions and from different angles Observational, prospective, single-center May 2017 – May 2022 50,000 School of Medicine, Zhejiang University, Hangzhou, China
Data collection and creation of an ML-based CDSS to predict pain response, opioid response and morphine usage requirements in pediatric patients requiring surgery, using electronic health record and genetic data. ID: NCT01140724 Genetic Observational, prospective, multi-center Apr. 2008 – Aug. 2021 1,200 Children’s Hospital Medical Center, Cincinnati, Ohio, United States
Data collection and creation of an ML-based CDSS to assess patient risk of elective heart valve surgery. ID: NCT03724123 Demographic and preoperative clinical parameters from the EHR Observational, retrospective, single-center Jan. 2008 – Dec. 2014 2,229 Kepler University Hospital, Linz, Austria
INTRAOPERATIVE APPLICATIONS
Evaluation of an ML-based CDSS (Edwards Hemosphere platform) to detect and prevent arterial hypotension during abdominal surgery with the Hypotension Prediction Index (HPI) using the FloTrac system. ID: NCT04301102 Intraoperative hemodynamic parameters Interventional, randomized, multi-center Sep. 2020 – May 2021 80 Hospital de Jerez de la Frontera, Cádiz, Spain
Evaluation of an ML-based CDSS (Edwards Hemosphere platform) to detect and prevent arterial hypotension during lung surgery with the Hypotension Prediction Index (HPI) using theFloTrac system. ID: NCT04149314 Intraoperative hemodynamic parameters Interventional, randomized, single-center Nov. 2019 – Dec. 2022 150 University of Giessen, Giessen, Germany
Evaluation of an ML-based CDSS (AlertWatch Anesthesia Control Tower) to support risk assessment for the anesthesiology team. ID: NCT03923699 Physiological parameters, EHR, anesthesia machine parameters, laboratory results Interventional, randomized, single-center Jul. 2019 – Jul. 2024 40,000 Washington University School of Medicine, Saint Louis, Missouri, United States
Evaluation of an ML-based CDSS to detect intraoperative hypertension, using blood pressure (Nexfin finger cuff). ID: NCT03533205 Intraoperative hemodynamic parameters (blood pressure) Observational, prospective, single-center Apr. 2015 – Apr. 2018 507 The Academic Medical Center, The University of Amsterdam, Amsterdam, Netherlands
Data collection and creation of an ML-based CDSS to recognize healthy and abnormal tissue characteristics in abdominal surgery. ID: NCT04589884 Intraoperative hyperspectral images (HSI) Observational, prospective, single-center Sep. 2020 – Oct. 2024 600 The Digestive and endocrine surgery service, NHC, Strasbourg, France
Multi-objective data collection of colorectal cancer surgery videos and biopsy samples for developing ML-based systems. ID: NCT04220242 Colorectal surgery videos and tissue microsections Observational, prospective and retrospective, multi-center Dec. 2019 – Dec. 2022 250 The Mater Misericordiae University Hospital, Dublin, Ireland
Data collection and creation of an ML-based CDSS to detect cerebral ischemia and reperfusion during cardiac surgery. ID: NCT03919370 Intraoperative hemodynamic and cerebral oxygenation parameters Observational, prospective, single-center Dec. 2019 – Dec. 2022 10 Sahlgrenska University Hospital, Gothenburg, Sweden
Data collection and creation of an ML-based CDSS to predict postoperative outcomes (mortality, morbidity, Intensive Care Unit admission, length of hospital stay, and hospital readmission). ID: NCT04014010 Intraoperative hemodynamic parameters (blood pressure, heart rate), oxygen level, carbon dioxide level and hemodynamic medication records Observational, retrospective, single-center Jan. 2013 – Dec. 2017 35,000 Nova Scotia Health Authority Queen Elizabeth II hospitals, Halifax, Canada
POSTOPERATIVE APPLICATIONS
Evaluation of an ML-based CDSS for real-time vasoactive and inotropic support de-escalation in pediatric patients following cardiac surgery. ID: NCT04600700 Postoperative blood oxygenation parameters (the inadequate oxygen delivery index) Observational, retrospective, single-center Jan. 2021 – Mar. 2022 250 Boston Children’s Hospital, Boston, United States
Evaluation of a gait monitoring system with ML components (GaitSmart) to detect gait deficiencies after total hip or knee replacement surgery, and detect differences from different rehabilitation programs. ID: NCT04289025 Postoperative gait parameters from inertial motion units (IMUs) Interventional, randomized, single-center Jan. 2021 – Mar. 2021 100 Norfolk and Norwich University Hospital, Norwich, Norfolk, United Kingdom
Evaluation of an ML-based CDSS (AlertWatch Anesthesia Control Tower) for risk forecasting immediately after surgery with telemedicine notifications. ID: NCT03974828 Physiological parameters, EHR, anesthesia machine parameters, laboratory results Interventional, randomized, single-center Nov. 2020 – Jan. 2024 3,375 Washington University School of Medicine, St. Louis, Missouri, United States
Evaluation of an ML-based system (Caption Health/Caption AI) to improve cardiac ultrasound image standardization and quality after surgery (step down unit). ID: NCT04203251 Postoperative cardiac ultrasound Observational, prospective, single-center Mar. 2020 – May 2020 100 University of California San Francisco, San Francisco, California, United States
Evaluation of an at-home ML-based postoperative monitoring system (Smart Angel, 2020) to reduce unplanned recourse. ID: NCT04068584 Postoperative hemodynamic, blood oxygenation and well-being parameters (pain, nausea, vomiting, comfort) Interventional, randomized, multi-center Feb. 2020 – Aug. 2021 1,260 Nïmes University Hospital Centre, Nïmes, France
Evaluation of an ML-based CDSS (CALYPSO) that creates personalized risk predictions to reduce postoperative complications. ID: NCT02828475 Unspecified Observational, prospective, single-center Jun. 2016 – Jan. 2017 200 Duke University Medical Center, Durham, North Carolina, United States
Evaluation of an ML-based CDSS to help manage postoperative cataract surgery patients. ID: NCT04138771 Postoperative visual acuity parameters, intraocular pressure parameters and slit-lamp images Interventional, single-center Jan. 2013 – Mar. 2020 300 Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, Guangdong, China
Data collection and creation of an ML-based CDSS to predict postoperative respiratory failure within 7 days. ID: NCT04527094 Pre- and intraoperative EHR Observational, prospective, single-center Nov. 2020 – Aug. 2021 8,000 Seoul National University Hospital, Seoul, Republic of Korea
Data collection and creation of an ML-based CDSS to predict postoperative outcomes after vascular stent placement using data from a wearable device (ECG bracelet). ID: NCT04455568 Postoperative ECG Observational, prospective, multi-center Jul. 2020 – Jul. 2024 400 Taipei Medical University Shuang Ho Hospital, New Taipei City, Taiwan
Data collection and creation of an ML-based system to compute continuous blood pressure of patients in surgical intensive care non-invasively, using a wearable blood pressure measuring device and a patient monitor (IntelliVue MX700, Philips). ID: NCT04261062 Postoperative hemodynamics (blood pressure) Observational, prospective, single-center May 2020 – Jan. 2022 220 Yonsei University College of Medicine, Department of Anesthesiology and Pain Medicine, Seoul, Republic of Korea
Data collection and creation of an ML-based CDSS to detect and predict opioid induced respiratory compromise (OIRC) events in postoperative pain management. ID: NCT03968094 EHR and postoperative blood oxygenation, ventilation and transcutaneous PCO2 parameters Observational, prospective, single-center Jun. 2019 – Mar. 2020 50 Buffalo General Medical Center, Buffalo, New York, United States
Data collection and creation of an ML-based CDSS to assess postoperative glioblastoma surgery images to distinguish progression from pseudo-progression. ID: NCT04359745 Preoperative and postoperative MRI Observational, prospective, multi-center Mar. 2019 – May 2023 500 Guy’s and St Thomas’ NHS Foundation Trust and King’s College, London, United Kingdom
Data collection and creation of an ML-based CDSS to predict kidney injury after hyperthermic intraperitoneal chemotherapy (HIPEC). ID: NCT03895606 Preoperative and intraoperative physiological parameters including hemodynamics, blood oxygenation, body temperature, cardiac index and stroke volume variation Observational, prospective, single-center Mar. 2019 – Mar. 2020 57 Gangnam Severance Hospital, Seoul, Republic of Korea
Data collection and creation of an ML-based CDSS to predict risk of readmission following discharge after cardiovascular surgery, using data from a wearable device (Snap40 Monitor). ID: NCT03800329 Postoperative hemodynamic, blood oxygenation, respiration, body temperature and movement parameters Interventional, randomized, single-center Mar. 2018 – Mar. 2021 100 Mayo Clinic in Rochester, Rochester, Minnesota, United States
MULTI-STAGE/OTHER APPLICATIONS
Evaluation of an CDSS (Digital Surgery GoSurgery) with ML components for OR workflow assistance and analytics. ID: NCT03955614 Surgery workflow and OR video Interventional, non-randomized, multi-center Oct. 2019 – Oct. 2020 150 Imperial College Hospitals NHS Trust, London, United Kingdom
Evaluation of an ML-based CDSS to predict motor response after subthalamic nucleus deep brain stimulation (STN DBS) therapy in Parkinson patients. ID: NCT04093908 Demographic, clinical and postoperative UPDRS variables Observational, retrospective, multi-center Aug. 2019 – Dec. 2019 322 Maastricht UMC, Maastricht, Limburg, Netherlands
Evaluation of an ML-based CDSS (Kia et al., 2020) to predict if a hospitalized patient requires care escalation within 6 hours. ID: NCT04026555 Admission discharge transfer (ADT) events, structured clinical assessments (e.g. nursing notes), physiological parameters, ECG and laboratory results Interventional, non-randomized, single-center Jun. 2019 – Mar. 2020 2,915 Mount Sinai Hospital, New York, New York, United States
Evaluation of an ML-based CDSS to help report and monitor patients before and after total knee arthroplasty (TKA), using data from a wearable device (unspecified). ID: NCT03406455 Preoperative and postoperative physical activity parameters including step counting and knee range-of-motion Observational, prospective, single-center Jul. 2018 – May 2019 25 Cleveland Clinic, Cleveland, Ohio, United States
Evaluation of a deep brain stimulation surgery navigation system (Surgical Information Sciences) with ML components for enhanced image visualization. ID: NCT02902328 Preoperative MRI Observational, prospective, single-center Mar. 2016 – Sep. 2016 30 Surgical Information Sciences Inc., Minneapolis, Minnesota, United States
Data collection and creation of an ML-based system for early sepsis detection for patients in ICUs including surgical ICUs. ID: NCT04130789 ICU device parameters, microbiology parameters and laboratory results Observational, prospective, multi-center Nov. 2019 – Jun. 2023 17,500 Clinical Microbiology, University Hospital Basel, Basel, Switzerland
Data collection and creation of an ML-based CDSS to predict liver transplant (LT) complication risk using microbial flora data at pre-LT, early post-LT and late post-LT timepoints. ID: NCT03666312 Preoperative and intraoperative microbial flora parameters Observational, prospective, multi-center Sep. 2019 – Aug. 2021 275 IRCCS San Raffaele, Milan, Italy
Multi-objective data collection to create and evaluate ML-based systems for liver volume assessment before and after surgery, and liver lesion detection. ID: NCT03960710 Preoperative and postoperative CT images Observational, retrospective, single-center Apr. 2019 – Sep. 2019 120 Radiology service, Imaging research unit, Edouard Herriot Hospital, Lyon, France
Data collection and creation of an ML-based CDSS to predict risk of postoperative cognitive complications. ID: NCT03175302 Preoperative digital cognititive testing data Observational, prospective, single-center Jun. 2018 – Aug. 2021 25,240 University of Florida, Gainesville, Florida, United States
Data collection and creation of an ML-based CDSS to predict risk of postoperative complications (Clavien-Dindo score). ID: NCT04092933 Patient Data Management System (PDMS) data including physiological parameters (vitals and respiratory), medication, intraoperative events and times Observational, retrospective, single-center May 2014 – Feb. 2022 109,000 The Technical University of Munich, Munich, Germany
Data collection and creation of an ML-based CDSS to predict postoperative acute renal failure after liver resection. ID: NCT01318798 Preoperative and intraoperative physiological data (unspecified) Observational, retrospective, single-center Jan. 2010 – Apr. 2012 549 University Hospital of Zurich, Department of Visceral and Transplantation Surgery, Zurich, Switzerland

Appendix F. Stakeholder importance

Importance of stakeholders as determined in the Delphi process.

graphic file with name nihms-1802848-f0005.jpg

Appendix References

  1. 3D Slicer. 3D Slicer image computing platform. URL: https://www.slicer.org/. Accessed: 2020-08-31. [Google Scholar]
  2. AdaptOR. Deep generative model challenge for domain adaptation in surgery URL: https://adaptor2021.github.io/. Accessed: 2021-06-29.
  3. AIDA-E. Analysis of images to detect abnormalities in endoscopy (aida-e). URL: https://isbiaida.grand-challenge.org/. Accessed: 2021-06-23.
  4. Alegion, Inc. (Austin, TX, US). Alegion | Data Labeling Software Platform. URL: https://www.alegion.com/. Accessed: 2020-08-15.
  5. AlertWatch Anesthesia Control Tower. Home | The Future of Patient Monitoring | AlertWatch, Inc. URL: https://www.alertwatch.com/. Accessed: 2020-10-29.
  6. Ali S, Zhou F, Braden B, Bailey A, Yang S, Cheng G, Zhang P, Li X, Kayser M, Soberanis-Mukul RD, et al. , 2020. An objective comparison of detection and segmentation algorithms for artefacts in clinical endoscopy. Scientific reports 10, 1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Allan M, Mcleod J, Wang C, Rosenthal JC, Hu Z, Gard N, Eisert P, Fu KX, Zeffiro T, Xia W, Zhu Z, Luo H, Jia F, Zhang X, Li X, Sharan L, Kurmann T, Schmid S, Sznitman R, Psychogyios D, Azizian M, Stoyanov D, Maier-Hein L, Speidel S, 2021. Stereo Correspondence and Reconstruction of Endoscopic Data Challenge. arXiv:2101.01133 [cs] URL: http://arxiv.org/abs/2101.01133. arXiv: 2101.01133. [Google Scholar]
  8. Allan M, Shvets A, Kurmann T, Zhang Z, Duggal R, Su YH, Rieke N, Laina I, Kalavakonda N, Bodenstedt S, Herrera L, Li W, Iglovikov V, Luo H, Yang J, Stoyanov D, Maier-Hein L, Speidel S, Azizian M, 2019. 2017 Robotic Instrument Segmentation Challenge. arXiv:1902.06426 [cs] URL: http://arxiv.org/abs/1902.06426. arXiv: 1902.06426. [Google Scholar]
  9. Amazon Comprehend. Amazon Comprehend - Natural Language Processing (NLP) and Machine Learning (ML). URL: https://aws.amazon.com/comprehend/. Accessed: 2020-10-20.
  10. Amazon SageMaker Ground Truth. Amazon SageMaker Ground Truth | AWS. URL: https://aws.amazon.com/sagemaker/groundtruth/. Accessed: 2020-10-22.
  11. ANVIL. ANVIL: The Video Annotation Research Tool. URL: https://www.anvil-software.org/. Accessed: 2020-08-31.
  12. Apache MXNet. Apache MXNet - A flexible and efficient library for deep learning. URL: https://mxnet.apache.org/. Accessed: 2020-08-13.
  13. Appen Ltd (Chatswood, NSW, Australia). Confidence to Deploy AI with World-Class Training Data. URL: https://appen.com/. Accessed: 2020-08-15.
  14. Bawa VS, Singh G, KapingA F, Skarga-Bandurova I, Oleari E, Leporini A, Landolfo C, Zhao P, Xiang X, Luo G, et al. , 2021. The saras endoscopic surgeon action detection (esad) dataset: Challenges and methods. arXiv preprint arXiv:2104.03178. [Google Scholar]
  15. Bernal J, Tajkbaksh N, Sánchez FJ, Matuszewski BJ, Chen H, Yu L, Angermann Q, Romain O, Rustad B, Balasingham I, et al. , 2017. Comparative validation of polyp detection methods in video colonoscopy: results from the miccai 2015 endoscopic vision challenge. IEEE transactions on medical imaging 36, 1231–1249. [DOI] [PubMed] [Google Scholar]
  16. Borgli H, Thambawita V, Smedsrud PH, Hicks S, Jha D, Eskeland SL, Randel KR, Pogorelov K, Lux M, Nguyen DTD, et al. , 2020. Hyperkvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy. Scientific Data 7, 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Bouget D, Benenson R, Omran M, Riffaud L, Schiele B, Jannin P, 2015. Detecting surgical tools by modelling local appearance and global shape. IEEE transactions on medical imaging 34, 2603-2617. bcom Surgery Workflow Toolbox.bcom *SurgeryWorkflow Toolbox* [Annotate]. URL: https://b-com.com/en/bcom-surgery-workflow-toolbox-annotate. Accessed: 2020-08-31. [DOI] [PubMed] [Google Scholar]
  18. Caffe. Caffe | Deep Learning Framework. URL: http://caffe.berkeleyvision.org/. Accessed: 2020-08-13.
  19. CapeStart, Inc. (Cambridge, MA, US). Medical Image Annotation Experts. URL: https://www.capestart.com/services/data-preparation/medical-imageannotation/. Accessed: 2020-08-15.
  20. Caption Health/Caption AI. Products – Caption Health. URL: https://captionhealth.com/products/caption-ai/. Accessed: 2020-10-29.
  21. CholecTriplet21. Surgical action triplet recognition. URL: https://cholectriplet2021.grand-challenge.org/CholecTriplet2021/. Accessed: 2021-06-30.
  22. CloudFactory Ltd (Richmond, UK). Data labeling for Machine Learning, AI & More | CloudFactory. URL: https://www.cloudfactory.com. Accessed: 2020-08-15.
  23. Cogito Tech LLC (New York, NY, US). Training Data for AI | Data Enrichment Outsourcing Services - Cogito. URL: https://www.cogitotech.com/. Accessed: 2020-08-15.
  24. Curious. Curious: Correction of brain shift with intra-operative ultrasound. URL: https://curious2019.grand-challenge.org/. Accessed: 2021-06-27.
  25. CVAT. Powerful and efficient Computer Vision Annotation Tool (CVAT). URL: https://github.com/opencv/cvat. Accessed: 2020-08-31.
  26. Data Version Control (DVC). Data Version Control DVC - Open-source Version Control System for Machine Learning Projects. URL: https://dvc.org/. Accessed: 2020-10-20.
  27. DeepLabel. A cross-platform image annotation tool for machine learning. URL: https://github.com/jveitchmichaelis/deeplabel. Accessed: 2020-08-31.
  28. DLTK. DLTK - Deep Learning Toolkit. URL: https://dltk.github.io/. Accessed: 2020-10-20.
  29. Docker. Empowering App Development for Developers | Docker. URL: https://www.docker.com/. Accessed: 2020-10-20.
  30. Edgecase AI LLC (Hingham, MA, US). Data Labeling, Training Data for AI & ML - Edgecase. URL: https://www.edgecase.ai/. Accessed: 2020-08-15.
  31. Edwards Hemosphere platform. HemoSphere advanced monitoring platform | Edwards Lifesciences. URL: https://www.edwards.com/devices/hemodynamicmonitoring/hemosphere. Accessed: 2020-10-29.
  32. EndoCV21. Addressing generalisability in polyp detection and segmentation. URL: https://endocv2021.grand-challenge.org/EndoCV2021/. Accessed: 2021-06-29.
  33. Endoscopy Disease Detection and Segmentation (EDD). EDD2020 - Grand Challenge. URL: https://edd2020.grand-challenge.org/. Accessed: 2020-08-31.
  34. EndoVis-CATARACTS. EndoVis CATARACTS - Grand Challenge. URL: https://cataracts2018.grand-challenge.org/. Accessed: 2020-08-31.
  35. EndoVis-CATARACTS-SemSeg. EndoVis CATARACTS Semantic Segmentation sub-challenge - Grand Challenge. URL: https://cataracts-semanticsegmentation2020.grand-challenge.org/. Accessed: 2020-08-31.
  36. EndoVis-CATARACTS-Workflow. EndoVis CATARACTS Workflow Analysis - Grand Challenge. URL: https://cataracts2020.grand-challenge.org/. Accessed: 2020-08-31.
  37. EndoVis-GIANA. Gastrointestinal Image ANAlysis challenge - Grand Challenge. URL: https://giana.grand-challenge.org/. Accessed: 2020-08-31.
  38. EndoVis-Instrument. EndoVisSub-Instrument - Grand Challenge. URL: https://endovissub-instrument.grand-challenge.org/. Accessed: 2020-08-31.
  39. EndoVis-MISAW. MIcro-Surgical AnastomoseWorkflow recognition on training sessions (MISAW) - syn21776936. URL: https://www.synapse.org/@Synapse:syn21776936/wiki/601700. Accessed: 2020-08-31.
  40. EndoVis-RobSeg. EndoVis Robotic Scene Segmentation Sub-Challenge - Grand Challenge. URL: https://endovissub2018-roboticscenesegmentation.grandchallenge.org/. Accessed: 2020-08-31.
  41. EndoVis-SurgVisDom. EndoVis Surgical Visual Domain Adaptation - Grand Challenge. URL: https://surgvisdom.grand-challenge.org/. Accessed: 2020-08-31.
  42. EndoVis-Workflow. EndoVisSub-Workflow - Grand Challenge. URL: https://endovissub2017-workflow.grand-challenge.org/. Accessed: 2020-08-31.
  43. EndoVis-WorkflowAndSkill. EndoVisSub-WorkflowAndSkill - Grand Challenge. URL: https://endovissub-workflowandskill.grand-challenge.org/.; Ac-fastai. Welcome to fastai. URL: https://docs.fast.ai. Accessed: 2020-10-26.
  44. FetReg. FetReg - Placental Vessel Segmentation and Registration in Fetoscopy 2021. URL: https://www.synapse.org/@Synapse:syn25313156/wiki/. Accessed: 2021-06-30.
  45. FloTrac system. FloTrac system | Edwards Lifesciences. URL: https://www.edwards.com/gb/devices/Hemodynamic-Monitoring/FloTrac. Accessed: 2020-10-29.
  46. GaitSmart. Digital Gait Analysis System. URL: https://www.gaitsmart.com/. Accessed: 2020-10-29.
  47. Gao Y, Vedula SS, Reiley CE, Ahmidi N, Varadarajan B, Lin HC, Tao L, Zappella L, Béjar B, Yuh DD, Chen CCG, Vidal R, Khudanpur S, Hager GD, 2014. JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS): A Surgical Activity Dataset for Human Motion Modeling. [Google Scholar]
  48. General Blockchain, Inc. (San Jose, CA, US). Image Annotation Services - Image tagging services for Computer Vision. URL: https://www.imageannotation.ai/. Accessed: 2020-08-15.
  49. GIANA21. Gastrointestinal image analysis. URL: https://giana.grand-challenge.org/. Accessed: 2021-06-30.
  50. Giannarou S, Visentini-Scarzanella M, Yang GZ, 2013. Probabilistic Tracking of Affine-Invariant Anisotropic Regions. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 130–143. doi: 10.1109/TPAMI.2012.81. Git. Git. URL: https://git-scm.com/. Accessed: 2020-10-20. [DOI] [PubMed] [Google Scholar]
  51. Google Cloud Natural Language. Cloud Natural Language. URL: https://cloud.google.com/natural-language. Accessed: 2020-10-20.
  52. Grupp RB, Unberath M, Gao C, Hegeman RA, Murphy RJ, Alexander CP, Otake Y, McArthur BA, Armand M, Taylor RH, 2020. Automatic annotation of hip anatomy in fluoroscopy for robust and efficient 2d/3d registration. International journal of computer assisted radiology and surgery 15, 759–769. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Haak D, Page CE, Deserno TM, 2016. A Survey of DICOM Viewer Software to Integrate Clinical Research and Medical Imaging. Journal of Digital Imaging 29, 206–215. doi: 10.1007/s10278-015-9833-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Hattab G, Arnold M, Strenger L, Allan M, Arsentjeva D, Gold O, Simpfendörfer T, Maier-Hein L, Speidel S, 2020. Kidney edge detection in laparoscopic image data for computer-assisted surgery. International Journal of Computer Assisted Radiology and Surgery 15, 379–387. [DOI] [PubMed] [Google Scholar]
  55. HeiSurf. Surgical workflow analysis and full scene segmentation. URL: https://www.synapse.org/@Synapse:syn25101790/wiki/. Accessed: 2021-06-30.; iMerit Technology Services Pvt Ltd (Kolkata, West Bengal, India). iMerit. URL: https://imerit.net/. Accessed: 2020-08-15.
  56. ImFusion Suite. ImFusion - ImFusion Suite. URL: https://www.imfusion.com/products/imfusion-suite. Accessed: 2020-10-20.; iMSTK. iMSTK | Interactive Medical Simulation Toolkit. URL: https://www.imstk.org/. Accessed: 2020-10-20.
  57. Infolks Ptv Ltd (Mannarkkad, Kerala, India). Image Annotation for Machine Learning | INFOLKS. URL: https://infolks.info/. Accessed: 2020-10-28.
  58. IntelliVue MX700, Philips. IntelliVue MX700 Bedside patient monitor | Philips Healthcare. URL: https://www.usa.philips.com/healthcare/product/HC865241/intellivue-mx700-patient-monitor. Accessed: 2020-10-27.
  59. ITK. ITK | Insight Toolkit. URL: https://itk.org/. Accessed: 2020-08-05.
  60. ITK-SNAP. ITK-SNAP Home. URL: http://www.itksnap.org/pmwiki/pmwiki.php. Accessed: 2020-10-20.
  61. Jin A, Yeung S, Jopling J, Krause J, Azagury D, Milstein A, Fei-Fei L, 2018. Tool detection and operative skill assessment in surgical videos using region-based convolutional neural networks, in: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE. pp. 691–699. [Google Scholar]
  62. Jupyter Notebook. Project Jupyter URL: https://www.jupyter.org. Accessed: 2020-10-20.
  63. Keras. Keras: the Python deep learning API. URL: https://keras.io/. Accessed: 2020-08-13.
  64. Kia A, Timsina P, Joshi HN, Klang E, Gupta RR, Freeman RM, Reich DL, Tomlinson MS, Dudley JT, Kohli-Seth R, Mazumdar M, Levin MA, 2020. MEWS++: Enhancing the Prediction of Clinical Deterioration in Admitted Patients through a Machine Learning Model. Journal of Clinical Medicine 9. URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7073544/, doi: 10.3390/jcm9020343. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Koulaouzidis A, Iakovidis DK, Yung DE, Rondonotti E, Kopylov U, Plevris JN, Toth E, Eliakim A, Johansson GW, Marlicz W, et al. , 2017. Kid project: an internet-based digital video atlas of capsule endoscopy for research purposes. Endoscopy international open 5, E477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Labelbox, Inc. (San Francisco, CA, US). Labelbox: The leading training data platform for data labeling. URL: https://labelbox.com. Accessed: 2020-10-28.
  67. LabelMe. LabelMe. The Open annotation tool. URL: http://labelme.csail.mit.edu/Release3.0/. Accessed: 2020-08-31.; Leibetseder A, Kletz S, Schoeffmann K, Keckstein S, Keckstein J, 2020. Glenda: Gynecologic laparoscopy endometriosis dataset, in: International Conference on Multimedia Modeling, Springer. pp. 439–450. [Google Scholar]
  68. Leibetseder A, Petscharnig S, Primus MJ, Kletz S, Münzer B, Schoeffmann K, Keckstein J, 2018. Lapgyn4: a dataset for 4 automatic content analysis problems in the domain of laparoscopic gynecology, in: Proceedings of the 9th ACM Multimedia Systems Conference, pp. 357–362. [Google Scholar]
  69. Lerotic M, Chung AJ, Clark J, Valibeik S, Yang GZ, 2008. Dynamic View Expansion for Enhanced Navigation in Natural Orifice Transluminal Endoscopic Surgery, in: Metaxas D, Axel L, Fichtinger G, Székely G (Eds.), Medical Image Computing and Computer-Assisted Intervention - MICCAI 2008, Springer, Berlin, Heidelberg. pp. 467–475. doi: 10.1007/978-3-540-85990-1_56. [DOI] [PubMed] [Google Scholar]
  70. Loy Rodas N, Barrera F, Padoy N, 2017. See It With Your Own Eyes: Markerless Mobile Augmented Reality for Radiation Awareness in the Hybrid Room. IEEE Transactions on Biomedical Engineering 64, 429–440. doi: 10.1109/TBME.2016.2560761. [DOI] [PubMed] [Google Scholar]
  71. Maier-Hein L, Mersmann S, Kondermann D, Bodenstedt S, Sanchez A, Stock C, Kenngott HG, Eisenmann M, Speidel S, 2014. Can Masses of Non-Experts Train Highly Accurate Image Classifiers?, in: Medical Image Computing and Computer-Assisted Intervention - MICCAI 2014, Springer, Cham. pp. 438–445. URL: http://link.springer.com/chapter/10.1007/978-3-319-10470-6_55, doi: 10.1007/978-3-319-10470-6_55. [DOI] [PubMed] [Google Scholar]
  72. Make Sense. Make Sense. URL: https://www.makesense.ai/. Accessed: 2020-08-31.
  73. MATLAB. MATLAB - MathWorks. URL: https://www.mathworks.com/products/matlab.html. Accessed: 2020-10-20.
  74. MATLAB Computer Vision Toolbox. Computer Vision Toolbox. URL: https://www.mathworks.com/products/computer-vision.html. Accessed: 2020-10-20.
  75. MATLAB Deep Learning Toolbox. Deep Learning Toolbox. URL: https://www.mathworks.com/products/deep-learning.html. Accessed: 2020-08-13.
  76. MATLAB Signal Processing Toolbox. Signal Processing Toolbox. URL: https://www.mathworks.com/products/signal.html. Accessed: 2020-10-20.
  77. MATLAB Statistics and Machine Learning Toolbox. Statistics and Machine Learning Toolbox. URL: https://www.mathworks.com/products/statistics.html. Accessed: 2020-10-20.
  78. Microsoft Cognitive Toolkit (CNTK). The Microsoft Cognitive Toolkit - Cognitive Toolkit - CNTK. URL: https://docs.microsoft.com/en-us/cognitivetoolkit/. Accessed: 2020-08-13.
  79. MITK. The Medical Imaging Interaction Toolkit (MITK) - mitk.org. URL: https://www.mitk.org/wiki/The_Medical_Imaging_Interaction_Toolkit_MITK/. Accessed: 2020-08-31.
  80. Moccia S, Momi ED, Mattos LS Laryngeal dataset. URL: https://zenodo.org/record/1003200.WdeQcnBx0nQ, doi: 10.5281/zenodo.1003200. type: dataset. Accessed: 2020-08-31. [DOI]
  81. Moccia S, Vanone GO, Momi ED, Mattos LS NBI-InfFrames. URL: https://zenodo.org/record/1162784.WnFzLZOdX6Y, doi: 10.5281/zenodo.1162784. [DOI]
  82. Model Zoo. Model Zoo - Deep learning code and pretrained models for transfer learning, educational purposes, and more. URL: https://modelzoo.co/. Accessed: 2020-08-04.
  83. Mountney P, Stoyanov D, Yang GZ, 2010. Three-Dimensional Tissue Deformation Recovery and Tracking. IEEE Signal Processing Magazine 27, 14–24. doi: 10.1109/MSP.2010.936728. conference Name: IEEE Signal Processing Magazine. [DOI] [Google Scholar]
  84. Nakawala H Nephrec9. URL: https://zenodo.org/record/1066831.WmtRmIjOVPY, doi: 10.5281/zenodo.1066831. type: dataset. Accessed: 2020-08-31. [DOI]
  85. Nexfin finger cuff. BMEYE Nexfin Blood Pressure Monitor. URL: https://medaval.ie/device/bmeye-nexfin/. Accessed: 2020-10-29.
  86. NiftyNet. NiftyNet - An open source convolutional neural networks platform for medical image analysis and image-guided therapy. URL: https://niftynet.io/. Accessed: 2020-10-20.
  87. NVIDIA Clara. NVIDIA Clara - An Application Framework Optimized for Healthcare and Life Sciences Developers. URL: https://developer.nvidia.com/clara. Accessed: 2020-10-20.
  88. NVIDIA Clara Imaging. NVIDIA Clara Imaging. URL: https://developer.nvidia.com/clara-medical-imaging. Accessed: 2020-08-31.
  89. Observer XT. Behavioral coding - Event logging software | The Observer XT. URL: https://www.noldus.com/observer-xt. Accessed: 2020-08-31.
  90. ONNX Model Zoo. onnx/models. URL: https://github.com/onnx/models. Accessed: 2020-08-04.
  91. OpenCV. OpenCV. URL: https://opencv.org/. Accessed: 2020-08-05.
  92. OpenSurgSim. OpenSurgSim. URL: http://www.opensurgsim.org/. Accessed: 2020-10-20.
  93. Ozyoruk KB, Gokceler GI, Bobrow TL, Coskun G, Incetan K, Almalioglu Y, Mahmood F, Curto E, Perdigoto L, Oliveira M, et al. , 2021. Endoslam dataset and an unsupervised monocular visual odometry and depth estimation approach for endoscopic videos. Medical image analysis 71, 102058. [DOI] [PubMed] [Google Scholar]
  94. PCL. Point Cloud Library. URL: https://pointcloudlibrary.github.io/. Accessed: 2020-08-05.
  95. Penza V EndoAbS Dataset. URL: https://zenodo.org/record/60593, doi: 10.5281/zenodo.60593. type: dataset. Accessed: 2020-08-31. [DOI]
  96. Penza V, Du X, Stoyanov D, Forgione A, Mattos L, De Momi E TrackVes Dataset. URL: https://zenodo.org/record/822053, doi: 10.5281/zenodo.822053. type: dataset. Accessed: 2020-08-31. [DOI]
  97. PETRAW. Peg transfer workflow recognition by different modalities. URL: https://www.synapse.org/@Synapse:syn25147789/wiki/. Accessed: 2021-06-30.
  98. Pixel Annotation Tool. GitHub - abreheret/PixelAnnotationTool: Annotate quickly images. URL: https://github.com/abreheret/PixelAnnotationTool. Accessed: 2020-08-31.
  99. Pogorelov K, Randel KR, Griwodz C, Eskeland SL, de Lange T, Johansen D, Spampinato C, Dang-Nguyen DT, Lux M, Schmidt PT, et al. , 2017. Kvasir: A multi-class image dataset for computer aided gastrointestinal disease detection, in: Proceedings of the 8th ACM on Multimedia Systems Conference, pp. 164–169. [Google Scholar]
  100. Pratt P, Stoyanov D, Visentini-Scarzanella M, Yang GZ, 2010. Dynamic Guidance for Robotic Surgery Using Image-Constrained Biomechanical Models, in: Jiang T, Navab N, Pluim JPW, Viergever MA (Eds.), Medical Image Computing and Computer-Assisted Intervention - MICCAI 2010, Springer, Berlin, Heidelberg. pp. 77–85. doi: 10.1007/978-3-642-15705-9_10. [DOI] [PubMed] [Google Scholar]
  101. Python Matplotlib. Matplotlib: Python plotting – Matplotlib 3.3.2 documentation. URL: https://matplotlib.org/. Accessed: 2020-10-20.
  102. Python NLTK. Natural Language Toolkit – NLTK 3.5 documentation. URL: https://www.nltk.org/. Accessed: 2020-10-20.
  103. Python scikit-learn. scikit-learn: machine learning in Python. URL: https://scikit-learn.org/stable/. Accessed: 2020-08-04.
  104. Python scipy.signal. Signal processing (scipy.signal) – SciPy v1.5.3 Reference Guide. URL: https://docs.scipy.org/doc/scipy/reference/signal.html. Accessed: 2020-10-20.
  105. Python scipy.stats. Statistical functions (scipy.stats)– SciPy v1.5.3 Reference Guide. URL: https://docs.scipy.org/doc/scipy/reference/stats.html. Accessed: 2020-10-20.
  106. Python seaborn. seaborn: statistical data visualization – seaborn 0.11.0 documentation. URL: https://seaborn.pydata.org/. Accessed: 2020-10-28.
  107. Python statsmodels. Introduction – statsmodels. URL: https://www.statsmodels.org/stable/index.html. Accessed: 2020-10-28.
  108. PyTorch. PyTorch. URL: https://www.pytorch.org. Accessed: 2020-08-13.
  109. PyTorch-NLP. PetrochukM/PyTorch-NLP - Basic Utilities for PyTorch Natural Language Processing (NLP). URL: https://github.com/PetrochukM/PyTorch-NLP. Accessed: 2020-10-20.
  110. Qin F, Lin S, Li Y, Bly RA, Moe KS, Hannaford B, 2020. Towards better surgical instrument segmentation in endoscopic vision: multi-angle feature aggregation and contour supervision. IEEE Robotics and Automation Letters 5, 6639–6646. [Google Scholar]
  111. R. R: The R Project for Statistical Computing. URL: https://www.r-project.org/. Accessed: 2020-10-20.
  112. ROS. Robot Operating System | Powering the world’s robots. URL: https://www.ros.org/. Accessed: 2020-10-20.
  113. Roß T, Reinke A, Full PM, Wagner M, Kenngott H, Apitz M, Hempe H, Mindroc-Filimon D, Scholz P, Tran TN, et al. , 2021. Comparative validation of multi-instance instrument segmentation in endoscopy: results of the robust-mis 2019 challenge. Medical image analysis 70, 101920. [DOI] [PubMed] [Google Scholar]
  114. Samasource Impact Solutions, Inc. (San Francisco, CA, US). Samasource | Training Data for AI. URL: https://www.samasource.com. Accessed: 2020-08-15.
  115. SARAS-MESAD. Saras challenge on multi-domain endoscopic surgeon action detection. URL: https://saras-mesad.grand-challenge.org/. Accessed: 2021-06-29.
  116. Sarikaya D, Corso JJ, Guru KA, 2017. Detection and Localization of Robotic Tools in Robot-Assisted Surgery Videos Using Deep Neural Networks for Region Proposal and Detection. IEEE Transactions on Medical Imaging 36, 1542–1549. doi: 10.1109/TMI.2017.2665671. [DOI] [PubMed] [Google Scholar]
  117. Scale AI, Inc. (San Francisco, CA, US). Scale AI: The Data Platform for AI. URL: https://scale.com/. Accessed: 2020-10-23.
  118. Schoeffmann K, Taschwer M, Sarny S, Münzer B, Primus MJ, Putzgruber D, 2018. Cataract-101: video dataset of 101 cataract surgeries, in:Proceedings of the 9th ACM Multimedia Systems Conference, pp. 421–425. [Google Scholar]
  119. Semantic Segmentation Editor. Hitachi-Automotive-And-Industry-Lab/semantic-segmentation-editor. URL: https://github.com/Hitachi-Automotive-And-Industry-Lab/semantic-segmentation-editor. Accessed: 2020-08-31.
  120. SimSurgSkill. Objective surgical skills assessment in vr simulation. URL: https://www.synapse.org/@Synapse:syn25127311/wiki/. Accessed: 2021-06-30.
  121. SlicerIGT. SlicerIGT | toolkit for navigated interventions. URL: http://www.slicerigt.org/wp/. Accessed: 2020-10-20.
  122. Smart Angel. Support with patient monitoring. URL: https://www.evolucare.com/en/support-with-patient-monitoring/?region=eur. Accessed: 2020-10-29.
  123. Smedsrud PH, Thambawita V, Hicks SA, Gjestang H, Nedrejord OO, Næss E, Borgli H, Jha D, Berstad TJD, Eskeland SL, Lux M, Espeland H, Petlund A, Nguyen DTD, Garcia-Ceja E, Johansen D, Schmidt PT, Toth E, Hammer HL, de Lange T, Riegler MA, Halvorsen P, 2021. Kvasir-Capsule, a video capsule endoscopy dataset. Scientific Data 8, 142. URL: https://www.nature.com/articles/s41597-021-00920-z, doi: 10.1038/s41597-021-00920-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  124. Snap40 Monitor. Full-Service Remote Healthcare Platform | Current Health. URL: https://currenthealth.com/. Accessed: 2020-10-29.
  125. SOFA. SOFA - Simulation Open Framework Architecture. URL: https://www.sofa-framework.org/. Accessed: 2020-10-20.; spaCy. spaCy. Industrial-strength Natural Language Processing in Python. URL: https://spacy.io/. Accessed: 2020-10-20.
  126. Srivastav V, Issenhuth T, Kadkhodamohammadi A, de Mathelin M, Gangi A, Padoy N, 2019. MVOR: A Multi-view RGBD Operating Room Dataset for 2D and 3D Human Pose Estimation. arXiv:1808.08180 [cs] URL: http://arxiv.org/abs/1808.08180. arXiv: 1808.08180. [Google Scholar]
  127. Stauder R, Ostler D, Kranzfelder M, Koller S, Feußner H, Navab N, 2017. The TUM LapChole dataset for the M2CAI 2016 workflow challenge. arXiv:1610.09278 [cs] URL: http://arxiv.org/abs/1610.09278. arXiv: 1610.09278. [Google Scholar]
  128. Steldia Services Ltd (Limassol Agios Athanasios, Cyprus). Outsourcing Europe | BPO Company Ukraine. URL: https://mindy-support.com/. Accessed: 2020-10-28.
  129. Stoyanov D, Mylonas GP, Deligianni F, Darzi A, Yang GZ, 2005. Soft-Tissue Motion Tracking and Structure Estimation for Robotic Assisted MIS Procedures, in: Duncan JS, Gerig G (Eds.), Medical Image Computing and Computer-Assisted Intervention - MICCAI 2005, Springer, Berlin, Heidelberg. pp. 139–146. doi: 10.1007/11566489_18. [DOI] [PubMed] [Google Scholar]
  130. Stoyanov D, Scarzanella MV, Pratt P, Yang GZ, 2010. Real-Time Stereo Reconstruction in Robotically Assisted Minimally Invasive Surgery, in: Jiang T, Navab N, Pluim JPW, Viergever MA (Eds.), Medical Image Computing and Computer-Assisted Intervention - MICCAI 2010, Springer, Berlin, Heidelberg. pp. 275–282. doi: 10.1007/978-3-642-15705-9_34. [DOI] [PubMed] [Google Scholar]
  131. SuperAnnotate Desktop. SuperAnnotate Desktop. URL: https://opencv.org/superannotate-desktop/. Accessed: 2020-10-22.
  132. SuperAnnotate LLC (Sunnyvale, CA, US). SuperAnnotate | The fastest annotation platform and services for training AI. URL: https://www.superannotate.com/. Accessed: 2020-10-23.
  133. Surgical Information Sciences. Surgical Information Sciences, Inc. URL: http://surgicalis.com/.. Accessed: 2020-10-27; s.w.an. s.w.an | scientific workflow analysis. URL: http://www.scientific-analysis.com/. Accessed: 2020-08-31.
  134. Sznitman R, Ali K, Richa R, Taylor RH, Hager GD, Fua P, 2012. Data-Driven Visual Tracking in Retinal Microsurgery, in: Ayache N, Delingette H, Golland P, Mori K (Eds.), Medical Image Computing and Computer-Assisted Intervention - MICCAI 2012, Springer, Berlin, Heidelberg. pp. 568–575. doi: 10.1007/978-3-642-33418-4_70. [DOI] [PubMed] [Google Scholar]
  135. Telus International (Vancouver, BC, CA). AI Data Solutions | TELUS International. URL: https://www.telusinternational.com/solutions/ai-data-solutions. Accessed: 2021-07-26.
  136. TensorFlow. TensorFlow - An end-to-end open source machine learning platform. URL: https://www.tensorflow.org/. Accessed: 2020-08-13.
  137. TensorFlow Model Garden. tensorflow/models. URL: https://github.com/tensorflow/models. Accessed: 2020-10-20.; torchvision models. torchvision.models – PyTorch 1.6.0 documentation. URL: https://pytorch.org/docs/stable/torchvision/models.html. Accessed: 2020-10-20.
  138. Twinanda AP, Shehata S, Mutter D, Marescaux J, de Mathelin M, Padoy N, 2017. EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos. IEEE Transactions on Medical Imaging 36, 86–97. doi: 10.1109/TMI.2016.2593957. [DOI] [PubMed] [Google Scholar]
  139. UltimateLabeling. alexandre01/UltimateLabeling. URL: https://github.com/alexandre01/UltimateLabeling. Accessed: 2020-08-31.
  140. VATIC. Video Annotation Tool from Irvine, California. URL: http://www.cs.columbia.edu/~vondrick/vatic/. Accessed: 2020-10-29.
  141. VLFeat. VLFeat. URL: https://www.vlfeat.org/. Accessed: 2020-08-05.
  142. VoTT. microsoft/VoTT - Visual Object Tagging Tool. URL: https://github.com/microsoft/VoTT. Accessed: 2020-08-31.
  143. VTK. VTK - The Visualization Toolkit. URL: https://vtk.org/. Accessed: 2020-10-20.
  144. Ye M, Giannarou S, Meining A, Yang GZ, 2016. Online tracking and retargeting with applications to optical biopsy in gastrointestinal endoscopic examinations. Medical Image Analysis 30, 144–157. doi: 10.1016/j.media.2015.10.003. [DOI] [PubMed] [Google Scholar]
  145. Ye M, Johns E, Handa A, Zhang L, Pratt P, Yang GZ, 2017. Self-Supervised Siamese Learning on Stereo Image Pairs for Depth Estimation in Robotic Surgery. arXiv:1705.08260 [cs] URL: http://arxiv.org/abs/1705.08260. arXiv: 1705.08260. [Google Scholar]

Footnotes

Declaration of Competing Interest

Anand Malpani is a future employee at Mimic Technologies Inc. (Seattle, WA, US). Johannes Fallert and Lars Mündermann are employed at KARL STORZ SE & Co. KG (Tuttlingen, Germany). Hirenkumar Nakawala is employed at CMR Surgical Ltd (Cambridge, UK). Nicolas Padoy is a scientific advisor of Caresyntax (Berlin, Germany). Daniel A. Hashimoto is a consultant for Johnson & Johnson (New Brunswick, NJ, USA), Verily Life Sciences (San Francisco, CA, USA), and Activ Surgical (Boston, MA, USA). He has received research support from Olympus Corporation and the Intuitive Foundation. Carla Pugh is the founder of 10 Newtons Inc. (Madison, WI, US). Danail Stoyanov is employed at Digital Surgery Ltd (London, UK) and Odin Vision Ltd (London, UK). Teodor Grantcharov is the founder of Surgical Safety Technologies Inc. (Toronto, Ontario, Canada). Tobias Roß is employed at Quality Match GmbH (Heidelberg, Germany). All other authors do not declare any conflicts of interest.

CRediT authorship contribution statement

Lena Maier-Hein: Conceptualization, Methodology, Validation, Formal analysis, Investigation, Data curation, Writing – original draft, Writing – review & editing, Visualization, Supervision, Project administration, Funding acquisition. Matthias Eisenmann: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing – original draft, Writing – review & editing, Visualization, Supervision, Project administration. Duygu Sarikaya: Conceptualization, Methodology, Investigation, Writing – original draft, Writing – review & editing. Keno März: Conceptualization, Methodology, Investigation, Writing – original draft, Writing – review & editing. Toby Collins: Methodology, Investigation, Writing – original draft, Writing – re view & editing. Anand Malpani: Methodology, Investigation, Writing – original draft, Writing – review & editing. Johannes Fall ert: Methodology, Investigation, Writing – original draft, Writing – review & editing. Hubertus Feussner: Methodology, Investigation, Writing – original draft, Writing – review & editing. Stamatia Giannarou: Methodology, Investigation, Writing – original draft, Writing – review & editing. Pietro Mascagni: Methodology, Investigation, Writing – original draft, Writing – review & editing. Hirenkumar Nakawala: Methodology, Validation, Formal analysis, Investigation, Data curation, Writing – original draft, Writing – review & editing. Adrian Park: Methodology, Investigation, Writing – original draft, Writing – review & editing. Carla Pugh: Methodology, Investigation, Writing – original draft, Writing – review & editing. Danail Stoyanov: Methodology, Investigation, Writing – original draft, Writing – review & editing. Swaroop S. Vedula: Method ology, Investigation, Writing – original draft, Writing – review & editing. Kevin Cleary: Methodology, Investigation, Writing – original draft, Writing – review & editing. Gabor Fichtinger: Methodology, Investigation, Writing – original draft, Writing – review & editing. Germain Forestier: Methodology, Investigation, Writing – original draft, Writing – review & editing. Bernard Gibaud: Methodology, Investigation, Writing – original draft, Writing – review & editing. Teodor Grantcharov: Methodology, Investigation, Writing – original draft, Writing – review & editing. Makoto Hashizume: Methodology, Investigation, Writing – original draft, Writing – review & editing. Doreen Heckmann-Nötzel: Validation, Formal analysis, Investigation, Data curation, Writing – review & editing, Project administration. Hannes G. Kenngott: Methodology, Investigation, Writing – original draft, Writing – review & editing. Ron Kikinis: Methodology, Investigation, Writing – original draft, Writing – review & editing. Lars Mündermann: Methodology, Investigation, Writing – original draft, Writing – review & editing. Nassir Navab: Methodology, Investigation, Writing – original draft, Writing – review & editing. Sinan Onogur: Methodology, Investigation, Writing – original draft, Writing – review & editing, Visualization. Tobias Roß: Methodology, Investigation, Writing – review & editing. Raphael Sznitman: Methodology, Investigation, Writing – original draft, Writing – review & editing. Russell H. Tay lor: Methodology, Investigation, Writing – original draft, Writing – review & editing. Minu D. Tizabi: Methodology, Validation, Formal analysis, Investigation, Data curation, Writing – original draft, Writing – review & editing. Martin Wagner: Methodology, Investigation, Writing – original draft, Writing – review & editing. Gregory D. Hager: Methodology, Investigation, Writing – original draft, Writing – review & editing. Thomas Neumuth: Methodology, Investigation, Writing – original draft, Writing – review & editing. Nicolas Padoy: Methodology, Investigation, Writing – original draft, Writing – review & editing. Justin Collins: Investigation, Writing – review & editing. Ines Gockel: Investigation, Writing – review & editing. Jan Goedeke: Investigation, Writing – review & editing. Daniel A. Hashimoto: Investigation, Writing – review & editing. Luc Joyeux: Investigation, Writing – review & editing. Kyle Lam: Investigation, Writing – review & editing. Daniel R. Leff: Investigation, Writing – review & editing. Amin Madani: Investigation, Writing – review & editing. Hani J. Marcus: Investigation, Writing – review & editing. Ozanan Meireles: Investigation, Writing – review & editing. Alexander Seitel: Validation, Formal analysis, Investigation, Data curation, Writing – review & editing, Visualization. Dogu Teber: Investigation, Writing – review & editing. Frank Ückert: Investigation, Writing – review & editing. Beat P. Müller-Stich: Methodology, Investigation, Writing – original draft, Writing – review & editing. Pierre Jannin: Conceptualization, Validation, Formal analysis, Investigation, Data curation, Writing – original draft, Writing – review & editing, Supervision, Project administration, Funding acquisition. Stefanie Speidel: Conceptualization, Validation, Formal analysis, Investigation, Data curation, Writing – original draft, Writing – review & editing, Supervision, Project administration, Funding acquisition.

References

  1. AAMI, 2012. Medical device interoperability. Association for the Advancement of Medical Instrumentation, Arlington, VA, USA. https://webstore.ansi.org/Documents/Medical-Device-Interoperability.pdf [Google Scholar]
  2. Abràmoff MD, Lavin PT, Birch M, Shah N, Folk JC, 2018. Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices. npj Digital Medicine 1 (1), 39. doi: 10.1038/s41746-018-0040-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. ACS Committee on Trauma,. Stop The Bleed. https://www.stopthebleed.org/.
  4. Adebayo J, Gilmer J, Muelly M, Goodfellow I, Hardt M, Kim B, 2018. Sanity checks for saliency maps. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. Curran Associates Inc., Montréal, Canada, pp. 9525–9536. [Google Scholar]
  5. Adibi A, Sadatsafavi M, Ioannidis JPA, 2020. Validation and utility testing of clinical prediction models: time to change the approach. JAMA 324 (3), 235. doi: 10.1001/jama.2020.1230. [DOI] [PubMed] [Google Scholar]
  6. Adler TJ, Ardizzone L, Vemuri A, Ayala L, Gröhl J, Kirchner T, Wirkert S, Kruse J, Rother C, Köthe U, Maier-Hein L, 2019. Uncertainty-aware performance assessment of optical imaging modalities with invertible neural networks. Int J Comput Assist Radiol Surg 14 (6), 997–1007. doi: 10.1007/s11548-019-01939-9. [DOI] [PubMed] [Google Scholar]
  7. AI Medical Service, Inc. (Tokyo, Japan: ),. AIM. Library Catalog: www.ai-ms.com, https://www.ai-ms.com/en/. [Google Scholar]
  8. Akladios C, Gabriele V, Agnus V, Martel-Billard C, Saadeh R, Garbin O, Lecointre L, Marescaux J, 2020. Augmented reality in gynecologic laparoscopic surgery: development, evaluation of accuracy and clinical relevance of a device useful to identify ureters during surgery. Surg Endosc 34 (3), 1077–1087. doi: 10.1007/s00464-019-06855-2. [DOI] [PubMed] [Google Scholar]
  9. Aksamentov I, Twinanda AP, Mutter D, Marescaux J, Padoy N, 2017. Deep Neural Networks Predict Remaining Surgery Duration from Cholecystectomy Videos. In: Descoteaux M, Maier-Hein L, Franz A, Jannin P, Collins DL, Duchesne S (Eds.), Medical Image Computing and Computer-Assisted Intervention - MICCAI 2017. Springer International Publishing, Cham, pp. 586–593. doi: 10.1007/978-3-319-66185-8_66. [DOI] [Google Scholar]
  10. Alapatt D, Mascagni P, Srivastav V, Padoy N, 2020. Neural Networks and Deep Learning. Hashimoto DA (Ed.) Artificial Intelligence in Surgery. McGraw Hill, New York. [Google Scholar]
  11. Albarqouni S, Baur C, Achilles F, Belagiannis V, Demirci S, Navab N, 2016. Aggnet: deep learning from crowds for mitosis detection in breast cancer histology images. IEEE Trans Med Imaging 35 (5), 1313–1321. doi: 10.1109/TMI.2016.2528120. [DOI] [PubMed] [Google Scholar]
  12. ALHAJJ H, Lamard M, Conze P. h., Cochener B, Quellec G, 2021. Cataracts. 10.21227/ac97-8m18 [DOI] [Google Scholar]
  13. Allan M, Kondo S, Bodenstedt S, Leger S, Kadkhodamohammadi R, Luengo I, Fuentes F, Flouty E, Mohammed A, Pedersen M, Kori A, Alex V, Krishnamurthi G, Rauber D, Mendel R, Palm C, Bano S, Saibro G, Shih C-S, Chiang H-A, Zhuang J, Yang J, Iglovikov V, Dobrenkii A, Reddiboina M, Reddy A, Liu X, Gao C, Unberath M, Kim M, Kim C, Kim C, Kim H, Lee G, Ullah I, Luna M, Park SH, Azizian M, Stoyanov D, Maier-Hein L, Speidel S, 2020. 2018 Robotic scene segmentation challenge. arXiv:2001.11190 [cs]. ArXiv: 2001.11190, http://arxiv.org/abs/2001.11190 [Google Scholar]
  14. Allan M, Mcleod J, Wang C, Rosenthal JC, Hu Z, Gard N, Eisert P, Fu KX, Zeffiro T, Xia W, Zhu Z, Luo H, Jia F, Zhang X, Li X, Sharan L, Kurmann T, Schmid S, Sznitman R, Psychogyios D, Azizian M, Stoyanov D, Maier-Hein L, Speidel S, 2021. Stereo correspondence and reconstruction of endoscopic data challenge. arXiv:2101.01133 [cs]. ArXiv: 2101.01133, http://arxiv.org/abs/2101.01133 [Google Scholar]
  15. Allan M, Shvets A, Kurmann T, Zhang Z, Duggal R, Su Y-H, Rieke N, Laina I, Kalavakonda N, Bodenstedt S, Herrera L, Li W, Iglovikov V, Luo H, Yang J, Stoyanov D, Maier-Hein L, Speidel S, Azizian M, 2019. 2017 Robotic instrument segmentation challenge. arXiv:1902.06426 [cs]. ArXiv: 1902.06426, http://arxiv.org/abs/1902.06426 [Google Scholar]
  16. Apache MXNet,. Apache MXNet: A flexible and efficient library for deep learning. https://mxnet.apache.org/.
  17. Arah OA, 2017. Bias analysis for uncontrolled confounding in the health sciences. Annu Rev Public Health 38 (1), 23–38. doi: 10.1146/annurev-publhealth-032315-021644. [DOI] [PubMed] [Google Scholar]
  18. Ardizzone L, Kruse J, Rother C, Köthe U, 2018. Analyzing Inverse Problems with Invertible Neural Networks. In: International Conference on Learning Representations. https://openreview.net/forum?id=rJed6j0cKX [Google Scholar]
  19. Arts DGT, De Keizer NF, Scheffer G-J, 2002. Defining and improving data quality in medical registries: a literature review, case study, and generic framework. Journal of the American Medical Informatics Association: JAMIA 9 (6), 600–611. doi: 10.1197/jamia.m1087. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. August AT, Sheth K, Brandt A, deRuijter V, Fuerch JH, Wall J, 2021. The value of surgical data-impact on the future of the surgical field. Surg Innov doi: 10.1177/15533506211003538. [DOI] [PubMed] [Google Scholar]
  21. auto-sklearn,. auto-sklearn - AutoSklearn 0.10.0 documentation https://automl.github.io/auto-sklearn/master/.
  22. AutoKeras,. AutoKeras: An Efficient Neural Architecture Search System. https://autokeras.com/.
  23. Ayala L, Wirkert S, Vemuri A, Adler T, Seidlitz S, Pirmann S, Engels C, Teber D, Maier-Hein L, 2021. Video-rate multispectral imaging in laparoscopic surgery: first-in-human application. arXiv:2105.13901 [cs, eess]. ArXiv: 2105.13901 http://arxiv.org/abs/2105.13901 [Google Scholar]
  24. Bach S, Binder A, Montavon G, Klauschen F, Müller K-R, Samek W, 2015. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10 (7), e0130140. doi: 10.1371/journal.pone.0130140. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Badgeley MA, Zech JR, Oakden-Rayner L, Glicksberg BS, Liu M, Gale W, McConnell MV, Percha B, Snyder TM, Dudley JT, 2019. Deep learning predicts hip fracture using confounding patient and healthcare variables. npj Digital Medicine 2 (1), 31. doi: 10.1038/s41746-019-0105-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Bahl M, Barzilay R, Yedidia AB, Locascio NJ, Yu L, Lehman CD, 2017. High-risk breast lesions: a machine learning model to predict pathologic upgrade and reduce unnecessary surgical excision. Radiology 286 (3), 810–818. doi: 10.1148/radiol.2017170549. [DOI] [PubMed] [Google Scholar]
  27. Bechhofer S, 2009. OWL: Web Ontology Language. In: LIU L, ÖZSU MT (Eds.), Encyclopedia of Database Systems. Springer US, Boston, MA, pp. 2008–2009. doi: 10.1007/978-0-387-39940-9_1073. [DOI] [Google Scholar]
  28. Bender D, Sartipi K, 2013. HL7 FHIR: An Agile and RESTful approach to healthcare information exchange. In: Proceedings of the 26th IEEE International Symposium on Computer-Based Medical Systems, pp. 326–331. [Google Scholar]
  29. Bernal J, Tajkbaksh N, Sánchez FJ, Matuszewski BJ, Chen H, Yu L, Angermann Q, Romain O, Rustad B, Balasingham I, Pogorelov K, Choi S, Debard Q, Maier-Hein L, Speidel S, Stoyanov D, Brandao P, Córdova H, Sánchez-Montes C, Gurudu SR, Fernández-Esparrach G, Dray X, Liang J, Histace A, 2017. Comparative validation of polyp detection methods in video colonoscopy: results from the MICCAI 2015 endoscopic vision challenge. IEEE Trans Med Imaging 36 (6), 1231–1249. doi: 10.1109/TMI.2017.2664042. [DOI] [PubMed] [Google Scholar]
  30. Bhandari M, Nallabasannagari AR, Reddiboina M, Porter JR, Jeong W, Mottrie A, Dasgupta P, Challacombe B, Abaza R, Rha KH, Parekh DJ, Ahlawat R, Capitanio U, Yuvaraja TB, Rawal S, Moon DA, Buffi NM, Sivaraman A, Maes KK, Porpiglia F, Gautam G, Turkeri L, Meyyazhgan KR, Patil P, Menon M, Rogers C, 2020. Predicting intra-operative and postoperative consequential events using machine-learning techniques in patients undergoing robot-assisted partial nephrectomy: a vattikuti collective quality initiative database study. BJU Int. doi: 10.1111/bju.15087. [DOI] [PubMed] [Google Scholar]
  31. Bishop C, 2006. Pattern recognition and machine learning. Springer-Verlag, New York. https://www.springer.com/gp/book/9780387310732 [Google Scholar]
  32. Bodenstedt S, Rivoir D, Jenke A, Wagner M, Breucha M, Müller-Stich B, Mees ST, Weitz J, Speidel S, 2019. Active learning using deep bayesian networks for surgical workflow analysis. Int J Comput Assist Radiol Surg 14 (6), 1079–1087. doi: 10.1007/s11548-019-01963-9. [DOI] [PubMed] [Google Scholar]
  33. Bodenstedt S, Wagner M, Mündermann L, Kenngott H, Müller-Stich B, Breucha M, Mees ST, Weitz J, Speidel S, 2019. Prediction of laparoscopic procedure duration using unlabeled, multimodal sensor data. Int J Comput Assist Radiol Surg 14 (6), 1089–1095. doi: 10.1007/s11548-019-01966-6. [DOI] [PubMed] [Google Scholar]
  34. Bowyer A, Royse C, 2016. The importance of postoperative quality of recovery: influences, assessment, and clinical and prognostic implications. Canadian Journal of Anesthesia/Journal canadien d’anesthésie 63 (2), 176–183. doi: 10.1007/s12630-015-0508-7. [DOI] [PubMed] [Google Scholar]
  35. Brigham and Women’s Hospital (BWH),. Advanced Multimodality Image-Guided Operating (AMIGO). https://ncigt.org/amigo.
  36. Bruins AA, Geboers DGPJ, Bauer JR, Klaessens JHGM, Verdaasdonk RM, Boer C, 2020. The vascular occlusion test using multispectral imaging: a validation study: the VASOIMAGE study. J Clin Monit Comput doi: 10.1007/s10877-019-00448-z. [DOI] [PubMed] [Google Scholar]
  37. Butler D, 2008. Translational research: crossing the valley of death. Nature 453 (7197), 840–842. doi: 10.1038/453840a. [DOI] [PubMed] [Google Scholar]
  38. Caffe,. Caffe | Deep Learning Framework. http://caffe.berkeleyvision.org/.
  39. Caffe2,. Caffe2 - a new lightweight, modular, and scalable deep learning framework. Library Catalog: caffe2.ai; http://caffe2.ai/. [Google Scholar]
  40. Castro DC, Walker I, Glocker B, 2020. Causality matters in medical imaging. Nat Commun 11 (1), 1–10. doi: 10.1038/s41467-020-17478-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Ceccarelli G, Forgione A, Andolfi E, Rocca A, Giuliani A, Calise F, 2018. Evolving Technologies in the Operating Room for Minimally Invasive Pancreatic Surgery. In: Boggi U (Ed.), Minimally Invasive Surgery of the Pancreas. Springer Milan, Milano, pp. 15–26. doi: 10.1007/978-88-470-3958-2_2. [DOI] [Google Scholar]
  42. Chadebecq F, Vasconcelos F, Mazomenos E, Stoyanov D, 2020. Computer vision in the surgical operating room. Visceral Medicine 1–7. doi: 10.1159/000511934. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Chainer,. Chainer: A flexible framework for neural networks. https://chainer.org/.
  44. Chen H, Chiang RHL, Storey VC, 2012. Business intelligence and analytics: from big data to big impact. MIS Quarterly 36 (4), 1165–1188. doi: 10.2307/41703503. [DOI] [Google Scholar]
  45. Chen Y, Elenee Argentinis J, Weber G, 2016. IBM Watson: how cognitive computing can be applied to big data challenges in life sciences research. Clin Ther 38 (4), 688–701. doi: 10.1016/j.clinthera.2015.12.001. [DOI] [PubMed] [Google Scholar]
  46. Choi PJ, Oskouian RJ, Tubbs RS, 2018. Telesurgery: past, present, and future. Cureus 10 (5). doi: 10.7759/cureus.2716. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. CholecTriplet21,. Surgical action triplet recognition. https://cholectriplet2021.grand-challenge.org/CholecTriplet2021/.
  48. Chou E, Tan M, Zou C, Guo M, Haque A, Milstein A, Fei-Fei L, 2018. Privacy-preserving action recognition for smart hospitals using low-resolution depth images. NeurIPS Workshop on Machine Learning for Health (ML4H). [Google Scholar]
  49. Clancy NT, Jones G, Maier-Hein L, Elson DS, Stoyanov D, 2020. Surgical spectral imaging. Med Image Anal 63, 101699. doi: 10.1016/j.media.2020.101699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Cleary K, Chung HY, Mun SK, 2004. OR2020 Workshop overview: operating room of the future. International Congress Series 1268, 847–852. doi: 10.1016/j.ics.2004.03.287. [DOI] [Google Scholar]
  51. Cleary K, Kinsella A, Mun SK, 2005. OR 2020 Workshop report: operating room of the future. CARS 2005: Computer Assisted Radiology and Surgery 1281, 832–838. doi: 10.1016/j.ics.2005.03.279. [DOI] [Google Scholar]
  52. Cloud AutoML,. Cloud AutoML - Custom Machine Learning Models. Library Catalog: cloud.google.com, https://cloud.google.com/automl. [Google Scholar]
  53. CNTK,. The Microsoft Cognitive Toolkit - Cognitive Toolkit - CNTK. https://docs.microsoft.com/en-us/cognitive-toolkit/.
  54. Cognilytica, 2019. Data Engineering, Preparation, and Labeling for AI 2019. Library Catalog: www.cognilytica.com Section: AI Market Research https://www.cognilytica.com/2019/03/06/report-data-engineering-preparation-and-labeling-for-ai-2019/. [Google Scholar]
  55. Cognilytica, 2020. Data Engineering, Preparation, and Labeling for AI 2020. Library Catalog: www.cognilytica.comhttps://www.cognilytica.com/2020/01/31/data-preparation-labeling-for-ai-2020/. [Google Scholar]
  56. Collins JW, Marcus HJ, Ghazi A, Sridhar A, Hashimoto D, Hager G, Arezzo A, Jannin P, Maier-Hein L, Marz K, Valdastri P, Mori K, Elson D, Giannarou S, Slack M, Hares L, Beaulieu Y, Levy J, Laplante G, Ramadorai A, Jarc A, Andrews B, Garcia P, Neemuchwala H, Andrusaite A, Kimpe T, Hawkes D, Kelly JD, Stoyanov D, 2021. Ethical implications of AI in robotic surgical training: a delphi consensus statement. Eur Urol Focus doi: 10.1016/j.euf.2021.04.006. [DOI] [PubMed] [Google Scholar]
  57. Computer-Integrated Surgical Systems and Technology (CISST),. Computer-Integrated Surgical Systems and Technology | NSF Engineering Research Center. https://cisst.org/.
  58. Conley DM, Singer SJ, Edmondson L, Berry WR, Gawande AA, 2011. Effective surgical safety checklist implementation. J. Am. Coll. Surg 212 (5), 873–879. doi: 10.1016/j.jamcollsurg.2011.01.052. [DOI] [PubMed] [Google Scholar]
  59. Connected Optimized Network & Data in Operating Rooms (CONDOR),. Project CONDOR - Connected Optimized Network & Data in Operating Rooms. https://condor-project.eu/.
  60. Core ML,. Core ML | Apple Developer Documentation. https://developer.apple.com/documentation/coreml.
  61. Corey KM, Kashyap S, Lorenzi E, Lagoo-Deenadayalan SA, Heller K, Whalen K, Balu S, Heflin MT, McDonald SR, Swaminathan M, Sendak M, 2018. Development and validation of machine learning models to identify high-risk surgical patients using automatically curated electronic health record data (pythia): a retrospective, single-site study. PLoS Med. 15 (11), e1002701. doi: 10.1371/journal.pmed.1002701. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Cornet R, de Keizer N, 2008. Forty years of SNOMED: a literature review. BMC Med Inform Decis Mak 8 Suppl 1, S2. doi: 10.1186/1472-6947-8-S1-S2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y, 2017. Deformable convolutional networks. arXiv:1703.06211 [cs]. ArXiv: 1703.06211 http://arxiv.org/abs/1703.06211 [Google Scholar]
  64. Dasgupta N, 2018. Practical big data analytics: Hands-on techniques to implement enterprise analytics and machine learning using hadoop, spark, nosql and r. Packt Publishing Ltd. [Google Scholar]
  65. DB-Engines, 2020. DB-Engines Ranking of Search Engines. https://db-engines.com/en/ranking/search+engine.
  66. De La Garza JR, Schmidt MW, Kowalewski K-F, Benner L, Müller PC, Kenngott HG, Fischer L, Müller-Stich BP, Nickel F, 2019. Does rating with a checklist improve the effect of e-learning for cognitive and practical skills in bariatric surgery? arater-blinded, randomized-controlled trial. Surg Endosc 33 (5), 1532–1543. doi: 10.1007/s00464-018-6441-4. [DOI] [PubMed] [Google Scholar]
  67. De Silva TS, Vedula SS, Perdomo-Pantoja A, Vijayan RC, Doerr SA, Uneri A, Han R, Ketcha MD, Skolasky RL, Witham T, Theodore N, Siewerdsen JH, 2020. Spinecloud: image analytics for predictive modeling of spine surgery outcomes. J. Med. Imaging 7 (3), 031502. doi: 10.1117/1.JMI.7.3.031502. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Deep Learning Toolbox,. Deep Learning Toolbox - design, train, and analyze deep learning networks. https://www.mathworks.com/products/deep-learning.html.
  69. Diana M, Soler L, Agnus V, D’Urso A, Vix M, Dallemagne B, Faucher V, Roy C, Mutter D, Marescaux J, Pessaux P, 2017. Prospective evaluation of precision multimodal gallbladder surgery navigation: virtual reality, near-infrared fluorescence, and x-ray-based intraoperative cholangiography. Ann. Surg 266 (5), 890–897. doi: 10.1097/SLA.0000000000002400. [DOI] [PubMed] [Google Scholar]
  70. Dietrich M, Seidlitz S, Schreck N, Wiesenfarth M, Godau P, Tizabi M, Sellner J, Marx S, Knödler S, Allers MM, Ayala L, Schmidt K, Brenner T, Studier-Fischer A, Nickel F, Müller-Stich BP, Kopp-Schneider A, Weigand MA, Maier-Hein L, 2021. Machine learning-based analysis of hyperspectral images for automated sepsis diagnosis. arXiv:2106.08445 [cs, eess]. http://arxiv.org/abs/2106.08445 [Google Scholar]
  71. Digital Imaging and Communications in Medicine (DICOM),. NEMA PS3 / ISO 12052, Digital Imaging and Communications in Medicine (DICOM) Standard. https://www.dicomstandard.org/. [Google Scholar]
  72. Duke University, 2016. Pilot Study for a Clinical Analytical Platform for Surgical Outcomes. Clinical trial registration. clinicaltrials.gov. Submitted: July 7, 2016, https://clinicaltrials.gov/ct2/show/NCT02828475 [Google Scholar]
  73. Dunkin BJ, Flowers C, 2015. 3D in the Minimally Invasive Surgery (MIS) Operating Room: Cameras and Displays in the Evolution of MIS. In: Fong Y, Giulianotti PC, Lewis J, Groot Koerkamp B, Reiner T (Eds.), Imaging and Visualization in The Modern Operating Room: A Comprehensive Guide for Physicians. Springer, New York, NY, pp. 145–155. doi: 10.1007/978-1-4939-2326-7_11. [DOI] [Google Scholar]
  74. Dwork C, McSherry F, Nissim K, Smith A, 2006. Calibrating Noise to Sensitivity in Private Data Analysis. In: Halevi S, Rabin T (Eds.), Theory of Cryptography. Springer, Berlin, Heidelberg, pp. 265–284. doi: 10.1007/11681878_14. [DOI] [Google Scholar]
  75. EndoVis, 2015. EndoVis - Grand Challenge. https://endovis.grand-challenge.org/.
  76. EndoVis-GIANA,. Gastrointestinal Image ANAlysis challenge - Grand Challenge. https://giana.grand-challenge.org/.
  77. EndoVis-ROBUST-MIS,. EndoVis Robust Endoscopic Instrument Segmentation - Grand Challenge. https://robustmis2019.grand-challenge.org/.
  78. EndoVis-Workflow,. EndoVisSub-Workflow - Grand Challenge. https://endovissub2017-workflow.grand-challenge.org/.
  79. EndoVis-Workflow, Skill,. EndoVisSub-Workflow and Skill - Grand Challenge. https://endovissub-workflowandskill.grand-challenge.org/.
  80. EndoVis’15 Instrument Subchallenge Dataset,. Open-cas: EndoVis’15 Instrument Sub-challenge Dataset. https://opencas.webarchiv.kit.edu/?q=node/30.
  81. European Parliament, 2020. The ethics of artificial intelligence: Issues and initiatives. Publications Office of the European Union, LU. https://data.europa.eu/doi/10.2861/6644 [Google Scholar]
  82. European Parliament and Council of European Union, 2016. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). https://eur-lex.europa.eu/eli/reg/2016/679/oj.
  83. Faes L, Wagner SK, Fu DJ, Liu X, Korot E, Ledsam JR, Back T, Chopra R, Pontikos N, Kern C, Moraes G, Schmid MK, Sim D, Balaskas K, Bachmann LM, Denniston AK, Keane PA, 2019. Automated deep learning design for medical image classification by health-care professionals with no coding experience: a feasibility study. The Lancet Digital Health 1 (5), e232–e242. doi: 10.1016/S2589-7500(19)30108-6. [DOI] [PubMed] [Google Scholar]
  84. Fan B, Li H-X, Hu Y, 2016. An intelligent decision system for intraoperative somatosensory evoked potential monitoring. IEEE transactions on neural systems and rehabilitation engineering: a publication of the IEEE Engineering in Medicine and Biology Society 24 (2), 300–307. doi: 10.1109/TNSRE.2015.2477557. [DOI] [PubMed] [Google Scholar]
  85. fastai,. Welcome to fastai. https://docs.fast.ai.
  86. FDA, 2019. Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD) - Discussion Paper and Request for Feedback. https://www.regulations.gov/document?D=FDA-2019-N-1185-0001.
  87. FDA, 2021. Artificial intelligence and machine learning in software as a medical device. FDA. https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-software-medical-device [Google Scholar]
  88. FDA, 2021b. FDA Authorizes Marketing of First Device that Uses Artificial Intelligence to Help Detect Potential Signs of Colon Cancer. https://www.fda.gov/news-events/press-announcements/fda-authorizes-marketing-first-device-uses-artificial-intelligence-help-detect-potential-signs-colon.
  89. Feldman LS, Pryor AD, Gardner AK, Dunkin BJ, Schultz L, Awad MM, Ritter EM, 2020. SAGES Video-based assessment (VBA) program: a vision for life-long learning for surgeons. Surg Endosc 34 (8), 3285–3288. doi: 10.1007/s00464-020-07628-y. [DOI] [PubMed] [Google Scholar]
  90. FetReg,. FetReg - Placental Vessel Segmentation and Registration in Fetoscopy 2021. https://www.synapse.org/#!Synapse:syn25313156/wiki/.
  91. Fitzek FH, Li S-C, Speidel S, Strufe T, Simsek M, Reisslein M, 2021. Tactile internet: With human-in-the-Loop. Academic Press. [Google Scholar]
  92. Fjeld J, Achten N, Hilligoss H, Nagy A, Srikumar M, 2020. Principled artificial intelligence: mapping consensus in ethical and rights-based approaches to principles for AI. Berkman Klein Center Research Publication (2020–1). [Google Scholar]
  93. Flux,. FluxML/Flux.jl. Original-date: 2016-04-01T21:11:05Z, https://github.com/FluxML/Flux.jl.
  94. Forrey AW, McDonald CJ, DeMoor G, Huff SM, Leavelle D, Leland D, Fiers T, Charles L, Griffin B, Stalling F, Tullis A, Hutchins K, Baenziger J, 1996. Logical observation identifier names and codes (LOINC) database: a public use set of codes and names for electronic reporting of clinical laboratory test results. Clin. Chem 42 (1), 81–90. doi: 10.1093/clinchem/42.1.81. [DOI] [PubMed] [Google Scholar]
  95. Futoma J, Hariharan S, Heller K, Sendak M, Brajer N, Clement M, Bedoya A, O’Brien C, 2017. An Improved Multi-Output Gaussian Process RNN with Real-Time Validation for Early Sepsis Detection. In: Machine Learning for Healthcare Conference. PMLR, pp. 243–254. http://proceedings.mlr.press/v68/futoma17a.html [Google Scholar]
  96. Gale W, Oakden-Rayner L, Carneiro G, Bradley AP, Palmer LJ, 2018. Producing radiologist-quality reports for interpretable artificial intelligence. arXiv:1806.00340 [cs]. ArXiv: 1806.00340 http://arxiv.org/abs/1806.00340 [Google Scholar]
  97. Gallego-Ortiz C, Martel AL, 2016. Interpreting extracted rules from ensemble of trees: application to computer-aided diagnosis of breast MRI. arXiv:1606.08288 [cs, stat]. ArXiv: 1606.08288, http://arxiv.org/abs/1606.08288 [Google Scholar]
  98. Gao Y, Vedula SS, Reiley CE, Ahmidi N, Varadarajan B, Lin HC, Tao L, Zappella L, Béjar B, Yuh DD, Chen CCG, Vidal R, Khudanpur S, Hager GD, 2014. JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS): A Surgical Activity Dataset for Human Motion Modeling. In: MICCAI workshop: M2cai, Vol. 3, p. 3. [Google Scholar]
  99. Gauss Surgical, Inc. (Menlo Park, CA, US),. Gauss Surgical. https://www.gausssurgical.com/.
  100. Gerke S, Babic B, Evgeniou T, Cohen IG, 2020. The need for a system view to regulate artificial intelligence/machine learning-based software as medical device. npj Digital Medicine 3 (1), 1–4. doi: 10.1038/s41746-020-0262-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. GIANA21,. Gastrointestinal image analysis. https://giana.grand-challenge.org/.
  102. Gibaud B, Forestier G, Feldmann C, Ferrigno G, Gonçalves P, Haidegger T, Julliard C, Katić D, Kenngott H, Maier-Hein L, März K, de Momi E, Nagy DA, Nakawala H, Neumann J, Neumuth T, Rojas Balderrama J, Speidel S, Wagner M, Jannin P, 2018. Toward a standard ontology of surgical process models. Int J Comput Assist Radiol Surg 13 (9), 1397–1408. doi: 10.1007/s11548-018-1824-5. [DOI] [PubMed] [Google Scholar]
  103. Gibaud B, Kassel G, Dojat M, Batrancourt B, Michel F, Gaignard A, Montagnat J, 2011. Neurolog: sharing neuroimaging data using an ontology-based federated approach. AMIA … Annual Symposium proceedings. AMIA Symposium 2011, 472–480. [PMC free article] [PubMed] [Google Scholar]
  104. Gibaud B, Penet C, Jannin P, 2014. OntoSPM: a core ontology of surgical procedure models. In: Proceedings of Surgetica, pp. 175–177. Chambéry, France [Google Scholar]
  105. GitHub,. GitHub - Build software better, together. https://github.com.
  106. Godau P, Maier-Hein L, 2021. Task fingerprinting for meta learning in surgical data science. MICCAI International Conference on Medical Image Computing and Computer-Assisted Intervention. To appear [Google Scholar]
  107. Goldenberg MG, Jung J, Grantcharov TP, 2017. Using data to enhance performance and improve quality and safety in surgery. JAMA Surg 152 (10), 972–973. doi: 10.1001/jamasurg.2017.2888. [DOI] [PubMed] [Google Scholar]
  108. Goyal A, 2018. New technologies for sentinel lymph node detection. Breast Care 13 (5), 349–353. doi: 10.1159/000492436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. van de Graaf FW, Eryigit O, Lange JF, 2020. Current perspectives on video and audio recording inside the surgical operating room: results of a cross-disciplinary survey. Updates Surg doi: 10.1007/s13304-020-00902-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Grand Challenge,. Grand Challenge - A platform for end-to-end development of machine learning solutions in biomedical imaging. https://grand-challenge.org/.
  111. Graves A, Wayne G, Reynolds M, Harley T, Danihelka I, Grabska-Barwińska A, Colmenarejo SG, Grefenstette E, Ramalho T, Agapiou J, Badia AP, Hermann KM, Zwols Y, Ostrovski G, Cain A, King H, Summerfield C, Blunsom P, Kavukcuoglu K, Hassabis D, 2016. Hybrid computing using a neural network with dynamic external memory. Nature 538 (7626), 471–476. doi: 10.1038/nature20101. [DOI] [PubMed] [Google Scholar]
  112. Grenon P, Smith B, 2004. SNAP And SPAN: towards dynamic spatial ontology. Spatial Cognition & Computation 4 (1), 69–104. doi: 10.1207/s15427633scc0401_5. [DOI] [Google Scholar]
  113. Grimes S, 2005. The challenge of integrating the healthcare enterprise. IEEE Eng. Med. Biol. Mag 24 (2), 122–124. doi: 10.1109/MEMB.2005.1411360. [DOI] [PubMed] [Google Scholar]
  114. Guo K, Xu T, Kui X, Zhang R, Chi T, 2019. Ifusion: towards efficient intelligence fusion for deep learning from real-time and heterogeneous data. Information Fusion 51, 215–223. doi: 10.1016/j.inffus.2019.02.008. [DOI] [Google Scholar]
  115. H2O,. AutoML: Automatic Machine Learning - H2O 3.32.0.1 documentation. https://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html.
  116. Hager GD, Maier-Hein L, Vedula SS, 2020. Chapter 38 - Surgical Data Science. In: Zhou SK, Rueckert D, Fichtinger G (Eds.), Handbook of Medical Image Computing and Computer Assisted Intervention. Academic Press, pp. 931–952. doi: 10.1016/B978-0-12-816176-0.00043-0. [DOI] [Google Scholar]
  117. Hai R, Geisler S, Quix C, 2016. Constance: An Intelligent Data Lake System. In: Proceedings of the 2016 International Conference on Management of Data. Association for Computing Machinery, New York, NY, USA, pp. 2097–2100. Event– place: San Francisco, California, USA [Google Scholar]
  118. Haider AH, Bilimoria KY, Kibbe MR, 2018. A checklist to elevate the science of surgical database research. JAMA Surg 153 (6), 505–507. doi: 10.1001/jamasurg.2018.0628. [DOI] [PubMed] [Google Scholar]
  119. Hamilton EC, Pham DH, Minzenmayer AN, Austin MT, Lally KP, Tsao K, Kawaguchi AL, 2018. Are we missing the near misses in the OR? - under- reporting of safety incidents in pediatric surgery. J. Surg. Res 221, 336–342. doi: 10.1016/j.jss.2017.08.005. [DOI] [PubMed] [Google Scholar]
  120. Hamlyn Centre,. The Hamlyn Centre, Imperial College London. http://www.imperial.ac.uk/a-z-research/hamlyn-centre/.
  121. Harangi B, Hajdu A, Lampe R, Torok P, 2017. Recognizing Ureter and Uterine Artery in Endoscopic Images Using a Convolutional Neural Network. In: 2017 IEEE 30th International Symposium on Computer-Based Medical Systems (CBMS), pp. 726–727. doi: 10.1109/CBMS.2017.137. ISSN: 2372–9198 [DOI] [Google Scholar]
  122. Hashmi ZG, Kaji AH, Nathens AB, 2018. Practical guide to surgical data sets: national trauma data bank (NTDB). JAMA Surg 153 (9), 852–853. doi: 10.1001/jamasurg.2018.0483. [DOI] [PubMed] [Google Scholar]
  123. Hattab G, Arnold M, Strenger L, Allan M, Arsentjeva D, Gold O, Simpfendörfer T, Maier-Hein L, Speidel S, 2020. Kidney edge detection in laparoscopic image data for computer-assisted surgery. Int J Comput Assist Radiol Surg 15 (3), 379–387. [DOI] [PubMed] [Google Scholar]
  124. Healthcare IT News, 2012. Leveraging Big Data and Analytics in Healthcare and Life Sciences: Enabling Personalized Medicine for High-Quality Care, Better Outcomes. https://www.healthcareitnews.com/resource/leveraging-big-data-and-analytics-healthcare-and-life-sciences-enabling-personalized-medici.
  125. Heim E, Seitel A, Andrulis J, Isensee F, Stock C, Ross T, Maier-Hein L, 2018. Clickstream analysis for crowd-based object segmentation with confidence. IEEE Trans Pattern Anal Mach Intell 40 (12), 2814–2826. doi: 10.1109/TPAMI.2017.2777967. [DOI] [PubMed] [Google Scholar]
  126. Heimann T, Mountney P, John M, Ionasec R, 2013. Learning without Labeling: Domain Adaptation for Ultrasound Transducer Localization. In: Mori K, Sakuma I, Sato Y, Barillot C, Navab N (Eds.), Medical Image Computing and Computer-Assisted Intervention - MICCAI 2013. Springer, Berlin, Heidelberg, pp. 49–56. doi: 10.1007/978-3-642-40760-4_7. [DOI] [PubMed] [Google Scholar]
  127. HeiSurf,. Surgical workflow analysis and full scene segmentation. https://www.synapse.org/#!Synapse:syn25101790/wiki/.
  128. Hirasawa T, Aoyama K, Tanimoto T, Ishihara S, Shichijo S, Ozawa T, Ohnishi T, Fujishiro M, Matsuo K, Fujisaki J, Tada T, 2018. Application of artificial intelligence using a convolutional neural network for detecting gastric cancer in endoscopic images. Gastric Cancer 21 (4), 653–660. doi: 10.1007/s10120-018-0793-2. [DOI] [PubMed] [Google Scholar]
  129. Ho D-A, Beyan O, 2020. Biases in data science lifecycle. arXiv:2009.09795 [cs]. ArXiv: 2009.09795, http://arxiv.org/abs/2009.09795 [Google Scholar]
  130. Holmgren AJ, Adler-Milstein J, 2017. Health information exchange in US hospitals: the current landscape and a path to improved information sharing. J. Hosp. Med 12 (3), 193–198. [DOI] [PubMed] [Google Scholar]
  131. Hsu C-C, Sandford B, 2007. The delphi technique: making sense of consensus. Practical Assessment, Research and Evaluation 12 (10), 1–8. [Google Scholar]
  132. Huaulmé A, Sarikaya D, Mut KL, Despinoy F, Long Y, Dou Q, Chng C-B, Lin W, Kondo S, Bravo-Sánchez L, Arbeláez P, Reiter W, Mitsuishi M, Harada K, Jannin P, 2021. MIcro-Surgical anastomose workflow recognition challenge report. arXiv:2103.13111 [cs]. ArXiv: 2103.13111, http://arxiv.org/abs/2103.13111 [DOI] [PubMed] [Google Scholar]
  133. Hull L, Arora S, Aggarwal R, Darzi A, Vincent C, Sevdalis N, 2012. The impact of nontechnical skills on technical performance in surgery: a systematic review. J. Am. Coll. Surg 214 (2), 214–230. doi: 10.1016/j.jamcollsurg.2011.10.016. [DOI] [PubMed] [Google Scholar]
  134. Hussain M, Khattak AM, Khan WA, Fatima I, Amin MB, Pervez Z, Batool R, Saleem MA, Afzal M, Faheem M, Saddiqi MH, Lee SY, Latif K, 2013. Cloud-based smart CDSS for chronic diseases. Health Technol. 3 (2), 153–175. [Google Scholar]
  135. Hyland SL, Faltys M, Hüser M, Lyu X, Gumbsch T, Esteban C, Bock C, Horn M, Moor M, Rieck B, Zimmermann M, Bodenham D, Borgwardt K, Rätsch G, Merz TM, 2020. Early prediction of circulatory failure in the intensive care unit using machine learning. Nat. Med 26 (3), 364–373. doi: 10.1038/s41591-020-0789-4. [DOI] [PubMed] [Google Scholar]
  136. IEEE, 1991. IEEE Standard computer dictionary: acompilation of IEEE standard computer glossaries. IEEE Std 610 1–217. doi: 10.1109/IEEESTD.1991.106963. [DOI] [Google Scholar]
  137. IHU Strasbourg,. IHU Strasbourg | Institut de chirurgie guidée par l’image. https://www.ihu-strasbourg.eu/.
  138. Innovation Center Computer Assisted Surgery (ICCAS),. ICCAS - Innovation Center Computer Assisted Surgery. https://www.iccas.de/.
  139. Institute of Medicine (USA) Roundtable on Value & Science-Driven Health Care, 2010. Clinical data as the basic staple of health learning: Creating and protecting a public good: Workshop summary. National Academies Press (USA), Washington (DC). http://www.ncbi.nlm.nih.gov/books/NBK54302/ [PubMed] [Google Scholar]
  140. Iseki H, Muragaki Y, Tamura M, Suzuki T, Yoshimitsu K, Ikuta S, Okamoto J, Chernov M, Izumi K, 2012. SCOT (Smart Cyber Operating Theater) project: Advanced medical information analyzer for guidance of the surgical procedures. In: Society for Information Display - 19th International Display Workshops 2012, IDW/AD 2012, pp. 1880–1883. https://waseda.pure.elsevier.com/en/publications/scot-smart-cyber-operating-theater-project-advanced-medical-infor [Google Scholar]
  141. Itzkovich D, Sharon Y, Jarc A, Refaely Y, Nisky I, 2019. Using Augmentation to Improve the Robustness to Rotation of Deep Learning Segmentation in Robotic-Assisted Surgical Data. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 5068–5075. doi: 10.1109/ICRA.2019.8793963. [DOI] [Google Scholar]
  142. Joskowicz L, Cohen D, Caplan N, Sosna J, 2019. Inter-observer variability of manual contour delineation of structures in CT. Eur Radiol 29 (3), 1391–1399. doi: 10.1007/s00330-018-5695-5. [DOI] [PubMed] [Google Scholar]
  143. Jung JJ, Jüni P, Lebovic G, Grantcharov T, 2020. First-year analysis of the operating room black box study. Ann. Surg 271 (1), 122–127. doi: 10.1097/SLA.0000000000002863. [DOI] [PubMed] [Google Scholar]
  144. Kaijser MA, van Ramshorst GH, Emous M, Veeger NJGM, van Wagensveld BA, Pierie J-PEN, 2018. A delphi consensus of the crucial steps in gastric bypass and sleeve gastrectomy procedures in the netherlands. Obes Surg 28 (9), 2634–2643. doi: 10.1007/s11695-018-3219-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  145. Kaissis GA, Makowski MR, Rückert D, Braren RF, 2020. Secure, privacy-preserving and federated machine learning in medical imaging. Nature Machine Intelligence 2 (6), 305–311. [Google Scholar]
  146. Kalra D, Beale T, Heard S, 2005. The openehr foundation. Stud Health Technol Inform 115, 153–173. [PubMed] [Google Scholar]
  147. Katić D, Julliard C, Wekerle A-L, Kenngott H, Müller-Stich BP, Dillmann R, Speidel S, Jannin P, Gibaud B, 2016. Erratum to: lapontospm: an ontology for laparoscopic surgeries and its application to surgical phase recognition. Int J Comput Assist Radiol Surg 11 (4). doi: 10.1007/s11548-015-1314-y. 679–679 [DOI] [PubMed] [Google Scholar]
  148. Katić D, Maleshkova M, Engelhardt S, Wolf I, März K, Maier-Hein L, Nolden M, Wagner M, Kenngott H, Müller-Stich BP, Dillmann R, Speidel S, 2017. What does it all mean? capturing semantics of surgical data and algorithms with ontologies. arXiv:1705.07747 [cs]. ArXiv: 1705.07747, http://arxiv.org/abs/1705.07747 [Google Scholar]
  149. Katić D, Schuck J, Wekerle A-L, Kenngott H, Müller-Stich BP, Dillmann R, Speidel S, 2016. Bridging the gap between formal and experience-based knowledge for context-aware laparoscopy. Int J Comput Assist Radiol Surg 11 (6), 881–888. doi: 10.1007/s11548-016-1379-2. [DOI] [PubMed] [Google Scholar]
  150. Kehlet H, Wilmore DW, 2008. Evidence-based surgical care and the evolution of fast-track surgery. Ann. Surg 248 (2), 189–198. doi: 10.1097/SLA.0b013e31817f2c1a. [DOI] [PubMed] [Google Scholar]
  151. Kendall A, Gal Y, 2017. What uncertainties do we need in Bayesian deep learning for computer vision? In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc., Long Beach, California, USA, pp. 5580–5590. [Google Scholar]
  152. Keras,. Keras: the Python deep learning API. https://keras.io/.
  153. Khronos Group, 2020. The Khronos Group. https://www.khronos.org.
  154. Kickingereder P, Isensee F, Tursunova I, Petersen J, Neuberger U, Bonekamp D, Brugnara G, Schell M, Kessler T, Foltyn M, Harting I, Sahm F, Prager M, Nowosielski M, Wick A, Nolden M, Radbruch A, Debus J, Schlemmer H-P, Heiland S, Platten M, Deimling A.v., Bent M.J.v.d., Gorlia T, Wick W, Bendszus M, Maier-Hein, 2019. Automated quantitative tumour response assessment of MRI in neuro-oncology with artificial neural networks: a multicentre, retrospective study. The Lancet Oncology 20 (5), 728–740. doi: 10.1016/S1470-2045(19)30098-1. [DOI] [PubMed] [Google Scholar]
  155. Kilian M, Avery M, Todd S, Teugels T, Martens F, Bruyndonckx J, 2015. Method and apparatus for data retention in a storage system. https://patents.google.com/patent/US9075851/en.
  156. Kim S, Kim C, Kim J, 2017. Reliable smart energy IoT-cloud service operation with container orchestration. In: 2017 19th Asia-Pacific Network Operations and Management Symposium (APNOMS), pp. 378–381. [Google Scholar]
  157. Koh PW, Liang P, 2017. Understanding Black-box Predictions via Influence Functions. In: International Conference on Machine Learning, pp. 1885–1894. Section: Machine Learning, http://proceedings.mlr.press/v70/koh17a.html [Google Scholar]
  158. Kohl S, Romera-Paredes B, Meyer C, De Fauw J, Ledsam JR, Maier- Hein K, Eslami SMA, Jimenez Rezende D, Ronneberger O, 2018. A probabilistic u-net for segmentation of ambiguous images. Adv Neural Inf Process Syst 31. https://proceedings.neurips.cc/paper/2018/hash/473447ac58e1cd7e96172575f48dca3b-Abstract.html [Google Scholar]
  159. Kokkinos I, 2017. UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5454–5463. doi: 10.1109/CVPR.2017.579. [DOI] [Google Scholar]
  160. Komorowski M, Celi LA, Badawi O, Gordon AC, Faisal AA, 2018. The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care. Nat. Med 24 (11), 1716–1720. doi: 10.1038/s41591-018-0213-5. [DOI] [PubMed] [Google Scholar]
  161. Konečný J, Brendan McMahan H, Yu FX, Richtárik P, Suresh AT, Bacon D, 2016. Federated learning: strategies for improving communication efficiency. CoRR._eprint: 1610.05492 [Google Scholar]
  162. Korndorffer JRJ, Hawn MT, Spain DA, Knowlton LM, Azagury DE, Nassar AK, Lau JN, Arnow KD, Trickey AW, Pugh CM, 2020. Situating artificial intelligence in surgery: a focus on disease severity. Ann. Surg 272 (3), 523–528. doi: 10.1097/SLA.0000000000004207. [DOI] [PubMed] [Google Scholar]
  163. 2016. In: Koubaa A (Ed.), Robot operating system (ROS): The complete reference (volume 1). Springer, Cham. [Google Scholar]
  164. Kricka LJ, 2019. History of disruptions in laboratory medicine: what have we learned from predictions? Clin. Chem. Lab. Med 57 (3), 308–311. [DOI] [PubMed] [Google Scholar]
  165. Lalys F, Jannin P, 2014. Surgical process modelling: a review. Int J Comput Assist Radiol Surg 9 (3), 495–511. doi: 10.1007/s11548-013-0940-5. [DOI] [PubMed] [Google Scholar]
  166. Langlotz CP, 2006. Radlex: a new method for indexing online educational materials. Radiographics: A Review Publication of the Radiological Society of North America, Inc 26 (6), 1595–1597. doi: 10.1148/rg.266065168. [DOI] [PubMed] [Google Scholar]
  167. Larrazabal AJ, Nieto N, Peterson V, Milone DH, Ferrante E, 2020. Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. Proceedings of the National Academy of Sciences 117 (23), 12592–12594. [DOI] [PMC free article] [PubMed] [Google Scholar]
  168. Lecuyer G, Ragot M, Martin N, Launay L, Jannin P, 2020. Assisted phase and step annotation for surgical videos. Int J Comput Assist Radiol Surg 15 (4), 673–680. doi: 10.1007/s11548-019-02108-8. [DOI] [PubMed] [Google Scholar]
  169. Lee W-S, Ahn SM, Chung J-W, Kim KO, Kwon KA, Kim Y, Sym S, Shin D, Park I, Lee U, Baek J-H, 2018. Assessing concordance with watson for oncology, a cognitive computing decision support system for colon cancer treatment in korea. JCO clinical cancer informatics 2, 1–8. doi: 10.1200/CCI.17.00109. [DOI] [PubMed] [Google Scholar]
  170. Lehne M, Sass J, Essenwanger A, Schepers J, Thun S, 2019. Why digital medicine depends on interoperability. NPJ Digit Med 2, 79. [DOI] [PMC free article] [PubMed] [Google Scholar]
  171. Li W, Milletarì F, Xu D, Rieke N, Hancox J, Zhu W, Baust M, Cheng Y, Ourselin S, Cardoso MJ, Feng A, 2019. Privacy-Preserving Federated Brain Tumour Segmentation. In: Suk H-I, Liu M, Yan P, Lian C (Eds.), Machine Learning in Medical Imaging. Springer International Publishing, Cham, pp. 133–141. doi: 10.1007/978-3-030-32692-0_16. [DOI] [Google Scholar]
  172. Liu X, Tsaftaris SA, 2020. Have you forgotten? a method to assess if machine learning models have forgotten data. arXiv:2004.10129 [cs]. ArXiv: 2004.10129, http://arxiv.org/abs/2004.10129 [Google Scholar]
  173. Long M, Cao Z, Wang J, Yu PS, 2017. Learning multiple tasks with multilinear relationship networks. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc., Long Beach, California, USA, pp. 1593–1602. [Google Scholar]
  174. Lundberg SM, Nair B, Vavilala MS, Horibe M, Eisses MJ, Adams T, Liston DE, King-Wai Low D, Newman S-F, Kim J, Lee S-I, 2018. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat. Biomed. Eng 2 (10), 749–760. doi: 10.1038/s41551-018-0304-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  175. Madani A, Namazi B, Altieri MS, Hashimoto DA, Rivera AM, Pucher PH, Navarrete-Welton A, Sankaranarayanan G, Brunt LM, Okrainec A, Alseidi A, 2021. Artificial intelligence for intraoperative guidance: using semantic segmentation to identify surgical anatomy during laparoscopic cholecystectomy. Ann. Surg Publish Ahead of Print. doi: 10.1097/SLA.0000000000004594. [DOI] [PMC free article] [PubMed] [Google Scholar]
  176. Mai R. y., Lu H. z., Bai T, Liang R, Lin Y, Ma L, Xiang B. d., Wu G. b., Li L. q., Ye J. z., 2020. Artificial neural network model for preoperative prediction of severe liver failure after hemihepatectomy in patients with hepatocellular carcinoma. Surgery 168 (4), 643–652. doi: 10.1016/j.surg.2020.06.031. [DOI] [PubMed] [Google Scholar]
  177. Maier-Hein L, Eisenmann M, Feldmann C, Feussner H, Forestier G, Giannarou S, Gibaud B, Hager GD, Hashizume M, Katic D, Kenngott H, Kikinis R, Kranzfelder M, Malpani A, März K, Müller-Stich B, Navab N, Neumuth T, Padoy N, Park A, Pugh C, Schoch N, Stoyanov D, Taylor R, Wagner M, Swaroop Vedula S, Jannin P, Speidel S, 2018. Surgical data science: a consensus perspective. arXiv e-prints. arXiv:1806.03184 [Google Scholar]
  178. Maier-Hein L, Eisenmann M, Reinke A, Onogur S, Stankovic M, Scholz P, Arbel T, Bogunovic H, Bradley AP, Carass A, Feldmann C, Frangi AF, Full PM, van Ginneken B, Hanbury A, Honauer K, Kozubek M, Landman BA, März K, Maier O, Maier-Hein K, Menze BH, Müller H, Neher PF, Niessen W, Rajpoot N, Sharp GC, Sirinukunwattana K, Speidel S, Stock C, Stoyanov D, Taha AA, van der Sommen F, Wang C-W, Weber M-A, Zheng G, Jannin P, Kopp-Schneider A, 2018. Why rankings of biomedical image analysis competitions should be interpreted with care. Nat Commun 9 (1), 5217. doi: 10.1038/s41467-018-07619-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  179. Maier-Hein L, Kondermann D, Roß T, Mersmann S, Heim E, Bodenstedt S, Kenngott HG, Sanchez A, Wagner M, Preukschas A, Wekerle A-L, Helfert S, März K, Mehrabi A, Speidel S, Stock C, 2015. Crowdtruth validation: a new paradigm for validating algorithms that rely on image correspondences. Int J Comput Assist Radiol Surg 10 (8), 1201–1212. doi: 10.1007/s11548-015-1168-3. [DOI] [PubMed] [Google Scholar]
  180. Maier-Hein L, Mersmann S, Kondermann D, Bodenstedt S, Sanchez A, Stock C, Kenngott HG, Eisenmann M, Speidel S, 2014. Can Masses of Non-Experts Train Highly Accurate Image Classifiers? In: Medical Image Computing and Computer-Assisted Intervention - MICCAI 2014. Springer, Cham, pp. 438–445. doi: 10.1007/978-3-319-10470-6_55. [DOI] [PubMed] [Google Scholar]
  181. Maier-Hein L, Reinke A, Kozubek M, Martel AL, Arbel T, Eisenmann M, Hanbury A, Jannin P, Müller H, Onogur S, Saez-Rodriguez J, van Ginneken B, Kopp-Schneider A, Landman BA, 2020. BIAS: Transparent reporting of biomedical image analysis challenges. Med Image Anal 66, 101796. doi: 10.1016/j.media.2020.101796. [DOI] [PMC free article] [PubMed] [Google Scholar]
  182. Maier-Hein L, Ross T, Gröhl J, Glocker B, Bodenstedt S, Stock C, Heim E, Götz M, Wirkert S, Kenngott H, Speidel S, Maier-Hein K, 2016. Crowd-Algorithm Collaboration for Large-Scale Endoscopic Image Annotation with Confidence. In: Ourselin S, Joskowicz L, Sabuncu MR, Unal G, Wells W (Eds.), Medical Image Computing and Computer-Assisted Intervention - MIC- CAI 2016. Springer International Publishing, Cham, pp. 616–623. doi: 10.1007/978-3-319-46723-8_71. [DOI] [Google Scholar]
  183. Maier-Hein L, Vedula SS, Speidel S, Navab N, Kikinis R, Park A, Eisenmann M, Feussner H, Forestier G, Giannarou S, Hashizume M, Katic D, Kenngott H, Kranzfelder M, Malpani A, März K, Neumuth T, Padoy N, Pugh C, Schoch N, Stoyanov D, Taylor R, Wagner M, Hager GD, Jannin P, 2017. Surgical data science for next-generation interventions. Nat. Biomed. Eng 1 (9), 691–696. doi: 10.1038/s41551-017-0132-7. [DOI] [PubMed] [Google Scholar]
  184. Maier-Hein L, Wagner M, Ross T, Reinke A, Bodenstedt S, Full PM, Hempe H, Mindroc-Filimon D, Scholz P, Tran TN, Bruno P, Kisilenko A, Müller B, Davitashvili T, Capek M, Tizabi MD, Eisenmann M, Adler TJ, Gröhl J, Schellenberg M, Seidlitz S, Lai TYE, Pekdemir B, Roethlingshoefer V, Both F, Bittel S, Mengler M, Mündermann L, Apitz M, Kopp-Schneider A, Speidel S, Nickel F, Probst P, Kenngott HG, Müller-Stich BP, 2021. Heidelberg colorectal data set for surgical data science in the sensor operating room. Sci Data 8 (1), 101. doi: 10.1038/s41597-021-00882-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  185. Malpani A, Vedula SS, Chen CCG, Hager GD, 2015. A study of crowdsourced segment-level surgical skill assessment using pairwise rankings. Int J Comput Assist Radiol Surg 10 (9), 1435–1447. doi: 10.1007/s11548-015-1238-6. [DOI] [PubMed] [Google Scholar]
  186. Maninis K-K, Radosavovic I, Kokkinos I, 2019. Attentive Single-Tasking of Multiple Tasks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1851–1860. doi: 10.1109/CVPR.2019.00195. [DOI] [Google Scholar]
  187. Marcus AP, Marcus HJ, Camp SJ, Nandi D, Kitchen N, Thorne L, 2020. Improved prediction of surgical resectability in patients with glioblastoma using an artificial neural network. Sci Rep 10 (1), 1–9. doi: 10.1038/s41598-020-62160-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  188. Marzahl C, Aubreville M, Bertram CA, Maier J, Bergler C, Kröger C, Voigt J, Breininger K, Klopfleisch R, Maier A, 2021. Exact: a collaboration toolset for algorithm-aided annotation of images with annotation version control. Sci Rep 11 (1), 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  189. Mascagni P, Alapatt D, Garcia A, Okamoto N, Vardazaryan A, Costamagna G, Dallemagne B, Padoy N, 2021. Surgical data science for safe cholecystectomy: a protocol for segmentation of hepatocystic anatomy and assessment of the critical view of safety. CoRR abs/2106.10916. https://arxiv.org/abs/2106.10916 [Google Scholar]
  190. Mascagni P, Alapatt D, Urade T, Vardazaryan A, Mutter D, Marescaux J, Costamagna G, Dallemagne B, Padoy N, 2021. A computer vision platform to automatically locate critical events in surgical videos: documenting safety in laparoscopic cholecystectomy. Ann Surg 274 (1), e93–e95. [DOI] [PubMed] [Google Scholar]
  191. Mascagni P, Fiorillo C, Urade T, Emre T, Yu T, Wakabayashi T, Felli E, Perretta S, Swanstrom L, Mutter D, Marescaux J, Pessaux P, Costamagna G, Padoy N, Dallemagne B, 2020. Formalizing video documentation of the critical view of safety in laparoscopic cholecystectomy: a step towards artificial intelligence assistance to improve surgical safety. Surg Endosc 34 (6), 2709–2714. doi: 10.1007/s00464-019-07149-3. [DOI] [PubMed] [Google Scholar]
  192. Mascagni P, Longo F, Barberio M, Seeliger B, Agnus V, Saccomandi P, Hostettler A, Marescaux J, Diana M, 2018. New intraoperative imaging technologies: innovating the surgeon’s eye toward surgical precision. J Surg Oncol 118 (2), 265–282. [DOI] [PubMed] [Google Scholar]
  193. Mascagni P, Padoy N, 2021. Or black box and surgical control tower: recording and streaming data and analytics to improve surgical care. J Visc Surg 158 (3S), S18–S25. [DOI] [PubMed] [Google Scholar]
  194. Mascagni P, Vardazaryan A, Alapatt D, Urade T, Emre T, Fiorillo C, Pessaux P, Mutter D, Marescaux J, Costamagna G, Dallemagne B, Padoy N, 2020. Artificial intelligence for surgical safety: automatic assessment of the critical view of safety in laparoscopic cholecystectomy using deep learning. Ann Surg. [DOI] [PubMed] [Google Scholar]
  195. McCulloch P, Taylor I, Sasako M, Lovett B, Griffin D, 2002. Randomised trials in surgery: problems and possible solutions. BMJ 324 (7351), 1448–1451. doi: 10.1136/bmj.324.7351.1448. Section: Education and debate [DOI] [PMC free article] [PubMed] [Google Scholar]
  196. McKinney SM, Sieniek M, Godbole V, Godwin J, Antropova N, Ashrafian H, Back T, Chesus M, Corrado GC, Darzi A, Etemadi M, Garcia-Vicente F, Gilbert FJ, Halling-Brown M, Hassabis D, Jansen S, Karthikesalingam A, Kelly CJ, King D, Ledsam JR, Melnick D, Mostofi H, Peng L, Reicher JJ, Romera-Paredes B, Sidebottom R, Suleyman M, Tse D, Young KC, Fauw JD, Shetty S, 2020. International evaluation of an AI system for breast cancer screening. Nature 577 (7788), 89–94. doi: 10.1038/s41586-019-1799-6. [DOI] [PubMed] [Google Scholar]
  197. MDCG 2019–11, 2019. Guidance on Qualification and Classification of Software in Regulation (EU) 2017/745 - MDR and Regulation (EU) 2017/746 - IVDR. https://ec.europa.eu/docsroom/documents/37581.
  198. Medtronic plc (Dublin, Ireland),. Intelligentes Endoskopie-Modul GI Genius | Medtronic (Deutschland). https://www.medtronic.com/covidien/de-de/products/gastrointestinal-artificial-intelligence/gi-genius-intelligent-endoscopy.html#.
  199. Meireles OR, Rosman G, Altieri MS, Carin L, Hager G, Madani A, Padoy N, Pugh CM, Sylla P, Ward TM, et al. , 2021. Sages consensus recommendations on an annotation framework for surgical video. Surg Endosc 1–12. [DOI] [PubMed] [Google Scholar]
  200. Merkow RP, Rademaker AW, Bilimoria KY, 2018. Practical guide to surgical data sets: national cancer database (NCDB). JAMA Surg 153 (9), 850–851. doi: 10.1001/jamasurg.2018.0492. Publisher: American Medical Association. [DOI] [PubMed] [Google Scholar]
  201. Meyer A, Zverinski D, Pfahringer B, Kempfert J, Kuehne T, Sündermann SH, Stamm C, Hofmann T, Falk V, Eickhoff C, 2018. Machine learning for real-time prediction of complications in critical care: a retrospective study. The Lancet Respiratory Medicine 6 (12), 905–914. doi: 10.1016/S2213-2600(18)30300-X. [DOI] [PubMed] [Google Scholar]
  202. Miladinovic I, Schefer-Wenzl S, 2018. NFV enabled IoT architecture for an operating room environment. In: 2018 IEEE 4th World Forum on Internet of Things (WF-IoT), pp. 98–102. [Google Scholar]
  203. Miotto R, Wang F, Wang S, Jiang X, Dudley JT, 2018. Deep learning for health-care: review, opportunities and challenges. Brief. Bioinformatics 19 (6), 1236–1246. doi: 10.1093/bib/bbx044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  204. Misra I, Shrivastava A, Gupta A, Hebert M, 2016. Cross-Stitch Networks for Multi-task Learning. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3994–4003. doi: 10.1109/CVPR.2016.433. [DOI] [Google Scholar]
  205. Mitchell M, 2019. Artificial intelligence: A guide for thinking humans. Farrar, Straus and Giroux. Google-Books-ID: 65iEDwAAQBAJ [Google Scholar]
  206. Miñarro-Giménez JA, Cornet R, Jaulent MC, Dewenter H, Thun S, Gøeg KR, Karlsson D, Schulz S, 2019. Quantitative analysis of manual annotation of clinical text samples. Int. J. Med. Inform 123, 37–48. [DOI] [PubMed] [Google Scholar]
  207. MLCommons, 2018. MLCommons - machine learning innovation to benefit everyone. Staging Public. https://mlcommons.org/. [Google Scholar]
  208. Moccia S, Wirkert SJ, Kenngott H, Vemuri AS, Apitz M, Mayer B, De Momi E, Mattos LS, Maier-Hein L, 2018. Uncertainty-aware organ classification for surgical data science applications in laparoscopy. IEEE Trans. Biomed. Eng 65 (11), 2649–2659. doi: 10.1109/TBME.2018.2813015. [DOI] [PubMed] [Google Scholar]
  209. Model Zoo,. Model Zoo - Deep learning code and pretrained models for transfer learning, educational purposes, and more. https://modelzoo.co/.
  210. Mun SK, Cleary K, 2005. The operating room of the future: review of OR 2020 workshop. In: Medical Imaging 2005: PACS and Imaging Informatics. International Society for Optics and Photonics, pp. 73–82. doi: 10.1117/12.604719. [DOI] [Google Scholar]
  211. Mutter D, Vix M, Dallemagne B, Perretta S, Leroy J, Marescaux J, 2011. Websurg: an innovative educational web site in minimally invasive surgery-principles and results. Surg Innov doi: 10.1177/1553350611398880. [DOI] [PubMed] [Google Scholar]
  212. März K, Hafezi M, Weller T, Saffari A, Nolden M, Fard N, Majlesara A, Zelzer S, Maleshkova M, Volovyk M, Gharabaghi N, Wagner M, Emami G, Engelhardt S, Fetzer A, Kenngott H, Rezai N, Rettinger A, Studer R, Mehrabi A, Maier-Hein L, 2015. Toward knowledge-based liver surgery: holistic information processing for surgical decision support. Int J Comput Assist Radiol Surg 10 (6), 749–759. doi: 10.1007/s11548-015-1187-0. [DOI] [PubMed] [Google Scholar]
  213. Nagendran M, Chen Y, Lovejoy CA, Gordon AC, Komorowski M, Harvey H, Topol EJ, Ioannidis JPA, Collins GS, Maruthappu M, 2020. Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies. BMJ 368. doi: 10.1136/bmj.m689. [DOI] [PMC free article] [PubMed] [Google Scholar]
  214. National Center for Tumor Diseases Dresden (NCT/UCC),. National Center for Tumor Diseases Dresden (NCT/UCC). https://www.nct-dresden.de/en.html.
  215. National Center for Tumor Diseases Heidelberg,. Surgical Oncology - National Center for Tumor Diseases Heidelberg. https://www.nct-heidelberg.de/forschung/precision-local-therapy-and-image-guidance/surgical-oncology.html. [PubMed]
  216. Navarrete-Welton AJ, Hashimoto DA, 2020. Current applications of artificial intelligence for intraoperative decision support in surgery. Front Med doi: 10.1007/s11684-020-0784-7. [DOI] [PubMed] [Google Scholar]
  217. Nazábal A, Olmos PM, Ghahramani Z, Valera I, 2020. Handling incomplete heterogeneous data using VAEs. Pattern Recognit 107, 107501. doi: 10.1016/j.patcog.2020.107501. [DOI] [Google Scholar]
  218. Neuschler EI, Butler R, Young CA, Barke LD, Bertrand ML, Böhm-Vélez M, Destounis S, Donlan P, Grobmyer SR, Katzen J, Kist KA, Lavin PT, Makariou EV, Parris TM, Schilling KJ, Tucker FL, Dogan BE, 2017. A pivotal study of optoacoustic imaging to diagnose benign and malignant breast masses: a new evaluation tool for radiologists. Radiology 287 (2), 398–412. doi: 10.1148/radiol.2017172228. [DOI] [PubMed] [Google Scholar]
  219. Nguyen G, Dlugolinsky S, Bobák M, Tran V, López García A, Heredia I, Malík P, Hluchý L, 2019. Machine learning and deep learning frameworks and libraries for large-scale data mining: a survey. Artif Intell Rev 52 (1), 77–124. doi: 10.1007/s10462-018-09679-z. [DOI] [Google Scholar]
  220. Nichols TE, Das S, Eickhoff SB, Evans AC, Glatard T, Hanke M, Kriegeskorte N, Milham MP, Poldrack RA, Poline J-B, Proal E, Thirion B, Essen DCV, White T, Yeo BTT, 2017. Best practices in data analysis and sharing in neuroimaging using MRI. Nat. Neurosci 20 (3), 299–303. doi: 10.1038/nn.4500. [DOI] [PMC free article] [PubMed] [Google Scholar]
  221. NiftyNet,. NiftyNet - An open source convolutional neural networks platform for medical image analysis and image-guided therapy. https://niftynet.io/.
  222. NVIDIA Clara,. NVIDIA Clara - An Application Framework Optimized for Health-care and Life Sciences Developers. Library Catalog: developer.nvidia.com, https://developer.nvidia.com/clara.
  223. NVIDIA DIGITS,. NVIDIA Deep Learning GPU Training System (DIGITS). Library Catalog: developer.nvidia.com, https://developer.nvidia.com/digits. [Google Scholar]
  224. Nölke J-H, Adler TJ, Gröhl J, Maier-Hein L, Kirchner T, Ardizzone L, Rother C, Köthe U, 2021. Invertible neural networks for uncertainty quantification in photoacoustic imaging. In: Photons Plus Ultrasound: Imaging and Sensing 2021. International Society for Optics and Photonics, p. 116421Q. doi: 10.1117/12.2578183. [DOI] [Google Scholar]
  225. Oakden-Rayner L, Dunnmon J, Carneiro G, Ré C, 2020. Hidden stratification causes clinically meaningful failures in machine learning for medical imaging. Proc ACM Conf Health Inference Learn (2020) 2020, 151–159. [DOI] [PMC free article] [PubMed] [Google Scholar]
  226. Ongenae F, Bonte P, Schaballie J, Vankeirsbilck B, De Turck F, 2016. Semantic Context Consolidation and Rule Learning for Optimized Transport Assignments in Hospitals. In: The Semantic Web. Springer International Publishing, pp. 88–92. [Google Scholar]
  227. ONNX Model Zoo, 2020. ONNX Model Zoo - A collection of pre-trained, state-of-the-art models in the ONNX format. Original-date: 2017-10-06T00:03:03Z, https://github.com/onnx/models.
  228. OP 4.1,. OP 4.1 https://op41.de/.
  229. Open Neural Network Exchange (ONNX),. ONNX - the open standard for machine learning interoperability. https://onnx.ai/.
  230. Ostler D, Seibold M, Fuchtmann J, Samm N, Feussner H, Wilhelm D, Navab N, 2020. Acoustic signal analysis of instrument-tissue interaction for minimally invasive interventions. Int J Comput Assist Radiol Surg 15 (5), 771–779. doi: 10.1007/s11548-020-02146-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  231. Padoy N, 2019. Machine and deep learning for workflow recognition during surgery. Minimally invasive therapy & allied technologies: MITAT: official journal of the Society for Minimally Invasive Therapy 28 (2), 82–90. doi: 10.1080/13645706.2019.1584116. [DOI] [PubMed] [Google Scholar]
  232. Padoy N, Blum T, Ahmadi S-A, Feussner H, Berger M-O, Navab N, 2012. Statistical modeling and recognition of surgical workflow. Med Image Anal 16 (3), 632–641. doi: 10.1016/j.media.2010.10.001. [DOI] [PubMed] [Google Scholar]
  233. Pan SJ, Yang Q, 2010. A survey on transfer learning. IEEE Trans Knowl Data Eng 22 (10), 1345–1359. doi: 10.1109/TKDE.2009.191. [DOI] [Google Scholar]
  234. Parisi GI, Kemker R, Part JL, Kanan C, Wermter S, 2019. Continual lifelong learning with neural networks: areview. Neural Networks 113, 54–71. doi: 10.1016/j.neunet.2019.01.012. [DOI] [PubMed] [Google Scholar]
  235. Peters BS, Armijo PR, Krause C, Choudhury SA, Oleynikov D, 2018. Review of emerging surgical robotic technology. Surg Endosc 32 (4), 1636–1655. doi: 10.1007/s00464-018-6079-2. [DOI] [PubMed] [Google Scholar]
  236. Peters J, Janzing D, Schölkopf B, 2017. Elements of causal inference - foundations and learning algorithms. The MIT Press, Cambridge, MA, USA. [Google Scholar]
  237. PETRAW,. Peg transfer workflow recognition by different modalities. https://www.synapse.org/#!Synapse:syn25147789/wiki/.
  238. Pfeiffer M, Funke I, Robu MR, Bodenstedt S, Strenger L, Engelhardt S, Roß T, Clarkson MJ, Gurusamy K, Davidson BR, Maier-Hein L, Riediger C, Welsch T, Weitz J, Speidel S, 2019. Generating Large Labeled Data Sets for Laparoscopic Image Processing Tasks Using Unpaired Image-to-Image Translation. In: Shen D, Liu T, Peters TM, Staib LH, Essert C, Zhou S, Yap P-T, Khan A (Eds.), Medical Image Computing and Computer Assisted Intervention - MICCAI 2019. Springer International Publishing, Cham, pp. 119–127. doi: 10.1007/978-3-030-32254-0_14. [DOI] [Google Scholar]
  239. PHG Foundation, 2020. Black box medicine and transparency. Library Catalog: www.phgfoundation.org. https://www.phgfoundation.org/research/black-box-medicine-and-transparency. [Google Scholar]
  240. Pugh CM, Ghazi A, Stefanidis D, Schwaitzberg SD, Martino MA, Levy JS, 2020. How wearable technology can facilitate AI analysis of surgical videos. Annals of Surgery Open 1 (2), e011. doi: 10.1097/AS9.0000000000000011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  241. PyTorch,. PyTorch - an open source machine learning framework that accelerates the path from research prototyping to production deployment. https://www.pytorch.org.
  242. Raghu M, Zhang C, Kleinberg J, Bengio S, 2019. Transfusion: understanding transfer learning for medical imaging. arXiv:1902.07208 [cs, stat]. ArXiv: 1902.07208, http://arxiv.org/abs/1902.07208 [Google Scholar]
  243. Raval MV, Pawlik TM, 2018. Practical guide to surgical data sets: national surgical quality improvement program (NSQIP) and pediatric NSQIP. JAMA Surg 153 (8), 764–765. doi: 10.1001/jamasurg.2018.0486. [DOI] [PubMed] [Google Scholar]
  244. Ravasio CS, Pissas T, Bloch E, Flores B, Jalali S, Stoyanov D, Cardoso JM, Da Cruz L, Bergeles C, 2020. Learned optical flow for intra-operative tracking of the retinal fundus. Int J Comput Assist Radiol Surg 15 (5), 827–836. doi: 10.1007/s11548-020-02160-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  245. Reinke A, Eisenmann M, Tizabi MD, Sudre CH, Rädsch T, Antonelli M, Arbel T, Bakas S, Cardoso MJ, Cheplygina V, Farahani K, Glocker B, Heckmann-Nötzel D, Isensee F, Jannin P, Kahn CE, Kleesiek J, Kurc T, Kozubek M, Landman BA, Litjens G, Maier-Hein K, Menze B, Müller H, Petersen J, Reyes M, Rieke N, Stieltjes B, Summers RM, Tsaftaris SA, van Ginneken B, Kopp-Schneider A, Jäger P, Maier-Hein L, 2021. Common limitations of image processing metrics: a picture story. arXiv:2104.05642 [cs, eess]. ArXiv: 2104.05642, http://arxiv.org/abs/2104.05642 [Google Scholar]
  246. Reyes M, Meier R, Pereira S, Silva CA, Dahlweid F-M, Tengg-Kobligk H.v., Summers RM, Wiest R, 2020. On the interpretability of artificial intelligence in radiology: challenges and opportunities. Radiology: Artificial Intelligence 2 (3), e190043. doi: 10.1148/ryai.2020190043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  247. Rieke N, Hancox J, Li W, Milletarì F, Roth HR, Albarqouni S, Bakas S, Galtier MN, Landman BA, Maier-Hein K, Ourselin S, Sheller M, Summers RM, Trask A, Xu D, Baust M, Cardoso MJ, 2020. The future of digital health with federated learning. npj Digital Medicine 3 (1), 1–7. doi: 10.1038/s41746-020-00323-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  248. Rigante M, La Rocca G, Lauretti L, D’Alessandris G, Mangiola A, Anile C, Olivi A, Paludetti G, 2017. Preliminary experience with 4K ultra-high definition endoscope: analysis of pros and cons in skull base surgery. Acta Otorhino-laryngologica Italica 37 (3), 237–241. doi: 10.14639/0392-100X-1684. [DOI] [PMC free article] [PubMed] [Google Scholar]
  249. Rivoir D, Pfeiffer M, Docea R, Kolbinger F, Riediger C, Weitz J, Speidel S, 2021. Long-term temporally consistent unpaired video translation from simulated surgical 3d data. arXiv preprint arXiv:2103.17204. [Google Scholar]
  250. Roberts M, Driggs D, Thorpe M, Gilbey J, Yeung M, Ursprung S, Aviles-Rivero AI, Etmann C, McCague C, Beer L, Weir-McCall JR, Teng Z, Gkrania-Klotsas E, Rudd JHF, Sala E, Schönlieb C-B, 2021. Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans. Nature Machine Intelligence 3 (3), 199–217. doi: 10.1038/s42256-021-00307-0. [DOI] [Google Scholar]
  251. Rockstroh M, Franke S, Hofer M, Will A, Kasparick M, Andersen B, Neumuth T, 2017. OR.NET: Multi-perspective qualitative evaluation of an integrated operating room based on IEEE 11073 SDC. Int J Comput Assist Radiol Surg 12 (8), 1461–1469. doi: 10.1007/s11548-017-1589-2. [DOI] [PubMed] [Google Scholar]
  252. Roedder N, Dauer D, Laubis K, Karaenke P, Weinhardt C, 2016. The digital transformation and smart data analytics: An overview of enabling developments and application areas. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 2795–2802. doi: 10.1109/BigData.2016.7840927. [DOI] [Google Scholar]
  253. van Roessel S, Strijker M, Steyerberg EW, Groen JV, Mieog JS, Groot VP, He J, De Pastena M, Marchegiani G, Bassi C, Suhool A, Jang J-Y, Busch OR, Halimi A, Zarantonello L, Groot Koerkamp B, Samra JS, Mittal A, Gill AJ, Bolm L, van Eijck CH, Abu Hilal M, Del Chiaro M, Keck T, Alseidi A, Wolfgang CL, Malleo G, Besselink MG, 2020. International validation and update of the amsterdam model for prediction of survival after pancreatoduodenectomy for pancreatic cancer. European Journal of Surgical Oncology: The Journal of the European Society of Surgical Oncology and the British Association of Surgical Oncology 46 (5), 796–803. doi: 10.1016/j.ejso.2019.12.023. [DOI] [PubMed] [Google Scholar]
  254. Roh HF, Nam SH, Kim JM, 2018. Robot-assisted laparoscopic surgery versus conventional laparoscopic surgery in randomized controlled trials: asystematic review and meta-analysis. PLoS ONE 13 (1), e0191628. doi: 10.1371/journal.pone.0191628. [DOI] [PMC free article] [PubMed] [Google Scholar]
  255. Ross T, Zimmerer D, Vemuri A, Isensee F, Wiesenfarth M, Bodenstedt S, Both F, Kessler P, Wagner M, Müller B, Kenngott H, Speidel S, Kopp-Schneider A, Maier-Hein K, Maier-Hein L, 2018. Exploiting the potential of unlabeled endoscopic video data with self-supervised learning. Int J Comput Assist Radiol Surg 13 (6), 925–933. doi: 10.1007/s11548-018-1772-0. [DOI] [PubMed] [Google Scholar]
  256. Roß T, Bruno P, Reinke A, Wiesenfarth M, Koeppel L, Full PM, Pekdemir B, Godau P, Trofimova D, Isensee F, Moccia S, Calimeri F, Müller-Stich BP, Kopp-Schneider A, Maier-Hein L, 2021. How can we learn (more) from challenges? astatistical approach to driving future algorithm development. arXiv:2106.09302 [cs]. ArXiv: 2106.09302, http://arxiv.org/abs/2106.09302 [Google Scholar]
  257. Roß T, Reinke A, Full PM, Wagner M, Kenngott H, Apitz M, Hempe H, Mindroc-Filimon D, Scholz P, Tran TN, Bruno P, Arbeláez P, Bian G-B, Bodenstedt S, Bolmgren JL, Bravo-Sánchez L, Chen H-B, González C, Guo D, Halvorsen P, Heng P-A, Hosgor E, Hou Z-G, Isensee F, Jha D, Jiang T, Jin Y, Kirtac K, Kletz S, Leger S, Li Z, Maier-Hein KH, Ni Z-L, Riegler MA, Schoeffmann K, Shi R, Speidel S, Stenzel M, Twick I, Wang G, Wang J, Wang L, Wang L, Zhang Y, Zhou Y-J, Zhu L, Wiesenfarth M, Kopp-Schneider A, Müller-Stich BP, Maier-Hein L, 2021. Comparative validation of multi-instance instrument segmentation in endoscopy: results of the ROBUST-MIS 2019 challenge. Med Image Anal 70, 101920. doi: 10.1016/j.media.2020.101920. [DOI] [PubMed] [Google Scholar]
  258. Sablayrolles A, Douze M, Schmid C, Jégou H, 2020. Radioactive data: tracing through training. arXiv:2002.00937 [cs, stat]. ArXiv: 2002.00937, http://arxiv.org/abs/2002.00937 [Google Scholar]
  259. Sabour S, Frosst N, Hinton GE, 2017. Dynamic routing between capsules. arXiv:1710.09829 [cs]. ArXiv: 1710.09829, http://arxiv.org/abs/1710.09829 [Google Scholar]
  260. Sanford DE, Strasberg SM, 2014. A simple effective method for generation of a permanent record of the critical view of safety during laparoscopic cholecystectomy by intraoperative “doublet” photography. J Am Coll Surg 218 (2), 170–178. [DOI] [PubMed] [Google Scholar]
  261. Sarikaya D, Guru KA, Corso JJ, 2018. Joint surgical gesture and task classification with multi-task and multimodal learning. arXiv:1805.00721 [cs]. ArXiv: 1805.00721, http://arxiv.org/abs/1805.00721 [Google Scholar]
  262. Sarikaya D, Jannin P, 2020. Towards generalizable surgical activity recognition using spatial temporal graph convolutional networks. arXiv:2001.03728 [cs]. ArXiv: 2001.03728, http://arxiv.org/abs/2001.03728 [Google Scholar]
  263. Schellenberg M, Gröhl J, Dreher K, Holzwarth N, Tizabi MD, Seitel A, Maier-Hein L, 2021. Data-driven generation of plausible tissue geometries for realistic photoacoustic image synthesis. arXiv:2103.15510 [physics]. ArXiv: 2103.15510, http://arxiv.org/abs/2103.15510 [Google Scholar]
  264. Schoenberg MB, Bucher JN, Koch D, Börner N, Hesse S, De Toni EN, Seidensticker M, Angele MK, Klein C, Bazhin AV, Werner J, Guba MO, 2020. A novel machine learning algorithm to predict disease free survival after resection of hepatocellular carcinoma. Ann Transl Med 8 (7), 434. doi: 10.21037/atm.2020.04.16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  265. Schwarz CG, Kremers WK, Therneau TM, Sharp RR, Gunter JL, Vemuri P, Arani A, Spychalla AJ, Kantarci K, Knopman DS, Petersen RC, Jack CR, 2019. Identification of anonymous MRI research participants with face-recognition software. N. Engl. J. Med 381 (17), 1684–1686. doi: 10.1056/NEJMc1908881. [DOI] [PMC free article] [PubMed] [Google Scholar]
  266. Schölkopf B, 2019. Causality for machine learning. arXiv:1911.10500 [cs, stat]. ArXiv: 1911.10500, http://arxiv.org/abs/1911.10500 [Google Scholar]
  267. scikit-learn,. scikit-learn: machine learning in Python. https://scikit-learn.org/stable/.
  268. Shankar V, Roelofs R, Mania H, Fang A, Recht B, Schmidt L, 2020. Evaluating Machine Accuracy on ImageNet. In: Proceedings of the 37th International Conference on Machine Learning. Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, pp. 8634–8644. [Google Scholar]
  269. Sharghi A, Haugerud H, Oh D, Mohareri O, 2020. Automatic operating room surgical activity recognition for robot-assisted surgery. In: Medical Image Computing and Computer Assisted Intervention. In: Lecture Notes in Computer Science, Vol. 12263, pp. 385–395. [Google Scholar]
  270. Shen D, Wu G, Suk H-I, 2017. Deep learning in medical image analysis. Annu Rev Biomed Eng 19 (1), 221–248. doi: 10.1146/annurev-bioeng-071516-044442. [DOI] [PMC free article] [PubMed] [Google Scholar]
  271. Shrikumar A, Greenside P, Kundaje A, 2017. Learning important features through propagating activation differences. In: Proceedings of the 34th International Conference on Machine Learning - Volume 70. JMLR.org, Sydney, NSW, Australia, pp. 3145–3153. [Google Scholar]
  272. Shur J, Orton M, Connor A, Fischer S, Moulton C-A, Gallinger S, Koh D-M, Jhaveri KS, 2020. A clinical-radiomic model for improved prognostication of surgical candidates with colorectal liver metastases. J Surg Oncol 121 (2), 357–364. doi: 10.1002/jso.25783. [DOI] [PubMed] [Google Scholar]
  273. Sigma Surgical Corporation,. Sigma Surgical Corporation. https://www.sigmasurgical.com.
  274. SimSurgSkill,. Objective surgical skills assessment in vr simulation. https://www.synapse.org/#!Synapse:syn25127311/wiki/.
  275. Sledge GW, Miller RS, Hauser R, 2013. Cancerlinq and the future of cancer care. American Society of Clinical Oncology Educational Book. American Society of Clinical Oncology. Annual Meeting 430–434. doi: 10.14694/EdBook_AM.2013.33.430. [DOI] [PubMed] [Google Scholar]
  276. Smith B, Arabandi S, Brochhausen M, Calhoun M, Ciccarese P, Doyle S, Gibaud B, Goldberg I, Kahn CE, Overton J, Tomaszewski J, Gurcan M, 2015. Biomedical imaging ontologies: a survey and proposal for future work. J Pathol Inform 6 (1), 37. doi: 10.4103/2153-3539.159214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  277. Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W, Goldberg LJ, Eilbeck K, Ireland A, Mungall CJ, OBI Consortium, Leontis N, Rocca-Serra P, Ruttenberg A, Sansone S-A, Scheuermann RH, Shah N, Whetzel PL, Lewis S, 2007. The OBO foundry: coordinated evolution of ontologies to support biomedical data integration. Nat. Biotechnol 25 (11), 1251–1255. doi: 10.1038/nbt1346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  278. Somashekhar SP, Sepúlveda M-J, Puglielli S, Norden AD, Shortliffe EH, Rohit Kumar C, Rauthan A, Arun Kumar N, Patil P, Rhee K, Ramya Y, 2018. Watson for oncology and breast cancer treatment recommendations: agreement with an expert multidisciplinary tumor board. Annals of Oncology: Official Journal of the European Society for Medical Oncology 29 (2), 418–423. doi: 10.1093/annonc/mdx781. [DOI] [PubMed] [Google Scholar]
  279. Soualmia LF, Charlet J, 2016. Efficient results in semantic interoperability for health care. findings from the section on knowledge representation and management. Yearb. Med. Inform (1) 184–187. [DOI] [PMC free article] [PubMed] [Google Scholar]
  280. Spangenberg N, Augenstein C, Franczyk B, Wilke M, 2018. Implementation of a Situation Aware and Real-Time Approach for Decision Support in Online Surgery Scheduling. In: 2018 IEEE 31st International Symposium on Computer-Based Medical Systems (CBMS), pp. 417–421. [Google Scholar]
  281. Srivastav V, Gangi A, Padoy N, 2019. Human Pose Estimation on Privacy-Preserving Low-Resolution Depth Images. In: Shen D, Liu T, Peters TM, Staib LH, Essert C, Zhou S, Yap P-T, Khan A (Eds.), Medical Image Computing and Computer Assisted Intervention - MICCAI 2019. Springer International Publishing, Cham, pp. 583–591. doi: 10.1007/978-3-030-32254-0_65. [DOI] [Google Scholar]
  282. Srivastav V, Gangi A, Padoy N, 2020. Self-supervision on Unlabelled or Data for Multi-person 2D/3D Human Pose Estimation. In: Martel AL, Abolmaesumi P, Stoyanov D, Mateus D, Zuluaga MA, Zhou SK, Racoceanu D, Joskowicz L (Eds.), Medical Image Computing and Computer Assisted Intervention - MICCAI 2020. Springer International Publishing, Cham, pp. 761–771. doi: 10.1007/978-3-030-59710-8_74. [DOI] [Google Scholar]
  283. Stahl JE, Egan MT, Goldman JM, Tenney D, Wiklund RA, Sandberg WS, Gazelle S, Rattner DW, 2005. Introducing new technology into the operating room: measuring the impact on job performance and satisfaction. Surgery 137 (5), 518–526. doi: 10.1016/j.surg.2004.12.015. [DOI] [PubMed] [Google Scholar]
  284. Stamos MJ, Brady MT, 2018. Anastomotic leak: are we closer to eliminating its occurrence? Annals of Laparoscopic and Endoscopic Surgery 3 (8). doi: 10.21037/4632. [DOI] [Google Scholar]
  285. Strickland E, 2019. IBM Watson, heal thyself: how IBM overpromised and under-delivered on AI health care. IEEE Spectr 56 (4), 24–31. doi: 10.1109/MSPEC.2019.8678513. [DOI] [Google Scholar]
  286. Surgical Data Science Initiative, 2015. Surgical Data Science Initiative. http://www.surgical-data-science.org/.
  287. Surgical Outcomes Club,. Surgical Outcomes Club. http://www.surgicaloutcomesclub.com/.
  288. Syus,. Syus Operating Room Analytics | Increase Surgeon Engagement. https://www.syus.com.
  289. TechCrunch, 2020. Google medical researchers humbled when AI screening tool falls short in real-life testing. https://social.techcrunch.com/2020/04/27/google-medical-researchers-humbled-when-ai-screening-tool-falls-short-in-real-life-testing/.
  290. TensorFlow,. TensorFlow - An end-to-end open source machine learning platform. https://www.tensorflow.org/.
  291. TensorLayer,. TensorLayer 2.2.2 documentation. https://tensorlayer.readthedocs.io/en/latest/.
  292. TFLearn,. TFLearn | TensorFlow Deep Learning Library. http://tflearn.org/.
  293. Theator, Inc. (San Mateo, CA, US),. theator - Surgical Intelligence Platform. https://theator.io/surgical-intelligence-platform/.
  294. Tokuda J, Fischer GS, Papademetris X, Yaniv Z, Ibanez L, Cheng P, Liu H, Blevins J, Arata J, Golby AJ, Kapur T, Pieper S, Burdette EC, Fichtinger G, Tempany CM, Hata N, 2009. OpenIGTLink: an open network protocol for image-guided therapy environment. Int. J. Med. Robot 5 (4), 423–434. [DOI] [PMC free article] [PubMed] [Google Scholar]
  295. Tokuyasu T, Iwashita Y, Matsunobu Y, Kamiyama T, Ishikake M, Sakaguchi S, Ebe K, Tada K, Endo Y, Etoh T, Nakashima M, Inomata M, 2021. Development of an artificial intelligence system using deep learning to indicate anatomical landmarks during laparoscopic cholecystectomy. Surg Endosc 35 (4), 1651–1658. [DOI] [PMC free article] [PubMed] [Google Scholar]
  296. Tolk A, Diallo S, Turnitsa C, 2007. Applying the levels of conceptual interoperability model in support of integratability, interoperability, and composability for system-of-systems engineering. Journal of Systems, Cybernetics, and Informatics 5 (5). https://digitalcommons.odu.edu/msve_fac_pubs/27 [Google Scholar]
  297. Tomašev N, Glorot X, Rae JW, Zielinski M, Askham H, Saraiva A, Mottram A, Meyer C, Ravuri S, Protsyuk I, Connell A, Hughes CO, Karthikesalingam A, Cornebise J, Montgomery H, Rees G, Laing C, Baker CR, Peterson K, Reeves R, Hassabis D, King D, Suleyman M, Back T, Nielson C, Ledsam JR, Mohamed S, 2019. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature 572 (7767), 116–119. doi: 10.1038/s41586-019-1390-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  298. Topol EJ, 2019. High-performance medicine: the convergence of human and artificial intelligence. Nat. Med 25 (1), 44–56. doi: 10.1038/s41591-018-0300-7. [DOI] [PubMed] [Google Scholar]
  299. TransEnterix,. TransEnterix Receives FDA Clearance for First Machine Vision System in Robotic Surgery | TransEnterix, Inc. https://ir.transenterix.com/news-releases/news-release-details/transenterix-receives-fda-clearance-first-machine-vision-system.
  300. Trinkūnas J, Tuinylienė E, Puronaitė R, 2018. Research on Hospital Information Systems Integration to National Electronic Health Record System. In: 2018 International Conference BIOMDLORE, pp. 1–6. [Google Scholar]
  301. Trofimova D, Adler T, Kausch L, Ardizzone L, Maier-Hein K, Köthe U, Rother C, Maier-Hein L, 2020. Representing ambiguity in registration problems with conditional invertible neural networks. arXiv:2012.08195 [cs]. ArXiv: 2012.08195, http://arxiv.org/abs/2012.08195 [Google Scholar]
  302. Tse D, Chow C. k., Ly T. p., Tong C. y., Tam K. w., 2018. The Challenges of Big Data Governance in Healthcare. In: 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), pp. 1632–1636. doi: 10.1109/TrustCom/BigDataSE.2018.00240. [DOI] [Google Scholar]
  303. Tsilimigras DI, Mehta R, Moris D, Sahara K, Bagante F, Paredes AZ, Farooq A, Ratti F, Marques HP, Silva S, Soubrane O, Lam V, Poultsides GA, Popescu I, Grigorie R, Alexandrescu S, Martel G, Workneh A, Guglielmi A, Hugh T, Aldrighetti L, Endo I, Pawlik TM, 2020. Utilizing machine learning for pre- and postoperative assessment of patients undergoing resection for BCLC-0, a and b hepatocellular carcinoma: implications for resection beyond the BCLC guidelines. Ann. Surg. Oncol 27 (3), 866–874. doi: 10.1245/s10434-019-08025-z. [DOI] [PubMed] [Google Scholar]
  304. Tukey JW, 1977. Exploratory data analysis. Addison-Wesley Pub. Co., Reading, Mass.. OCLC: 3058187 [Google Scholar]
  305. Twinanda AP, Alkan EO, Gangi A, de Mathelin M, Padoy N, 2015. Data-driven spatio-temporal RGBD feature encoding for action recognition in operating rooms. Int J Comput Assist Radiol Surg 10 (6), 737–747. doi: 10.1007/s11548-015-1186-1. [DOI] [PubMed] [Google Scholar]
  306. Twinanda AP, Shehata S, Mutter D, Marescaux J, de Mathelin M, Padoy N, 2017. Endonet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans Med Imaging 36 (1), 86–97. doi: 10.1109/TMI.2016.2593957. [DOI] [PubMed] [Google Scholar]
  307. Twinanda AP, Yengera G, Mutter D, Marescaux J, Padoy N, 2019. RSDNet: Learning to predict remaining surgery duration from laparoscopic videos without manual annotations. IEEE Trans Med Imaging 38 (4), 1069–1078. doi: 10.1109/TMI.2018.2878055. [DOI] [PubMed] [Google Scholar]
  308. UCL-Ventura, 2020. UCL-Ventura - ucl institute of healthcare engineering. Library Catalog: www.ucl.ac.uk, https://www.ucl.ac.uk/healthcare-engineering/covid-19/ucl-ventura-breathing-aids-covid19-patients/about-ucl-ventura. [Google Scholar]
  309. University College London (UCL),. UCL - University College London, Wellcome / EPSRC Centre for Interventional and Surgical Sciences. https://www.ucl.ac.uk/interventional-surgical-sciences/.
  310. Upton R, 2019. The heart of the matter: how AI can transform cardiovascular health: ross upton, CEO and academic co-founder at ultromics, discusses the potential to implement AI in clinical diagnostics. Sci. Comput. World 16+. [Google Scholar]
  311. Vanschoren J, 2018. Meta-learning: a survey. arXiv:1810.03548 [cs, stat]. ArXiv: 1810.03548 version: 1, http://arxiv.org/abs/1810.03548 [Google Scholar]
  312. Varghese J, Fujarski M, Hegselmann S, Neuhaus P, Dugas M, 2018. CDEGenerator: An online platform to learn from existing data models to build model registries. Clin. Epidemiol 10, 961–970. [DOI] [PMC free article] [PubMed] [Google Scholar]
  313. Vedula SS, Ishii M, Hager GD, 2017. Objective assessment of surgical technical skill and competency in the operating room. Annu Rev Biomed Eng 19, 301–325. doi: 10.1146/annurev-bioeng-071516-044435. [DOI] [PMC free article] [PubMed] [Google Scholar]
  314. Vercauteren T, Unberath M, Padoy N, Navab N, 2020. CAI4CAI: the rise of contextual artificial intelligence in computer-assisted interventions. Proc. IEEE 108 (1), 198–214. doi: 10.1109/JPROC.2019.2946993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  315. Vijayan R, Silva TD, Han R, Zhang X, Uneri A, Doerr S, Ketcha M, Perdomo-Pantoja A, Theodore N, Siewerdsen JH, 2019. Automatic pedicle screw planning using atlas-based registration of anatomy and reference trajectories. Physics in Medicine & Biology 64 (16), 165020. doi: 10.1088/1361-6560/ab2d66. [DOI] [PMC free article] [PubMed] [Google Scholar]
  316. Wagner M, Bihlmaier A, Kenngott HG, Mietkowski P, Scheikl PM, Bodenstedt S, Schiepe-Tiska A, Vetter J, Nickel F, Speidel S, Wörn H, Mathis-Ullrich F, Müller-Stich BP, 2021. A learning robot for cognitive camera control in minimally invasive surgery. Surg Endosc doi: 10.1007/s00464-021-08509-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  317. Wagner M, Mietkowski P, Schneider G, Apitz M, Mayer B, Bodenstedt S, Speidel S, Bergh B, Müller-Stich B, Kenngott H, 2017. Big Data in der Chirurgie: Realisierung einer Echtzeit-Sensordatenanalyse im vernetzten Operationssaal. In: Zeitschrift für Gastroenterologie. Georg Thieme Verlag KG, p. KV577. doi: 10.1055/s-0037-1605317. [DOI] [Google Scholar]
  318. Wang W, Tolk A, Wang W, 2009. The levels of conceptual interoperability model: applying systems engineering principles to M&S. In: Proceedings of the 2009 Spring Simulation Multiconference. Society for Computer Simulation International, San Diego, California, pp. 1–9. [Google Scholar]
  319. Ward MJ, Marsolo KA, Froehle CM, 2014. Applications of business analytics in healthcare. Bus Horiz 57 (5), 571–582. doi: 10.1016/j.bushor.2014.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  320. Ward TM, Fer DM, Ban Y, Rosman G, Meireles OR, Hashimoto DA, 2021. Challenges in surgical video annotation. Computer Assisted Surgery 26 (1), 58–68. doi: 10.1080/24699322.2021.1937320. [DOI] [PubMed] [Google Scholar]
  321. WHO, 2021. Ethics and governance of artificial intelligence for health: WHO guidance. Geneva: World Health Organization. https://www.who.int/publications/i/item/9789240029200 [Google Scholar]
  322. Wijnberge M, Geerts BF, Hol L, Lemmers N, Mulder MP, Berge P, Schenk J, Terwindt LE, Hollmann MW, Vlaar AP, Veelo DP, 2020. Effect of a machine learning-derived early warning system for intraoperative hypotension vs standard care on depth and duration of intraoperative hypotension during elective noncardiac surgery: the HYPE randomized clinical trial. JAMA doi: 10.1001/jama.2020.0592. [DOI] [PMC free article] [PubMed] [Google Scholar]
  323. Wilhelm D, Kranzfelder M, Ostler D, Stier A, Meyer HJ, Feussner H, 2020. Digitalization in surgery: what surgeons currently think and know about it-results of an online survey. Chirurg 91 (1), 51–59. doi: 10.1007/s00104-019-01043-3. [DOI] [PubMed] [Google Scholar]
  324. Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, Blomberg N, Boiten J-W, Santos L.B.d.S., Bourne PE, Bouwman J, Brookes AJ, Clark T, Crosas M, Dillo I, Dumon O, Edmunds S, Evelo CT, Finkers R, Gonzalez-Beltran A, Gray AJG, Groth P, Goble C, Grethe JS, Heringa J, Hoen P.A.C.t., Hooft R, Kuhn T, Kok R, Kok J, Lusher SJ, Martone ME, Mons A, Packer AL, Persson B, Rocca-Serra P, Roos M, Schaik R.v., Sansone S-A, Schultes E, Sengstag T, Slater T, Strawn G, Swertz MA, Thompson M, Lei J.v.d., Mulligen E.v., Velterop J, Waagmeester A, Wittenburg P, Wolstencroft K, Zhao J, Mons B, 2016. The FAIR guiding principles for scientific data management and stewardship. Sci Data 3 (1), 1–9. doi: 10.1038/sdata.2016.18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  325. Winkler JK, Fink C, Toberer F, Enk A, Deinlein T, Hofmann-Wellenhof R, Thomas L, Lallas A, Blum A, Stolz W, Haenssle HA, 2019. Association between surgical skin markings in dermoscopic images and diagnostic performance of a deep learning convolutional neural network for melanoma recognition. JAMA Dermatol 155 (10), 1135–1141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  326. Wirkert SJ, Kenngott H, Mayer B, Mietkowski P, Wagner M, Sauer P, Clancy NT, Elson DS, Maier-Hein L, 2016. Robust near real-time estimation of physiological parameters from megapixel multispectral images with inverse monte carlo and random forest regression. Int J Comput Assist Radiol Surg 11 (6), 909–917. doi: 10.1007/s11548-016-1376-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  327. Wirkert SJ, Vemuri AS, Kenngott HG, Moccia S, Götz M, Mayer BFB, Maier-Hein KH, Elson DS, Maier-Hein L, 2017. Physiological Parameter Estimation from Multispectral Images Unleashed. In: Descoteaux M, Maier-Hein L, Franz A, Jannin P, Collins DL, Duchesne S (Eds.), Medical Image Computing and Computer Assisted Intervention - MICCAI 2017. Springer International Publishing, Cham, pp. 134–141. doi: 10.1007/978-3-319-66179-7_16. [DOI] [Google Scholar]
  328. Xie C, Yang P, Yang Y, 2018. Open knowledge accessing method in IoT-based hospital information system for medical record enrichment. IEEE Access 6, 15202–15211. doi: 10.1109/ACCESS.2018.2810837. [DOI] [Google Scholar]
  329. Yao J, Fidler S, Urtasun R, 2012. Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 702–709. doi: 10.1109/CVPR.2012.6247739. [DOI] [Google Scholar]
  330. Yao S, Wang R, Qian K, Zhang Y, 2020. Real world study for the concordance between IBM Watson for oncology and clinical practice in advanced non-small cell lung cancer patients at a lung cancer center in china. Thorac Cancer 11 (5), 1265–1270. doi: 10.1111/1759-7714.13391. [DOI] [PMC free article] [PubMed] [Google Scholar]
  331. Yu T, Mutter D, Marescaux J, Padoy N, 2019. Learning from a tiny dataset of manual annotations: a teacher/student approach for surgical phase recognition. arXiv:1812.00033 [cs, stat]. ArXiv: 1812.00033, http://arxiv.org/abs/1812.00033 [Google Scholar]
  332. Zhou J, Cui G, Zhang Z, Yang C, Liu Z, Wang L, Li C, Sun M, 2019. Graph neural networks: a review of methods and applications. arXiv:1812.08434 [cs, stat]. ArXiv: 1812.08434, http://arxiv.org/abs/1812.08434 [Google Scholar]
  333. Zhou X-Y, Guo Y, Shen M, Yang G-Z, 2019. Artificial intelligence in surgery. arXiv:2001.00627 [physics]. ArXiv: 2001.00627, http://arxiv.org/abs/2001.00627 [Google Scholar]
  334. Zia A, Bhattacharyya K, Liu X, Wang Z, Kondo S, Colleoni E, van Amsterdam B, Hussain R, Hussain R, Maier-Hein L, Stoyanov D, Speidel S, Jarc A, 2021. Surgical visual domain adaptation: results from the MICCAI 2020 surgvisdom challenge. arXiv:2102.13644 [cs]. ArXiv: 2102.13644, http://arxiv.org/abs/2102.13644 [Google Scholar]

RESOURCES