Skip to main content
. 2023 Oct 24;42(48):3545–3555. doi: 10.1038/s41388-023-02857-6

Table 1.

The table below uses some (less self-explanatory parts of) ‘Good Machine Learning Practice for Medical Device Development: Guiding Principles’ created by the FDA, MHRA and Health Canada, to discuss AI design requirements.

Design and development
Design team Ensure the design team includes technical, clinical and regulatory specialists, with relevant experience. Having a multidisciplinary team involved from the beginning helps ensure all aspects are considered early in development. Clinical key opinion leaders are crucial to understanding the intended purpose of the device in the currently accepted workflow and supporting risk assessments during development and final benefit/risk acceptance.
Intended purpose, state of the art and target performance

Clearly define the intended purpose, ground truth, target performance, and model hyper parameters.

This information will be used to determine the state of the art for the device and the statistical rational for the training, testing and validation datasets.

Data management and machine learning operations

Establish data management plans for selection and handling of the data sets used as part of the training, testing and validation.

Include requirements for statistical rationale for the size of the data sets used, quality acceptance criteria, details of how out of specification samples are handled, tools used as part of the data preparation and analysis, roles and responsibilities for data preparation and approval (with necessary independence), metadata that is required for samples and is to be collected as part of the data preparation activities, number of samples and statistical rational, description of the machine learning pipeline, how is version control maintained, details of error analysis.

Risk management

Risk assessments need to include risks associated with the use of the device in the clinical workflow, use error, security risks and risks associated with AI/ML as a technology, for example bias built into the model by an inadequate sample cohort. BS 34971/AAMI CR 34971 is a useful source to understand the application of ISO 14971 to Artificial Intelligence and Machine Learning

Ensure cybersecurity risk assessments are initiated early in the development process and are updated as the development evolves. The ensures vulnerabilities in the AI/ML device and/or connected platform or environments are considered. There are many guidances available from MDCG 2019-16 [42], FDA Cybersecurity in Medical Devices: Quality System Considerations and Content of Pre-market Submissions (Apr-2022) [43].

Design requirements

Taking time to establish appropriate design architecture and ensure design teams are communicating to ensure all data transfer requirements between and within devices is clearly identified will help avoid unexpected failures in verification.

Review harmonized/consensus standards to identify any device specific requirements.

Human-AI team Use scenarios and the human-AI team should be considered when establishing user interfaces and workflow steps, engagement of key user profiles will help provide valuable feedback on the user interface and workflows during development as part of formative studies which are required for novel devices. Evidence exists [44] to support the human-AI team during usability studies and provision of information to users on the scenarios in which AI/ML devices can underperform to avoid the human becoming reliant on the results of an assistive tool for judgement.
Labelling/User training

All regulatory requirements including information for the user and in some situations - the patient. EN 82304 and many AI/ML guidances and proposed legislation stress the importance of providing clear and essential information relevant to the model performance, characteristics of the data used to train and test the model, acceptable inputs, contraindications, limitations of the model, guidance on interpreting result, and clinical workflow integration of the model.

All this information is critical to the user to support the proper use of the device and to build comfortability of users with the device.

Clinical deployment
Monitoring and maintenance

Following release of the device to the field for clinical use, post-market surveillance and post-market performance follow-up activities are critical for monitoring device performance, use, safety and security in the field using real world performance data.

Controls can be built into AI/ML devices, which enable manufacturers to collect data on the performance both enabling analysis of the performance against the claimed performance and to identify drift, overfitting, unintended bias or model degradation.

The FDA discuss the use of SaMD pre-specifications and change protocols [45], where as part of the pre-market approval manufactures can defined anticipated modification to the device performance as it learns from use data in the field.

The uptake of this is yet to be seen as the concept of device performance changing in the field may not align to pathology laboratory practices or user comfortability, as there are qualification processes the laboratory require before a device can be used for clinical practice.

The question remains, do pathologists and other users trust automated devices?

In terms of monitoring for safety and security, maintenance programmes also need to include controls for monitoring the status of SOUPs and any potential vulnerabilities with the SOUPs, environment that device is held in (cloud or networked).

Risk management

The outputs of post-market surveillance and post-market performance follow-up activities should be used to review the risk assessment, overall residual risk assessment and benefit-risk profile for the device to confirm its ongoing suitability from a safety, performance and acceptability against the state of the art.

Using the real world data available on the use of the device enables manufacturers to affirm or update their risk management file, and possibly intended purpose and design of the device.

Given the novel nature of AI/ML medical devices and increase in DP devices being used in the field, more data will be available on similar devices to highlight any unforeseen or inappropriately addressed aspects of the risk management file.