. 2024 Jun 14;5(6):101006. doi: 10.1016/j.patter.2024.101006

Table 2.

Additional recommendations

Component	Recommendations
Deployment	The deployment strategy should consider the entire process, from node distribution to aggregator interaction, ensuring seamless communication and efficient training rounds. Furthermore, the authors should consider integrating machine learning operations (MLOps) practices, as these enhance automation, monitoring, and security; ensure seamless integration and deployment; encourage collaboration; and increase the efficiency and reliability of the FL platform. None of the papers we reviewed discussed version control of the global model. Implementing version control in FL enhances traceability, supports asynchronous communication, enables A/B testing, and provides a rollback mechanism. Storing copies of model artifacts across different versions strengthens auditability and facilitates benchmarking and tracking model performance over time.
Reproducibility	As FL is a rapidly evolving field of innovative research, we recommend that the community work together to develop an FL methodology checklist to improve the documentation of future studies. In the absence of such a checklist, we recommend that authors and reviewers use existing checklists, such as CLAIM,¹³⁹ for assessing the completeness of the data and model descriptions in a medical imaging context. Additionally, tools such as PROBAST¹⁴⁹ are recommended for assessing the biases in the data and models. Practitioners should only develop a new FL codebase when the existing frameworks fundamentally do not accomplish their aim, otherwise there is a risk of coding errors due to the complexity of the FL system. Codebases, and trained models, should be released publicly if possible to allow the community to easily apply the model and validate the performance.

Component

Recommendations

Deployment

The deployment strategy should consider the entire process, from node distribution to aggregator interaction, ensuring seamless communication and efficient training rounds. Furthermore, the authors should consider integrating machine learning operations (MLOps) practices, as these enhance automation, monitoring, and security; ensure seamless integration and deployment; encourage collaboration; and increase the efficiency and reliability of the FL platform. None of the papers we reviewed discussed version control of the global model. Implementing version control in FL enhances traceability, supports asynchronous communication, enables A/B testing, and provides a rollback mechanism. Storing copies of model artifacts across different versions strengthens auditability and facilitates benchmarking and tracking model performance over time.

Reproducibility

As FL is a rapidly evolving field of innovative research, we recommend that the community work together to develop an FL methodology checklist to improve the documentation of future studies. In the absence of such a checklist, we recommend that authors and reviewers use existing checklists, such as CLAIM,¹³⁹ for assessing the completeness of the data and model descriptions in a medical imaging context. Additionally, tools such as PROBAST¹⁴⁹ are recommended for assessing the biases in the data and models. Practitioners should only develop a new FL codebase when the existing frameworks fundamentally do not accomplish their aim, otherwise there is a risk of coding errors due to the complexity of the FL system. Codebases, and trained models, should be released publicly if possible to allow the community to easily apply the model and validate the performance.