In recent years, digitalization and artificial intelligence have made tremendous progress. In medicine, data-driven technologies are especially applicable in areas with a high degree of automation and standardization of data [1,2]. Substantial advances have as well been reported in clinical microbiology, but their translation into routine application remains a long process with several technical and regulatory hurdles. Some of the low-hanging fruits for diagnostics scenarios include (i) dashboards to interconnect and visualize microbiology data [3,4], (ii) automated analysis of images such as microscopy slides [5] or agar plates [6,7] and (iii) association of genome sequences and proteomic profiles with pathogen phenotypes [8,9]. Clinical applications require standardized data formats, ontologies with an interoperable information technology environment [10], infrastructure with sufficient storage and computational capacity, and technical expertise to address the needs of microbiologists and infectious diseases experts.
In the present themed issue, Luz et al. summarize machine learning algorithms for the analysis of routine electronic health records. The authors identified 52 studies covering various aspects of infectious disease management including sepsis, hospital-acquired and surgical site infections, and microbiological test results. The heterogeneity of machine learning algorithms ranged from logistic regression, random forest, support vector machines to artificial neural networks. A key gap is the lack of essential information on data handling [11]. Pfeiffer-Smadja et al. ask if the time has come for machine learning in routine practice of clinical microbiology [12]. In 97 studies, the data sources used were highly diverse ranging from genomic data and microscopic images to mass spectrometry. Almost 40% of studies were from low- and middle-income countries—highlighting the opportunities that digitalization and digital biomarkers have to offer considering decreasing costs and cloud-based services [13]. However, digital biomarkers also require validation in clinical studies to show their impact on relevant outcomes. Lacking standardized data and algorithms poses an important challenge for reproduction and validation studies [14]. As a result of issues in data handling, two prominent published coronavirus disease 2019 (COVID-19) articles were recently retracted [15,16]. Journals clearly need standards for data and code sharing. The FAIR principles provide an excellent guidance [17]. Although software code and tools are often shared on GitHub (github.com) [18], the details provided are often limited with missing explanatory code books or instructions. Proper data and code-handling policies should be part of the new research quality standard and will allow independent validation of machine learning algorithms and data sets.
Smith and Kirby report on applications in modern image analysis [19]. Machine-learning-based image analysis may revolutionize microscopy for classical Gram stains, ova and parasite preparation, and histopathological slides. For example, a neural network could categorize Gram stains from positive blood cultures with remarkable precision into Gram positives/negatives and cocci/rods [5]. Of note, state-of-the-art infrastructure to generate high-quality images, data storage and processing may be required. However, smartphone devices can bridge the technology gaps [20,21]. Similarly, based on pattern recognition, single bacterial colonies growing on agar plates can be categorized or even identified [6,7]. Both applications, automated microscopy and agar plate inspection, are likely to radically change the workflow in modern diagnostic laboratories [22]. Perhaps parallel to how we have embraced matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass-spectrometry for identification, making biochemical tests almost superfluous [23]. However, there may be potential for extracting additional information from MALDI-TOF spectra. Weis and colleagues look into this key technology [24] and summarize algorithms to link spectral profiles to microbiological phenotypes. In their review, 36 studies using machine learning for species identification and antibiotic susceptibility testing were identified. Most commonly used machine learning techniques included support vector machines, genetic algorithms, artificial neural networks and quick classifiers. Within the studies identified, a wide range of qualities were noted and only four studies validated their findings [24].
All authors highlight the need for validated algorithms. Validation is also a key point in the regulatory process and impacts reimbursement. From May 2021, the medical device and in vitro medical device regulations of Europe will steer software with a diagnostic, monitoring or therapeutic purpose (http://ec.europa.eu/growth/sectors/medical-devices/regulatory-framework/), forming the basis for CE labelling including machine-learning-based algorithms in clinical microbiology. Both academia and industry will benefit from standards in data and code handling as this process will support validation and further build trust in computational models and methods [25,26]. A process additionally fuelled by (i) well-designed clinical studies and (ii) cross-validation to known and well-established statistical approaches. Ethical and legal aspects should also be raised if such algorithms are to be integrated into personalized and public health medicine [27]. As illustrated, during the COVID-19 crisis, multiple models have predicted different outcomes [28,29] of, for example, fatality rates and impact of the lockdown. In public health emergencies high-quality real-time data must be available in machine-readable formats for the scientific community. Such infrastructure for public health monitoring needs to be further developed. If public health decisions rely on such models, in return models should to be validated in a similar way to algorithms in personalized medicine because the impact for the general population and economics is significant.
Clearly, an interesting and challenging time for clinical microbiology and infectious disease is ahead. Standards in data and code handling are a first step, which will allow us to use the opportunities of digitalization and machine learning to improve diagnostics and patient care.
Editor: L. Leibovici
References
- 1.Bailey A.L., Ledeboer N., Burnham C.D. Clinical microbiology is growing up: the total laboratory automation revolution. Clin Chem. 2019;65:634–643. doi: 10.1373/clinchem.2017.274522. [DOI] [PubMed] [Google Scholar]
- 2.Ezewudo M. Integrating standardized whole genome sequence analysis with a global Mycobacterium tuberculosis antibiotic resistance knowledgebase. Sci Rep. 2018;8:15382. doi: 10.1038/s41598-018-33731-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Graber C.J. Decreases in antimicrobial use associated with multihospital implementation of electronic antimicrobial stewardship tools. Clin Infect Dis. 2019 doi: 10.1093/cid/ciz941. epub ahead of print. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Hebert C., Flaherty J., Smyer J., Ding J., Mangino J.E. Development and validation of an automated ventilator-associated event electronic surveillance system: a report of a successful implementation. Am J Infect Contr. 2018;46:316–321. doi: 10.1016/j.ajic.2017.09.006. [DOI] [PubMed] [Google Scholar]
- 5.Smith K.P., Kang A.D., Kirby J.E. Automated interpretation of blood culture Gram stains by use of a deep convolutional neural network. J Clin Microbiol. 2018;56:e01521. doi: 10.1128/JCM.01521-17. 17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Croxatto A. Towards automated detection, semi-quantification and identification of microbial growth in clinical bacteriology: a proof of concept. Biomed J. 2017;40:317–328. doi: 10.1016/j.bj.2017.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Van T.T., Mata K., Dien Bard J. Automated detection of Streptococcus pyogenes pharyngitis by use of colorex strep A CHROMagar and WASPLab artificial intelligence chromogenic detection module software. J Clin Microbiol. 2019;57:e00811–e00819. doi: 10.1128/JCM.00811-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Jamal S. Artificial intelligence and machine learning based prediction of resistant and susceptible mutations in Mycobacterium tuberculosis. Sci Rep. 2020;10:5487. doi: 10.1038/s41598-020-62368-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Lupolova N., Lycett S.J., Gally D.L. A guide to machine learning for bacterial host attribution using genome sequence data. Microb Genom. 2019;5 doi: 10.1099/mgen.0.000317. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Gansel X., Mary M., van Belkum A. Semantic data interoperability, digital medicine, and e-health in infectious disease management: a review. Eur J Clin Microbiol Infect Dis. 2019;38:1023–1034. doi: 10.1007/s10096-019-03501-6. [DOI] [PubMed] [Google Scholar]
- 11.Luz C.F. Machine learning in infection management using routine electronic health records: tools, techniques, and reporting of future technologies. Clin Microbiol Infect. 2020;26:1291–1299. doi: 10.1016/j.cmi.2020.02.003. [DOI] [PubMed] [Google Scholar]
- 12.Pfeiffer-Smadja N. Machine learning in the clinical microbiology laboratory: has the time come for routine practice? Clin Microbiol Infect. 2020;26:1300–1309. doi: 10.1016/j.cmi.2020.02.006. [DOI] [PubMed] [Google Scholar]
- 13.Karim M.R. Improving data workflow systems with cloud services and use of open data for bioinformatics research. Brief Bioinform. 2018;19:1035–1050. doi: 10.1093/bib/bbx039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Vandenberg O. Consolidation of clinical microbiology laboratories and introduction of transformative technologies. Clin Microbiol Rev. 2020;33 doi: 10.1128/CMR.00057-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Mehra M.R., Desai S.S., Kuy S., Henry T.D., Patel A.N. Retraction: cardiovascular disease, drug therapy, and mortality in Covid-19. N Engl J Med. 2020 doi: 10.1056/NEJMoa2007621. N Engl J Med, doi:10.1056/NEJMc2021225. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
- 16.Mehra M.R., Desai S.S., Ruschitzka F., Patel A.N. RETRACTED: hydroxychloroquine or chloroquine with or without a macrolide for treatment of COVID-19: a multinational registry analysis. Lancet. 2020 doi: 10.1016/S0140-6736(20)31180-6. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
- 17.Corpas M., Kovalevskaya N.V., McMurray A., Nielsen F.G.G. A FAIR guide for data providers to maximise sharing of human genomic data. PLoS Comput Biol. 2018;14 doi: 10.1371/journal.pcbi.1005873. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Hendriksen R.S. Using genomics to track global antimicrobial resistance. Front Public Health. 2019;7:242. doi: 10.3389/fpubh.2019.00242. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Smith K.P., Kirby J.E. Image analysis and artificial intelligence in infectious disease diagnostics. Clin Microbiol Infect. 2020;26:1318–1323. doi: 10.1016/j.cmi.2020.03.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Linares M. Collaborative intelligence and gamification for on-line malaria species differentiation. Malar J. 2019;18:21. doi: 10.1186/s12936-019-2662-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Perkel J.M. Pocket laboratories. Nature. 2017;545:119–121. doi: 10.1038/545119a. [DOI] [PubMed] [Google Scholar]
- 22.Cherkaoui A. Implementation of the WASPLab and first year achievements within a university hospital. Eur J Clin Microbiol Infect Dis. 2020 doi: 10.1007/s10096-020-03872-1. [DOI] [PubMed] [Google Scholar]
- 23.Angeletti S., Ciccozzi M. Matrix-assisted laser desorption ionization time-of-flight mass spectrometry in clinical microbiology: an updating review. Infect Genet Evol. 2019;76:104063. doi: 10.1016/j.meegid.2019.104063. [DOI] [PubMed] [Google Scholar]
- 24.Weis C.V., Jutzeler C.R., Borgwardt K. Machine learning for microbial identification and antimicrobial susceptibility testing on MALDI-TOF mass spectra: a systematic review. Clin Microbiol Infect. 2020;26:1310–1317. doi: 10.1016/j.cmi.2020.03.014. [DOI] [PubMed] [Google Scholar]
- 25.Cabitza F., Campagner A., Balsano C. Bridging the "last mile" gap between AI implementation and operation: "data awareness" that matters. Ann Transl Med. 2020;8:501. doi: 10.21037/atm.2020.03.63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Topol E.J. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25:44–56. doi: 10.1038/s41591-018-0300-7. [DOI] [PubMed] [Google Scholar]
- 27.Watson D.S. Clinical applications of machine learning algorithms: beyond the black box. BMJ. 2019;364:l886. doi: 10.1136/bmj.l886. [DOI] [PubMed] [Google Scholar]
- 28.Davies N.G. Effects of non-pharmaceutical interventions on COVID-19 cases, deaths, and demand for hospital services in the UK: a modelling study. Lancet Public Health. 2020 doi: 10.1016/S2468-2667(20)30133-X. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Flaxman S. Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe. Nature. 2020 doi: 10.1038/s41586-020-2405-7. [DOI] [PubMed] [Google Scholar]