Abstract
Introduction
The use of natural language processing (NLP) for a literature search has been poorly investigated in vascular surgery so far. The aim of this pilot study was to test the applicability of an artificial intelligence (AI) based mobile application for literature searching in a topic related to vascular surgery.
Technique
A focused scientific question was defined to evaluate the performance of the AI application for a literature search and compare the results with the ground truth provided via a traditional literature search performed by human experts. Using pre-defined keywords, the literature search was performed automatically by the AI application through different steps, including quality assessment based on evaluation of the information available and quality filters using indicators of level of evidence, selection of publications based on relevancy filters using NLP, summarisation, and visualisation of the publications via the mobile app. A traditional literature search performed by human experts required 10 hours to check 154 original articles, among which 26 (16.9%) were truly related to the question, 63 (40.9%) related to the field but not to the specific question, and 65 (42.2%) were unrelated. The AI based search was performed in less than one hour, and, compared with traditional search, the method identified 17 original articles (48.6%) truly related to the question (p < .010), 18 (51.4%) related to the field but not to the specific question (p = .26), and no unrelated publications (p < .001). Fifteen truly related articles (88.2%) were identified jointly by the two methods. No significant difference was observed regarding the median number of citations, year of publications, and impact factor of journals.
Discussion
The AI based method enabled a targeted, focused, and time saving literature search, although the selection of publications was not completely exhaustive. These results suggest that such an AI driven application is a complementary tool to help researchers and clinicians for continuous education and dissemination of knowledge.
Keywords: Artificial intelligence, Literature search, Natural language processing, Vascular surgery
Introduction
Artificial Intelligence (AI) holds great promise in vascular surgery, with various potential applications that will enhance the detection, diagnosis, evaluate the prognosis, or plan the treatment of vascular disease.1, 2, 3 AI regroups several fields including computer vision (focusing on imaging analysis), machine learning (ML), and natural language processing (NLP, focusing on human language analysis). NLP enables computer technology to process, analyse, understand, and interpret human written or oral language. Using different techniques such as ML and computational linguistics, NLP has been mainly proposed to identify and extract information from health records and several studies have suggested it could optimise care for patients with vascular diseases.4,5 Recent studies have also highlighted the potential of NLP to build new tools to automate a literature search.6 The field is in its infancy and the use of NLP in this setting has been poorly reported in vascular surgery so far.
The aim of this study was to test the applicability of an AI based mobile application for literature searching in a topic related to vascular surgery and to compare results with the ground truth provided via a traditional literature search performed by human experts.
Technique
A specific and clearly defined scientific question was defined to evaluate the performance of the AI application for literature searching and selection of original articles relevant to the topic. The current topic of the authors' research team (J.R., F.L.) focuses on applications of AI in vascular diseases and the team has previously published related comprehensive literature reviews and bibliometric analysis.1,2 Therefore, the authors chose the following scientific question to serve as a use case: “What studies have been published on the use of AI/ML to evaluate the prognosis of patients with aortic aneurysm?” Related keywords were selected by the authors’ research team (J.R., F.L., G.D.L.) and were defined as “Artificial Intelligence”, “Machine Learning”, “Predictive Models”, “Prognosis”, “Aortic aneurysm” (including thoracic and abdominal aortic aneurysm).
Artificial intelligence based search
The pre-defined keywords were used in the commercialised AI based mobile application at the authors’ request (Juisci SAS, Neuilly-sur-Seine, France7) to perform the literature search until March 2023. The pipeline of the AI based method is depicted in Figure. 1. The software allowed checking for various formats of publications using several sources, including Medline/Pubmed, Europe PMC, peer reviewed journals, and a public database (Fig. 1). During the search and selection process, two consecutive filters were applied to check the quality of articles: a content quality filter was used to check the availability of information related to the publication (title, structured text, authors, publication date, journal, number of citations, DOI, related articles, meta-data, keywords) and associated document in pdf format, and an objective quality filter was applied to integrate indicators of level of evidence (classification as peer review journal, impact factor, SCImago Journal Rank, H index, number of citations). A crawler generated a raw set of publications using all the available information. NLP algorithms and relevancy filters were then applied to parse content and structure it. Raw data were filtered based on encoded indicators and selection criteria to select the most relevant publications. Associated pdf files were downloaded by the software, and key text sections were extracted to create a ready to summarise dataset of publications. Additional AI algorithms based on NLP and computer vision were applied to enable pdf reading and analysis, and extraction of figures and tables. The dataset was then summarised by a NLP algorithm to provide a digest of the publication. The final output was then displayed on the mobile application and the user had access to the original publication as well as a summary. If needed, the user could also request an automatic translation of the publication.
Figure 1.
Pipeline of the artificial intelligence (AI) based method for literature search, screening and selection of publications. (A) The software initiates a literature search of various types of publications based on user request using several sources. (B) During the selection process, quality assessment is evaluated by a parser using two consecutive filters to check availability of information and indicators of level of evidence. Natural language processing (NLP) processing and relevancy filters are used for final selection of publications to fit with the user request. (C) Each publication is summarised by the AI application using computer vision, deep learning, and NLP algorithms. All the publications are displayed and can be read on the mobile application. The user gets access to the original publication as well as a summary generated by the software. NLP = Natural Language Processing; SJR = SCImago Journal Rank.
Traditional human based search
In parallel, human experts (F.L., J.R., G.D.L.) performed a systematic literature search following guidelines defined by the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) Group. The authors independently performed a literature search using Pubmed to identify studies reporting the use of AI/ML to develop predictive models in aortic aneurysm using a combination of the pre-defined keywords as follows: Query 1: artificial intelligence OR machine learning, Query 2: aortic aneurysm, Query 3: prognosis OR predictive models. Queries 1 and 2, and then Query 3 were connected using the “AND” operator. The flow chart is depicted in Figure. 2. Inclusion criteria were original articles reporting applications of AI/ML in aortic aneurysm to develop predictive models, including prediction of prognosis and outcomes of patients. Review articles, case reports, editorials, letters, or comments were excluded. After titles were identified, the abstracts were checked, and full texts were retrieved. Content related criteria were applied, and eligibility was independently checked by two authors (J.R., G.D.L.). In a few cases of disagreement, the article was discussed with a third author (F.L.) to reach consensus.
Figure 2.
Flow chart depicting the process for the literature search and selection of the studies by traditional search performed by human experts.
Comparison between artificial intelligence based search and traditional human based search
The publications proposed by both methods were classified as follows: Original articles truly related corresponded to articles using AI/ML to predict the prognosis and outcomes of patients with aortic aneurysm, Original articles related to the field but not specifically to the question corresponded to articles using AI/ML in aortic aneurysm but for an application other than predicting the prognosis, Original articles not related to the question corresponded to articles that did not use ML/AI, or to articles that used ML/AI in pathologies other than aortic disease. Statistical analyses were performed using GraphPad Prism software (version 8.00, San Diego, CA, USA). Categorical data were expressed as number and percentage, and continuous variables were expressed as median with interquartile range. Group differences were investigated using the Mann–Whitney test for continuous data and Fisher's exact test for categorical data. A p value <.050 was considered statistically significant.
For the specific scientific question investigated, 184 citations were identified during the traditional search performed by human experts, among which 154 were original articles (Table 1). After checking for eligibility criteria, 26 original articles (16.9%) were found to be truly related to the question, 63 (40.9%) were related to the field but not to the specific question, and 65 (42.2%) were not related. The AI based method selected and recommended 45 publications, among which 35 were original articles (Table 1). Compared with traditional search, the AI based method identified 17 original articles (48.6%) truly related to the question (p < .010), 18 (51.4%) related to the field but not to the specific question (p = .26), and no unrelated publication was reported (p < .001). The computational time required for the recommendation of the articles by the AI based method was less than one hour, whereas the traditional literature search required approximately 10 hours. No significant difference was observed regarding the median impact factor (IF) and the category of the journals, number of citations, or year of publication of the articles between the two methods (Table 1). However, the proportion of articles published in journals with IF superior to 3.0 tended to be higher with the AI based search method (76.6% vs. 46.2%, p = .064). Comparison of the selection of articles revealed that 15 original articles were jointly identified by the two methods, among which 11 (73.3%) were published in journals with an IF > 3 (Table 1). The AI based method identified two more articles truly related to the topic. The traditional human based search identified 11 articles truly related to the question that were not selected by the AI application but only one (9.1%) from a journal with an IF > 3.0.
Table 1.
Comparison of performances between the artificial intelligence (AI) based method and traditional search method performed by human experts.
AI based method (Juisci application) | Traditional search (Human experts) | p value | |
---|---|---|---|
Quantitative analysis | |||
Total number of papers identified in the search results | 45 | 184 | NA |
Number of original articles in the search results | 35 | 154 | NA |
Number of original articles truly related to the question | 17/35 (48.6) | 26/154 (16.9) | <.001 |
Number of original articles related to the field but not to the question | 18/35 (51.4) | 63/154 (40.9) | .26 |
Number of unrelated papers | 0 (0) | 65/154 (42.2) | <.001 |
Estimation of computational time | Approximately <1 hour | Approximately 10 hours | NA |
Qualitative analysis of truly related articles | |||
Impact factor of the journal | 3.6 (2.5, 4.7) | 2.7 (1.8, 3.7) | .13 |
Number of articles with impact factor >3.0 | 13/17 (76.5) | 12/26 (46.2) | .060 |
Number of articles published in journals related to cardiovascular disease | 6/17 (35.3) | 13/26 (50) | .37 |
Number of articles published in other journals (general journal or related to engineering and bio-informatics) | 11/17 (64.7) | 13/26 (50) | .37 |
Number of citations of the articles | 21 (4.5, 48.0) | 13 (4.0, 34.0) | .47 |
Year of publication of the articles | 2020 (2017, 2021) | 2020 (2017, 2022) | .54 |
Qualitative analysis of truly related articles identified jointly by the two methods | |||
Number of original articles | 15/17 (88.2) | 15/26 (57.7) | NA |
Impact factor of the journal | 3.6 (1.9, 4.3) | NA | |
Number of articles with impact factor >3.0 | 11/15 (73.3) | NA | |
Number of citations of the articles | 24 (4.8, 50.5) | NA | |
Year of publication of the articles | 2020 (2016, 2020) | NA | |
Qualitative analysis of truly related articles identified by only one of the methods | |||
Number of original articles | 2/17 (11.8) | 11/26 (42.3) | NA |
Impact factor of the journal | 5.6 | 1.9 (1.2, 2.5) | NA |
Number of articles with impact factor >3.0 | 2/2 (100) | 1/11 (9.1) | NA |
Number of citations of the articles | 23.5 | 7.5 (1, 19) | NA |
Year of publication of the articles | 2020 | 2021 (2018, 2022) | NA |
Results are expressed as n, n (%), or median with interquartile range. NA = not applicable.
Discussion
This pilot study tested an innovative AI based mobile application that automates literature searching and proposes publications related to the users’ request. Compared with the ground truth provided by human experts, the results showed that the AI based method enabled a targeted, focused, and time saving literature search. Although the selection of publications by the AI based method was not exhaustive, it gave an appropriate overview of current and high quality publications related to the specific question investigated, with almost 90% of the articles that were common to the selection performed by traditional human based search. In addition, the AI method allowed the identification of two more truly related articles, suggesting its usefulness as a complementary, easy to use, and quick processing tool for literature searching. The appropriate balance between exhaustiveness and specificity for a literature search and selection of articles may also depend on the needs of the users.
The last decades have witnessed an exponential increase of publications in all areas of medicine. Clinicians and researchers have to face new challenges to deal with increased amounts of information and keeping up to date in their area of expertise.8 A traditional literature search in databases usually leads to the proposition of hundreds to thousands of papers with a small proportion of papers that actually matches the topic of interest. The user has to check manually all the results of the search to identify relevant papers. The process can be tedious and time consuming while, at the same time, health professionals face increased pressure regarding quality, efficiency, and rentability. The AI based method proposed a tool to automate a literature search and select publications that could help health professionals to easily screen scientific content adapted to their use and may have the advantage of being easily accessible, available, and constantly updated through a mobile app. In addition, an automatic NLP driven literature search could help to reduce bias related to an author's experience and familiarity in conducting systematic reviews and meta-analysis.9 AI could potentially help to improve reproducibility and reduce interoperator variations during the literature search process.
This study presents some limitations, and several perspectives can be highlighted. It investigated a specific question based on original articles. It would worth testing the AI based method on several other questions related to vascular surgery, on topics that have been documented for longer periods, and in other types of publications to test its performance for a literature search and selection of publications in other fields. The performance of the method for the selection of publications based on full text analysis of the original publications was analysed, and further studies are required to provide a qualitative analysis of translations and summaries proposed. Compared with human experts, the selection of publications using the AI method was not exhaustive but the completeness of the selection of publications may depend on the intended use. In addition to the relevance and quality of the publications, further studies would also be of interest to investigate the adequacy of the selection to the users’ needs. Finally, other AI based methods have been reported to enable an automatic literature search but comparing the results between studies remains extremely challenging due to heterogeneity in study designs, AI techniques, and methodology used.9 Further efforts should be oriented towards building standards and guidelines to evaluate and validate NLP applications.10 Although further research is required, in this study, the AI based method allowed a quick, focused, and easily available overview of current knowledge on a specific topic. In addition to traditional methods, such an AI driven tool could help to complement the continuous education of health professionals and researchers, contribute to knowledge dissemination, and might benefit research and clinical practice. Such technology offers great promises, although its use remains to be evaluated and kept under human supervision and responsibilities.
Funding
This work has been supported by the French government through the National Research Agency (ANR) with the reference number ANR-22-CE45-0023-01 and through the 3IA Côte d’Azur Investments in the Future project managed with the reference number ANR-19-P3IA-002.
Conflict of interest
None.
References
- 1.Lareyre F., Le C.D., Ballaith A., Adam C., Carrier M., Amrani S., et al. Applications of artificial intelligence in non-cardiac vascular diseases: a bibliographic analysis. Angiology. 2022;73:606–614. doi: 10.1177/00033197211062280. [DOI] [PubMed] [Google Scholar]
- 2.Raffort J., Adam C., Carrier M., Ballaith A., Coscas R., Jean-Baptiste E., et al. Artificial intelligence in abdominal aortic aneurysm. J Vasc Surg. 2020;72:321–333. doi: 10.1016/j.jvs.2019.12.026. [DOI] [PubMed] [Google Scholar]
- 3.Li B., Feridooni T., Cuen-Ojeda C., Kishibe T., de Mestral C., Mamdani M., et al. Machine learning in vascular surgery: a systematic review and critical appraisal. NPJ Digit Med. 2022;5:7. doi: 10.1038/s41746-021-00552-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Wu H., Wang M., Wu J., Francis F., Chang Y.H., Shavick A., et al. A survey on clinical natural language processing in the United Kingdom from 2007 to 2022. NPJ Digit Med. 2022;5:186. doi: 10.1038/s41746-022-00730-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.McLenon M., Okuhn S., Lancaster E.M., Hull M.M., Adams J.L., McGlynn E., et al. Validation of natural language processing to determine the presence and size of abdominal aortic aneurysms in a large integrated health system. J Vasc Surg. 2021;74:459–466. doi: 10.1016/j.jvs.2020.12.090. [DOI] [PubMed] [Google Scholar]
- 6.Kwabena A.E., Wiafe O.B., John B.D., Bernard A., Boateng F.A.F. An automated method for developing search strategies for systematic review using natural language processing (NLP) MethodsX. 2023;10 doi: 10.1016/j.mex.2022.101935. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.JUISCI. Available at: https://www.juisci.com/. [Accessed 18 March 2023].
- 8.Subbiah V. The next generation of evidence-based medicine. Nat Med. 2023;29:49–58. doi: 10.1038/s41591-022-02160-z. [DOI] [PubMed] [Google Scholar]
- 9.Blaizot A., Veettil S.K., Saidoung P., Moreno-Garcia C.F., Wiratunga N., Aceves-Martins M., et al. Using artificial intelligence methods for systematic review in health sciences: a systematic review. Res Synth Methods. 2022;13:353–362. doi: 10.1002/jrsm.1553. [DOI] [PubMed] [Google Scholar]
- 10.Zhang Y., Liang S., Feng Y., Wang Q., Sun F., Chen S., et al. Automation of literature screening using machine learning in medical evidence synthesis: a diagnostic test accuracy systematic review protocol. Syst Rev. 2022;11:11. doi: 10.1186/s13643-021-01881-5. [DOI] [PMC free article] [PubMed] [Google Scholar]