PURPOSE
Despite advances in molecular therapeutics, few anticancer agents achieve durable responses. Rational combinations using two or more anticancer drugs have the potential to achieve a synergistic effect and overcome drug resistance, enhancing antitumor efficacy. A publicly accessible biomedical literature search engine dedicated to this domain will facilitate knowledge discovery and reduce manual search and review.
METHODS
We developed RetriLite, an information retrieval and extraction framework that leverages natural language processing and domain-specific knowledgebase to computationally identify highly relevant papers and extract key information. The modular architecture enables RetriLite to benefit from synergizing information retrieval and natural language processing techniques while remaining flexible to customization. We customized the application and created an informatics pipeline that strategically identifies papers that describe efficacy of using combination therapies in clinical or preclinical studies.
RESULTS
In a small pilot study, RetriLite achieved an F1 score of 0.93. A more extensive validation experiment was conducted to determine agents that have enhanced antitumor efficacy in vitro or in vivo with poly (ADP-ribose) polymerase inhibitors: 95.9% of the papers determined to be relevant by our application were true positive and the application's feature of distinguishing a clinical paper from a preclinical paper achieved an accuracy of 97.6%. Interobserver assessment was conducted, which resulted in a 100% concordance. The data derived from the informatics pipeline have also been made accessible to the public via a dedicated online search engine with an intuitive user interface.
CONCLUSION
RetriLite is a framework that can be applied to establish domain-specific information retrieval and extraction systems. The extensive and high-quality metadata tags along with keyword highlighting facilitate information seekers to more effectively and efficiently discover knowledge in the combination therapy domain.
INTRODUCTION
Although there is growing interest in precision oncology, few tumors have a single driver and durable disease control is rarely achieved with monotherapy. Combination therapy, which simultaneously uses two or more drugs, has been extensively used in chemotherapy and increasingly being explored in targeted therapy trials. Compared with monotherapy approaches, a combination strategy has the potential to enhance drug efficacy if synergistic or additive values can be achieved.1 Combination therapies may also prevent or overcome cancer drug resistance, which remains a tremendous challenge in cancer treatment.2 Although these remain increasingly important research questions, there is also increased emphasis on personalizing combinations, by strategies such as targeting a driver alteration in combination with standard-of-care chemotherapy or by targeting coalterations with a combination of targeted therapies. Thus, there is a need to rapidly identify combination therapies that have been tested preclinically and found to be additive and/or synergistic and to identify combinations tested clinically and known to be safe.
CONTEXT
Key Objective
To facilitate exploration of combination therapy literature, will an automated information retrieval and extraction system that was developed with experts in the loop render sufficiently good performance and provide value in real-world applications?
Knowledge Generated
We applied RetriLite, an informatics framework that combines information retrieval and natural language processing techniques in the domain of combination therapy. RetriLite was shown to achieve a good performance in identifying highly relevant papers, and we made the results of our study, which include rich metadata that our tool extracted freely available to the public via a web application.
Relevance
RetriLite may facilitate clinician-investigators interested in novel therapeutic opportunities to combinations that have been explored preclinically but not yet explored clinically. Furthermore, although guideline-based therapy is encouraged, patients with rare histologies, rare molecular alterations, or combinations of alterations may represent clinical scenarios where all therapeutic options may need to be considered with multidisciplinary guidance.
Driven by a growing interest on this topic, the volume and velocity of literature related to combination therapies have expanded significantly over the years. As of the time of writing, a general search of combination therapy cancer in PubMed returns 287,497 entries.3 Searching with anticancer drug combination, a term in the MeSH (Medical Subject Headings) lexicon, 135,740 articles were returned. More specific search terms certainly will help reduce the number of matched articles, yet general literature search engines, such as PubMed, Web of Science, and Google Scholar,4,5 are not designed to address specific search questions. Therefore, the burden of conducting a comprehensive search followed by denoising lies on the information seekers. This requires familiarity with all known terminologies of key domain concepts, which is an unreasonable expectation for human users. The challenges are amplified in knowledge discovery tasks as information seekers' understanding of the subject is expected to grow iteratively, making a manual search effort even more ad hoc and time-consuming. Domain-specific databases, such as Drug Bank, Therapeutic Target Database, and DGIdb, provide relevant information about genes, drugs, and their associations but are not integrated with any general literature search capacity.6-8 Other subscription-based resources like Embase and Cochrane provide support that cross-links biomedical literature with some specific topics, but do not yet include combination therapy.9,10 Saiz et al11 from the IBM Watson Health group developed an automated clinical evidence engine leveraging artificial intelligence to mine clinical oncology–related research. Their application was not yet customized to identify preclinical evidence or the combination therapy context. Because of the proprietary nature of commercial solutions, data availability is limited. Furthermore, the supervised learning method used to establish the system requires numerous prelabeled data, which is often an impractical task for most academic centers to emulate because of the lack of resources.12 Publicly accessible resources do exist. The largest medical wiki called A Hematology Oncology Wiki13 is one example whose focus is on regimens that are actively prescribed by clinical oncologists and thus does not yet cover studies in the research setting.
To explore the possibility of establishing a free resource for public access that integrates knowledgebase and a search engine customized for the combination therapy domain, we developed a novel natural language processing (NLP)–assisted literature retrieval and extraction system called RetriLite and customized it for the precision oncology domain. In developing and evaluating the informatics pipeline, we used an expert-in-the-loop strategy to not only leverage their rich domain expertise but also explore the real-world application of RetriLite in supporting their knowledge discovery. The tool can be accessed via Combination Therapy Literature Search Engine.14 Custom dictionaries and lexicons are available in the Data Supplement.
METHODS
RetriLite's Information Retrieval Component
We developed a literature retrieval and extraction application called RetriLite using MEDLINE, the literature database that powers PubMed. Below, we enumerate the features that RetriLite offers:
Query expansion: in biomedicine, a plethora of domain-specific terminologies are used throughout the literature. RetriLite leveraged domain-specific lexicons and systemically developed automatic query expansion: NCBI's Entrez gene database for gene symbol or aliases,15 NCI Thesaurus for drug name or aliases,16 and a cancer disease lexicon developed by a major cancer center for cancer terms (Data Supplement).17 Customized user-defined concept expansion is also available.
Relevancy-based ranking: we used Lucene, a state-of-the-art information retrieval library, as the backbone of our application,18 rendering a basic relevance-based ranking. The default ranking strategy is term frequency, inverted document frequency–weighted scheme where the terms that matched search keywords contributed to the documents' relevancy score.
Keyword highlighting: RetriLite provides key term highlighting, which conveniently conveys the hidden knowledge used in formulating the expanded query and may assist knowledge discovery.
RetriLite's NLP Component
To refine the retrieval results, a simple rule-based NLP component is developed that eliminates less relevant papers and extracts meaningful metadata. We implemented the NLP module using an industrial strength library written in Python called spaCy, which is well known for its speed and performance.19
The following strategies were used in our design:
Named entity recognition: we developed a general named entity recognizer mechanism, which takes a dictionary as input to recognize relevant entities in the text and normalize them using the canonical terminology.
Context analysis: one common challenge for information retrieval systems is their inability to analyze contexts of the matched query terms. To address this, we applied text segmentation and eliminated articles where matched keywords do not appear in the same context. Our previous study suggested that sentence-level co-occurrence struck a good balance between precision and recall,20 and we adopted this strategy in RetriLite. A paper is considered to contain the desirable context if at least one sentence exists where all required concepts co-occur.
Applying RetriLite for Combination Therapy
In this article, we customized RetriLite for combination therapy and developed the pipeline with four modules. Figure 1 illustrates the architecture of the pipeline where sources/processes colored in teal are automatically generated and those marked by red are manually compiled.
Retriever: this module took a gene list or drug list as input. For a gene list, our in-house drug database was cross-referenced to identify clinically available drugs directly targeting the gene(s) along with their aliases.17 For a drug list, all the names/aliases associated with these drugs would be retrieved. A conjunctive Boolean search query was then generated, requiring the co-existence of three factors: (1) the drugs of interest, (2) the concept of cancer, and (3) the concept of combination therapy. Keywords related to factor (3) were compiled by two domain experts (coauthors M.K. and D.Y.; Data Supplement). The retriever executed the query to identify papers that satisfied the condition.
Refiner: the NLP component was applied to refine the retrieval results using the named entity recognizer and context analyzer, which considers a paper qualified if it contains at least one sentence where two drug entries co-occur with the combination therapy concept (one of them has to be the drug of interest).
Classifier: in this step, a customized weighted term dictionary developed in house (Data Supplement) was used to classify the main theme of the articles to be either clinical or preclinical, which is meaningful high-level metadata for the audience of this topic per the expert review team. The study type is labeled with the class having the larger score (a tiebreaker defaulted to the preclinical study type).
Tagger: To facilitate expert review, we generated relevant metadata tags to help navigate the sizable corpus. For example, our tool created tags such as the general category of cancer types (solid tumor and/or hematologic malignancy), drugs that matched the search query (anchor drugs), any other drugs recognized in the combination context yet not included in the original search query (partner drugs and they include specific drugs and broad drug concepts such as chemotherapy and radiation), the types of studies (clinical and/or preclinical), and certain safety-related concepts such as side effects mentioned in the abstracts.
FIG 1.
Architecture of RetriLite pipeline for combination therapy. DB, database; NER, named entity recognizer.
In-Depth Knowledge Curation
To explore a real-world application of RetriLite, we engaged two expert reviewers who actively pursue research in the combination therapy field (coauthors C.X.C.P. and T.S.). They conducted in-depth review (including full manuscripts and the Data Supplement) and knowledge curation on a subset of papers flagged by RetriLite to approximate their own review process. Because of space constraints, a detailed description of the relevant background and process is provided in the Data Supplement.
Web Application for Combination Therapy Literature Search
To expand the impact of our effort, we developed a publicly accessible web application to showcase the results generated by our pipeline. Because of space constraints, we provide a detailed description in the Data Supplement.
RESULTS
Pilot Study on Combination Therapy Studies Using a Vascular Endothelial Growth Factor Inhibitor
To establish a preliminary evaluation of the utility of our application and its performance and configuration, we used the anticancer drug apatinib (rivoceranib) as input to the RetriLite pipeline, simply on the basis of the fact that apatinib was ranked first alphabetically in our drug list. Apatinib is an oral, small-molecule receptor tyrosine kinase inhibitor with potential antiangiogenic and antineoplastic activities; it selectively binds to and inhibits vascular endothelial growth factor receptor 2.16,21,22 Specifically, in December 2018, when this experiment was executed, the retriever module retrieved 56 papers that met the search criteria (pattern described in the Methods); the NLP components were used to further identify highly relevant papers, resulting in 44 papers being retained. To evaluate the refiner module, two curators developed the gold standard by independently reviewing the 56 papers returned by the retriever and achieved 100% concordance in their assessments. Eighty-nine percent of the articles retained by the refiner were shown to be correctly classified as relevant or irrelevant. The refiner did result in four false positives and two false negatives, which yielded a recall of 0.95, a precision of 0.91, an F1 score of 0.93 , and a false-negative rate of 0.05. Error analysis reveals that false negatives were due to NCI thesaurus (the drug lexicon that we used) not containing Apa as an alias to the drug apatinib; false positives resulted from spaCy's sentence segmentation error or context ambiguities (eg, paper suggested exploring apatinib combination in the future is beneficial but did not provide actual data). The relatively high recall and low false-negative rate coupled with a reasonably good precision led us to believe that the elimination conducted in the refiner module achieved a balanced outcome.
Validation on Combination With Poly (ADP-ribose) Polymerase Inhibitors
We expanded our experiment to use a family of drugs as input. To provide practical value to our expert reviewers (C.X.C.P. and T.S.) who participated in this round of experiment, we added classifier and tagger modules in the RetriLite pipeline. To balance the need to validate RetriLite and to align with the reviewers' research interests and their time commitment, we focused on the formal assessment of two key functional modules that can be evaluated relatively easily by objective measures, namely, refiner and preclinical/clinical paper classifier. The paper ranking and metadata generated by the tagger were provided to them as nonfunctional features and were not formally assessed.
Extended Assessment of RetriLite
We tested RetriLite using a class of drugs known as poly (ADP-ribose) polymerase (PARP) inhibitors, which were aligned with the expert reviewers' research focus. We provided all approved and commonly investigated PARP inhibitors as input to the RetriLite pipeline. In January 2019, when this experiment was executed, the retriever identified 729 papers meeting basic search criteria and the refiner retained 419 of them, among which 106 were labeled by the classifier as clinical studies and 313 as preclinical.
For manual validation, we followed a process similar to that described by Fathiamini et al.20 Two reviewers with clinical and scientific domain expertise each reviewed 211 and 208 preclinical and clinical papers per a random assignment. To facilitate their review process, each paper's PubMed Identifier, predicated study type (clinical or preclinical), title, and additional metadata generated by the tagger were provided, including the predicted drug combinations, the biomarkers targeted by these drugs (if any), and the clinical safety terms mentioned in the article (for clinical studies only).
The reviewers evaluated RetriLite's prediction of relevance and the classification of the study type. Among the 419 papers reviewed, 402 (95.9%) papers were considered relevant as they discussed the use of PARP inhibitors in combination therapies, whereas 17 papers were irrelevant (three were correction/comment papers with the relevant title but no specific detail mentioned in the abstracts and 14 mentioned PARP inhibitors but not in a combination setting). Among those relevant papers, 360 (85.9%) extensively discussed studies that formally evaluated the efficacy of using combination therapies involving specific PARP inhibitors where detailed data and results were reported, whereas the other 42 papers did not provide specific efficacy data about the combination of interest. With respect to the classification of clinical versus preclinical papers, 409 (97.6%) papers were correctly classified, whereas 10 papers were misclassified (five clinical and five preclinical). Error analysis showed that most of the misclassifications resulted from the paper records released by MEDLINE containing only titles but no abstracts, thus skewing analysis.
To assess inter-rater concordance, we randomly extracted a sample (10%) of papers from the entire collection (42 papers) and they were reviewed by both raters. Analysis revealed that the reviewers reached 100% concordance.
The research focus of our scientific review team is on assessing the efficacy of combining PARP inhibitors with another targeted therapy or chemotherapy. To identify relevant rational combination therapies to explore in their research, expert reviewers conducted an in-depth review of a subset of preclinical and clinical scientific literature studies identified by RetriLite that met their specific needs (61 clinical and 177 preclinical papers). Figure 2 provides some examples of the trend revealed by analysis of our expert reviewers' findings.
FIG 2.

NLP results of (A) selected combination therapies associated with PARP inhibitors with synergistic efficacy from at least three preclinical studies and (B) selected combination therapies associated with PARP inhibitors with increased efficacy from at least one clinical study. NLP, natural language processing; PARP, poly (ADP-ribose) polymerase.
Web Application
RetriLite not only identified highly relevant papers but also produced many metadata tags. We developed a web application to disseminate data generated for the PARP inhibitor experiment.14 Figure 3 illustrates the application's basic features. Users can search by a biomarker or drug of interest from controlled lexicons (the same ones used by our tagger module) via a type-ahead feature, provided that it is found in at least one paper deemed relevant by RetriLite, and the application will return all the papers matching the search criteria from the RetriLite results collected in our study. The matched search terms are highlighted in the text. Details of the matched drugs including their preferred name, aliases, and development phase (US Food and Drug Administration approved or clinical/preclinical development) are provided. Additional metadata are also provided along with the paper's abstract including drug-gene association information (if any) and/or phase information (for clinical studies). Users can filter publications by the publication date, predicted study type, and predicted clinical phase. In addition, custom sorting of the results is provided on the basis of the following options: predicted clinical relevancy, predicted preclinical relevancy, publication date, and impact factor.
FIG 3.

Combo web application.
DISCUSSION
We presented a novel and customizable literature mining tool called RetriLite for exploring combination therapies for cancer treatment. Integrating strategies from information retrieval, NLP, and domain-specific knowledgebase, our application can extract an extensive amount of domain concepts from free texts and normalize them using the standardized nomenclature. The performance and utility of the tool were formally evaluated by two separate groups of reviewers. Their feedback validated the tool in verifying that it brings highly relevant results to the end users. Our scientific review team allowed us to demonstrate the utility of our technology in a real-world setting. With a publicly facing search engine that supports a variety of options to filter and sort the results, our tool enables any user interested in this domain to navigate the results with more ease. The identification of rational combinations has several potential uses. First, for basic/translational researchers with a particular interest in an agent, it can quickly review what combinations have been explored and were found to have enhanced efficacy in vitro and/or in vivo. Second, this can guide clinician-investigators interested in novel therapeutic opportunities to combinations that have been explored preclinically that might not have been explored in the clinical setting. In addition, investigators can identify combinations deemed safe in early-phase trials that can be explored for future additional indications. Furthermore, although guideline-based therapy is encouraged, patients with rare histologies, rare molecular alterations, or combinations of alterations may represent clinical scenarios where all therapeutic options may need to be considered with multidisciplinary guidance.
We recognize several limitations of our study. First and foremost, the RetriLite index was built from publicly available titles and abstracts released by MEDLINE. Although extensive in breadth, the lack of availability of full texts negatively affected RetriLite's recall as some papers only mention specific drug names in the text full body including some key studies, reviews, and meta-analyses. Although we expected RetriLite's performance to be limited by the scope of its data source, regretfully, we did not perform a systematic study to quantify the false-negative rates of RetriLite's retriever module at the time of our experiments, which presented a major limitation of our evaluation. In future work, we plan to explore full text from PubMed Central and pursue a systematic evaluation to quantify the retriever module's recall. In addition, we acknowledge that although the in-depth review generated much insightful domain knowledge, because of the significant time investment, the current study was only centric to PARP inhibitor–related studies. Future work is required to expand the scope. Another noteworthy limitation is related to the NLP components. At present, they are unable to disambiguate contexts, which mentioned multiple drugs conjunctively or disjunctively. More sophisticated techniques need to be used to achieve this goal. Although powerful, they are still computationally intensive and need to be performed offline. Thus far, we have only uploaded results related to PARP inhibitors on our web application. In future work, we will continue to deliver periodic updates to such content and whenever possible, expand the scope of the content. The NLP library RetriLite used has not been formally evaluated for the cancer domain, and we did not perform formal comparative studies against other NLP processors such as Pharmspresso.23 Further study is needed to shed more light in this respect.
ACKNOWLEDGMENT
We thank Jeffrey Tacy from the MD Anderson's Enterprise Development and Integration team for his help in enabling our application to be accessible in the public domain.
Jia Zeng
Stock and Other Ownership Interests: Merck, Novocure, Guardant Health
Other Relationship: Philips Healthcare (Inst)
Md Abu Shufean
Research Funding: Royal Philips (Inst)
Michael Kahle
Research Funding: Royal Philips (Inst)
Dong Yang
Employment: Qiagen, Guardant Health
Kenna Shaw
Consulting or Advisory Role: Guidepoint Global
Research Funding: Guardant Health (Inst), Tempus (Inst), Philips Healthcare (Inst)
Funda Meric-Bernstam
Employment: MD Anderson Cancer Center
Honoraria: Rutgers Cancer Institute of New Jersey
Consulting or Advisory Role: Samsung Bioepis, Xencor, Debiopharm Group, Silverback Therapeutics, IBM Watson Health, Roche, PACT Pharma, eFFECTOR Therapeutics, Kolon Life Sciences, Tyra Biosciences, Zymeworks, Puma Biotechnology, Zentalis, Alkermes, Infinity Pharmaceuticals, AbbVie, Black Diamond Therapeutics, Eisai, OnCusp Therapeutics, Lengo Therapeutics, Tallac Therapeutics, Karyopharm Therapeutics, Biovica
Speakers' Bureau: Chugai Pharma
Research Funding: Novartis (Inst), AstraZeneca (Inst), Taiho Pharmaceutical (Inst), Genentech (Inst), Calithera Biosciences (Inst), Debiopharm Group (Inst), Bayer (Inst), Aileron Therapeutics (Inst), Puma Biotechnology (Inst), CytomX Therapeutics (Inst), Jounce Therapeutics (Inst), Zymeworks (Inst), Curis (Inst), Pfizer (Inst), eFFECTOR Therapeutics (Inst), AbbVie (Inst), Boehringer Ingelheim (I), Guardant Health (Inst), Daiichi Sankyo (Inst), GlaxoSmithKline (Inst), Seattle Genetics (Inst), Taiho Pharmaceutical (Inst), Klus Pharma (Inst), Takeda (Inst)
Travel, Accommodations, Expenses: Beth Israel Deaconess Medical Center
No other potential conflicts of interest were reported.
SUPPORT
Supported in part by The Cancer Prevention and Research Institute of Texas (RP150535), the Sheikh Khalifa Bin Zayed Al Nahyan Institute for Personalized Cancer Therapy, NCATS Grant UL1 TR000371, CTSA-Informatics Core 1UL1TR003167-01 (Center for Clinical and Translational Sciences), and the MD Anderson Cancer Center Support grant (P30 CA016672). The team authors J.Z., M.A.S., M.K., and D.Y. are affiliated with is receiving funding and technology support from Royal Philips.
DATA SHARING STATEMENT
We will continue to leverage the web application that we developed as a platform to disseminate data and knowledge generated by our study. We are committed to maintain the support of and conduct periodic updates on this publicly accessible resource. Latest data update was in September 2021. Source code of RetriLite will not be published.
AUTHOR CONTRIBUTIONS
Conception and design: Jia Zeng, Michael Kahle, Kenna Shaw, Funda Meric-Bernstam
Administrative support: Kenna Shaw
Provision of study materials or patients: Kenna Shaw
Collection and assembly of data: Jia Zeng, Christian X. Cruz Pico, Md Abu Shufean, Michael Kahle, Dong Yang, Kenna Shaw
Data analysis and interpretation: Jia Zeng, Christian X. Cruz Pico, Md Abu Shufean, Michael Kahle, Kenna Shaw, Funda Meric-Bernstam
Manuscript writing: All authors
Final approval of manuscript: All authors
Accountable for all aspects of the work: All authors
AUTHORS' DISCLOSURES OF POTENTIAL CONFLICTS OF INTEREST
The following represents disclosure information provided by authors of this manuscript. All relationships are considered compensated unless otherwise noted. Relationships are self-held unless noted. I = Immediate Family Member, Inst = My Institution. Relationships may not relate to the subject matter of this manuscript. For more information about ASCO's conflict of interest policy, please refer to www.asco.org/rwc or ascopubs.org/cci/author-center.
Open Payments is a public database containing information reported by companies about payments made to US-licensed physicians (Open Payments).
Jia Zeng
Stock and Other Ownership Interests: Merck, Novocure, Guardant Health
Other Relationship: Philips Healthcare (Inst)
Md Abu Shufean
Research Funding: Royal Philips (Inst)
Michael Kahle
Research Funding: Royal Philips (Inst)
Dong Yang
Employment: Qiagen, Guardant Health
Kenna Shaw
Consulting or Advisory Role: Guidepoint Global
Research Funding: Guardant Health (Inst), Tempus (Inst), Philips Healthcare (Inst)
Funda Meric-Bernstam
Employment: MD Anderson Cancer Center
Honoraria: Rutgers Cancer Institute of New Jersey
Consulting or Advisory Role: Samsung Bioepis, Xencor, Debiopharm Group, Silverback Therapeutics, IBM Watson Health, Roche, PACT Pharma, eFFECTOR Therapeutics, Kolon Life Sciences, Tyra Biosciences, Zymeworks, Puma Biotechnology, Zentalis, Alkermes, Infinity Pharmaceuticals, AbbVie, Black Diamond Therapeutics, Eisai, OnCusp Therapeutics, Lengo Therapeutics, Tallac Therapeutics, Karyopharm Therapeutics, Biovica
Speakers' Bureau: Chugai Pharma
Research Funding: Novartis (Inst), AstraZeneca (Inst), Taiho Pharmaceutical (Inst), Genentech (Inst), Calithera Biosciences (Inst), Debiopharm Group (Inst), Bayer (Inst), Aileron Therapeutics (Inst), Puma Biotechnology (Inst), CytomX Therapeutics (Inst), Jounce Therapeutics (Inst), Zymeworks (Inst), Curis (Inst), Pfizer (Inst), eFFECTOR Therapeutics (Inst), AbbVie (Inst), Boehringer Ingelheim (I), Guardant Health (Inst), Daiichi Sankyo (Inst), GlaxoSmithKline (Inst), Seattle Genetics (Inst), Taiho Pharmaceutical (Inst), Klus Pharma (Inst), Takeda (Inst)
Travel, Accommodations, Expenses: Beth Israel Deaconess Medical Center
No other potential conflicts of interest were reported.
REFERENCES
- 1.Mokhtari RB, Homayouni TS, Baluch N, et al. : Combination therapy in combating cancer. Oncotarget 8:38022-38043, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Leary M, Heerboth S, Lapinska K, et al. : Sensitization of drug resistant cancer cells: A matter of combination therapy. Cancers 10:483, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.PubMed : https://www.ncbi.nlm.nih.gov/pubmed
- 4.Web of Science : https://www.webofknowledge.com
- 5.Google Scholar : https://scholar.google.com
- 6.Law V, Knox C, Djoumbou Y, et al. : DrugBank 4.0: Shedding new light on drug metabolism. Nucleic Acids Res 42:D1091-D1097, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Li YH, Yu CY, Li XX, et al. : Therapeutic target database update 2018: Enriched resource for facilitating bench-to-clinic research of targeted therapeutics. Nucleic Acids Res 46:D1121-D1127, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Cotto KC, Wagner AH, Feng YY, et al. : DGIdb 3.0: A redesign and expansion of the drug-gene interaction database. Nucleic Acids Res 46:D1068-D1073, 2018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Embase : https://www.embase.com/
- 10.Cochrane : https://www.cochrane.org/
- 11.Saiz FS, Sanders C, Stevens R, et al. : Artificial intelligence clinical evidence engine for automatic identification, prioritization, and extraction of relevant clinical oncology research. JCO Clin Cancer Inform 5:102-111, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Woo M: Trial by artificial intelligence. Nature 573:S100-S102, 2019 [DOI] [PubMed] [Google Scholar]
- 13.A Hematology Oncology Wiki. HemOnc.org [Google Scholar]
- 14.Combination Therapy Literature Search Engine. https://pct.mdanderson.org/combo [Google Scholar]
- 15.Maglott D, Ostell J, Pruitt KD, et al. : Entrez gene: Gene-centered information at NCBI. Nucleic Acids Res 33:D54-D58, 2005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.NCI Thesaurus : https://ncit.nci.nih.gov/ncitbrowser/
- 17.Zeng J, Shufean MA, Khotskaya Y, et al. : OCTANE: Oncology clinical trial annotation engine. JCO Clin Cancer Inform 3:1-11, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Apache Lucene : https://lucene.apache.org
- 19.spaCy . Industrial-strength natural language processing. https://spacy.io/
- 20.Fathiamini S, Johnson AM, Zeng J, et al. : Automated identification of molecular effects of drugs (AIMED). J Am Med Infom Assoc 23:758-765, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Tian S, Quan H, Xie C, et al. : YN968D1 is a novel and selective inhibitor of vascular endothelial growth factor receptor-2 tyrosine kinase with potent activity in vitro and in vivo. Cancer Sci 102:1374-1380, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Li J, Qin S, Xu J, et al. : Apatinib for chemotherapy-refactory advanced metastatic gastric cancer: Results from a randomized, placebo-controlled, parallel-arm, phase II trial. J Clin Oncol 31:3219-3225, 2013 [DOI] [PubMed] [Google Scholar]
- 23.Garten Y, Altman R: Pharmspresso: A text mining tool for extraction of pharmcogenomic concepts and relationships from full text. BMC Bioinformatics 10:S2-S6, 2009 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
We will continue to leverage the web application that we developed as a platform to disseminate data and knowledge generated by our study. We are committed to maintain the support of and conduct periodic updates on this publicly accessible resource. Latest data update was in September 2021. Source code of RetriLite will not be published.

