Graphical abstract
Keywords: Large language models, Drug discovery, Target identification, ADMET
Highlights
-
•
AI-enabled LLMs have prospered and are used in science, medicine, and different realms of society.
-
•
Different LLMs are used in the multiple drug discovery and development stages.
-
•
LLMs used are de novo drug discovery, drug target identification and validation, ADME/ADMET, etc.
-
•
These LLMs help in faster and more cost-efficient drug discovery.
Abstract
Background
Due to the recent revolution of artificial intelligence (AI), AI-enabled large language models (LLMs) have flourished and started to be applied in various sectors of science and medicine. Drug discovery and development are time-consuming, complex processes that require high investment. The conventional method of drug discovery is costly and has a high failure rate. AI-enabled LLMs are used in various steps of drug discovery to solve the challenges of time and cost.
Aim of Review
The article aims to provide a comprehensive understanding of AI-enabled LLMs and their use in various steps of drug discovery to ease the challenges.
Key Scientific Concepts of Review
The review provides an overview of the LLMs and their current state-of-the-art application in structure-based drug molecule design and de novo drug design. The different applications of AI-enabled LLMs have been illustrated, such as drug target identification, validation, interaction, and ADME/ADMET. Several domain-specific models of LLMs are developed in this direction and applied in drug discovery and development to speed up the process. We discussed all these domain-specific models of LLMs and their applications in this field. Finally, we illustrated the challenges and future perspectives on the applications of AI-enabled LLMs in drug discovery and development.
Introduction
In 2020, Exscientia, an artificial intelligence/machine learning (AI/ML) and automation-based pharmaceutical company, announced that the first AI-discovered molecule (DSP-1181) entered the Phase-1 clinical trial as a drug molecule. Exscientia and Sumitomo Dainippon Pharma developed the molecule through a joint venture [1], [2], [3]. Exscientia is a UK-based drug discovery company, and Sumitomo Dainippon Pharma is a Japan-based pharmaceutical company. The news has gained attention to the drug discovery and development in the pharmaceutical companies. Drug discovery and development is a complicated process, and vast knowledge is required from basic science research, medicinal chemistry, biological assays, and pharmaceutical technologies. At the same time, researchers should know the different methodologies, tools, and technologies from different fields for drug target identification, target validation, hit-to-lead (H2L) generation using high throughput screen (HTS), synthesis of the drug molecule, drug molecule testing using the animal model, drug delivery system development, ADME/ADMET (absorption, distribution, metabolism, excretion/toxicity) analysis. Therefore, it is a complicated and time-consuming process. However, the association of AI in drug discovery and development has reduced the cost and accelerated the timeline. Therefore, AI-enabled drug molecule discovery by Exscientia and Sumitomo Dainippon Pharma and its entrance into the clinical trial has created a new hope in the pharmaceutical sector for drug molecule discovery. As of 2024, several AI-generated molecules have entered the clinical trial [4], [5], [6]. AI and deep learning (DL) have been used in different drug discovery and development steps to help speed up the process. For example AI and DL has been used in high-throughput screening (HTS) and molecular docking to speed up drug discovery and development. HTS allows for the experimental testing of large libraries of compounds against biological targets, significantly speeding up the identification of active compounds [7]. On the other hand, molecular docking simulates interactions between small molecules and proteins to predict binding affinities. At the same time, quantitative structure–activity relationship (QSAR) models use statistical methods to relate chemical structures to biological activities, providing valuable insights into molecular function [8]. AI-enabled large language models (LLMs) or generative Al (Gen AI) have been used in HTS, molecular docking, QSAR, etc., for drug discovery and development processes [9].
After the Alan Turing test in 1950, AI started to progress. ML algorithms were introduced in 1959, and the first chatbot, ELIZA, was introduced in 1966 (Fig. 1). After the introduction of DL algorithms, the area has advanced very fast. AI-empowered LLMs have emerged as a transformative force in NLP and are driving notable progress in diverse domains. LLM, a DL based computational model, can perform various NLP (natural language processing) related assignments. Using the token or word as a form of generative AI, LLMs can take text as the input and repeatedly generate the output. LLMs were developed from the language models (LMs), and before 2017, few LMs existed. In 2017, transformer algorithms, a DL-based architecture, were developed, which allows the LMs to process and generate large amounts of text data [10]. It is a landmark discovery by Vaswani et al. where text is converted into a token. The token is a numerical representation in the transformer algorithm, and each token can be converted into a vector [10], [11]. The full potential of LLMs materialized with the introduction of GPT-3 by OpenAI in 2020. Trained on an unparalleled scale, encompassing over 175 billion parameters and a dataset comprising nearly a trillion words, GPT-3 demonstrated exceptional performance across a broad spectrum of natural language tasks, including text generation, translation, summarization, and question answering [12]. Models like GPT-3, BERT, and Claude have undergone training on billions, or even trillions, of tokens, enabling them to amass an unparalleled comprehension of human language and its subtleties (Fig. 2). Recently, LLM has evolved in MLLM (multimodal large language model). The evolution of MLLMs represents a significant leap in AI, enabling models to process and understand multiple forms of data, such as text, images, audio, and video [13]. MLLM has also been applied in different domains of biology to medicine, including drug discovery and development.
Fig. 1.
The timeline illustrates AI progress and its landmark achievements.
Fig. 2.
The timeline illustrates the introduction of different AI-powered chatbots from time to time.
The rapid growth of LLM or MLLM has recently led to significant advancements in various domains. The transformer-enabled LLM or MLLM is widely applied in different areas of science, including chemical science, computer science, and biology, due to its impressive capabilities and several advantages [14], [15], [16]. LLMs or MLLMs are also applied in various fields of medical science, including orthopedics, pathology, medicine, radiology, etc. [17], [18], [19], [20]. After the launch of ChatGPT on November 30, 2022, LLM gained immense importance from users. Within a few days after its launch, a few million users were added to the new users list for the chatbot. Now, billions of users use different chatbots such as ChatGPT and others. Like other users, drug discovery researchers have gained massive interest in AI-enabled research. Researchers are using ChatGPT and other LLMs or MLLMs to speed up the process of drug discovery and development [1], [21], [22], [23]. Therefore, a focus shift has been noted to the relevance of LLMs in pharmaceutical research.
This article discusses the role of AI-enabled LMs, LLMs, and MLLMs in different drug discovery and development fields such as de-novo drug discovery, drug target identification and validation, ADME/ADMET, etc. We also discuss the other applications of LMs, LLMs, and MLLMs in drug discovery and development. Furthuremore, different challenges and prospects in drug discovery for LLMs have also been discussed.
Molecular design using DL and deep generative models
Molecular design using DL and deep generative models has revolutionized drug discovery, material science, and chemistry by enabling efficient chemical space exploration, molecular properties prediction, and the generation of novel molecules with desired characteristics. These advancements have dramatically accelerated the pace of innovation and discovery across multiple scientific disciplines. HTS is used to test large libraries and identify active compounds experimentally. Scientists are trying to include the LLM or Gen AI in HTS. Similarly, combinatorial chemistry systematically generates large libraries of molecules, which are then screened for biological activity, expanding the potential for discovering new therapeutic agents.. Computational methods, such as molecular docking, are critical in this revolution. AI-based docking has revolutionized drug discovery. LLM used is tried in this area. QSAR models, which use AI or DL models, provide more insights into molecular function. Deep QSAR is an example [24]. In the future, LLM or MLLM will play a significant role in this area.
DL in molecular design leverages DNNs for property prediction and feature extraction. DNNs predict molecular properties such as solubility, toxicity, and binding affinity from molecular structures, while CNNs and graph neural networks (GNNs) automatically extract relevant features from molecular graphs or SMILES strings [25]. Reinforcement learning (RL) further enhances molecular design by optimizing molecular structures to improve desired properties and guiding the search within chemical space. RL also facilitates inverse design, where molecules with specific properties are iteratively refined (Fig. 3) [26].
Fig. 3.
The overview of a workflow of AI-enabled target-specific drug molecule design.
Deep generative models, including variational autoencoders (VAEs), generative adversarial networks (GANs), recurrent neural networks (RNNs), and transformers, are pivotal in molecular design. VAEs encode molecules into a continuous latent space, allowing for smooth interpolation and generation of novel molecules, and conditional VAEs generate molecules with specific properties by conditioning the latent space on desired characteristics. GANs improve the quality of generated molecules through adversarial training, where a generator creates new molecules and a discriminator distinguishes between real and generated molecules. GANs are particularly useful in drug discovery, generating novel drug-like molecules by learning from existing chemical libraries. RNNs generate new molecules as SMILES strings, learning these strings' syntax and chemical validity and predicting the next character in a sequence to ensure chemical validity. With their sequence-to-sequence generation and attention mechanisms, transformers generate molecules as sequences with high efficiency and accuracy, capturing complex dependencies in molecular structures [27].
The applications of deep generative models are vast. In drug discovery, these models facilitate de novo drug design, creating new drug candidates with high affinity for specific targets, and lead optimization, improving properties such as potency, selectivity, and pharmacokinetic profiles [28], [29]. In material science, they aid in designing new polymers with specific mechanical, thermal, or electrical properties and generating novel catalysts with high efficiency and selectivity for chemical reactions. In chemistry, these models predict feasible synthetic routes for complex molecules and the outcomes of chemical reactions, assisting in planning experiments [28].
Molecular design using DL and deep generative models has been applied in different drug discovery and development areas. LLMs or MLLMs have started to be used in molecular design through the transformative use in this field. However, several challenges remain. High-quality labeled data is often scarce, especially for novel or rare molecules, necessitating techniques for data augmentation such as virtual screening and synthetic data generation. The black-box nature of DL models raises concerns about interpretability, making it difficult to understand why specific molecules are generated. Developing explainable AI methods is crucial for interpreting and understanding the decisions made by deep generative models. Integrating these models with experimental workflows for validation and combining them with automated synthesis and testing platforms can ensure the practical utility of generated molecules. Ethical and safety considerations are paramount to prevent the misuse of generative models for creating harmful substances and to ensure compliance with regulatory standards for designing and using novel molecules.
AI-enabled LLMs in drug discovery
Presently, AI-enabled models using LLM are used significantly in the different steps of drug discovery and development and can perform a wide range of tasks in this domain (Fig. 4). Researchers are developing LLMs to speed up the process. Here, we have identified how LLMs perform these tasks.
Fig. 4.
AI-enabled LLMs in the context of drug discovery and development. The figure depicts the different stages of drug discovery and development and the applications of LLMs in these stages.
LLMs have been used in drug design in different fields of chemistry, including structure-based drug molecule generation and de novo drug design, drug target identification and validation, ADME/ADMET study, etc.(Table 1).
Table 1.
Significant LLMs and MLLMs in drug discovery and development.
| Sl. No. | LLMs/MLLMs | Developer | Remarks | Reference |
|---|---|---|---|---|
| 1. | Med-PaLM 2 | It analyses large volumes of medical literature and clinical trial data to accelerate drug discovery and identify drug targets | [122] | |
| 2. | Tx-LLM | The wide variety of chemical or biological entities (dataset, targets) support to various stages of the drug discovery | [123] | |
| 3. | SynerGPT | OpenAI | This GPT language model is used to learn drug synergy relations and designing novel synergistic drug structures by personalized dataset | [124] |
| 4. | CancerGPT | Li et al. | This application supports to hold the drug pair synergy prediction in rare tissues with partial structured data and features | [125] |
| 5. | GenePT | Chen et al. | The fine tuned model is used to know the perturbation predictions and drug-gene interactions, as well as classifying gene properties and cell types | [126] |
| 6. | DTI-BERT | Zheng et al. | It is used for genomic drug discovery, and computational prediction of drug target prediction, specifically to generate information from drug molecular fingerprints | [127] |
| 7. | Geneformer | Theodoris et al. | It is applied the to identify the candidate therapeutic targets for cardiomyopathy and is capable to transform public datasets into candidate therapeutic target | [128] |
| 8. | MOLE-BERT | Xia et al. | It uses the molecular graphs as input data and predict the molecular properties of drug molecules | [129] |
| 9. | LSCPP-BERT | UKPLab | This tool used for predicting coding lncRNA-sORFs and prospective to expressively contribute to drug development and agricultural applications | [130] |
| 10. | SMILES-BERT | Wang et al. | It is used to extract drug features of fine-tuned large protein models, and offers potential representations of drug-target pairs | [131] |
| 11. | MolGPT | Bagal et al. | In the token prediction task, it integrates an extra training work for conditional prediction and generation of innovative and efficacious drug molecules | [39] |
| 12. | C2P2 | Nguyen et al. | It uses the dataset for protein–protein interaction (PPI) and chemical-chemical interaction (CCI) tasks to obtain information of intermolecular interactions and consequently transfer this to affinity prediction | [130], [131] |
| 13. | K-BERT | Wu et al. | The model is applied for atom feature prediction, prediction of molecular feature, and contrastive learning across the multiple pharmaceutical datasets | [132] |
| 14. | DrugAssist | Ye et al. | The interactive molecule optimization model can perform to optimization for the critical task in the drug discovery pipeline | [40] |
| 15. | DrugLLM | Liu et al. | This LLM is tailored for drug design and able to produce novel molecules with predictable properties based on limited examples | [38] |
| 16. | FSM-DDTR | Monteiro et al. | This architecture in the framework of drug design, is capable of exploring the immense chemical representation space to generate novel molecules with enhanced pharmacological properties and target selectivity | [62] |
| 17. | QuoteTarge-t | Chen et al. | It is an effective sequence-based identifier for drug target proteins, and support to produce novel insights into identifying drug molecule-binding sites | [65] |
| 18. | cMolGPT | Wang et al. | This valued tool is used for de novo molecule design and has the potential to quicken molecular optimization cycle time | [60] |
Structure-based drug molecule design using AI-enabled LLMs
LLMs can perform a variety of tasks in different areas of chemistry. The pre-trained LLMs can answer the question of structural perspective of chemical structure [30], [31], [32]. Tran et al. illustrated pre-trained LMs to answer the different questions in chemistry. They tried to understand how the text-to-text pre-trained LM can assist in chemical classifications, information about chemical species and their physical and chemical properties, and applications through the question-answering (QA) system for chemistry. The LM’s ability to answer complex questions with superior accuracy [33] is a good example. Using DL-based LLM, researchers embed chemistry knowledge and develop further models for drug discovery and development. Using these models, researchers design the ligand using the conventional method [34]. It is also called the structure-based ligand or structure-based drug molecular design (Fig. 5A). For structure-based drug molecular design, SMILES strings are generated from molecular graphs (Fig. 5B). Therefore, several molecular structures can be generated through LMs or LLMs using the SMILES string [35], [36]. Sadeghi et al. used two LLMs, LLaMA from Meta AI and GPT from OpenAI, to understand the fundamental molecular-input of SMILES. They found that LLaMA outperformed in both drug drug interaction (DDI) and molecular property prediction tasks. It shows that LLMs, in generating SMILES embeddings, have great potential [36]. Zhumagambetov et al. developed an LM called Transmol. It can be used for molecular generation of drug molecules. It might be helpful for molecular library generation and lead generation. It included an ML-generated molecules database (cheML.io web database) for advanced molecule design with cutting-edge methodology [37]. Similarly, DrugLLM has been generated by a group of researchers. It is a few-shot molecule generation using open LLM and GMR (group-based molecular representation) for proper molecular representation [38]. Bagal et al. have developed a MolGPT model using a transformer-decoder algorithm. The model can create molecules with required scaffolds. The chosen molecular properties can be generated for this molecule by creating scaffold SMILES [39]. Ye et al. recently developed an LLM to optimize the drug molecule. The model is called Drugassist, and it performs a molecule optimization process by grasping the underlying patterns in chemical structures [40]. Recently, Wang et al. developed a LLM for drug design and to comprehend three-dimensional (3D) structures through tokenization. The model is called Token-Mol 1.0. The researchers proposed a GCE (Gaussian cross-entropy) loss function during drug design [41]. Researchers can facilitate high-quality and rapid drug design using the model.
Fig. 5.
Overview of the application of LLMs from structure-based drug molecule design to de novo drug design. (a) the application of LLMs in structure-based drug molecule design, (b) the generation of SMILES strings from molecular graphs during drug design, and (c) the application of LLMs in de novo drug design.
De novo drug design using AI-enabled LLMs
De novo drug design is a procedure that involves creating a novel drug-like molecule without a starting template [42]. It is a prioritized area. Researchers are developing drug molecules through de novo drug design using AI-enabled LLM or AI-enabled chemical language models (Fig. 5C) [43]. Haroon et al. recently developed a GPT-based LM for de novo drug design. It might help researchers design drugs with desired properties [44].
Similarly, Wang et al. have developed a conditional GPT-based model for de novo drug design through the generation of SMILES string. The model is called cMolGPT. It is a significant tool for de novo molecule design that can help quicken the molecular optimization cycle time [45]. Using the chemical language model (CML), Moret et al. tried to understand the relationship between molecular structure and bioactivity. In this study, a collection of CML design ligand molecules were used ligands of PI3Kγ (phosphoinositide 3-kinase gamma) and showed positive results. The model can be used for de novo drug design [46]. CLM is generated from NLP. Further this group advocated that CML to be used for de novo molecular structure generation, activity-focused molecular design, and virtual compound screening. Grisoni illustrates that CLM can generate new molecules. Therefore, it accelerates de novo drug design [47].
Similarly, Monteiro et al. developed a multi-objective LM for de novo drug design. The model, called FSM-DDTR, performs de novo drug design transformer-based architecture exploring the vast chemical space. The generated drug molecule has optimal drug-likeness properties, optimal values of synthetic accessibility score, topological polar surface area, molecular lipophilicity, and molecular weight [48].
Drug target identification, validation, and interaction using AI-enabled LLMs
Drug target identification, validation, and interaction is one of the essential areas of drug discovery and development. Researchers are developing different LLM-based models to target identification, validation, and interaction. Recently, drug target interaction studies in wet labs have become costly and time-consuming. In this direction, LLM-based models help to show the path for faster drug discovery methods. LLMs identify drug targets through gene-related literature and explore disease mechanisms and biological pathways [49]. One example is ChatGPT v.4, which has a plug-in that assists with an initial interpretation of protein-related drug target discovery. Sheikholeslami et al. developed DrugGen, an LLM-based model that can develop the drug and its target's interactions. In this DrugGen model, drug targets are essential components [50]. To keep in mind the problem, Kalakoti et al. have developed a transformer-based LM to study the drug target interaction (DTI). The model is called TransDTI. The model is backed by molecular docking and simulation analysis [51]. It has been noted that the maximum candidates of the drug target class are proteins such as GPCR, ion channel receptors, enzymes, and transporter proteins [52]. Protein language models (PLM) have been developed to understand the different properties of proteins. The protein language model is used to identify the potentially druggable protein targets. Chen et al. have developed a sequence-based transformer PLM to find the druggable protein targets. The model is called QuoteTarget. The model uses the joint sequence-enabled self-supervised pretraining PLM, and it identified 1213 possible drug target proteins. The model applied residue-binding weights using the Grad-Cam (gradient-weighted class activation mapping) algorithm and the well-trained network [53]. Some protein language models help to understand the properties of proteins that can be used as protein-based drug targets. Some examples are PLMSearch for homology sequences [54]; xCAPT5 for PPI (protein–protein interaction) [55], LMPhosSite for prediction of phosphorylation site [56], LMNglyPred for estimate of N-linked glycosylation sites [57].
ADME/ADMET using AI-enabled LLMs
Understanding a drug molecule's ADME attributes is a significant criterion for drug development. Presently, along with the four properties, toxicity is also measured. Therefore, presently, it is termed ADMET [58], [59]. ADMET can be studied through the LLM technologies [60]. Several researchers stated that ChatGPT helps to comprehend the ADMET properties through the QA model [22], [61], [62]. Wang et al. have studied the ADMET properties of drug molecules using ChatGPT. In this study, the researchers have used as a case study on anticocaine addiction drug [63]. ChatGPT or other AI-enabled chatbots or LLMs might be a successful platform for understanding a drug molecule's ADME/ADMET properties. Zheng et al. have also studied ADME attributes of a drug through LLMs [64]. Similarly, Niu et al. developed PharmaBench, a multi-agent data mining system. It is a benchmark for ADMET model prediction [65]. Aksamit et al. discussed the hybrid SMILES-fragment tokenization method for ADMET forecast during drug discovery. The method is coupled with two pre-training strategies [66].
Other areas of drug discovery and development and AI-enabled LLMs
Several other LLMs have been developed to bridge this drug discovery and development gap. Recently, researchers developed an LLM for the pharmaceutical domain. The model is called PharmGPT, and it helps understand the biopharmaceutical and chemical sectors [67]. DDIs are one of the critical parameters that researchers understand after drug development [68]. Using the effectiveness of ChatGPT, Juhi et al. studied the DDIs using two-stage questions. In this study, a total of 40 DDI lists were studied. For the first question, 20 were inconclusive, and 19 were conclusive. For the second question, they found that 22 were inconclusive, 17 were conclusive, and one answer was wrong [69]. Liang et al. developed a ChatGPT-like model for understanding the drug molecule graphs. The model is called DrugChat [70]. LLMs have shown the capacity for drug safety documentation search. Researchers have searched the drug safety documentation [70], [71].
LLM selection
The objective of LLM selection is to increase the task's performance during drug discovery and development. However, different LLMs can perform the same task. Therefore, the selection of LLMs is an essential process in drug discovery and development. It is a critical decision that requires a profound understanding of the model's capabilities and where it fits the drug discovery process with the highest output. Most importantly, we should know the LLM's application in drug discovery and development. Therefore, for LLM selection, we should also consider the drug compound-specific parameters and biological system-specific parameters. At the same time, we should understand each LLMs' advantages and disadvantages. Following this process, we can efficiently use LLMs in drug discovery and development.
MLLMs in drug discovery and development
Multimodal LLMs thats MLLMs are used to handle many non-text datasets and text data sets. It can handle images, audio, video, and other vast range of non-text datasets. To explore diverse promises, researchers have applied MLLMs in different areas of medical science. Using a multimodal DL model, researchers predicted multiclass surgical outcomes in glaucoma. The model can forecast the binary outcomes when the surgeries have numerous consequences. Also, the model can provide significant understanding for clinical decision-making [72]. Recently, Xiao et al. developed the ProteinGPT, an MLLM-based protein structure understanding and property prediction model. Understanding the property prediction of the protein-based drug target may be helpful [73]. Besides the protein-based drug target, MLLM will also help to understand the biological macromolecules that can be used as drug targets [74].
Researchers have recently used the MLLM to understand the mutations that cause nAb escape. This model might help to detect drug-resistance mutations [75]. However, more efficient models are needed in this direction.
Similarly, Liu et al. have created GIT-Mol, an MLLM to combine text, images, and graphs for complex information on molecular science and explore further potential in this area. The model can execute downstream tasks, such as chemical reaction prediction and compound name recognition [76]. The model can be used to recognize the drug molecule recognition. Sirumalla et al. developed a multi-task and multimodal transformer to help discover small molecular drugs [77]. Similarly, Lu et al. proposed a model called MMFDL (multimodal fused deep learning). It can assist in predicting drug molecules' properties. These researcher indicated the binding constants for drug molecules' protein–ligand complex using this model, and the model can provide information from different molecular representations [78]. There is an enormous possibility for MLLMs for different areas of drug discovery and development because they can use images. Therefore, MLLMs can potentialy shape the future landscape of drug discovery and development.
LLMs in other medical use
LLMs can significantly revolutionize medical science by increasing many applications, including improving the accuracy of diagnoses and assisting the decision-making processes in clinical settings (Table 2) [79]. These models can enhance patient care by enhancing crucial medical competencies, encompassing factual knowledge and interpersonal communication skills. ChatGPT has also exhibited a significant comprehension of medical semantics and the capacity to carry out intricate medical reasoning tasks [80]. This proficiency is demonstrated by its impressive performance in medical licensure examinations [81], [82], [83]. AI-driven systems like ChatGPT significantly impact doctor-patient relations and help distribute medical knowledge in today's complex medical field. LLMs serve as initial online consultation tools, offering patients fundamental yet essential information regarding their medical issues, treatment options, and preventive measures [84], [85]. This feature helps patients save crucial time and provides them with fundamental knowledge and guidance before their face-to-face medical appointments. LLMs excel at simplifying intricate medical terminology, providing lucid and understandable explanations that improve patients' comprehension of medical diagnoses and recommendations [86].
Table 2.
Significant LLMs and MLLMs in the medical and healthcare sectors.
| Sl. No. | LLMs/MLLMs | Developer | Remarks | Reference |
|---|---|---|---|---|
| 1. | Galactica | Meta AI | It is trained on over 48 million papers, reference material, textbooks, proteins, compounds, and additional sources of scientific knowledge intended for the research community to develope personalized medicine |
[133], [134] |
| 2. | BioLinkBERT | Yasunaga et al. | The 110 million parameters containing model is used for extraction of data from clinical trial reports | [135] |
| 3. | BioMegatron | NVIDIA | This biomedical text performing LLM is used for recognition, relation extraction, and question answering in diverse biomedical aspect | [136] |
| 4. | BioMedLM | Bolton et al. | It holds 2.7 billion parameters and is a GPT-style autoregressive model trained entirely on PubMed abstracts and full articles, and MCQs in biomedicine | [137] |
| 5. | Perplexity | Perplexity.ai | It is able to generate medical information with the convinced level of accuracy for questions usually modeled by patients having health issues for prostate, skin, breast, lung, and colorectal cancers | [138] |
| 6. | ChatDoctor | Li et al. | The bulky dataset of 100,000 patient-doctor dialogues specifically used for used online medical consultation platform (error tolerance) | [139] |
| 7. | PMC-LLaMA | Wu et al. | It is the open-source language model (30 thousand medical textbooks and 4.8 million biomedical academic papers) specifically designed for medicine applications | [140] |
| 8. | PubMedBERT | Gu et al. | It uses the PubMed vocabulary, pretrained using PubMed abstracts for evidence-based medical information extraction, medical notes and domain-specific biomedical support | [141] |
| 9. | ClinicalCamel | Toma et al. | The dialogue-based knowledge encoding model for data from dense medical texts for within the healthcare domain and clinical applications | [142] |
| 10. | GPT3.5 | OpenAI | It is used as different models as a diagnostic aid for complex medical cases, cancer diseases, medical questionaries and imaging study | [143] |
| 11. | BioGPT | Microsoft | In biomedical domains it is used through the biomedical literature to generate fluent descriptions for biomedical terms | [144] |
| 12. | MedAlpaca | Han et al. | It is the open-source collection of medical conversational AI models and training data used for improving medical workflows, patient care, diagnostics, and health education services | [145] |
| 13. | B-LBConA | Yang et al. | This model includes disambiguation clues about the relevance among the reference context and candidate entities via the context-aware mechanism | [146] |
| 14. | GeneGPT | Jin et al. | It uses the Web APIs of the NCBI for answering genomics questions, and it supports improved access to biomedical information | [147] |
| 15. | ClinicalBERT | Huang et al. | The lab values and medications, other structured data used for modeling of clinical notes ad associate patients information. | [97] |
| 16. | MT-BioNER | Khan et al. | The biomedical datasets of (time and memory) used for slot time tagging as a multi-task learning model. | [148] |
Moreover, current research highlights the significance of these sophisticated instruments in enhancing the precision of consultations and combating medical misinformation, specifically in domains like vaccination [87]. LLMs possess a vast knowledge base that enables them to cover various medical specialties, including orthodontics and cardiac surgery. It can reduce the communication barrier between doctors and patients [88].
Various instances of LLMs optimized explicitly for medical applications demonstrate the potential of transfer learning, domain adaptation, and other specialized approaches in this domain. For example, BioBERT, a pre-trained language representative model, can specifically design several biomedical applications. It is built on the BERT architecture. The model has undergone extensive optimization using vast biomedical datasets, including PubMed abstracts and PMC full-text publications. This optimization has led to notable enhancements in several biomedical NLP tasks, such as identifying named entities, extracting relationships, and answering questions [89]. ClinicalBERT, a specialized model, has undergone fine-tuning using the MIMIC-III dataset, comprising electronic health records of patients in critical care units. This model has improved efficacy in clinical NLP tasks, such as predicting patient mortality, de-identifying information, and classifying diagnoses [90]. BlueBERT, which shares the same BERT architecture and has been pre-trained on an extensive collection of biomedical text data, has demonstrated exceptional performance in several biomedical related NLP tasks. Some examples are named relation extraction, entity recognition, and biomedical question-answering [91].
LLMs are extensively utilized in medicine and biomedical research for several purposes, such as generating, summarizing, and correcting text. These models can produce substantial amounts of original material, such as templates that can be used for clinical documentation, standardized reports, presentation outlines, or cover letter samples, particularly for book submissions. LLMs can summarize intricate academic papers into concise content, allowing the readers to grasp challenging concepts in articles and automatically generate some of the abstracts. Moreover, they can condense large quantities of clinical data, such as transforming notes into precise summarized statements that help in patient evaluations. LLMs can improve the grammar, coherence, readability, and conciseness of the written information without changing the actual meaning and context of the text. This feature is especially advantageous in scholarly writing since it can enhance the clarity and consistency of manuscripts and grant proposals [92]. The introduction of LLMs has uniquely transformed medical writing, particularly by incorporating models like ChatGPT [93]. Although significant limitations exist in understanding and creating medical texts, LLMs for medical science are skilled at quickly accessing a wide range of interdisciplinary data, which helps researchers rapidly combine newly made discoveries [94]. Researchers at the multi-country level performed a series of tests to comprehend the pattern of ChatGPT-derived answers, and different statistical models were developed for validation. However, the ChatGPT-derived answers show plagiarism [95]. Therefore, previously, we urged the researchers that plagiarism-free LLMs are required for writing [96]. LLMs have exceptional proficiency in composing the first iterations of articles and improving the syntax and manner of existing publications, hence augmenting their clarity and consistency [97], [98].
LLMs in rare disease drug discovery and development
Rare diseases affect fewer people. Less data is available for rare diseases, so drug development is challenging. On the other hand, drug development for rare diseases is complex because many such cases are linked to multiple variations in genotypic and phenotypic manifestations [99]. Multimodal learning is helping in this direction to perform genotype-phenotype mapping. Khodaee et al. developed a multimodal foundation model to understand genotype-phenotype mapping and its relationship. It might provide an advanced resolution to explore cellular heterogeneity [100]. Gene prioritization is another area of research for rare diseases. Liang et al. developed a Genetic Transformer (GeneT) for identifying causative variants of candidate genes. It will be a helpful solution for gene prioritization in rare diseases [101]. Therefore, LLM or MLLM helps to understand multiple variations in genotypic and phenotypic manifestations. Kafkas et al. illustrated the LLM and its application in causative gene prioritization (phenotype-based) in rare diseases [102]. Similarly, Kim et al. described the benefit of LLM for phenotype-based gene prioritization [103].
For rare diseases, the number of patients recruited for clinical trials is low, and this has generated less data. Therefore, it is very challenging to collect sufficient efficacy and safety data for clinical trials in rare diseases [104]. On the other hand, LLMs might help match the patients in clinical trials, which might assist the clinical trials in rare diseases. Jin et al. developed TrialGPT, an LLM-based model to support clinical trials. This LLM has three modules: TrialGPT-Retrieval to conduct large-scale filtering to recover trial candidates, TrialGPT-Matching to indicate patient eligibility, and TrialGPT-Ranking to yield the trial-level scores [104]. Using these three modules, one can perform clinical trials efficiently. This LLM will help the fewer patients recruited for clinical trials, such as those for rare diseases.
LLMs for drug repurposing
Drug repurposing helps us to find a molecule for a symptom that already exists as a drug to treat other diseases and conditions. It is one of the cost and time saving ways in the drug discovery and development process to create new medicines. In urgent situations, drug repurposing helps to find therapeutic solutions [105], [106]. During the emergency conditions of the COVID-19 pandemic, researchers have tried to develop several repurposed drugs [107], [108]. Recently, LLM has helped in drug repurposing. Wei et al. have developed DrugReAlign, an LLM for drug repurposing using the multi-source prompt framework. The model can handle extensive training data and vast parameter sizes [109]. Similarly, Inoue et al. developed DrugAgent, which is another drug-repurposing model. This model shows the framework’s possibility to forecast drug-disease interactions [110]. These LLM-based drug repurposing models reduce the costs and time compared to traditional drug discovery methods.
Ethical challenges of utilizing LLMs in drug discovery
The ethical challenges of utilizing LLMs in drug discovery encompass accountability, fairness, and the risk of unforeseen outcomes. A key concern is determining who should be held accountable for decisions shaped or guided by these models. Concerns about privacy in LLMs are significant, as these models can retain information from their training datasets. For instance, in the context of handling sensitive multi-omics data obtained during patient profiling, it is crucial to guarantee that the data is properly anonymized, making it impossible to link back to an individual patient [49].
Many prominent LLM inference platforms implement rate limits, preventing any single client from monopolizing the request queue to maintain equitable handling of client requests. However, this basic approach to fairness can lead to inefficient resource usage and a suboptimal experience for clients, especially when additional capacity is available. Although there is extensive research on fair scheduling, deploying LLMs introduces distinct challenges due to the unpredictability of request durations and the specific batching dynamics on parallel accelerators [111].
Future prospect
The potential of AI in drug discovery is rapidly transitioning from theoretical to practical, especially with the emergence of LLMs. Utilizing AI in the drug discovery and development stages is poised to revolutionize the competitive landscape for pharmaceutical companies, enabling them to innovate more effectively by harnessing the capabilities of LLMs [112]. Researchers have been exploring the potential of AI for new drug discovery, employing techniques such as graph neural networks [113], [114] and, more recently, generative models [115]. Numerous AI-driven approaches exist for molecular design and drug development, including GPT-based models that leverage scaffold SMILES strings with desired molecular properties [39]. The T5 architecture has also been utilized for tasks like reaction prediction [116] and translating between molecular captions and SMILES strings [117].
LLMs now exhibit human-like proficiency in processing text. Significantly, transformer neural networks form the foundation for text and image processing networks, creating multimodal AI models that can simultaneously handle various data types. It represents a significant shift from the specialized, niche models of the 2010s [118]. Drug research and development (R&D) is a challenging and intricate process characterized by lengthy timelines, substantial financial investment, and high failure rates. With its robust capability to analyze large datasets and complex networks, ML is increasingly enhancing the efficiency and success rates in drug R&D [119]. In the rapidly evolving landscape of drug discovery, the fusion of multimodal data with AI and ML technologies sets the stage for ground-breaking advancements. This synergy is not only expediting traditional processes but also paving the way for innovative approaches to complex, manpower intensive challenges [120]. When looking into the future of clinical trials, AI is unquestionably a multifaceted technology shaping the drug discovery process.
The ideology behind using AI-driven tools in clinical trials is establishing a systemic channel to evaluate vast amounts of information generated during drug research with higher accuracy. Researchers are leveraging AI tools to pinpoint drug molecules and detect specific complex disease patterns in patients. When AI is integrated with ML, it enhances researchers' ability to analyze vast datasets, leading to the optimization of drug molecules for better outcomes. AI algorithms streamline various stages of the drug discovery process, making clinical trials for drug approval faster, more accurate, and more efficient. The future of clinical research lies in the broad adoption of digital technologies, virtual learning, and AI. This technological integration will reduce the financial burden on pharmaceutical companies during drug development. Moving forward, pharmaceutical companies will utilize AI applications to create patient-centric drugs with high precision, effectively closing the gaps between drug discovery, development, clinical trials, approval, and market distribution [121].
Conclusion
Integrating prompt-engineering LMs, LLMs and MLLMs represent a transformative leap in drug discovery and development. These sophisticated AI technologies facilitate the precise and efficient identification of drug candidates and offer deeper insights into complex biological processes by synthesizing extensive and varied datasets. Using LLMs in drug research accelerates the development pipeline, significantly reduces costs, and enhances the success rates of clinical trials by automating and optimizing numerous stages of the process. As pharmaceutical companies increasingly adopt these AI-driven methodologies, the industry is poised for rapid innovation and development of highly effective, patient-centric therapies. This shift promises to expedite the creation and approval of new drugs and aims to improve healthcare outcomes. By embracing these AI-enabled advanced tools, the pharmaceutical industry can better meet the demands of modern medicine, addressing critical health challenges more efficiently and effectively. In summary, prompt-engineering-powered and multimodal LLMs are set to revolutionize drug discovery and development, paving the way for a future where cutting-edge technology and healthcare innovation go hand in hand and will revolutionize the pharmaceutical sector through the fastest and most cost-effective method of drug discovery and development shortly.
Compliance with Ethics Requirements
This article does not contain any studies with human or animal subjects.
CRediT authorship contribution statement
Chiranjib Chakraborty: Investigation, Supervision, Writing – original draft, Writing – review & editing. Manojit Bhattacharya: Validation, Writing – original draft. Soumen Pal: Validation, Formal analysis, Writing – original draft. Srijan Chatterjee: Validation, Formal analysis, Writing – original draft. Arpita Das: Validation, Formal analysis. Sang-Soo Lee: Validation, Funding acquisition.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This study was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF- 2020R1I1A3074575).
Biographies

Dr. Chiranjib Chakraborty, Chair Professor, Department of Biotechnology at Adamas University, India, is a Tata Innovation Fellow, DBT, Govt. of India and Future Leaders Mentorship Fellow, American Society of Microbiology, USA. He was former Professor, Galgotias University, India and former Associate Professor, VIT University, Vellore, India. Dr. Chakraborty is also a Research Director in Bioinformatics (as Advisory Professor), Institute of Skeletal Aging (ISA), Hallym University, South Korea. He is Associate Editor of ‘Frontiers in Pharmacology’, Frontiers in Bioengineering and Biotechnology and Editorial Board member of ‘Nature Scientific Reports’, ‘Interdisciplinary Sciences: Computational Life Sciences’, etc. His research interest is mutation, infectious dieases, non-coding RNA, medical bioinformatics, AI, etc. He published more than 325 SCI/SCIE papers and five books and two edited books.

Manojit Bhattacharya is working as Assistant Professor at the P.G. Department of Zoology, Fakir Mohan University, Odisha, India. He obtained his PhD. degree from Vidyasagar University, West Bengal, India in 2019. He also served as SERB-National Post Doctorate Fellow at ICAR-Central Inland Fisheries Research Institute, Kolkata, India. Presently, his research interest is focused on immunoinformatics, medical biotechnology, and different aspects of computational biology. He has authored some 146 peer-reviewed, SCI/SCIE scientific publications.

Soumen Pal is presently working as a Professor at School of Mechanical Engineering, Vellore Institute of Technology, Vellore, India. He obtained PhD from Jadavpur University, India in the field of materials science and nanotechnology. He has research interests in various cross fields including, material synthesis and processing, structure-property correlations, composite materials, mechanical computational science, exploring emerging scientific languages which can be used in data analysis, AI/ML/LLM applications, quantum computing technologies, protein binding technologies, biological processes at the cellular level in the human body, etc. He has some 45 peer reviewed publications, and has completed on funded project from DST-SERB, Government of India.

Srijan Chatterjee is currently pursuing his Master's degree in Biomedical Science at Hallym University in South Korea. His research interests lie in Immunoinformatics, Bioinformatics, and Bone Biology, and he is dedicated to making significant contributions in these fields. He has authored over 25 peer-reviewed publications in SCI/SCIE journals.

Dr. Arpita Das is presently working as an Assistant Professor at School of Biotechnology, Adamas University since November, 2017. She has more than twelve years of research and teaching experience. Dr. Das has completed her PhD from Jadavpur University, she has been awarded prestigious Erasmus Mundus Post-Doctoral fellowship in 2015. Her research interests encompass diverse topics in the fields of microbiology viz. functional food, nutraceuticals, probiotic bacteria, microbiology, disease and disease mechanism. She has standardized several important biochemical assays and published several research papers and book chapters in international peer reviewed journals.

Sang–Soo Lee is a Professor, Department of Orthopedic Surgery at Hallym University-hospital and director of Institute of Skeletal Ageing & Orthopedic Surgery. He received his medical degree from the college of medicine, Hallym University. He received Ph.D. degree in the field of basic orthopedic research from Hallym University-Graduate School. He had completed his residency at the Hallym University Medical Center, Korea, and fellowship at the Hospital for Special Surgery, NY, USA. He has co-authored about several peer-reviewed scientific publications in the field of orthopaedic research.
Contributor Information
Chiranjib Chakraborty, Email: drchiranjib@yahoo.com.
Sang-Soo Lee, Email: 123sslee@gmail.com.
References
- 1.Pal S., et al. ChatGPT or LLM in next-generation drug discovery and development: pharmaceutical and biotechnology companies can make use of the artificial intelligence-based device for a faster way of drug discovery and development. Int J Surg. 2023;109(12):4382–4384. doi: 10.1097/JS9.0000000000000719. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Burki T. A paradigm for drug development. Lancet Digit Health. 2020;2(5):e226–e227. doi: 10.1016/S2589-7500(20)30088-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Kirkpatrick P. Artificial intelligence makes a splash in small-molecule drug discovery. Biopharma Deal. 2022;2022:d43747–d44022. [Google Scholar]
- 4.Pun F.W., Ozerov I.V., Zhavoronkov A. AI-powered therapeutic target discovery. Trends Pharmacol Sci. 2023;44(9):561–572. doi: 10.1016/j.tips.2023.06.010. [DOI] [PubMed] [Google Scholar]
- 5.Arnold C. Inside the nascent industry of AI-designed drugs. Nat Med. 2023;29(6):1292–1295. doi: 10.1038/s41591-023-02361-0. [DOI] [PubMed] [Google Scholar]
- 6.Chakraborty C., et al. The changing scenario of drug discovery using AI to deep learning: recent advancement, success stories, collaborations, and challenges. Mol Ther Nucleic Acids. 2024;35(3) doi: 10.1016/j.omtn.2024.102295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Boldini D., et al. Machine learning assisted hit prioritization for high throughput screening in drug discovery. ACS Cent Sci. 2024;10(4):823–832. doi: 10.1021/acscentsci.3c01517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Tian Y.Y., et al. QSAR study, molecular docking and molecular dynamic simulation of aurora kinase inhibitors derived from imidazo[4,5-b]pyridine derivatives. Molecules. 2024;29(8) doi: 10.3390/molecules29081772. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Chakraborty C., et al. Generative AI in drug discovery and development: the next revolution of drug discovery and development would be directed by generative AI. Ann Med Surg (Lond) 2024;86(10):6340–6343. doi: 10.1097/MS9.0000000000002438. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Vaswani A., et al. Attention is all you need. Adv Neural Inf Proces Syst. 2017;30 [Google Scholar]
- 11.Zhao H., et al. Transformer vision-language tracking via proxy token guided cross-modal fusion. Pattern Recogn Lett. 2023;168:10–16. [Google Scholar]
- 12.S. Kukreja et al. A literature survey on open source large language models In: Proceedings of the 2024 7th International Conference on Computers in Management and Business 2024. 10.1145/3647782.3647803. [DOI]
- 13.Dong, M.M., T.C. Stratopoulos, and V.X. Wang, A Scoping Review of ChatGPT Research in Accounting and Finance. Theophanis C. and Wang, Victor Xiaoqi, A Scoping Review of ChatGPT Research in Accounting and Finance (December 30, 2023), 2023. 10.2139/ssrn.4680203. [DOI]
- 14.Bhattacharya M., et al. Large Language Model (LLM) to Multimodal Large Language Model (MLLM): a journey to shape the biological macromolecules to biological sciences and medicine. Mol Therapy-Nucleic Acids. 2024;35(3) doi: 10.1016/j.omtn.2024.102255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Shanahan M., McDonell K., Reynolds L. Role play with large language models. Nature. 2023;623(7987):493–498. doi: 10.1038/s41586-023-06647-8. [DOI] [PubMed] [Google Scholar]
- 16.Boiko D.A., et al. Autonomous chemical research with large language models. Nature. 2023;624(7992):570–578. doi: 10.1038/s41586-023-06792-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Chakraborty C., et al. Overview of Chatbots with special emphasis on artificial intelligence-enabled ChatGPT in medical science. Front Artif Intell. 2023;6 doi: 10.3389/frai.2023.1237704. 1237704. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Chatterjee S., et al. ChatGPT and large language models in orthopedics: from education and surgery to research. J Exp Orthop. 2023;10(1):128. doi: 10.1186/s40634-023-00700-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Laohawetwanit T., Apornvirat S., Namboonlue C. Thinking like a pathologist: morphologic approach to hepatobiliary tumors by ChatGPT. Am J Clin Pathol. 2024 doi: 10.1093/ajcp/aqae087. [DOI] [PubMed] [Google Scholar]
- 20.Sacoransky E., Kwan B.Y.M., Soboleski D. ChatGPT and assistive AI in structured radiology reporting: a systematic review. Curr Probl Diagn Radiol. 2024 doi: 10.1067/j.cpradiol.2024.07.007. [DOI] [PubMed] [Google Scholar]
- 21.Zhang H., et al. Large language model-based natural language encoding could be all you need for drug biomedical association prediction. Anal Chem. 2024 doi: 10.1021/acs.analchem.4c01793. [DOI] [PubMed] [Google Scholar]
- 22.Chakraborty C., Bhattacharya M., Lee S.S. Artificial intelligence enabled ChatGPT and large language models in drug target discovery, drug discovery, and development. Mol Ther Nucleic Acids. 2023;33:866–868. doi: 10.1016/j.omtn.2023.08.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ravi Kiran A., Kusuma Kumari G., Krishnamurthy P.T. ChatGPT in drug discovery process. Adv Pharm Bull. 2024;14(1):5–6. doi: 10.34172/apb.2024.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Tropsha A., et al. Integrating QSAR modelling and deep learning in drug discovery: the emergence of deep QSAR. Nat Rev Drug Discov. 2024;23(2):141–155. doi: 10.1038/s41573-023-00832-0. [DOI] [PubMed] [Google Scholar]
- 25.Hu W., et al. Deep learning methods for small molecule drug discovery: a survey. IEEE Trans Artif Intell. 2023;5(2):459–479. [Google Scholar]
- 26.Simm G., Pinsler R., Hernández-Lobato J.M. International Conference on Machine Learning. PMLR; 2020. Reinforcement learning for molecular design guided by quantum mechanics. [Google Scholar]
- 27.Bond-Taylor S., et al. Deep generative modelling: a comparative review of vaes, gans, normalizing flows, energy-based and autoregressive models. IEEE Trans Pattern Anal Mach Intell. 2021;44(11):7327–7347. doi: 10.1109/TPAMI.2021.3116668. [DOI] [PubMed] [Google Scholar]
- 28.Sousa T., et al. Generative deep learning for targeted compound design. J Chem Inf Model. 2021;61(11):5343–5361. doi: 10.1021/acs.jcim.0c01496. [DOI] [PubMed] [Google Scholar]
- 29.Bian Y., Xie X.Q. Generative chemistry: drug discovery with deep learning generative models. J Mol Model. 2021;27(3):71. doi: 10.1007/s00894-021-04674-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Jablonka K.M., et al. 14 examples of how LLMs can transform materials science and chemistry: a reflection on a large language model hackathon. Digit Discov. 2023;2(5):1233–1250. doi: 10.1039/d3dd00113j. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Jablonka K.M., et al. Leveraging large language models for predictive chemistry. Nat Mach Intell. 2024;6(2):161–169. [Google Scholar]
- 32.Guo T., et al. What can large language models do in chemistry? a comprehensive benchmark on eight tasks. Adv Neural Inf Proces Syst. 2023;36:59662–59688. [Google Scholar]
- 33.Tran D., et al. Leveraging text-to-text pretrained language models for question answering in chemistry. ACS Omega. 2024;9(12):13883–13896. doi: 10.1021/acsomega.3c08842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Riaz I.B., et al. Applications of artificial intelligence in prostate cancer care: a path to enhanced efficiency and outcomes. Am Soc Clin Oncol Educ Book. 2024;44(3) doi: 10.1200/EDBK_438516. e438516. [DOI] [PubMed] [Google Scholar]
- 35.A, M.B., et al., Augmenting large language models with chemistry tools. Nat Mach Intell, 2024. 6(5): 525-535. [DOI] [PMC free article] [PubMed]
- 36.Sadeghi S., et al. Can large language models understand molecules? BMC Bioinf. 2024;25(1):225. doi: 10.1186/s12859-024-05847-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Zhumagambetov R., et al. Transmol: repurposing a language model for molecular generation. RSC Adv. 2021;11(42):25921–25932. doi: 10.1039/d1ra03086h. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Liu X., et al. DrugLLM: open large language model for few-shot molecule generation. arXiv preprint arXiv:2405.06690. 2024 [Google Scholar]
- 39.Bagal V., et al. MolGPT: molecular generation using a transformer-decoder model. J Chem Inf Model. 2022;62(9):2064–2076. doi: 10.1021/acs.jcim.1c00600. [DOI] [PubMed] [Google Scholar]
- 40.Ye G., et al. Drugassist: a large language model for molecule optimization. arXiv preprint arXiv:2401.10334. 2023 doi: 10.1093/bib/bbae693. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Wang J., et al. Token-Mol 1.0: Tokenized drug design with large language model. arXiv preprint arXiv:2407.07930. 2024 doi: 10.1038/s41467-025-59628-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Chen W., et al. Artificial intelligence for drug discovery: resources, methods, and applications. Mol Ther Nucleic Acids. 2023;31:691–702. doi: 10.1016/j.omtn.2023.02.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Grisoni F. Chemical language models for de novo drug design: challenges and opportunities. Curr Opin Struct Biol. 2023;79 doi: 10.1016/j.sbi.2023.102527. [DOI] [PubMed] [Google Scholar]
- 44.Haroon S., Hafsath C.A., Jereesh A.S. Generative Pre-trained Transformer (GPT) based model with relative attention for de novo drug design. Comput Biol Chem. 2023;106 doi: 10.1016/j.compbiolchem.2023.107911. [DOI] [PubMed] [Google Scholar]
- 45.Wang Y., et al. cMolGPT: a conditional generative pre-trained transformer for target-specific de novo molecular generation. Molecules. 2023;28(11) doi: 10.3390/molecules28114430. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Moret M., et al. Leveraging molecular structure and bioactivity with chemical language models for de novo drug design. Nat Commun. 2023;14(1):114. doi: 10.1038/s41467-022-35692-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Grisoni F. Chemical language models for de novo drug design: challenges and opportunities. Curr Opin Struct Biol. 2023;2023(79) doi: 10.1016/j.sbi.2023.102527. [DOI] [PubMed] [Google Scholar]
- 48.Monteiro N.R.C., et al. FSM-DDTR: End-to-end feedback strategy for multi-objective De Novo drug design using transformers. Comput Biol Med. 2023;164 doi: 10.1016/j.compbiomed.2023.107285. [DOI] [PubMed] [Google Scholar]
- 49.Zheng Y., et al. Large language models in drug discovery and development: from disease mechanisms to clinical trials. arXiv preprint arXiv:2409.04481. 2024 [Google Scholar]
- 50.Sheikholeslami M., et al. DrugGen: advancing drug discovery with large language models and reinforcement learning feedback. arXiv preprint arXiv:2411.14157. 2024 doi: 10.1038/s41598-025-98629-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Kalakoti Y., Yadav S., Sundar D. TransDTI: transformer-based language models for estimating DTIs and building a drug recommendation workflow. ACS Omega. 2022;7(3):2706–2717. doi: 10.1021/acsomega.1c05203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Santos R., et al. A comprehensive map of molecular drug targets. Nat Rev Drug Discov. 2017;16(1):19–34. doi: 10.1038/nrd.2016.230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Chen J., et al. QuoteTarget: a sequence-based transformer protein language model to identify potentially druggable protein targets. Protein Sci. 2023;32(2):e4555. doi: 10.1002/pro.4555. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Liu W., et al. PLMSearch: protein language model powers accurate and fast sequence search for remote homology. Nat Commun. 2024;15(1):2775. doi: 10.1038/s41467-024-46808-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Dang T.H., Vu T.A. xCAPT5: protein-protein interaction prediction using deep and wide multi-kernel pooling convolutional neural networks with protein language model. BMC Bioinf. 2024;25(1):106. doi: 10.1186/s12859-024-05725-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Pakhrin S.C., et al. LMPhosSite: a deep learning-based approach for general protein phosphorylation site prediction using embeddings from the local window sequence and pretrained protein language model. J Proteome Res. 2023;22(8):2548–2557. doi: 10.1021/acs.jproteome.2c00667. [DOI] [PubMed] [Google Scholar]
- 57.Pakhrin S.C., et al. LMNglyPred: prediction of human N-linked glycosylation sites using embeddings from a pre-trained protein language model. Glycobiology. 2023;33(5):411–422. doi: 10.1093/glycob/cwad033. [DOI] [PubMed] [Google Scholar]
- 58.Mak K.K., Balijepalli M.K., Pichika M.R. Success stories of AI in drug discovery - where do things stand? Expert Opin Drug Discov. 2022;17(1):79–92. doi: 10.1080/17460441.2022.1985108. [DOI] [PubMed] [Google Scholar]
- 59.Wang J. Comprehensive assessment of ADMET risks in drug discovery. Curr Pharm Des. 2009;15(19):2195–2219. doi: 10.2174/138161209788682514. [DOI] [PubMed] [Google Scholar]
- 60.Ekins S., et al. In silico ADME/tox comes of age: twenty years later. Xenobiotica. 2023:1–7. doi: 10.1080/00498254.2023.2245049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Pradhan T., Gupta O., Chawla G. The future of ChatGPT in medicinal chemistry: harnessing AI for accelerated drug discovery. ChemistrySelect. 2024;9(13) e202304359. [Google Scholar]
- 62.Sharma G., Thakur A. ChatGPT in drug discovery. Theoretical Comput Chem. 2023 doi: 10.26434/chemrxiv-2023-qgs3k. [DOI] [Google Scholar]
- 63.Wang R., Feng H., Wei G.W. ChatGPT in drug discovery: a case study on Anticocaine addiction drug development with Chatbots. J Chem Inf Model. 2023;63(22):7189–7209. doi: 10.1021/acs.jcim.3c01429. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Zheng Y., et al. Large language models for scientific synthesis, inference and explanation. arXiv preprint arXiv:2310.07984. 2023 [Google Scholar]
- 65.Niu Z., et al. PharmaBench: enhancing ADMET benchmarks with large language models. Sci Data. 2024;11(1):985. doi: 10.1038/s41597-024-03793-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Aksamit N., et al. Hybrid fragment-SMILES tokenization for ADMET prediction in drug discovery. BMC Bioinf. 2024;25(1):255. doi: 10.1186/s12859-024-05861-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Chen L., et al. PharmGPT: domain-specific large language models for bio-pharmaceutical and chemistry. arXiv preprint arXiv:2406.18045. 2024 [Google Scholar]
- 68.Chakraborty S., et al. Artificial intelligence (AI) is paving the way for a critical role in drug discovery, drug design, and studying drug-drug interactions - correspondence. Int J Surg. 2023;109(10):3242–3244. doi: 10.1097/JS9.0000000000000564. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Juhi A., et al. The capability of ChatGPT in predicting and explaining common drug-drug interactions. Cureus. 2023;15(3) doi: 10.7759/cureus.36272. e36272. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Liang Y., et al. Drugchat: towards enabling chatgpt-like capabilities on drug molecule graphs. arXiv preprint arXiv:2309.03907. 2023 [Google Scholar]
- 71.J.L. Painter et al. Enhancing drug safety documentation search capabilities with Large Language Models: a user-centric approach In: International Conference on Computational Science and Computational Intelligence Proceedings 2023. 10.1109/csci62032.2023.00015. [DOI]
- 72.Lin W.C., et al. Prediction of multiclass surgical outcomes in glaucoma using multimodal deep learning based on free-text operative notes and structured EHR data. J Am Med Inform Assoc. 2024;31(2):456–464. doi: 10.1093/jamia/ocad213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Xiao Y., et al. Proteingpt: Multimodal llm for protein property prediction and structure understanding. arXiv preprint arXiv:2408.11363. 2024 [Google Scholar]
- 74.Bhattacharya M., et al. Large language model to multimodal large language model: a journey to shape the biological macromolecules to biological sciences and medicine.” Molecular therapy. Nucleic acids. 2024;35(3) doi: 10.1016/j.omtn.2024.102255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Chakraborty C., et al. Prompt engineering-enabled LLM or MLLM and instigative bioinformatics pave the way to identify and characterize the significant SARS-CoV-2 antibody escape mutations. Int J Biol Macromol. 2024 doi: 10.1016/j.ijbiomac.2024.138547. 287 p. 138547. [DOI] [PubMed] [Google Scholar]
- 76.Liu P., et al. GIT-Mol: a multi-modal large language model for molecular science with graph, image, and text. Comput Biol Med. 2024;171 doi: 10.1016/j.compbiomed.2024.108073. [DOI] [PubMed] [Google Scholar]
- 77.Sirumalla, S.K., et al. Multi-Modal and Multi-Task Transformer for Small Molecule Drug Discovery. In: ICML'24 Workshop ML for Life and Material Science: From Theory to Industry Applications. https://openreview.net/forum?id=Ya5OHw7lZ8.
- 78.Lu X., et al. Multimodal fused deep learning for drug property prediction: Integrating chemical language and molecular graph. Comput Struct Biotechnol J. 2024;23:1666–1679. doi: 10.1016/j.csbj.2024.04.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Karabacak M., Margetis K. Embracing large language models for medical applications: opportunities and challenges. Cureus. 2023;15(5) doi: 10.7759/cureus.39305. e39305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Singhal K., et al. Large language models encode clinical knowledge. Nature. 2023;620(7972):172–180. doi: 10.1038/s41586-023-06291-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Nori H., et al. Capabilities of gpt-4 on medical challenge problems. arXiv preprint arXiv:2303.13375. 2023 [Google Scholar]
- 82.Kung T.H., et al. Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2(2) doi: 10.1371/journal.pdig.0000198. e0000198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Gilson A., et al. How does ChatGPT perform on the united states medical licensing examination (USMLE)? the implications of large language models for medical education and knowledge assessment. JMIR Med Educ. 2023;9 doi: 10.2196/45312. e45312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Zhang T., Feng T. Application and technology of an open source AI large language model in the medical field. Radiology Science. 2023;2(1):96–104. [Google Scholar]
- 85.Omiye J.A., et al. Large language models propagate race-based medicine. NPJ Digit Med. 2023;6(1):195. doi: 10.1038/s41746-023-00939-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Thirunavukarasu A.J., et al. Large language models in medicine. Nat Med. 2023;29(8):1930–1940. doi: 10.1038/s41591-023-02448-8. [DOI] [PubMed] [Google Scholar]
- 87.Zhang P., Kamel Boulos M.N. Generative AI in medicine and healthcare: promises, opportunities and challenges. Future Internet. 2023;15(9):286. [Google Scholar]
- 88.Thirunavukarasu A.J. Large language models will not replace healthcare professionals: curbing popular fears and hype. J R Soc Med. 2023;116(5):181–182. doi: 10.1177/01410768231173123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Lee J., et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2020;36(4):1234–1240. doi: 10.1093/bioinformatics/btz682. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Huang K., Altosaar J., Ranganath R. Clinicalbert: modeling clinical notes and predicting hospital readmission. arXiv preprint arXiv:1904.05342. 2019 [Google Scholar]
- 91.Peng Y., Yan S., Lu Z. Transfer learning in biomedical natural language processing: an evaluation of BERT and ELMo on ten benchmarking datasets. arXiv preprint arXiv:1906.05474. 2019 [Google Scholar]
- 92.Li H., et al. Ethics of large language models in medicine and medical research. Lancet Digit Health. 2023;5(6):e333–e335. doi: 10.1016/S2589-7500(23)00083-3. [DOI] [PubMed] [Google Scholar]
- 93.Peng C., et al. A study of generative large language model for medical research and healthcare. NPJ Digit Med. 2023;6(1):210. doi: 10.1038/s41746-023-00958-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Thapa S., Adhikari S. ChatGPT, bard, and large language models for biomedical research: opportunities and pitfalls. Ann Biomed Eng. 2023;51(12):2647–2651. doi: 10.1007/s10439-023-03284-0. [DOI] [PubMed] [Google Scholar]
- 95.Bhattacharya M., et al. ChatGPT’s scorecard after the performance in a series of tests conducted at the multi-country level: a pattern of responses of generative artificial intelligence or large language models. Curr Res Biotechnol. 2024;7 [Google Scholar]
- 96.Pal S., et al. AI-enabled ChatGPT or LLM: a new algorithm is required for plagiarism-free scientific writing. Int J Surg. 2024;110(2):1329–1330. doi: 10.1097/JS9.0000000000000939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Bernstein I.A., et al. Comparison of Ophthalmologist and large language model chatbot responses to online patient eye care questions. JAMA Netw Open. 2023;6(8) doi: 10.1001/jamanetworkopen.2023.30320. e2330320. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Decker H., et al. Large language model-based Chatbot vs Surgeon-generated informed consent documentation for common procedures. JAMA Netw Open. 2023;6(10) doi: 10.1001/jamanetworkopen.2023.36997. e2336997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Alves V.M., et al. Knowledge-based approaches to drug discovery for rare diseases. Drug Discov Today. 2022;27(2):490–502. doi: 10.1016/j.drudis.2021.10.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Khodaee F., et al. Multimodal learning for mapping the genotype-phenotype dynamics. Res Square. 2024 doi: 10.1038/s43588-024-00765-7. rs.3.rs-4355413. [DOI] [PubMed] [Google Scholar]
- 101.Liang L., et al. Genetic transformer: An innovative large language model driven approach for rapid and accurate identification of causative variants in rare genetic diseases. medRxiv. 2024 [Google Scholar]
- 102.Kafkas Ş., et al. The application of Large Language Models to the phenotype-based prioritization of causative genes in rare disease patients. medRxiv. 2023 doi: 10.1038/s41598-025-99539-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Kim J., et al. Assessing the utility of large language models for phenotype-driven gene prioritization in rare genetic disorder diagnosis. arXiv preprint arXiv:2403.14801. 2024 doi: 10.1016/j.ajhg.2024.08.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Kempf L., et al. Challenges of developing and conducting clinical trials in rare disorders. Am J Med Genet. 2018;176(4):773–783. doi: 10.1002/ajmg.a.38413. [DOI] [PubMed] [Google Scholar]
- 105.Begley C.G., et al. Drug repurposing: misconceptions, challenges, and opportunities for academic researchers. Sci Transl Med. 2021;13(612) doi: 10.1126/scitranslmed.abd5524. [DOI] [PubMed] [Google Scholar]
- 106.Rosa D.e., Cristina M., et al. Drug repurposing: a nexus of innovation, science, and potential. Sci Rep. 2023;13(1):17887. doi: 10.1038/s41598-023-44264-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Saha R.P., et al. Repurposing drugs, ongoing vaccine, and new therapeutic development initiatives against COVID-19. Front Pharmacol. 2020;11:1258. doi: 10.3389/fphar.2020.01258. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Alam S., et al. Therapeutic effectiveness and safety of repurposing drugs for the treatment of COVID-19: position standing in 2021. Front Pharmacol. 2021;12 doi: 10.3389/fphar.2021.659577. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Wei J., et al. DrugReAlign: a multisource prompt framework for drug repurposing based on large language models. BMC Biol. 2024;22(1):226. doi: 10.1186/s12915-024-02028-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Inoue Y., Song T., Fu T. DrugAgent: explainable drug repurposing agent with large language model-based reasoning. arXiv preprint arXiv:2408.13378. 2024 [Google Scholar]
- 111.Wilkerson, Daniel Shawcross. A proposal for proquints: Identifiers that are readable, spellable, and pronounceable. arXiv preprint arXiv:0901.4016 (2009).
- 112.Hughes, D., AI in Drug Discovery – Harnessing the Power of LLMs.https://www.graphable.ai/blog/ai-in-drug-discovery-and-development/ (accessed on 27 July, 2024). 2024.
- 113.Lv Q., et al. Meta learning with graph attention networks for low-data drug discovery. IEEE Trans Neural Netw Learn Syst. 2023 doi: 10.1109/TNNLS.2023.3250324. [DOI] [PubMed] [Google Scholar]
- 114.Lv Q., et al. Meta-molnet: a cross-domain benchmark for few examples drug discovery. IEEE Trans Neural Netw Learn Syst. 2024 doi: 10.1109/TNNLS.2024.3359657. [DOI] [PubMed] [Google Scholar]
- 115.Paul D., et al. Artificial intelligence in drug discovery and development. Drug Discov Today. 2021;26(1):80–93. doi: 10.1016/j.drudis.2020.10.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Lu J., Zhang Y. Unified deep learning model for multitask reaction predictions with explanation. J Chem Inf Model. 2022;62(6):1376–1387. doi: 10.1021/acs.jcim.1c01467. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Edwards C., et al. Translation between molecules and natural language. arXiv preprint arXiv:2204.11817. 2022 [Google Scholar]
- 118.Truhn D., et al. Large language models and multimodal foundation models for precision oncology. NPJ Precis Oncol. 2024;8(1):72. doi: 10.1038/s41698-024-00573-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Guan S., Wang G. Drug discovery and development in the era of artificial intelligence: from machine learning to large language models. Artif Intell Chem. 2024;2(1) [Google Scholar]
- 120.blog, Q., Revolutionizing Drug Discovery with Multimodal Data and AI: A Deep Dive into Use Cases. https://quantori.com/blog/revolutionizing-drug-discovery-with-multimodal-data-and-ai-a-deep-dive-into-use-cases (accessed on 27 July, 2024). 2024.
- 121.Kapila, N., AI in Clinical Trials: The Future of Drug Discovery.https://www.appliedclinicaltrialsonline.com/view/ai-in-clinical-trials-the-future-of-drug-discovery (accessed on 27 July, 2024). 2024.
- 122.S.B. Brahmavar et al. Generating Novel Leads for Drug Discovery using LLMs with Logical Feedback In: Proceedings of the AAAI Conference on Artificial Intelligence 2024. 10.1609/aaai.v38i1.27751. [DOI]
- 123.Zambrano Chaves J.M., et al. Tx-LLM: A Large Language Model for Therapeutics. arXiv e-prints. 2024 p. arXiv: 2406.06316. [Google Scholar]
- 124.Edwards C., et al. Synergpt: In-context learning for personalized drug synergy prediction and drug design. arXiv preprint arXiv:2307.11694. 2023 [Google Scholar]
- 125.Li T., et al. CancerGPT for few shot drug pair synergy prediction using large pretrained language models. NPJ Digit Med. 2024;7(1):40. doi: 10.1038/s41746-024-01024-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126.Chen Y., Zou J. GenePT: a simple but effective foundation model for genes and cells built from ChatGPT. bioRxiv. 2024 [Google Scholar]
- 127.Zheng J., Xiao X., Qiu W.R. DTI-BERT: identifying drug-target interactions in cellular networking based on BERT and deep learning method. Front Genet. 2022;13 doi: 10.3389/fgene.2022.859188. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 128.Theodoris C.V., et al. Transfer learning enables predictions in network biology. Nature. 2023;618(7965):616–624. doi: 10.1038/s41586-023-06139-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 129.Xia, J., et al., Mole-bert: Rethinking pre-training graph neural networks for molecules. 2023. https://openreview.net/forum?id=jevY-DtiZTR.
- 130.Liu J., et al. Large language models in bioinformatics: applications and perspectives. ArXiv. 2024 [Google Scholar]
- 131.(a) Wang, S., et al. Smiles-bert: large scale unsupervised pre-training for molecular property prediction. In: Proceedings of the 10th ACM international conference on bioinformatics, computational biology and health informatics. 2019; (b) Nguyen, T.M., T. Nguyen, and T. Tran, Mitigating cold-start problems in drug-target affinity prediction with interaction knowledge transferring. Brief Bioinform, 2022. 23(4). [DOI] [PMC free article] [PubMed]
- 132.Wu Z., et al. Knowledge-based BERT: a method to extract molecular features like computational chemists. Brief Bioinform. 2022;23(3):131. doi: 10.1093/bib/bbac131. [DOI] [PubMed] [Google Scholar]
- 133.Benary M., et al. Leveraging large language models for decision support in personalized oncology. JAMA Netw Open. 2023;6(11) doi: 10.1001/jamanetworkopen.2023.43689. e2343689. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 134.Taylor R., et al. Galactica: a large language model for science. arXiv preprint arXiv:2211.09085. 2022 [Google Scholar]
- 135.Yasunaga M., Leskovec J., Liang P. Linkbert: pretraining language models with document links. arXiv preprint arXiv:2203.15827. 2022 [Google Scholar]
- 136.H.-C. Shin et al. BioMegatron: larger biomedical domain language model In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020. 10.18653/v1/2020.emnlp-main.379. [DOI]
- 137.Bolton E., et al. Biomedlm: A 2.7 b parameter language model trained on biomedical text. arXiv preprint arXiv:2403.18421. 2024 [Google Scholar]
- 138.Gravina A.G., et al. Charting new AI education in gastroenterology: cross-sectional evaluation of ChatGPT and perplexity AI in medical residency exam. Dig Liver Dis. 2024;56(8):1304–1311. doi: 10.1016/j.dld.2024.02.019. [DOI] [PubMed] [Google Scholar]
- 139.Yunxiang L., et al. Chatdoctor: a medical chat model fine-tuned on llama model using medical domain knowledge. arXiv preprint arXiv:2303.14070. 2023;2(5):6. doi: 10.7759/cureus.40895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 140.Wu C., et al. PMC-LLaMA: toward building open-source language models for medicine. J Am Med Inform Assoc. 2024 doi: 10.1093/jamia/ocae045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 141.Gu Y., et al. Domain-specific language model pretraining for biomedical natural language processing. ACM Trans Comput Healthcare (HEALTH) 2021;3(1):1–23. [Google Scholar]
- 142.Toma, A., et al., Clinical camel: An open expert-level medical language model with dialogue-based knowledge encoding. arXiv preprint arXiv:2305.12031, 2023.
- 143.Rios-Hoyo A., et al. Evaluation of large language models as a diagnostic aid for complex medical cases. Front Med (Lausanne) 2024;11 doi: 10.3389/fmed.2024.1380148. 1380148. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 144.Luo R., et al. BioGPT: generative pre-trained transformer for biomedical text generation and mining. Brief Bioinform. 2022;23(6) doi: 10.1093/bib/bbac409. [DOI] [PubMed] [Google Scholar]
- 145.Han, T., et al., MedAlpaca--an open-source collection of medical conversational AI models and training data. arXiv preprint arXiv:2304.08247, 2023.
- 146.Yang S., et al. B-LBConA: a medical entity disambiguation model based on Bio-LinkBERT and context-aware mechanism. BMC Bioinf. 2023;24(1):97. doi: 10.1186/s12859-023-05209-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 147.Jin Q., et al. GeneGPT: augmenting large language models with domain tools for improved access to biomedical information. ArXiv. 2023 doi: 10.1093/bioinformatics/btae075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 148.Khan, M.R., M. Ziyadi, and M. AbdelHady, Mt-bioner: Multi-task learning for biomedical named entity recognition using deep bidirectional transformers. arXiv preprint arXiv:2001.08904, 2020.






