Skip to main content
Journal of Cheminformatics logoLink to Journal of Cheminformatics
. 2025 Aug 7;17:121. doi: 10.1186/s13321-025-00978-6

From molecules to data: the emerging impact of chemoinformatics in chemistry

Anup Basnet Chetry 1,, Keisuke Ohto 2
PMCID: PMC12333164  PMID: 40775368

Abstract

Chemoinformatics is a rapidly advancing field that integrates chemistry, computer science, and data analysis to enhance the study and application of chemical systems. This interdisciplinary approach leverages computational tools and large datasets to drive innovation in various chemical disciplines, including drug discovery, materials science, and environmental chemistry. Recent advancements in artificial intelligence (AI) and machine learning (ML) have significantly improved the ability to analyze complex datasets, predict molecular properties, and design new compounds. Additionally, the expansion of open-access databases and collaborative platforms has facilitated broader access to chemical data and fostered global research collaboration. Sophisticated molecular modeling techniques, such as multi-scale modeling and free energy calculations, have enhanced the accuracy of predictions, while big data analytics has enabled the extraction of valuable insights from vast datasets. Emerging technologies, including quantum computing, hold promise for further revolutionizing the field by offering new capabilities for simulating and optimizing chemical processes. Despite these advancements, chemoinformatics faces challenges related to data integrity, computational demands, and interdisciplinary collaboration. Addressing these challenges is crucial for the continued growth and effectiveness of chemoinformatics. Overall, the field is poised to play a pivotal role in advancing chemical research and developing innovative solutions to address global challenges.

Scientific contribution This article highlights the growing impact of chemoinformatics in modern chemistry by integrating computational tools with molecular science to enhance data-driven discovery. It explores advancements in machine learning, artificial intelligence, and big data analytics, which improve molecular property predictions and accelerate chemical innovations. The study also discusses key applications in drug design and materials science, demonstrating how chemoinformatics drives efficiency and sustainability in research. Additionally, it outlines future challenges and opportunities, emphasizing the need for improved algorithms, data standardization, and interdisciplinary collaboration. This work contributes to the evolving role of chemoinformatics as a crucial pillar of modern chemical research.

Keywords: Chemoinformatics, Drug discovery, Analysis, Computing, Modeling

Introduction

Chemoinformatics, as defined by Gasteiger and Engel, is "the application of informatics methods to solve chemical problems" [1]. As an interdisciplinary field that integrates chemistry with computer science and data analysis, chemoinformatics has rapidly become a cornerstone of modern chemical research [2]. The term encompasses a wide array of computational techniques designed to handle chemical data, ranging from molecular modeling to the design of novel compounds and materials [3]. As the digital transformation of the scientific world continues, chemoinformatics has emerged as a critical tool for managing the increasing complexity and volume of chemical information [4, 5]. The origins of chemoinformatics can be traced back to the pharmaceutical industry, where it played a pivotal role in drug discovery and molecular design. Early applications focused on quantitative structure–activity relationships (QSAR), molecular docking, and virtual screening, significantly enhancing the efficiency of drug development. Over time, the field expanded beyond pharmaceuticals to encompass data-driven approaches that facilitate the storage [8], retrieval [9], and analysis of chemical data on an unprecedented scale [10]. The increasing openness of chemical data, driven by initiatives promoting public databases such as PubChem and ChEMBL, has further accelerated research progress [11]. Additionally, the formal integration of chemoinformatics into university curricula reflects its growing importance, ensuring that future researchers are equipped with computational skills essential for modern chemical problem-solving. The advent of high-throughput screening, automated synthesis, and advanced analytical techniques has led to an explosion of chemical data [12]. While this data deluge presents vast opportunities for new discoveries, it also introduces significant challenges in managing, analyzing, and interpreting large datasets [13]. Chemoinformatics addresses these challenges by offering a range of solutions, including specialized chemical databases, molecular modeling software, and machine learning algorithms that predict chemical behavior and properties [14, 15]. One of the most impactful applications of chemoinformatics is in drug discovery, where virtual screening and QSAR models enable researchers to predict the biological activity of compounds before synthesis, saving both time and resources [16, 17]. Similarly, in materials science, chemoinformatics facilitates the design of new materials by predicting their properties based on molecular structure [18]. In environmental chemistry, it aids in understanding the fate of chemicals in the environment and assessing their potential risks [19]. Despite its rapid advancement, chemoinformatics faces several challenges. Issues related to data quality and standardization remain critical, particularly in the consistent representation of molecular structures. Molecular notations such as SMILES (Simplified Molecular Input Line Entry System), InChI (International Chemical Identifier), and MOL file formats are widely used for encoding molecular information. Each of these notations serves different purposes: SMILES offers a compact, linear representation ideal for database storage, while InChI provides a standardized, non-proprietary identifier facilitating data exchange. However, the accurate representation of complex chemical information, such as reaction conditions, stereochemistry, metal complexes, and dynamic molecular interactions, often presents challenges due to the limitations of current encoding systems. The need for comprehensive and flexible molecular representations is critical for improving data interoperability and predictive modeling performance [20]. Another crucial aspect of chemoinformatics, particularly in machine learning (ML)-driven chemical modeling, is the incorporation of negative (inactive) data alongside positive datasets. Many predictive models, such as QSAR and deep learning approaches, require well-balanced training datasets that include compounds with both desirable and undesirable properties. The availability of high-quality negative data is essential for improving the reliability and generalizability of ML models, particularly in drug discovery, where distinguishing between active and inactive compounds can enhance the accuracy of virtual screening and lead optimization [21]. However, curating negative datasets remains a challenge due to limited reporting of inactive compounds, potential biases in screening assays, and the need for standardization across different chemical domains. Additionally, integrating chemoinformatics tools into traditional laboratory workflows requires close collaboration between chemists, computer scientists, and data analysts. Looking ahead, the integration of artificial intelligence (AI) and machine learning with chemoinformatics is expected to revolutionize the field [22]. These technologies have the potential to enhance predictive modeling [23], automate data analysis [24], and accelerate the discovery of new compounds and materials [25]. Moreover, the rise of big data and cloud computing presents new opportunities for managing and analyzing the massive datasets generated by modern chemical research [26, 27].

This paper explores the emerging impact of chemoinformatics in chemistry, delving into its diverse applications, current challenges, and the advancements shaping its future. By examining the role of chemoinformatics in modern chemical research, this study aims to highlight its potential to drive innovation, enhance sustainability, and contribute to the ongoing evolution of the chemical sciences.

Historical background

Chemoinformatics is a specialized branch of information technology that employs computers and software to aid in the collection, storage, analysis, and manipulation of chemical data. This encompasses chemical formulas, structures, properties, spectra, and biological or biochemical activities [28, 29]. The term "chemoinformatics," a shortened version of "chemical informatics," was introduced by Frank Brown in the late 1990s [30]. However, the foundational ideas of chemoinformatics, such as chemical databases, quantitative structure–activity relationships (QSAR), and the prediction of compound properties or spectra, have been in existence for over four decades. Despite being a relatively niche field with limited academic or industrial recognition until recently, chemoinformatics has gained prominence due to the rise of high-throughput drug screening and the demand for vast chemical libraries containing millions of compounds [31]. It now plays a crucial role in various aspects of drug discovery and development. The evolution of chemoinformatics is deeply intertwined with the development of computational chemistry and the increasing reliance on data-driven approaches in scientific research. Understanding the historical context of chemoinformatics provides insight into how the field has grown to become a fundamental part of modern chemistry [32].

Early beginnings: the rise of computational chemistry

The roots of chemoinformatics can be traced back to the 1960s and 1970s when computational chemistry began to emerge as a distinct discipline. During this period, the focus was primarily on molecular modeling and quantum chemistry [33]. Researchers used computational methods to predict molecular structures, properties, and behaviors, often relying on simplified models due to the limited computational power available at the time. These early efforts laid the groundwork for the integration of computers into chemical research, demonstrating the potential of computational tools to complement experimental work [34].

Emergence of chemoinformatics: bridging chemistry and informatics

The term "chemoinformatics" was formally introduced in the late 1990s, although the practice itself had been evolving for decades. The increasing complexity of chemical data, coupled with advances in computational power and algorithms, led to the need for specialized tools to manage and analyze this information. Chemoinformatics emerged as a response to these challenges, combining elements of chemistry, informatics, and computer science to create a new field dedicated to the efficient handling of chemical data [29]. The development of chemical databases, such as the Cambridge Structural Database (CSD) and PubChem, marked a significant milestone in the evolution of chemoinformatics. These databases provided researchers with easy access to vast amounts of chemical information, enabling more efficient data retrieval and analysis. At the same time, advancements in molecular modeling software allowed for more accurate predictions of molecular properties, further cementing the importance of chemoinformatics in research [35].

Expansion and diversification: the 21st century boom

The twenty-first century saw an explosion in the volume of chemical data generated by new technologies such as high-throughput screening, automated synthesis, and advanced spectroscopy techniques. This data deluge necessitated the development of more sophisticated chemoinformatics tools capable of handling large datasets and performing complex analyses [36]. During this period, chemoinformatics expanded its reach beyond traditional applications in drug discovery to other areas such as materials science, environmental chemistry, and agrochemicals. The field diversified, with new methodologies emerging to address the unique challenges posed by different types of chemical data. For example, chemoinformatics became a specialized subfield focused on small molecules, while bioinformatics dealt with biological macromolecules and systems biology [37, 38].

The integration of artificial intelligence and machine learning

In recent years, the integration of artificial intelligence (AI) and machine learning (ML) into chemoinformatics has represented a major leap forward [39]. These technologies have significantly enhanced the capabilities of chemoinformatics tools, allowing for more accurate predictions, automated data analysis, and the discovery of new patterns in chemical data. AI-driven approaches, such as deep learning, have been applied to tasks ranging from virtual screening to molecular property prediction, opening up new avenues for research and innovation [40]. The historical development of chemoinformatics reflects the broader trends in science and technology, where data-driven approaches and computational tools have become increasingly central to research. As the field continues to evolve, it is poised to play an even more critical role in shaping the future of chemistry [41].

Applications of chemoinformatics

Chemoinformatics has found a wide range of applications across various subfields of chemistry, revolutionizing the way researchers approach complex problems. The integration of computational tools and data analysis techniques has not only accelerated discovery but also enhanced the efficiency and precision of chemical research [42]. The key applications of chemoinformatics in drug discovery, materials science, and environmental chemistry, highlighting its transformative impact on these fields.

Drug discovery and development

One of the most prominent applications of chemoinformatics is in drug discovery and development. The pharmaceutical industry has long relied on chemoinformatics to streamline the process of identifying and optimizing potential drug candidates [43]. Virtual screening, a technique that uses computational models to predict the biological activity of compounds, allows researchers to rapidly assess large libraries of molecules, identifying those with the highest likelihood of success [44]. Chemoinformatics also plays a crucial role in the design of new drugs through quantitative structure–activity relationship (QSAR) modeling [45]. QSAR models correlate the chemical structure of compounds with their biological activity, enabling the prediction of how modifications to molecular structures will impact their efficacy. This approach reduces the need for costly and time-consuming experimental testing, allowing for the more efficient development of new therapeutic agents [17]. Molecular docking is another key application of chemoinformatics in drug discovery [46]. This technique simulates the interaction between a drug molecule and its target, typically a protein, to predict the strength and specificity of binding. By identifying the best candidates for further development, molecular docking helps to optimize lead compounds and improve the success rate of drug development projects [47].

Materials science

In materials science, chemoinformatics has become an invaluable tool for the design and discovery of new materials [48]. By leveraging computational models and large datasets, researchers can predict the properties of materials before they are synthesized, guiding the development of materials with tailored characteristics for specific applications [18]. One of the key applications of chemoinformatics in materials science is the design of polymers [49] catalysts [50] and nanomaterials [51]. Predictive modeling allows researchers to explore a vast chemical space, identifying promising candidates for further investigation. For example, chemoinformatics can be used to design polymers with specific mechanical, thermal, or chemical properties, enabling the creation of materials for advanced technologies such as flexible electronics [52] high-performance batteries [53] and lightweight composites [54].

Catalyst design is another area where chemoinformatics has made significant contributions [55]. By modeling the interactions between catalysts and reactants, chemoinformatics tools can predict the efficiency and selectivity of catalytic processes. This approach has led to the discovery of more effective and environmentally friendly catalysts, supporting the development of green chemistry practices [56].

Environmental chemistry

Chemoinformatics has also found important applications in environmental chemistry, where it aids in understanding the behavior and impact of chemicals in the environment [58]. One of the primary uses of chemoinformatics in this field is the prediction of chemical fate and transport, which involves modeling how chemicals move through and interact with different environmental media, such as air, water, and soil [59]. Toxicity prediction is another critical application of chemoinformatics in environmental chemistry [60]. By analyzing the structural features of chemicals, chemoinformatics tools can predict their potential toxicity to humans, animals, and ecosystems. This information is essential for assessing the environmental risks associated with chemical substances and for guiding the development of safer alternatives [61]. Additionally, chemoinformatics supports the management of environmental data, providing tools for the analysis and interpretation of large datasets generated by monitoring programs [62]. This capability is particularly valuable for tracking the presence of pollutants, understanding their sources and pathways, and evaluating the effectiveness of remediation efforts [63].

Chemoinformatics in academia and industry

The adoption of chemoinformatics tools is widespread in both academia and industry, where they are used to accelerate research and development across various chemical disciplines. In academia, chemoinformatics is increasingly incorporated into the curriculum, training the next generation of chemists in the use of computational tools and data analysis techniques [64]. Researchers in academic institutions use chemoinformatics to explore fundamental questions in chemistry, from understanding molecular interactions to designing new materials and drugs [65]. In industry, chemoinformatics is a key component of the research and development process, particularly in the pharmaceutical, chemical, and materials sectors [66]. Companies leverage chemoinformatics to improve the efficiency of their R&D efforts, reduce costs, and bring products to market more quickly. The ability to predict the behavior and properties of chemicals before they are synthesized or tested experimentally has become a competitive advantage, driving innovation and sustainability in the chemical industry [67].

Machine learning and AI in chemoinformatics

Chemoinformatics has become a driving force in a variety of domains, including drug discovery, materials science, and sustainability. Recently, machine learning (ML) and artificial intelligence (AI) have emerged as critical tools in advancing these fields [68]. In drug discovery, AI-driven models have enhanced virtual screening processes and QSAR modeling, allowing researchers to predict the biological activity of compounds before their synthesis, thereby saving time and resources. Additionally, ML algorithms have enabled the optimization of lead compounds, which can accelerate the transition from discovery to clinical trials [69]. In materials science, the application of AI has led to the discovery of new materials with desirable properties by analyzing vast datasets of molecular structures and their corresponding characteristics [70]. For instance, graph neural networks (GNNs) have been particularly effective in predicting material properties based on molecular graphs, offering a promising pathway for developing new materials for energy storage, catalysis, and electronics [71]. In the realm of sustainability, ML models have been used to design greener chemical processes by optimizing reaction conditions, predicting chemical interactions, and minimizing waste generation. AI technologies also contribute to the evaluation of the environmental impact of chemicals, helping researcher’s select eco-friendly alternatives for industrial processes [72]. While the integration of AI and ML in chemoinformatics offers promising advancements, there are challenges that remain. Issues related to data quality, standardization, and the need for negative data for model training must be addressed. Furthermore, representing certain chemical information, such as reaction conditions or metals, continues to be a challenge for many ML models [73]. Overcoming these obstacles will be essential for realizing the full potential of AI in transforming chemoinformatics applications.

Current challenges in chemoinformatics

While chemoinformatics has made significant strides in advancing chemical research, the field still faces several challenges that need to be addressed to fully realize its potential. These challenges span data management, computational demands, interdisciplinary collaboration, and the integration of new technologies [74]. Understanding and overcoming these challenges is crucial for the continued growth and effectiveness of chemoinformatics.

Data integrity and standardization

One of the most significant challenges in chemoinformatics is ensuring data integrity and standardization. The quality of the data used in chemoinformatics directly impacts the accuracy and reliability of computational models and predictions. However, chemical data often comes from diverse sources, including experimental results, literature, and databases, which may vary in format, quality, and completeness [75]. Standardizing chemical data across different platforms and sources is essential for effective data sharing and integration. Without standardization, inconsistencies in data formats, naming conventions and units of measurement can lead to errors in data interpretation and analysis. Furthermore, the lack of standardized ontologies and metadata complicates the process of linking related data from different sources, hindering the development of comprehensive datasets [76]. Data curation, which involves cleaning, validating, and annotating data, is a time-consuming and resource-intensive process. As the volume of chemical data continues to grow, automating data curation processes becomes increasingly important. Developing and adopting standardized protocols and formats for chemical data will be critical for ensuring the reliability of chemoinformatics tools and fostering collaboration across the scientific community [77].

Computational demands

The complexity of chemical systems and the vast amount of data involved in chemoinformatics require significant computational resources. As chemoinformatics continues to evolve, the need for more advanced algorithms and computing power becomes increasingly apparent. Many chemoinformatics tasks, such as molecular docking, quantum chemical calculations, and large-scale data mining, are computationally intensive, requiring high-performance computing (HPC) infrastructure [78]. Access to sufficient computational resources is often a limiting factor, particularly for smaller research institutions and laboratories. The cost of HPC infrastructure, along with the expertise required to use it effectively can be prohibitive for some researchers. Additionally, the development of efficient algorithms that can handle the complexity of chemical data while minimizing computational overhead remains an ongoing challenge [79]. Cloud computing offers a potential solution to some of these issues by providing scalable computational resources on demand. However, the adoption of cloud computing in chemoinformatics also raises concerns about data security, privacy, and the need for specialized software that can efficiently utilize cloud-based resources [80].

Interdisciplinary collaboration

Chemoinformatics is inherently interdisciplinary, requiring expertise in chemistry, computer science, and data analysis. Effective collaboration between these disciplines is essential for the development and application of chemoinformatics tools. However, differences in terminology, methodologies, and research priorities can create barriers to effective collaboration [81]. Chemists may not have the computational expertise needed to develop or use advanced chemoinformatics tools, while computer scientists may lack the chemical knowledge necessary to understand the specific challenges and nuances of chemical data. Bridging this gap requires education and training programs that equip researchers with the skills needed to work across disciplines. Moreover, fostering a collaborative environment where chemists, computer scientists, and data analysts can work together effectively is crucial for the advancement of chemoinformatics. This includes creating opportunities for interdisciplinary research projects, workshops, and conferences that bring together experts from different fields to share knowledge and develop new approaches [82].

Integration of emerging technologies

The rapid advancement of technologies such as artificial intelligence (AI), machine learning (ML), and big data analytics presents both opportunities and challenges for chemoinformatics. While these technologies have the potential to greatly enhance the capabilities of chemoinformatics tools, their integration into existing workflows is not without challenges [83]. One of the main challenges is the need for large, high-quality datasets to train AI and ML models. In many cases, the available chemical data may be insufficient, incomplete, or biased, leading to models that are less accurate or generalizable. Additionally, the "black box" nature of some AI and ML models can make it difficult to interpret their predictions, which is a concern in fields like drug discovery where understanding the underlying mechanisms is crucial [84]. The integration of AI and ML also requires researchers to acquire new skills in data science and computational techniques. This can be a significant hurdle for chemists who are traditionally trained in experimental methods. Furthermore, the rapid pace of technological change means that chemoinformatics tools must be continuously updated and adapted to keep up with new developments [85].

Advancements and emerging trends in chemoinformatics

Despite the challenges, chemoinformatics is a rapidly evolving field that continues to benefit from technological advancements and innovative methodologies. These advancements are not only addressing existing challenges but also opening up new avenues for research and application. This section, explore some of the most significant advancements and emerging trends in chemoinformatics, including the integration of AI and machine learning, the growth of open-access databases, and the development of more sophisticated modeling techniques.

Artificial intelligence and machine learning

One of the most transformative trends in chemoinformatics is the integration of artificial intelligence (AI) and machine learning (ML) [86]. These technologies are enhancing the ability to predict molecular properties, analyze large datasets, and discover new compounds. AI and ML models, particularly deep learning techniques, have shown remarkable success in tasks such as virtual screening, molecular design, and structure–activity relationship (SAR) analysis [87]. AI-driven approaches are also being used to automate the generation of chemical data, such as predicting the outcomes of chemical reactions or designing new molecules with desired properties. Generative models, like variational autoencoders (VAEs) [88] and generative adversarial networks (GANs) [89] are being applied to create novel molecular structures that can be further optimized for specific applications. Moreover, AI and ML are helping to address some of the data challenges in chemoinformatics by enabling the extraction of useful information from noisy or incomplete datasets. Transfer learning, for instance, allows models trained on large datasets to be adapted for use with smaller, domain-specific datasets, improving their applicability in specialized areas of chemistry [90].

Growth of open-access databases and collaborative platforms

The increasing availability of open-access chemical databases and collaborative platforms is another significant advancement in chemoinformatics. These resources are democratizing access to chemical data and tools, allowing researchers from around the world to contribute to and benefit from shared knowledge [91]. Open-access databases like PubChem [92] ChEMBL [93] and the Cambridge Structural Database (CSD) have become essential resources for chemoinformatics research [94]. They provide vast amounts of chemical data, including molecular structures, properties, biological activities, and reaction information, which can be used for a wide range of applications, from drug discovery to materials science. Collaborative platforms and initiatives, such as the Open Chemistry initiative and the Chemical Informatics and Cyberinfrastructure Collaboratory (CICC) are fostering a more open and collaborative research environment [95]. These platforms provide tools and frameworks for sharing data, models, and workflows, enabling researchers to work together more effectively and accelerate the pace of discovery.

Advanced molecular modeling techniques

The development of more sophisticated molecular modeling techniques is another key trend in chemoinformatics. These techniques are improving the accuracy and reliability of predictions, particularly in areas such as drug design, materials science, and environmental chemistry. One of the notable advancements is the use of multi-scale modeling, which combines different levels of theory (e.g., quantum mechanics, molecular mechanics, and coarse-grained models) to capture the behavior of complex chemical systems more accurately. This approach allows researchers to study molecular interactions at different scales, from atomic-level details to macroscopic properties, providing a more comprehensive understanding of chemical phenomena [96]. Another advancement is the increased use of free energy calculations, which provide more accurate estimates of binding affinities and other thermodynamic properties. Techniques such as molecular dynamics (MD) simulations and enhanced sampling methods are being integrated into chemoinformatics workflows to improve the prediction of molecular behavior in different environments [97].

Integration with big data analytics

The integration of big data analytics into chemoinformatics is enabling the analysis of increasingly large and complex datasets. With the advent of high-throughput screening, automated synthesis, and advanced spectroscopy techniques, the volume of chemical data generated has grown exponentially. Big data analytics tools are helping researchers to make sense of this data, identifying patterns, trends, and correlations that would be difficult or impossible to detect using traditional methods [98]. Data mining and machine learning techniques are being applied to extract valuable insights from large chemical datasets, such as identifying new drug candidates, predicting reaction outcomes, or discovering new materials. The ability to analyze big data is also enhancing the development of predictive models, improving their accuracy and generalizability [99].

Quantum computing: a frontier in chemoinformatics

Quantum computing represents a frontier in chemoinformatics with the potential to revolutionize the field. While still in its early stages, quantum computing offers the promise of solving complex chemical problems that are intractable for classical computers. Quantum computers can theoretically perform certain calculations exponentially faster than classical computers, making them particularly well-suited for tasks such as simulating molecular interactions, optimizing chemical processes, and designing new materials [100]. The application of quantum computing to chemoinformatics could lead to breakthroughs in areas such as drug discovery, where it could be used to accurately model the behavior of large biomolecules or explore vast chemical spaces in search of new therapeutics. Although practical quantum computing is still a few years away, ongoing research and development in this area suggest that it could become a powerful tool in the chemoinformatics arsenal [101].

Conclusion

Chemoinformatics represents a dynamic and rapidly evolving field that bridges the gap between chemistry and computational science. By leveraging advanced computational tools and data-driven approaches, chemoinformatics has transformed the way researchers analyze, predict, and design chemical systems. The field has made significant contributions across various domains, including drug discovery, materials science, and environmental chemistry, enabling more efficient and innovative research practices. The integration of artificial intelligence (AI) and machine learning (ML) has been a game-changer, providing powerful tools for analyzing complex datasets and making accurate predictions. These technologies have enhanced the capabilities of chemoinformatics, leading to breakthroughs in molecular design, virtual screening, and data analysis. The growth of open-access databases and collaborative platforms has democratized access to chemical data, fostering a more inclusive and collaborative research environment. Advancements in molecular modeling techniques, such as multi-scale modeling and free energy calculations, have improved the precision and reliability of predictions, allowing researchers to explore complex chemical systems with greater accuracy. The integration of big data analytics has enabled the analysis of large and complex datasets, uncovering valuable insights and driving innovation in various chemical disciplines. Looking forward, the potential impact of quantum computing on chemoinformatics holds promise for solving some of the most challenging problems in the field. While still in its nascent stages, quantum computing could revolutionize chemoinformatics by providing new capabilities for simulating molecular interactions and optimizing chemical processes. Despite the progress made, the field faces ongoing challenges related to data integrity, computational demands, interdisciplinary collaboration, and the integration of emerging technologies. Addressing these challenges will be crucial for the continued advancement of chemoinformatics and its application in solving complex chemical problems.

In conclusion, chemoinformatics is at the forefront of modern chemistry, driving innovation and discovery through the use of computational tools and data analysis. As the field continues to evolve, it will play an increasingly important role in advancing our understanding of chemical systems and developing new solutions to address global challenges. The continued integration of new technologies and methodologies, along with a focus on overcoming existing challenges, will ensure that chemoinformatics remains a vital and transformative discipline in the years to come.

Author contributions

As a CA, CAs are as follows: ensuring that all listed authors have approved the manuscript before submission and that all authors receive the submission and all substantive correspondence with editors, as well as the full reviews, verifying that all data, materials (including reagents), and code, even those developed or provided by other authors, comply with the transparency and reproducibility standards of both the field and journal. This responsibility includes but is not limited to: (i) ensuring that original data/materials/code upon which the submission is based are preserved following best practices in the field so that they are retrievable for reanalysis; (ii) confirming that data/materials/code presentation accurately reflects the original; and (iii) foreseeing and minimizing obstacles to the sharing of data/materials/code described in the work.

Availability of data and materials

No datasets were generated or analysed during the current study.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Engel T (2006) Basic overview of chemoinformatics. J Chem Inf Model 46(6):2267–2277. 10.1021/ci600234z [DOI] [PubMed] [Google Scholar]
  • 2.Brown N (2009) Chemoinformatics—an introduction for computer scientists. ACM Comput Surv (CSUR) 41(2):1–38. 10.1145/1459352.1459353 [Google Scholar]
  • 3.Willett P (2011) Chemoinformatics: a history. Wiley Interdiscip Rev: Comput Mol Sci 1(1):46–56. 10.1002/wcms.1 [Google Scholar]
  • 4.Varnek A, Baskin II (2011) Chemoinformatics as a theoretical chemistry discipline. Mol Inf 30(1):20–32. 10.1002/minf.201000100 [DOI] [PubMed] [Google Scholar]
  • 5.Agrafiotis DK, Bandyopadhyay D, Wegner JK, van Vlijmen H (2007) Recent advances in chemoinformatics. J Chem Inf Model 47(4):1279–1293. 10.1021/ci700059g [DOI] [PubMed] [Google Scholar]
  • 6.Raslan MA, Raslan SA, Shehata EM, Mahmoud AS, Sabri NA (2023) Advances in the applications of bioinformatics and chemoinformatics. Pharmaceuticals 16(7):1050. 10.3390/ph16071050 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Maldonado AG, Doucet JP, Petitjean M, Fan BT (2006) Molecular similarity and diversity in chemoinformatics: from theory to applications. Mol Divers 10:39–79. 10.1007/s11030-006-8697-1 [DOI] [PubMed] [Google Scholar]
  • 8.Sushko I, Novotarskyi S, Körner R, Pandey AK, Rupp M, Teetz W, Tetko IV (2011) Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information. J Comput-Aided Mol Des 25:533–554. 10.1007/s10822-011-9440-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Chen WL (2006) Chemoinformatics: past, present, and future. J Chem Inf Model 46(6):2230–2255. 10.1021/ci060016u [DOI] [PubMed] [Google Scholar]
  • 10.Lo YC, Rensi SE, Torng W, Altman RB (2018) Machine learning in chemoinformatics and drug discovery. Drug Discov Today 23(8):1538–1546. 10.1016/j.drudis.2018.05.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Bajorath J (2024) Milestones in chemoinformatics: global view of the field. J Cheminform 16:124. 10.1186/s13321-024-00922-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Gasteiger J (2016) Chemoinformatics: Achievements and challenges, a personal view. Molecules 21(2):151. 10.3390/molecules21020151 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Héberger K (2008) Chemoinformatics—multivariate mathematical–statistical methods for data evaluation. In: Vekey K, Telekes A, Vertes A (eds) Medical applications of mass spectrometry. Elsevier, Amsterdam, pp 141–169 [Google Scholar]
  • 14.Saldívar-González FI, Huerta-García CS, Medina-Franco JL (2020) Chemoinformatics-based enumeration of chemical libraries: a tutorial. J Cheminform 12(1):64. 10.1186/s13321-020-00466-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Bunin BA, Siesel B, Morales GA, Bajorath J (2007) Chemoinformatics theory. Springer, Netherlands, pp 1–49 [Google Scholar]
  • 16.Engel T (2014) Chemoinformatik: mit informatik chemische probleme lösen. Chem unserer Zeit 48(6):440–448. 10.1002/ciuz.201400657 [Google Scholar]
  • 17.Srivastava V, Selvaraj C, Singh SK (2021) Chemoinformatics and QSAR. Adv Bioinform. 10.1007/978-981-33-6191-1_10 [Google Scholar]
  • 18.Yosipof A, Shimanovich K, Senderowitz H (2016) Materials informatics: statistical modeling in material science. Mol Inf 35(11–12):568–579. 10.1002/minf.201600047 [DOI] [PubMed] [Google Scholar]
  • 19.Li H, Yan D, Zhang Z, Lichtfouse E (2019) Prediction of CO 2 absorption by physical solvents using a chemoinformatics-based machine learning model. Environ Chem Lett 17:1397–1404. 10.1007/s10311-019-00874-0 [Google Scholar]
  • 20.Xue L, Bajorath J (2000) Molecular descriptors in chemoinformatics, computational combinatorial chemistry, and virtual screening. Comb Chem High Throughput Screen 3(5):363–372. 10.2174/1386207003331454 [DOI] [PubMed] [Google Scholar]
  • 21.López-López E, Gortari EF-de, Medina-Franco JL (2022) Yes SIR! On the structure–inactivity relationships in drug discovery. Drug Discov Today 27(8):2353–2362. 10.1016/j.drudis.2022.05.005 [DOI] [PubMed] [Google Scholar]
  • 22.Kutchukian PS, Dropinski JF, Dykstra KD, Li B, DiRocco DA, Streckfuss EC, Dreher SD (2016) Chemistry informer libraries: a chemoinformatics enabled approach to evaluate and advance synthetic methods. Chem Sci 7(4):2604–2613. 10.1039/C5SC04751J [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Sharma S, Sharma D (2018) Intelligently applying artificial intelligence in chemoinformatics. Curr Top Med Chem 18(20):1804–1826. 10.2174/1568026619666181120150938 [DOI] [PubMed] [Google Scholar]
  • 24.Wishart DS (2007) Introduction to cheminformatics. Curr Protoc Bioinform 18(1):14–21. 10.1002/0471250953.bi1401s18 [Google Scholar]
  • 25.Prakash N, Gareja DA (2010) Cheminformatics. J Proteomics Bioinform 3:249–252. 10.4172/jpb.1000147 [Google Scholar]
  • 26.Begam BF, Kumar JS (2012) A study on cheminformatics and its applications on modern drug discovery. Procedia Eng 38:1264–1275. 10.1016/j.proeng.2012.06.156 [Google Scholar]
  • 27.McEwen L, Li Y (2014) Academic librarians at play in the field of cheminformatics: building the case for chemistry research data management. J Comput Aided Mol Des 28:975–988. 10.1007/s10822-014-9777-4 [DOI] [PubMed] [Google Scholar]
  • 28.Gaspar HA, Baskin II, Marcou G, Horvath D, Varnek A (2015) Chemical data visualization and analysis with incremental generative topographic mapping: big data challenge. J Chem Inf Model 55(1):84–94. 10.1021/ci500575y [DOI] [PubMed] [Google Scholar]
  • 29.Wishart DS (2016) Introduction to cheminformatics. Curr Protoc Bioinform 53(1):14–21. 10.1002/0471250953.bi1401s53 [DOI] [PubMed] [Google Scholar]
  • 30.Agrafiotis DK, Holloway MK, Johnson SA, Reynolds CH, Stouch TR, Tropsha A, Waller CL (2018) Chemistry, information and Frank: a tribute to Frank Brown. J Comput Aided Mol Des 32:723–729. 10.1007/s10822-018-0135-9 [DOI] [PubMed] [Google Scholar]
  • 31.Lenci E, Trabocchi A (2022) Diversity-oriented synthesis and chemoinformatics: a fruitful synergy towards better chemical libraries. Eur J Org Chem 2022(29):e202200575. 10.1002/ejoc.202200575 [Google Scholar]
  • 32.Dreher SD, Krska SW (2021) Chemistry informer libraries: conception, early experience, and role in the future of cheminformatics. Acc Chem Res 54(7):1586–1596. 10.1021/acs.accounts.0c00760 [DOI] [PubMed] [Google Scholar]
  • 33.Bajorath J (ed) (2008) Chemoinformatics: concepts, methods, and tools for drug discovery. Springer Science & Business Media, Berlin [Google Scholar]
  • 34.Gasteiger J (2006) Chemoinformatics: a new field with a long tradition. Anal Bioanal Chem 384:57–64. 10.1007/s00216-005-0065-y [DOI] [PubMed] [Google Scholar]
  • 35.Gozalbes R, Pineda-Lucena A (2011) Small molecule databases and chemical descriptors useful in chemoinformatics: an overview. Comb Chem High Throughput Screen 14(6):548–558. 10.2174/138620711795767857 [DOI] [PubMed] [Google Scholar]
  • 36.Stone S, Newman DJ, Colletti SL, Tan DS (2022) Cheminformatic analysis of natural product-based drugs and chemical probes. Nat Prod Rep 39(1):20–32. 10.1039/D1NP00039J [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Olivares-Amaya R, Amador-Bedolla C, Hachmann J, Atahan-Evrenk S, Sanchez-Carrera RS, Vogt L, Aspuru-Guzik A (2011) Accelerated computational discovery of high-performance materials for organic photovoltaics by means of cheminformatics. Energy Environ Sci 4(12):4849–4861. 10.1039/C1EE02056K [Google Scholar]
  • 38.Williams AJ, Grulke CM, Edwards J, McEachran AD, Mansouri K, Baker NC, Richard AM (2017) The CompTox chemistry dashboard: a community data resource for environmental chemistry. J Cheminform 9:1–27. 10.1186/s13321-017-0247-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Djoumbou-Feunang Y, Wilmot J, Kinney J, Chanda P, Yu P, Sader A, Kumpatla SP (2023) Cheminformatics and artificial intelligence for accelerating agrochemical discovery. Front Chem 11:1292027. 10.3389/fchem.2023.1292027 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Saifi I, Bhat BA, Hamdani SS, Bhat UY, Lobato-Tapia CA, Mir MA, Ganie SA (2024) Artificial intelligence and cheminformatics tools: a contribution to the drug development and chemical science. J Biomol Struct Dyn 42(12):6523–6541. 10.1080/07391102.2023.2234039 [DOI] [PubMed] [Google Scholar]
  • 41.Jayaraman A, Olsen B (2024) Convergence of artificial intelligence, machine learning, cheminformatics, and polymer science in macromolecules. Macromolecules. 10.1021/acs.macromol.4c01704 [Google Scholar]
  • 42.Parvatikar PP, Patil S, Khaparkhuntikar K, Patil S, Singh PK, Sahana R, Raghu AV (2023) Artificial intelligence: machine learning approach for screening large database and drug discovery. Antivir Res. 10.1016/j.antiviral.2023.105740 [DOI] [PubMed] [Google Scholar]
  • 43.Mannhold R, Kubinyi H, Folkers G (2006) Chemoinformatics in drug discovery. John Wiley & Sons, Hoboken [Google Scholar]
  • 44.Karthikeyan M, Krishnan S (2002) Chemoinformatics: a tool for modern drug discovery. Int J Inf Technol Manage 1(1):69–82. 10.1504/IJITM.2002.001188 [Google Scholar]
  • 45.Marshall GR (2004) Introduction to chemoinformatics in drug discovery–a personal view. Chemioinform Drug Discov. 10.1002/3527603743 [Google Scholar]
  • 46.Neves BJ, Braga RC, Melo-Filho CC, Moreira-Filho JT, Muratov EN, Andrade CH (2018) QSAR-based virtual screening: advances and applications in drug discovery. Front Pharmacol 9:1275. 10.3389/fphar.2018.01275 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Rondón-Villarreal P, López WOC (2020) Identification of potential natural neuroprotective molecules for Parkinson’s disease by using chemoinformatics and molecular docking. J Mol Graph Model 97:107547. 10.1016/j.jmgm.2020.107547 [DOI] [PubMed] [Google Scholar]
  • 48.Wang G, Zhu W (2016) Molecular docking for drug discovery and development: a widely used approach but far from perfect. Future Med Chem 8(14):1707–1710. 10.4155/fmc-2016-0143 [DOI] [PubMed] [Google Scholar]
  • 49.Babar M, Hassan F, Ijaz M, Mohyuddin MT (2024) Chemoinformatics. Trends Plant Biotechnol. 10.1007/978-981-97-0814-7_11 [Google Scholar]
  • 50.Le TC, Winkler DA (2018) Applications in materials science. Appl Chemoinform: Achiev Future Oppor. 10.1002/9783527806539.ch12 [Google Scholar]
  • 51.Takahashi K, Ohyama J, Nishimura S, Fujima J, Takahashi L, Uno T, Taniike T (2023) Catalysts informatics: paradigm shift towards data-driven catalyst design. Chem Commun 59(16):2222–2238. 10.1039/D2CC05938J [DOI] [PubMed] [Google Scholar]
  • 52.Mikolajczyk A, Sizochenko N, Mulkiewicz E, Malankowska A, Rasulev B, Puzyn T (2019) A chemoinformatics approach for the characterization of hybrid nanomaterials: safer and efficient design perspective. Nanoscale 11(24):11808–11818. 10.1039/C9NR01162E [DOI] [PubMed] [Google Scholar]
  • 53.Deng, S. (2024). Machine learning approaches for screening of materials in flexible electronic devices. 10.32657/2F10356/2F177761
  • 54.Baskin I, Ein-Eli Y (2022) Electrochemoinformatics as an emerging scientific field for designing materials and electrochemical energy storage and conversion devices—an application in battery science and technology. Adv Energy Mater 12(48):2202380. 10.1002/aenm.202202380 [Google Scholar]
  • 55.Oaki Y, Igarashi Y (2021) Materials informatics for 2D materials combined with sparse modeling and chemical perspective: toward small-data-driven chemistry and materials science. Bull Chem Soc Jpn 94(10):2410–2422. 10.1246/bcsj.20210253 [Google Scholar]
  • 56.Toyao T, Maeno Z, Takakusagi S, Kamachi T, Takigawa I, Shimizu KI (2019) Machine learning for catalysis informatics: recent applications and prospects. ACS Catal 10(3):2260–2297. 10.1021/acscatal.9b04186 [Google Scholar]
  • 57.Bueso-Bordils JI, Antón-Fos GM, Martín-Algarra R, Alemán-López PA (2024) Overview of computational toxicology methods applied in drug and green chemical discovery. J Xenobiotics 14(4):1901–1918. 10.3390/jox14040101 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Mansouri K, Taylor K, Auerbach S, Ferguson S, Frawley R, Hsieh JH, Sutherland V (2024) Unlocking the potential of clustering and classification approaches: navigating supervised and unsupervised chemical similarity. Environ Health Perspect 132(8):085002. 10.1289/EHP14001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Gasteiger J (2014) Some solved and unsolved problems of chemoinformatics. SAR QSAR Environ Res 25(6):443–455. 10.1080/1062936X.2014.898688 [DOI] [PubMed] [Google Scholar]
  • 60.Sharma AK, Srivastava GN, Roy A, Sharma VK (2017) ToxiM: a toxicity prediction tool for small molecules developed using machine learning and chemoinformatics approaches. Front Pharmacol 8:880. 10.3389/fphar.2017.00880 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Sosnowska A, Rybinska-Fryca A, Barycki M, Jagiello K, Puzyn T (2018) Chemoinformatic approach to assess toxicity of ionic liquids. Comput Toxicol: Methods Protoc. 10.1007/978-1-4939-7899-1_26 [DOI] [PubMed] [Google Scholar]
  • 62.Gonzalez-Medina M, Medina-Franco JL (2019) Chemical diversity of cyanobacterial compounds: a chemoinformatics analysis. ACS Omega 4(4):6229–6237. 10.1021/acsomega.9b00532 [Google Scholar]
  • 63.Vorberg S, Tetko IV (2014) Modeling the biodegradability of chemical compounds using the online CHEmical modeling environment (OCHEM). Mol Inf 33(1):73–85. 10.1002/minf.201300030 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Naveja JJ, Oviedo-Osornio CI, Trujillo-Minero NN, Medina-Franco JL (2018) Chemoinformatics: a perspective from an academic setting in Latin America. Mol Diversity 22:247–258. 10.1007/s11030-017-9802-3 [DOI] [PubMed] [Google Scholar]
  • 65.Willett P (2020) The literature of chemoinformatics: 1978–2018. Int J Mol Sci 21(15):5576. 10.3390/ijms21155576 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Martinez-Mayorga K, Madariaga-Mazon A, Medina-Franco JL, Maggiora G (2020) The impact of chemoinformatics on drug discovery in the pharmaceutical industry. Expert Opin Drug Discov 15(3):293–306. 10.1080/17460441.2020.1696307 [DOI] [PubMed] [Google Scholar]
  • 67.Bajorath J, Chávez-Hernández AL, Duran-Frigola M, Fernández-de Gortari E, Gasteiger J, López-López E, Valli M (2022) Chemoinformatics and artificial intelligence colloquium: progress and challenges in developing bioactive compounds. J Cheminform 14(1):82. 10.1186/s13321-022-00661-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Niazi SK, Mariam Z (2023) Recent advances in machine-learning-based chemoinformatics: a comprehensive review. Int J Mol Sci 24(14):11488. 10.3390/ijms241411488 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Klenina OV, Chaban TI (2023) Use of chemoinformatics and bioinformatics databases in the processes of computer-aided drug design. Farmatsevtychnyi zhurnal 6:61–82. 10.3235/0367-3057.6.23.05 [Google Scholar]
  • 70.Rodrigues JF, Florea L, de Oliveira MC, Diamond D, Oliveira ON (2021) Big data and machine learning for materials science. Discov Mater 1:1–27. 10.1007/s43939-021-00012-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Tran, H. T., Joshua Thomas, J., Malim, N. H. A. H., Ali, A. M., & Huynh, S. B. (2021). Graph neural networks in cheminformatics. In Intelligent Computing and Optimization: Proceedings of the 3rd International Conference on Intelligent Computing and Optimization 2020 (ICO 2020) (pp. 823–837). Springer International Publishing. 10.1007/978-3-030-68154-8_71
  • 72.Weber JM, Guo Z, Zhang C, Schweidtmann AM, Lapkin AA (2021) Chemical data intelligence for sustainable chemistry. Chem Soc Rev 50(21):12013–12036. 10.1039/D1CS00477H [DOI] [PubMed] [Google Scholar]
  • 73.Miljković F, Medina-Franco JL (2024) Artificial intelligence-open science symbiosis in chemoinformatics. Artif Intell Life Sci. 10.1016/j.ailsci.2024.100096 [Google Scholar]
  • 74.Gasteiger J (2020) Chemistry in times of artificial intelligence. ChemPhysChem 21(20):2233–2242. 10.1002/cphc.202000518 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Williams AJ, Ekins S, Tkachenko V (2012) Towards a gold standard: regarding quality in public domain chemistry databases and approaches to improving the situation. Drug Discov Today 17(13–14):685–701. 10.1016/j.drudis.2012.02.013 [DOI] [PubMed] [Google Scholar]
  • 76.Verma, R., Taneja, T., Singh, N., & Tanwar, N. (2024, July). Chemoinformatics and big data analytics: Revolutionizing chemical research-A review. In AIP Conference Proceedings (Vol. 3121, No. 1). AIP Publishing. 10.1063/5.0221539
  • 77.Gimadiev TR, Lin A, Afonina VA, Batyrshin D, Nugmanov RI, Akhmetshin T, Varnek A (2021) Reaction data curation I: chemical structures and transformations standardization. Mol Inform 40(12):2100119. 10.1002/minf.202100119 [DOI] [PubMed] [Google Scholar]
  • 78.Miranda-Salas J, Peña-Varas C, Martínez IV, Olmedo DA, Zamora WJ, Chávez-Fumagalli MA, Medina-Franco JL (2023) Trends and challenges in chemoinformatics research in Latin America. Artif Intell Life Sci 3:100077. 10.1016/j.ailsci.2023.100077 [Google Scholar]
  • 79.Banegas-Luna AJ, Ceron-Carrasco JP, Puertas-Martin S, Perez-Sanchez H (2019) BRUSELAS: HPC generic and customizable software architecture for 3D ligand-based virtual screening of large molecular databases. J Chem Inf Model 59(6):2805–2817. 10.1021/acs.jcim.9b00279 [DOI] [PubMed] [Google Scholar]
  • 80.Karthikeyan M, Vyas R, Karthikeyan M, Vyas R (2014) Cloud computing infrastructure development for chemoinformatics. Pract Chemoinform. 10.1007/978-81-322-1780-0_10 [Google Scholar]
  • 81.Alsenan, S. A., Al-Turaiki, I., & Hafez, A. (2020, November). Chemoinformatics for Data Scientists: An Overview. In Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services (pp. 456–461). 10.1145/3428757.3429147
  • 82.Pernaa J (2022) Possibilities and challenges of using educational cheminformatics for STEM education: A SWOT analysis of a molecular visualization engineering project. J Chem Educ 99(3):1190–1200. 10.1021/acs.jchemed.1c00683 [Google Scholar]
  • 83.Waseem T, Babar MM, Abdi G, Rajadas J (2024) Use of bioinformatics in high-throughput drug screening. In: Singh V, Kumar A (eds) Advances in bioinformatics. Springer, Singapore, pp 249–260 [Google Scholar]
  • 84.Rodríguez-Pérez R, Miljković F, Bajorath J (2022) Machine learning in chemoinformatics and medicinal chemistry. Ann Rev Biomed Data Sci 5(1):43–65. 10.1146/annurev-biodatasci-122120-124216 [DOI] [PubMed] [Google Scholar]
  • 85.Bajorath, J., Chávez-Hernández, A. L., Duran-Frigola, M., Fernández-de Gortari, E., Gasteiger, J., López-López, E., ... & Valli, M. (2022). Chemoinformatics and artificial intelligence colloquium: progress and challenges to develop bioactive compounds. 10.26434/chemrxiv-2022-nr0dm-v2 [DOI] [PMC free article] [PubMed]
  • 86.Niazi SK, Mariam Z (2023) Computer-aided drug design and drug discovery: a prospective analysis. Pharmaceuticals 17(1):22. 10.3390/ph17010022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Dong J, Yao ZJ, Zhu MF, Wang NN, Lu B, Chen AF, Cao DS (2017) ChemSAR: an online pipelining platform for molecular SAR modeling. J Cheminform 9:1–13. 10.1186/s13321-017-0215-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Colby SM, Nuñez JR, Hodas NO, Corley CD, Renslow RR (2019) Deep learning to generate in silico chemical property libraries and candidate molecules for small molecule identification in complex samples. Anal Chem 92(2):1720–1729. 10.1021/acs.analchem.9b02348 [DOI] [PubMed] [Google Scholar]
  • 89.Lavecchia A (2024) Navigating the frontier of drug-like chemical space with cutting-edge generative AI models. Drug Discov Today. 10.1016/j.drudis.2024.104133 [DOI] [PubMed] [Google Scholar]
  • 90.Micheli A, Podda M (2022) Deep learning in cheminformatics. In: Bacciu D, Paulo J, Lisboa G, Vellido A (eds) Deep learning in biology and medicine. World Scientific, Singapore, pp 157–195 [Google Scholar]
  • 91.Chen J, Swamidass SJ, Dou Y, Bruand J, Baldi P (2005) ChemDB: a public database of small molecules and related chemoinformatics resources. Bioinformatics 21(22):4133–4139. 10.1093/bioinformatics/bti683 [DOI] [PubMed] [Google Scholar]
  • 92.Ihlenfeldt WD, Bolton EE, Bryant SH (2009) The PubChem chemical structure sketcher. J Cheminform 1:1–9. 10.1186/1758-2946-1-20 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Willighagen EL, Waagmeester A, Spjuth O, Ansell P, Williams AJ, Tkachenko V, Wild DJ (2013) The ChEMBL database as linked open data. J Cheminform 5:1–12. 10.1186/1758-2946-5-23 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Jónsdóttir SÓ, Jørgensen FS, Brunak S (2005) Prediction methods and databases within chemoinformatics: emphasis on drugs and drug candidates. Bioinformatics 21(10):2145–2160. 10.1093/bioinformatics/bti314 [DOI] [PubMed] [Google Scholar]
  • 95.Guha R, Gilbert K, Fox G, Pierce M, Wild D, Yuan H (2010) Advances in cheminformatics methodologies and infrastructure to support the data mining of large, heterogeneous chemical datasets. Curr Comput Aided Drug Des 6(1):50–67. 10.2174/157340910790980115 [DOI] [PubMed] [Google Scholar]
  • 96.Prieto-Martínez FD, Peña-Castillo A, Méndez-Lucio O, Fernández-de Gortari E, Medina-Franco JL (2016) Molecular modeling and chemoinformatics to advance the development of modulators of epigenetic targets: a focus on DNA methyltransferases. Adv Protein Chem Struct Biol 105:1–26. 10.1016/bs.apcsb.2016.05.001 [DOI] [PubMed] [Google Scholar]
  • 97.Jaeger-Honz S, Klein K, Schreiber F (2024) Systematic analysis, aggregation and visualisation of interaction fingerprints for molecular dynamics simulation data. J Cheminform 16(1):28. 10.1186/s13321-024-00822-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Tetko IV, Engkvist O, Koch U, Reymond JL, Chen H (2016) BIGCHEM: challenges and opportunities for big data analysis in chemistry. Mol Inf 35(11–12):615–621. 10.1002/minf.201600073 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Rodrigues JF Jr, Florea L, De Oliveira MC, Diamond D, Oliveira ON Jr (2019) A survey on big data and machine learning for chemistry. arXiv preprint. 10.4855/arXiv.1904.10370 [Google Scholar]
  • 100.Satoh H, Steiner VM, Hutter J (2024) “Quantum-chemoinformatics” for design and discovery of new molecules and reactions. Springer, Berlin [Google Scholar]
  • 101.Bräse S (2024) Digital chemistry: navigating the confluence of computation and experimentation-definition, status quo, and future perspective. Digit Discov. 10.2643/chemrxiv-2024-249fr [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

No datasets were generated or analysed during the current study.


Articles from Journal of Cheminformatics are provided here courtesy of BMC

RESOURCES