Abstract

We introduce a new workflow that relies heavily on chemical quantitative structure-retention relationship (QSRR) models to accelerate method development for micro/mini-scale high-throughput purification (HTP). This provides faster access to new active pharmaceutical ingredients (APIs) through high-throughput experimentation (HTE). By comparing fingerprint structural similarity (e.g., Tanimoto index) with small training data sets containing a few hundred diverse small molecule antagonists of a lipid metabolizing enzyme, we can predict retention time (RT) of new compounds. Machine learning (ML) helps to identify optimal separation conditions for purification without performing the traditional crude QC step involving ultrahigh performance liquid chromatography (UHPLC) analyses of each compound. This green-chemistry approach with the use of predictive tools reduces cost and significantly shortens the design-make-test (DMT) cycle of new drugs by way of HTE.
Keywords: Chromatography, High-throughput experimentation, Machine learning, Microgram scale purification, Tanimoto similarity, Parallel medicinal chemistry, Small molecule drug discovery
Innovative methods for the accelerated production of new synthetic molecules enable pharmaceutical companies to drive the next generation of medicines.1 The advent of high-throughput experiment (HTE) approaches have unlocked new routes to chemical matter and are changing the practice of drug discovery research.2−4 HTE at the micro/mini-scale decreases the use of synthesis materials, accelerates reaction optimization, and allows for novel chemistry strategies.5,6 While “direct-to-biology” (D2B) approaches can facilitate the generation of quantitative structure–activity relationships (SAR) with the presence of impurities,7 chromatography still plays a critical role in the design-make-test (DMT) cycle.
The traditional processes in centralized high-throughput purification (HTP) laboratories8−10 however have often been seen as a bottleneck for the rapid production of novel compounds and are not amenable to HTE. Recently, high-pressure micro/mini-scale high-performance liquid chromatography (HPLC)11,12 and supercritical fluid chromatography (SFC)13 systems have been specifically developed and configured to support the purification stage of HTE processes. Significant progress has also been made in multidimensional liquid chromatography (LC) systems14,15 that expand the available separation space for HTE analysis and purification. The development of integrated platforms16,17 for automating DMT steps can also increase process efficiencies.
Traditionally, a crude QC check is performed on an ultrahigh performance liquid chromatography (UHPLC) instrument as the process of scouting for suitable chromatography conditions used in the method development stage of the HTE purification workflow.18 With a typical 3.5 min injection-to-injection cycle period, the time required to perform a crude QC check of a library of 384 compounds will require nearly a 24-h span to complete. Eliminating this step will significantly shorten the HTE DMT cycle time while reducing expenditure of analytical resources and solvents. Efficient ways to employ automation to optimize the HPLC separation have been reported for retention modeling,19,20 impurity determination,21 and selectivity optimization using large chromatographic data sets.22
Machine learning (ML) can be used to enhance many aspects of HTE processes from route planning and experimental design to data processing and review. In recent years the use of computer-assisted tools that employ ML have emerged as a practical means to predict analyte retention for a range of LC techniques such as reversed-phase liquid chromatography (RPLC), hydrophilic interaction liquid chromatography (HILIC), ion chromatography (IC), and size exclusion chromatography (SEC).23−25 Different data-driven approaches have been reported with the use of commercial software26−29 such as ChromSword, DryLab, and LC Simulator as a substitute for traditional HPLC method development which can often be laborious and time-consuming. Chemical similarity searching and quantitative structure-retention relationship (QSRR) modeling can be used to establish relationships between structural features and analyte retention thereby giving retention time (RT) models based on compound structure.30−32 The Tanimoto coefficient T is one of the most prevalent fingerprint similarity measures for comparing chemical structures,33 whereby each bit of the Boolean string corresponds to the presence or absence of a particular feature of the structure. A Tanimoto similarity score of 0 indicates no chemical similarity while a score of 1 indicates perfect structural similarity. Recent work report on its use to build analyte specific models capable of predicting RT under different chromatographic conditions.34,35
Our workflow leverages ACD/ChromGenius36,37 as part of the AutoChrom (ACD/Labs, Toronto, ON, Canada) software tool set for establishing databases of molecular structures and employing chemical QSRR modeling and chromatographic simulation. This results in the selection of robust chromatographic parameters and method conditions for downstream HTE processes such as micro/mini-scale purification. To the best of our knowledge, this is the first report focusing on the integration of ML tools with HTE chromatography processes, which is clearly advantageous and allows for faster access to new active pharmaceutical ingredients (APIs).38 In this paper we focus on utilizing ML techniques in HTE to predict RT and chromatography conditions for a distinct structural class of small molecule antagonists of a lipid metabolizing enzyme for rapid screening and hit finding.
In order to more broadly explore the SAR for this class of compounds, we envisioned the execution of a diverse library with an amide coupling as the final step (Scheme 1) by taking advantage of HTE.
Scheme 1. Aryl Amide Coupling Library.

The substituted core 1 was coupled with aryl amine monomers which were selected after applying filters for multiparameter optimization and structural diversity. Coupling reactions with compound 1 were carried out at mini-scale (26.4 μmol per reaction) parallel medicinal chemistry. Upon screening several amide coupling conditions with aryl amines, we used amine monomers at 1.28 equiv with 1.2 equiv of hexafluorophosphate azabenzotriazole tetramethyl uronium (HATU) as coupling reagent and 3 equiv of diisopropylethylamine (DIEA) as base in dimethylformamide (DMF). The reactions were agitated at room temperature for 18 h and the subsequent mixtures were then concentrated under vacuum. The resulting residues of 2 were reconstituted in 200 μL of dimethyl sulfoxide (DMSO), filtered through a 0.45 μm filter, and then submitted in a 96-well microplate for our micro/mini-scale HTP workflow (Figure 1).
Figure 1.
Method prediction and micro/mini-scale purification workflow that delivers final compound arrays for assay.
Automation of data submission and processing with fingerprint-based structural similarity calculations expressed as Tanimoto similarity index was performed using ACD/Spectrus DB and ACD/ChromGenius software, respectively. ACD/ChromGenius constructs models and predicts RT based on the structures of the compounds to be separated as well as using molecular descriptors such as log P, log D, polar surface area, molecular volume, molecular weight, molar refractivity, and the number of hydrogen bond donor and acceptor sites on the molecule (see Supporting Information Figure S-1). Similarity scores are calculated between the structures based on the Tanimoto similarity index. The calculated RTs allow for automatic assignment of the best focused separation method for each sample prior to purification, instead of a generic broad gradient. These software tools provide important predictive and data visualization capabilities for our HTE workflow.
Our micro/mini-scale HTE chromatography method development strategy utilizes a common C18 column stationary phase between analysis and purification. The mobile phases are LC-MS grade water and acetonitrile (CH3CN) from Fisher Scientific (Fair Lawn, NJ, USA), containing pH modifiers 0.1% trifluoroacetic acid (C2HF3O2) or 0.1% ammonium hydroxide (NH4OH).
A custom configured 1290 Infinity II micro/mini-scale preparative HPLC-MS system (Agilent Technologies, Santa Clara, CA, USA) with 6135B mass detector (see Supporting Information Figure S-2) was used for compound isolation, with full control provided by OpenLab CDS ChemStation Edition software. A Gemini NX C18 (Phenomenex Corporation, Torrance, CA, USA) semipreparative column (150 mm × 10.0 mm i.d., 5 μm, 110 Å) was used at room temperature. Injection volume was 200.0 μL using a 2000 μL sample loop and the flow rate was set at 10 mL/min. The 7 min method gradients are determined based on predicted RTs, and fractions are collected by mass trigger based on the extracted ion chromatogram (XIC).
Aliquots from collected fractions in barcoded 1-dram vials are transferred to a 96-well plate using a MiniTasker robotic workstation by Sirius Automation (Buffalo Grove, IL, USA). Experimental RT data for these compounds were obtained on an Acquity I-Class UHPLC (Waters, Milford, MA, USA) analytical system with SQ Detector 2 (SQD2) mass spectrometer, and full control provided by MassLynx/FractionLynx 4.1 software. A BEH C18 (Waters, Milford, MA, USA) analytical column (50 mm × 2.1 mm i.d., 1.7 μm, 130 Å) was used at 50 °C. Injection volume was 5.0 μL using a 10 μL sample loop and the flow rate was set at 0.3 mL/min. A 2 min standard QC method gradient 10–90% acetonitrile/water profile followed by a 1 min equilibration was performed for all samples to obtain final fraction purity and RT values. UV detection was carried out at 215 nm.
Collected fractions were dried with a HT-12 Series III centrifugal evaporator by Genevac Inc. (Stone Ridge, NY, USA). Dried vials were weighed on a Multitasker robotic workstation by Sirius Automation (Buffalo Grove, IL, USA) with 5-place analytical balance. Reconstitution of the purified compounds in DMSO to a standardized 10 mM or 2 mM concentration enabled further downstream testing. No unexpected or unusually high safety hazards were encountered in our experimental procedures.
The accuracy of the QSRR model is dependent on database composition,32,35 and the selection of appropriate structural analogs is important to create a representative training set for a particular project. For this study we created a diverse training set that comprised of 168 structurally distinct small molecule lipid metabolizing enzyme inhibitors to validate the prediction accuracy of ACD/ChromGenius software. All are proprietary compounds of Merck & Co., Inc., Rahway, NJ, USA (their structures are not presented). Compound physical-chemical property details are provided in Table S-1 (Supporting Information) calculated with ACD/Percepta software. These compounds underwent our typical purification workflow with final data processing, analysis, and reporting performed on ACD/MSWorkBook suite software. The registration-ready structure data file (SDF) generated during the final analysis step contains information on the chemical structure and associated data including gradient method parameters, recovered weight, final purity and RT of target compounds in the 2 min aqueous UHPLC QC run. This SDF is then imported into ACD/ChromGenius for similarity calculations.
A histogram showcasing the distribution of pairwise structural similarities and Tanimoto index heat map are displayed in Figure 2. From the full training data set of 168 compounds, nearly 70% of pairs had similarities less than 0.7 and 0.1% of pairs had similarities below 0.4, indicating a structurally diverse training data set (Figure 2A). As seen from the heat map, areas that have sections of high similarity do exist (Figure 2B). A small percentage of pairs (approximately 1%) were higher than 0.9 with a high degree of structural similarity. The correlation of experimentally measured and calculated training set RTs is shown in Figure S-3, Supporting Information.
Figure 2.
Similarity data for the QSRR training set. (A) Histogram showcasing the distribution of pairwise structural similarities expressed as the Tanimoto index. (B) Heat map visualization for all 168 compounds of the training set.
A 24-member aryl amide coupling compound library from the same project was then used as a test set to validate the model built using data in the 168-member training data set. Structural diversity is confirmed by a UMAP plot of our training and test sets (see Supporting Information Figure S-4). The software found the 50 best structural matches from the training data set for each new test structure based on the Tanimoto similarity score and was able to predict all 24 compound library RTs by using molecular descriptors. The difference between calculated RTs proposed by ACD/ChromGenius and the experimentally observed RTs is shown in Table 1. The RTs calculated by ACD/ChromGenius match well with the experimentally observed RTs from UHPLC, with a R2 value of 0.73 (Figure 3). The RT calculations were within 0.2 min of measured experimental values for all but 3 compounds. In addition, 15 compounds exhibited RT discrepancies that were less than 0.10 min. The software predicted elution RTs can be used to assign the purification method gradient (see Table S-2, Supporting Information), which matches nicely with the actual compound elution in the purification stage based on experimental RTs. More importantly, as gradients are assigned based on a table of RT time ranges, appropriate focused methods were assigned for all the compounds. Higher purity samples are therefore obtained when compared with using a single broad gradient method (i.e., 25–95% acetonitrile/water) for purification (see Supporting Information Figure S-5).
Table 1. QSRR Results for an Aryl Amide Coupling Test Library Purified by the Mini-Scale Purification Workflow.
Figure 3.
Calculated and experimental UHPLC RT correlation of the QSRR test set.
Although the number of compounds in training set databases can vary from tens to thousands of compounds, our results confirm the contemporary understanding32,34,35 that smaller compound training sets that have sufficient structural similarity to the analytes of interest can generate accurate models. This demonstrates the concept that RTs predicted by ML trained on a diverse training set can be used for successful method development in the purification of micro/mini-scale HTE samples. The next step is to build databases that include different column parameters, gradient time, temperature, and pH conditions to make more precise prediction and method selections, as well as to test the accuracy of predictions by expanding to other structural modalities. Implementation of this workflow into more early stage projects will accelerate the DMT cycle by reducing the amount of time and resources spent on purification method development.
We report a new HTE workflow based on ML QSRR models to accelerate purification method development. Predictive HPLC software such as ACD/ChromGenius can be especially helpful for early stage projects with limited information. In addition, poorly characterized chemicals can potentially be identified based on structural similarity to known molecules. Optimal separation conditions and method assignments for purification can be identified without running prior crude analyses. This drives efficiency by providing a significant time savings and reduced expenditure of analytical resources and solvents, while decreasing waste generation. We have shown that a proper data training model can tolerate a wide range of functional groups and complex structures found in investigational pharmaceutical compounds and that the method is applicable in modern miniaturized parallel HTE.
Acknowledgments
The authors would like to acknowledge Shane Krska of Chemical Capabilities for Accelerating Therapeutics and Edward Sherer of Analytical Enabling Capabilities, Merck & Co., Inc., Rahway, NJ, USA for manuscript review and providing helpful comments. We would like to thank Mary McKee, Irina Oshchepkova, and Alexander Waked of Advanced Chemistry Development, Inc. (ACD/Labs) for discussions and application support.
Glossary
ABBREVIATIONS
- HTE
high-throughput experimentation
- HTP
high-throughput purification
- D2B
direct-to-biology
- QSRR
quantitative structure-retention relationships
- ML
machine learning
- RT
retention time
- HPLC
high-performance liquid chromatography
- MS
mass spectrometry
- UHPLC
ultrahigh performance liquid chromatography
- UMAP
uniform manifold approximation and projection
Supporting Information Available
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acsmedchemlett.4c00145.
Software prediction methodology, purification instrument configuration, training set RT correlation, supplemental tables containing compound physical-chemical property parameters and method assignments, and chromatography spectra (PDF)
Author Contributions
The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript.
The authors declare no competing financial interest.
Special Issue
Published as part of ACS Medicinal Chemistry Lettersvirtual special issue “Exploring the Use of AI/ML Technologies in Medicinal Chemistry and Drug Discovery”.
Supplementary Material
References
- Campos K. R.; Coleman P. J.; Alvarez J. C.; Dreher S. D.; Garbaccio R. M.; Terrett N. K.; Tillyer R. D.; Truppo M. D.; Parmee E. R. The Importance of Synthetic Chemistry in the Pharmaceutical Industry. Science (80-.) 2019, 363 (6424), eaat0805 10.1126/science.aat0805. [DOI] [PubMed] [Google Scholar]
- Buitrago Santanilla A.; Regalado E. L.; Pereira T.; Shevlin M.; Bateman K.; Campeau L.-C.; Schneeweis J.; Berritt S.; Shi Z.-C.; Nantermet P.; Liu Y.; Helmy R.; Welch C. J.; Vachal P.; Davies I. W.; Cernak T.; Dreher S. D. Nanomole-Scale High-Throughput Chemistry for the Synthesis of Complex Molecules. Science (80-.) 2015, 347 (6217), 49–53. 10.1126/science.1259203. [DOI] [PubMed] [Google Scholar]
- Cernak T.; Gesmundo N. J.; Dykstra K.; Yu Y.; Wu Z.; Shi Z. C.; Vachal P.; Sperbeck D.; He S.; Murphy B. A.; Sonatore L.; Williams S.; Madeira M.; Verras A.; Reiter M.; Lee C. H.; Cuff J.; Sherer E. C.; Kuethe J.; Goble S.; Perrotto N.; Pinto S.; Shen D. M.; Nargund R.; Balkovec J.; DeVita R. J.; Dreher S. D. Microscale High-Throughput Experimentation as an Enabling Technology in Drug Discovery: Application in the Discovery of (Piperidinyl)Pyridinyl-1H-Benzimidazole Diacylglycerol Acyltransferase 1 Inhibitors. J. Med. Chem. 2017, 60 (9), 3594–3605. 10.1021/acs.jmedchem.6b01543. [DOI] [PubMed] [Google Scholar]
- Krska S. W.; DiRocco D. A.; Dreher S. D.; Shevlin M. The Evolution of Chemical High-Throughput Experimentation to Address Challenging Problems in Pharmaceutical Synthesis. Acc. Chem. Res. 2017, 50 (12), 2976–2985. 10.1021/acs.accounts.7b00428. [DOI] [PubMed] [Google Scholar]
- Gesmundo N. J.; Tu N. P.; Sarris K. A.; Wang Y. ChemBeads-Enabled Photoredox High-Throughput Experimentation Platform to Improve C(Sp2)-C(Sp3) Decarboxylative Couplings. ACS Med. Chem. Lett. 2023, 14 (4), 521–529. 10.1021/acsmedchemlett.2c00538. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Matthews A. D.; Peters E.; Debenham J. S.; Gao Q.; Nyamiaka M. D.; Pan J.; Zhang L.-K.; Dreher S. D.; Krska S. W.; Sigman M. S.; Uehling M. R. Cu Oxamate-Promoted Cross-Coupling of α-Branched Amines and Complex Aryl Halides: Investigating Ligand Function through Data Science. ACS Catal. 2023, 13 (24), 16195–16206. 10.1021/acscatal.3c04566. [DOI] [Google Scholar]
- Hendrick C. E.; Jorgensen J. R.; Chaudhry C.; Strambeanu I. I.; Brazeau J. F.; Schiffer J.; Shi Z.; Venable J. D.; Wolkenberg S. E. Direct-to-Biology Accelerates PROTAC Synthesis and the Evaluation of Linker Effects on Permeability and Degradation. ACS Med. Chem. Lett. 2022, 13 (7), 1182–1190. 10.1021/acsmedchemlett.2c00124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu M.; Chen K.; Christian D.; Fatima T.; Pissarnitski N.; Streckfuss E.; Zhang C.; Xia L.; Borges S.; Shi Z.; Vachal P.; Tata J.; Athanasopoulos J. High-Throughput Purification Platform in Support of Drug Discovery. ACS Comb. Sci. 2012, 14 (1), 51–59. 10.1021/co200138h. [DOI] [PubMed] [Google Scholar]
- Weller H. N.; Nirschl D. S.; Paulson J. L.; Hoffman S. L.; Bullock W. H. Addressing the Medicinal Chemistry Bottleneck: A Lean Approach to Centralized Purification. ACS Comb. Sci. 2012, 14 (9), 520–526. 10.1021/co300075g. [DOI] [PubMed] [Google Scholar]
- Bryan M. C.; Dillon B.; Hamann L. G.; Hughes G. J.; Kopach M. E.; Peterson E. A.; Pourashraf M.; Raheem I.; Richardson P.; Richter D.; Sneddon H. F. Sustainable Practices in Medicinal Chemistry: Current State and Future Directions. J. Med. Chem. 2013, 56 (15), 6007–6021. 10.1021/jm400250p. [DOI] [PubMed] [Google Scholar]
- Hettiarachchi K.; Hayes M.; Desai A. J.; Wang J.; Ren Z.; Greshock T. J. Subminute Micro-Isolation of Pharmaceuticals with Ultra-High Pressure Liquid Chromatography. J. Pharm. Biomed. Anal. 2019, 176, 112794 10.1016/j.jpba.2019.112794. [DOI] [PubMed] [Google Scholar]
- Barhate C. L.; Donnell A. F.; Davies M.; Li L.; Zhang Y.; Yang F.; Black R.; Zipp G.; Zhang Y.; Cavallaro C. L.; Priestley E. S.; Weller H. N. Microscale Purification in Support of High-Throughput Medicinal Chemistry. Chem. Commun. 2021, 57 (84), 11037–11040. 10.1039/D1CC03791A. [DOI] [PubMed] [Google Scholar]
- Hayes M.; Hettiarachchi K.; Lang S.; Wang J.; Greshock T. J. Ultra-Fast Microscale Purification of Chiral Racemates and Achiral Pharmaceuticals with Analytical Supercritical Fluid Chromatography. J. Chromatogr. A 2022, 1665, 462829. 10.1016/j.chroma.2022.462829. [DOI] [PubMed] [Google Scholar]
- Goyon A.; Masui C.; Sirois L. E.; Han C.; Yehl P.; Gosselin F.; Zhang K. Achiral-Chiral Two-Dimensional Liquid Chromatography Platform to Support Automated High-Throughput Experimentation in the Field of Drug Development. Anal. Chem. 2020, 92 (22), 15187–15193. 10.1021/acs.analchem.0c03754. [DOI] [PubMed] [Google Scholar]
- Hettiarachchi K.; Streckfuss E.; Sanzone J. R.; Wang J.; Hayes M.; Kong M.; Greshock T. J. Microscale Purification with Direct Charged Aerosol Detector Quantitation Using Selective Online One- or Two-Dimensional Liquid Chromatography. Anal. Chem. 2022, 94 (23), 8309–8316. 10.1021/acs.analchem.2c00750. [DOI] [PubMed] [Google Scholar]
- Baranczak A.; Tu N. P.; Marjanovic J.; Searle P. A.; Vasudevan A.; Djuric S. W. Integrated Platform for Expedited Synthesis-Purification-Testing of Small Molecule Libraries. ACS Med. Chem. Lett. 2017, 8 (4), 461–465. 10.1021/acsmedchemlett.7b00054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ginsburg-Moraff C.; Grob J.; Chin K.; Eastman G.; Wildhaber S.; Bayliss M.; Mues H. M.; Palmieri M.; Poirier J.; Reck M.; Luneau A.; Rodde S.; Reilly J.; Wagner T.; Brocklehurst C. E.; Wyler R.; Dunstan D.; Marziale A. N. Integrated and Automated High-Throughput Purification of Libraries on Microscale. SLAS Technol. 2022, 27 (6), 350–360. 10.1016/j.slast.2022.08.002. [DOI] [PubMed] [Google Scholar]
- Dykstra K. D.; Streckfuss E.; Liu M.; Liu J.; Yu Y.; Wang M.; Kozlowski J. A.; Myers R. W.; Buevich A. V.; Maletic M. M.; Vachal P.; Krska S. W. Synthesis of HDAC Inhibitor Libraries via Microscale Workflow. ACS Med. Chem. Lett. 2021, 12 (3), 337–342. 10.1021/acsmedchemlett.0c00596. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tyteca E.; Veuthey J. L.; Desmet G.; Guillarme D.; Fekete S. Computer Assisted Liquid Chromatographic Method Development for the Separation of Therapeutic Proteins. Analyst 2016, 141 (19), 5488–5501. 10.1039/C6AN01520D. [DOI] [PubMed] [Google Scholar]
- Haddad P. R.; Taraji M.; Szücs R. Prediction of Analyte Retention Time in Liquid Chromatography. Anal. Chem. 2021, 93 (1), 228–256. 10.1021/acs.analchem.0c04190. [DOI] [PubMed] [Google Scholar]
- Stafford J. D.; Maloney T. D.; Myers D. P.; Cintron J. M.; Castle B. C. A Systematic Approach to Development of Liquid Chromatographic Impurity Methods for Pharmaceutical Analysis. J. Pharm. Biomed. Anal. 2011, 56 (2), 280–292. 10.1016/j.jpba.2011.05.028. [DOI] [PubMed] [Google Scholar]
- Sheridan R.; Schafer W.; Piras P.; Zawatzky K.; Sherer E. C.; Roussel C.; Welch C. J. Toward Structure-Based Predictive Tools for the Selection of Chiral Stationary Phases for the Chromatographic Separation of Enantiomers. J. Chromatogr. A 2016, 1467, 206–213. 10.1016/j.chroma.2016.05.066. [DOI] [PubMed] [Google Scholar]
- Tyteca E.; Périat A.; Rudaz S.; Desmet G.; Guillarme D. Retention Modeling and Method Development in Hydrophilic Interaction Chromatography. J. Chromatogr. A 2014, 1337, 116–127. 10.1016/j.chroma.2014.02.032. [DOI] [PubMed] [Google Scholar]
- Park S. H.; Talebi M.; Amos R. I. J.; Tyteca E.; Haddad P. R.; Szucs R.; Pohl C. A.; Dolan J. W. Towards a Chromatographic Similarity Index to Establish Localised Quantitative Structure-Retention Relationships for Retention Prediction. II Use of Tanimoto Similarity Index in Ion Chromatography. J. Chromatogr. A 2017, 1523, 173–182. 10.1016/j.chroma.2017.02.054. [DOI] [PubMed] [Google Scholar]
- Wen Y.; Talebi M.; Amos R. I. J.; Szucs R.; Dolan J. W.; Pohl C. A.; Haddad P. R. Retention Prediction in Reversed Phase High Performance Liquid Chromatography Using Quantitative Structure-Retention Relationships Applied to the Hydrophobic Subtraction Model. J. Chromatogr. A 2018, 1541, 1–11. 10.1016/j.chroma.2018.01.053. [DOI] [PubMed] [Google Scholar]
- Krisko R. M.; McLaughlin K.; Koenigbauer M. J.; Lunte C. E. Application of a Column Selection System and DryLab Software for High-Performance Liquid Chromatography Method Development. J. Chromatogr. A 2006, 1122 (1–2), 186–193. 10.1016/j.chroma.2006.04.065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Xiao K. P.; Xiong Y.; Liu F. Z.; Rustum A. M. Efficient Method Development Strategy for Challenging Separation of Pharmaceutical Molecules Using Advanced Chromatographic Technologies. J. Chromatogr. A 2007, 1163 (1–2), 145–156. 10.1016/j.chroma.2007.06.027. [DOI] [PubMed] [Google Scholar]
- Wang L.; Zheng J.; Gong X.; Hartman R.; Antonucci V. Efficient HPLC Method Development Using Structure-Based Database Search, Physico-Chemical Prediction and Chromatographic Simulation. J. Pharm. Biomed. Anal. 2015, 104, 49–54. 10.1016/j.jpba.2014.10.032. [DOI] [PubMed] [Google Scholar]
- Haidar Ahmad I. A.; Chen W.; Halsey H. M.; Klapars A.; Limanto J.; Pirrone G. F.; Nowak T.; Bennett R.; Hartman R.; Makarov A. A.; Mangion I.; Regalado E. L. Multi-Column Ultra-High Performance Liquid Chromatography Screening with Chaotropic Agents and Computer-Assisted Separation Modeling Enables Process Development of New Drug Substances. Analyst 2019, 144 (9), 2872–2880. 10.1039/C8AN02499E. [DOI] [PubMed] [Google Scholar]
- Maggiora G.; Vogt M.; Stumpfe D.; Bajorath J. Molecular Similarity in Medicinal Chemistry. J. Med. Chem. 2014, 57 (8), 3186–3204. 10.1021/jm401411z. [DOI] [PubMed] [Google Scholar]
- Kensert A.; Bouwmeester R.; Efthymiadis K.; Van Broeck P.; Desmet G.; Cabooter D. Graph Convolutional Networks for Improved Prediction and Interpretability of Chromatographic Retention Data. Anal. Chem. 2021, 93 (47), 15633–15641. 10.1021/acs.analchem.1c02988. [DOI] [PubMed] [Google Scholar]
- Fine J.; Mann A. K. P.; Aggarwal P. Structure Based Machine Learning Prediction of Retention Times for LC Method Development of Pharmaceuticals. Pharm. Res. 2024, 41 (2), 365–374. 10.1007/s11095-023-03646-2. [DOI] [PubMed] [Google Scholar]
- Bajusz D.; Rácz A.; Héberger K. Why Is Tanimoto Index an Appropriate Choice for Fingerprint-Based Similarity Calculations?. J. Cheminform. 2015, 7 (1), 1–13. 10.1186/s13321-015-0069-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sagandykova G.; Buszewski B. Perspectives and Recent Advances in Quantitative Structure-Retention Relationships for High Performance Liquid Chromatography. How Far Are We?. TrAC - Trends Anal. Chem. 2021, 141, 116294. 10.1016/j.trac.2021.116294. [DOI] [Google Scholar]
- Szucs R.; Brown R.; Brunelli C.; Hradski J.; Masár M. Impact of Structural Similarity on the Accuracy of Retention Time Prediction. J. Chromatogr. A 2023, 1707, 464317. 10.1016/j.chroma.2023.464317. [DOI] [PubMed] [Google Scholar]
- Tyrkkö E.; Pelander A.; Ojanperä I. Prediction of Liquid Chromatographic Retention for Differentiation of Structural Isomers. Anal. Chim. Acta 2012, 720, 142–148. 10.1016/j.aca.2012.01.024. [DOI] [PubMed] [Google Scholar]
- Dossin E.; Martin E.; Diana P.; Castellon A.; Monge A.; Pospisil P.; Bentley M.; Guy P. A. Prediction Models of Retention Indices for Increased Confidence in Structural Elucidation during Complex Matrix Analysis: Application to Gas Chromatography Coupled with High-Resolution Mass Spectrometry. Anal. Chem. 2016, 88 (15), 7539–7547. 10.1021/acs.analchem.6b00868. [DOI] [PubMed] [Google Scholar]
- Eyke N. S.; Koscher B. A.; Jensen K. F. Toward Machine Learning-Enhanced High-Throughput Experimentation. Trends Chem. 2021, 3 (2), 120–132. 10.1016/j.trechm.2020.12.001. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.




