Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2026 Feb 12.
Published in final edited form as: ACS Nano. 2025 Sep 11;19(37):33288–33296. doi: 10.1021/acsnano.5c09066

TuNa-AI: A Hybrid Kernel Machine To Design Tunable Nanoparticles for Drug Delivery

Zilu Zhang 1, Yan Xiang 1, Joe Laforet Jr 1, Ivan Spasojevic 2,3, Ping Fan 3, Ava Heffernan 4, Christine E Eyler 4, Kris C Wood 5, Zachary C Hartman 6, Daniel Reker 1,7,*
PMCID: PMC12893383  NIHMSID: NIHMS2112154  PMID: 40934473

Abstract

Artificial intelligence (AI) has the potential to transform nanoparticle development for drug delivery; however, existing strategies typically optimize either material selection or component ratios in isolation. To enable simultaneous optimization of both, we integrated an automated liquid handling platform with machine learning to systematically explore the nanoparticle formulation space. A data set comprising 1275 distinct formulations (spanning drug molecules, excipients, and synthesis molar ratios) was generated, resulting in a 42.9% increase in successful nanoparticle formation through composition optimization. We developed a bespoke hybrid kernel machine that couples molecular feature learning with relative compositional inference, enhancing the modeling of formulation outcomes across chemical spaces. This hybrid kernel significantly improved prediction performance across three kernel-based algorithms, with a support vector machine (SVM) achieving superior performance when using our kernel compared to standard kernels and outperforming all other machine learning architectures, including transformer-based deep neural networks. Using SVM-guided predictions, we successfully formulated the difficult-to-encapsulate venetoclax with optimized taurocholic acid ratios, yielding enhanced in vitro efficacy against Kasumi-1 leukemia cells. In a second case study, our AI-guided platform reduced excipient usage by 75% in a trametinib formulation while preserving the in vitro efficacy and in vivo pharmacokinetics relative to the standard formulation. Taken together, this study establishes a generalizable framework that combines robotic experimentation, kernel machine learning, and experimental validation to accelerate nanoparticle composition optimization for drug delivery.

Keywords: nanoparticle, drug delivery, machine learning, lab automation, tunable composition

Graphical Abstract

graphic file with name nihms-2112154-f0006.jpg

INTRODUCTION

In recent years, Al has reshaped the landscape of drug development, leading to the emergence of several Al-developed drug candidates now advancing through clinical trials.1 Significant attention has been given to early drug discovery due to large data availability, while Al for later stages of drug development focusing on safety and delivery still provides an opportunity for further growth.2 Due to limited data accessibility and the complexity of physiological environments and formulations that can impact a drug’s safety and delivery, this high-risk high-reward endeavor poses a formidable challenge for currently available computational models.

Nevertheless, a few recent pioneering studies have validated the potential for Al to be utilized for large-scale nanoparticle data curation and analysis,3 discovery of novel nanostructures,4, 5 property-oriented formulation optimization,68 and design of specialized delivery vehicles.912 Notwithstanding these significant advancements, the application of Al to nanoparticle drug delivery is still rare and encounters distinct obstacles. Formulation quality commonly depends not only on selecting suitable excipients but also on the specific amounts of excipients.13 This is a common blind spot of currently available computational nanoparticle design pipelines that focus either on material selection or ratio optimization but not both.4, 5, 12 For instance, several machine learning models have been developed to accelerate material selection for lipid nanoparticles (LNPs) but they commonly learn from large data with fixed material ratios, which prohibits them from learning effects of different material ratios to further optimize nanoformulations.9, 11, 12 Similarly, we and others have recently developed platforms for the creation of drug-excipient nanoparticles that have demonstrated the potential of machine learning models to rapidly identify suitable small molecular excipients for drug encapsulation.4, 5 These novel materials stand out thanks to facile synthesis without need for complicated purification and ultrahigh drug loading performance often exceeding 90% compared to the typical <10% drug loading found in most established nanoformulation systems.4 However, similar to progress in related materials, all of the currently existing models that predict drug-excipient nanoparticle formation lack the functionality to optimize relative amounts of drugs and excipients. This limitation can prohibit the discovery of nanoformulations for certain drugs that require specific amounts of excipient to achieve stability. It also prevents models from enabling the optimization of materials such as adjusting excipient amounts.

There is a pressing need for innovative machine learning architectures that can incorporate compositional and formulation-specific information into computer-aided formulation design pipelines. Off-the-shelf machine learning models are now commonly employed in nanoparticle design studies but often fall short in their inherent architectural flexibility to handle multimodal data structures unique to materials science. For example, while deep learning models excel at identifying fine-grained structural patterns and making accurate predictions, they often require massive data sets for effective training, which are not commonly available in drug delivery and materials science. On the other hand, traditional machine learning models such as random forests perform well with small- to medium-sized structured data sets but struggle to differentiate between minor feature variations and the integration of multimodal data types, which are commonly expected for decision making in formulation design and optimization.

Here, to overcome these challenges, we prototype a tunable nanoparticle platform guided by AI (TuNa-AI) that integrates molecular learning with the identification of patterns in relative material amounts. To achieve this, we created a bespoke kernel machine that quantifies relationships between formulations based on the molecular structures of the drugs and excipients as well as their relative amounts used in synthesis. We show that this bespoke model architecture allows for the discovery of novel nanoparticles capable of formulating previously difficult-to-encapsulate drugs. Additionally, we showcase how this platform can be used for further formulation optimization by reducing excipient amounts without compromising the in vitro efficacy and in vivo pharmacokinetic properties.

RESULTS AND DISCUSSION

High-Throughput Synthesis of Tunable Nanoparticle Formulations.

To assess the tunability of drug-excipient nanoparticles and to create a training database for machine learning model development, we first designed and conducted a screen of various drug-excipient pairs to assess whether they could successfully form nanoparticles. A set of 17 drugs and 15 excipients was chosen based on the molecules’ solubility, commercial availability, and clinical relevance (Table S1, Table S2). Testing each drug-excipient pair under five different synthesis conditions yielded a data set of 1275 entries, generated efficiently through our robotic lab automation (Figure 1a,b). Of the 255 molecular pairs tested, 28 formed stable nanoparticles (Figure 1c) using the previously established equimolar synthesis route, resulting in a success rate of approximately 10%—fully corroborating earlier reported success rates in drug-excipient nanoparticle screening. However, when testing pairs at other synthesis conditions that include more or less of the stabilizing excipient, the number of viable combinations increased to 40, marking a 42.9% improvement (Figure 2a). This indicated that a substantial number of new drug-excipient nanoparticles can be discovered by altering the excipient amounts. For example, the tunable synthesis enabled the encapsulation of drugs such as erlotinib, gefitinib, selumetinib, and ceritinib, drugs that were not successfully encapsulated with any of our excipients using the standard 1:1 synthesis protocol and, to the best of our knowledge, currently lacking injectable formulations due to their poor solubility.

Figure 1. Overview of the TuNa-Al platform.

Figure 1.

(a) Workflow of the nanoparticle synthesis, identification, prediction, and characterization pipeline. (b) Schematic of robotic nanoparticle synthesis. (c) Criteria that were applied here to determine successful nanoparticle formation. (d) Experimental techniques used here for nanoparticle characterization.

Figure 2. Screening results for tunable nanoparticle formation.

Figure 2.

(a) Visualization of individual screening results. In each layer, the data matrix comprises all pairwise combinations of 17 drugs and 15 excipients. Each combination was tested at five different excipient-to-drug molar ratios (0.25, 0.5, 1, 2, and 4, from low to high). “No NP formation” indicates combinations that did not meet all criteria for defining a stable nanoformulation. Type I: Within the tested molar ratios, these combinations consistently formed nanoparticles at ratios inclusive of and symmetrical around the equimolar synthesis route (1:1), exhibiting no preferential tunability. Type II: These combinations failed to form nanoparticles at one or more ratios and displayed preferential tunability, favoring either higher or lower excipient usage for stable formation. Note that both type I and type II nanoparticles can be identified using the standard equimolar protocol (excipient/drug = 1:1). Type III: Novel combinations identified exclusively under nonequimolar conditions using the tunable synthesis protocol. (b) Breakdown of all successful nanoparticle formations and representative examples of novel combinations discovered using the tunable synthesis strategy.

Beyond enabling the encapsulation of a broader set of drugs, our screening revealed that different drug-excipient pairs exhibit distinct compositional preferences (Figure 2a). Type I pairs formed nanoparticles at equimolar ratios as well as at other relative material ratios, often with compatibility symmetrically distributed around the equimolar point. While some pairs tolerated a wide range of material ratios, others exhibited more restricted compatibility. For example, type II formulations followed asymmetric optimization paths, favoring either increased or reduced excipient usage for stable formation rather than equimolar ratios, thus displaying “preferential tunability” (Figure 2a). While both type I and type II systems could be identified using the standard equimolar synthesis route, type III combinations produced viable nanoformulations only through the optimized, nonequimolar protocol with tunable composition. This highlights the utility of adjusting the material ratios to discover previously inaccessible nanoparticles. Among these novel nanoparticles, some offered multiple viable excipient-to-drug ratios, whereas others required precise “sweet spots” for successful formation (Figure 2b).

Hybrid Kernel Design and Machine Learning Evaluation.

We next trained machine learning models on these data to learn these design patterns and apply them to new formulation discovery. To leverage both high-dimensional structural descriptors and the effective incorporation of compositional information, we developed a bespoke hybrid kernel machine (Figure 3a). Different types of molecular features were extracted for both the drugs and excipients to capture multiple aspects of each compound. These include topological structural information and physicochemical properties, which are expected to influence intermolecular interactions and, consequently, the likelihood of nanoparticle formation. Subsequently, topological and property-based descriptors were transformed by using separate, distinct kernel functions to account for their differing value distributions. Binary extended-connectivity fingerprints (ECFPs), which capture circular structural details, were processed using a Tanimoto kernel, while continuous physicochemical properties were modeled with a radial basis function (RBF) kernel to account for complex nonlinear relationships. Additionally, a dedicated RBF kernel was employed to encode compositional information, enabling the comprehensive representation of both structural and compositional factors for each formulation. This integrated feature set allows the classification architectures to learn more effectively and led to nanoparticle formation estimation with higher resolution (Figure 3b). This model architecture was intentionally designed without training on nanoparticle properties like size or zeta potential, as these characteristics are measurable only after successful synthesis. Instead, our model enables high-throughput in silico screening of millions of potential material combinations.

Figure 3. The hybrid kernel machine design and retrospective evaluation results.

Figure 3.

(a) Hybrid kernel machine architecture, featuring inputs from binary topological fingerprints of chemical structures, continuous physicochemical properties of drugs and excipients, and molarity ratios of drug and excipient during synthesis. RBF, radial basis function. (b) Transformed feature space representation through the kernel method. (c) Leave-one-pair-out cross-validation design to assess model generalizability for unseen chemical structures and combinations. (d) Evaluation of kernel-learning models using either their default kernel or our new hybrid kernel for each model. Models with default kernels (SVM and GP with RBF kernel; kNN with Minkowski distance) are shown in gray, while hybrid kernel models are highlighted in blue. ROC-AUC, area under the receiver operating characteristic curve. (e, f) ROC curves and AUC scores of all surveyed models. The best-performing support vector machine (SVM) model is shown in red, while all other models are shown in shades of gray. The dashed diagonal represents random guessing (ROC-AUC = 0.5), and the shaded areas indicate one standard deviation across five independent cross-validation results for each model. SVM, support vector machine; GP, Gaussian process; RF, random forest; MPNN, message-passing neural network; TNN, transformer neural network; kNN, k-nearest neighbors; MLP, multilayer perceptron. (g) Computational cost comparison (normalized to SVM CPU time, except the TNN and MPNN models which use GPU acceleration). Mann-Whitney test (α = 0.05): ****p < 0.0001.

To evaluate the utility of the hybrid kernel, we integrated it into three different kernel-based machine learning models: support vector machine (SVM), Gaussian process (GP), and k-nearest neighbors (kNN). We evaluated the performance of these models retrospectively using leave-one-out tests based on drugs (LODO), excipients (LOEO), or unique drug-excipient pairs (LOPO) to assess the ability of the models to make predictions for unseen molecules not part of the training data (Figure 3c). Performance was quantified using the area under the receiver operating characteristic curve (ROC-AUC) to evaluate the ability of the models to correctly prioritize successful nanoparticle formulations. In all three models, significant performance improvements were observed when using our hybrid kernel compared with the default kernels implemented in the machine learning library scikit-learn (Figure 3d).

To further contextualize the performance of the kernel models, we trained and systematically evaluated a range of advanced machine learning models with different architectures (including state-of-the-art deep neural networks utilizing message-passing mechanisms (MPNN)14 and attention-based transformers (TNN)15) on the same data set to assess their ability to accurately identify nanoparticles. Notably, the SVM model showed the highest average performance across all evaluations (Figure 3e,f). While the MPNN slightly outperformed the SVM in the LODO evaluation (ROC-AUC scores of 0.88 and 0.87, respectively), the SVM was the single best model in the LOEO task (ROC-AUC score 0.91) and the LOPO task (ROC-AUC 0.86) and also achieved high performance in additional evaluation tasks including repeated train-test split validation, random split 5-fold cross validation, and stratified split 5-fold cross validation, highlighting the strong predictiveness of the kernel machine and its ability to extrapolate to unseen drug-excipient pairs (Figure S1, Figure S2). Additionally, the SVM model demonstrated exceptional computational efficiency, requiring fewer computational resources compared to all other models except for the model-free nearest neighbor approach (Figure 3g).

To further strengthen our model, we incorporated 1442 previously published drug-excipient nanoparticles4, 5, 16 into the training data to create a larger data set of 2717 nanoformulations. To ensure data quality, completeness, and accuracy, external data sources were selected based on their close relevance and transferability to our nanoparticle systems, particularly with respect to synthesis protocols, hypothesized formation mechanisms (self-assembly by antisolvent addition), and molecular properties (pure small molecule systems without polymers). Each literature-derived record was manually reviewed for relevance and alignment with our experimental work. Importantly, these external data were used exclusively for model training and were not included in the test sets during cross-validation. Incorporating these additional data significantly enhanced the performance of our SVM model across all retrospective evaluations (Figure S3). We further validated that this enhanced model could correctly predict external test data capturing the optimization of a nanoparticle system encapsulating tetraiodophenolphthalein using Congo red as studied by Shoichet and colleagues (Table S3).16

Encapsulation of Difficult-to-Encapsulate Drugs.

With this predictive model in hand, we next set out to evaluate the ability of our hybrid kernel machine to prospectively identify and characterize (Figure 1d) nanoformulations of drugs not characterized in our high-throughput screen. We focused this prospective study on identifying formulations of our in-house approved chemotherapeutics library given the ample validation of the utility of chemotherapeutic nanoparticles.4, 5, 16 One of the drugs that was predicted to form nanoparticles with synthesis conditions outside of the established 1:1 ratio was venetoclax (Figure 4a), a selective BCL-2 inhibitor. Venetoclax is exclusively commercially available in oral formulations, but is classified as a Biopharmaceutics Classification System (BCS) class IV compound with low bioavailability of approximately 5% under fasting conditions.17 Therefore, there is an opportunity to explore other formulation approaches that can circumvent this poor bioavailability. The hybrid kernel machine predicted that taurocholic acid (TCA) (Figure 4a) could encapsulate venetoclax provided the excipient is used in excess over drug during synthesis (Figure 4b). Experimental validation confirmed these predictions, demonstrating that spherical venetoclax-TCA nanoparticles could form stably under high TCA conditions, with a monodispersed size distribution as determined by dynamic light scattering (DLS) (Figure 4ce). This is noteworthy as it fully corroborates the predictions of the kernel machine and also highlights that the same venetoclax-TCA nanoparticles could not have been created with the previously established, static, equimolar synthesis route. The nanoformulated venetoclax exhibited improved colloidal stability, which is a key factor for ensuring formulation quality (Figure 4f). Functionally, the formulated venetoclax retained its potent efficacy against Kasumi-1 human acute myeloblastic leukemia cells with a slightly improved pIC50 of 5.39 ± 0.08 over the unformulated drug (pIC50 = 5.22 ± 0.05) (Figure 4g).

Figure 4. Encapsulation of the difficult-to-encapsulate drug venetoclax by increasing taurocholic acid content.

Figure 4.

(a) Chemical structures of venetoclax and taurocholic acid (TCA). (b, c) Model predictions and experimental validation of venetoclax-TCA nanoparticles at different molar ratios. (d−f) Transmission electron microscope (TEM) images, size distribution, and dispersion stability of 500 μM venetoclax, both unformulated and TCA-formulated (TCA:venetoclax = 2:1, molar ratio). (g) Venetoclax nanoparticles exhibit improved cytotoxicity over free drugs on Kasumi-1 acute myeloblastic leukemia cells. Mann-Whitney test (α = 0.05): *p < 0.05.

Optimization of Nanoformulations To Reduce Excipient While Maintaining Bioequivalence.

In a separate prospective evaluation, we aimed to determine whether the model could be used for material optimization through excipient reduction. Congo red (Figure 5a) is commonly used to stabilize drugs in self-assembling nanoformulation systems,4, 5, 16, 18 although it is classified by the Occupational Safety and Health Administration (OSHA) as a category 1B carcinogen and a category 2 reproductive toxicant,19 which hinders the translation of Congo red-stabilized nanoparticles. Our machine learning model predicted that trametinib (Figure 5a), a mitogen-activated protein kinase (MEK) inhibitor approved and investigated for the treatment of melanoma, lung cancer and liver cancer,20, 21 could form stable nanoparticles with varying amounts of Congo red, including quantities below the standard equimolar ratio (Figure 5b). Experimental validation confirmed successful nanoparticle formation with as little as 25% of the typical Congo red amount (Congo red/trametinib = 0.25 compared to the usual 1:1 equimolar synthesis; Figure 5c,d). Gratifyingly, these optimized nanoparticles exhibited an increased drug loading of 83.4% compared with 77.2% found in the standard 1:1 equimolar synthesis (Figure 5e). Importantly, the reduction in stabilizing Congo red in the formulation did not compromise the in vitro cancer cell cytotoxicity of the trametinib nanoparticles: both the standard and the TuNa-Al optimized trametinib nanoparticles exhibited comparable cytotoxicity against HepG2 human liver cancer cells while both significantly outperformed unformulated trametinib (pIC50 [Free Drug] = 4.60 ± 0.03, pIC50 [Standard NP] = 4.94 ± 0.02, pIC50 [Tunable NP] = 4.97 ± 0.02; Figure 5f).

Figure 5. Optimization and characterization of trametinib nanoparticles with reduced amount of Congo red.

Figure 5.

(a) Chemical structures of trametinib and Congo red. (b, c) Model predictions and experimental validation of trametinib-Congo red nanoparticles. (d) TEM images of 500 μM trametinib, both unformulated and formulated with Congo red (Congo red:trametinib = 4:1, molar ratio). (e) Drug loading of trametinib nanoparticles at Congo red/trametinib molar ratios of 1:1 (standard nanoparticles, 100% Congo red) and 1:4 (tunable nanoparticles, 25% Congo red). (f) Standard and tunable trametinib nanoparticles (20 μM) exhibit comparable in vitro cytotoxicity against HepG2 human liver cancer cells, significantly outperforming free drug. (g) Schematic of the in vivo pharmacokinetic (PK) experiment. (h) Plasma drug concentration following retro-orbital injection of equal doses of standard and tunable trametinib nanoparticles. (i) Key PK parameters of standard and optimized trametinib nanoparticles derived from plasma drug concentration profiles show largely bioequivalent behavior. Mann-Whitney test (α = 0.05): n.s., p ≥ 0.05. *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001.

To further assess the pharmacokinetic (PK) impact of reduced excipient usage, the two nanoformulations were studied in vivo in CD-1 mice following retro-orbital injection (Figure 5g), with plasma drug concentrations measured at various postinjection time points (Figure 5h). Encouragingly, both formulations showed near-identical performances in their area under the curve (AUC) with comparable PK parameters (Figure 5i). While the tunable nanoparticles exhibited a slightly shorter half-life, they demonstrated a longer mean residence time, indicating overall prolonged systemic retention despite faster elimination following drug administration. Other PK parameters showed no statistically significant differences, demonstrating that the two formulations were largely bioequivalent in their systemic biodistribution. These findings underscore that adjustments in formulation composition can potentially enhance the safety and sustainability of nanoparticle-based medicines without compromising PK performance and that machine learning can assist in such endeavors.

CONCLUSIONS

This study introduces a strategy for tuning drug-excipient nanoparticles by modulating the synthesis stoichiometry, thereby expanding the design and optimization landscape for these nanoformulations. By integrating robotic, high-throughput data generation with machine learning, we demonstrate how lab automation can yield consistent, reproducible data sets that enable a more systematic, data-driven approach to formulation design.

This work highlights the advantages of customizing machine learning architectures for domain-specific challenges, illustrating how innovations in model design can outperform generic, off-the-shelf machine learning approaches. The bespoke hybrid kernel machine tailored to nanoparticle composition learning outperformed all other tested algorithms, including advanced deep learning models. Prospective application of our kernel machine enabled the successful encapsulation of previously challenging drugs, offering new avenues for therapeutic delivery. Furthermore, the model’s capacity to optimize formulations by reducing excipient usage, while maintaining in vitro efficacy and in vivo pharmacokinetics, demonstrates the potential of machine learning to optimize materials, creating formulations with higher drug loading and improved material efficiency.

Overall, our integrated platform further exemplifies the transformative potential of machine learning in nanomedicine. In the future, bespoke computational tools are poised to play an outsized role in accelerating, optimizing, and derisking the development of safer, more effective drug delivery systems with broad implications for healthcare and patient outcomes.

METHODS

Material and Automated Nanoparticle Library Synthesis.

All chemicals were purchased from MedChemExpress and Sigma-Aldrich. Stock solutions of drugs (40 mM) and excipients (10, 20, 40, 80, and 160 mM) were created in sterile DMSO and stored at −20 °C. For every formulation, 1 μL of drug stock solution was mixed with 1 μL of stock solution of any of the excipients at a specific concentration in a 96-well plate using an OpenTrons OT-2 liquid-handling station. Mixing of the two droplets was ensured through centrifugation at 2500 rpm and 850 g for 20 s. Subsequently, 198 μL of sterile-filtered and degassed phosphate-buffered saline (PBS) was rapidly added to each well for solvent exchange, and the formulation was mixed through repeated pipetting, leading to a final synthesis solution at 200 μM drug concentration in 1% DMSO/PBS. A 75 μL volume of replicate samples was then transferred into a 384-well microplate (Brandtech) for high-throughput DLS assessment on a Wyatt DynaPro plate reader III at 25 °C using five independent acquisitions of 5 s duration each. DLS enabled quantitative assessment of nanoparticle formation by measuring the size distribution and normalized/raw intensity of resulting coaggregates using the “globular protein” model. Data were processed by calculating the median size observed for a specific formulation. Nanoparticle formation was considered to be successful if the formulation met the following criteria: mean radius of ≤200 nm, PDI ≤40%, ratio between raw and normalized intensity of ≥15.

Hybrid Kernel Design and Machine Learning.

Chemical structures for all drug and excipient molecules were extracted in a simplified molecular-input line-entry system (SMILES) representation from PubChem. Compounds were described according to radial chemical substructures (binary ECFP, a radius of 4, 2048 bits, rdkit.org) and selected physicochemical properties (rdkit.Chem.Descriptors._descList, rdkit.org). Tanimoto kernels were utilized for the circular fingerprints, and radial basis function (RBF) kernels were used for physicochemical properties to quantify different aspects of chemical similarity. Relative excipient/drug molar ratios were encoded as a single rational number and were compared between different formulations using a separate RBF kernel. The resultant gram matrices of the fingerprints and descriptor kernels were combined through element-wise addition. The resulting molecular similarity matrix was then element-wise multiplied by the matrix derived from the relative molar ratio RBF synthesis condition kernel. The generated output matrices of the hybrid kernel served as inputs for three different kernel machine learning models: support vector machine (SVM), Gaussian process (GP), and k-nearest neighbors (kNN). When using default kernels for the kernel-learning models or when using featurized inputs for other machine learning models (xgboost, XGBoost; random forest, RF; multilayer perceptron, MLP), concatenated vectors of fingerprints, descriptors, and excipient/drug molarity ratios were used as inputs. Finally, SMILES strings can be directly fed into the graph-based message-passing neural network (MPNN) and the transformer neural network (TNN) with molarity ratios encoded as an additional feature. All models were retrospectively evaluated through five repeats of leave-one-out cross validation based on drugs, excipients, or molecular pairs and by repeated train-test split validation, random split 5-fold cross validation, and stratified 5-fold cross validation. If prior data were added to the model during cross-validation, it was only added to the training-fold but not the test-fold to ensure that both the models with and without prior data are evaluated on the same test data. When adding prior data to the training fold, we filtered the prior data to remove any drugs or excipients that are also contained in the test set to comply with the leave-one-out format. Code and data for all these calculations can be found at https://github.com/RekerLab/TuNa-Al. A web server to perform nanoparticle predictions with the TuNa-Al model can be accessed at https://github.com/RekerLab/TuNa-AI/blob/main/code/TuNaOnline.md.

Transmission Electron Microscopy Imaging.

A 10 μL aliquot of a freshly prepared nanoparticle solution was deposited on a 300-mesh carbon-coated copper grid and allowed to adsorb for 90 s. Excess solution was removed using a filter paper, followed by negative staining with 1% uranyl acetate solution for 60 s. After air-drying, the grid was examined using an FEI Tecnai G2 Twin transmission electron microscope operated at an accelerating voltage of 180 kV.

Dispersion Kinetics.

The dispersion stability of venetoclax free drug and nanoparticles was performed according to adapted protocols from the Organization for Economic Co-operation and Development (OECD) guidelines “Dispersion Stability of Nanomaterials in Simulated Environmental Media”, as published previously. Specifically, venetoclax was studied at 250 μM. Formulations were generated by mixing 5 μL of drug stock solution with 5 μL of excipient stock in DMSO (nanoparticle formulation) or 5 μL of DMSO (pure drug control) followed by solvent exchange by adding 990 μL of sterile filtered and degassed PBS. All samples were generated in standard 1.5 mL Eppendorf microcentrifuge tubes, briefly vortexed to ensure dispersion, and subsequently stored sealed at room temperature. Formulations were then sampled at predetermined time points by extracting 5 μL of formulation with a standard Eppendorf pipet from the center of the vial. The extracted 5 μL samples were transferred into individual high-performance liquid chromatography (HPLC) sample vials (Sigma-Aldrich) and diluted in DMSO to ensure full solubility of the drug before injection onto the column. Diluted samples were stored at 4 °C and drug concentrations were determined via HPLC within 24 h using the analytical methods described below.

Drug Loading Quantification.

200 μL nanoparticle solution was separated from unencapsulated components through centrifugation at 16 100g for 45 min at room temperature. The nanoparticle pellet was dissolved in 200 μL of DMSO to ensure full solubility. The solution was subsequently injected into an HPLC using the analytical method described below to determine the concentration of drug and excipient, respectively. Drug loading was calculated as the amount of drug encapsulated in the particles divided by the mass of the particles given by the cumulative amount of the drug and the excipient found in the particles.

Analytics.

Samples were analyzed using an Agilent 1100 LC system equipped with MassHunter Workstation software (version 10.1). Chromatographic separation was performed on an Agilent EC C-18 Poroshell column (4.6 mm × 50 mm, 2.7 μm particle size) at room temperature. The optimized mobile phase consisted of 30 mM sodium phosphate buffer (pH 5.6) (A) and acetonitrile (B) using a flow rate of 0.75 mL/min. Gradient separation was achieved over a 15 min run at the following parameters: 0 min 80% A and 20% B, 10 min 100% B, 12 min 80% A and 20% B, 15 min 80% A and 20% B. The injection volume was 10 μL, and the selected ultraviolet (UV) detection wavelength was 248 nm (venetoclax and trametinib) and 490 nm (Congo red), with no reference at an acquisition rate of 0.62 Hz.

MTT Cell Viability Assessment.

Cell stocks were authenticated by short tandem repeat (STR) fingerprinting conducted at the Duke Cell Culture Facility and screened for mycoplasma contamination. Cells were seeded in 96-well plates at a density of 10,000 cells per well in 90 μL of medium and incubated (5% CO2, 37 °C) overnight to allow for adhesion (adherent cells) or adaptation (suspension cells). Cells were cultured in RPMI 1640 medium supplemented with 1% penicillin-streptomycin (10 000 U/mL), and 10% (HepG2) or 20% (Kasumi-1) fetal bovine serum. Following the incubation period, the medium was refreshed, and cells were treated with varying concentrations of nanoparticles or equimolar free drug controls for 48 h. After treatment, the medium was replaced with a drug-free fresh cell culture medium to terminate the drug exposure. A solution of 3 mg/mL MTT was added to each well to constitute 15% of the total volume, and the cells were incubated for an additional 4 h. The supernatant was then removed, and the resulting dark blue formazan crystals were dissolved in 100 μL of DMSO to ensure complete solubility. Cell viability was quantified by measuring absorbance at 570 nm using a Tecan Infinite M Plex multimode microplate reader. Treatment data were normalized to the untreated vehicle control (100% survival, 1% DMSO in PBS) and the positive control (0% survival, 100% DMSO).

Trametinib Nanoparticle in vivo Pharmacokinetics Study.

Male CD-1 mice (n = 5 per time-point; 28 g body weight on average, range 25.7–32.5 g; 6–8 weeks old) were administered intravenously (retro-orbital sinus, under mild isoflurane anesthesia) with 1.17 mg/kg of trametinib as 50 μL of either standard nanoformulation (Congo red/trametinib = 1) or tunable nanoformulations (Congo red/trametinib = 0.25). At 10, 30, 1.5, 6, and 24 h postinjection, blood (~1 mL) was collected from the left heart ventricle into vials containing 10 μL of 75 mg/mL K2EDTA in water, and plasma was separated by centrifugation and frozen until analysis. All animal procedures were performed strictly according to approved animal protocols (Duke University IACUC No.: A107–23-04). A fit-for-purpose liquid chromatography/tandem-mass spectrometry (LC/MS/MS) assay to measure trametinib in mouse plasma was performed on an Agilent 1200 LC, AB/Sciex API 5500 Qtrap MS/MS instrument. 20 μL of plasma was mixed with 40 μL of 10 ng/mL trametinib-13C,d3 (internal standard) in methanol/acetonitrile (1:1), followed by vigorous agitation in Fast Prep 120 (Thermo-Savant) at speed 4.0 for 20 s, 2 cycles. After centrifugation, 40 μL of supernatant was transferred to autosampler and 10 μL was injected into LC system. The LC/MS/MS conditions were as follows. Column: Eclipse Plus C18, 4.6 mm × 50 mm, 1.8 μm, at 40 °C. Mobile phase A: 0.5% formic acid, 10 mM ammonium hydroxide, and 2% acetonitrile, in water. Mobile phase B: acetonitrile. Flow: 1 mL/min. Elution gradient (linear): 0–0.2 min 30–95% B, 0.2–0.7 min 95% B, 0.7–0.8 min 95–30% B. The analyte and internal standard were measured in positive electrospray ion mode. The following MS/MS transitions for quantification [and confirmation] of the respective [M + H] ions were used. Trametinib: m/z 616/491 [616/254], and trametinib-d3 (int. std): m/z 620/495 [620/258]. Calibration standards in the 0.4–50 ng/mL range were prepared in corresponding drug-free mouse plasma. Lowest limit of quantification (LOQ) was 0.4 ng/mL. From the conc/time data, a noncompartmental approach using the WinNonlin software (v 2.1, Pharsight) was utilized to calculate the PK parameters (i.e., Cmax, half-life, AUC, clearance, mean residence time, volume of distribution).

Supplementary Material

Supporting Information_pdf

ACKNOWLEDGMENTS

This research was supported by the NIH NIGMS Grant R35GM151255. This work was performed in part at the Duke University Shared Materials Instrumentation Facility (SMIF), a member of the North Carolina Research Triangle Nanotechnology Network (RTNN), which is supported by the National Science Foundation (Award ECCS-2025064) as part of the National Nanotechnology Coordinated Infrastructure (NNCI).

Footnotes

Preprint Available. A preprint of this work is available at Zhang, Z.; Xiang, Y.; Laforet, J., Jr.; Spasojevic, I.; Fan, P.; Heffernan, A.; Eyler, C.E.; Wood, K.C.; Hartman, Z. C.; Reker, D. TuNa-Al: a hybrid kernel machine to design tunable nanoparticles for drug delivery. ChemRxiv 2025, 10.26434/chemrxiv-2025-r8mvz (accessed March 18, 2025).

The authors declare the following competing financial interest(s): K.C.W. is a co-founder, consultant, and equity holder at Tavros Therapeutics and Celldom, a scientific advisor and/or equity holder at Simple Therapeutics, Decrypt Biomedicine, Retroviral Therapeutics, and Stelexis Biosciences, and has performed consulting work for Guidepoint Global, Bantam Pharmaceuticals, and Apple Tree Partners. D.R. acts as a consultant to the pharmaceutical and biotechnology industry, as a mentor for Start2, and on the scientific advisory board of Areteia Therapeutics.

REFERENCES

  • (1).Arnold C. Inside the nascent industry of AI-designed drugs. Nat. Med 2023, 29 (6), 1292–1295. DOI: 10.1038/s41591-023-02361-0. [DOI] [PubMed] [Google Scholar]
  • (2).Markey C; Croset S; Woolley OR; Buldun CM; Koch C; Koller D; Reker D. Characterizing emerging companies in computational drug development. Nat. Comput. Sci 2024, 4 (2), 96–103. DOI: 10.1038/s43588-024-00594-8. [DOI] [PubMed] [Google Scholar]
  • (3).Mendes BB; Zhang Z; Conniot J; Sousa DP; Ravasco JMJM; Onweller LA; Lorenc A; Rodrigues T; Reker D; Conde J. A large-scale machine learning analysis of inorganic nanoparticles in preclinical cancer research. Nat. Nanotechnol 2024, 19 (6), 867–878. DOI: 10.1038/s41565-024-01673-7. [DOI] [PubMed] [Google Scholar]
  • (4).Reker D; Rybakova Y; Kirtane AR; Cao R; Yang JW; Navamajiti N; Gardner A; Zhang RM; Esfandiary T; L’Heureux J; et al. Computationally guided high-throughput design of self-assembling drug nanoparticles. Nat. Nanotechnol 2021, 16 (6), 725–733. DOI: 10.1038/s41565-021-00870-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (5).Shamay Y; Shah J; Işık M; Mizrachi A; Leibold J; Tschaharganeh DF; Roxbury D; Budhathoki-Uprety J; Nawaly K; Sugarman JL; et al. Quantitative self-assembly prediction yields targeted nanomedicines. Nat. Mater 2018, 17 (4), 361–368. DOI: 10.1038/s41563-017-0007-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (6).Abostait A; Abdelkarim M; Bao Z; Miyake Y; Tse WH; Di Ciano-Oliveir C; Buerki-Thurnherr T; Allen C; Keijzer R; Labouta HI Optimizing lipid nanoparticles for fetal gene delivery in vitro, ex vivo, and aided with machine learning. J. Controlled Release 2024, 376, 678–700. DOI: 10.1016/j.jconrel.2024.10.047. [DOI] [PubMed] [Google Scholar]
  • (7).Bao Z; Tom G; Cheng A; Watchorn J; Aspuru-Guzik A; Allen C. Towards the prediction of drug solubility in binary solvent mixtures at various temperatures using machine learning. J. Cheminf 2024, 16 (1), 117. DOI: 10.1186/s13321-024-00911-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (8).Cheng L; Zhu Y; Ma J; Aggarwal A; Toh WH; Shin C; Sangpachatanaruk W; Weng G; Kumar R; Mao H-Q Machine Learning Elucidates Design Features of Plasmid Deoxyribonucleic Acid Lipid Nanoparticles for Cell Type-Preferential Transfection. ACS Nano 2024, 18 (42), 28735–28747. DOI: 10.1021/acsnano.4c07615. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (9).Li B; Raji IO; Gordon AGR; Sun L; Raimondo TM; Oladimeji FA; Jiang AY; Varley A; Langer RS; Anderson DG Accelerating ionizable lipid discovery for mRNA delivery using machine learning and combinatorial chemistry. Nat. Mater 2024, 23 (7), 1002–1008. DOI: 10.1038/s41563-024-01867-3. [DOI] [PubMed] [Google Scholar]
  • (10).Leyden MC; Oviedo F; Saxena S; Kumar R; Le N; Reineke TM Synergistic Polymer Blending Informs Efficient Terpolymer Design and Machine Learning Discerns Performance Trends for pDNA Delivery. Bioconjugate Chem. 2024, 35 (7), 897–911. DOI: 10.1021/acs.bioconjchem.4c00028. [DOI] [PubMed] [Google Scholar]
  • (11).Xu Y; Ma S; Cui H; Chen J; Xu S; Gong F; Golubovic A; Zhou M; Wang KC; Varley A; et al. AGILE platform: a deep learning powered approach to accelerate LNP development for mRNA delivery. Nat. Commun 2024, 15 (1), 6305. DOI: 10.1038/s41467-024-50619-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (12).Witten J; Raji I; Manan RS; Beyer E; Bartlett S; Tang Y; Ebadi M; Lei J; Nguyen D; Oladimeji F; et al. Artificial intelligence-guided design of lipid nanoparticles for pulmonary gene therapy. Nat. Biotechnol 2024. DOI: 10.1038/s41587-024-02490-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (13).Zhu Y; Shen R; Vuong I; Reynolds RA; Shears MJ; Yao Z-C; Hu Y; Cho WJ; Kong J; Reddy SK; et al. Multi-step screening of DNA/lipid nanoparticles and co-delivery with siRNA to enhance and prolong gene expression. Nat. Commun 2022, 13 (1), 4282. DOI: 10.1038/s41467-022-31993-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (14).Heid E; Greenman KP; Chung Y; Li S-C; Graff DE; Vermeire FH; Wu H; Green WH; McGill CJ Chemprop: A Machine Learning Package for Chemical Property Prediction. J. Chem. Inf. Model 2024, 64 (1), 9–17. DOI: 10.1021/acs.jcim.3c01250. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (15).Rampášek L; Galkin M; Dwivedi VP; Luu AT; Wolf G; Beaini D. Recipe for a general, powerful, scalable graph transformer. In Proceedings of the 36th International Conference on Neural Information Processing Systems, New Orleans, LA, USA; 2022. DOI: 10.48550/arXiv.2205.12454 [DOI] [Google Scholar]
  • (16).McLaughlin CK; Duan D; Ganesh AN; Torosyan H; Shoichet BK; Shoichet MS Stable Colloidal Drug Aggregates Catch and Release Active Enzymes. ACS Chem. Biol 2016, 11 (4), 992–1000. DOI: 10.1021/acschembio.5b00806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (17).Alaarg A; Menon R; Rizzo D; Liu Y; Bien J; Elkinton T; Grieme T; Asmus LR; Salem AH A microdosing framework for absolute bioavailability assessment of poorly soluble drugs: A case study on cold-labeled venetoclax, from chemistry to the clinic. Clin. Transl. Sci 2022, 15 (1), 244–254. DOI: 10.1111/cts.13144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (18).Shoichet MS; Ganesh AN; Donders EN; Shoichet BK; Torosyan H. Stable colloidal drug aggregates and methods of manufacture and use thereof. Google Patents: 2022. [Google Scholar]
  • (19).Siddiqui SI; Allehyani ES; Al-Harbi SA; Hasan Z; Abomuti MA; Rajor HK; Oh S. Investigation of Congo Red Toxicity towards Different Living Organisms: A Review. Processes 2023, 11 (3), 807. DOI: 10.3390/pr11030807. [DOI] [Google Scholar]
  • (20).Wright CJM; McCormack PL Trametinib: First Global Approval. Drugs 2013, 73 (11), 1245–1254. DOI: 10.1007/s40265-013-0096-1. [DOI] [PubMed] [Google Scholar]
  • (21).Zhou X; Zhu A; Gu X; Xie G. Inhibition of MEK suppresses hepatocellular carcinoma growth through independent MYC and BIM regulation. Cell. Oncol 2019, 42 (3), 369–380. DOI: 10.1007/s13402-019-00432-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information_pdf

RESOURCES