Abstract
Synthetic and naturally occurring particles, such as nanoparticles (NPs) and exosomes; a type of extracellular vesicles (EVs), have garnered widespread attention across various fields, including biomaterials, oncology, and delivery systems for drugs and vaccines. Traditional methods for identifying NPs and EVs, such as transmission electron microscopy, are often prohibitively expensive and labor-intensive. As an alternative, the assessment of electrokinetic attributes such as zeta potential or electrophoretic mobility, conductance, and mean count rate, offers a more cost-effective, rapid, and reliable means of characterizing these particles. In this context, we introduce the first application of a quantum machine learning (QML)-based electrokinetic mining for the identification of green-synthesized iron- and cobalt-based NPs, as well as exosomes derived from human embryonic stem cells (hESC), human lung cancer (A549) cells, and colorectal cancer (CRC) cells, based solely on their electrokinetic attributes. Comparative analyses involving cross-validation, train-test splits, confusion matrices, and Receiver Operating Characteristic (ROC) curves revealed that classical ML techniques could accurately identify the types of NPs and EVs. Notably, QML demonstrated proficiency in differentiating between various NPs and EVs, including the distinction of EVs in the plasma of CRC patients versus those of healthy individuals. Furthermore, QML's application has been extended to the identification of NPs along with EVs in the plasma of CRC patients and experimental mice, achieving higher prediction performance even with a minimal training dataset, demonstrating that QML based electrokinetic mining could identify NPs or EVs with minimal training data, thereby facilitating novel clinical development in the realm of liquid biopsies.
Keywords: Nanoparticle, Extracellular vesicle, Electrokinetic analysis, Machine learning, Quantum machine learning, Liquid biopsy
Graphical abstract
Highlights
-
•
First use of Quantum Machine Learning (QML)-based electrokinetic mining for distinguising exosomes and nanoparticles.
-
•
QML outperformed classical methods in distinguishing exosomes from cancer patients and healthy individuals.
-
•
QML's success suggests potential advancements in liquid biopsy techniques for cancer diagnostics.
1. Introduction
Both nanoparticles (NPs) and extracellular vesicles (EVs) have widespread applications in several disciplines of biomedical sciences, including targeted delivery for cancer therapy [1], tissue engineering [2], vaccine delivery, and biosensing [3]. NPs are broadly composed of materials in the size range of approximately less than 100 nm [4]. EVs are a heterogeneous group of small, membrane-bound nanovesicles, such as exosomes and microvesicles, that vary in size, composition, and function [[5], [6], [7]]. Most applications of NPs and EVs require their identification and detailed characterization such as in vivo imaging, pharmacokinetics study and in liquid biopsies [[8], [9], [10]]. Alternative to the well-established quantitative structure–property relationship (QSPR) technique [[11], [12], [13], [14], [15]], modern approaches for identification and characterization of NPs currently rely on a combination of machine learning (ML) algorithms and cutting-edge molecular biophysics techniques, including X-ray diffraction (XRD), surface enhanced Raman spectroscopy (SERS), tunable resistive pulse sensing (TRPS), nuclear magnetic resonance (NMR) spectroscopy, thermogravimetric analysis (TGA), transmission electron microscopy (TEM), scanning electron microscopy (SEM), and atomic force microscopy (AFM) [16]. EVs are often characterized by similar methods, including TEM, immunogold-EM, nano-tracking analyzer (NTA), and biochemical techniques such as Western blotting, enzyme linked immunosorbent assay (ELISA), surface plasmon resonance (SPR), and AFM [5,6]. While these methods yield high-resolution data for characterizing NPs and EVs, several of these approaches necessitate manual imaging and/or feature analysis. These processes are not only costly and time-consuming but also demand specialized equipment and the expertise of highly trained personnel, which hinder automation. Current ML-based technologies suffer from lack of generalizability and scalability [17], as the development of large, well-annotated training datasets is also resource intensive.
ML-based methods have the potential to provide rapid, accurate, and reproducible characterization of NPs and EVs. However, the lack of large training datasets demands the development of novel approaches to improve (i) the scalability of experimental routines and/or (ii) the predictive power of ML models. To this end, we investigate the performance of quantum ML (QML) and classical ML to better characterize NPs (iron and cobalt) and EVs (from hESC, A549, and Caco-2 cells) based on electrokinetic properties (Supplementary Figs. S1A–C).
Electrokinetic properties have proven utility for the characterization of NPs [18,19], and EVs [6,20]. Electrokinetic properties including zeta potential (ZP), electrophoretic mobility (Mob), conductivity (Cond) and mean count rate (MCR), have been used to characterize NPs subjected to an applied electric field. Electrokinetic analysis provides valuable information regarding the characteristics of NPs as an important technique to quantify the chemical constituents of NPs. As NPs continue to grow in applications in the world of medical technology, it is essential to ensure they are safe and effective [21,22]. ZP, one of the electrokinetic properties, reflecting electrostatic interactions between particles, is crucial for distinguishing cancer and non-cancer cells, and their EVs. Cancer patients and healthy individuals exhibit different EVs due to variations in cellular origin, tumor microenvironment, and genetic alterations [23]. Factors like pH, hypoxia, and oncogenic signaling influence EV biogenesis and cargo, altering their electronic kinetic properties including ZP [6,24]. These changes affect EV surface charge, size, and membrane composition, impacting their function and interaction with recipient cells [25]. Altered surface charge in cancer cells, influenced by molecular changes, depicting malignancy, can be detected through ZP measurements. This distinction in electrokinetic properties serves as a valuable biomarker for cancer detection, facilitating liquid biopsy based early diagnosis [6,20]. Electrokinetic properties can be measured conveniently using dynamic light scattering (DLS). DLS offers several advantages including non-invasive measurements, accessibility even with small sample quantities and sensitivity to trace aggregation, thus achieving scalability, cost savings and more rapid analysis (i). ML models have been to be very useful in disclosing complex patterns in several biological datasets [[26], [27], [28], [29], [30], [31]]. As such, ML has been utilized in various aspects of NP studies including synthesis [32,33], evaluation of dynamics [34], and delivery to tumor [35]. ML has also been applied to study EVs in connection to cancer diagnosis via SERS [36,37] and elucidation of proteins’ interactions contributing to cancer stemness and integrity of EVs [38]. The variety of applications to characterize NPs and EVs necessitates the need for new training data sets collected from different experiments. Therefore, for ML methods to be useful, the resource burden required to develop high-quality training data must be reduced.
QML is expected to require smaller training datasets than classical ML to achieve similar performance. This reduction in necessary training data stems from the allowed superposition of states in the two-level quantum-mechanical system. Classical ML algorithms use classical computers, which operate on bits that can be in one of two states, 0 or 1. Quantum computers, on the other hand, use qubits, which can be in the superposition of both states (0 and 1) at the same time. This allows QML algorithms to solve problems that are intractable for classical computers. Variational Quantum Classifier (VQC), which is a hybrid QML and has also been applied in different biomedical research such as dementia prediction [39]. The VQC has several advantages over classical ML algorithms, for example, quantum computers can represent probability distributions more accurately than classical computers. In addition, QML shows improved generalization from limited data compared to classical methods. Caro et al. demonstrated that QML models can effectively learn from small datasets, potentially offering advantages in various applications requiring minimal training data [40,41]. Therefore, successfully applying QML would address both scalability and predictive power with minimal training dataset.
NPs have been widely utilized as the vaccine adjuvants and delivery platforms, enhancing the immunogenicity and stability of antigens, and allowing for slow release and targeted delivery [[42], [43], [44]]. Distinguishing between different NPs is crucial owing to their distinct properties, toxicity profiles, reactivities, and performance characteristics [45]. Therefore, a robust method is required for the identification of type of NPs. Detecting NPs in blood plasma enables assessment of biodistribution, toxicity, drug delivery efficacy, and interactions with blood components, supporting nanomedicine development and ensuring safety in biomedical applications [46]. Moreover, the heterogeneity and overlapping characteristics of these NPs along with exosomes make their accurate identification and characterization challenging, particularly in the context of cancer detection and monitoring [7,47]. In the current study, we validated an electrokinetic approach to identify and characterize NPs and EVs in plasma samples of colorectal cancer (CRC) patients and experimental mice. Given the low cost, and ease of implementation and scalability of this technology, we anticipate that QML-based electrokinetic mining will likely become an auxiliary method to characterize NP or EV for biomedical research – and could uncover other types of biological nanovesicles suitable to for liquid biopsy research.
2. Results
2.1. Size and electrokinetic characterization of NPs and EVs
The morphology and size of green synthesized iron- and cobalt- NPs were characterized by TEM and NTA. Most of the NPs were found to be round, and a few of them look slightly elongated. The diameter of both types of NPs was found to be around 20 nm. (Fig. 1A and B). Electrokinetic analysis provides valuable information regarding the characteristics of NPs. It is one of the most important techniques used to quantify the chemical constituents of NPs. As NPs continue to gain popularity in the world of medical technology, it is essential to ensure they are safe and effective [21,22]. Besides, electrokinetic analysis can be used to identify and quantify the concentration of chemical components (such as hydroxyl groups and carboxyl groups) in NPs and assess their stability under various conditions [48]. This helps to understand the various biological and chemical properties of NPs and helps to improve the design process of these futuristic technologies. This technique provides valuable insight into the size, ZP, surface charge and stability of NPs. It also helps to identify chemical constituents present in NPs [[49], [50], [51], [52]]. Therefore, it is imperative to examine electrokinetic characteristics to distinguish between different types of NPs. Fig. 1C, D shows the distribution of ZP of iron- and cobalt- NPs, which were found to be approximately 10 and 26 mV, respectively (Fig. 1E). Interestingly, the ZP of iron NPs was found to be distributed at a much wider range as compared to that of cobalt NPs (Fig. 1A and B). The electrophoretic mobility (Mob) of iron NPs was found to be significantly lower than that of cobalt NPs (Fig. 1F), which might be related to the lower deposition rate of cobalt NPs. Notably, the molecular mass of iron and cobalt are 55.845 and 58.93 Da, respectively. Therefore, as other factors, including their size seems to be similar based on the TEM analysis (Fig. 1A and B), the possible significant difference between Mob could be due to the difference in their molecular masses. In general, larger particles with higher mobility often achieve a lower deposition rate, whereas smaller particles with lower mobility often achieve a higher deposition rate [53,54]. The Cond of Co NP was found to be significantly lower than that of Fe NP (Fig. 1G). Further, the record of time-dependent phase response for iron NPs was maximum around 5 rad, which was significantly lower than that of cobalt NPs (maximum around 60 rad) (Fig. 1H and I), inferring that the particle with steep curve responds rapidly, whereas the particle with a shallow curve is responding slowly. Similarly, silver nanoparticles (AgNPs) were characterized by TEM to depict morphology, and DLS-based electrokinetic analysis (Supplementary Figs. S2A–C). Further, Supplementary Figs. S3A–C depict the size distribution of EVs from hESC, A549, and Caco-2.
Fig. 1.
Characterization and Electrokinetic analysis of iron- and cobalt-nanoparticles. (A, B) Representative transmission electron microscopy (TEM) image of (A) iron NPs, and (B) cobalt NPs. Scale bar: 20 nm. Both NP populations are mostly spherical in shape. (C–D) Representative graphs showing distribution of apparent ZP vs total count of (C) iron NPs, and (D) cobalt NPs for six samples of NPs. Each color represents a sample of NPs. (E–G) Bar graphs showing the mean - (E) zeta potential (ZP), (F) electrophoretic mobility (Mob), and (G) conductivity for iron (black) and cobalt (gray) NPs. Error bars display the standard error of the mean (N = 3). (H, I) Time-dependent phase of (H) iron NPs, and (I) cobalt NPs. Student's t-test was applied for comparison: iron NPs vs. cobalt NPs; Statistical significance was set at ∗P < 0.05, ∗∗P < 0.01.
2.2. Identification of distribution patterns and relation among different electrokinetic features of NPs and EVs
A scatter plot analysis of four different electrokinetic features, namely ZP, Mob, Cond, and MCR, reveals the relationships between variables compared in a pairwise fashion (Fig. 2). For both NPs and EVs, ZP and Mob are highly correlated (Spearman correlation, P = 1) (Fig. 2Ai, Aii, Bi, Bii, Ei, Eii, Fi, Fii). This can be attributed to the direct proportionality between ZP and Mob defined by the Smoluchowski equation [55]. In the case of a plot of four given parameters, the data points are plotted along each axis to visualize the correlation between the variables (Fig. 2A–D i-iv). As shown in the scatterplot, ZP and Mob are clearly correlated. The data points along the ZP axis are clustered around the same line as Mob, which indicates that the relationship between these two variables is linear. By contrast, the other parameter pairs show no visible correlation. This could indicate that there is no significant relationship between them. The scatter plot of the four given parameters suggests that it is possible to classify the data points into two different types using conductance and mean count rate, together with one or both of ZP and mobility. Similarly, the scatterplot of the four different electrokinetic attributes of EVs from hESC or A549 has been shown in Fig. 2E–H i-iv, and that of EVs from hESC or Caco-2 has been shown in Supplementary Figs. S4A–D.
Fig. 2.
Visualization of the data distribution of the raw datasets using scatter plot analysis. (A–D) Pairwise comparisons of electrokinetic features, namely ZP, Mob, Cond, and MCR-for iron (blue) and cobalt (red) NPs (N = 1016). (E–H) Pairwise comparisons of electrokinetic features for A549- (blue) and hESC- (red) EVs (N = 1033).
2.3. Cross-validation, train-test split, confusion matrix and ROC curve analysis established effectiveness and predictive power of classical ML models to identify NPs and EVs
Cross-validation is a resampling method used to evaluate ML models on a limited data set. Cross-validation ensures that the model is free of bias due to the data used to train it by preventing the risk of overfitting [56]. As ZP and Mob are linearly correlated [55], which is also supported by our analysis (Fig. 2, Supplementary Figs. S4A–D), we excluded Mob from downstream analysis. K-fold cross validation [57], was used with 10, 100, and 200 folds to train classical ML models to differentiate between our two populations of NPs and two populations of EVs (Supplementary Table S1). Specifically, we trained five classical ML models for each classification task: linear regression (LR), support vector machine (SVM), K-Nearest Neighbors (KNN), decision tree (DT), and eXtreme Gradient Boosting (XGB). The data is divided into k folds, and the model was trained on k-1 folds and evaluated on the remaining fold. This process was repeated k times, and the average performance of the model is used to evaluate its performance. The train-test split is a method of training and evaluation of the performance of ML algorithms by splitting the data into 70:30 percent ratio [58]. A trained LR model showed the lowest prediction performance and the highest variance, whereas DT and XGB achieved the best prediction performance with minimal variance. The 200-fold cross-validation prediction performance of LR, SVM, KNN, DT, a XGB were found to be 0.7554 ± 0.227, 0.9108 ± 0.142, 0.9687 ± 0.098, 0.9987 ± 0.018, and 0.9987 ± 0.018, respectively (Supplementary Figs. S5A–C). Similar trends were found with 100- and 10- fold cross-validation (Supplementary Figs. S4D–I). When training with a train/test split, the prediction performance of LR, SVM, KNN, DT, and XGB on the held-out test set were found to be 0.75, 0.91, 0.96, 0.99, and 0.99, respectively for the identification of NPs (Supplementary Table S1). Similar strategy was used for the identification of hESC- or A549- EVs, and the prediction performance was determined by the LR, SVM, KNN, DT, and XGB on the held-out test. Apparently, the cross-validation of different ML models at different folds suggests their applicability for the prediction of NPs and EVs across the classical ML models. Confusion matrices for all the trained classical ML models for NPs and EVs (from hESC and A549 cells) are shown in Fig. 3A–E,I,M and Supplementary Fig. S6A-C, S7A-C. For each model, these demonstrate similar levels of false positives and false negatives. The precision, recall, and F1-score for each model is shown in Fig. 3B–F,J,N and Supplementary Fig. S6D-F, S7D-F. Area under the curve of ROC (AUC-ROC) for LR, SVM, KNN, DT, and XGB were 0.74, 0.89, 0.97, 1.0, and 1.0, respectively for the identification of NPs (Fig. 3C–G, Supplementary Figs. S6G–I). The AUC-ROC for LR, SVM, KNN, DT, and XGB for the identification of hESC- and A549 cells-derived EVs have been shown in Fig. 3K–O, and Supplementary Figs. S7G–I. All predictions from each model are displayed in three-dimensional feature space in Fig. 3D–H,L,P, and Supplementary Fig. S6J-L, S7J-L. By similar analysis, the confusion matrix, precision, recall, and F1-score, AUC-ROC, and three-dimensional scatter plots for hESC- or Caco-2- EVs have been shown in Supplementary Figs. S8A–O. A learning curve analysis shows that the performance of classical ML models such as LR, kNN, and DT were somewhat affected, and that of SVM and XGB were lower than the QML, when the training set was gradually reduced whereas the prediction performance for the prediction via QML was persistent, as depicted by the heatmap and plot (Supplementary Figs. S9A and B). In addition, the classical ML as well as QML models could distinguish AgNPs from FeNPs or CoNPs (Supplementary Figs. S10A–D). A gamified SHAP value analysis showed that Cond and MCR are the most significant features for classifying NPs and EVs, respectively (Fig. 3Q and R).
Fig. 3.
Performance evaluation of classical ML models to predict the types of NPs and EVs via confusion matrix and ROC curve. (A–H) Confusion matrices, binary classification metrics (precision, recall, F1 score), ROC, and three-dimensional scatter plots for each ML model trained for the identification of NPs. (I–P) Confusion matrices, binary classification metrics (precision, recall, F1 score), ROC, and three-dimensional scatter plots for each ML model trained for the identification of EVs. (A, E, I, M) The confusion matrix is a table that summarizes the performance of a classification model. The rows of the confusion matrix represent the true NP classes, while the columns represent the predicted NP classes by the ML models. (B, F, J, N) Precision measures the proportion of correctly predicted positive instances out of all predicted positive instances, while recall measures the proportion of correctly predicted positive instances out of all actual positive instances. The F1 score is the harmonic mean of precision and recall, and it is a measure of a model's overall performance. (C, G, K, O) The AUC-ROC is a measure of the overall performance of the model. A perfect model would have an AUC of 1, while a random model would have an AUC of 0.5. (D, H, L, P) Three-dimension scatter plots showing the distribution of predicted data via LR, and XGB on the held out 30 % test set. (Q, R) SHAP value analysis showing the contribution of major attributes for LR-based identification of (Q) NPs and (R) EVs. For the degree of uncertainty in sample was represented in (C,G,K,O) ROC plots via confidence interval (CI) with probability limit of 95 %.
2.4. Quantum ML based electrokinetic mining identifies NPs and EVs with fewer training sets
Classical ML algorithms use classical computers, which operate on bits that can be in one of two states, 0 or 1. Quantum computers, on the other hand, use qubits, which can be in a superposition of both states (0 and 1) at the same time (Supplementary Fig. S11). This allows QML algorithms to solve problems that are intractable for classical computers. VQC, which is a QML algorithm and has also been applied to different biomedical research problems such as dementia prediction [39]. The VQC has several advantages over classical ML algorithms, for example, quantum computers can represent probability distributions more accurately than classical computers. Intriguingly, the confusion matrix and AUC-ROC curve for the prediction of NPs via VQC with 70 % training set yielded an prediction performance of 94 % for the held out 30 % of test set (Fig. 4A–D,G). In addition, the VQC can be trained on much smaller datasets than classical ML algorithms, because the VQC can learn the parameters of the quantum circuit directly from the data, without the need for a large number of training set [39]. To evaluate the power of VQC for predicting NPs with lower training set, we reduced the training set to 50 %, and 10 % of training set, which still yielded the prediction performance of 94 %, and 92 %, respectively for the held out 30 % of test set (Fig. 4B,C,E,F, Supplementary Fig. S12). This demonstrates that VQC, which is a hybrid QML, could predict the NPs with better prediction performance even after training the QML with electrokinetic mining of fewer training set (Fig. 4G–I), inferring its advantage over classical ML. The comparative analysis of prediction performance via QML was found to be significantly better than the classical ML models with much lower training sets (Fig. 4J–P, Supplementary Fig. S13), suggesting towards the applicability of QML even with extremely minimal training set for electrokinetic mining to identify NPs.
Fig. 4.
Performance evaluation of quantum ML model and comparative analysis of its prediction performance with classical ML models with extremely small training set for the identification of NPs. Representative (A–C) confusion matrix, (D–F) ROC curves with confidence interval, and (G–I) three-dimensional scatter plots for QML with (A, D, G) 70 % training set, (B, E, H) 50 % of initial training set, and (C, F, I) 10 % of initial training set. The test accuracies were evaluated on the held out 30 % test set. The confusion matrix shows the number of correctly and incorrectly classified NPs for each type of NP. The ROC curve shows the trade-off between the true positive rate (TPR) and the false positive rate (FPR) for different thresholds. As can be seen from the figures, the VQC model can achieve high prediction performance in identifying the types of NPs, even with a smaller training set. (J–P) Representative three-dimensional scatter plots showing the NP prediction ability via (J–N) classical ML models: LR, SVM, KNN, DT, and XGB, and (O) VQC, a QML with 0.5 % training set. (P) Bar graph showing that QML outperformed the classical ML models for the identification of NPs, even when trained with 0.5 % training set. For precision estimation, the degree of uncertainty in sample was represented in (D–F) ROC plots via confidence interval (CI) with probability limit of 95 %.
A similar strategy, of QML based electrokinetic mining was utilized for the identification of EVs from hESC, A549, and Caco-2 cells (Supplementary Figs. S14 and S15). EVs play a significant role in intercellular communication and can carry various biological molecules, including proteins, lipids, DNA, and different RNA forms, which mimic the constituents of their parent cells [59]. hESC can be differentiated into specific cell lineages such as pulmonary neuroendocrine cells (PNECs) and progenitor cells, which can be utilized for disease modeling, drug discovery, and potential therapeutic applications [[60], [61], [62], [63]]. Distinguishing among EVs from hESC, A549, and Caco-2 cells by QML based electrokinetic mining would offer a better approach for identification of EVs requiring lower training sets. Notably, the QML-based electrokinetic mining provided improved prediction performance for the identification of types of EVs, even with minimal training set (Supplementary Fig S14A-K, S15A-E).
2.5. QML detects NPs in mouse plasma and in both healthy and cancer patients via electrokinetic mining with minimal training data
To examine the feasibility of using QML for identifying the NPs in plasma, we utilized pre-clinical and clinical plasma samples without or with NPs to record their electrokinetic parameters. For pre-clinical samples, 10 mice were injected with 100 μl of AgNP solution via tail vein for 2 consecutive days, followed by isolation of EVs from their plasma and EVs from the plasma of 10 untreated mice were used as control group. Then, the electrokinetic parameters of samples from both the groups were recorded using Malvern Zetasizer Nano, followed by segregating the training and test data in the percentage ratio of 70:30. Furthermore, after keeping the test data separate, the percentage of training data was reduced, followed by training the classical ML and QML on different percentages of training set (Fig. 5A). Based on ROC analysis, the performance of QML were 85 % (AUC = 0.85, 95 % CI: 0.75–0.94), 80 % (AUC = 0.80, 95 % CI: 0.69–0.91), 70 % (AUC = 0.85, 95 % CI: 0.70–0.84), 63 % (AUC = 0.63, 95 % CI: 0.48–0.76) by training set at 70 %, 50 %, 10 %, and 1 %, respectively, to identify NPs containing plasma of mice. This was further illustrated with 3D scatter plots showing the separation between two groups by respective training sets (5B-E). The quantitative bar graph demonstrated that QML could maintain prediction performance even with mining of massively reduced training set (Fig. 5F).
Fig. 5.
QML based electrokinetic mining could detect the NPs in mouse plasma, and plasma from healthy individuals and cancer patients. (A) A schematic diagram showing the experimental plan for the collection of plasma from control- and NPs injected-mice, followed by recording of electrokinetic data, and the ML based training and prediction modules. (B–E) ROC and 3D scatter plots, and (F) the quantitative bar graph, showing the prediction performance to identify NPs in the plasma of mice using QML trained with 70 %, 50 %, 10 %, or 1 % data. (G) A schematic diagram showing the experimental plan for the collection of plasma from healthy individuals and CRC patients, and separately spiking NPs in a separate group of plasma from healthy individuals and CRC patients, followed by measurements of electrokinetic data, and the ML based training and prediction modules. (H–K) ROC and 3D scatter plots showing the QML based identification of NPs in plasma of (H) healthy and (J) cancer patients, and the corresponding bar graphs, showing the prediction performance of QML trained with 70 %, 50 %, 10 %, and 5 % data, to identify NPs in the plasma of healthy and cancer patients. (F, I, K) Bar graphs showed that QML could maintain high prediction performance, even with massive reductions in training set size. For precision estimation, the degree of uncertainty in sample was represented in ROC plots via CI (showed with confidence shape) with probability limit of 95 %.
For the clinical samples, plasma was obtained from the ten healthy subjects and ten colorectal cancer (CRC) patients. Patients consented under IRB-10-209-A approved by the University of Chicago Institutional Review Board. The CRC patients ranged in age from 42 to 82, half of them were male and the other half were female. They were majorly from two different self-identified races including White, and African American, however one patient declined to identify race. In the context of ethnicity, except for one patient, all were non-Hispanic. All the cancers were identified as either low grade, invasive poorly differentiated, or invasive moderately differentiated. They were also categorized based on the tumor, node, and metastasis (TNM) based stages, and AJCC stage as reported in Supplementary Table S2. The control group consisted of 10 healthy subjects in the age range of 49–81 with half males and half females, as age-matched controls. The age-matched control group of healthy subjects were also either White or African American, in the context of both age and race-matched to patients. Like the group of CRC patients, all were non-Hispanic, except one (Supplementary Table S2). To examine the capacity of QML to identify NPs in plasma samples of healthy and cancer patients, NPs were spiked in the plasma samples, followed by the electrokinetic recording of control- and NPs-spiked healthy and cancer patients’ plasma, and prediction performance by training QML with 70 %, 50 %, 10 %, and 5 % data (Fig. 5G). As evident from the ROC curves analysis with 70 % and 5 % training set, the QML could identify the NPs spiked plasma samples of healthy and cancer patients (Fig. 5H–J). Quantitative bar graphs showed that the prediction performance by the QML could be sustained even after massively reducing the percentage training data (Fig. 5I–K), suggesting that QML based electrokinetic mining could be a better approach for the identification of NPs in plasma samples.
2.6. QML distinguishes EVs and NP in healthy individuals vs. cancer patients
To examine how accurately QML based electrokinetic mining could distinguish between the EVs in the plasma of heathy subjects and CRC patients, the electrokinetic parameters of plasma-derived EVs were measured using the Malvern Zetasizer Nano, followed by segregating the training and test data in the percentage ratio 70:30. Further, after keeping the test data separate, the percentage of training data was reduced sequentially, followed by training the classical ML and QML on different percentages of training set. Additionally, the plasma derived EV samples were spiked with NPs, to determine if the QML based electrokinetic mining could distinguish healthy-from cancer- EVs with or without NPs (Fig. 6A). The prediction performance of different classical models; LR, kNN, SVM, DT, and XGB for the identification of NPs with EVs from the plasma of cancer patients are shown in Supplementary Figs. S16A and B. Importantly, the electrokinetic properties of NPs also change when they are diluted in plasma (Supplementary Figs. S17A–D). As per the ROC analysis, the prediction accuracies by QML were 69 % (AUC = 0.69, 95 % CI: 0.44–0.65) and 73 % (AUC = 0.73, 95 % CI: 0.55–0.77) by training set at 70 % and 5 %, respectively, for distinguishing healthy EVs versus cancer EVs. This suggests that, unlike classical ML, the VQC QML could distinguish the EVs from the plasma of healthy subjects and cancer patients even with mining of 5 % training set (Fig. 6B–E). Similar analysis was carried out to distinguish between healthy EVs and healthy EVs spiked with NPs. As evident from the ROC analysis, the prediction accuracies by QML were 89 % (AUC = 0.89, 95 % CI: 0.68–0.86) and 77 % (AUC = 0.77, 95 % CI: 0.45–0.64) by training set at 70 % and 5 %, respectively, for distinguishing healthy EVs versus healthy EVs with NPs. This suggests that, unlike classical ML, the VQC QML could distinguish the EVs from the plasma of healthy subjects and cancer patients even with mining of 5 % training set (Fig. 6C–E). A comparable investigation was conducted to differentiate between cancer EVs and cancer EVs spiked with NPs. The prediction accuracies of QML were 73 % (AUC = 0.73, 95 % CI: 0.52–0.74) and 65 % (AUC = 0.65, 95 % CI: 0.65–0.89) by training set at 70 % and 5 %, respectively, for distinguishing cancer EVs versus cancer EVs with NPs, suggesting that QML could also distinguish between the plasma derived EVs mixed with or without NPs (Fig. 6D and E). Supplementary Figs. 18A–E showed the QML based electrokinetic mining with fewer observations could distinguish healthy EVs with NPs vs cancer EVs with NPs. The quantitative bar graph depicting the training set (%) versus prediction performance (%), demonstrated that QML could provide superior prediction performance even with mining of massively reduced training set (Fig. 6E).
Fig. 6.
QML based electrokinetic mining with fewer observations identifies EVs from the plasma of healthy and cancer patients, and EVs mixed with NPs. (A) A schematic diagram showing the experimental plan for the collection of plasma from healthy and cancer patients, followed by the isolation of EVs and their spiking with NPs. Then, electrokinetic recording was conducted on the plasma derived EVs without or with NPs. (B–D) ROC curves and confusion matrices showing the classification of (B) healthy EVs vs cancer EVs, (C) healthy EVs vs healthy EVs spiked with NPs, and (D) cancer EVs vs cancer EVs spiked with NPs by QML trained with 70 % and 5 % data. (E) Quantitative bar graphs showing the prediction performance by QML at training set of 70 % and 5 %, inferring that QML could sustain the percentage of prediction performance, even with massive decrease in training set. For precision estimation, the degree of uncertainty in sample was represented in ROC plots via CI (showed with confidence shape) with probability limit of 95 %. In Fig. 6E, the statistical significance was determined via multiple t-tests, followed by the Holm-Sidak method, with P value = 0.05.
3. Discussion
NPs and EVs are pivotal in a wide array of industrial domains, including biomedical research, nano-biomedicine, drug delivery, and vaccine development. They serve not only as effective diagnostic tools but also as efficient vehicles for drug delivery. With their ability to target specific molecules and structures, they can be used to identify and quantify biomarkers, remove waste, and help mitigate diseases. In addition, they could contribute to developing advanced materials like more durable paints, coatings, and additives. NPs also have a great potential in the energy sector, providing materials with higher energy outputs [22,26,28].
NPs have unique physical and chemical properties compared to larger particles, due to their small size. Therefore, appropriate identification and characterization of these particles is very important. This will help scientists understand their potential applications, risk assessment of their use, and help design effective strategies for safe handling. It is also important for drug manufacturers to ensure the quality of nanomaterials used in their products [22,32,64].
EVs are increasingly recognized for their potential in liquid biopsy applications, particularly in the field of oncology. EVs are secreted by most of the cell types including stem cells and cancer cells, and circulate in biofluid like blood, carrying a diverse array of biomolecules reflective of their cells of origin [5,65]. This makes them valuable for non-invasive cancer diagnosis, monitoring treatment response, and detecting therapeutic resistance through blood samples. The complexity and heterogeneity of EVs owing to their diverse size, biogenesis pathway, and overlapping biomarkers, pose huge challenges for their identification and analysis [66]. Intriguingly, ZP, an intrinsic electrokinetic property, illuminates the electrostatic interplay among particles, playing a pivotal role in discerning cancer from non-cancer cells and their EVs. The modified surface charge in cancer cells, shaped by molecular dynamics indicative of malignancy, is unveiled through ZP analysis. This nuanced electrokinetic differentiation stands as a precious biomarker for cancer detection, enabling timely liquid biopsy-driven diagnoses [6,20]. In addition, the ZP of EVs derived from cancer cells, have been found to correlate with malignant characteristics of their parent cells, suggesting the utility of ZP as a surrogate biomarker of cancer [5,6,67,68]. The narrow range of negative surface charge in most EVs [20], reflects their consistent biophysical properties due to shared lipid and protein compositions. Despite this limited range, advanced ML models can extract realistic information by integrating electrokinetic data with other EV characteristics such as size, cargo, and origin to capture subtle variations and functional implications. Importantly, by leveraging multi-dimensional data and pattern recognition, ML models identify subtle correlations and variations, enhancing predictive accuracy despite limited surface charge range [28,69,70]. EVs from a single cell type exhibit heterogeneity in size, shape, and composition [66]. These factors influence electrokinetic properties like zeta potential by affecting surface charge distribution and hydrodynamic behavior. Electrokinetic measurements thus provide a comprehensive assessment of EV diversity [20].
EVs and NPs can interact when co-secreted, with NPs potentially adhering to EV surfaces or being encapsulated [[71], [72], [73]]. This interaction creates complex nanostructures with distinct electrokinetic properties [74]. ML models can differentiate silver NPs from EVs by analyzing subtle variations in electrokinetic parameters which also consider size, shape, surface charge, and biomolecular composition [75,76]. ML algorithms can recognize intricate patterns in these electrokinetic parameters, enabling accurate classification. This approach leverages the heterogeneity of EV populations and the unique properties of NPs to achieve precise separation and characterization [77,78], advancing our understanding of EV-NP interactions and their potential applications in nanomedicine.
Conventional techniques for the identification and characterization of NPs and EVs include TEM, SEM, AFM, XRD and DLS. TEM and SEM provide insights into the shape, size distribution and properties of the NPs. AFM provides information on the surface morphology of the NPs. XRD provides structural information on the crystalline structure of the NPs. DLS is used to characterize the size and polydispersity of the NPs. The drawbacks of these techniques include the inability to identify and characterize nanomaterials in complex matrices. The techniques are also expensive and require specialized equipment and personnel. Furthermore, these techniques are limited by imaging resolution, which is often too low to identify nanomaterials. Finally, some of these techniques require a sample to be exposed to radiation of high energy, which can alter the properties of the sample [[11], [12], [13], [14], [15], [16],[79], [80], [81], [82]].
The application of ML models for the identification and characterization of NPs and EVs is critical for understanding their physicochemical properties and functionalities. ML models extensively utilize large-scale data mining approaches to capture and analyze physicochemical characteristics, like size, shape, and composition. ML models are also able to characterize and identify NPs through features like surface reactivity and safety. The advantages of using ML models for the identification and characterization of NPs include the high speed and accuracy in which the data can be visualized and characterized. Additionally, ML models can help identify difficulty detecting properties in NPs to better understand their structure and behavior in different contexts. Ultimately, ML models applied to electrokinetic features offer an efficient and cost-effective way to identify and characterize NPs which would otherwise be difficult to acquire through conventional methods [26,27,34,83,84].
Electrokinetic analysis and data mining with the application of ML models provide a powerful way to identify and characterize NPs and EVs. This approach is useful in industries related to nanotechnology and biomedicine for assessing the particle size distribution, surface charge density, and concentration of nanomaterials. Additionally, ML models allow us to process large sets of data and identify and characterize nanomaterial characteristics quickly and accurately. This approach offers faster and more accurate outcomes with remarkably lower cost that enable better decision-making and efficient NP characterization [[85], [86], [87], [88]].
The No-Free-Lunch Theorem implies that, when averaging over all possible objective functions, no optimization algorithm outperforms any other in all cases. There are many algorithms and hyperparameters, so it is impractical to test all possibilities. We investigated several algorithms, including LR, SVM, KNN, DT, and XGB. These five classical ML models were chosen for electrokinetic data mining due to their effectiveness with numerical data, interpretability, and efficiency. While the deep learning approach excels in complex analysis, these models offer transparency and establish a performance baseline [89]. The data were randomly split into 70 % training and 30 % testing sets. Each model was trained and tested by 10-, 100-, and 200- fold cross-validation. 10-fold cross-validation involves splitting the data into 10 subsets and using 9 subsets for training and 1 subset for testing. This was repeated 10 times, and the results were averaged. The dataset was shuffled to prevent overfitting (Supplementary Fig. S4, Supplementary Table S1).
The application of ML algorithms to electrokinetic analysis has been successful in yielding meaningful results. It could detect patterns in large datasets and make predictions based on the electrokinetic parameters of NPs, such as ZP, Mob, Cond, and MCR. The electrokinetic properties of NPs change when diluted in plasma due to the formation of a protein corona, altering their surface charge and mobility [90]. This affects their interactions with cells and tissues, impacting drug delivery efficacy and potential toxicity in clinical applications.
Using five types of supervised classical ML models namely LR, KNN, SVM, DT, and XGB, accuracy of 74 %, 89 %, 97 %, 100 %, and 100 % were respectively achieved for the identification of NPs (Fig. 3A–H). Similarly, classical ML models were found to be effective in identifying the types of EVs (Fig. 3I–P). This illustrates the effectiveness of classical ML-driven mining of electrokinetic parameters in identifying the NPs or EVs. As a limitation, the identical performance of DT and XGB could be potential due to overfitting. Typically, XGB outperforms single DTs on complex tasks due to its ensemble nature and boosting technique [91]. The lack of improvement observed here could suggest that either the dataset is relatively simple, allowing even a single DT to capture most patterns, or that both models might be overfitting to noise in the training data. This emphasizes the need for further investigation into model generalizability and dataset complexity in future studies. In addition, this could potentially be avoided by the implementation of batch-based data split in future.
QML is a promising new field that uses quantum mechanics to solve ML problems. Unlike classical ML, which uses bits that can be in one state or the other, QML uses qubits, which can be in a superposition of both states at the same time. This allows QML algorithms to solve problems that are intractable for classical computers [39,[92], [93], [94], [95]]. QML through VQC offers distinct advantages over traditional algorithms. Quantum systems excel in representing complex probability distributions, enhancing accuracy. Moreover, QML demonstrates superior generalization from limited datasets, a crucial benefit in data-scarce scenarios. Research by Caro et al. highlights QML's efficacy in learning from minimal training data, potentially tackling both scalability challenges and predictive power limitations, particularly when working with constrained datasets, positioning QML as a promising frontier in computational learning [40,41]. We demonstrated the first use of QML alongside classical ML to identify NPs or EVs via electrokinetic mining. More specifically, green synthesis allied to DLS allowed characterization of 504 and 512 samples of Fe- and Co- NPs, respectively. According to train-test split and k-fold cross-validation techniques, VQC was found to predict NP types with accuracies of 94 %, 94 %, and 91 %, when respectively trained on datasets of 70 %, 50 %, and 10 % of the initial size. Notably, the QML showed the highest prediction performance compared to the classical ML models at the 1 % training set (Supplementary Figs. S12 and 13). Similarly, the electrokinetic data of EVs were analyzed by the implementation of QML, and notably with mere 0.5 % training set QML could identify the type of EVs with prediction performance of 71 % (Supplementary Figs. S14A–K). These findings indicate that QML based electrokinetic mining shows promising potential for accurately identifying the cellular origins of EVs, a pivotal step in analyzing the diverse populations of EVs present in biofluids such as plasma and CSF. Our research underscores the remarkable potential of QML in discerning EVs originating from cancerous and non-cancerous cells. This discovery opens exciting avenues for leveraging this technology to differentiate EVs present in clinically relevant biofluids. Our current focus lies in investigating the feasibility of utilizing QML to distinguish between EVs extracted from the plasma of cancer patients and those from individuals without cancer. By doing so, we aim to enhance the diagnostic capabilities of EV analysis, paving the way for more precise and effective cancer detection methods. This endeavor has led us to harness EVs sourced from both healthy subjects and individuals diagnosed with cancer, marking a crucial step towards harnessing the full potential of QML in advancing cancer diagnostics.
In expanding the scope of our approach, we analyzed the electrokinetic properties of pre-clinical and clinical plasma samples. The clinical samples included plasma from 10 healthy subjects and 10 CRC patients, from which EVs were isolated. Notably, the QML based electrokinetic mining was found to be effective in distinguishing heathy EVs and cancer EVs (Fig. 6A–C, E). Additionally, to assess the potential of employing QML to distinguish NPs containing plasma derived EVs, NPs were spiked into the EV samples from the plasma of 10 CRC patients in a separate group. Further, we used pre-clinical samples to assess the potential of employing QML to distinguish EVs in plasma containing NPs, 10 mice received 100 μl of AgNP solution intravenously. Subsequently, EVs were isolated from their plasma, with EVs from the plasma of 10 untreated mice serving as the control group. Notably, the QML based electrokinetic mining could identify NPs with EVs from the plasma of CRC patients (Fig. 6A–D, E), and mice (Figs. S5A–F). Evidently, compared to the QML, the benchmarks for LR, KNN, SVM, DT) and XGB indicate that the performance of VQC profoundly outperforms classical ML models making it especially useful for predictive purposes with a small training set. Interestingly, the QML based analysis using only 0.5 % training set demonstrated that even a single feature would be enough to learn and predict, contributing towards the identification of NPs or EVs. This is a substantial finding that could lead to new applications in areas such as drug discovery and environmental remediation. Some of the challenges that need to be addressed before VQCs can be widely used are noise, scalability, and software. In the context of noise, quantum computers are susceptible to noise, which can cause errors in the computation. In terms of scalability, quantum computers are currently very small, and it is not yet clear how to scale them up to the size that would be needed for practical applications. When it comes to the software, there is a lack of software development tools for VQCs [96]. Therefore, these aspects may require future studies to make progress in the application of QML. Additionally, incorporating ML models through application programming interfaces (APIs) into Internet of Things (IoT) applications, especially by utilizing sophisticated QML algorithms, holds the potential to substantially elevate the precision in predicting cancer susceptibility, recurrence, and survival rates. This integration would serve as a crucial asset for the early diagnosis and prognosis evaluation of cancer, marking a substantial advancement in the medical diagnostics and patient care [97,98].
In future, the scope of ML algorithms in nano-biomedicine for designing targeted therapy for various diseases is exciting and varied. ML algorithms can be used to develop advanced nanomaterials that can recognize and bind specifically to diseased proteins and cellular components, for imaging and tracking of NPs and EVs in vivo, to analyze large datasets to identify biomarkers, to identify optimal dosing, and to predict the response of a patient to a particular drug regimen. Ultimately, ML algorithms can be used to help optimize targeted therapies based on individual patient data, with the goal of achieving improved outcomes and improved quality of care. Furthermore, with the tremendous advances in ML technology, there is an opportunity for researchers to develop more sophisticated algorithms that can predict the response of a patient to a particular therapeutic intervention and guide dosing regimen. This could be particularly useful for personalized medicine approaches.
4. Materials and methods
4.1. Chemicals
Cobalt nitrate hexahydrate (Co(NO3)2. 6H2O) (Product no.: 1.02536), and ferrous iron (III) chloride hexahydrate (FeCl3.6H2O) (Product no.: 1.03943) were obtained from Millipore Sigma. The chemicals were obtained at 99.9 % purity in analytical reagent (AR) grade.
4.2. Green synthesis of cobalt- and iron- NPs
The Curcuma longa plant's rhizome extract was obtained from Sigma-Aldrich (Product no.: NIST3300). For the green synthesis of NPs, 0.1 M Cobalt nitrate hexahydrate or Ferrous iron (III) chloride was used along with 10 mL plant extract of Curcuma longa. Both sources were mixed by using a magnetic stirrer at constant rpm (750 rpm) for 30 min. The color deviations from light brown to dark brown specify the nucleation and growth of Co3O4 NPs. The obtained color-modified NP solutions were centrifuged at 10,000 rpm for 10 min (repeated three times), and their obtained pellet was washed with double distilled water. Finally, the collected NP precipitate was filtered (with Whatman No. 1 filter paper) and dried in the oven at 100 °C for 60 min. The dried NP samples were processed for further characterization and electrokinetic analysis.
4.3. TEM analysis
The morphology of the Fe- or Co- NPs was visualized using TEM machine, at a 200 kV accelerating voltage and an ultra-high resolution of 0.2 nm and a magnification level of 2,000,000 x. A 3 mm diameter TEM grid was prepped by placing 5 μL of NP solutions onto carbon-coated copper grids. The grids were then dried with a mercury lamp before analysis. Image magnifying software was used to determine the size of the NPs to obtain morphological data.
4.4. Cell culture
All embryonic stem cell studies were approved by the Institutional Review Board (IRB) at the University of Chicago. All cells were purchased from ATCC or WiCell in the past 3 years and were negative for mycoplasma. The hESC lines were regularly checked for chromosome abnormalities and maintained with normal chromosome numbers.
hESC was cultured as described previously [61,99]. The hESC line-RUES2 or H1 was cultured on irradiated mouse embryonic fibroblasts (Global Stem) at a density of 20,000–25,000 cells/cm2 in a medium of DMEM/F12, 20 % knockout serum replacement (Life Technologies), 0.1 mM β-mercaptoethanol (Sigma Aldrich) and 20 ng/ml bFGF (R&D Systems), and medium was changed daily. hESC cultures were maintained in an undifferentiated state at 37 °C in a 5 % CO2/air environment until stem cells reached about 90 % confluence.
Human lung cancer (A549)- and Colorectal adenocarcinoma (Caco-2)- cells were cultured in a medium of DMEM supplemented with 10 % FBS, and 1 % Penicillin-Streptomycin. Cell cultures were maintained in an undifferentiated state at 37 °C in a 5 % CO2.
4.5. Isolation of EVs from cell cultured condition medium
EVs from the culture conditioned medium (CCM) of hESC, A549, or Caco-2 were isolated using Total Exosome Isolation reagent (Invitrogen, catalog no. 4478359) as per the manufacturer's protocol. The CCM from cultured hESC or Caco-2 were first centrifuged at 2000×g for 30 min. Then, the resultant supernatant was mixed with 0.5 vol of a total exosome isolation reagent. The mixture was incubated at 4 °C overnight, followed by its centrifugation at 10,000×g for 1 h at 4 °C. The resultant supernatant was aspirated and discarded, and the remaining exosome pellet was diluted with 1 × PBS for NTA analysis.
4.6. Clinical samples of plasma
Plasma samples were obtained from ten healthy subjects and ten colorectal cancer patients (background information is listed in Supplementary Table S2), and patients consented under IRB-10-209-A approved under the University of Chicago Institutional Review Board. For the group consisting of NPs with EVs, the plasma-derived EV samples were spiked with NPs (at the dilution of 1:100).
4.7. Pre-clinical samples of plasma
All procedures in mice were in accordance with the animal protocol approved by the Institutional Animal Care and Use Committee at the University of Chicago, Illinois, USA.
Blood samples were withdrawn from control and experimental group of C57BL/6 mice (10 mice/group) as described previously [100,101]. In experimental group of mice, 100 μl volume of AgNPs suspension (Sigma-Aldrich, Cat# 730785) in PBS (stock concentration = 0.5 mg/ml) was injected via tail vein for 2 consecutive days.
4.8. Isolation of EVs from plasma
EVs from plasma were isolated using Total Exosome Isolation kit (from plasma) (Invitrogen, catalog no. 4484450) as per the manufacturer's protocol. The reagent works by sequestering water molecules, which causes less soluble components like vesicles to precipitate out of the solution. These can then be collected through a brief, low-speed centrifugation process. Initially, the plasma was treated with Proteinase K for 10 min. Next, the reagent was added, and the mixture was incubated for half an hour at 4 °C, followed by a standard centrifugation at room temperature (10,000×g for 5 min), to recover the precipitated exosomes. Finally, the resulting pellet is resuspended in PBS for downstream analysis.
4.9. NTA analysis
The number and size distribution of EVs were characterized using NTA with a Malvern NanoSight NS300 instrument. A monochromatic laser beam at 405 nm was applied to 500 μL of EV solutions loaded into the sample chamber. Three video captures of exosome movements were recorded over a 30-s period and analyzed using NTA software. The software was optimized to identify and track exosomes on a frame-by-frame basis. The number of EVs released from hESC or Caco-2 cells was calculated using NTA analysis.
4.10. Electrokinetic analysis by DLS
The colloidal suspension of Fe- or Co- NPs was (10 times) diluted with sterile water, followed by homogenization with an ultrasonic homogenizer for 15 min at 20 Hz to break up aggregations of NPs. The solution of NP was passed through a Millipore filter (0.22 μm) and analyzed through Malvern Zetasizer Nano. EVs were analyzed by Malvern Zetasizer Nano using equally diluted samples prepared with equal amount (50 μg/ml) within PBS for each group. Similarly, the electrokinetic analysis of EVs plasma from control- and NP injected- C57BL/6 mice (10 mice/group), and EVs from the plasma of 10 healthy subjects, 10 colorectal cancer patients (without and with NPs spiking), was accomplished via Malvern Zetasizer Nano.
4.11. Data collection
A large dataset of electrokinetic features (ZP, Mob, Cond, and MCR) of two different green synthesized Fe NPs (N = 504) and Co NPs (N = 512), two different types of EVs; hESC-derived EVs (N = 500), Caco-2-derived EVs (N = 500), and DLD-1-derived EVs (N = 500), clinical samples: Control- and NPs spiked-plasma samples from 10 healthy subjects (N = 200), 10 colorectal cancer patients (N = 200). EVs from the plasma of 10 healthy subjects (N = 100), 10 colorectal cancer patients (N = 100), and NPs along with EVs from the plasma of 10 colorectal cancer patients (N = 100), and pre-clinical samples: EVs from the plasma of control C57BL/6 (N = 100), and NP injected C57BL/6 mice, were characterized via DLS. The data was subsequently used for the downstream ML analysis.
4.12. Classical machine learning analysis
The NP dataset included 1016 data points in two types (Fe and Co) with four given parameters, namely ZP, Mob, Cond, and MRC. The EV dataset included 1533 data points in three types (hESC-, A549-or Caco-2- EVs). Among them, only ZP, Cond, and MRC were used due to the linear correlation between ZP and Mob. The structure of the dataset was first analyzed using scatter plot analysis and principal component analysis (PCA). Then we used a train-test split and k-fold cross validation to evaluate the performance of the models. For the train-test split, the dataset was shuffled, then divided into two sets with sizes of 70 % and 30 % for training and testing, respectively. For the k-fold cross validation, we used 10-, 100, and 200- folds. In this study, we used LR, SVM, KNN, DT, and XGB to classify the NP's and EVs' types based on three parameters: ZP, Mob, and MCR. In this work, all the ML models were first trained using a training set and then tested on the test set. The performance of these models was evaluated using various metrics such as cross validation accuracy, confusion matrix, precision, recall, F1-score, and AUC-ROC.
Optimization: The parameters of the classical ML models were optimized via hyperparameter tuning using GridSearchCV.
4.13. Quantum machine learning analysis via variational quantum classifier
The VQC is a key supervised QML algorithm (Supplementary Fig. S9) that is widely used for classification problems in noisy intermediate-scale quantum (NISQ) devices [[93], [94], [95]]. It allows us to obtain exploratory results on NISQ devices without the need for additional error-correction techniques. The cost function is calculated using iterative device measurements, which helps to mitigate errors by incorporating noisy data into the optimization computations. This quantum method uses the mapping of classical input data to an increasingly large quantum feature space, which is based on quantum circuits that are difficult to replicate classically.
The QML model was implemented in three steps: state preparation, model circuit, and measurement. VQC started with the initial state preparation of data. Then, a quantum feature map, which is a mathematical mapping that embedded the data into higher dimensional spaces was applied to convert classical data into quantum states. The variational circuit was then constructed, and its parameters were optimized using a classical optimizer. The number of measurements and dimensions determined the size of the variational circuit. The measured value was transmitted back to the circuit as feedback, which was used to improve the variational circuit's parameters. Finally, the process was repeated until the desired prediction performance was achieved.
Feature Map: A feature map was used as the first step. This is a type of mathematical function that maps data from one space to another, and in the context of QML, it was used to map classical data into quantum states. This allowed quantum computers to process and learn from classical data in a new way. A quantum feature map was used, which is a function that maps a classical data point xᵢ to a quantum state |ϕ(xᵢ)⟩. The quantum state |ϕ(xᵢ)⟩ is a superposition of states that represent the different possible values of the classical data point xᵢ. Once the classical data was mapped to quantum states, the quantum computer was used to process and learn from the data, such as using a classifier to find a hyperplane that separated the data into two classes. In this research paper, a Pauli feature map with a linear entanglement strategy was used, which is depicted in Supplementary Fig. S19 as a quantum circuit for a Pauli feature map with two data encoding repetitions over four features.
Model Circuit: The second step is the model circuit, or the classifier strictly speaking. A parameterized unitary operator U(w) is created such that |ψ (x:θ)⟩ = U(w)|ψ(x)⟩ The model circuit is constructed from gates that evolve the input state. The circuit is based on unitary operations and depends on external parameters which will be adjustable. Given a prepared state |ψᵢ⟩ the model circuit, U(w) maps |ψᵢ⟩ to another vector |ψᵢ⟩ = U(w)|ψᵢ⟩. In turn U(w) consists of a series of unitary gates. The weights of the model circuit can be optimized using a classical optimization algorithm. The optimization algorithm tries different values for the weights and measures the performance of the model for each value. The value of the weights that produce the best performance is then used for the final model. We used the State vector simulator to test the circuit on 32 qubits, which can be represented as.
Supplementary Fig. S20 depicts the variational circuit used in this work.
Measurement: This is the final step in a QML algorithm. It estimates the probability of a data point belonging to a particular class by measuring the quantum state of the data point. This is equivalent to sampling multiple times from the set of all possible values that the data point could take on and averaging the results.
Optimization: The parameters of the VQC are updated using an optimization routine. This is done in a classical loop that trains the parameters until the cost function's value decreases. Here, we used gradient-free stochastic optimization algorithm including Constrained Optimization by Linear Approximation (COBYLA), Powell's Method or Powell's Conjugate Direction Method (POWELL), Simultaneous Perturbation Stochastic Approximation (SPSA), or Univariate Marginal Distribution Algorithm (UMDA) optimizer. By varying the range of maximum iterations of these optimizers, we ensured the maximum prediction performance of the given model.
The data was prepared for training and testing by encoding the data into qubits and by shuffling and splitting it into two sets: a training set and a testing set. Next, the feature map was defined, which is a circuit that maps the input data to a set of qubits that can be used by the VQC algorithm. Next, the ansatz was defined, which is a circuit that is used to approximate the target function. Then, the VQC algorithm was trained by iteratively varying the parameters of the ansatz and measuring the prediction performance of the classifier on the training set (70 % initial training set, 50 % of initial, and 10 % of initial). Eventually, the VQC algorithm was tested to evaluate the performance of the classifier via confusion matrix and ROC analysis.
Similarly, data points of three types of EVs (hESC-, A549-, or Caco-2- EVs) with their given parameters, namely ZP, Cond, and MRC were used for the implementation of VQC. In addition, the same strategy was employed for mouse- and human-plasma or EVs, with or without NPs. Augmentation techniques were employed for distinguishing cancer plasma vs cancer plasma with NPs, followed by training the QML model.
4.14. SHAP value analysis
SHAP value analysis was conducted to explain the output of ML based electrokinetic mining for the identification of NPs or EVs. SHAP values were calculated by measuring the contribution of each attribute to the prediction ability of ML model.
4.15. Statistical analysis
All statistical tests, no adjustments were made for comparison. For all parametric statistical analyses, data were determined to be normally distributed. In Fig. 2E–G, the data are presented as mean ± standard error mean (S.E.M.) of 3 data points. Student's t-test was applied for statistical analysis with significance level at ∗P < 0.05, ∗∗P < 0.01. The degree of uncertainty in sample was represented in ROC plots via CI (showed with confidence shape) with probability limit of 95 %. In Fig. 6E, the statistical significance was determined via multiple t-tests, followed by the Holm-Sidak method, with P value = 0.05.
CRediT authorship contribution statement
Abhimanyu Thakur: Writing – original draft, Visualization, Project administration, Methodology, Conceptualization. Pedro Correia Santos Bezerra: Writing – review & editing, Visualization, Methodology. Abhishek: Visualization, Methodology. Shihao Zeng: Visualization, Methodology. Kui Zhang: Investigation. Werner Treptow: Investigation. Alexander Luna: Methodology. Urszula Dougherty: Methodology. Akushika Kwesi: Methodology. Isabella R. Huang: Resources. Christine Bestvina: Writing – review & editing. Marina Chiara Garassino: Writing – review & editing. Fuyu Duan: Visualization. Yash Gokhale: Writing – review & editing, FNU. Bin Duan: Writing – review & editing. Yin Chen: Writing – review & editing. Qizhou Lian: Resources. Marc Bissonnette: Methodology. Jianpan Huang: Writing – review & editing, Visualization, Supervision, Methodology. Huanhuan Joyce Chen: Writing – review & editing, Supervision, Funding acquisition, Conceptualization.
Data and materials availability
All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper can be requested from the corresponding authors. The code used to generate the results in this paper is available at https://github.com/Abhimanyu2023/nanoQML.
Ethics approval and consent to participate
All embryonic stem cell studies were approved by the Institutional Review Board (IRB) at the University of Chicago. Plasma samples were obtained from ten healthy subjects and ten colorectal cancer patients, and patients consented under IRB-10-209-A approved under the University of Chicago Institutional Review Board.
Funding
This work was supported by the National Cancer Institute (NCI) R00 CA226353-01A1, Cancer Research Foundation Young Investigator Award and a Lung Cancer Research Foundation (LCRF) Pilot Project Award to HJC.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
We are thankful to Madeleine S. Durkee (at Department of Radiology, UChicago, Illinois, USA) for her expert comments and revision of the manuscript. We appreciate the technical support from Jace Chen (at the UChicago, IL, USA).
Footnotes
Peer review under the responsibility of editorial board of Bioactive Materials.
Supplementary data to this article can be found online at https://doi.org/10.1016/j.bioactmat.2025.03.023.
Contributor Information
Abhimanyu Thakur, Email: abhimanyu@uchicago.edu.
Jianpan Huang, Email: jphuang@hku.hk.
Huanhuan Joyce Chen, Email: joycechen@uchicago.edu.
Appendix A. Supplementary data
The following is the Supplementary data to this article:
References
- 1.Patra J.K., Das G., Fraceto L.F., Campos E.V.R., Rodriguez-Torres M. del P., Acosta-Torres L.S., Diaz-Torres L.A., Grillo R., Swamy M.K., Sharma S., Habtemariam S., Shin H.-S. Nano based drug delivery systems: recent developments and future prospects. J. Nanobiotechnol. 2018;16:71. doi: 10.1186/s12951-018-0392-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Fathi-Achachelouei M., Knopf-Marques H., Ribeiro da Silva C.E., Barthès J., Bat E., Tezcaner A., Vrana N.E. Use of nanoparticles in tissue engineering and regenerative medicine. Front. Bioeng. Biotechnol. 2019;7 doi: 10.3389/fbioe.2019.00113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Malekzad H., Sahandi Zangabad P., Mirshekari H., Karimi M., Hamblin M.R. Noble metal nanoparticles in biosensors: recent studies and applications. Nanotechnol. Rev. 2017;6 doi: 10.1515/ntrev-2016-0014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Khan I., Saeed K., Khan I. Nanoparticles: properties, applications and toxicities. Arab. J. Chem. 2017 doi: 10.1016/j.arabjc.2017.05.011. [DOI] [Google Scholar]
- 5.Thakur A., Ke X., Chen Y.-W., Motallebnejad P., Zhang K., Lian Q., Chen H.J. The mini player with diverse functions: extracellular vesicles in cell biology, disease, and therapeutics. Protein Cell. 2022;13:631–654. doi: 10.1007/s13238-021-00863-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Thakur A., Qiu G., Xu C., Han X., Yang T., Ng S.P., Chan K.W.Y., Wu C.M.L., Lee Y. Label-free sensing of exosomal MCT1 and CD147 for tracking metabolic reprogramming and malignant progression in glioma. Sci. Adv. 2020;6 doi: 10.1126/sciadv.aaz6119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Thakur A., Parra D.C., Motallebnejad P., Brocchi M., Chen H.J. Exosomes: small vesicles with big roles in cancer, vaccine development, and therapeutics. Bioact. Mater. 2022;10:281–294. doi: 10.1016/j.bioactmat.2021.08.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Gong L., Weng Y., Zhou W., Zhang K., Li W., Jiang J., Zhu J. In vivo CT imaging of gold nanoparticle-labeled exosomes in a myocardial infarction mouse model. Ann. Transl. Med. 2021;9:504. doi: 10.21037/atm-21-981. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Betzer O., Perets N., Angel A., Motiei M., Sadan T., Yadid G., Offen D., Popovtzer R. In vivo neuroimaging of exosomes using gold nanoparticles. ACS Nano. 2017;11:10883–10893. doi: 10.1021/acsnano.7b04495. [DOI] [PubMed] [Google Scholar]
- 10.Fang X., Wang Y., Wang S., Liu B. Nanomaterials assisted exosomes isolation and analysis towards liquid biopsy, Mater. Today Bio. 2022;16 doi: 10.1016/j.mtbio.2022.100371. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Sizochenko N., Leszczynski J. Review of current and emerging approaches for quantitative nanostructure-activity relationship modeling. J. Nanotoxicology Nanomedicine. 2016;1:1–16. doi: 10.4018/JNN.2016010101. [DOI] [Google Scholar]
- 12.Puzyn T., Rasulev B., Gajewicz A., Hu X., Dasari T.P., Michalkova A., Hwang H.-M., Toropov A., Leszczynska D., Leszczynski J. Using nano-QSAR to predict the cytotoxicity of metal oxide nanoparticles. Nat. Nanotechnol. 2011;6:175–178. doi: 10.1038/nnano.2011.10. [DOI] [PubMed] [Google Scholar]
- 13.Sizochenko N., Rasulev B., Gajewicz A., Kuz’min V., Puzyn T., Leszczynski J. From basic physics to mechanisms of toxicity: the “liquid drop” approach applied to develop predictive classification models for toxicity of metal oxide nanoparticles. Nanoscale. 2014;6:13986–13993. doi: 10.1039/C4NR03487B. [DOI] [PubMed] [Google Scholar]
- 14.Liu R., Rallo R., George S., Ji Z., Nair S., Nel A.E., Cohen Y. Classification NanoSAR development for cytotoxicity of metal oxide nanoparticles. Small. 2011;7:1118–1126. doi: 10.1002/smll.201002366. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Tantra R., Oksel C., Puzyn T., Wang J., Robinson K.N., Wang X.Z., Ma C.Y., Wilkins T. Nano(Q)SAR: challenges, pitfalls and perspectives. Nanotoxicology. 2015;9:636–642. doi: 10.3109/17435390.2014.952698. [DOI] [PubMed] [Google Scholar]
- 16.Mourdikoudis S., Pallares R.M., Thanh N.T.K. Characterization techniques for nanoparticles: comparison and complementarity upon studying nanoparticle properties. Nanoscale. 2018;10:12871–12934. doi: 10.1039/C8NR02278J. [DOI] [PubMed] [Google Scholar]
- 17.Li K., DeCost B., Choudhary K., Greenwood M., Hattrick-Simpers J. A critical examination of robustness and generalizability of machine learning prediction of materials properties. npj Comput. Mater. 2023;9:55. doi: 10.1038/s41524-023-01012-9. [DOI] [Google Scholar]
- 18.Handbook of Nanotechnology Applications. Elsevier; 2021. [DOI] [Google Scholar]
- 19.Quevedo D.F., Lentz C.J., Coll de Peña A., Hernandez Y., Habibi N., Miki R., Lahann J., Lapizco-Encinas B.H. Electrokinetic characterization of synthetic protein nanoparticles. Beilstein J. Nanotechnol. 2020;11:1556–1567. doi: 10.3762/bjnano.11.138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Midekessa G., Godakumara K., Ord J., Viil J., Lättekivi F., Dissanayake K., Kopanchuk S., Rinken A., Andronowska A., Bhattacharjee S., Rinken T., Fazeli A. Zeta potential of extracellular vesicles: toward understanding the attributes that determine colloidal stability. ACS Omega. 2020;5:16701–16710. doi: 10.1021/acsomega.0c01582. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Gaurav I., Singh T., Thakur A., Kumar G., Rathee P., Kumari P., Sweta K. Synthesis, in-vitro and in-silico evaluation of silver nanoparticles with root extract of withania somnifera for antibacterial activity via binding of penicillin-binding protein-4. Curr. Pharm. Biotechnol. 2020;21:1674–1687. doi: 10.2174/1389201021666200702152000. [DOI] [PubMed] [Google Scholar]
- 22.Thakur A. Nano therapeutic approaches to combat progression of metastatic prostate cancer. Adv. Cancer Biol. - Metastasis. 2021;2 doi: 10.1016/j.adcanc.2021.100009. [DOI] [Google Scholar]
- 23.Tkach M., Théry C. Communication by extracellular vesicles: where we are and where we need to go. Cell. 2016;164:1226–1232. doi: 10.1016/j.cell.2016.01.043. [DOI] [PubMed] [Google Scholar]
- 24.Parolini I., Federici C., Raggi C., Lugini L., Palleschi S., De Milito A., Coscia C., Iessi E., Logozzi M., Molinari A., Colone M., Tatti M., Sargiacomo M., Fais S. Microenvironmental pH is a key factor for exosome traffic in tumor cells. J. Biol. Chem. 2009;284:34211–34222. doi: 10.1074/jbc.M109.041152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Skog J., Würdinger T., van Rijn S., Meijer D.H., Gainche L., Curry W.T., Carter B.S., Krichevsky A.M., Breakefield X.O. Glioblastoma microvesicles transport RNA and proteins that promote tumour growth and provide diagnostic biomarkers. Nat. Cell Biol. 2008;10:1470–1476. doi: 10.1038/ncb1800. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Thakur A., Mishra A.P., Panda B., Rodríguez D.C.S., Gaurav I., Majhi B. Application of artificial intelligence in pharmaceutical and biomedical studies. Curr. Pharm. Des. 2020;26:3569–3578. doi: 10.2174/1381612826666200515131245. [DOI] [PubMed] [Google Scholar]
- 27.Vamathevan J., Clark D., Czodrowski P., Dunham I., Ferran E., Lee G., Li B., Madabhushi A., Shah P., Spitzer M., Zhao S. Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 2019;18:463–477. doi: 10.1038/s41573-019-0024-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Thakur A., Mishra A.P., Panda B., Sweta K., Majhi B. Detection of disease-specific parent cells via distinct population of nano-vesicles by machine learning. Curr. Pharm. Des. 2020;26:3985–3996. doi: 10.2174/1381612826666200422091753. [DOI] [PubMed] [Google Scholar]
- 29.Chen L., Schär M., Chan K.W.Y., Huang J., Wei Z., Lu H., Qin Q., Weiss R.G., van Zijl P.C.M., Xu J. In vivo imaging of phosphocreatine with artificial neural networks. Nat. Commun. 2020;11:1072. doi: 10.1038/s41467-020-14874-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Huang J., Lai J.H.C., Tse K.-H., Cheng G.W.Y., Liu Y., Chen Z., Han X., Chen L., Xu J., Chan K.W.Y. Deep neural network based CEST and AREX processing: application in imaging a model of Alzheimer's disease at 3 T. Magn. Reson. Med. 2022;87:1529–1545. doi: 10.1002/mrm.29044. [DOI] [PubMed] [Google Scholar]
- 31.Huang X., Liu B., Guo S., Guo W., Liao K., Hu G., Shi W., Kuss M., Duryee M.J., Anderson D.R., Lu Y., Duan B. SERS spectroscopy with machine learning to analyze human plasma derived sEVs for coronary artery disease diagnosis and prognosis. Bioeng. Transl. Med. 2023;8 doi: 10.1002/btm2.10420. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Tao H., Wu T., Aldeghi M., Wu T.C., Aspuru-Guzik A., Kumacheva E. Nanoparticle synthesis assisted by machine learning. Nat. Rev. Mater. 2021;6:701–716. doi: 10.1038/s41578-021-00337-5. [DOI] [Google Scholar]
- 33.Lv H., Chen X. Intelligent control of nanoparticle synthesis through machine learning. Nanoscale. 2022;14:6688–6708. doi: 10.1039/D2NR00124A. [DOI] [PubMed] [Google Scholar]
- 34.Yao L., Ou Z., Luo B., Xu C., Chen Q. Machine learning to reveal nanoparticle dynamics from liquid-phase TEM videos. ACS Cent. Sci. 2020;6:1421–1430. doi: 10.1021/acscentsci.0c00430. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Lin Z., Chou W.-C., Cheng Y.-H., He C., Monteiro-Riviere N.A., Riviere J.E. Predicting nanoparticle delivery to tumors using machine learning and artificial intelligence approaches. Int. J. Nanomed. 2022;17:1365–1379. doi: 10.2147/IJN.S344208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Shin H., Choi B.H., Shim O., Kim J., Park Y., Cho S.K., Kim H.K., Choi Y. Single test-based diagnosis of multiple cancer types using Exosome-SERS-AI for early stage cancers. Nat. Commun. 2023;14:1644. doi: 10.1038/s41467-023-37403-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Shin H., Oh S., Hong S., Kang M., Kang D., Ji Y., Choi B.H., Kang K.-W., Jeong H., Park Y., Hong S., Kim H.K., Choi Y. Early-stage lung cancer diagnosis by deep learning-based spectroscopic analysis of circulating exosomes. ACS Nano. 2020;14:5435–5444. doi: 10.1021/acsnano.9b09119. [DOI] [PubMed] [Google Scholar]
- 38.Ramos E.K., Tsai C.-F., Jia Y., Cao Y., Manu M., Taftaf R., Hoffmann A.D., El-Shennawy L., Gritsenko M.A., Adorno-Cruz V., Schuster E.J., Scholten D., Patel D., Liu X., Patel P., Wray B., Zhang Y., Zhang S., Moore R.J., V Mathews J., Schipma M.J., Liu T., Tokars V.L., Cristofanilli M., Shi T., Shen Y., Dashzeveg N.K., Liu H. Machine learning-assisted elucidation of CD81–CD44 interactions in promoting cancer stemness and extracellular vesicle integrity. Elife. 2022;11 doi: 10.7554/eLife.82669. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Huang H.-Y., Broughton M., Mohseni M., Babbush R., Boixo S., Neven H., McClean J.R. Power of data in quantum machine learning. Nat. Commun. 2021;12:2631. doi: 10.1038/s41467-021-22539-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Caro M.C., Huang H.-Y., Cerezo M., Sharma K., Sornborger A., Cincio L., Coles P.J. Generalization in quantum machine learning from few training data. Nat. Commun. 2022;13:4919. doi: 10.1038/s41467-022-32550-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Huang H.-Y., Broughton M., Cotler J., Chen S., Li J., Mohseni M., Neven H., Babbush R., Kueng R., Preskill J., McClean J.R. Quantum advantage in learning from experiments. Science. 2022;376:1182–1186. doi: 10.1126/science.abn7293. [DOI] [PubMed] [Google Scholar]
- 42.Hu C.-M.J., Fang R.H., Luk B.T., Zhang L. Nanoparticle-detained toxins for safe and effective vaccination. Nat. Nanotechnol. 2013;8:933–938. doi: 10.1038/nnano.2013.254. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Zhao Y., Zhao X., Cheng Y., Guo X., Yuan W. Iron oxide nanoparticles-based vaccine delivery for cancer treatment. Mol. Pharm. 2018;15:1791–1799. doi: 10.1021/acs.molpharmaceut.7b01103. [DOI] [PubMed] [Google Scholar]
- 44.Liu S.-Y., Wei W., Yue H., Ni D.-Z., Yue Z.-G., Wang S., Fu Q., Wang Y.-Q., Ma G.-H., Su Z.-G. Nanoparticles-based multi-adjuvant whole cell tumor vaccine for cancer immunotherapy. Biomaterials. 2013;34:8291–8300. doi: 10.1016/j.biomaterials.2013.07.020. [DOI] [PubMed] [Google Scholar]
- 45.Luyts K., Napierska D., Nemery B., Hoet P.H.M. How physico-chemical characteristics of nanoparticles cause their toxicity: complex and unresolved interrelations. Environ. Sci. Process. Impacts. 2013;15:23–38. doi: 10.1039/C2EM30237C. [DOI] [PubMed] [Google Scholar]
- 46.Almeida J.P.M., Chen A.L., Foster A., Drezek R. In vivo biodistribution of nanoparticles. Nanomedicine. 2011;6:815–835. doi: 10.2217/nnm.11.79. [DOI] [PubMed] [Google Scholar]
- 47.Grego E.A., Siddoway A.C., Uz M., Liu L., Christiansen J.C., Ross K.A., Kelly S.M., Mallapragada S.K., Wannemuehler M.J., Narasimhan B. Polymeric Nanoparticle-Based Vaccine Adjuvants and Delivery Vehicles. 2020. pp. 29–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Lupínková S., Benkocká M., Ryšánek P., Kolská Z. Enhancing immobilization of iron oxide particles on various polymer surfaces. Polym. Eng. Sci. 2022;62:1463–1472. doi: 10.1002/pen.25935. [DOI] [Google Scholar]
- 49.Arias J.L., Gallardo V., Ruiz M.A. Multifunctional Anticancer Nanomedicine Based on a Magnetically Responsive Cyanoacrylate Polymer. 2012. pp. 61–88. [DOI] [PubMed] [Google Scholar]
- 50.Li X., Zhou P., Luo Z., Feng R., Wang L. Hohenbuehelia serotina polysaccharides self-assembled nanoparticles for delivery of quercetin and their anti-proliferative activities during gastrointestinal digestion in vitro. Int. J. Biol. Macromol. 2022;203:244–255. doi: 10.1016/j.ijbiomac.2022.01.143. [DOI] [PubMed] [Google Scholar]
- 51.Cheng X., Wang X., Cao Z., Yao W., Wang J., Tang R. Folic acid-modified soy protein nanoparticles for enhanced targeting and inhibitory. Mater. Sci. Eng. C. 2017;71:298–307. doi: 10.1016/j.msec.2016.10.018. [DOI] [PubMed] [Google Scholar]
- 52.Cano-Sarmiento C., Téllez-Medina D.I., Viveros-Contreras R., Cornejo-Mazón M., Figueroa-Hernández C.Y., García-Armenta E., Alamilla-Beltrán L., García H.S., Gutiérrez-López G.F. Zeta potential of food matrices. Food Eng. Rev. 2018;10:113–138. doi: 10.1007/s12393-018-9176-z. [DOI] [Google Scholar]
- 53.De Berardis B., Marchetti M., Risuglia A., Ietto F., Fanizza C., Superti F. Exposure to airborne gold nanoparticles: a review of current toxicological data on the respiratory tract. J. Nanoparticle Res. 2020;22:235. doi: 10.1007/s11051-020-04966-9. [DOI] [Google Scholar]
- 54.Flagan R.C. vol. 26. 2008. pp. 254–268. (Differential Mobility Analysis of Aerosols: A Tutorial, KONA Powder Part. J.). [DOI] [Google Scholar]
- 55.Delgado A.V., González-Caballero F., Hunter R.J., Koopal L.K., Lyklema J. Measurement and interpretation of electrokinetic phenomena. J. Colloid Interface Sci. 2007;309:194–224. doi: 10.1016/j.jcis.2006.12.075. [DOI] [PubMed] [Google Scholar]
- 56.Kohavi R. Int. Jt. Conf. Articial Intell. 1995. A study of cross-validation and bootstrap for accuracy estimation and model selection.https://www.ijcai.org/Proceedings/95-2/Papers/016.pdf [Google Scholar]
- 57.Alpaydın E. The MIT Press; 2004. Introduction to Machine Learning.https://mitpress.mit.edu/9780262012119/introduction-to-machine-learning/ [Google Scholar]
- 58.Xu Y., Goodacre R. On splitting training and validation set: a comparative study of cross-validation, bootstrap and systematic sampling for estimating the generalization performance of supervised learning. J. Anal. Test. 2018;2:249–262. doi: 10.1007/s41664-018-0068-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Iyaswamy A., Thakur A., Guan X.-J., Krishnamoorthi S., Fung T.Y., Lu K., Gaurav I., Yang Z., Su C.-F., Lau K.-F., Zhang K., Ng R.C.-L., Lian Q., Cheung K.-H., Ye K., Chen H.J., Li M. Fe65-engineered neuronal exosomes encapsulating corynoxine-B ameliorate cognition and pathology of Alzheimer's disease. Signal Transduct. Targeted Ther. 2023;8:404. doi: 10.1038/s41392-023-01657-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Chen H.J., Gardner E.E., Shah Y., Zhang K., Thakur A., Zhang C., Elemento O., Varmus H. Formation of malignant, metastatic small cell lung cancers through overproduction of cMYC protein in TP53 and RB1 depleted pulmonary neuroendocrine cells derived from human embryonic stem cells. Elife. 2024 doi: 10.7554/eLife.93170.1. [DOI] [Google Scholar]
- 61.Lian Q., Zhang K., Zhang Z., Duan F., Guo L., Luo W., Mok B.W.-Y., Thakur A., Ke X., Motallebnejad P., Nicolaescu V., Chen J., Ma C.Y., Zhou X., Han S., Han T., Zhang W., Tan A.Y., Zhang T., Wang X., Xu D., Xiang J., Xu A., Liao C., Huang F.-P., Chen Y.-W., Na J., Randall G., Tse H., Chen Z., Chen Y., Chen H.J. Differential effects of macrophage subtypes on SARS-CoV-2 infection in a human pluripotent stem cell-derived model. Nat. Commun. 2022;13:2028. doi: 10.1038/s41467-022-29731-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Smith R.C., Tabar V. Constructing and deconstructing cancers using human pluripotent stem cells and organoids. Cell Stem Cell. 2019;24:12–24. doi: 10.1016/j.stem.2018.11.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Zhang M., Vandana J.J., Lacko L., Chen S. Modeling cancer progression using human pluripotent stem cell-derived cells and organoids. Stem Cell Res. 2020;49 doi: 10.1016/j.scr.2020.102063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Zhu J., Ozdemir S.K., Xiao Y.-F., Li L., He L., Chen D.-R., Yang L. On-chip single nanoparticle detection and sizing by mode splitting in an ultrahigh-Q microresonator. Nat. Photonics. 2010;4:46–49. doi: 10.1038/nphoton.2009.237. [DOI] [Google Scholar]
- 65.Welsh J.A., Goberdhan D.C.I., O'Driscoll L., Buzas E.I., Blenkiron C., Bussolati B., Cai H., Di Vizio D., Driedonks T.A.P., Erdbrügger U., Falcon‐Perez J.M., Fu Q., Hill A.F., Lenassi M., Lim S.K., Mahoney M.G., Mohanty S., Möller A., Nieuwland R., Ochiya T., Sahoo S., Torrecilhas A.C., Zheng L., Zijlstra A., Abuelreich S., Bagabas R., Bergese P., Bridges E.M., Brucale M., Burger D., Carney R.P., Cocucci E., Colombo F., Crescitelli R., Hanser E., Harris A.L., Haughey N.J., Hendrix A., Ivanov A.R., Jovanovic‐Talisman T., Kruh‐Garcia N.A., Ku’ulei‐Lyn Faustino V., Kyburz D., Lässer C., Lennon K.M., Lötvall J., Maddox A.L., Martens‐Uzunova E.S., Mizenko R.R., Newman L.A., Ridolfi A., Rohde E., Rojalin T., Rowland A., Saftics A., Sandau U.S., Saugstad J.A., Shekari F., Swift S., Ter‐Ovanesyan D., Tosar J.P., Useckaite Z., Valle F., Varga Z., van der Pol E., van Herwijnen M.J.C., Wauben M.H.M., Wehman A.M., Williams S., Zendrini A., Zimmerman A.J., Consortium M.I.S.E.V., Théry C., Witwer K.W. Minimal information for studies of extracellular vesicles (MISEV2023): from basic to advanced approaches. J. Extracell. Vesicles. 2024;13 doi: 10.1002/jev2.12404. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Liu H., Liu S., Xiao Y., Song W., Li H., Ho L.W.C., Shen Z., Choi C.H.J. A pH-reversible fluorescent probe for in situ imaging of extracellular vesicles and their secretion from living cells. Nano Lett. 2021;21:9224–9232. doi: 10.1021/acs.nanolett.1c03110. [DOI] [PubMed] [Google Scholar]
- 67.Mendivil-Alvarado H., Limon-Miro A.T., Carvajal-Millan E., Lizardi-Mendoza J., Mercado-Lara A., Coronado-Alvarado C.D., Rascón-Durán M.L., Anduro-Corona I., Talamás-Lara D., Rascón-Careaga A., Astiazarán-García H. Extracellular vesicles and their zeta potential as future markers associated with nutrition and molecular biomarkers in breast cancer. Int. J. Mol. Sci. 2023;24:6810. doi: 10.3390/ijms24076810. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Akagi T., Ichiki T. Encycl. Biocolloid Biointerface Sci. 2V Set. Wiley; 2016. Evaluation of zeta‐potential of individual exosomes secreted from biological cells using a microcapillary electrophoresis chip; pp. 469–473. [DOI] [Google Scholar]
- 69.van der Pol E., de Rond L., Coumans F.A.W., Gool E.L., Böing A.N., Sturk A., Nieuwland R., van Leeuwen T.G. Absolute sizing and label-free identification of extracellular vesicles by flow cytometry. Nanomed. Nanotechnol. Biol. Med. 2018;14:801–810. doi: 10.1016/j.nano.2017.12.012. [DOI] [PubMed] [Google Scholar]
- 70.Raccuglia P., Elbert K.C., Adler P.D.F., Falk C., Wenny M.B., Mollo A., Zeller M., Friedler S.A., Schrier J., Norquist A.J. Machine-learning-assisted materials discovery using failed experiments. Nature. 2016;533:73–76. doi: 10.1038/nature17439. [DOI] [PubMed] [Google Scholar]
- 71.Shang L., Xie Q., Yang C., Kong L., Zhang Z. Extracellular vesicles facilitate the transportation of nanoparticles within and between cells for enhanced tumor therapy. ACS Appl. Mater. Interfaces. 2023;15:42378–42394. doi: 10.1021/acsami.3c10237. [DOI] [PubMed] [Google Scholar]
- 72.Peruzzi J.A., Gunnels T.F., Edelstein H.I., Lu P., Baker D., Leonard J.N., Kamat N.P. Enhancing extracellular vesicle cargo loading and functional delivery by engineering protein-lipid interactions. Nat. Commun. 2024;15:5618. doi: 10.1038/s41467-024-49678-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Ho L.W.C., Chan C.K.W., Han R., Lau Y.F.Y., Li H., Ho Y.-P., Zhuang X., Choi C.H.J. Mammalian cells exocytose alkylated gold nanoparticles via extracellular vesicles. ACS Nano. 2022;16:2032–2045. doi: 10.1021/acsnano.1c07418. [DOI] [PubMed] [Google Scholar]
- 74.Gaurav I., Thakur A., Kumar G., Long Q., Zhang K., Sidu R.K., Thakur S., Sarkar R.K., Kumar A., Iyaswamy A., Yang Z. Delivery of apoplastic extracellular vesicles encapsulating green-synthesized silver nanoparticles to treat citrus canker. Nanomaterials. 2023;13:1306. doi: 10.3390/nano13081306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Gómez-de-Mariscal E., Maška M., Kotrbová A., Pospíchalová V., Matula P., Muñoz-Barrutia A. Deep-learning-based segmentation of small extracellular vesicles in transmission electron microscopy images. Sci. Rep. 2019;9 doi: 10.1038/s41598-019-49431-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Jensen M.N., Guerreiro E.M., Enciso-Martinez A., Kruglik S.G., Otto C., Snir O., Ricaud B., Hellesø O.G. Identification of extracellular vesicles from their Raman spectra via self-supervised learning. Sci. Rep. 2024;14:6791. doi: 10.1038/s41598-024-56788-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Phillips W., Willms E., Hill A.F. Understanding extracellular vesicle and nanoparticle heterogeneity: novel methods and considerations. Proteomics. 2021;21 doi: 10.1002/pmic.202000118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Jeppesen D.K., Zhang Q., Franklin J.L., Coffey R.J. Extracellular vesicles and nanoparticles: emerging complexities. Trends Cell Biol. 2023;33:667–681. doi: 10.1016/j.tcb.2023.01.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Grover Singh, Bakshi Quantitative structure-property relationships in pharmaceutical research - Part 2. Pharmaceut. Sci. Technol. Today. 2000;3:50–57. doi: 10.1016/s1461-5347(99)00215-1. http://www.ncbi.nlm.nih.gov/pubmed/10664573 [DOI] [PubMed] [Google Scholar]
- 80.Grover Singh, Bakshi Quantitative structure-property relationships in pharmaceutical research - Part 1. Pharmaceut. Sci. Technol. Today. 2000;3:28–35. doi: 10.1016/s1461-5347(99)00214-x. http://www.ncbi.nlm.nih.gov/pubmed/10637598 [DOI] [PubMed] [Google Scholar]
- 81.Skotadis E., Kanaris A., Aslanidis E., Kalatzis N., Chatzipapadopoulos F., Marianos N., Tsoukalas D. Identification of two commercial pesticides by a nanoparticle gas-sensing array. Sensors. 2021;21:5803. doi: 10.3390/s21175803. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.He L., Özdemir Ş.K., Zhu J., Kim W., Yang L. Detecting single viruses and nanoparticles using whispering gallery microlasers. Nat. Nanotechnol. 2011;6:428–432. doi: 10.1038/nnano.2011.99. [DOI] [PubMed] [Google Scholar]
- 83.Sun B., Fernandez M., Barnard A.S. Machine learning for silver nanoparticle electron transfer property prediction. J. Chem. Inf. Model. 2017;57:2413–2423. doi: 10.1021/acs.jcim.7b00272. [DOI] [PubMed] [Google Scholar]
- 84.Sidey-Gibbons J.A.M., Sidey-Gibbons C.J. Machine learning in medicine: a practical introduction. BMC Med. Res. Methodol. 2019;19:64. doi: 10.1186/s12874-019-0681-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Stetefeld J., McKenna S.A., Patel T.R. Dynamic light scattering: a practical guide and applications in biomedical sciences. Biophys. Rev. 2016;8:409–427. doi: 10.1007/s12551-016-0218-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Jia Z., Li J., Gao L., Yang D., Kanaev A. Dynamic light scattering: a powerful tool for in situ nanoparticle sizing. Colloids and Interfaces. 2023;7:15. doi: 10.3390/colloids7010015. [DOI] [Google Scholar]
- 87.Carvalho P.M., Felício M.R., Santos N.C., Gonçalves S., Domingues M.M. Application of light scattering techniques to nanoparticle characterization and development. Front. Chem. 2018;6 doi: 10.3389/fchem.2018.00237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Bhattacharjee S. DLS and zeta potential – what they are and what they are not? J. Control. Release. 2016;235:337–351. doi: 10.1016/j.jconrel.2016.06.017. [DOI] [PubMed] [Google Scholar]
- 89.Rjoob K., Bond R., Finlay D., McGilligan V., Leslie S.J., Rababah A., Iftikhar A., Guldenring D., Knoery C., McShane A., Peace A., Macfarlane P.W. Machine learning and the electrocardiogram over two decades: time series and meta-analysis of the algorithms, evaluation metrics and applications. Artif. Intell. Med. 2022;132 doi: 10.1016/j.artmed.2022.102381. [DOI] [PubMed] [Google Scholar]
- 90.Ibsen S., Sonnenberg A., Schutt C., Mukthavaram R., Yeh Y., Ortac I., Manouchehri S., Kesari S., Esener S., Heller M.J. Recovery of drug delivery nanoparticles from human plasma using an electrokinetic platform technology. Small. 2015;11:5088–5096. doi: 10.1002/smll.201500892. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Moore A., Bell M. XGBoost, A novel explainable ai technique, in the prediction of myocardial infarction: a UK biobank cohort study. Clin. Med. Insights Cardiol. 2022;16 doi: 10.1177/11795468221133611. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Gottesman D. Fault-tolerant quantum computation with local gates. J. Mod. Opt. 2000 https://arxiv.org/pdf/quant-ph/9903099.pdf [Google Scholar]
- 93.Cerezo M., Arrasmith A., Babbush R., Benjamin S.C., Endo S., Fujii K., McClean J.R., Mitarai K., Yuan X., Cincio L., Coles P.J. Variational quantum algorithms. Nat. Rev. Phys. 2021;3:625–644. doi: 10.1038/s42254-021-00348-9. [DOI] [Google Scholar]
- 94.Liang J.-M., Shen S.-Q., Li M., Li L. Variational quantum algorithms for dimensionality reduction and classification. Phys. Rev. 2020;101 doi: 10.1103/PhysRevA.101.032323. [DOI] [Google Scholar]
- 95.D. Sierra-Sosa, J. Arcila-Moreno, C. Garcia-Zapirain, Begonya Castillo-Olea, A. Elmaghraby, Dementia Prediction Applying Variational Quantum Classifier, (n.d.). https://doi.org/https://doi.org/10.48550/arXiv.2007.08653.
- 96.D. Franklin, F.T. Chong, Challenges in Reliable Quantum Computing, in: Nano, Quantum Mol. Comput., Kluwer Academic Publishers, Boston, n.d.: pp. 247–266. 10.1007/1-4020-8068-9_8. [DOI]
- 97.Shaikh F.J., Rao D.S. Prediction of cancer disease using machine learning approach. Mater. Today Proc. 2022;50:40–47. doi: 10.1016/j.matpr.2021.03.625. [DOI] [Google Scholar]
- 98.Cai Z., Poulos R.C., Liu J., Zhong Q. Machine learning for multi-omics data integration in cancer. iScience. 2022;25 doi: 10.1016/j.isci.2022.103798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Chen H.J., Poran A., Unni A.M., Huang S.X., Elemento O., Snoeck H.-W., Varmus H. Generation of pulmonary neuroendocrine cells and SCLC-like tumors from human embryonic stem cells. J. Exp. Med. 2019;216:674–687. doi: 10.1084/jem.20181155. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Thakur A., Qiu G., Ng S.-P., Guan J., Yue J., Lee Y., Wu C.-M.L. Direct detection of two different tumor-derived extracellular vesicles by SAM-AuNIs LSPR biosensor. Biosens. Bioelectron. 2017;94:400–407. doi: 10.1016/j.bios.2017.03.036. [DOI] [PubMed] [Google Scholar]
- 101.Thakur A., Xu C., Li W.K., Qiu G., He B., Ng S.-P., Wu C.-M.L., Lee Y. In vivo liquid biopsy for glioblastoma malignancy by the AFM and LSPR based sensing of exosomal CD44 and CD133 in a mouse model. Biosens. Bioelectron. 2021;191 doi: 10.1016/j.bios.2021.113476. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.







