DFT-ML-Based Property Prediction of Transition Metal Complex Photosensitizers for Photodynamic Therapy

Jingxing Gao; Yachao Dong; Tian Qiu; Wen Sun; Jian Du

doi:10.1021/acsomega.5c08727

. 2025 Oct 31;10(44):53447–53459. doi: 10.1021/acsomega.5c08727

DFT-ML-Based Property Prediction of Transition Metal Complex Photosensitizers for Photodynamic Therapy

Jingxing Gao ¹, Yachao Dong ^1,^*, Tian Qiu ¹, Wen Sun ^1,^*, Jian Du ¹

PMCID: PMC12613122 PMID: 41244434

Abstract

Photodynamic therapy (PDT) is a noninvasive clinical treatment for cancers using photosensitizers and light. While most research has focused on organic molecules, such as porphyrins as photosensitizers, there is emerging interest in the utilization of transition metal complexes (TMCs). Photosensitizer synthesis and the following performance test are time- and resource-consuming, so presynthetic screening of photosensitizers for their property would be critical. In this work, a hybrid mechanistic and data-driven model is proposed for the quantitative structure–property relationship (QSPR) of photosensitizers; important excited-state quantum chemistry descriptors (e.g., excitation energy) are first calculated based on density functional theory (DFT), and these descriptors, together with other molecular descriptors, are used to build single and hybrid machine learning (ML) models for the prediction of the singlet oxygen quantum yield of hexacoordinate TMC photosensitizers (Ru-, Ir-, and Re-complex). The support vector regression model and kernel ridge regression model are shown to provide good predictions on test (R ² > 0.9) and external test sets (R ² > 0.7) in single-ML models, while the delta-learning model and the Mixture-of-Experts model can further improve the generalization ability (R ² up to 0.87 on the external test set) and show strong universality. SHAP analysis further confirms the reasonable choice of the mechanistic descriptors in the QSPR model. To our knowledge, this constitutes the first integrated DFT-ML framework specifically designed for the unique challenges of small data sets in TMC photosensitizer research.

graphic file with name ao5c08727_0013.jpg

graphic file with name ao5c08727_0011.jpg

1. Introduction

Cancer is one of the major diseases of great threat to human health. The global cancer cases are expected to grow to 28.4 million cases in 2040, with a 47% increase over 2020. At present, the main therapies for cancer include surgical therapy, chemotherapy, radiotherapy, gene therapy, photodynamic therapy (PDT), and photothermal therapy (PTT). Among the above therapies, PDT is considered to be effective for superficial cancerous tissues because of its advantages of low toxicity, lack of drug resistance, and mild adverse reactions.

The main process of PDT is to use light sources to activate nontoxic or microtoxic photosensitizers to produce cytotoxic reactive oxygen species (ROS), thereby inducing apoptosis and necrosis of cells at the tumor site. As shown in Figure , under the irradiation of appropriate wavelength light, the photosensitizer (PS) will be excited to a singlet state, and then quickly converted to a triplet state through intersystem crossing (ISC). Then, triplet state PS reacts with the substrate photodynamically to produce ROS. At present, this photodynamic process is divided into two types: type I and type II. − In the process of type I photochemical reaction, PS in the triplet state reacts with nearby substrates to form radical cations or radical anions through electron transfer, which will further react with oxygen-containing substrates (such as water, oxygen, etc.) to produce ROS (such as superoxide anions and hydroxyl radicals). , In type II photochemical reactions, PS in the triplet state directly transfers energy to oxygen to form highly reactive singlet oxygen (¹O₂), as shown in Figure . Therefore, PS is the core element of PDT, and its photophysical and chemical properties determine the therapeutic effect. There is now emerging interest in extending the use to transition metal complexes (TMC), which can display intense absorptions in the visible region, and many also possess high two-photon absorption cross sections, enabling two-photon excitation with NIR light. Therefore, transition metal complexes have become efficient candidates for PSs with developing potential.

Schematic illustration of the process of the Ru complex in type II PDT.

The most studied transition metal in PDT has been Ru, , which usually has high water solubility compared with porphyrins or phthalocyanines as well as a high Φ_Δ, which is essential as a type II PDT PS. For example, a Ru (II) complex named “TLD1433” entered clinical trials in early 2017. A series of water-soluble Ru (II) phthalocyanines with large and stable conjugated π systems have been developed that enable efficient energy and electron transfer processes; a new generation series of cyclometalated Ru(II) polypyridyl complexes have been designed and synthesized with the photophysical properties of revealed absorption maxima around 560 nm with an absorption up to 700 nm. Ir-complexes also have relatively wide applications in the field of photosensitizers. Organic-modified mesoporous silica nanoparticles containing iridium complexes have been synthesized, which exhibit photophysical properties such as high photoreaction yield and high singlet oxygen quantum yield. Two novel cyclometalated Ir-complexes have also been developed, which have strong emission peaks, long excited state lifetimes, and high singlet oxygen particle yields. There are currently very limited studies on rhenium complex photosensitizers, but they also have development potential; for example, a tricarbonyl Recomplex with endoplasmic reticulum-targeting activity has been designed and synthesized, which has strong absorption and a high singlet oxygen quantum yield.

However, photosensitizer synthesis and the following determination of its photosensitizing properties are time- and resource-consuming processes. Therefore, preliminary and presynthetic screening of sensitizers for their ability to generate ¹O₂ would be of great value. Recently, mathematical modeling method , and machine learning (ML) method − have gained popularity and proved to be a powerful tool in various areas, which use algorithms to learn from data, detect patterns, and make fast and accurate predictions. − ML has already been used in property prediction of organic molecule photosensitizers, including related properties with type I and type II PDT. , A quantitative structure property relationship (QSPR) model has been established for a data set containing 32 porphyrins and metalloporphyrins. A new machine learning method has been developed to efficiently and accurately predict the emission energy and photoluminescent quantum yield. 15 single models and three different hybrid models have been proposed to evaluate a data set of 3,066 organic materials to predict photophysical properties (absorption wavelength, emission wavelength, and quantum yield). However, these models are not suitable for TMC photosensitizers because traditional structure descriptors such as SMILES can hardly capture all the information on such PSs, and the lack of data makes it difficult to use deep learning models such as a graph neural network. To the best of our knowledge, there is no research reporting the property prediction method of TMC photosensitizers for PDT. Thus, it is necessary to develop machine learning models with a small data set to predict the properties of TMC photosensitizers such as the triplet state lifetime τ_T, the triplet quantum yield Φ_T, and the singlet oxygen quantum yield Φ_Δ. DFT can elucidate intrinsic mechanisms that cannot be observed by experimental techniques and is widely used in theoretical chemistry. − The combined DFT and ML method could be an excellent method for property prediction of TMC photosensitizers.

In this work, we introduce a systematic DFT-ML framework explicitly developed for the small-data regime prevalent in TMC photosensitizer development, which provides a tailored solution for accelerating the discovery of TMC photosensitizers. We study various single- and hybrid-ML models to predict the singlet oxygen quantum yield Φ_Δ of TMC, which is an evaluation index of type II photosensitizers for PDT; additionally, Φ_Δ is more important and easier to collect from the literature compared to the triplet state lifetime and the triplet quantum yield. In order to characterize the structure and charge transition under light irradiation of TMC photosensitizers during the PDT process, density functional theory (DFT) is chosen to calculate the properties of the excited state as the quantum chemistry descriptor, which could provide low-dimensional ML models suitable for the small data sets of TMC available in the literature. These models, based on quantum chemistry descriptors and other descriptors, are trained and tested on TMC data sets including Ru-complexes, Ir-complex, and Recomplex because they are the majority of reported TMCs, and they are all hexa-coordination TMCs and have similar structural characteristics. The result is compared with the performance of two hybrid-ML models, including the delta-learning model (DLM) and Mixture-of-Experts model (MoE), trained on a specific TMC photosensitizer data set to test whether the generalized metal model trained on three six-coordination TMC can replace the specialized metal model trained on TMC of a given metal center. The subsequent SHAP analysis shows descriptor contribution to the predicted Φ_Δ, which could provide strong interpretability of the proposed model, such as the excitation energy of the S1 state and T1 state. Based on the modeling results, the proposed DFT-MoE model has been found to be most accurate, which could provide theoretical support for experimental synthesis and screening.

The article is structured as follows. In Section , we introduce the calculation method of four types of descriptors and the details of six single-ML models and two hybrid-ML models. In Section , we analyze and compare the training results of these models and test the performance of the best models on a separate Ru-complex data set and Ir-complex data set. In Section , we make the concluding remarks.

2. Method

In this section, we first present the TMC photosensitizers used in the data set of the machine learning model training process; then, we introduce the calculation method of descriptors; finally, we give the details of the single and hybrid-ML models proposed in this work.

2.1. Data Set Construction and Preprocessing

Our data set consists of 136 sets of TMC photosensitizers data from different references. We collect the structures, solvents, and irradiation wavelengths in Φ_Δ test and corresponding Φ_Δ from these references (the data with Φ_Δ less than 0.01 and those with excessive differences in Φ_Δ only under different wavelength irradiations were removed in the preprocessing process). The properties distribution of the data set is shown in Figure . Additionally, the external test set is made up of the other 11 sets of data from references to test the generalization ability of the proposed models. The detailed information on the data set and the external test set is shown in Tables S1 and S2.

Distribution of the data set on (A) solvent, (B) metal center, (C) singlet oxygen quantum yield, and (D) irradiation wavelength.

2.2. Descriptor Acquisition

The descriptors used in this work consist of four kinds of descriptors for TMC photosensitizers. Quantum chemistry descriptors are the most important descriptors, which reflect the electron transfer information on S1 and T1 excited states and their differences calculated by time-dependent density functional theory (TD-DFT). Molecule structure descriptors include molecule size, charge, and structure on the photodynamic property. Metal-centered descriptors depict the impact of the metal center to distinguish the different kinds of TMC. External condition descriptors describe the impact of external conditions on the effect of the PDT process. The descriptors employ both implicit and explicit methods to describe the influence of the solvent. The implicit method incorporates the solvent’s effect into quantum chemistry descriptors through the CPCM and SMD solvent models used in DFT calculations. The explicit method, on the other hand, directly uses the static dielectric constant and dielectric constant at the infinite frequency of the solvent as external condition descriptors.

2.2.1. Quantum Chemistry Descriptors

In the TD-DFT calculation, the excited state wave function is described by a linear combination of single excited configuration functions. Each configuration function has a coefficient w as excited configuration or w′ as deexcited configuration. First, hole distribution ρ^hole and electron distribution ρ^ele are defined as follows:

ρ^{hole} (r) = ρ_{local}^{hole} (r) + ρ_{cross}^{hole} (r)

ρ_{local}^{hole} (r) = \sum_{i \to a} (w_{i}^{a})^{2} φ_{i} φ_{i} - \sum_{i \leftarrow a} (w_{i}^{' a})^{2} φ_{i} φ_{i}

ρ_{cross}^{hole} (r) = \sum_{i \to a} \sum_{j \neq i \to a} w_{i}^{a} w_{j}^{a} φ_{i} φ_{j} - \sum_{i \leftarrow a} \sum_{j \neq i \leftarrow a} w_{i}^{' a} w_{j}^{' a} φ_{i} φ_{j}

ρ^{ele} (r) = ρ_{local}^{ele} (r) + ρ_{cross}^{ele} (r)

ρ_{local}^{ele} (r) = \sum_{i \to a} (w_{i}^{a})^{2} φ_{a} φ_{a} - \sum_{i \leftarrow a} (w_{i}^{' a})^{2} φ_{a} φ_{a}

ρ_{cross}^{ele} (r) = \sum_{i \to a} \sum_{i \to b \neq a} w_{i}^{a} w_{i}^{b} φ_{a} φ_{b} - \sum_{i \leftarrow a} \sum_{i \leftarrow b \neq a} w_{i}^{' a} w_{i}^{' b} φ_{a} φ_{b}

In eqs –, r is the coordinate vector; φ is the orbital wave function; i or j is the occupied orbital, and a or b is the empty orbital. $\sum_{i \to a}$ means summation over each excited configuration and $\sum_{i \leftarrow a}$ means summation over each deexcited configuration, while w _i is the coefficient of the excited configuration from occupied orbital i to empty orbital a and w _i is the coefficient of the deexcited configuration from empty orbital a to occupied orbital i. The hole distribution and electron distribution are divided into two parts: local term and cross term. The local term is generally dominant, reflecting the contribution of the configuration function itself, while the cross term reflects the impact of the coupling between the configuration functions on the hole and electron distribution. Then we adopt the following excitation descriptors:

Sr index = \int \sqrt{ρ^{hole} (r) ρ^{ele} (r)} d r

D index = \sqrt{| X^{ele} - X^{hole} |^{2} + | Y^{ele} - Y^{hole} |^{2} + | Z^{ele} - Z^{hole} |^{2}}

Δ σ index = | σ^{ele} | - | σ^{hole} |

H_{CT} = | H \times u_{CT} |

H index = (| σ^{ele} | + | σ^{hole} |) / 2

t index = D index - H_{CT}

HDI = 100 \times \sqrt{\int [ρ^{hole} (r)]^{2} d r}

EDI = 100 \times \sqrt{\int [ρ^{ele} (r)]^{2} d r}

In these descriptors, the Sr index describes the degree of overlap between electron and hole distributions. The D index describes the distance between the mass centers of the electron and hole distributions, where X ^ele/hole is the X coordinate of the mass center of the electron/hole and Y ^ele/hole, Z ^ele/hole stand for the other coordinate. The Δσ index describes the difference in the overall spatial distribution width of the electron and hole. σ ^ele and σ ^hole are the distribution breadth or dispersion degree of the electron and hole, and their x, y, and z components are the square mean root deviations of the electron and hole distribution with the X, Y, and Z coordinate of the mass center of the electron and hole calculated by eq . H _CT describes the average spread of electron and hole in the charge-transfer (CT) direction, where H is the sum of the average spread of electron and hole in x, y, and z direction, s calculated by eq and u _CT is a unit vector in the CT direction. The H index describes the overall average width of the distribution of electrons and holes. The T index describes the degree of separation between electrons and holes. The hole delocalization index and electron delocalization index describe the uniformity of the distribution of holes and electrons.

σ_{λ}^{s} = \sqrt{\int (λ - Γ^{s})^{2} ρ^{s} (r) d r}, λ = {x, y, z}, Γ = {X, Y, Z}, s = {hole, ele}

H_{λ} = \frac{σ_{λ}^{ele} + σ_{λ}^{hole}}{2}, λ = {x, y, z}

Additionally, vertical ionization energy (VIE) and vertical electron affinity (VEA) are also candidate QCD; they are critically important for characterizing electron-transfer processes, which are fundamental to Type I PDT mechanisms. , They are not calculated as the QCD in our model because they are not directly correlated with the singlet oxygen quantum yield for typical Type II systems.

First, DFT geometry optimizations were carried out using ORCA 5.0.4 , under the PBE0-D3 method and def2-SVP basis set; , the solvation effect was considered using the CPCM solvent model; , then use TD-DFT to complete S1 state and T1 state geometry optimization and calculation under PBE0-D3 method and def2-TZVP basis set ,, with SMD solvent model to get excited state wave function information for subsequent descriptor calculations; finally, S1-T1 spin–orbit coupling (SOC) is calculated under the same calculation level as excited state calculation at optimized S1 state structure. All the following quantum chemistry descriptors of S1 (S-) and T1 (T-) states shown in Table are calculated by the software Multiwfn 3.8 dev. The property differences (D-) between the S1 state and T1 state were also calculated as QCD. The PBE0 hybrid functional was selected because it has demonstrated accuracy similar to the B3LYP functional, which is one of the most widely used functionals. Moreover, the PBE0 functional is particularly well-suited for transition metal complexes, as it often provides better results in geometric optimization and energy calculation. ,

1. Meanings of Quantum Chemistry Descriptors.

descriptors	meaning
S/T/D-sr	degree of overlap between electron and hole distributions of S1 state/of T1 state/of their difference
S/T/D-d	distance between the mass centers of the hole and electron distributions of the S1 state/of the T1 state/of their difference
S/T/D-sigma	difference in the overall spatial distribution width of the electron and hole of the S1 state/of the T1 state/of their difference
S/T/D-hct	average spread of the electron and hole in the CT direction of the S1 state/of the T1 state/of their difference
S/T/D-h	overall average width of the distribution of the electron and hole of the S1 state/of the T1 state/of their difference
S/T/D-t	degree of separation between the electron and hole of the S1 state/of the T1 state/of their difference
S/T/D-hdi	uniformity of the distribution of the hole of the S1 state/of the T1 state/of their difference
S/T/D-edi	uniformity of the distribution of electrons of the S1 state/of the T1 state/of their difference
S/T/D-ee	excitation energy of the S1 state/of the T1 state/difference between S1 and T1 state
S/T/D-mlct	metal to ligand charge transfer proportion of the S1 state/of the T1 state/of their difference
S/T/D-tedm	transition electric dipole moment of the S1 state/of the T1 state/of their difference
S/T/D-tmdm	transition magnetic dipole moment of the S1 state/of T1 state/of their difference
soc	spin–orbit coupling (SOC) matrix element between S1 and T1 states
S1	absorption wavelength of the S1 state calculated by DFT
fosc1	oscillator strengths of the S1 state via transition electric dipole moments
fosc2	oscillator strengths of the S1 state via transition velocity dipole moments

Open in a new tab

2.2.2. Molecule Structure Descriptors

Molecule structure descriptors describe the impact of molecule size, charge, and structure on the photodynamic property including the charge of entire complex cation, the charge of all ligand, the charge of connected atom with metal center, the number of atoms, relative molecular mass, and the number of important functional groups (if a functional group contains several smaller functional groups, the largest functional group is counted). Take two complexes shown in Table S1 as an example. For complex 1, the entire complex cation has a valence of +2, so nc = 2, all three ligands are electrically neutral, so lc = 0; among the 6 atoms connected to ruthenium, the oxygen carries a unit of negative charge, so cc = −1; the Ru in the complex has a valence of +2, so mc = 2. By the same reasoning, for complex 17, nc = 2, lc = 0, cc = 0, and mc = 2. The detailed meanings are shown in Table .

2. Meanings of Molecular Structure Descriptors.

descriptors	meaning
nc	net charge of the entire complex cation
lc	charge of all ligands
cc	charge of connected atoms with a metal center
an	number of atoms
mw	relative molecular mass
n-X	number of halogens
n-COO	number of ester groups
n-CO	number of carbonyl groups
n-CHO	number of aldehyde groups
n-CONH	number of peptide bonds
n–OH	number of hydroxyl groups
n-NH2	number of amino groups
n-S	number of sulfur atoms
n-O	number of oxygen atoms
n-CN	number of cyan groups
n-bodipy	number of dipyrromethene boron difluorides
n-py	number of pyridines
n-ph	number of phenyl groups
n-pyra	number of pyrazines
n-pyrr	number of pyrroles
n-r6	number of six-membered rings
n-r5	number of five-membered rings
n-db	number of double bonds
n-tb	number of triple bonds

Open in a new tab

2.2.3. Metal-Centered Descriptors

Metal-centered descriptors distinguish the impact of different transition metals as center on TMC properties by describing the number of metal center, the charge of metal center (mc), the period which the metal element is located in (cp), and the outer electron configuration of metal atoms (cs, cd, cf for s-, d- and f- electron count). The outer electron configuration of metal atoms is based solely on the intrinsic properties of the free metal atom (prior to coordination). For example, Ru is in period 5, so cp = 5; Ru has the configuration [Kr]4d⁷5s¹, so cs = 1, cd = 7, and cf = 0. The detailed meanings are shown in Table .

3. Meanings of Metal-Centered Descriptors.

descriptors	meaning
cn	number of metal centers
cp	period in which the central metal element is located
mc	charge of the metal center
cf	number of electrons in the outermost f orbital of the central metal
cd	number of electrons in the outermost d orbital of the central metal
cs	number of electrons in the outermost s orbital of the central metal

Open in a new tab

2.2.4. External Condition Descriptors

External condition descriptors describe the impact of external condition on the effect of PDT process including the static dielectric constant, dielectric constant at infinite frequency of the solvent, and irradiation wavelengths. The detailed meanings are shown in Table .

4. Meanings of External Condition Descriptors.

descriptors	meaning
eps	static dielectric constant of the solvent
epsinf	dielectric constant at the infinite frequency of the solvent
wl	number of electrons in the outermost s orbital of the central metal

Open in a new tab

2.3. Machine Learning Models

The data set was randomly divided; the training set accounts for 90% (122 data points), and the test set accounts for 10% (14 data points). Leave-one-out (LOO) cross-validation was used on the training set to test the stability of the model. The trained models are also tested on the external test set to obtain their generalization ability. All the input descriptors are normalized by eq (soc is normalized after the logarithm is taken).

X_{normalized} = \frac{X - X_{\min}}{X_{\max} - X_{\min}}

Six candidate single-ML models are first utilized for the prediction of small data set TMC photosensitizers: support vector regression (SVR), kernel ridge regression (KRR), Gaussian process regression (GPR), eXtreme Gradient Boosting regression (XGBoost), random forest regression (RFR), and k-neighbor regression (KNR). These models are first trained on all descriptors to achieve the descriptors' importance rank through SHAP analysis, and then the first 30–50 descriptor groups (at intervals of 5) are tested as the model input and retrained these models to get the best descriptor group. To further improve the prediction accuracy and generalization ability of the single-ML model, two hybrid models are proposed: the delta-learning model (DLM) and the Mixture-of-Experts model (MoE). The process of the delta-learning model is shown in Figure . The first model is used to predict the target value, and the next model is used to predict the error (delta) of the real value and predicted value of the previous model as an amendment item, and so on. Thus, the final predicted value is the predicted value by the first model compared to the predicted errors by all subsequent models. MoE uses multiple single-ML models as expert models to predict the target value simultaneously, as shown in Figure . The final predicted value of the MoE model is the weighted average of the predicted values of multiple machine learning models. All the model parameters above are optimized by the optuna library in Python 3.11, and the hyperparameters optimized for these models are shown in Table .

Process of the Mixture-of-Experts model.

5. Hyperparameters Optimized of Single-ML Models.

models	parameters optimized
SVR	penalty coefficient, tolerance, kernel type
KRR	regularization parameter, hyperparameter of Gaussian kernel
GPR	noise variance, kernel length scale, number of optimizers
XGBoost	max depth of the tree, learning rate, number of estimators, sample ratio and feature ratio of each tree
RFR	number of decision tree; number of features considered per branch; max depth of the decision tree
KNR	number of neighbors, distance measurement parameters, weight allocation strategies, nearest neighbor search algorithms

Open in a new tab

The target variable for all machine learning models developed in this work is the experimentally measured Φ_Δ. Because the machine learning models are inherently nonlinear and very complex in explicit form, the subsequent SHAP analysis is employed to show descriptors contribution to the predicted Φ_Δ which could provide strong interpretability of the proposed model.

3. Results and Discussion

In this section, we present the result of the descriptor filter and the training result with the best descriptor group of single-ML models. We use SHAP analysis to determine the importance of descriptors and choose the most important ones. Then, we show the performance of two hybrid models to find the best model to predict the Φ_Δ of TMC photosensitizers. Finally, we compare the generalized metal model trained on three six-coordination TMCs with the specialized metal model trained on a specific TMC.

3.1. Single-ML Models

In order to verify the importance of different kinds of descriptors, we first compared these models using all descriptors against models where one kind of descriptor was removed at a time. The key finding is that the removal of QCD led to the most significant drop in model performance (the R ² of the external test set decreased from 0.830 to 0.051 in SVR, from 0.747 to 0.214 in KRR, and from 0.451 to −0.142 in GPR) as shown in Table S3. This demonstrates that the QCD provides unique and critical information that cannot be compensated for by the other descriptors. After the descriptor filter, the best models are the SVR model and the KRR model, followed by GPR, as shown in Tables and S4. These models with the best descriptor groups show a good regression effect, generalization ability, as shown in Table and Figure . The stability of these three models is not very well because of the lack of data set, the complexity of the mechanism of the photodynamic therapy process, and the different environmental influences during the Φ_Δ testing process. The other three single-ML models do not satisfy the conditions of the QSPR model (Q ² ≥ 0.6 in cross-validation and R ² ≥ 0.6 in the external test set). The superior performance of the kernel-based models (SVR, KRR, and GPR) over tree-based models (RFR and XGBoost) can be attributed to the nature of our feature space. Our descriptor set primarily consists of continuous, normalized quantum-chemical, and structural properties. Kernel methods are particularly adept at modeling complex, nonlinear relationships in such continuous feature spaces by implicitly mapping them into higher-dimensional spaces where linear relationships may be found. In contrast, tree-based models, which rely on axis-aligned splits, often perform exceptionally well with highly dimensional, sparse data such as molecular fingerprints. The descriptors' importance ranking by SHAP analysis of six single-ML models with all descriptors is shown in Figures S1–S6.

6. R ²(Q ²) Result of the SVR Model and KRR Model in a Descriptor Filter.

model	R ²(Q ²)	30 descriptors	35 descriptors	40 descriptors	45 descriptors	50 descriptors	all descriptors
SVR	training set	0.970	0.957	0.992	0.992	0.990	0.990
	test set	0.948	0.947	0.940	0.927	0.940	0.935
	external test set	0.681	0.603	0.741	0.815	0.772	0.830
	LOO cross-validation	0.568	0.618	0.619	0.622	0.626	0.579
KRR	training set	0.991	0.998	0.996	0.997	0.996	0.993
	test set	0.942	0.914	0.884	0.903	0.924	0.944
	external test set	0.713	0.663	0.700	0.753	0.729	0.747
	LOO cross-validation	0.594	0.608	0.616	0.635	0.534	0.593

Open in a new tab

7. Performance of Three Best Models with Filtered Descriptors.

model		R ²(Q ²)	MaxAE	MAE	MSE
SVR (45 filtered descriptors)	training set	0.992	0.091	0.024	0.001
	test set	0.927	0.130	0.049	0.004
	external test set	0.815	0.254	0.091	0.016
	LOO cross-validation	0.622	0.631	0.122	0.033
KRR (45 filtered descriptors)	training set	0.997	0.055	0.011	0.0003
	test set	0.903	0.157	0.061	0.005
	external test set	0.753	0.296	0.115	0.021
	LOO cross-validation	0.635	0.610	0.119	0.032
GPR (40 filtered descriptors)	training set	0.989	0.117	0.019	0.001
	test set	0.866	0.261	0.056	0.007
	external test set	0.701	0.303	0.125	0.025
	LOO cross-validation	0.647	0.556	0.122	0.031

Open in a new tab

Performance of (A) SVR model, (B) KRR model, and (C) GPR model on the training set, test set, and external test set.

The filtered descriptors contribution is sorted by SHAP analysis, shown in Figures S7–S9. Among the top 15 descriptors of the three models, 12 are the same, indicating that these descriptors can well describe the influence factors of the photodynamic therapy process, and these models can also recognize these factors as shown in Figure A. QCD has a major influence on the descriptors, which indicates that these models have strong interpretability of the PDT mechanism as shown in Figure B. The excitation energy of S1 state (S-ee) and T1 state (T-ee) as QCD are the two most important descriptors, which influence the Φ_Δ by ΔE _st in ISC process. The notable importance of molecule structural descriptors such as relative molecular mass (mw) and number of atoms (an), while not directly involved in the photophysical process, can be interpreted as a consequence of the data set’s composition. We posit that the relative molecular mass acts not as a direct causal factor but rather as a proxy variable for molecular complexity, which is a key outcome of successful ligand engineering. To achieve high singlet oxygen quantum yields, sophisticated ligand modifications, such as extending π-conjugation, are typically employed. These modifications inevitably increase the molecular mass and atom number. Consequently, the model identifies a correlation wherein higher-performing molecules tend to possess more complex, and therefore heavier, architectures. This insight underscores that the model learns from the patterns of successful design presented in the literature.

Modeling descriptor results: (A) distribution in 15 top descriptors of SVR, KRR, and GPR model by SHAP analysis separately; and the (B) type they belong to.

3.2. Hybrid-ML Models

3.2.1. Delta-Learning Model

We use the delta-learning model with bilayers in this section. We selected the top-performing model, SVR, as the base model to provide a strong initial estimate. For the critical delta model, which must capture the complex pattern of errors, we chose the second-best performer, KRR. Compared with the SVR and KRR models, which constitute the delta-learning model, it not only shows better fitting effects on the training set and test set but also has better generalization ability on the external test set, as shown in Table and Figure A. This indicates that the delta-learning model can correct the results of the single-ML model by further predicting the residual, thereby improving its generalization ability.

8. Performance of DLM and MoE.

model		R ²(Q ²)	MaxAE	MAE	MSE
DLM	training set	1.000	0.031	0.004	10^–5
	test set	0.928	0.132	0.051	0.004
	external test set	0.820	0.256	0.091	0.015
	LOO cross-validation	0.625	0.628	0.120	0.032
mixture-of-SVR-KRR-GPR model (MoE1)	training set	0.990	0.067	0.025	0.001
	test set	0.922	0.132	0.055	0.004
	external test set	0.870	0.211	0.083	0.011
	LOO cross-validation	0.657	0.637	0.117	0.030
mixture-of-SVR-KRR model (MoE2)	training set	0.992	0.092	0.024	0.001
	test set	0.927	0.142	0.051	0.004
	external test set	0.840	0.234	0.085	0.014
	LOO cross-validation	0.621	0.660	0.120	0.033

Open in a new tab

Performance of (A) DLM, (B) mixture-of-SVR-KRR-GPR model, and (C) mixture-of-SVR-KRR model on training set, test set, and external test set.

3.2.2. Mixture-of-Experts Model

In this section, we designed two variants to systematically evaluate the impact of expert composition: MoE1 (3 Experts), which incorporates all three kernel-based models (SVR, KRR, and GPR), and MoE2 (2 Experts), which incorporates only the top two models (SVR and KRR). This comparative design allowed us to test whether the performance of the MoE model is optimal with a focused set of the best experts or a broader ensemble. The result shows that the MoE model also shows better fitting effects on the training set and test set and better generalization ability on the external test set, as shown in Table and Figure B,C. This indicates that the MoE model balances the errors of each single-ML model and enhances the prediction ability of the model.

Judging from the performances of the training set and the test set, all three hybrid models have shown improved fitting effects. Judging from the performance of the external test set, among the three hybrid models, the best-performing model is the MoE1 model, followed by the MoE2 model and the delta-learning model. Thus, it is observed that hybrid models can improve the prediction accuracy and generalization ability of single-ML models for TMC photosensitizing properties.

3.3. Comparison with the Specialized Metal Model

In this section, we retrain the two kinds of hybrid models: the delta-learning model (DLM) and the MoE1 model from scratch on Ru-complex and Ir-complex photosensitizers, respectively (first train the single-ML model and get filtered descriptor groups, then optimize and retrain these two models). The result is compared with the performance of these two models trained in Section on all TMC photosensitizers data sets to test whether the generalized metal model trained on three six-coordination TMC can replace the specialized metal model trained on TMC of a given metal center. The Ru-complex data set consists of 77 data points in the internal data set (used for training set/test set with a 9:1 random split) and 5 data points in the external test set. The Ir-complex data set consists of 51 data points in the internal data set and 5 data points in the external test set.

The generalized metal DLM performed slightly better on the Ru-complex data set compared to the model trained on all metal complexes (with the upper and lower quartiles error closer to 0), but slightly worse on the external test set, though the difference was minor, as shown in Figure . It also performed marginally better on the Ir-complex data set (with a smaller negative maximum error) and performed basically the same on the external test set. This indicates that the generalized metal DLM has strong universality and can replace the specialized metal DLM trained on Ru-complexes. The detailed result of DLM comparison is shown in Table S5.

Violin plots of error distribution on (A) Ru-complex, (B) Ir-complex and the result comparisons of external test set on (C) Ru-complex, and (D) Ir-complex of DLM.

The performance of both DLM and the following MoE model on the external test set of the Ir-complex data set is poorer compared to that of the Ru-complex data set. The possible reasons are as follows: (I) The smaller data set for Ir-complexes might contribute to the observed difference. The model may have learned more robust patterns for Ru-complexes due to their larger representation, leading to more confident and accurate predictions for this class; (II) For many Ru-complexes, the dominant route for singlet oxygen generation involves the intersystem crossing (ISC) between the S1 and T1 state. In contrast, Ir-complexes, with their stronger spin–orbit coupling, may exhibit more complex excited-state dynamics. The generation of singlet oxygen may involve energy transfer from other triplet states (e.g., T2) or proceed through mixed triplet state character. Our current QCD, which focuses on S1 and T1 states, may not fully capture the critical energetics and dynamics associated with these alternative pathways, leading to reduced predictive accuracy for Ir-complexes.

In terms of the prediction of Ru-complexes, the generalized metal MoE1 model performs slightly worse (with a wider distribution of prediction errors) compared to the specialized metal MoE1 model, and its performance on the external test set is also slightly lower, as shown in Figure . In terms of the prediction of Ir-complexes, the generalized metal MoE1 model performs slightly better than the specialized metal MoE1 model (with lower maximum positive and negative errors), and their performance on the external test set is basically the same. This indicates that the MoE model also has strong universality and can be used to predict the properties of various TMC photosensitizers. Compared with DLM, the MoE model achieved better performance on both the external test sets of Ru-complexes and Ir-complexes. This might be because the MOE model using multiple monolayer models is more likely to learn patterns from small data sets than the DLM using a bilayer model. The detailed result of the MoE1 model comparison is shown in Table S6.

Violin plots of error distribution on (A) Ru-complex, (B) Ir-complex and the result comparisons of external test set on (C) Ru-complex, and (D) Ir-complex of the MoE1 model.

4. Conclusions

Transition metal complexes are potential photosensitizer candidates in PDT for their high singlet oxygen quantum yield (Φ_Δ) and water solubility, but the synthesis of photosensitizers and the experimental determination of Φ_Δ are both time-intensive and laborious processes. Traditional structure descriptors for machine learning models, such as SMILES, can hardly capture all the information of TMC, and the lack of data makes it difficult to use deep learning models. In this work, we propose a DFT-ML modeling approach to predict the photosensitizing properties of TMC. The excited state descriptors are calculated by DFT, and ML models are built using them together with other descriptors to characterize the structure and charge transition process under light irradiation of TMC photosensitizers. Six single-ML models and two kinds of hybrid-ML models are proposed based on these descriptors and their performance on the Φ_Δ prediction is tested.

The best descriptor groups are filtered to optimize single-ML models, respectively, and the best single models are then utilized to build hybrid models. The results show SVR and KRR provide good predictions on the test set (R ² > 0.9) and external test set (R ² > 0.7) in single-ML models; while DLM and MoE models can further improve the prediction effect (R ² up to 0.87 on the external test set). The comparison with the same hybrid-ML model trained on a specialized metal complex indicates that the proposed models also have strong universality (ΔR ² < 0.1 on the external test set between the generalized metal model and the specialized metal model) and can be used to predict the properties of various TMC photosensitizers. The subsequent SHAP analysis provides strong interpretability of the PDT mechanism. The excitation energy of the S1 state and the T1 state are the two most important descriptors, while relative molecular mass and dielectric constant at infinite frequency of the solvent also have an outstanding impact. These results demonstrate that the excited state descriptors have a good effect in predicting PDT process properties, and the hybrid-ML model with these descriptors can provide accurate predictions on photosensitizing properties based on a small data set of TMCs. Thus, the proposed approach could be a useful addition and theoretical guidance as a screening step prior to experiments of organic synthesis and photosensitivity testing.

This approach can filter out a large proportion of promising but low-performing candidates in the computational stage. By reducing the number of compounds that require synthesis, the proposed model can (I) significantly decrease the consumption of valuable and expensive metal precursors, ligands, and other chemicals, (II) save weeks or months of synthetic and characterization labor, and (III) allow researchers to focus their experimental efforts on the most promising leads, thereby increasing the efficiency and success rate of the discovery pipeline. However, one should notice that the scarcity of experimentally measured Φ_Δ for TMC is a fundamental constraint in the field, directly leading to our small data set. In the future, we will work collaboratively to expand the data set by incorporating photosensitizers based on other metals (such as Pd, Pt, Zn, etc.), leading to a more robust and universally applicable model.

Supplementary Material

ao5c08727_si_001.xlsx^{(89.8KB, xlsx)}

ao5c08727_si_002.pdf^{(1.5MB, pdf)}

Acknowledgments

The authors appreciate funding from the National Natural Science Foundation of China (22308044).

Glossary

Abbreviations

TMC: transition metal complex
PS: photosensitizer
DFT: density functional theory
TD-DFT: time-dependent density functional theory
ML: machine learning
QSPR: quantitative structure–property relationship
PDT: photodynamic therapy
ROS: reactive oxygen species
ISC: intersystem crossing
CT: charge-transfer
QCD: quantum chemistry descriptor
MSD: molecule structure descriptor
MCD: metal-centered descriptor
ECD: external condition descriptor
LOO: leave-one-out
SVR: support vector regression
KRR: kernel ridge regression
GPR: Gaussian process regression
XGBoost: extreme gradient boosting regression
RFR: random forest regression
KNR: k-neighbor regression
DLM: delta-learning model
MoE: Mixture-of-Experts

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acsomega.5c08727.

Detailed calculated descriptors input used for model training (XLSX)
Detailed information on TMC photosensitizers used to construct the data set and external test set, result of the GPR model, XGBoost model, RFR model, and KNR model in a descriptor filter, result of hybrid model comparison on specialized TMC photosensitizers, result of SHAP analysis, and optimized hyperparameters of the machine learning models (PDF)

The authors declare no competing financial interest.

References

Sung H., Ferlay J., Siegel R. L., Laversanne M., Soerjomataram I., Jemal A., Bray F.. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA. Cancer. J. Clin. 2021;71(3):209–249. doi: 10.3322/caac.21660. [DOI] [PubMed] [Google Scholar]
Mohamedahmed A., Zaman S., Wuheb A. A., Ismail A., Nnaji M., Alyamani A. A., Eltyeb H. A., Yassin N. A.. Peri-operative, oncological and functional outcomes of robotic versus transanal total mesorectal excision in patients with rectal cancer: a systematic review and meta-analysis. Technol. Coloproctology. 2024;28(1):75. doi: 10.1007/s10151-024-02947-x. [DOI] [PubMed] [Google Scholar]
de Castilhos J., Tillmanns K., Blessing J., Larano A., Borisov V., Stein-Thoeringer C. K.. Microbiome and pancreatic cancer: time to think about chemotherapy. Gut Microbes. 2024;16(1):2374596. doi: 10.1080/19490976.2024.2374596. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jain S. M., Nagainallur R. S., Murali K. M., Banerjee A., Sun-Zhang A., Zhang H., Pathak R., Sun X. F., Pathak S.. Understanding the molecular mechanism responsible for developing therapeutic radiation-induced radioresistance of rectal cancer and improving the clinical outcomes of radiotherapy - a review. Cancer Biol. Ther. 2024;25(1):2317999. doi: 10.1080/15384047.2024.2317999. [DOI] [PMC free article] [PubMed] [Google Scholar]
Malakar P., Shukla S., Mondal M., Kar R. K., Siddiqui J. A.. The nexus of long noncoding RNAs, splicing factors, alternative splicing and their modulations. RNA Biol. 2024;21(1):1–20. doi: 10.1080/15476286.2023.2286099. [DOI] [PMC free article] [PubMed] [Google Scholar]
Karimi M., Homayoonfal M., Zahedifar M., Ostadian A., Adibi R., Mohammadzadeh B., Raisi A., Ravaei F., Rashki S., Khakbraghi M.. et al. Development of a novel nanoformulation based on aloe vera-derived carbon quantum dot and chromium-doped alumina nanoparticle (al2o3:cr@cdot NPs): evaluating the anticancer and antimicrobial activities of nanoparticles in photodynamic therapy. Cancer Nanotechnol. 2024;15(1):26. doi: 10.1186/s12645-024-00260-8. [DOI] [Google Scholar]
Li X., Li X., Park S., Wu S., Guo Y., Nam K. T., Kwon N., Yoon J., Hu Q.. Photodynamic and photothermal therapy via human serum albumin delivery. Coord. Chem. Rev. 2024;520:216142. doi: 10.1016/j.ccr.2024.216142. [DOI] [Google Scholar]
Lin J., Wan M. T.. Current evidence and applications of photodynamic therapy in dermatology. Clin. Cosmet. Investig. Dermatol. 2014;7:145. doi: 10.2147/CCID.S35334. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hu J., Lei Q., Zhang X.. Recent advances in photonanomedicines for enhanced cancer photodynamic therapy. Prog. Mater. Sci. 2020;114:100685. doi: 10.1016/j.pmatsci.2020.100685. [DOI] [Google Scholar]
Ni K., Luo T., Nash G. T., Lin W.. Nanoscale metal–organic frameworks for cancer immunotherapy. Acc. Chem. Res. 2020;53(9):1739–1748. doi: 10.1021/acs.accounts.0c00313. [DOI] [PMC free article] [PubMed] [Google Scholar]
Baptista M. S., Cadet J., Di Mascio P., Ghogare A. A., Greer A., Hamblin M. R., Lorente C., Nunez S. C., Ribeiro M. S., Thomas A. H.. et al. Type i and type II photosensitized oxidation reactions: guidelines and mechanistic pathways. Photochem. Photobiol. 2017;93(4):912–919. doi: 10.1111/php.12716. [DOI] [PMC free article] [PubMed] [Google Scholar]
Baptista M. S., Cadet J., Greer A., Thomas A. H.. Photosensitization reactions of biomolecules: definition, targets and mechanisms. Photochem. Photobiol. 2021;97(6):1456–1483. doi: 10.1111/php.13470. [DOI] [PubMed] [Google Scholar]
Liu Y., Meng X., Bu W.. Upconversion-based photodynamic cancer therapy. Coord. Chem. Rev. 2019;379:82–98. doi: 10.1016/j.ccr.2017.09.006. [DOI] [Google Scholar]
Montaseri H., Kruger C. A., Abrahamse H.. Review: organic nanoparticle based active targeting for photodynamic therapy treatment of breast cancer cells. Oncotarget. 2020;11(22):2120–2136. doi: 10.18632/oncotarget.27596. [DOI] [PMC free article] [PubMed] [Google Scholar]
Erb J., Setter D., Swavey J., Willits F., Swavey S.. BODIPY-ruthenium(II) polypyridyl complexes: synthesis, computational, spectroscopic, electrochemical, and singlet oxygen studies. Inorg. Chim. Acta. 2024;560:121831. doi: 10.1016/j.ica.2023.121831. [DOI] [Google Scholar]
Mckenzie L. K., Bryant H. E., Weinstein J. A.. Transition metal complexes as photosensitisers in one- and two-photon photodynamic therapy. Coord. Chem. Rev. 2019;379:2–29. doi: 10.1016/j.ccr.2018.03.020. [DOI] [Google Scholar]
Zhang Z., He M., Wang R., Fan J., Peng X., Sun W.. Development of ruthenium nanophotocages with red or near-infrared light-responsiveness. ChemBioChem. 2023;24(24):e202300606. doi: 10.1002/cbic.202300606. [DOI] [PubMed] [Google Scholar]
Zhang L., Wang P., Zhou X., Bretin L., Zeng X., Husiev Y., Polanco E. A., Zhao G., Wijaya L. S., Biver T.. et al. Cyclic ruthenium-peptide conjugates as integrin-targeting phototherapeutic prodrugs for the treatment of brain tumors. J. Am. Chem. Soc. 2023;145(27):14963–14980. doi: 10.1021/jacs.3c04855. [DOI] [PMC free article] [PubMed] [Google Scholar]
Heinemann F., Karges J., Gasser G.. Critical overview of the use of ru(II) polypyridyl complexes as photosensitizers in one-photon and two-photon photodynamic therapy. Acc. Chem. Res. 2017;50(11):2727–2736. doi: 10.1021/acs.accounts.7b00180. [DOI] [PubMed] [Google Scholar]
Fong J., Kasimova K., Arenas Y., Kaspler P., Lazic S., Mandel A., Lilge L.. A novel class of ruthenium-based photosensitizers effectively kills in vitro cancer cells and in vivo tumors. Photochem. Photobiol. Sci. 2015;14(11):2014–2023. doi: 10.1039/c4pp00438h. [DOI] [PubMed] [Google Scholar]
Ferreira J. T., Pina J., Ribeiro C. A. F., Fernandes R., Tomé J. P. C., Rodríguez Morgade M. S., Torres T.. Highly efficient singlet oxygen generators based on ruthenium phthalocyanines: synthesis, characterization and in vitro evaluation for photodynamic therapy. Chem. – Eur. J. 2020;26(8):1789–1799. doi: 10.1002/chem.201903546. [DOI] [PubMed] [Google Scholar]
Cervinka J., Hernández-García A., Bautista D., Markova L., Kostrhunova H., Malina J., Kasparkova J., Santana M. D., Brabec V., Ruiz J.. New cyclometalated ru() polypyridyl photosensitizers trigger oncosis in cancer cells by inducing damage to cellular membranes. Inorg. Chem. Front. 2024;11(13):3855–3876. doi: 10.1039/D4QI00732H. [DOI] [Google Scholar]
Estevão B. M., Vilela R. R. C., Geremias I. P., Zanoni K. P. S., de Camargo A. S. S., Zucolotto V.. Mesoporous silica nanoparticles incorporated with ir(III) complexes: from photophysics to photodynamic therapy. Photodiagnosis Photodyn. Ther. 2022;40:103052. doi: 10.1016/j.pdpdt.2022.103052. [DOI] [PubMed] [Google Scholar]
Martínez-Alonso M., Jones C. G., Shipp J. D., Chekulaev D., Bryant H. E., Weinstein J. A.. Phototoxicity of cyclometallated ir(III) complexes bearing a thio-bis-benzimidazole ligand, and its monodentate analogue, as potential PDT photosensitisers in cancer cell killing. JBIC Journal of Biological Inorganic Chemistry. 2024;29(1):113–125. doi: 10.1007/s00775-023-02031-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
Feng W., Liang B., Chen B., Liu Q., Pan Z., Liu Y., He L.. A tricarbonyl rhenium(i) complex decorated with boron dipyrromethene for endoplasmic reticulum-targeted photodynamic therapy. Dyes Pigment. 2023;211:111077. doi: 10.1016/j.dyepig.2023.111077. [DOI] [Google Scholar]
Paragian K., Li B., Massino M., Rangarajan S.. A computational workflow to discover novel liquid organic hydrogen carriers and their dehydrogenation routes. Mol. Syst. Des. Eng. 2020;5(1):1167–1658. doi: 10.1039/D0ME00105H. [DOI] [Google Scholar]
Thomas H. Y., Ford Versypt A. N.. A mathematical model of glomerular fibrosis in diabetic kidney disease to predict therapeutic efficacy. Front. Pharmacol. 2024;15:1481768. doi: 10.3389/fphar.2024.1481768. [DOI] [PMC free article] [PubMed] [Google Scholar]
Mcdonagh J. L., Palmer D. S., Mourik T. V., Mitchell J. B. O.. Are the sublimation thermodynamics of organic molecules predictable? J. Chem. Inf. Model. 2016;56(11):2162–2179. doi: 10.1021/acs.jcim.6b00033. [DOI] [PubMed] [Google Scholar]
Fan J., Shi S., Xiang H., Fu L., Duan Y., Cao D., Lu H.. Predicting elimination of small-molecule drug half-life in pharmacokinetics using ensemble and consensus machine learning methods. J. Chem. Inf. Model. 2024;64(8):3080–3092. doi: 10.1021/acs.jcim.3c02030. [DOI] [PubMed] [Google Scholar]
Haciefendioglu T., Yildirim E.. Band gap and reorganization energy prediction of conducting polymers by the integration of machine learning and density functional theory. J. Chem. Inf. Model. 2025;65(11):5360–5369. doi: 10.1021/acs.jcim.5c00345. [DOI] [PMC free article] [PubMed] [Google Scholar]
He T., Xiao P., Li D., Qu B., Zhou R.. Accelerating the discovery of efficient phosphorescent iridium(III) complex emitters with targeted color gamuts through interpretable machine learning models and virtual screening. J. Phys. Chem. C. 2025;129(23):10696–10708. doi: 10.1021/acs.jpcc.5c01750. [DOI] [Google Scholar]
Izquierdo R., Zadorosny R., Rosales M., Marrero-Ponce Y., Cubillan N.. Molecular and descriptor spaces for predicting initial rate of catalytic homogeneous quinoline hydrogenation with ru, rh, os, and ir catalysts. ACS Omega. 2025:4c–9503c. doi: 10.1021/acsomega.4c09503. [DOI] [PMC free article] [PubMed] [Google Scholar]
Horvitz E., Mulligan D.. Data, privacy, and the greater good. Science. 2015;349(6245):253–255. doi: 10.1126/science.aac4520. [DOI] [PubMed] [Google Scholar]
Santana V. V., Rebello C. M., Queiroz L. P., Ribeiro A. M., Shardt N., Nogueira I. B. R.. PUFFIN: a path-unifying feed-forward interfaced network for vapor pressure prediction. Chem. Eng. Sci. 2024;286:119623. doi: 10.1016/j.ces.2023.119623. [DOI] [Google Scholar]
Wu G., Zhao Y., Zhang L., Du J., Meng Q., Liu Q.. Machine learning potential model for accelerating quantum chemistry-driven property prediction and molecular design. AIChE J. 2025 doi: 10.1002/aic.18741. [DOI] [Google Scholar]
Zhu J., Hao L., Zhang H., Wei H.. Development of convolutional neural network-based models for efficient and reliable flashpoint prediction. Ind. Eng. Chem. Res. 2025;64(15):7803–7809. doi: 10.1021/acs.iecr.4c04373. [DOI] [Google Scholar]
Liao Z., Lu J., Xie K., Wang Y., Yuan Y.. Prediction of photochemical properties of dissolved organic matter using machine learning. Environ. Sci. Technol. 2023;57(46):17971–17980. doi: 10.1021/acs.est.2c07545. [DOI] [PubMed] [Google Scholar]
Chebotaev P. P., Buglak A. A., Sheehan A., Filatov M. A.. Predicting fluorescence to singlet oxygen generation quantum yield ratio for BODIPY dyes using QSPR and machine learning. Phys. Chem. Chem. Phys. 2024;26(38):25131–25142. doi: 10.1039/D4CP02471K. [DOI] [PubMed] [Google Scholar]
He L., Dong J., Yang Y., Huang Z., Ye S., Ke X., Zhou Y., Li A., Zhang Z., Wu S.. et al. Accelerating the discovery of type ii photosensitizer: experimentally validated machine learning models for predicting the singlet oxygen quantum yield of photosensitive molecule. J. Mol. Struct. 2025;1321:139850. doi: 10.1016/j.molstruc.2024.139850. [DOI] [Google Scholar]
Buglak A. A., Filatov M. A., Hussain M. A., Sugimoto M.. Singlet oxygen generation by porphyrins and metalloporphyrins revisited: a quantitative structure-property relationship (QSPR) study. Journal of Photochemistry and Photobiology a: Chemistry. 2020;403:112833. doi: 10.1016/j.jphotochem.2020.112833. [DOI] [Google Scholar]
Ju C. W., Bai H., Li B., Liu R.. Machine learning enables highly accurate predictions of photophysical properties of organic fluorescent materials: emission wavelengths and quantum yields. J. Chem. Inf. Model. 2021;61(3):1053–1065. doi: 10.1021/acs.jcim.0c01203. [DOI] [PubMed] [Google Scholar]
Mahato K. D., Kumar Das S. S. G., Azad C., Kumar U.. Machine learning based hybrid ensemble models for prediction of organic dyes photophysical properties: absorption wavelengths, emission wavelengths, and quantum yields. APL Mach. Learn. 2024;2(1):016101. doi: 10.1063/5.0181294. [DOI] [Google Scholar]
Khaheshi S., Riahi S., Mohammadi-Khanaposhtani M., Shokrollahzadeh H.. Prediction of amines capacity for carbon dioxide absorption based on structural characteristics. Ind. Eng. Chem. Res. 2019;58(20):8763–8771. doi: 10.1021/acs.iecr.9b00567. [DOI] [Google Scholar]
Wang X., Zhang T., Zhang H., Wang X., Xie B., Fan W.. Combined DFT and machine learning study of the dissociation and migration of h in pyrrole derivatives. J. Phys. Chem. A. 2023;127(35):7383–7399. doi: 10.1021/acs.jpca.3c03192. [DOI] [PubMed] [Google Scholar]
Mohamed A., Visco D. P., Breimaier K., Bastidas D. M.. Effect of molecular structure on the b3LYP-computed HOMO–LUMO gap: a structure – property relationship using atomic signatures. ACS Omega. 2025;10(3):2799–2808. doi: 10.1021/acsomega.4c08626. [DOI] [PMC free article] [PubMed] [Google Scholar]
Liu Z., Lu T., Chen Q.. An sp-hybridized all-carboatomic ring, cyclo[18]carbon: electronic structure, electronic spectrum, and optical nonlinearity. Carbon. 2020;165:461–467. doi: 10.1016/j.carbon.2020.05.023. [DOI] [Google Scholar]
Buglak A. A., Telegina T. A., Vorotelyak E. A., Kononov A. I.. Theoretical study of photoreactions between oxidized pterins and molecular oxygen. Journal of Photochemistry and Photobiology a: Chemistry. 2019;372:254–259. doi: 10.1016/j.jphotochem.2018.12.002. [DOI] [Google Scholar]
Ouattara W. P., Bamba K., Thomas A. S., Diarrassouba F., Ouattara L., Ouattara M. P., N’Guessan K. N., Kone M. G. R., Kodjo C. G., Ziao N.. Theoretical studies of photodynamic therapy properties of azopyridine δ-OsCl2(azpy)2 complex as a photosensitizer by a TDDFT method. Computational Chemistry. 2021;09(01):64–84. doi: 10.4236/cc.2021.91004. [DOI] [Google Scholar]
Neese F.. The ORCA program system. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2012;2(1):73–78. doi: 10.1002/wcms.81. [DOI] [Google Scholar]
Neese F.. Software update: TheORCA program systemversion 5.0. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2022;12(5):e1606. doi: 10.1002/wcms.1606. [DOI] [Google Scholar]
Adamo C., Barone V.. Toward reliable density functional methods without adjustable parameters: the PBE0 model. J. Chem. Phys. 1999;110(13):6158–6170. doi: 10.1063/1.478522. [DOI] [Google Scholar]
Weigend F., Ahlrichs R.. Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for h to rn: design and assessment of accuracy. Phys. Chem. Chem. Phys. 2005;7(18):3297–3305. doi: 10.1039/b508541a. [DOI] [PubMed] [Google Scholar]
Weigend F.. Accurate coulomb-fitting basis sets for h to rn. Phys. Chem. Chem. Phys. 2006;8(9):1057–1065. doi: 10.1039/b515623h. [DOI] [PubMed] [Google Scholar]
Barone V., Cossi M.. Quantum calculation of molecular energies and energy gradients in solution by a conductor solvent model. J. Phys. Chem. A. 1998;102(11):1995–2001. doi: 10.1021/jp9716997. [DOI] [Google Scholar]
Garcia-Rates M., Neese F.. Effect of the solute cavity on the solvation energy and its derivatives within the framework of the gaussian charge scheme. J. Comput. Chem. 2020;41(9):922–939. doi: 10.1002/jcc.26139. [DOI] [PubMed] [Google Scholar]
Hellweg A., Hättig C., Höfener S., Klopper W.. Optimized accurate auxiliary basis sets for RI-MP2 and RI-CC2 calculations for the atoms rb to rn. Theor. Chem. Acc. 2007;117(4):587–597. doi: 10.1007/s00214-007-0250-5. [DOI] [Google Scholar]
Marenich A. V., Cramer C. J., Truhlar D. G.. Universal solvation model based on solute electron density and on a continuum model of the solvent defined by the bulk dielectric constant and atomic surface tensions. J. Phys. Chem. B. 2009;113(18):6378–6396. doi: 10.1021/jp810292n. [DOI] [PubMed] [Google Scholar]
Lu T., Chen F.. Multiwfn: a multifunctional wavefunction analyzer. J. Comput. Chem. 2012;33(5):580–592. doi: 10.1002/jcc.22885. [DOI] [PubMed] [Google Scholar]
Vetere V., Adamo C., Maldivi P.. Performance of the `parameter free’ PBE0 functional for the modeling of molecular properties of heavy metals. Chem. Phys. Lett. 2000;325(1):99–105. doi: 10.1016/S0009-2614(00)00657-6. [DOI] [Google Scholar]
Waller M. P., Braun H., Hojdis N., Bühl M.. Geometries of second-row transition-metal complexes from density-functional theory. J. Chem. Theory Comput. 2007;3(6):2234–2242. doi: 10.1021/ct700178y. [DOI] [PubMed] [Google Scholar]
Golbraikh A., Tropsha A.. Beware of q2. J. Mol. Graph. 2002;20(4):269–276. doi: 10.1016/S1093-3263(01)00123-1. [DOI] [PubMed] [Google Scholar]
Shapley, L. S. A value for n-person games. In Contributions to the Theory of Game II; Princeton University Press: 1953; pp 307–318. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ao5c08727_si_001.xlsx^{(89.8KB, xlsx)}

ao5c08727_si_002.pdf^{(1.5MB, pdf)}

[ref1] Sung H., Ferlay J., Siegel R. L., Laversanne M., Soerjomataram I., Jemal A., Bray F.. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA. Cancer. J. Clin. 2021;71(3):209–249. doi: 10.3322/caac.21660. [DOI] [PubMed] [Google Scholar]

[ref2] Mohamedahmed A., Zaman S., Wuheb A. A., Ismail A., Nnaji M., Alyamani A. A., Eltyeb H. A., Yassin N. A.. Peri-operative, oncological and functional outcomes of robotic versus transanal total mesorectal excision in patients with rectal cancer: a systematic review and meta-analysis. Technol. Coloproctology. 2024;28(1):75. doi: 10.1007/s10151-024-02947-x. [DOI] [PubMed] [Google Scholar]

[ref3] de Castilhos J., Tillmanns K., Blessing J., Larano A., Borisov V., Stein-Thoeringer C. K.. Microbiome and pancreatic cancer: time to think about chemotherapy. Gut Microbes. 2024;16(1):2374596. doi: 10.1080/19490976.2024.2374596. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref4] Jain S. M., Nagainallur R. S., Murali K. M., Banerjee A., Sun-Zhang A., Zhang H., Pathak R., Sun X. F., Pathak S.. Understanding the molecular mechanism responsible for developing therapeutic radiation-induced radioresistance of rectal cancer and improving the clinical outcomes of radiotherapy - a review. Cancer Biol. Ther. 2024;25(1):2317999. doi: 10.1080/15384047.2024.2317999. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref5] Malakar P., Shukla S., Mondal M., Kar R. K., Siddiqui J. A.. The nexus of long noncoding RNAs, splicing factors, alternative splicing and their modulations. RNA Biol. 2024;21(1):1–20. doi: 10.1080/15476286.2023.2286099. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref6] Karimi M., Homayoonfal M., Zahedifar M., Ostadian A., Adibi R., Mohammadzadeh B., Raisi A., Ravaei F., Rashki S., Khakbraghi M.. et al. Development of a novel nanoformulation based on aloe vera-derived carbon quantum dot and chromium-doped alumina nanoparticle (al2o3:cr@cdot NPs): evaluating the anticancer and antimicrobial activities of nanoparticles in photodynamic therapy. Cancer Nanotechnol. 2024;15(1):26. doi: 10.1186/s12645-024-00260-8. [DOI] [Google Scholar]

[ref7] Li X., Li X., Park S., Wu S., Guo Y., Nam K. T., Kwon N., Yoon J., Hu Q.. Photodynamic and photothermal therapy via human serum albumin delivery. Coord. Chem. Rev. 2024;520:216142. doi: 10.1016/j.ccr.2024.216142. [DOI] [Google Scholar]

[ref8] Lin J., Wan M. T.. Current evidence and applications of photodynamic therapy in dermatology. Clin. Cosmet. Investig. Dermatol. 2014;7:145. doi: 10.2147/CCID.S35334. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref9] Hu J., Lei Q., Zhang X.. Recent advances in photonanomedicines for enhanced cancer photodynamic therapy. Prog. Mater. Sci. 2020;114:100685. doi: 10.1016/j.pmatsci.2020.100685. [DOI] [Google Scholar]

[ref10] Ni K., Luo T., Nash G. T., Lin W.. Nanoscale metal–organic frameworks for cancer immunotherapy. Acc. Chem. Res. 2020;53(9):1739–1748. doi: 10.1021/acs.accounts.0c00313. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref11] Baptista M. S., Cadet J., Di Mascio P., Ghogare A. A., Greer A., Hamblin M. R., Lorente C., Nunez S. C., Ribeiro M. S., Thomas A. H.. et al. Type i and type II photosensitized oxidation reactions: guidelines and mechanistic pathways. Photochem. Photobiol. 2017;93(4):912–919. doi: 10.1111/php.12716. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref12] Baptista M. S., Cadet J., Greer A., Thomas A. H.. Photosensitization reactions of biomolecules: definition, targets and mechanisms. Photochem. Photobiol. 2021;97(6):1456–1483. doi: 10.1111/php.13470. [DOI] [PubMed] [Google Scholar]

[ref13] Liu Y., Meng X., Bu W.. Upconversion-based photodynamic cancer therapy. Coord. Chem. Rev. 2019;379:82–98. doi: 10.1016/j.ccr.2017.09.006. [DOI] [Google Scholar]

[ref14] Montaseri H., Kruger C. A., Abrahamse H.. Review: organic nanoparticle based active targeting for photodynamic therapy treatment of breast cancer cells. Oncotarget. 2020;11(22):2120–2136. doi: 10.18632/oncotarget.27596. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref15] Erb J., Setter D., Swavey J., Willits F., Swavey S.. BODIPY-ruthenium(II) polypyridyl complexes: synthesis, computational, spectroscopic, electrochemical, and singlet oxygen studies. Inorg. Chim. Acta. 2024;560:121831. doi: 10.1016/j.ica.2023.121831. [DOI] [Google Scholar]

[ref16] Mckenzie L. K., Bryant H. E., Weinstein J. A.. Transition metal complexes as photosensitisers in one- and two-photon photodynamic therapy. Coord. Chem. Rev. 2019;379:2–29. doi: 10.1016/j.ccr.2018.03.020. [DOI] [Google Scholar]

[ref17] Zhang Z., He M., Wang R., Fan J., Peng X., Sun W.. Development of ruthenium nanophotocages with red or near-infrared light-responsiveness. ChemBioChem. 2023;24(24):e202300606. doi: 10.1002/cbic.202300606. [DOI] [PubMed] [Google Scholar]

[ref18] Zhang L., Wang P., Zhou X., Bretin L., Zeng X., Husiev Y., Polanco E. A., Zhao G., Wijaya L. S., Biver T.. et al. Cyclic ruthenium-peptide conjugates as integrin-targeting phototherapeutic prodrugs for the treatment of brain tumors. J. Am. Chem. Soc. 2023;145(27):14963–14980. doi: 10.1021/jacs.3c04855. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref19] Heinemann F., Karges J., Gasser G.. Critical overview of the use of ru(II) polypyridyl complexes as photosensitizers in one-photon and two-photon photodynamic therapy. Acc. Chem. Res. 2017;50(11):2727–2736. doi: 10.1021/acs.accounts.7b00180. [DOI] [PubMed] [Google Scholar]

[ref20] Fong J., Kasimova K., Arenas Y., Kaspler P., Lazic S., Mandel A., Lilge L.. A novel class of ruthenium-based photosensitizers effectively kills in vitro cancer cells and in vivo tumors. Photochem. Photobiol. Sci. 2015;14(11):2014–2023. doi: 10.1039/c4pp00438h. [DOI] [PubMed] [Google Scholar]

[ref21] Ferreira J. T., Pina J., Ribeiro C. A. F., Fernandes R., Tomé J. P. C., Rodríguez Morgade M. S., Torres T.. Highly efficient singlet oxygen generators based on ruthenium phthalocyanines: synthesis, characterization and in vitro evaluation for photodynamic therapy. Chem. – Eur. J. 2020;26(8):1789–1799. doi: 10.1002/chem.201903546. [DOI] [PubMed] [Google Scholar]

[ref22] Cervinka J., Hernández-García A., Bautista D., Markova L., Kostrhunova H., Malina J., Kasparkova J., Santana M. D., Brabec V., Ruiz J.. New cyclometalated ru() polypyridyl photosensitizers trigger oncosis in cancer cells by inducing damage to cellular membranes. Inorg. Chem. Front. 2024;11(13):3855–3876. doi: 10.1039/D4QI00732H. [DOI] [Google Scholar]

[ref23] Estevão B. M., Vilela R. R. C., Geremias I. P., Zanoni K. P. S., de Camargo A. S. S., Zucolotto V.. Mesoporous silica nanoparticles incorporated with ir(III) complexes: from photophysics to photodynamic therapy. Photodiagnosis Photodyn. Ther. 2022;40:103052. doi: 10.1016/j.pdpdt.2022.103052. [DOI] [PubMed] [Google Scholar]

[ref24] Martínez-Alonso M., Jones C. G., Shipp J. D., Chekulaev D., Bryant H. E., Weinstein J. A.. Phototoxicity of cyclometallated ir(III) complexes bearing a thio-bis-benzimidazole ligand, and its monodentate analogue, as potential PDT photosensitisers in cancer cell killing. JBIC Journal of Biological Inorganic Chemistry. 2024;29(1):113–125. doi: 10.1007/s00775-023-02031-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref25] Feng W., Liang B., Chen B., Liu Q., Pan Z., Liu Y., He L.. A tricarbonyl rhenium(i) complex decorated with boron dipyrromethene for endoplasmic reticulum-targeted photodynamic therapy. Dyes Pigment. 2023;211:111077. doi: 10.1016/j.dyepig.2023.111077. [DOI] [Google Scholar]

[ref26] Paragian K., Li B., Massino M., Rangarajan S.. A computational workflow to discover novel liquid organic hydrogen carriers and their dehydrogenation routes. Mol. Syst. Des. Eng. 2020;5(1):1167–1658. doi: 10.1039/D0ME00105H. [DOI] [Google Scholar]

[ref27] Thomas H. Y., Ford Versypt A. N.. A mathematical model of glomerular fibrosis in diabetic kidney disease to predict therapeutic efficacy. Front. Pharmacol. 2024;15:1481768. doi: 10.3389/fphar.2024.1481768. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref28] Mcdonagh J. L., Palmer D. S., Mourik T. V., Mitchell J. B. O.. Are the sublimation thermodynamics of organic molecules predictable? J. Chem. Inf. Model. 2016;56(11):2162–2179. doi: 10.1021/acs.jcim.6b00033. [DOI] [PubMed] [Google Scholar]

[ref29] Fan J., Shi S., Xiang H., Fu L., Duan Y., Cao D., Lu H.. Predicting elimination of small-molecule drug half-life in pharmacokinetics using ensemble and consensus machine learning methods. J. Chem. Inf. Model. 2024;64(8):3080–3092. doi: 10.1021/acs.jcim.3c02030. [DOI] [PubMed] [Google Scholar]

[ref30] Haciefendioglu T., Yildirim E.. Band gap and reorganization energy prediction of conducting polymers by the integration of machine learning and density functional theory. J. Chem. Inf. Model. 2025;65(11):5360–5369. doi: 10.1021/acs.jcim.5c00345. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref31] He T., Xiao P., Li D., Qu B., Zhou R.. Accelerating the discovery of efficient phosphorescent iridium(III) complex emitters with targeted color gamuts through interpretable machine learning models and virtual screening. J. Phys. Chem. C. 2025;129(23):10696–10708. doi: 10.1021/acs.jpcc.5c01750. [DOI] [Google Scholar]

[ref32] Izquierdo R., Zadorosny R., Rosales M., Marrero-Ponce Y., Cubillan N.. Molecular and descriptor spaces for predicting initial rate of catalytic homogeneous quinoline hydrogenation with ru, rh, os, and ir catalysts. ACS Omega. 2025:4c–9503c. doi: 10.1021/acsomega.4c09503. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref33] Horvitz E., Mulligan D.. Data, privacy, and the greater good. Science. 2015;349(6245):253–255. doi: 10.1126/science.aac4520. [DOI] [PubMed] [Google Scholar]

[ref34] Santana V. V., Rebello C. M., Queiroz L. P., Ribeiro A. M., Shardt N., Nogueira I. B. R.. PUFFIN: a path-unifying feed-forward interfaced network for vapor pressure prediction. Chem. Eng. Sci. 2024;286:119623. doi: 10.1016/j.ces.2023.119623. [DOI] [Google Scholar]

[ref35] Wu G., Zhao Y., Zhang L., Du J., Meng Q., Liu Q.. Machine learning potential model for accelerating quantum chemistry-driven property prediction and molecular design. AIChE J. 2025 doi: 10.1002/aic.18741. [DOI] [Google Scholar]

[ref36] Zhu J., Hao L., Zhang H., Wei H.. Development of convolutional neural network-based models for efficient and reliable flashpoint prediction. Ind. Eng. Chem. Res. 2025;64(15):7803–7809. doi: 10.1021/acs.iecr.4c04373. [DOI] [Google Scholar]

[ref37] Liao Z., Lu J., Xie K., Wang Y., Yuan Y.. Prediction of photochemical properties of dissolved organic matter using machine learning. Environ. Sci. Technol. 2023;57(46):17971–17980. doi: 10.1021/acs.est.2c07545. [DOI] [PubMed] [Google Scholar]

[ref38] Chebotaev P. P., Buglak A. A., Sheehan A., Filatov M. A.. Predicting fluorescence to singlet oxygen generation quantum yield ratio for BODIPY dyes using QSPR and machine learning. Phys. Chem. Chem. Phys. 2024;26(38):25131–25142. doi: 10.1039/D4CP02471K. [DOI] [PubMed] [Google Scholar]

[ref39] He L., Dong J., Yang Y., Huang Z., Ye S., Ke X., Zhou Y., Li A., Zhang Z., Wu S.. et al. Accelerating the discovery of type ii photosensitizer: experimentally validated machine learning models for predicting the singlet oxygen quantum yield of photosensitive molecule. J. Mol. Struct. 2025;1321:139850. doi: 10.1016/j.molstruc.2024.139850. [DOI] [Google Scholar]

[ref40] Buglak A. A., Filatov M. A., Hussain M. A., Sugimoto M.. Singlet oxygen generation by porphyrins and metalloporphyrins revisited: a quantitative structure-property relationship (QSPR) study. Journal of Photochemistry and Photobiology a: Chemistry. 2020;403:112833. doi: 10.1016/j.jphotochem.2020.112833. [DOI] [Google Scholar]

[ref41] Ju C. W., Bai H., Li B., Liu R.. Machine learning enables highly accurate predictions of photophysical properties of organic fluorescent materials: emission wavelengths and quantum yields. J. Chem. Inf. Model. 2021;61(3):1053–1065. doi: 10.1021/acs.jcim.0c01203. [DOI] [PubMed] [Google Scholar]

[ref42] Mahato K. D., Kumar Das S. S. G., Azad C., Kumar U.. Machine learning based hybrid ensemble models for prediction of organic dyes photophysical properties: absorption wavelengths, emission wavelengths, and quantum yields. APL Mach. Learn. 2024;2(1):016101. doi: 10.1063/5.0181294. [DOI] [Google Scholar]

[ref43] Khaheshi S., Riahi S., Mohammadi-Khanaposhtani M., Shokrollahzadeh H.. Prediction of amines capacity for carbon dioxide absorption based on structural characteristics. Ind. Eng. Chem. Res. 2019;58(20):8763–8771. doi: 10.1021/acs.iecr.9b00567. [DOI] [Google Scholar]

[ref44] Wang X., Zhang T., Zhang H., Wang X., Xie B., Fan W.. Combined DFT and machine learning study of the dissociation and migration of h in pyrrole derivatives. J. Phys. Chem. A. 2023;127(35):7383–7399. doi: 10.1021/acs.jpca.3c03192. [DOI] [PubMed] [Google Scholar]

[ref45] Mohamed A., Visco D. P., Breimaier K., Bastidas D. M.. Effect of molecular structure on the b3LYP-computed HOMO–LUMO gap: a structure – property relationship using atomic signatures. ACS Omega. 2025;10(3):2799–2808. doi: 10.1021/acsomega.4c08626. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref46] Liu Z., Lu T., Chen Q.. An sp-hybridized all-carboatomic ring, cyclo[18]carbon: electronic structure, electronic spectrum, and optical nonlinearity. Carbon. 2020;165:461–467. doi: 10.1016/j.carbon.2020.05.023. [DOI] [Google Scholar]

[ref47] Buglak A. A., Telegina T. A., Vorotelyak E. A., Kononov A. I.. Theoretical study of photoreactions between oxidized pterins and molecular oxygen. Journal of Photochemistry and Photobiology a: Chemistry. 2019;372:254–259. doi: 10.1016/j.jphotochem.2018.12.002. [DOI] [Google Scholar]

[ref48] Ouattara W. P., Bamba K., Thomas A. S., Diarrassouba F., Ouattara L., Ouattara M. P., N’Guessan K. N., Kone M. G. R., Kodjo C. G., Ziao N.. Theoretical studies of photodynamic therapy properties of azopyridine δ-OsCl2(azpy)2 complex as a photosensitizer by a TDDFT method. Computational Chemistry. 2021;09(01):64–84. doi: 10.4236/cc.2021.91004. [DOI] [Google Scholar]

[ref49] Neese F.. The ORCA program system. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2012;2(1):73–78. doi: 10.1002/wcms.81. [DOI] [Google Scholar]

[ref50] Neese F.. Software update: TheORCA program systemversion 5.0. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2022;12(5):e1606. doi: 10.1002/wcms.1606. [DOI] [Google Scholar]

[ref51] Adamo C., Barone V.. Toward reliable density functional methods without adjustable parameters: the PBE0 model. J. Chem. Phys. 1999;110(13):6158–6170. doi: 10.1063/1.478522. [DOI] [Google Scholar]

[ref52] Weigend F., Ahlrichs R.. Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for h to rn: design and assessment of accuracy. Phys. Chem. Chem. Phys. 2005;7(18):3297–3305. doi: 10.1039/b508541a. [DOI] [PubMed] [Google Scholar]

[ref53] Weigend F.. Accurate coulomb-fitting basis sets for h to rn. Phys. Chem. Chem. Phys. 2006;8(9):1057–1065. doi: 10.1039/b515623h. [DOI] [PubMed] [Google Scholar]

[ref54] Barone V., Cossi M.. Quantum calculation of molecular energies and energy gradients in solution by a conductor solvent model. J. Phys. Chem. A. 1998;102(11):1995–2001. doi: 10.1021/jp9716997. [DOI] [Google Scholar]

[ref55] Garcia-Rates M., Neese F.. Effect of the solute cavity on the solvation energy and its derivatives within the framework of the gaussian charge scheme. J. Comput. Chem. 2020;41(9):922–939. doi: 10.1002/jcc.26139. [DOI] [PubMed] [Google Scholar]

[ref56] Hellweg A., Hättig C., Höfener S., Klopper W.. Optimized accurate auxiliary basis sets for RI-MP2 and RI-CC2 calculations for the atoms rb to rn. Theor. Chem. Acc. 2007;117(4):587–597. doi: 10.1007/s00214-007-0250-5. [DOI] [Google Scholar]

[ref57] Marenich A. V., Cramer C. J., Truhlar D. G.. Universal solvation model based on solute electron density and on a continuum model of the solvent defined by the bulk dielectric constant and atomic surface tensions. J. Phys. Chem. B. 2009;113(18):6378–6396. doi: 10.1021/jp810292n. [DOI] [PubMed] [Google Scholar]

[ref58] Lu T., Chen F.. Multiwfn: a multifunctional wavefunction analyzer. J. Comput. Chem. 2012;33(5):580–592. doi: 10.1002/jcc.22885. [DOI] [PubMed] [Google Scholar]

[ref59] Vetere V., Adamo C., Maldivi P.. Performance of the `parameter free’ PBE0 functional for the modeling of molecular properties of heavy metals. Chem. Phys. Lett. 2000;325(1):99–105. doi: 10.1016/S0009-2614(00)00657-6. [DOI] [Google Scholar]

[ref60] Waller M. P., Braun H., Hojdis N., Bühl M.. Geometries of second-row transition-metal complexes from density-functional theory. J. Chem. Theory Comput. 2007;3(6):2234–2242. doi: 10.1021/ct700178y. [DOI] [PubMed] [Google Scholar]

[ref61] Golbraikh A., Tropsha A.. Beware of q2. J. Mol. Graph. 2002;20(4):269–276. doi: 10.1016/S1093-3263(01)00123-1. [DOI] [PubMed] [Google Scholar]

[ref62] Shapley, L. S. A value for n-person games. In Contributions to the Theory of Game II; Princeton University Press: 1953; pp 307–318. [Google Scholar]

PERMALINK

DFT-ML-Based Property Prediction of Transition Metal Complex Photosensitizers for Photodynamic Therapy

Jingxing Gao

Yachao Dong

Tian Qiu

Wen Sun

Jian Du

Abstract

1. Introduction

1.

2.

2. Method

2.1. Data Set Construction and Preprocessing

3.

2.2. Descriptor Acquisition

2.2.1. Quantum Chemistry Descriptors

1. Meanings of Quantum Chemistry Descriptors.

2.2.2. Molecule Structure Descriptors

2. Meanings of Molecular Structure Descriptors.

2.2.3. Metal-Centered Descriptors

3. Meanings of Metal-Centered Descriptors.

2.2.4. External Condition Descriptors

4. Meanings of External Condition Descriptors.

2.3. Machine Learning Models

4.

5.

5. Hyperparameters Optimized of Single-ML Models.

3. Results and Discussion

3.1. Single-ML Models

6. R 2(Q 2) Result of the SVR Model and KRR Model in a Descriptor Filter.

7. Performance of Three Best Models with Filtered Descriptors.

6.

7.

3.2. Hybrid-ML Models

3.2.1. Delta-Learning Model

8. Performance of DLM and MoE.

8.

3.2.2. Mixture-of-Experts Model

3.3. Comparison with the Specialized Metal Model

9.

10.

4. Conclusions

Supplementary Material

Acknowledgments

Glossary

Abbreviations

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

6. R ²(Q ²) Result of the SVR Model and KRR Model in a Descriptor Filter.