Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Aug 17;15:30118. doi: 10.1038/s41598-025-15996-5

Determination of disintegration time using formulation data for solid dosage oral formulations via advanced machine learning integrated optimizer models

Mohammed Ghazwani 1,, Umme Hani 1
PMCID: PMC12358554  PMID: 40820039

Abstract

We evaluated the properties of tablets by artificial intelligence and machine learning computational approach with integration of optimizer. A large dataset on formulations properties and corresponding tablet disintegration time was collected and the models were used to fit the dataset. Utilizing a dataset of approximately 2,000 entries encompassing molecular properties, physical properties, excipient composition, and formulation characteristics, three ML models were evaluated: TabNet, Radial Basis Function Support Vector Regression (RBF-SVR), and Neural Oblivious Decision Ensembles (NODE). Data preprocessing involved Min-Max normalization, outlier detection via Elliptic Envelope, and feature selection using Conditional Mutual Information, with hyperparameters optimized through the Water Cycle Algorithm. Performance was assessed using R², RMSE, and MAE across train, validation, and test sets, with 95% confidence intervals confirming robust predictions. NODE demonstrated great accuracy for fitting the data, with the highest calculated test R² (0.9805) and the lowest RMSE (7.078) and MAE (5.913), outperforming TabNet (R²: 0.9657, RMSE: 9.382, MAE: 7.299) and RBF-SVR (R²: 0.9652, RMSE: 9.452, MAE: 7.127). These findings highlight NODE’s efficacy in modeling complex data relationships, offering significant potential for optimizing tablet formulations in pharmaceutical research to design proper fast-disintegrating tablets.

Keywords: Tablet disintegration, Pharmaceutics, Machine learning, Optimization, Modeling

Subject terms: Biomedical engineering, Chemical engineering

Introduction

Development of efficient solid-dosage formulations of pharmaceuticals is challenging as there are different formulation properties which could affect the release and dissolution of API (Active Pharmaceutical Ingredient). For efficient treatment, the release of APIs from solid-dosage formulations such as tablets should be precisely controlled, however the release is dependent on many parameters1,2. Unravelling the relationship between the drug release and formulation properties is challenging to be determined by experimental methods, and computational techniques can be applied for finding the complex relationship between these parameters. For modeling the process, underlying phenomena should be well understood and described to have a great insight into tablet release and design advanced drug delivery formulations.

Considering different steps for drug delivery by oral administration, disintegration is the first step where the tablet components are disintegrated in the solution for release of API. Disintegration is defined by its time, known as disintegration time, DT, in drug release3,4. The physical properties of tablet components, interactions between API and excipient, particle size of excipient, and interaction with the solvent are among the most important parameters affecting DT5. For computational analysis of tablet disintegration, a multiscale model is required in order to find the relationship between formulation and DT under different conditions. Kalný et al.6 developed numerical simulations based on Discrete Element Method (DEM) to predict DT in immediate release tablets. A combined methodology was used in which the tablet fragmentation was modeled using DEM, while the dissolution was estimated using finite volume method. Developing such a model can significantly reduce the amount of lab work to optimize formulations. So et al.7 used numerical scheme based on finite difference to model tablet disintegration time via analysis of mass transfer in the solution phase.

The models based on DEM, finite volume, mass transfer, and finite element are mechanistic models which provide physical insight into disintegration; however, these models have been recognized to be computationally expense and executing them for a large number of formulations is impossible. Therefore, some computational techniques such as machine learning which are suitable for this purpose can be employed. Machine learning (ML) is increasingly used across various fields to detect complex patterns and relationships in data that traditional methods often miss. By learning from data, ML models can make accurate predictions and adapt to changing conditions, making them especially useful in scientific applications involving nonlinear, multivariable systems. As data becomes more complex such as tablet formulations, ML enhances decision-making, process optimization, and insight discovery. In pharmaceutical research, ML can aid in designing controlled release systems and predicting tablet properties, such as estimating disintegration time based on formulation parameters.

Recently, several attempts were made on correlation of tablet DT to formulation properties via ML models such as learning-based regressive models which showed great accuracy4,5,8. Following the previous works, more algorithms need to be applied and optimized in correlation of DT to formulation properties. In this work, three ML models were evaluated including: TabNet, Radial Basis Function Support Vector Regression (RBF-SVR), and Neural Oblivious Decision Ensembles (NODE). The models are applied for the first time to correlate DT to formulation properties.

Materials and methods

Data analytics

The dataset of this work has been gathered from a publication which reported data collection for fast-disintegration tablets9. This dataset contains around 2,000 entries, with disintegration time as the output and a wide range of input variables which are related to the formulation properties. Input features for building the models in this study include molecular properties of APIs (e.g., hydrogen bonds, molecular weight), composition of tablets excipient (e.g., magnesium stearate, microcrystalline cellulose, chitosan), tablets mechanical characteristics (e.g., friability, hardness), and formulation characteristics (e.g., density, flowability, wetting time) which have been also considered in other studies as inputs for prediction of tablet disintegration times4,5,8. The same methodology is adopted in this study with different ML models for estimation of DT and compared with other ML models to improve the accuracy of model.

An initial evaluation was carried out on the dataset, and it was turned out that this dataset contains no missing values; therefore, imputation is not required. The subsequent sections of this subsection outline the preprocessing steps applied prior to ML modeling.

Normalization using min-max scaler

In machine learning, data normalization is essential for standardizing feature scales across a dataset. It improves training efficiency and stability, especially for models sensitive to input scale. Without normalization, some features may dominate others, leading to skewed results and reduced model performance.

By employing the Min-Max scaling technique, all values are uniformly scaled in the range 0 to 1. A linear transformation is performed by first subtracting the minimum value of each input and then dividing by its range. The normalization formula using this technique, as shown in Eq. 1, represents this process10.

graphic file with name d33e235.gif 1

This transformation ensures that the values are proportional, making it easier for the model to interpret the data11.

Outlier detection using elliptic envelope

In statistics, the Elliptic Envelope is used to identify anomalies, assuming multivariate normal distributions. In this way, the central distribution and variability of the data points are effectively modeled by defining an elliptical region that captures the majority of samples. Outliers are observations that fall outside of this range12.

The procedure starts by calculating the mean vector and covariance matrix. Using these statistical estimates, the algorithm constructs an ellipse intended to enclose the core distribution of the data. It is mathematically necessary to optimize the volume of the resulting ellipsoid by ensuring that it contains at least 95% or 99% of the data. The anomalous points are identified as those that are outside the defined elliptical boundary.

Conditional mutual information (CMI)

CMI was used in this study to include the features which identifies the most informative features for regression tasks and fitting the dataset. CMI quantifies the details that one variable conveys regarding the output, conditioned on the presence of a second variable13,14.

In regression analysis, for Inline graphic as inputs, and Inline graphic as outputs, CMI is utilized to quantify the distance or dependency between features as follows14,15:

graphic file with name d33e289.gif

CMI, denoted as Inline graphic, measures the distinct contribution of each feature while considering the effects of other inputs8.

TabNet regression model

TabNet is a neural network model tailored for structured datasets, blending strong prediction accuracy with clear model transparency. It utilizes a stepwise attention framework to prioritize important features at each stage, mimicking the logic of decision trees while retaining the adaptability of deep learning methods. This makes TabNet particularly well-suited for predicting disintegration time based on the diverse and complex set of input features in our dataset, including drugs and excipients properties, mechanical properties, excipients composition, and formulation characteristics4.

TabNet architecture includes both encoding and decoding modules. The encoder handles input variables by passing them through multiple sequential stages, each containing two main elements: an attention-based selector and a feature transformation unit16. The feature transformer applies a series of integrated layers and ReLU activations to transform the input features. The attentive transformer generates a sparse mask that determines which features to select for the next decision step17. This mask is computed using a sparsemax function, which allows for the selection of only the most relevant features while setting others to zero, thereby enhancing interpretability.

Mathematically, the feature selection mask at step (i) is given by:

graphic file with name d33e326.gif

where, Inline graphic is the prior scale from the previous step, and Inline graphic is the output of the attentive transformer. The selected features are then used to compute the output of the current step, which contributes to the final prediction.

For the regression task of predicting disintegration time, the outputs from all decision steps are aggregated and passed through a linear layer to produce the final continuous prediction:

graphic file with name d33e347.gif

where, Inline graphic is the output from the i-th decision step, and Inline graphic and Inline graphic are learnable parameters.

The TabNet model is trained to minimize a combination of the mean squared error (MSE) loss for the regression task and a sparsity loss that encourages the model to select fewer features at each step. The sparsity loss is based on the entropy of the feature selection masks and is controlled by a hyperparameter Inline graphic. The total loss function is:

graphic file with name d33e383.gif

RBF-SVR

The Radial Basis Function Support Vector Regression (RBF-SVR) is a robust ML approach used for regression tasks, particularly effective for modeling nonlinear relationships in tabular data. Support Vector Regression (SVR) adapts the core ideas of Support Vector Machines for use in predicting continuous outcomes, aiming to fit a function within a defined epsilon margin around actual values. The Radial Basis Function (RBF) kernel maps input features into a more complex feature space, allowing the model to learn intricate and nonlinear relationships18.

In this study, the RBF-SVR model is implemented to predict disintegration time based on the input features described above. The model seeks to minimize the error within an epsilon-insensitive tube, where errors smaller than epsilon are ignored, and larger errors are penalized linearly. The RBF kernel is expressed as Inline graphic, where Inline graphic is a kernel parameter controlling the width of the Gaussian function, and Inline graphic is the squared Euclidean distance between two feature vectors Inline graphic and Inline graphic. Also, SVR optimization problem is formulated as:

graphic file with name d33e429.gif

subject to:

graphic file with name d33e436.gif
graphic file with name d33e441.gif
graphic file with name d33e446.gif

where w shows the weight vector, b stands for the bias, Inline graphic maps the input to the higher-dimensional space, C represents the regularization factor balancing margin maximization and error tolerance, Inline graphic is the margin of tolerance, and Inline graphic, Inline graphic are slack variables for handling errors outside the epsilon tube.

The hyperparameters C, Inline graphic, and Inline graphic are critical to the model’s performance and are tuned using cross-validation to optimize predictive accuracy. The training process employs a grid search to identify the optimal combination of these parameters, minimizing the mean squared error (MSE) on the validation set. The final prediction for an input Inline graphic is given by:

graphic file with name d33e507.gif

where Inline graphic, Inline graphic are the Lagrange multipliers obtained during optimization. This formulation allows RBF-SVR to effectively model the complex relationships between formulation parameters and disintegration time, providing high predictive accuracy and robustness to outliers, as demonstrated in prior studies.

Neural oblivious decision ensembles (NODE)

NODE model merges the structure of decision trees with the flexibility of neural networks to model tabular data in regression tasks. NODEs are built from a collection of oblivious decision trees, where all data points follow the same split rules at each depth level within a tree, which helps lower variance and improve model stability.

The implementation of the model uses an ensemble of T trees, each with a fixed depth L (e.g., L = 3) and K splits at each level, designed to effectively capture interactions between features. Each node in a tree applies a splitting function Inline graphic, parameterized by neural network weights Inline graphic, which maps input features x to a binary decision based on a threshold. The splitting function can be expressed as Inline graphic, where Inline graphic is a sigmoid activation, w are weights, and b is a bias, determining whether data points move left or right in the tree. The final prediction for an input x is the ensemble average: Inline graphic, where Inline graphic is the output of the t-th tree. Training minimizes a MSE loss: Inline graphic, using stochastic gradient descent with the Adam optimizer, where Inline graphic stands for a regularization parameter. Hyperparameters T, L, and K are tuned via cross-validation. This mathematical framework ensures NODEs balance interpretability and predictive power, outperforming traditional trees, as shown in prior work19. The procedure is schematically illustrated in Fig. 1.

Fig. 1.

Fig. 1

Flowchart representation of the NODE model architecture, highlighting tree-based decision paths and ensemble averaging.

Hyper-parameter optimization using WCA

Water Cycle Algorithm (WCA) runs as a metaheuristic optimizer which can be employed for tuning ML hyperparameters. This technique simulates the different stages of the water cycle, such as evaporation, river formation, and precipitation, to investigate optimal solutions20,21.

In the first phase, a random set of hyper-parameter combinations is generated as potential solutions. In the evaporation phase, solutions are assessed by fitness, with higher values reflecting better performance. These fitness values are crucial in determining the rate of water evaporation22,23. Water vapor condenses into clouds during the precipitation phase, randomly scattered over the solution set. Every cloud stands for a possible improvement upon a solution. Every cloud’s degree of fitness is assessed; the one with the highest quality is selected24. The identified cloud acts as a reference point for creating a river, directing the existing solution in its direction. This river symbolizes iterative modifications to parameter values. The contrast between the selected cloud and the present solution indicates the necessary updates.

Figure 2 depicts the fundamental sequence of operations in the WCA algorithm. An advantage of this method is its ability to effectively tackle multiple objectives. Multi-objective optimization is a technique that identifies the optimal solutions for goals that are in conflict with each other. The concept is applied to extend WCA optimizer to manage several objectives. If one solution performs better in at least one goal without being less than another, then it is judged to dominate another. Incorporating this idea allows the WCA to find a set of mutually exclusive solutions21,25.

Fig. 2.

Fig. 2

Flowchart for WCA.

Results and discussion

This section presents the outcomes of hyperparameter optimization and feature selection, which form the basis for the subsequent modeling phase. The WCA was used to determine the optimal hyperparameters for each regression method applied in this study to predict DT. These values, listed in Table 1, reflect the configuration that best balances model complexity and learning efficiency. Hyperparameter tuning was carried out for increasing the accuracy of ML models for DT predictions.

Table 1.

Selected hyperparameters optimized by WCA.

Model Hyperparameter Selected Value
TabNet Number of decision steps 6
Feature dimension 64
Relaxation factor 1.45
Learning rate 0.021
Sparsity regularization (λ) 1.04e-4
RBF-SVR C (Penalty parameter) 90
ε (Epsilon in loss function) 0.14
γ (Kernel coefficient) 0.053
NODE Number of trees (T) 128
Tree depth (L) 4
Learning rate 0.0121
Regularization weight (λ) 1.01e-3

In parallel, feature selection was performed using CMI to identify the most informative and non-redundant features with respect to the target variable. The selected features, listed in Table 2, represent distinct and relevant attributes from the original dataset, ensuring minimal multicollinearity and maximum predictive value.

Table 2.

Features selected using conditional mutual information (CMI).

Feature Name Feature Type CMI Score (with Y) Description
Molecular Weight Molecular Property 0.236 Total mass of a molecule
Hydrogen Bond Donors Molecular Property 0.198 Number of hydrogen-donating atoms
Hardness Physical Property 0.215 Mechanical strength of the tablet
Friability Physical Property 0.187 Tendency to crumble under stress
Microcrystalline Cellulose Excipient Composition 0.204 Filler and binder used in formulation
Magnesium Stearate Excipient Composition 0.192 Lubricant affecting cohesion and flow
Wetting Time Formulation Characteristic 0.221 Time for surface of tablet to become wet
Bulk Density Formulation Characteristic 0.208 Mass per unit volume of the powder blend

As shown in Table 3, all models demonstrate high predictive accuracy for DT predictions, with determined test R² values exceeding 0.96. NODE model has the highest test R² of 0.980472, followed closely by TabNet (0.965692) and RBF-SVR (0.965172). The Mean MCCV R² scores (Table 4) indicate robust generalization, with NODE again leading at 0.974393, compared to TabNet (0.963033) and RBF-SVR (0.959633). Table 4 further highlights NODE’s superior performance, with the lowest test RMSE (7.077951) and MAE (5.913171), compared to TabNet (RMSE: 9.381520, MAE: 7.299484) and RBF-SVR (RMSE: 9.452304, MAE: 7.127279). The lower error rates for NODE suggest it captures the complex relationships between formulation parameters and disintegration time more effectively.

Table 3.

R2 values of models.

Model Train R2 Mean MCCV R2 Test R2
TabNet 0.975908 0.963033 0.965692
RBF-SVR 0.977693 0.959633 0.965172
NODE 0.985493 0.974393 0.980472

Table 4.

Error rates of final models.

Model Train Test
RMSE MAE RMSE MAE
TabNet 6.919607 5.886754 9.381520 7.299484
RBF-SVR 6.658407 5.564464 9.452304 7.127279
NODE 5.369581 4.628027 7.077951 5.913171

The 95% confidence intervals in Table 5 provide insight into the variability of the performance metrics. NODE’s R² CI ([0.9668, 0.9942]) is slightly wider than TabNet ([0.9520, 0.9794]) and RBF-SVR ([0.9515, 0.9789]), reflecting its higher R² but also potential sensitivity to data variability. However, NODE’s RMSE and MAE CIs ([6.1063, 8.0496] and [5.1019, 6.7244], respectively) are narrower than those of TabNet ([8.0943, 10.6687] and [6.2979, 8.3011]) and RBF-SVR ([8.1552, 10.7494] and [6.1496, 8.1050]), indicating greater precision and stability in its error metrics. This suggests that NODE’s predictions are not only more accurate but also more consistent across the test set.

Table 5.

95% confidence intervals for test set metrics.

Model R² CI RMSE CI MAE CI
TabNet [0.9520, 0.9794] [8.0943, 10.6687] [6.2979, 8.3011]
RBF-SVR [0.9515, 0.9789] [8.1552, 10.7494] [6.1496, 8.1050]
NODE [0.9668, 0.9942] [6.1063, 8.0496] [5.1019, 6.7244]

NODE consistently outperforms TabNet and RBF-SVR across all metrics, with the highest test R² (0.980472), lowest RMSE (7.077951), and lowest MAE (5.913171), as shown in Tables 3 and 4. Its narrower error CIs (Table 5) further confirm its robustness for DT estimations. While TabNet and RBF-SVR exhibit comparable performance (R² ≈ 0.965, RMSE ≈ 9.4, MAE ≈ 7.2), NODE’s superior accuracy and stability make it the preferred choice. Additionally, NODE’s interpretability, enhanced by SHAP analysis, aligns well with the study’s goal of understanding formulation impacts on disintegration time, as its tree-based structure facilitates clear feature interaction insights.

Figures 3, 4 and 5, illustrating the comparison of actual versus predicted values along with residual plots for each model, visually corroborate the performance metrics detailed in Tables 3, 4 and 5, reinforcing the validity of the numerical results. Furthermore, the results of DT correlation show no bias in the estimation, and all data points have been distributed uniformly above and below the Ideal Fit line (see Figs. 3, 4 and 5). This is of great importance to see this behavior which proves that the model selection, feature selection, and hyper-parameter optimization were all greatly executed to estimate the values of DT. Based on the comprehensive evaluation of performance metrics (Tables 3, 4 and 5) and interpretability (Figs. 3, 4 and 5), the NODE model is identified as the best model for predicting disintegration time in this study. Its high predictive accuracy and its learning curve is shown in Fig. 6 indicating the variations of fitting R2 with training test size.

Fig. 3.

Fig. 3

TabNet Model: Comparison of actual and predicted models and residual plot.

Fig. 4.

Fig. 4

RBF-SVR Model: Comparison of actual and predicted models and residual plot.

Fig. 5.

Fig. 5

NODE Model: Comparison of actual and predicted models and residual plot.

Fig. 6.

Fig. 6

Learning Curves for TabNet, RBF-SVR, and NODE.

The accuracy of NODE as the best model in this study is compared with some other fitted ML models and the results are indicated in Table 6 based on the test dataset. The test set is used for comparison as it shows the model’s robustness and reliability in prediction of tablet disintegration time, and any overfitting can be observed in test dataset. The models used for comparison include Local Polynomial Regression (LPR), Gaussian Process Regression (GPR), and Deep Gaussian Process Regression (DGPR)8. It is indicated that the NODE model performed better than all other models as evidenced by R2, RMSE, and MAE values which confirms the validity and generality of the NODE model in estimating disintegration time.

Table 6.

Comparisons between NODE model performance and three ML regression models.

Model Test
R 2 RMSE MAE
NODE 0.98 7.07 5.91
LPR 0.78 20.91 16.85
GPR 0.90 13.71 11.29
DGPR 0.97 7.27 5.98

The SHAP analysis (Figs. 7, 8 and 9) enhances the interpretability of the models, revealing key drivers of disintegration time predictions. The SHAP summary plot (Fig. 7) identifies features such as microcrystalline cellulose content, hardness, and wetting time as the top contributors across all models, consistent with pharmaceutical domain knowledge. The SHAP waterfall plot (Fig. 8) and force plot (Fig. 9) for a representative test sample illustrate how these features interact to produce individual predictions. For instance, higher microcrystalline cellulose content typically reduces disintegration time, as shown by negative SHAP values, while increased hardness increases it. TabNet’s attention mechanism complements SHAP by inherently selecting relevant features, making it particularly interpretable, while NODE’s tree-based structure and RBF-SVR’s kernel-based approach provide additional perspectives on feature importance. It has been observed that wetting time is the most important factor with the greatest contribution to DT variations as captured by SHAP analysis. This can be due to the molecular diffusion of solvent molecules into the tablet pores and facilitates the disintegration in the solvent phase. This has been also reported and confirmed in the previous ML studies on DT prediction4,5,8.

Fig. 7.

Fig. 7

SHAP analysis results.

Fig. 8.

Fig. 8

SHAP Waterfall plot.

Fig. 9.

Fig. 9

A sample SHAP force plot.

Conclusion

This study advances the application of ML in pharmaceutical research by developing robust predictive models for disintegration time, a pivotal property for designing controlled release tablet formulations. Utilizing a dataset of approximately 2,000 entries encompassing molecular properties, physical properties, excipient composition, and formulation characteristics, three ML models—TabNet, Radial Basis Function Support Vector Regression (RBF-SVR), and NODE—were rigorously evaluated. Preprocessing steps, including Min-Max normalization, Elliptic Envelope outlier detection, and Conditional Mutual Information feature selection, combined with hyperparameter optimization via the WCA, ensured optimal model performance. NODE outperformed TabNet and RBF-SVR, achieving the highest test R² (0.9805) and the lowest RMSE (7.078) and MAE (5.913), demonstrating superior accuracy and robustness in capturing complex relationships within the data. SHAP analysis further enhanced interpretability, identifying microcrystalline cellulose content, hardness, and wetting time as key predictors, aligning with pharmaceutical domain knowledge and providing actionable insights for formulation optimization. These results highlight NODE’s potential as a powerful tool for predictive modeling in pharmaceutical sciences, facilitating precise and interpretable predictions to improve tablet design. The successful integration of ML in this context underscores its transformative potential for drug development. Future research could explore ensemble approaches, incorporate additional tablet properties, or leverage larger datasets to further enhance predictive accuracy and broaden the applicability of these models in pharmaceutical innovation.

Acknowledgements

The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work through Large Groups Project under grant number (RGP.2/559/45).

Author contributions

M.G.: Conceptualization, Writing, Methodology, Investigation, Resources, Supervision. U.H.: Writing, Investigation, Validation, Analysis, Software. All authors reviewed the manuscript.

Data availability

The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Hein, L. et al. Controlled drug release from a polymer-free multi-walled carbon nanotube-based coating. J. Drug Deliv. Sci. Technol.112, 107231 (2025). [Google Scholar]
  • 2.Zhai, G. et al. Microneedle drug delivery carriers capable of achieving sustained and controlled release function. Colloids Surf., B. 253, 114767 (2025). [DOI] [PubMed] [Google Scholar]
  • 3.Diószegi, A. et al. Automated tablet defect detection and the prediction of disintegration time and crushing strength with deep learning based on tablet surface images. Int. J. Pharm.667, 124896 (2024). [DOI] [PubMed] [Google Scholar]
  • 4.Moin, A. et al. Development of machine learning models for Estimation of disintegration time on fast-disintegrating tablets. Eur. J. Pharm. Sci.211, 107141 (2025). [DOI] [PubMed] [Google Scholar]
  • 5.Ghazwani, M. & Hani, U. Prediction of tablet disintegration time based on formulations properties via artificial intelligence by comparing machine learning models and validation. Sci. Rep.15 (1), 13789 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Kalný, M., Grof, Z. & Štěpánek, F. Microstructure based simulation of the disintegration and dissolution of immediate release pharmaceutical tablets. Powder Technol.377, 257–268 (2021). [Google Scholar]
  • 7.So, C., Narang, A. S. & Mao, C. Modeling the tablet disintegration process using the finite difference method. J. Pharm. Sci.110 (11), 3614–3622 (2021). [DOI] [PubMed] [Google Scholar]
  • 8.Ghazwani, M. & Hani, U. Data driven analysis of tablet design via machine learning for evaluation of impact of formulations properties on the disintegration time. Ain Shams Eng. J.16 (9), 103512 (2025). [Google Scholar]
  • 9.Momeni, M. et al. Dataset development of pre-formulation tests on fast disintegrating tablets (FDT): data aggregation. BMC Res. Notes. 16 (1), 131 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Henderi, H., Wahyuningsih, T. & Rahwanto, E. Comparison of Min-Max normalization and Z-Score normalization in the K-nearest neighbor (kNN) algorithm to test the accuracy of types of breast cancer. Int. J. Inf. Inform. Syst.4 (1), 13–20 (2021). [Google Scholar]
  • 11.Alqarni, M. & Alqarni, A. Computational intelligence investigations on the correlation of pharmaceutical solubility in mixtures of binary solvents: effect of composition and temperature. Chin. J. Phys.93, 503–514 (2025). [Google Scholar]
  • 12.Usman, N., Utami, E. & Hartanto, A. D. Comparative Analysis of Elliptic Envelope, Isolation Forest, One-Class SVM, and Local Outlier Factor in Detecting Earthquakes with Status Anomaly using Outlier. in International Conference on Computer Science, Information Technology and Engineering (ICCoSITE). 2023. IEEE. 2023. IEEE. (2023).
  • 13.Liang, J. et al. Feature selection with conditional mutual information considering feature interaction. Symmetry11 (7), 858 (2019). [Google Scholar]
  • 14.Mahdi, W. A., Alhowyan, A. & Obaidullah, A. J. Utilization of artificial intelligence for evaluation of targeted cancer therapy via drug nanoparticles to estimate delivery efficiency to various sites. Chemometr. Intell. Lab. Syst.257, 105309 (2025). [Google Scholar]
  • 15.Latorre Carmona, P. et al. Feature selection in regression tasks using conditional mutual information. in Iberian Conference on Pattern Recognition and Image Analysis. Springer. (2011).
  • 16.Li, W. TabNet for high-dimensional tabular data: advancing interpretability and performance with feature fusion. in IET Conference Proceedings CP915. IET. (2025).
  • 17.Khazael, S. M. et al. Enhancing solar PV suitability mapping in the middle East using an optimized deep learning framework. Alexandria Eng. J.129, 553–571 (2025). [Google Scholar]
  • 18.Zhang, F. & O’Donnell, L. J. Support vector regression, in Machine Learning. Editors: Andrea Mechelli and Sandra Vieira. Academic Press. 123–140. (2020).
  • 19.Popov, S., Morozov, S. & Babenko, A. Neural oblivious decision ensembles for deep learning on tabular data. arXiv preprint arXiv:1909.06312, (2019).
  • 20.Abou El-Ela, A. A., El-Sehiemy, R. A. & Abbas, A. S. Optimal placement and sizing of distributed generation and capacitor banks in distribution systems using water cycle algorithm. IEEE Syst. J.12 (4), 3629–3636 (2018). [Google Scholar]
  • 21.Wang, Y. et al. A hybrid CFD and machine learning study of energy performance of photovoltaic systems with a porous collector: model development and validation. Case Stud. Therm. Eng.69, 105998 (2025). [Google Scholar]
  • 22.Eskandar, H. et al. Water cycle algorithm–A novel metaheuristic optimization method for solving constrained engineering optimization problems. Comput. Struct.110, 151–166 (2012). [Google Scholar]
  • 23.Sadollah, A., Eskandar, H. & Kim, J. H. Water cycle algorithm for solving constrained multi-objective optimization problems. Appl. Soft Comput.27, 279–298 (2015). [Google Scholar]
  • 24.Razmjooy, N., Khalilpour, M. & Ramezani, M. A new meta-heuristic optimization algorithm inspired by FIFA world cup competitions: theory and its application in PID designing for AVR system. J. Control Autom. Electr. Syst.27, 419–440 (2016). [Google Scholar]
  • 25.Jafar, R. M. S. et al. A comprehensive evaluation: water cycle algorithm and its applications. in Bio-inspired Computing: Theories and Applications: 13th International Conference, BIC-TA 2018, Beijing, China, November 2–4, 2018, Proceedings, Part II 13. Springer. (2018).

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES