Predicting Cycle Life for Lithium-Ion Batteries with Ternary Cathode Materials Using Data-Driven Machine Learning

Long Li; Pengfei Yue; Chongnian Tang; Xun Qi; Lumeng Chao; Yuanyuan Gao; Xiaokui Wang; Yue Zhang; Chaohui Wu; Feng Liu; Guanshihan Du; Yongjun Wu; Zijian Hong

doi:10.1021/acsomega.5c09364

. 2025 Oct 26;10(43):52001–52009. doi: 10.1021/acsomega.5c09364

Predicting Cycle Life for Lithium-Ion Batteries with Ternary Cathode Materials Using Data-Driven Machine Learning

Long Li ^†, Pengfei Yue ^†, Chongnian Tang ^‡, Xun Qi ^‡, Lumeng Chao ^‡, Yuanyuan Gao ^§, Xiaokui Wang ^§, Yue Zhang ^§, Chaohui Wu ^§, Feng Liu ^∥,^⊥, Guanshihan Du ^#, Yongjun Wu ^∥,^⊥,^#,^*, Zijian Hong ^∥,^⊥,^#,^*

PMCID: PMC12593148 PMID: 41210813

Abstract

Lithium-ion batteries with ternary cathode materials offer several advantages, including high energy density, relatively low cost, and high power density, making them suitable for applications in electric vehicles and large-scale grid storage systems. However, a significant challenge is the rapid and nonlinear capacity fade during cycling, which necessitates accurate predictions of battery cycle performance. In this study, we developed three machine learning models, namely Elastic Net, Random Forest, and XGBoost to predict the remaining useful life (RUL) of batteries with ternary cathodes using data from a public database. XGBoost demonstrated the highest prediction accuracy when tested with training data from the first 100 cycles, achieving a prediction error of 11.8%. Furthermore, the prediction error increased slightly to 17.0% when tested with only the first 30 charge/discharge cycles. This study exemplifies the potential of machine learning models for predicting battery cycle life, with important implications for the operation and maintenance of electric vehicles and large-scale grid storage systems.

graphic file with name ao5c09364_0008.jpg

graphic file with name ao5c09364_0006.jpg

Introduction

Lithium-ion batteries have been widely used in electric vehicles, portable electronics, electric vertical takeoff and landing aircraft, and large-scale grid storage, etc., owing to their advantages such as high energy density, high power density, lightweight, long lifetime, and low cost. − They have been a key part of our modern civilization and contribute substantially to the decarbonization of transportation. Meanwhile, the rapid development of electric vehicles also calls for great improvement in the battery capacity and energy density to overcome the “mileage anxiety”. Ternary cathode materials, e.g, Li (Ni, Co, Mn)O₂ (NCM in short) and their derivatives such as Li (Ni, Co, Al)O₂ (NCA in short), have been considered as promising candidates for powering next-generation electric vehicles with high energy density. − Meanwhile, the electrochemical system with NCM-based cathode will gradually lose its initial capacity during operation due to the side reaction with electrolytes, transition metals dissolution, materials degradation and decomposition under high voltage, and crack of the secondary particle, etc. − This degradation process is often nonlinear, with a faster capacity decay during prolonged cycling. Moreover, as compared to the LiFePO₄ (LFP) cathode material, NCM-based cathodes typically exhibit lower stability and shorter cycle life, making it more critical yet more challenging to predict the cycle life accurately.

Previously, many data-driven machine learning models have been built to predict the cycle life of lithium-ion batteries, particularly those with LFP cathode materials. − For instance, Severson et al. created elastic-net-based machine learning models to predict the cycle life of commercial LFP/graphite cells using data from early cycles, achieving high precision with prediction errors of 9.1% based on data from the first 100 cycles. Ma et al. developed an accurate and efficient machine learning model called Broad Learning-Extreme Learning Machine to predict the cycle life of different lithium-ion batteries. Moreover, several machine learning models are developed to predict cycle life for batteries with NCM-based cathodes, including long short-term memory networks, linear regression, and XGBoost, etc. Recently, Vijayaraghavan et al. conducted a comparative study on K-Nearest Neighbors (KNN), Multi-Layer Perceptron (MLP), and Light Gradient Boosting Machine (LGBM) for predicting the remaining useful life of NCM batteries, further demonstrating the applicability of diverse algorithms in this domain. In parallel, Yao et al. used the physical mechanisms inspired by Li-plating and SEI growth to guide the feature selection and use a data–driven method to describe the relationship between physics–based inputs for NCM-based batteries. Meanwhile, currently there are very few reports focusing on predicting the cycle life of NCM-based batteries using early cycling data, which is essential for the battery management systems in electric vehicles. Additionally, there is a noticeable lack of systematic comparisons regarding the performance of various machine learning models in predicting the cycle life of NCM-based batteries. In this study, we developed three machine learning models: Elastic Net, XGBoost, and Random Forest, to predict the cycle life of NCM-based batteries using early cycling data and benchmarking with data from the battery testing database.

Main

In this study, we adopt the battery testing data from public data sets, which include three different types of batteries with ternary cathodes: e.g., with LiNi_0.86Co_0.11Al_0.03O₂ (denoted as NCA battery), LiNi_0.83Co_0.11Mn_0.07O₂ (denoted as NCM battery), and a mixture of NCM and NCA (denoted as NCM_NCA battery) cathodes. These batteries are tested under different temperatures (25 °C, 35 °C, and 45 °C) with different C-rates (from 0.25 to 4 C), the specific battery testing conditions and battery number are listed in Table S1. In this work, 70%, 10%, and 20% of the total data are used for training, validation, and prediction, respectively. Figure illustrates the cycling performance of these batteries in the data set. A typical charge–discharge protocol and the corresponding current–voltage curve are given in Figure a, with five stages: (I), constant current charging (∼2000 mA) until the voltage reaches 4.2 V; (II), constant voltage charging until the current reaches 0 mA; (III), relaxation for a given period; (IV), constant current discharging until the voltage decreases to 2.6 V; (V), relaxation for a given time period. The discharge capacity for each cycle is then obtained by calculating the discharge current times the discharge time. The cycle life is determined by the number of cycles when the discharge capacity decreases to 70% of the initial discharge capacity. The discharge capacity versus the cycle number for the NCM, NCA, and NCM_NCA batteries are plotted in Figure b–d, respectively. It can be observed that for NCA and NCM batteries, the cycling performance is largely dependent on the testing temperature, which tends to decay much faster under 25 °C compared to 35 °C and 45 °C. In contrast, the NCM_NCA battery has a longer cycle life at 25 °C. It is interesting to note that for high-temperature cycling, the discharge capacity decreases faster for the initial 50 cycles, followed by a slower and linear decay for the subsequent cycling. When cycled at 25 °C, the discharge capacity decays abruptly after 100 cycles. The cycle life spans from 30 cycles to 1200 cycles for different batteries, showing the diversity of the batteries in the data sets.

Cycling performance for batteries with high capacity cathodes. (a) A typical charge–discharge protocol and the corresponding current–voltage curve. The specific discharge capacity versus cycle number for (b) NCM, (c) NCA, and (d) NCM_NCA cathodes under different temperatures. CY 25, CY 35, and CY 45 denote the cycling conditions under 25 °C, 35 °C, and 45 °C, respectively. The experimental data is taken from ref , reproduced under the terms of the Creative Commons CC BY 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Results and Discussion

An Elastic Net based machine learning model has been developed to predict the cycle life of batteries (details in Methods). This is a linear regression model that combines L1 regularization (Lasso regression) and L2 regularization (Ridge regression), allowing for feature selection (with sparsity induced by L1) while controlling model complexity (via weight decay from L2), making it particularly suitable for high-dimensional data. The model has been previously utilized to predict the cycle life of LFP batteries. The performance of the Elastic Net model is illustrated in Figure . As indicated in Figure a, the relationship between predicted cycle life and actual cycle life shows an overall linear correlation, demonstrating the model’s predictive power for the remaining useful life (RUL) of the batteries. However, a significant prediction error of ∼21.9%, with a high root-mean-square error (RMSE) of around 163.9 can be found. Some data points exhibit noticeable deviations from the ideal reference line, particularly in extreme RUL ranges for batteries with either very low or very high cycle lives. This highlights the model’s limited generalization capability under extreme testing conditions. The error distribution presented in Figure b is not strictly normal, which has an average error close to zero and no systematic bias. However, it exhibits a marked long-tail effect, in which a small number of samples have very large absolute errors. These outliers significantly impair model performance, likely due to insufficient generalization when subjected to complex testing conditions.

Observed and predicted cycle lives for the ternary cathodes with elastic net model (a) battery cycle life prediction, (b) error distributions, (c) feature importance analysis, (d) the change in the prediction error of the elastic net model with the number of cycles used.

To analyze the key factors influencing the accuracy of the model, we conducted a feature importance analysis, as illustrated in Figure c. It is revealed that features such as “DeltaQMin_log”, “Is_NCM_NCA”, “Is_NCA”, “Is_NCM”, “CapacityFadeSlope”, and “temperature” are more important than others. This is understandable, as factors like material type, capacity fade slope, and temperature could greatly influence cycling performance. On the other hand, the high RMSE and mean absolute percentage error (MAPE) can be attributed to two main reasons: (1) as a linear model, Elastic Net represents the target variable through linear combinations of features. This may lead to an overreliance on a small subset of high-weight features. If these significant features fail to capture the complex influences on battery operation conditions effectively, the model might overlook underrepresented factors. (2) As a linear regression model, Elastic Net may encounter difficulties in accurately modeling nonlinear interactions among features. This is particularly important in the context of battery aging, which often involves multivariate coupling between key features. Additionally, we examined the effect of reducing the number of training cycles on prediction errors. Figure (d) illustrates how prediction errors fluctuate as the number of training cycles changes for the Elastic Net model, with the corresponding prediction errors displayed in Table . It can be seen that using first 40 cycles as the input data, the prediction error can be as high as 31.2%. However, incorporation of more historical data (for instance, increasing from 40 cycles to 80 cycles) significantly reduces the prediction error to 21.9%, which stabilizes from 80 to 100 cycles.

1. Prediction Error of the Elastic Net Model vs the Number of Charge/Discharge Cycles Used as the Input Historical Data.

cycles	30	40	50	60	70	80	90	100
prediction error (%)	26.0	31.2	21.2	24.3	23.5	21.9	20.8	21.9
RMSE	157.8	175.6	164.2	174.2	171.2	161.9	159.2	163.9

Open in a new tab

Similarly, a Random Forest based machine learning model has further been built to predict the battery cycle life for the batteries, as shown in Figure (details in Methods). Random Forest is an ensemble learning method that constructs multiple decision trees and aggregating their prediction. This approach demonstrates high generalization capability and strong resistance to noise. , It has been widely employed to model materials problems such as spectrum–property relationships, mechanical properties of alloys, and the stability of AB₂C Heusler compounds, etc. Figure a presents the parity plot of the battery cycle life predicted by the Random Forest model, using the historical capacity data from the first 100 charge/discharge cycles. It can be seen that the data points are more closely clustered around the ideal reference line (the diagonal) as compared to the Elastic Net model, with the prediction error reduced to 14.9%. This indicates that the Random Forest model is better at capturing nonlinear degradation patterns in battery aging processes, especially under diverse operating conditions. The error distribution diagram in Figure b reveals that the prediction errors exhibit an approximately normal distribution, with most errors concentrated within a narrower range, signifying an overall improvement in prediction accuracy and stability. However, it is important to note that some samples still exhibit high prediction errors (absolute errors greater than 200 cycles).

Observed and predicted cycle lives for the ternary cathodes with random forest model (a) battery cycle life prediction, (b) error distributions, (c) feature importance analysis, (d) the change in the prediction error of the elastic net model with the number of cycles used.

The feature importance analysis diagram, illustrated in Figure c, reveals that the feature importance distribution is more varied compared to the Elastic Net model. This analysis identifies over ten key features, including “CapacityFadeIntercept”, “temperature”, “AvgChargeTimeFirst5Cycles”, “DischargeCapacityCycle2”, “CapacityFadeSlope”, “DeltaQMin_log”, “Is_NCA”, and “Is_NCM_NCA”, among others. It is evident from this analysis that the Random Forest model achieves more accurate prediction of battery aging processes through effective capturing and exploring of key features, integrating multiple features in a balanced manner, handling categorical information efficiently, and suppressing noise through ensemble mechanisms.

Additionally, we examined how reducing the number of training cycles affects prediction errors in the Random Forest model (Figure d and Table ). When the number of utilized cycles increased from the first 30 to 90 cycles, the prediction error decreased slightly from 19.2% to 13.7%, and the RMSE dropped from 134.4 to 127.3. These improvements indicate that moderately expanding cycle data enhances the model’s performance and reduces prediction deviations through enriched feature learning. However, further extending the cycle range to 100 cycles resulted in a slight rebound of the prediction error percentage and an increase in RMSE. This indicates that adding too many cycles can lead to diminishing returns, possibly due to increased computational noise or the risk of overfitting caused by unnecessary redundant data.

2. Prediction Error of the Random Forest Model vs the Number of Charge/Discharge Cycles Used as the Input Historical Data.

cycles	30	40	50	60	70	80	90	100
prediction error (%)	19.2	16.4	16.6	15.4	15.7	14.3	13.7	14.9
RMSE	134.4	141.5	150.9	145.0	145.3	136.7	127.5	137.2

Open in a new tab

Finally, we developed a XGBoostmodel to estimate battery RUL, as shown in Figure . XGBoost is an efficient ensemble framework based on gradient boosting that iteratively trains weak learners, specifically regression trees, while optimizing loss functions to achieve both high accuracy and robustness in complex regression tasks. Several core features of XGBoost align well with the requirements for battery RUL prediction, including enhanced regularization, awareness of sparsity, and effective handling of missing values. For feature extraction, the first 100 cycles of aging data were selected as the input range. Figure a presents the scatter plot of XGBoost’s battery lifespan predictions, while Figure b illustrates the distribution of prediction errors. The scatter plot in Figure a indicates a significant improvement in the fitting accuracy between the predicted and actual RUL values. The deviation of XGBoost’s predictions from the testing value is considerably lower than that of the Elastic Net model and even surpasses the performance of the Random Forest model on most samples. This demonstrates XGBoost’s superior ability to capture complex nonlinear degradation patterns in battery aging processes. Overall, the XGBoost model achieves a breakthrough in precision, reducing the prediction error to 11.8%. This represents a substantial improvement over the benchmarks set by the Elastic Net (21.9%) and Random Forest (14.9%) models.

Observed and predicted cycle lives for the ternary cathodes with XG Boost model (a) battery cycle life prediction, (b) error distributions, (c) feature importance analysis, (d) the change in the prediction error of the elastic net model with the number of cycles used.

The feature importance analysis is presented in Figure c, highlighting seven key features: “AvgChargeTimeFirst5Cycles,” “DeltaQMin_log”, “CapacityFadeIntercept”, “DischargeCapacityCycle2”, “DeltaQVar_log”, “temperature”, “Is_NCM”, and “Is_NCA”, etc. It is evident that all three machine learning methods utilize similar features, yet they prioritize different factors. Notably, features such as “DeltaQMin_log” and “CapacityFadeIntercept” emerge as particularly significant across most methods. Interestingly, it is revealed that the material type is not as critical for the XGBoost model, suggesting that the variations between materials can be effectively captured by other features derived from battery tests.

We reduced the number of training cycles in the XGBoost model and recorded the prediction errors (see Figure d and Table ). As the range of training cycles increased from the initial 30 to 100 cycles, we observed a distinct pattern of “initial fluctuation followed by sustained decline” in the prediction error. This pattern reflects the interaction between battery aging dynamics and model optimization. During the initial fluctuation stage, XGBoost’s gradient-boosting mechanism faced difficulties due to instability in the feature space, such as mixed aging modes, materials, and sparse indicators from early cycles. This led to temporary discrepancies in fitting and degradation in performances. However, with extended cycle data, we provided sufficient nonlinear aging data, enabling XGBoost to improve its tree splits and regularization. The iterative correction process of gradient boosting utilizes Hessian-optimized splits, which minimizes residual errors in complex aging trajectories.

3. Prediction Error of the XG Boost Model vs the Number of Charge/Discharge Cycles Used as the Input Historical Data.

cycles	30	40	50	60	70	80	90	100
prediction error (%)	17.0	16.3	19.8	17.2	15.6	15.3	16.1	11.8
RMSE	157.3	131.1	147.5	150.8	116.6	130.8	127.5	105.6

Open in a new tab

A direct comparison of the optimized performances of the three machine learning models, based on data from the first 100 cycles, is presented in Figure . It is indicated that the Elastic Net Model has the highest prediction error of 21.9%, with an RMSE of 163.9 cycles. This is followed by the Random Forest model, which has a prediction error of 14.9% and an RMSE of 137.2 cycles. The XGBoost model, on the other hand, demonstrates the lowest prediction error of 11.8% and the lowest RMSE of 105.6 cycles. To further improve prediction accuracy, a larger database with more training data is necessary. Furthermore, since the C-rate is an important factor that can significantly affect battery cycle life, we have included both charging and discharging C-rates as additional input features in all three machine learning models (see Figures S1–S3). However, it appears that including the C-rate does not significantly enhance model performance. This can be attributed to two main factors: 1. The feature “AvgChargeTimeFirst5Cycles” has already been included, which serves as an effective proxy for the charging C-rate. A higher C-rate leads to a shorter charging time. 2. The data set consists of only 16 batteries with varying charging C-rates and 6 batteries with varying discharging C-rates out of a total of 130 samples. This limited diversity reduces the statistical power of the explicit C-rate features, particularly for linear models like Elastic Net, which require a broader data distribution to reliably estimate parameters.

Comparison of the performances for different machine learning models.

Conclusions

In conclusion, we have developed three machine learning models: Elastic Net, Random Forest, and XGBoost to predict the cycle life of high energy density ternary cathodes, specifically the NCM/NCA₂-based cathodes, using open-source experimental battery testing database. Among these models, XGBoost demonstrates the lowest prediction error of 11.8% and a root-mean-square error (RMSE) of 105.6 cycles, based on training data from the first 100 testing cycles. The model’s performance degrades slightly with less training data; when using only the first 30 testing cycles, the prediction error increases to 17.0%. We further analyzed the feature importance and found that while all three machine learning methods utilize similar features, they prioritize different factors. Key features, such as “DeltaQMin_log” and “CapacityFadeIntercept” were identified as particularly significant across most methods. Interestingly, it is revealed that the material type is not as critical for some models, suggesting that variations of the electrochemical performance between different materials are effectively captured by other features derived from battery testings. This study demonstrates that machine learning models are effective in predicting the remaining useful life of batteries, even in cases of highly nonlinear capacity decay. Models such as XGBoost exhibit strong generalization capabilities, making them applicable across various battery types, temperatures, and cycle rates. This study paves the way for utilizing machine learning to predict the cycle life of next-generation high-energy-density battery systems, such as NCM and NCA. These advancements have significant implications for battery management systems in electric vehicles and large-scale grid storage systems.

Methods

Features

Feature engineering is a core component in the construction of machine learning models, which can extract key features that effectively characterize the degradation patterns of batteries from raw data, thereby enhancing model prediction accuracy and generalization capability. The features used in this study includes: (1) DeltaQVar_log (log of the variation of the discharge capacity); (2) DeltaQMin_log (log of the minimum of the differences in discharge capacity); (3) CapacityFadeSlope (slope of the capacity fade); (4) CapacityFadeIntercept (intercept for the linear plot of the capacity curve); (5) AvgChargeTimeFirst5Cycles (average charge time for the first five charge/discharge cycles); (6) Temperature (temperature for the battery testing environment); (7) Is_NCA, Is_NCM, Is_NCM_NCA (one-hot coding to distinguish the different cathode materials); (8) DischargeCapacityCycle2 (the specific discharge capacity for the second cycle).

Elastic Net

In this study, Elastic Net is used as a baseline method for comparison with other algorithms. It is a linear regression model that combines L1 regularization (Lasso regression) and L2 regularization (Ridge regression), which achieves feature selection (sparsity from L1) while controlling model complexity (weight decay from L2). The optimization method used in this study is grid search with cross-validation:

(a)
predefined hyperparameter grid: The outer layer parameters with alphaVec = 0.01:0.05:1 (L1 regularization weight for Elastic Net, 20 parameters in total). Inner layer parameters with lambdaVec = 0:0.01:1 (regularization strength, 101 parameters in total). A nested loop is used to traverse all (alpha, lambda) combinations (a total 20 × 101 = 2020).
(b)
Cross-validation for optimal parameter selection: For each alpha, the Lasso function is configured with 3-fold cross-validation, e.g., CV = 3 and 5 Monte Carlo repetitions, i.e., MCReps = 5. The optimal lambda for each alpha is selected as the one yielding the minimum cross-validation MSE, which is stored as minLambdaMSE.
(c)
Validation set secondary screening: for each alpha, identify the optimal lambda with minimum cross-validation MSE. Compute the training set RMSE (rmseList) for each (alpha, lambda) combination and select the top 5 alpha values with the lowest training RMSE. For these 5 alpha candidates, compute the validation set RMSE, and choose the combination with the lowest validation RMSE as the final model.

Random Forest

Random Forest is an ensemble learning method based on the Bagging strategy, which enhances model generalization by constructing multiple decision trees and integrating their predictions. The core mechanisms include: bootstrap sampling which randomly sample subsets with replacement from the original data to train each tree; random feature selection during node splitting that randomly select subsets of features to reduce intertree correlation. Random Forest model parameter optimization is realized by grid search with holdout validation, based on the following strategy:

(a)
Explicitly define hyperparameter grid with outer parameter numTreesRange = [10, 200, 300] (number of trees, 3 values) and Inner Parameter minLeafRange =[1, 5, 10] (minimum leaf nodes, 3 values). This forms 9 combinations of hyperparameters, representing a typical Grid Search.
(b)
Holdout validation which directly use an independent validation set (val_Data) to calculate the RMSE for each parameter combination, i.e., valRMSE = sqrt(mean((yPredVal – yVal)²)).
(c)
Full traversal and optimal selection which trains the full model (TreeBagger) for each parameter combination and record the validation set RMSE. The combination with the smallest RMSE ([∼, bestIdx] = min(resultTable ValRMSE)) is selected to complete hyperparameter optimization.

XGBoost

XGBoost is an advanced gradient boosting framework that iteratively trains decision trees to optimize the loss function. Its core features include both regularization enhancement and sparsity awareness. It automatically learns split directions for sparse features, adapting to missing values or invalid features. Grid search with holdout validation is employed for parameter optimization, implemented as follows:

(a)
Explicit hyperparameter grid definition with outer layer parameter learningRates = [0.05, 0.1, 0.15] (3 values) and inner layer parameter numTrees = [100, 200, 500] (3 values). Total combinations of 9 to form a standard grid search.
(b)
Holdout validation which directly evaluate each combination on a fixed validation set (XVal, yVal), i.e., valLoss = loss (model, XVal, yVal). The combination with the smallest loss is selected.
(c)
Dynamic update that determines if valLoss < bestLoss, update bestModel and retain the optimal hyperparameters.
(d)
After optimization, retrain the final model (finalModel) using the combined training and validation set (fullData) to enhance generalization.

Feature Importance Calculation

For the Elastic Net model, feature importance was calculated as the absolute value of the final optimized model coefficients. This approach is standard for linear models, where the magnitude of the coefficient reflects the strength of the relationship between each feature and the target variable (cycle life). Features were then ranked by these absolute values to determine their relative importance.

For the Random Forest model, feature importance was computed using out-of-bag (OOB) permutation importance. This method evaluates the importance of each feature by randomly permuting its values in the OOB samples and measuring the resulting increase in prediction error. A larger increase in error indicates a more important feature, as the model’s performance is more dependent on that feature’s original values.

For the XGBoost model, feature importance was derived from the total gain generated by each feature across all trees in the ensemble. The gain represents the improvement in accuracy (reduction in impurity, measured by mean squared error for regression tasks) brought by each feature when used as a split point. Features that consistently provide higher gains when used for splitting are considered more important.

While these methods are model-specific and not directly comparable across different algorithms, they represent standard and well-established approaches for evaluating feature importance within each modeling framework.

Supplementary Material

ao5c09364_si_001.pdf^{(347KB, pdf)}

Acknowledgments

A start-up grant from Zhejiang University is acknowledged (ZH).

The data are available from the corresponding author upon reasonable request (ZH: hongzijian100@zju.edu.cn).

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acsomega.5c09364.

Table S1: the testing conditions for the batteries. Figure S1–S3: performances of the machine learning models with C-rate includes as the key feature (PDF)

The authors declare no competing financial interest.

Published as part of ACS Omega special issue “Energy Storage across Scales”.

References

Tarascon J.-M., Armand M.. Issues and challenges facing rechargeable lithium batteries. Nature. 2001;414:359–367. doi: 10.1038/35104644. [DOI] [PubMed] [Google Scholar]
Goodenough J.-B., Kim Y.. Challenges for Rechargeable Li Batteries. Chem. Mater. 2010;22(3):587–603. doi: 10.1021/cm901452z. [DOI] [Google Scholar]
Armand M., Tarascon J.-M.. Building better batteries. Nature. 2008;451:652–657. doi: 10.1038/451652a. [DOI] [PubMed] [Google Scholar]
Nasajpour-Esfahani N., Garmestani H., Bagheritabar M.. et al. Comprehensive review of lithium-ion battery materials and development challenges. Renewable Sustainable Energy Rev. 2024;203:114783. doi: 10.1016/j.rser.2024.114783. [DOI] [Google Scholar]
Degen F., Winter M., Bendig D.. et al. Energy consumption of current and future production of lithium-ion and post lithium-ion battery cells. Nat. Energy. 2023;8:1284–1295. doi: 10.1038/s41560-023-01355-z. [DOI] [Google Scholar]
Fredericks W. L., Sripad S., Bower G. C., Viswanathan V.. Performance Metrics Required of Next-Generation Batteries to Electrify Vertical Takeoff and Landing (VTOL) Aircraft. ACS Energy Lett. 2018;3(12):2989–2994. doi: 10.1021/acsenergylett.8b02195. [DOI] [Google Scholar]
Yang T., Zhang K., Zuo Y.. et al. Ultrahigh-nickel layered cathode with cycling stability for sustainable lithium-ion batteries. Nat. Sustain. 2024;7:1204–1214. doi: 10.1038/s41893-024-01402-x. [DOI] [Google Scholar]
Hou D., Zhou X., Li T., Xu W., Li T., Liu Y.. Texture Evolution in Polycrystal Layered Oxide Cathode Particles for Lithium-Ion Batteries. Energy Mater. Adv. 2025;6:0176. doi: 10.34133/energymatadv.0176. [DOI] [Google Scholar]
Liu T., Yu L., Lu J., Zhou T., Huang X., Cai Z., Dai A., Gim J., Ren Y., Xiao X.. et al. Rational design of mechanically robust Ni-rich cathode materials via concentration gradient strategy. Nat. Commun. 2021;12:6024. doi: 10.1038/s41467-021-26290-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
Li J., He Y., Nazari S.. et al. Revealing the degradation behaviors and mechanisms of NCM cathode in scrapped lithium-ion batteries. J. Power Sources. 2023;582:233563. doi: 10.1016/j.jpowsour.2023.233563. [DOI] [Google Scholar]
Wu F., Dong J., Chen L., Chen G., Shi Q., Nie Y., Lu Y., Bao L., Li N., Song T.. et al. Removing the Intrinsic NiO Phase and Residual Lithium for High-Performance Nickel-Rich Materials. Energy Mater. Adv. 2023;4:0007. doi: 10.34133/energymatadv.0007. [DOI] [Google Scholar]
Britala L., Marinaro M., Kucinskis G.. A review of the degradation mechanisms of NCM cathodes and corresponding mitigation strategies. J. Energy Storage. 2023;73A:108875. doi: 10.1016/j.est.2023.108875. [DOI] [Google Scholar]
Evro S., Ajumobi A., Mayon D., Tomomewo O.-S.. Navigating battery choices: A comparative study of lithium iron phosphate and nickel manganese cobalt battery technologies. Future Batteries. 2024;4:100007. doi: 10.1016/j.fub.2024.100007. [DOI] [Google Scholar]
Su L., Wu M., Li Z., Zhang J.. Cycle life prediction of lithium-ion batteries based on data-driven methods. eTransportation. 2021;10:100137. doi: 10.1016/j.etran.2021.100137. [DOI] [Google Scholar]
Severson K.-A., Attia P.-M., Jin N.. et al. Data-driven prediction of battery cycle life before capacity degradation. Nat. Energy. 2019;4:383–391. doi: 10.1038/s41560-019-0356-8. [DOI] [Google Scholar]
Ma Y., Wu L., Guan Y., Peng Z.. The capacity estimation and cycle life prediction of lithium-ion batteries using a new broad extreme learning machine approach. J. Power Sources. 2020;476:228581. doi: 10.1016/j.jpowsour.2020.228581. [DOI] [Google Scholar]
Wang Y., Zhu J., Cao L.. et al. Long Short-Term Memory Network with Transfer Learning for Lithium-ion Battery Capacity Fade and Cycle Life Prediction. Appl. Energy. 2023;350:121660. doi: 10.1016/j.apenergy.2023.121660. [DOI] [Google Scholar]
Rhyu J., Schaeffer J., Li M.-L.. et al. Systematic feature design for cycle life prediction of lithium-ion batteries during formation. Joule. 2025;9:101884. doi: 10.1016/j.joule.2025.101884. [DOI] [Google Scholar]
Si Q., Matsuda S., Yamaji Y., Momma T., Tateyama Y.. Data-Driven Cycle Life Prediction of Lithium Metal-Based Rechargeable Battery Based on Discharge/Charge Capacity and Relaxation Features. Adv. Sci. 2024;11:2402608. doi: 10.1002/advs.202402608. [DOI] [PMC free article] [PubMed] [Google Scholar]
Vijayaraghavan V., Garg A., Gao L.. Predicting the remaining useful life of nickel-manganese-cobalt batteries using ensemble gradient boosting with probabilistic estimation. J. Energy Storage. 2025;135:118374. doi: 10.1016/j.est.2025.118374. [DOI] [Google Scholar]
Yao J., Gao Q., Gao T.. et al. A Physics–Guided Machine Learning Approach for Capacity Fading Mechanism Detection and Fading Rate Prediction Using Early Cycle Data. Batteries. 2024;10:283. doi: 10.3390/batteries10080283. [DOI] [Google Scholar]
Zhu J., Wang Y., Huang Y., Bhushan Gopaluni R., Cao Y., Heere M., Mühlbauer M. J., Mereacre L., Dai H., Liu X.. et al. Data-driven capacity estimation of commercial lithium-ion batteries from voltage relaxation. Nat. Commun. 2022;13:2261. doi: 10.1038/s41467-022-29837-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zou H., Hastie T.. Regularization and variable selection via the elastic net. J. Roy. Stat. Soc. B. 2005;67(2):301–320. doi: 10.1111/j.1467-9868.2005.00503.x. [DOI] [Google Scholar]
Fawagreh K., Gaber M. M., Elyan E.. Random forests: from early developments to recent advancements. Syst. Sci. Control Eng. 2014;2(1):602–609. doi: 10.1080/21642583.2014.956265. [DOI] [Google Scholar]
Xu P., Ji X., Li M., Lu W.. Small data machine learning in materials science. npj Comput. Mater. 2023;9:42. doi: 10.1038/s41524-023-01000-z. [DOI] [Google Scholar]
Torrisi S. B., Carbone M. R., Rohr B. A., Montoya J. H., Ha Y., Yano J., Suram S. K., Hung L.. Random forest machine learning models for interpretable X-ray absorption near-edge structure spectrum-property relationships. npj Comput. Mater. 2020;6:109. doi: 10.1038/s41524-020-00376-6. [DOI] [Google Scholar]
Kwak S., Kim J., Ding H.. et al. Machine learning prediction of the mechanical properties of γ-TiAl alloys produced using random forest regression model. J. Mater. Res. Technol. 2022;18:520–530. doi: 10.1016/j.jmrt.2022.02.108. [DOI] [Google Scholar]
Oliynyk A. O., Antono E., Sparks T. D.. et al. High-Throughput Machine-Learning-Driven Synthesis of Full-Heusler Compounds. Chem. Mater. 2016;28(20):7324–7331. doi: 10.1021/acs.chemmater.6b02724. [DOI] [Google Scholar]
Chen, T. ; Guestrin, C. . XGBoost: A scalable tree boosting system. In Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2016, pp 785–794. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ao5c09364_si_001.pdf^{(347KB, pdf)}

Data Availability Statement

The data are available from the corresponding author upon reasonable request (ZH: hongzijian100@zju.edu.cn).

[ref1] Tarascon J.-M., Armand M.. Issues and challenges facing rechargeable lithium batteries. Nature. 2001;414:359–367. doi: 10.1038/35104644. [DOI] [PubMed] [Google Scholar]

[ref2] Goodenough J.-B., Kim Y.. Challenges for Rechargeable Li Batteries. Chem. Mater. 2010;22(3):587–603. doi: 10.1021/cm901452z. [DOI] [Google Scholar]

[ref3] Armand M., Tarascon J.-M.. Building better batteries. Nature. 2008;451:652–657. doi: 10.1038/451652a. [DOI] [PubMed] [Google Scholar]

[ref4] Nasajpour-Esfahani N., Garmestani H., Bagheritabar M.. et al. Comprehensive review of lithium-ion battery materials and development challenges. Renewable Sustainable Energy Rev. 2024;203:114783. doi: 10.1016/j.rser.2024.114783. [DOI] [Google Scholar]

[ref5] Degen F., Winter M., Bendig D.. et al. Energy consumption of current and future production of lithium-ion and post lithium-ion battery cells. Nat. Energy. 2023;8:1284–1295. doi: 10.1038/s41560-023-01355-z. [DOI] [Google Scholar]

[ref6] Fredericks W. L., Sripad S., Bower G. C., Viswanathan V.. Performance Metrics Required of Next-Generation Batteries to Electrify Vertical Takeoff and Landing (VTOL) Aircraft. ACS Energy Lett. 2018;3(12):2989–2994. doi: 10.1021/acsenergylett.8b02195. [DOI] [Google Scholar]

[ref7] Yang T., Zhang K., Zuo Y.. et al. Ultrahigh-nickel layered cathode with cycling stability for sustainable lithium-ion batteries. Nat. Sustain. 2024;7:1204–1214. doi: 10.1038/s41893-024-01402-x. [DOI] [Google Scholar]

[ref8] Hou D., Zhou X., Li T., Xu W., Li T., Liu Y.. Texture Evolution in Polycrystal Layered Oxide Cathode Particles for Lithium-Ion Batteries. Energy Mater. Adv. 2025;6:0176. doi: 10.34133/energymatadv.0176. [DOI] [Google Scholar]

[ref9] Liu T., Yu L., Lu J., Zhou T., Huang X., Cai Z., Dai A., Gim J., Ren Y., Xiao X.. et al. Rational design of mechanically robust Ni-rich cathode materials via concentration gradient strategy. Nat. Commun. 2021;12:6024. doi: 10.1038/s41467-021-26290-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref10] Li J., He Y., Nazari S.. et al. Revealing the degradation behaviors and mechanisms of NCM cathode in scrapped lithium-ion batteries. J. Power Sources. 2023;582:233563. doi: 10.1016/j.jpowsour.2023.233563. [DOI] [Google Scholar]

[ref11] Wu F., Dong J., Chen L., Chen G., Shi Q., Nie Y., Lu Y., Bao L., Li N., Song T.. et al. Removing the Intrinsic NiO Phase and Residual Lithium for High-Performance Nickel-Rich Materials. Energy Mater. Adv. 2023;4:0007. doi: 10.34133/energymatadv.0007. [DOI] [Google Scholar]

[ref12] Britala L., Marinaro M., Kucinskis G.. A review of the degradation mechanisms of NCM cathodes and corresponding mitigation strategies. J. Energy Storage. 2023;73A:108875. doi: 10.1016/j.est.2023.108875. [DOI] [Google Scholar]

[ref13] Evro S., Ajumobi A., Mayon D., Tomomewo O.-S.. Navigating battery choices: A comparative study of lithium iron phosphate and nickel manganese cobalt battery technologies. Future Batteries. 2024;4:100007. doi: 10.1016/j.fub.2024.100007. [DOI] [Google Scholar]

[ref14] Su L., Wu M., Li Z., Zhang J.. Cycle life prediction of lithium-ion batteries based on data-driven methods. eTransportation. 2021;10:100137. doi: 10.1016/j.etran.2021.100137. [DOI] [Google Scholar]

[ref15] Severson K.-A., Attia P.-M., Jin N.. et al. Data-driven prediction of battery cycle life before capacity degradation. Nat. Energy. 2019;4:383–391. doi: 10.1038/s41560-019-0356-8. [DOI] [Google Scholar]

[ref16] Ma Y., Wu L., Guan Y., Peng Z.. The capacity estimation and cycle life prediction of lithium-ion batteries using a new broad extreme learning machine approach. J. Power Sources. 2020;476:228581. doi: 10.1016/j.jpowsour.2020.228581. [DOI] [Google Scholar]

[ref17] Wang Y., Zhu J., Cao L.. et al. Long Short-Term Memory Network with Transfer Learning for Lithium-ion Battery Capacity Fade and Cycle Life Prediction. Appl. Energy. 2023;350:121660. doi: 10.1016/j.apenergy.2023.121660. [DOI] [Google Scholar]

[ref18] Rhyu J., Schaeffer J., Li M.-L.. et al. Systematic feature design for cycle life prediction of lithium-ion batteries during formation. Joule. 2025;9:101884. doi: 10.1016/j.joule.2025.101884. [DOI] [Google Scholar]

[ref19] Si Q., Matsuda S., Yamaji Y., Momma T., Tateyama Y.. Data-Driven Cycle Life Prediction of Lithium Metal-Based Rechargeable Battery Based on Discharge/Charge Capacity and Relaxation Features. Adv. Sci. 2024;11:2402608. doi: 10.1002/advs.202402608. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref20] Vijayaraghavan V., Garg A., Gao L.. Predicting the remaining useful life of nickel-manganese-cobalt batteries using ensemble gradient boosting with probabilistic estimation. J. Energy Storage. 2025;135:118374. doi: 10.1016/j.est.2025.118374. [DOI] [Google Scholar]

[ref21] Yao J., Gao Q., Gao T.. et al. A Physics–Guided Machine Learning Approach for Capacity Fading Mechanism Detection and Fading Rate Prediction Using Early Cycle Data. Batteries. 2024;10:283. doi: 10.3390/batteries10080283. [DOI] [Google Scholar]

[ref22] Zhu J., Wang Y., Huang Y., Bhushan Gopaluni R., Cao Y., Heere M., Mühlbauer M. J., Mereacre L., Dai H., Liu X.. et al. Data-driven capacity estimation of commercial lithium-ion batteries from voltage relaxation. Nat. Commun. 2022;13:2261. doi: 10.1038/s41467-022-29837-w. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref23] Zou H., Hastie T.. Regularization and variable selection via the elastic net. J. Roy. Stat. Soc. B. 2005;67(2):301–320. doi: 10.1111/j.1467-9868.2005.00503.x. [DOI] [Google Scholar]

[ref24] Fawagreh K., Gaber M. M., Elyan E.. Random forests: from early developments to recent advancements. Syst. Sci. Control Eng. 2014;2(1):602–609. doi: 10.1080/21642583.2014.956265. [DOI] [Google Scholar]

[ref25] Xu P., Ji X., Li M., Lu W.. Small data machine learning in materials science. npj Comput. Mater. 2023;9:42. doi: 10.1038/s41524-023-01000-z. [DOI] [Google Scholar]

[ref26] Torrisi S. B., Carbone M. R., Rohr B. A., Montoya J. H., Ha Y., Yano J., Suram S. K., Hung L.. Random forest machine learning models for interpretable X-ray absorption near-edge structure spectrum-property relationships. npj Comput. Mater. 2020;6:109. doi: 10.1038/s41524-020-00376-6. [DOI] [Google Scholar]

[ref27] Kwak S., Kim J., Ding H.. et al. Machine learning prediction of the mechanical properties of γ-TiAl alloys produced using random forest regression model. J. Mater. Res. Technol. 2022;18:520–530. doi: 10.1016/j.jmrt.2022.02.108. [DOI] [Google Scholar]

[ref28] Oliynyk A. O., Antono E., Sparks T. D.. et al. High-Throughput Machine-Learning-Driven Synthesis of Full-Heusler Compounds. Chem. Mater. 2016;28(20):7324–7331. doi: 10.1021/acs.chemmater.6b02724. [DOI] [Google Scholar]

[ref29] Chen, T. ; Guestrin, C. . XGBoost: A scalable tree boosting system. In Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2016, pp 785–794. [Google Scholar]

PERMALINK

Predicting Cycle Life for Lithium-Ion Batteries with Ternary Cathode Materials Using Data-Driven Machine Learning

Long Li

Pengfei Yue

Chongnian Tang

Xun Qi

Lumeng Chao

Yuanyuan Gao

Xiaokui Wang

Yue Zhang

Chaohui Wu

Feng Liu

Guanshihan Du

Yongjun Wu

Zijian Hong

Abstract

Introduction

Main

1.

Results and Discussion

2.

1. Prediction Error of the Elastic Net Model vs the Number of Charge/Discharge Cycles Used as the Input Historical Data.

3.

2. Prediction Error of the Random Forest Model vs the Number of Charge/Discharge Cycles Used as the Input Historical Data.

4.

3. Prediction Error of the XG Boost Model vs the Number of Charge/Discharge Cycles Used as the Input Historical Data.

5.

Conclusions

Methods

Features

Elastic Net

Random Forest

XGBoost

Feature Importance Calculation

Supplementary Material

Acknowledgments

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases