Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Nov 13;15:39873. doi: 10.1038/s41598-025-23748-8

AI-driven multi-objective optimization of FCHEV sizing and energy management considering degradation and vehicle dynamics under realistic machine learning-based traffic conditions

Morteza Montazeri-Gh 1,2,, Afshin Mostashiri 1
PMCID: PMC12615721  PMID: 41233539

Abstract

The performance of Fuel Cell Hybrid Electric Vehicles (FCHEVs) is critically dependent on the optimization of energy management strategies (EMS), powertrain component sizing, and associated cost factors. Achieving the full potential of FCHEVs necessitates sophisticated optimization techniques that balance competing objectives of fuel efficiency, durability, and performance. This paper introduces a novel AI-driven multi-objective optimization framework that simultaneously optimizes both the EMS and powertrain component sizing, incorporating real-world driving conditions, degradation models, and vehicle dynamic constraints such as acceleration, top speed, and gradeability. The methodology begins by employing a machine learning approach using a Random Forest classifier to construct a representative driving cycle, categorizing traffic conditions into four distinct operational scenarios: congested, urban, extra-urban, and highway. An advanced hybrid optimization approach is then developed by combining Deep Q-Networks (DQN) with the NSGA-II evolutionary algorithm. This framework dynamically selects genetic operators (crossover and mutation) based on population performance, enhancing convergence and Pareto front quality. Both Type-2 Fuzzy Logic Controller parameters for EMS and powertrain component sizes are co-optimized to improve efficiency and durability. The proposed co-optimization framework improves both efficiency and durability, reducing fuel consumption by 21% compared to sizing-only optimization, while increasing battery durability by 7% and fuel cell durability by 30% compared to EMS-only approaches. Finally, the practical feasibility of the approach is demonstrated through hardware-in-the-loop (HIL) testing, where the optimized Type-2 fuzzy logic controller is executed in real-time by an Electronic Control Unit (ECU) via a data acquisition interface, confirming the system’s applicability.

Keywords: FCHEV, EMS, Machine learning, Sizing, Aging, HIL

Subject terms: Energy science and technology, Engineering, Mathematics and computing

Introduction

FCHEVs represent one of the most promising solutions for sustainable transportation, offering substantial reductions in greenhouse gas emissions compared to traditional internal combustion engine vehicles1,2. By emitting only water vapor and warm air, FCHEVs directly contribute to mitigating environmental pollution and are increasingly viewed as a cornerstone of future green mobility. At the heart of FCHEV performance lies the EMS, which dynamically regulates the distribution of power between the fuel cell and the battery, ensuring optimal operation across diverse driving conditions35.

Recent studies have made significant strides in applying genetic optimization methods to enhance energy management strategies (EMS) for hybrid vehicles. For example, paper6 utilizes the Non-dominated Sorting Genetic Algorithm III (NSGA-III) to co-optimize component sizing and EMS for an ammonia-hydrogen hybrid powertrain, improving system efficiency and reducing ammonia consumption in heavy-duty applications. Similarly, article7 combines NSGA-III with Bayesian optimization, achieving a 31.24% efficiency improvement in hydrogen production for carbon-free heavy-duty vehicles. Additionally, research8 demonstrates that adaptive EMS, based on genetic optimization, enhances energy efficiency and reduces carbon emissions in ammonia-hydrogen propulsion systems.

In recent years, artificial intelligence (AI)—and reinforcement learning (RL) in particular—has played an increasingly influential role in advancing energy management system (EMS) design. By enabling adaptive, data-driven, and real-time decision-making, these approaches have delivered measurable gains in fuel economy, reductions in component degradation, and enhanced vehicle performance under dynamic operating conditions911. For instance, a hierarchical Deep Deterministic Policy Gradient (DDPG)-based EMS improved fuel-cell efficiency by up to 56%, reduced battery degradation by 0.28%, and lowered operating costs by 9.24%12. Similarly, a hybrid Deep Dyna-Q method, which integrates model-free and model-based RL, demonstrated superior EMS optimization compared to conventional strategies, while also reducing training costs and improving policy stability under WLTC conditions13. Hybrid EMS structures have also been investigated, such as14, which combined fuzzy logic for high-level power allocation with RL-based low-level converter control, thereby reducing stress on critical components. More recently, degradation-aware models of fuel cells and batteries have been incorporated into RL-based EMS designs, further improving robustness, adaptability, and interpretability in long-term operation15. Paper16 emphasizes the pivotal role of RL in developing effective energy management strategies for fuel cell electric vehicles, demonstrating, through a novel sim-to-real framework, that integrating advanced RL algorithms with high-fidelity vehicle models results in significant reductions in hydrogen consumption—ranging from 4.35% to 5.73% across various testing stages.

While AI-driven optimization of EMS has progressed rapidly, the integration of AI in powertrain component sizing remains relatively underexplored. Although several studies have proposed co-optimization frameworks that address both EMS and component sizing, many have yet to fully exploit the potential of AI-based methods17,18. In contrast, other optimization techniques, such as Dynamic Programming (DP), Pontryagin’s Maximum Principle (PMP), Equivalent Consumption Minimization Strategy (ECMS), and Particle Swarm Optimization (PSO), have been widely used. For instance, Article19 presents a co-optimization approach for hybrid electric vehicles that simultaneously optimizes battery size and EMS, considering factors such as energy consumption, battery degradation, and depth of discharge (DOD). Using convex programming, this approach aims to minimize total costs and enhance vehicle efficiency. Similarly, Article20 introduces a real-time, multi-layer co-optimization strategy for hybrid vehicles, improving powertrain configuration, parameters, and control. By integrating a multi-mode, multi-gear system with fast real-time control (AFRCS), this method significantly enhances fuel economy, acceleration, and battery life, with real-world tests showing a 50.45% improvement in acceleration and a 22.67% increase in battery life. In the case of fuel cell vehicles, convex programming has also been applied to optimize EMS and component sizing for hybrid buses, focusing on driving patterns and cost sensitivity21. Additionally, other studies have examined the optimization of fuel cell power ratings, battery capacities, and control strategies to understand how these sizing decisions affect overall system efficiency22. A crucial, yet often overlooked, aspect of simultaneously optimizing EMS and component sizing is the impact of component aging and vehicle dynamics. Recent research highlights the importance of performance-related constraints—such as acceleration, top speed, and gradeability—in shaping optimization outcomes23,24. For example, one study25 showed that explicitly considering vehicle dynamics under WLTC conditions improved gradeability by 10.5% and extended fuel cell shutdown time by 18.5%, all while maintaining drivability.

Parallel research has also begun addressing aging-aware co-optimization by embedding degradation models for both fuel cells and batteries into the optimization framework2629. These studies demonstrate tangible benefits: a co-simulation method30 that simultaneously optimized component sizing and EMS achieved a 29% extension in fuel cell lifetime and a 15% gain in fuel efficiency, albeit with a moderate increase in battery aging. Similarly, a fuzzy multi-objective framework applied to a battery–ultracapacitor hybrid system helped balance trade-offs among range, efficiency, and longevity31. Additionally, paper32 presents a co-optimization framework for hybrid powertrains, combining high-fidelity engine and motor maps with an adaptive NSGA-III algorithm enhanced by a chaos sequence (NSGA-CS) to improve diversity and prevent premature convergence. This method minimizes fuel consumption, battery degradation, and manufacturing cost, outperforming traditional approaches and validated through real driving cycles and hardware-in-loop experiments. Another study33 optimized component sizing and EMS policies for a Toyota Mirai platform, achieving up to a 21% increase in fuel economy and meaningful reductions in both cost and degradation. Despite these advances, there remains no comprehensive framework that fully integrates AI-powered optimization of EMS and powertrain sizing simultaneously, while also incorporating real-world traffic patterns, component aging, and vehicle dynamics in a unified process. This gap underscores the need for advanced strategies that not only rely on co-simulation but also improve the underlying optimization process itself.

One important development in this regard is the use of reinforcement learning for adaptive operator selection in evolutionary algorithms, a concept already applied in diverse optimization contexts3436. A Dueling Deep Q-Network has been used to dynamically select crossover and mutation operators, improving multi-objective optimization on benchmark problems with applications in engineering and resource allocation37. Deep RL–based operator selection has also optimized energy and travel time for unmanned electric sweepers in urban networks38. More recently, its integration into NSGA-II demonstrated clear advantages, yielding higher-quality Pareto fronts in mixed-flow assembly scheduling39.

Building on these advances, this paper proposes a novel hybrid AI-based multi-objective optimization framework tailored for FCHEVs. The framework embeds a DQN within NSGA-II, enabling adaptive selection of genetic operators to jointly optimize both Type-2 Fuzzy Logic Controller parameters and powertrain component sizing. To ensure realistic operating conditions, a machine-learning-based traffic classification model is incorporated, using a Random Forest classifier to generate representative driving cycles spanning congested, urban, extra-urban, and highway scenarios. Degradation models for the fuel cell and battery are explicitly included, along with essential vehicle performance constraints such as acceleration, top speed, and gradeability. The proposed system is evaluated through simulation, with key metrics including fuel consumption, battery degradation, and fuel cell aging. Finally, HIL testing is conducted to validate the practical robustness and applicability of the approach under real-world driving conditions.

Vehicle description

A FCHEV typically consists of key components such as a fuel cell system, an electric motor, a battery, a power electronics unit, a hydrogen storage tank, and a control system to efficiently manage energy flow between the fuel cell, battery, and electric motor, as illustrated in Fig. 1(a).

Fig. 1.

Fig. 1

Structure of a FCHEV: (a) The main components of a vehicle, (b) running resistance of a vehicle.

To better understand the vehicle’s performance, Fig. 1(b) illustrates its longitudinal dynamics, which are primarily influenced by key forces: traction force (Inline graphic​), rolling resistance (Inline graphic), aerodynamic drag (Inline graphic​), and the gravitational component along the road slope (Inline graphic​). The governing dynamic equation determining the required traction torque is expressed in Eq. (1).

graphic file with name d33e779.gif 1

Here, Inline graphic represents the total mass of the vehicle, encompassing the weights of the fuel cells, battery, electric motor, and other essential components. Inline graphic​ denotes the input electric power supplied to the DC/AC inverter, which is required to drive the electric motor. The parameters Inline graphic​ and Inline graphic​ refer to the transmission efficiency and motor drive efficiency, respectively. Additional key variables include u for vehicle speed, ρ for air density, δ as the rotational mass correction factor, Inline graphic as the vehicle’s frontal area, Inline graphic​ for the aerodynamic drag coefficient, Inline graphic​ for the rolling resistance coefficient, and Inline graphic for gravitational acceleration. The complete set of vehicle specifications used in this study is provided in Table 1.

Table 1.

The specification of studied FCHEV.

Parameters Value Unite Source
Vehicle Specifications Drag coefficient (Inline graphic) 0.318 - 40
Coefficient of Rolling Friction (Inline graphic) 0.0102 -
Front view area (Inline graphic) 2.1 Inline graphic
vehicle Weight (M) 1531 Kg
Effect coefficient of rotating objects (Inline graphic) 1.078 -
Battery Nominal capacity 5 Ah 48
Voltage 413 V
Fuel cell Maximum power 75 KW 41
Stack number 430 -
Active area 320 Inline graphic
Motor Maximum power 75 KW ADVISOR
Maximum torque 280 Nm
Maximum speed 6000 rpm

Fuel cell modeling

Proton Exchange Membrane Fuel Cells (PEMFCs) generate electrical power by facilitating an electrochemical reaction between hydrogen and oxygen. In this process, hydrogen serves as the primary fuel, while oxygen is sourced from the surrounding air. The byproducts of the reaction are electricity, heat, and water, making PEMFCs an environmentally friendly option. A distinctive feature of these fuel cells is the solid polymer electrolyte membrane, which selectively allows protons to migrate through while forcing electrons to travel via an external circuit (refer to Fig. 2(a)).

Fig. 2.

Fig. 2

Fuel cell modeling plots: (a) Basic structure of PEMFC, (b) efficiency plot of the fuel cell system41.

This separation creates the electric current necessary for power generation. The fundamental reaction governing this process is summarized in Eq. (2). Additional specifications related to the fuel cell stack are listed in Table 1.

graphic file with name d33e1065.gif 2

The hydrogen consumption of a PEMFC (Inline graphic) can be determined using:

graphic file with name d33e1079.gif 3

In this context, N represents the number of cells, F denotes the Faraday constant, and Inline graphic refers to the net current extracted from the PEMFC. This hydrogen consumption is then utilized to determine both the cost of hydrogen and the energy input to the PEMFC, as outlined below:

graphic file with name d33e1099.gif 4

Where Inline graphicrepresents the higher heating value of the hydrogen supplied to the PEMFC. The specific relationships between the fuel cell power (FC Power) and efficiency examined in this study are depicted in Fig. 2(b)41.

To enhance the accuracy and interpretability of fuel cell vehicle models, advanced AI methods leveraging experimental data are highly effective. For example, paper42 presents a theory-constrained neural network (TCNN) that combines theoretical models with data-driven techniques, using experimental data to improve fuel cell temperature and voltage predictions while maintaining physical significance, resulting in better hydrogen consumption estimation and system optimization.

Fuel cell aging

This study estimates fuel cell power degradation (Q-FC) by incorporating the cumulative effects of various operational conditions using empirically derived coefficients that reflect different degradation mechanisms43,44. The deterioration in maximum power output, denoted as Q-FC, is calculated using the equation:

graphic file with name d33e1139.gif 5

Where Inline graphic indicates the peak power capacity of the fuel cell, serving as a reference for quantifying losses. The degradation coefficients are associated with distinct stress factors: α₁ (0.00126% per hour) accounts for the prolonged operation at very low power levels (below 5% of Inline graphic), which can impair electrochemical efficiency; α₂ (0.00196% per cycle) reflects the impact of frequent startups and shutdowns; α₃ (5.93 × 10⁻⁵% per load change) measures degradation from transient fluctuations in load demand; α₄ corresponds to the high-power operation stress, typically encountered above 90% of rated output; and α₅ (0.002% per hour) represents the baseline performance decay under regular usage, attributable to gradual material wear such as membrane thinning and catalyst aging. The time durations t₁, t₂, and t₃ represent cumulative periods of low-power use, high-power operation, and total runtime, respectively, while n₁ and n₂ count the number of on/off cycles and transient load events44. It is noteworthy that, according to45, fuel cell degradation is predominantly attributed to load changing, which accounts for 56.5% of the total degradation. In alignment with standards from the U.S. Department of Energy (DOE), a fuel cell is considered to have reached its end of life (EOL) when its peak output drops by 10%, with a targeted lifespan of 5000 operational hours46.

Battery modeling

To accurately simulate the internal resistance and open-circuit voltage behavior, lookup tables were applied based on discharge properties, charge resistance, and voltage–SOC profiles, as detailed in Fig. 3 (a)47. In this model, the battery’s output current Inline graphic, terminal voltage Inline graphic, and output power Inline graphic​ are computed using the following relationships:

graphic file with name d33e1202.gif 6
graphic file with name d33e1208.gif 7
graphic file with name d33e1214.gif 8

Fig. 3.

Fig. 3

Characteristic maps for battery modeling: (a) Relationship between internal resistance, open circuit voltage, and SOC47, (b) capacity depletion at C/2 discharge rate as a function of cycle number for various DoD51.

To extend battery service life and avoid damage from deep discharges, the SOC should not fall below 40%48,49, which corresponds to a maximum depth of discharge (DoD) of 60%, as defined by:

graphic file with name d33e1229.gif 9

Battery aging

The battery degradation model presented in this work adopts a semi-empirical approach, incorporating the combined influence of temperature, SOC, and C-rate to evaluate battery aging. It estimates performance decline through the total ampere-hour (Ah) throughput, offering a holistic assessment of usage impact over time. Battery capacity loss is expressed as a percentage of the initial capacity, making the degradation trends easily quantifiable. To account for varying operational conditions, the average stress factors are calculated over the full driving cycle, thereby improving the model’s representation of real-world use. The primary degradation is:

graphic file with name d33e1263.gif 10

In this formulation, Inline graphic is the initial nominal battery capacity, and Inline graphic is the remaining capacity after experiencing a cumulative charge/discharge throughput Inline graphic. The total charge throughput, including both charging and discharging phases, is calculated by:

graphic file with name d33e1289.gif 11

Where Inline graphic represents the battery current over time. Capacity fades at the cell level is further characterized by50:

graphic file with name d33e1307.gif 12

Here, Inline graphic represents the capacity severity factor function, Inline graphic denotes the gas constant, and Inline graphic​ is the cell activation energy associated with capacity fade, and Inline graphic refers to the battery temperature.

To generalize the model for various C-rates, Wang et al.51 introduced a modified expression:

graphic file with name d33e1345.gif 13

Table 2 provides the corresponding values of the coefficient Inline graphic for different C-rates.

Table 2.

Values of B respect to C-rate51.

C- Rate C/2 2C 6C 10C
B values 31,630 21,681 12,934 15,512

Figure 3(b) presents the capacity loss trend for a LiFePO₄ cell operating at C/2 across different DoD. The plotted curves exhibit a mild S-curve pattern, highlighting the complex nature of cycle-induced degradation. Nevertheless, a generally linear trend between capacity retention and cycle count is observed within individual DoD levels.

Electric motor modeling

The core performance parameters of an electric motor include rotational speed, output torque, and conversion efficiency. In this study, motor behavior is modeled using an efficiency map, depicted in Fig. 4. This map delineates operating zones, with quadrant I representing propulsion (driving) and quadrant IV indicating regenerative braking (generator operation). Furthermore, Fig. 4 outlines the motor’s peak torque boundaries under varying conditions. The power output or consumption of the motor, Inline graphic, in relation to torque Inline graphic and angular velocity Inline graphic, is defined by the following expressions:

graphic file with name d33e1439.gif 14

Fig. 4.

Fig. 4

Efficiency map and torque-speed profile.

Development of a real-world traffic driving cycle

A realistic, scenario-specific drive cycle largely determines the Pareto front—that is, the observed trade-off between fuel use and component aging. The cycle fixes the distribution of transients (stops, bursts, grades, cruising) that drive power demand, SOC excursions/Ah-throughput, thermal loads, and engine on/off events—exactly the mechanisms that consume fuel and accumulate degradation52. This research focused on formulating a representative drive cycle tailored to Tehran’s traffic conditions, aimed at capturing the city’s distinctive driving patterns. Given Tehran’s dense population—approximately 10 million residents53—and its well-known issues with traffic congestion, creating an accurate drive cycle was essential for reliable modeling and simulation. To achieve this, driving data were collected from various urban routes throughout the city, providing the foundation for constructing a drive cycle suitable for evaluating and optimizing vehicle performance under real-world traffic conditions.

Data collection setup

A GPS-enabled system was developed using an Arduino Uno microcontroller integrated with a NEO-6 M GPS module to record real-world traffic data. As depicted in Fig. 5(a), this setup was employed to log vehicle speed as car traveled through various areas—the GPS device sampled data at consistent time intervals, enabling the construction of detailed speed-time profiles. The entire system was enclosed in a durable casing for protection during on-road testing, as shown in Fig. 5(b). The data collection routes covered several key streets and highways, with Fig. 5(c) highlighting the primary paths. Routes marked in red represent areas with higher traffic congestion, while those in yellow indicate regions with relatively smoother traffic flow. Figure. 5(d) displays the vehicles outfitted with the GPS during the data gathering.

Fig. 5.

Fig. 5

Data collection setup and route for real-world traffic cycle: (a) The components of GPS, (b) GPS box, (c) route characteristics of the Tehran city, (d) setup to develop drive cycle.

Noise filtering

Traffic data was collected over six months to develop the real-world driving cycle. Aggregating all recorded speed-time samples resulted in a comprehensive drive cycle encompassing more than 300,000 s of data, as illustrated in Fig. 6(a). An advanced smoothing algorithm was applied to address high-frequency fluctuations in the velocity data, providing superior performance compared to conventional averaging techniques. This method follows the filtering approach proposed in54, expressed by the following equation:

graphic file with name d33e1517.gif 15
Fig. 6.

Fig. 6

Development of the real-world traffic drive cycle: (a) Driving data collected over a six-month period across various routes in the city; (b) drive cycle generation using a machine learning-based clustering approach.

In this formulation, the kernel function Inline graphic assigns appropriate weights to the velocity samples within a specified temporal window centered around each time point (t), thereby enabling effective noise reduction. A smoothing window of 4 s (h = 4) was adopted, utilizing a weighted kernel to enhance filtering precision.

The employed kernel function Inline graphic is defined as:

graphic file with name d33e1539.gif 16

This kernel effectively reduces noise while preserving the key dynamics of the driving cycle. The filtering process results are depicted in Fig. 6(a), demonstrating the method’s efficacy in smoothing raw velocity data.

Machine learning approach for classifying and constructing traffic drive cycle

The speed data used in this study encompassed a wide range of routes, traffic conditions, and time-of-day variations to ensure a comprehensive representation of driving environment. The dataset was segmented into microtrips, defined as continuous driving sequences separated by idle periods. A microtrip was considered active when the vehicle’s speed was greater than zero, and it ended when the speed dropped to zero and remained there for a period of time. This segmentation approach effectively captures both dynamic driving phases and stationary intervals.

For each microtrip, a set of descriptive features was extracted to characterize the driving conditions. These features included average speed, idle time percentage, speed standard deviation, peak speed, acceleration events, and deceleration events. The extracted microtrips were then classified into four driving categories: Congested, Urban, Extra-Urban, and Highway. The classification criteria and feature thresholds for each category are presented in Table 3.

Table 3.

Feature thresholds for congested, urban, extra-urban, and highway conditions.

Class Average Velocity (km/h) Idle Percentage (%) Standard Deviation of Velocity Max Velocity (km/h) Acceleration Events Deceleration Events
Congested 0–5 0- 100 < 2 < 10 0–5 0–5
Urban 5–15 0–75 < 5 10–30 2–10 5–20
Extra Urban 15–30 0–53 < 10 30–50 5–15 10–30
Highway >30 0–30 > 10 > 40 10–20 20–50

The extracted features from each microtrip were employed to train a Random Forest classifier to identify the driving condition associated with each segment. The classifier was trained on 300,000 s of labeled microtrip data, with a cost matrix to penalize misclassifications. Boundary adjustment resolved ambiguous cases near class thresholds. The Random Forest algorithm operates as an ensemble of multiple decision trees, each trained on a random subset of features and bootstrapped data samples (see Fig. 7). For instance, one tree in the forest might split first on Average Velocity (> 15 km/h), then on Max Velocity (> 40 km/h) to classify Highway traffic, while another tree could prioritize Idle Percentage (> 50%) to detect Congested conditions. During inference, each tree votes independently, and the final prediction is determined by majority voting (for classification) or averaging (for regression). This approach reduces overfitting by decorrelating individual trees, leveraging the strength of collective decision-making while mitigating biases from any single tree’s structure.

Fig. 7.

Fig. 7

Decision tree for driving condition classification based on velocity, acceleration, and traffic metrics.

The classification model’s performance is visually demonstrated in a series of plots. Figure 8(a) shows a 3D scatter plot of Average Velocity, Idle Percentage, and Standard Deviation of Velocity, with microtrips color-coded by condition. Congested microtrips show lower velocities and higher idle percentages, while Highway microtrips exhibit the opposite characteristics. Figure 8(b) is a bar chart showing the number of microtrips in each condition, with Congested being the most common. Figure 8(c) displays a 2D plot of Idle Percentage vs. Average Velocity, clearly distinguishing Congested and Highway conditions. Figure 8(d) shows Max Velocity vs. Standard Deviation, where Highway microtrips have the highest values for both features.

Fig. 8.

Fig. 8

Random Forest classification of microtrip driving conditions: (a) three-dimensional scatter plot of average velocity, idle time percentage, and speed standard deviation; (b) two-dimensional plot of average velocity and idle time percentage; (c) bar chart of microtrip distribution; (d) plot of maximum velocity and speed standard deviation.

The confusion matrix (Fig. 9(a)) illustrates the model’s performance, showing high classification accuracy: 99.8% for Congested, 93.1% for Urban, 99.6% for Extra Urban, and 100% for Highway. Most misclassifications occur between Urban and extra-urban due to overlapping features. In Fig. 9(b), correctly classified microtrips are shown in green and misclassified in red, with errors mainly appearing at the boundaries between Urban and Extra Urban conditions.

Fig. 9.

Fig. 9

Random forest classification: (a) Confusion matrix, (b) misclassification analysis based on average velocity and idle time percentage.

Finally, the traffic drive cycle was generated by stitching together selected microtrips from each classified driving condition, prioritizing those closest to their respective cluster centers (see Fig. 6(b)). This approach ensured that the resulting drive cycle accurately represented the typical driving patterns. The cycle captures diverse traffic conditions—Congested, Urban, Extra Urban, and Highway—each with distinct durations, speed profiles, and behavioral characteristics, as summarized in Table 4.

Table 4.

Characteristics of congested, urban, extra-urban, and highway conditions.

Condition Congested Urban Extra Urban Highway Total Cycle
Duration (sec) 465 500 499 565 2029
MaxSpeed (km/h) 12.5 17.1 35.7 71.4 71.4
AverageSpeed (km/h) 2.7 7.0 19.2 39.2 18.0
StdVel (km/h) 3.2 4.8 8.4 19.6 18.5
AccelEvts 90 168 211 224 693
DecelEvts 123 174 160 216 674
MaxAccel (m/Inline graphic) 0.83 0.87 0.965 1.40 1.40
MinAccel (m/Inline graphic) −0.77 −0.9 −1.2 −1.13 −1.2
AvgAccel (m/Inline graphic) 0.263 0.264 0.27 0.28 0.27
AvgDecel (m/Inline graphic) −0.18 −0.25 −0.33 −0.31 −0.27
IdleTime (sec) 229 83 28 13 353
NumberStops 14 12 3 1 31
Distance (km) 0.35 0.98 2.67 6.16 10.15

Hybrid multi-objective deep reinforcement learning optimization for FCHEV

This section presents a hybrid multi-objective optimization framework designed to simultaneously optimize powertrain component sizing and the EMS of a FCHEV. The proposed method combines the NSGA-II with a DQN to enable the adaptive selection of genetic operators. Integrating these advanced techniques effectively balances multiple conflicting objectives, including maximizing energy efficiency, minimizing operational costs, and prolonging system durability. Moreover, the framework simultaneously fine-tunes both the physical configuration of the powertrain and the parameters of the Type-2 fuzzy logic controller, ensuring intelligent energy distribution under real-world driving conditions.

NSGA-II multi-objective optimization

Figure 10 together illustrate the working principles of the NSGA-II algorithm for solving multi-objective optimization problems. In part (a), the process begins with a randomly generated initial population of solutions, each representing a possible trade-off between conflicting objectives. These solutions are combined with newly generated offspring and then ranked using non-dominated sorting, which organizes them into layers (F1, F2, F3, etc.) based on Pareto dominance. The first front (F1) contains the best solutions that are not outperformed by any other, while lower-ranked fronts (F2, F3, F4) represent progressively weaker alternatives. Since the population size must remain limited, NSGA-II applies an additional step known as crowding distance sorting to select which solutions survive to the next generation. This mechanism ensures that chosen solutions are not only of high quality but also well spread across the objective space, preventing the algorithm from converging to a narrow cluster of points55. Part (b) shows this concept visually: the horizontal axis represents one cost (fuel cell and battery aging) and the vertical axis represents another (fuel consumption). The scattered black dots indicate dominated solutions, while the connected blue points represent the non-dominated fronts. On the first front (F1), consecutive solutions such as Inline graphic, Inline graphic​, and Inline graphic​ are used to calculate crowding distance. If Inline graphic is far from its neighbors, it receives a larger crowding distance value and is more likely to be selected, as it contributes to maintaining diversity. Conversely, solutions packed too closely together may be rejected. Through this combination of non-dominated sorting and crowding distance sorting, NSGA-II ensures that the final Pareto front is both close to the true trade-off boundary and evenly distributed, offering decision-makers a wide set of balanced alternatives.

Fig. 10.

Fig. 10

NSGA-II evolutionary process: (a) Non-dominated sorting and crowding distance, (b) pareto front representation with crowding distance.

In evolutionary algorithms such as NSGA-II, new solutions are generated using genetic operators, which recombine or perturb existing solutions to explore the search space. For instance, consider two parent solutions representing vehicle configurations: (40 kWh battery, 80 kW fuel cell) and (60 kWh battery, 100 kW fuel cell). A simple crossover could exchange their traits to produce (40, 100) and (60, 80), while a mutation step might slightly adjust one offspring to (42, 97), introducing diversity. In the proposed framework, more advanced operators are also considered. The Simulated Binary Crossover (SBX) blends parent values by a scaling factor, potentially producing an offspring such as (38, 90), which lies between and beyond the parents. The DE/rand/1 operator generates a new solution by adding a weighted difference between two parents to a third, for example (40, 80) + 0.8 × ((60, 100) − (30, 70)) = (64, 104), effectively exploring along directional vectors. The DE/rand/2 operator extends this idea by combining two such difference vectors, allowing the offspring to explore more aggressively, e.g., (40, 80) + 0.5 × ((60, 100) − (30, 70)) + 0.5 × ((55, 85) − (25, 60)) = (70, 107). Traditionally, such operators are applied using fixed probabilities, but this is inflexible because their usefulness changes across search stages. To address this, adaptive operator selection driven by a deep reinforcement learning agent dynamically chooses which operator to apply based on their recent performance, ensuring that the most effective operators are emphasized while weaker ones are used less often.

Hybrid NSGA-II and DQN framework for multi-objective optimization

This methodology enhances the NSGA-II algorithm with a DQN to enable the adaptive selection of genetic operators throughout the multi-objective optimization process. This hybrid NSGA-II–DQN framework effectively addresses the challenges of complex multi-objective problems by integrating evolutionary strategies with reinforcement learning. The DQN agent interacts dynamically with the evolving population—comprising decision variables and their corresponding objective function values—and selects the most appropriate genetic operators based on learned Q-values, which capture each operator’s historical performance.

The Fig. 11 shows how the proposed NSGA-II + Deep Q-Learning framework works step by step.

Fig. 11.

Fig. 11

The illustration of the proposed NSGA-II-DQN model.

  • Evolution (top-left): The process begins with a group of candidate solutions (red and blue circles). These represent different possible designs or strategies. Special tools called operators (OP1, OP2, OP3) are used to generate new solutions by mixing and modifying the existing ones. The number of solutions first increases because many offspring are created, but later it is reduced again after only the best ones are selected.

  • Interaction (bottom-left): Here, an agent (the learning controller) communicates with the NSGA-II algorithm. The agent chooses which operator to apply (action), and NSGA-II evaluates the result, sending back a reward that reflects how good the new solutions are. This loop helps the system learn from trial and error.

  • Learning (bottom-right): All the information about the state of the population, the chosen operator, and the reward is stored in a memory called experience replay. From this memory, a Q-network (a neural network) learns to predict the usefulness of each operator in different situations. To make learning more stable, another copy of the network (target network) is also used. Together, they improve the accuracy of the Q-values, which measure how good each action is expected to be.

  • Decision (top-right): Finally, the system uses the learned Q-values to decide which operator (OP1, OP2, or OP3) should be applied next. This decision balances trying new options (exploration) with choosing the currently best operator (exploitation). The selected operator is then sent back to the Evolution stage, and the cycle repeats.

Figure 12 illustrates the proposed hybrid optimization framework designed for tuning multi-objective fuzzy EMS ans component sizes. It begins with the initialization of a diverse population within defined bounds, followed by an evaluation of objective functions to assess solution quality. The algorithm employs NSGA-II’s non-dominated sorting and crowding distance calculations to organize the population based on dominance. A pivotal aspect of the flowchart is the DQN’s role in selecting genetic operators; instead of following traditional methods, the algorithm dynamically chooses operators based on learned Q-values that reflect historical performance. The flow also incorporates a feedback loop, where the DQN assesses whether improvements have been made and adjusts its selection strategy accordingly. By utilizing the Chebyshev aggregation method for credit assignment, the algorithm converts multiple objective values into a single scalar reward, ensuring effective exploration of the solution space. This adaptive mechanism allows the algorithm to refine its approach over iterations, progressively identifying and prioritizing the most effective genetic operators to optimize the overall fitness of the population.

Fig. 12.

Fig. 12

NSGA-II-DQN flowchart.

Credit assignment strategy

In multi-objective optimization, assigning credit (reward) to offspring solutions requires aggregating multiple objective values into a single scalar. To address this, the Chebyshev aggregation method is employed. This method calculates the maximum weighted distance between the objective function values and a reference point, which is defined as the minimum value for each objective observed in the current population. Formally, the aggregation function for an individual Inline graphic is given by56,57:

graphic file with name d33e2109.gif 17

Where Inline graphic is the weight assigned to the Inline graphic -th objective, Inline graphic is the value of the Inline graphic -th objective function, Inline graphic​ is the reference point (minimum value) for the Inline graphic -th objective, and Inline graphic is the number of objectives.

The reward for each offspring Inline graphic is calculated as follows:

graphic file with name d33e2168.gif 18

This formulation ensures that the reward is bounded between 0 and 1, where the highest value corresponds to the best-performing individual in the population. After credit assignment, the reward for each offspring is computed by comparing operator rewards and individual rewards, using:

graphic file with name d33e2176.gif 19

This design emphasizes larger improvements over frequent small changes. Operator rewards are maintained in a sequence R, which stores recent operator selections paired with their corresponding rewards.

Operator candidate set

The hybrid NSGA-II-DQN framework employs multiple genetic operators selected from a candidate set to generate offspring:

  1. Simulated Binary Crossover (SBX): This operator is efficient for handling multi-modal landscapes and is described by:

graphic file with name d33e2199.gif 20

Where Inline graphic is a scaling factor, Inline graphic​ is the Inline graphic-th decision variable of the offspring, Inline graphic, Inline graphic​ are the Inline graphic -th decision variables of the two parents.

  • 2.

    Differential evolution operators: These operators are effective for dealing with complex variable associations.

  • DE/rand/1 is defined by:

graphic file with name d33e2263.gif 21

Where Inline graphic is a scaling factor, and Inline graphic, Inline graphic, Inline graphic​ are the parent solutions.

  • DE/rand/2 is defined by:

graphic file with name d33e2303.gif 22

Adaptive operator selection strategy

Following the credit assignment, the DQN adaptively selects the most appropriate genetic operator based on the current population state (including decision variables and objective values). This state is input into the trained Q-network, which outputs Q-values corresponding to each genetic operator. Operator selection employs the epsilon-greedy policy:

graphic file with name d33e2314.gif 23

Where Inline graphic is the starting exploration rate, Inline graphic​ is the minimum exploration rate, Inline graphic is the decay rate. The agent selects a random operator from the set with probability Inline graphic, or chooses the operator with the highest Q-value with probability Inline graphic. The operator selection based on the Q-values is given by:

graphic file with name d33e2353.gif 24

Where Inline graphic is the candidate set of genetic operators and Inline graphicis the Q-value for operator Inline graphic in the given state.

Overall process

The NSGA-II-DQN algorithm follows the standard NSGA-II process with the key modification of DQN-based operator selection. The process begins by initializing the population and performing the standard NSGA-II operations, such as selection, crossover, and mutation. However, instead of applying traditional crossover and mutation operations, the DQN selects the genetic operator for each offspring, guided by the learned Q-values. The detailed procedure of the NSGA-II-DQN algorithm with Chebyshev credit assignment is outlined in Algorithm 1.

In the proposed NSGA-II–DQN framework, the optimization process is initiated by generating a random population of candidate solutions and initializing a Q-network to guide operator selection. At each generation, the solutions are evaluated with respect to the defined objectives, and the current reference point is updated accordingly. The population is then ranked using non-dominated sorting and crowding distance, after which a compact state representation is constructed to characterize the distribution of solutions. Based on this state, an evolutionary operator is selected from a candidate set through an ε-greedy policy applied to the Q-network. Offspring are subsequently generated by applying the chosen operator, followed by mutation within the defined bounds. For each offspring, a reward is calculated through Chebyshev aggregation of the objective values, and the corresponding experience tuple (state, operator, reward, next state) is stored in the replay memory. The Q-network is iteratively updated by sampling mini-batches from the replay buffer, computing target values, and minimizing the temporal-difference error. Over successive generations, the exploration rate ε is decayed to encourage exploitation of the learned policy. Finally, environmental selection is performed to retain the top N solutions, and the Pareto archive is updated. This iterative process continues until the maximum number of generations is reached, at which point the final Pareto front and the trained Q-network are obtained.

graphic file with name 41598_2025_23748_Figa_HTML.jpg

Algorithm 1. NSGA-II-DQN

Optimization constraints and objective function formulation

The performance of an FCHEV is strongly dependent on the optimal design of three critical factors: component sizing, power management strategies, and driving conditions. These factors are interdependent, and their interactions must be carefully considered to achieve optimal efficiency. Consequently, a holistic optimization approach is required, where each component—such as the fuel cell, motor, and battery—along with the energy management strategy, is treated as a separate optimization problem. This approach focuses on two main objectives: reducing fuel consumption and minimizing the degradation of both the battery and fuel cell. Additionally, the required longitudinal performance constraints are incorporated. While this method enhances the overall efficiency of the FCHEV, it also increases the complexity of the simulation, as it intensifies the interactions between the different variables and subsystems within the optimization process.

This method establishes two objective functions along with two constraints. The first objective function (obj1) quantifies the total aging of both the fuel cell and battery, whereas the second objective function (obj2) reflects the cost associated with fuel consumption. The optimization aims to minimize both fuel consumption and operational costs, even though these objectives may conflict with each other, particularly in terms of dynamic performance. The problem is formulated as a constrained, non-linear multi-objective optimization problem, as outlined below.

The degradation objective obj1​ is formulated as a weighted average of the normalized battery and fuel-cell state-of-health terms, where Inline graphic​ and Inline graphic scale each component by its reference value to ensure commensurate contributions. The weight factors for the fuel cell and battery, denoted as w1 and w2, are set to 1 and 2, respectively.

graphic file with name d33e2417.gif 25

The second objective function (obj2) represents the fuel cell vehicle’s fuel consumption, as outlined in Eq. (2). Similar to obj1, this objective function is normalized based on refrence value of fuel consumption (Inline graphic).

graphic file with name d33e2434.gif 26

It is crucial to understand that in multi-objective optimization problems, obtaining a single solution that optimally balances all conflicting objectives simultaneously is generally unattainable. In the current study, efforts to improve fuel economy or reduce costs may increase battery and fuel cell aging. As a result, finding a balance among these competing objectives is essential. The set of solutions that achieve this balance is known as the Pareto front (see Fig. 13). Within the Pareto front, the “Knee point” is considered the optimal solution, representing the point where the distance to the extreme line is maximized. This point is critical because it provides the best trade-off between the conflicting objectives. It is the point where no objective can be improved without worsening another, making it a critical point for decision-making in practical applications.

Fig. 13.

Fig. 13

Pareto optimal solution of a two-objective FCHEV optimization.

The powertrain components of the FCHEV were dimensioned based on performance benchmarks aligned with the industry-standard PNGV specifications33, with key requirements summarized in Table 5.

Table 5.

FCHEV performance for optimal solution33.

Constraints Description Value
Acceleration (m/s^2) for 0 to 97 km/h Acc1≤ 12s
for 64 to 97 km/h Acc2≤ 5.3s
for 0 to 137 km/h Acc3≤ 23.4s
Maximum speed (km/h) 0% road grade Dis≥136
Gradeability (%) 55 mph (88.5 km/h) at 6.5 % grade grd≥ 6.5%

A fitness function is required to assess each candidate solution, enabling the NSGA-II-DQN framework to concurrently optimize component sizing, energy management strategy, and operational expenses in FCHEVs. This study defines the fitness function as the inverse of the objective functions. To ensure the solutions meet the required constraints, penalty functions are incorporated to penalize undesirable outcomes. Specifically, acceleration-related penalty functions are used to guide the optimization process toward feasible solutions, as outlined below.

graphic file with name d33e2520.gif 27

Additionally, the penalty functions for gradability and maximum speed are defined as follows:

graphic file with name d33e2528.gif 28
graphic file with name d33e2534.gif 29

In this study, the fitness function is formulated by incorporating the penalty functions into the objective function, as follows:

graphic file with name d33e2542.gif 30

In this framework, Inline graphic represents the fitness function, while Inline graphic denotes the penalty function associated with the i-th constraint. The constant Inline graphic​ is a positive penalty coefficient that determines the magnitude of the penalty for each limit.

To evaluate and compare the performance of NSGA-II and NSGA-II-DQN, three performance indicators were employed, each designed to capture both convergence and diversity of the Pareto front. Convergence measures how closely the obtained solutions approach the true Pareto optimal set, while diversity assesses how uniformly these solutions are distributed across the objective space.

The first indicator is the Coverage Metric (CM), which quantifies the dominance relationship between two sets of non-dominated solutions. For two sets Inline graphic and Inline graphic, Inline graphic is defined as31:

graphic file with name d33e2595.gif 31

A higher value of Inline graphic indicates that set Inline graphic dominates a greater portion of set Inline graphic. If Inline graphic, set Inline graphic is considered superior.

The second indicator is the Hypervolume (HV), which jointly measures convergence and diversity. HV corresponds to the Lebesgue measure of the portion of the objective space dominated by the Pareto front and bounded by a reference point:

graphic file with name d33e2635.gif 32

Where Inline graphic denotes the Lebesgue measure and Inline graphic​ represents the hypervolume of the region dominated by solution i ∈ A. Larger HV values indicate better convergence and wider coverage of the objective space.

The third indicator is the Spacing Metric (SM), which evaluates the uniformity of solution distribution. It is defined as:

graphic file with name d33e2657.gif 33

Where Inline graphic​ is the minimum distance between solution i and any other solution in A:

graphic file with name d33e2671.gif 34

Here d is the mean of all Inline graphic​. Smaller SM values indicate a more uniform distribution of solutions along the Pareto front.

Optimization of powertrain components in FCHEV

To optimize the FCHEV’s powertrain components, three critical design variables are considered: the size coefficients of the fuel cell stacks, the electric motor, and the battery modules. The optimization process systematically adjusts these variables to minimize fuel consumption and reduce degradation of both the battery and fuel cell while ensuring compliance with the longitudinal performance constraints specified in Table 5. The values and parameter ranges used in this optimization are detailed in Table 6.

Table 6.

Values and ranges of parameters and variables.

Parameter or variable Value
Sizing Parameters Initial coefficient of the number of fuel cell stacks (K-FC) 1
Initial Coefficient for the Number of Battery Modules (K-BA) 1
Initial coefficient for the electric motor (K-EM) 1
fuel cell stacks coefficient range [0.46, 1.4]
Electric motor tourqe coefficient range [0.5, 1.5]
Battery capacity coefficient range [0.7, 1.5]
NSGA-II Parameters Population size 10
Generation 5
Mutation Rate 0.1
Crossover Rate 0.9

RL

Parameters

Training Episodes 15
Steps per Episode 5
Network hidden layer size 128
Batch Size 64
Buffer Size 10,000
Discount factor 0.99
Learning rate 1e-3

The process begins with formulating cost functions that guide the NSGA-II-DQN algorithm in iteratively refining the sizing coefficients of the powertrain components. By repeatedly simulating the driving cycle and minimizing the defined cost function, the algorithm progressively optimizes the parameters, resulting in improved EMS, enhanced overall efficiency, reduced operational costs, and extended battery and fuel cell system durability.

Simultaneous optimization of sizing and EMS

Since this study aims to simultaneously optimize both the control strategy and the powertrain component sizing, the next phase focuses on analyzing the interaction between these two systems through concurrent optimization. This approach seeks to identify optimal component configurations alongside the most effective control strategy variables.

In this context, the paper defines five membership functions for each input variable—namely, power demand (P-req) and SOC—and seven membership functions for the output variable (FC-POWER). The fuzzy membership functions for both inputs and outputs, constructed using Fuzzy Type-2 logic, are illustrated in Fig. 14. The detailed fuzzy rules governing the system are presented in Table 7.

Fig. 14.

Fig. 14

Inputs and output non-optimized membership functions of Type-2 fuzzy logic for FCHEV.

Table 7.

Fuzzy rules for EMS of FCHEV.

Required Power (P-req)
S RS M RB B
SOC
S S RS M RB B
RS VS S RS M RB
M VS VS S RS M
RB VS VS VS S RS
B VS VS VS VS S

The objective remains unchanged, as specified in Eqs. (25) and (26), and the sizing configurations are consistent with those presented in Table 6. Table 8 details the initial optimization values for the inputs and outputs, along with their respective optimization ranges.

Table 8.

Initial optimization values and ranges for inputs and outputs.

Index Original Value Fuzzy Value (Min) Lower Lag Lower Scale

Input 1

(P-req)

MF1 2500 [0, 7500] [0.2, 0.5] [0.7, 1]
MF2 7500 [7500, 12500] [0.2, 0.5] [0.7, 1]
MF3 12,500 [12500, 17500] [0.2, 0.5] [0.7, 1]
MF4 17,500 [17500, 22500] [0.2, 0.5] [0.7, 1]
MF5 22,500 [22500, 80000] [0.2, 0.5] [0.7, 1]

Input 2

(SOC)

MF1 0.6 [0.4, 0.65] [0.2, 0.5] [0.7, 1]
MF2 0.65 [0.65, 0.7] [0.2, 0.5] [0.7, 1]
MF3 0.7 [0.7, 0.75] [0.2, 0.5] [0.7, 1]
MF4 0.75 [0.75, 0.8] [0.2, 0.5] [0.7, 1]
MF5 0.8 [0.8, 1] [0.2, 0.5] [0.7, 1]

Output

(P-FC)

MF1 2500 [0, 7500] [0.2, 0.5] [0.7, 1]
MF2 7500 [7500, 12500] [0.2, 0.5] [0.7, 1]
MF3 12,500 [12500, 17500] [0.2, 0.5] [0.7, 1]
MF4 17,500 [17500, 22500] [0.2, 0.5] [0.7, 1]
MF5 22,500 [22500, 27500] [0.2, 0.5] [0.7, 1]
MF6 27,500 [27500, 32500] [0.2, 0.5] [0.7, 1]
MF7 32,500 [32500, 70000] [0.2, 0.5] [0.7, 1]

To perform optimization using NSGA-II-DQN, 62 variables must be adjusted (see Fig. 12). Among these variables, the X variables correspond to the required power input, the Y variables represent the battery SOC, and the Z variables indicate the output of the Type-2 fuzzy controller, specifically the required power for the fuel cell.

Variables such as X10, X20, …, and X50 ​are assigned as optimization variables for the membership functions numbered one through five. Similarly, variables X11, X21, …, and X51​ represent the lower scale values for these membership functions, while X12, X22, …, and X52​ correspond to the lower lag values. This numbering convention is applied consistently to the second input (Y variables) and the output (Z variables).

The optimization process begins by defining a cost function that directs the NSGA-II-DQN algorithm to fine-tune the fuzzy controller’s coefficients iteratively. Through repeated simulations of the driving cycle aimed at minimizing this cost function, the 62 fuzzy logic parameters previously described are systematically refined. Simultaneously, the sizing parameters for the fuel cell stacks (S-FC), battery modules (S-BA), and electric motor (S-EM) are optimized in parallel. Figure 15 illustrates the optimization process for the FCHEV, combining Type-2 fuzzy logic with NSGA-II-DQN algorithms. This figure offers a detailed overview of the calculations involved in fuel consumption and battery management.

Fig. 15.

Fig. 15

Schematic of simultaneous optimization of EMS strategy and sizing for FCHEV.

HIL configuration for FCHEV

To validate the proposed control strategy and powertrain optimization, Fig. 16 presents a comprehensive schematic of the HIL testing setup described in this article. The fuel cell vehicle model runs in real-time on a host computer, enabling dynamic simulation. Digital data—specifically, the battery SOC and the vehicle’s power demand (Inline graphic​)—are transmitted from Simulink to the Advantech PCI-1711 data acquisition card, which interfaces with the computer’s motherboard via PCI. This data is then converted into analog signals and sent to the STM32F7 hardware for further processing.

Fig. 16.

Fig. 16

HIL configuration for FCHEV.

Several key factors motivate the selection of an analog communication protocol between the host computer and the STM32F7 hardware: compatibility with existing sensor technologies, cost-effectiveness, real-time data processing capabilities, robustness against interference, and seamless integration with digital systems.

Furthermore, the Type-2 fuzzy logic algorithm—optimized using the NSGA-II-DQN approach—has been implemented in C + + on the STM32F7 hardware. The output of this fuzzy controller, representing the fuel cell’s required power (Inline graphic​), is converted into a Pulse Width Modulation (PWM) signal and passed through a first-order low-pass filter to generate a stable analog signal. This analog output is then fed into the analog input channel of the Advantech data acquisition card, enabling real-time execution of the simulation model in conjunction with the hardware. Detailed specifications for the Advantech PCI-1711 data acquisition card and the STM32F7 hardware are provided in Table 9.

Table 9.

Technical specifications of STM32F746ZG microcontroller and advantech PCI-1711/PCI-1723 data acquisition Cards.

STM32F746ZG Advantech PCI-1711 Advantech PCI-1723
Component Specification Component Specification Specification
System on a Chip ARM®32-bit Cortex®-M7 32-bit I/O Connector 1 × 68-pin SCSI female connector

1 × 68-pin SCSI

female connector

Clock speed 216 MHz max CPU frequency Analog inputs 16 single-ended, 12 bits Non
SRAM 320 KB Analog outputs 2-channel, 12 bits 8-channel, 16 bits
GPIOs (168) with external interrupt capability

Analog output

range

0 ~ 5 V, 0 ~ 10 V −10 ~ 10 V
ADC (numbers) 12-bit ADCs with 24 channels (3) Digital input 16-channel 16-channel
DAC(numbers) 12-bit DAC channels (2) Digital output 16-channel 16-channel
Operating voltage 3.3v
Other Transport Protocols (numbers) USART/UART (8), I2C (4), SPI (6)

Results and discussion

This section will investigate the impact of simultaneous Sizing and EMS optimization. The analysis begins with evaluating co-simulation results based on the real-world traffic cycle. Subsequently, the influence of varying traffic conditions, such as highway and congested scenarios, on the optimization process will be explored. The next step involves discussing the optimization outcomes for different driving cycles, including UDDS and WLTP Class 3. Following that, the performance of the optimized controller will be assessed through grade testing. Finally, HIL simulations will be conducted to examine system behavior under diverse traffic conditions.

Simultanious optimization for real-world traffic driving cycle

Figure 17; Table 10 compare NSGA-II and NSGA-II-DQN across the three cases. In the Sizing case, both methods achieve very similar Pareto fronts, but NSGA-II-DQN demonstrates slightly better convergence, more uniform spacing, and marginally higher hypervolume. In the EMS case, the advantage of NSGA-II-DQN becomes more evident, as it achieves broader coverage of solutions, larger hypervolume, and improved spacing. The superiority of NSGA-II-DQN is most pronounced in the combined Sizing + EMS case, where it dominates nearly all NSGA-II solutions and simultaneously delivers higher diversity and a more uniform distribution. This significant improvement arises because simultaneous optimization greatly increases problem complexity, introducing stronger trade-offs between design and operational objectives; while NSGA-II relies on fixed operators that may stagnate, NSGA-II-DQN dynamically adapts operator selection through reinforcement learning, enabling it to explore effectively in early stages and converge more efficiently in later stages.

Fig. 17.

Fig. 17

Pareto front comparison between NSGA-II and NSGA-II-DQN: (a) sizing optimization, (b) EMS optimization, and (c) simultaneous sizing + EMS optimization.

Table 10.

Comparison of NSGA-II and NSGA-II-DQN using coverage metric, hypervolume, and spacing metric across sizing, EMS, and sizing + EMS optimization cases.

Sizing EMS Sizing + EMS
NSGA-II NSGA-II-DQN NSGA-II NSGA-II-DQN NSGA-II NSGA-II-DQN
CM 0.33 0.4 0.12 0.63 0 0.92
HV 0.000165 0.000170 0.0102 0.0117 0.0056 0.0114
SM 0.000636 0.000229 0.00579 0.00549 0.00767 0.00381

To evaluate the performance of NSGA-II against NSGA-II–DQN in terms of convergence speed and solution coverage, Fig. 18 presents the convergence behavior of both algorithms during simultaneous EMS and sizing optimization. The x-axis denotes the convergence thresholds (50%, 75%, 90%, 95%, and 99%), while the y-axis represents the number of episodes required to achieve each threshold. The results demonstrate that NSGA-II–DQN consistently requires fewer episodes, particularly at higher thresholds, underscoring its superior efficiency and faster convergence compared to the conventional NSGA-II.

Fig. 18.

Fig. 18

Convergence comparison of NSGA-II and NSGA-II–DQN in simultaneous EMS and Sizing optimization.

By incorporating reinforcement learning into the optimization process, the quality of results can be further enhanced; however, it is equally important to recognize that minimizing fuel cell aging, battery degradation, and fuel consumption requires the simultaneous optimization of component sizing and EMS. Although EMS optimization alone can reduce aging and improve efficiency, treating it in isolation neglects the strong interdependence between system design and operational strategy. Figure 19(a) presents the Pareto fronts for sizing, EMS, and combined sizing + EMS optimization using NSGA-II-DQN, with the corresponding knee points highlighted. The knee point solutions indicate balanced trade-offs, where further improvement in one objective leads to a significant compromise in the other. Importantly, the simultaneous EMS + sizing optimization achieves a knee point that clearly outperforms the individual cases, demonstrating a superior balance between objectives. This improvement arises because component sizing and EMS are inherently coupled: the effectiveness of an EMS strategy depends on the available system capacities, while the optimal sizing configuration is strongly influenced by the way energy is managed during operation.

Fig. 19.

Fig. 19

Optimization plots for real-world traffic cycle: (a) Pareto solutions for sizing optimization, (b) Pareto solutions for EMS optimization, (c) Pareto solutions for co-simulation sizing and EMS optimization, (d) battery SOC, (e) fuel cell power demand, (f) FFT magnitude of fuel cell power.

The integrated sizing and EMS optimization approach delivers substantial gains in both energy efficiency and component durability, as summarized in Table 11. In particular, the combined strategy achieves a reduction in equivalent fuel consumption (Eq-Fuel) of about 21% relative to the non-optimized baseline, surpassing the reductions obtained from sizing-only (14%) and EMS-only (10%) optimizations. These results highlight that AI-based co-optimization not only enhances EMS performance but also generates significant additional benefits when component sizing and EMS are addressed simultaneously.

Table 11.

FCHEV optimization results for real-world traffic cycle.

Non-optimized Sizing EMS Sizing + EMS
Sizing Scales S-FC 1 0.77 1 0.67
S-BA 1 1.42 1 1.39
S-EM 1 0.75 1 0.74
Fuel Consumption FC-Fuel (gr/100 km) 788 680 670 601
Eq-Fuel (gr/100 km) 818 705 734 648
Degeredation Q-FC (W) 3.0 2.3 1.9 1.6
Q-BA 0.0149 0.0139 0.0152 0.0143
Vehicle Objective and Constraints Obj1 - 1.86 1.87 1.70
Obj2 - 0.69 0.68 0.61
Acc1 (m/s^2) 8.1 10.4 8.1 10.3
Acc2 (m/s^2) 4.1 5.3 4.1 5.2
Acc3 (m/s^2) 16.6 22.6 16.6 22.5
Spd_max ((km/h)) 119.0 107.6 119.0 107.1
Grd (%) 21.8 16.3 21.8 16.5

Analysis of the battery degradation index (Q-BA) highlights the trade-offs between different optimization strategies. Compared to the non-optimized baseline (0.0149), the EMS-only strategy increases battery degradation by about 2% (0.0152), since the oscillatory power demand is shifted from the fuel cell to the battery. This transfer is confirmed by Fig. 19(b), where the SOC trajectory under EMS exhibits sharper declines, indicating that the battery is more frequently tasked with compensating for load variations. The reason for this behavior is that EMS effectively reduces fuel cell degradation (Q-FC decreases from 3.0 to 1.9 W), thereby extending fuel cell lifespan at the expense of accelerated battery aging. When EMS is combined with sizing, however, battery degradation is alleviated (0.0143), as the enlarged battery capacity allows the EMS to distribute power more evenly and limit deep SOC fluctuations, as also visible in the smoother SOC profile of Fig. 19(b). Interestingly, the sizing-only case achieves the lowest battery degradation (0.0139), suggesting that although EMS + sizing provides a more balanced compromise between fuel cell and battery health, the stronger emphasis on protecting the fuel cell in the combined strategy leads to a partial transfer of power oscillations back to the battery compared with the sizing-only solution.

The combined AI optimization markedly improved the fuel cell lifespan (Q-FC), with degradation reduced by nearly 30% compared to the sizing-only case. These findings underscore the advantage of integrating physical component resizing with intelligent control strategies, demonstrating that such a holistic approach yields more comprehensive benefits than either strategy in isolation. As illustrated in Fig. 19(c), AI simultaneous optimization more effectively minimizes power fluctuations—a primary driver of component degradation—compared to other methods.

The FFT magnitude plot in Fig. 19(d) provides further insights, presenting the frequency content of power signals on a logarithmic scale. The combined optimization reduces power spectral density at low frequencies, indicating smoother power demand and diminished transient stress on powertrain components. The EMS optimization successfully reduces fuel cell power fluctuations, shifting this variability to the battery. While this trade-off is evident in the EMS-only optimization (Table 11), the combined EMS + sizing approach introduces a critical design adaptation: the battery size is increased specifically to accommodate these redirected power fluctuations. This strategic sizing compensation maintains system stability while achieving smoother fuel cell operation, demonstrating how component sizing and energy management must be co-optimized to handle power distribution challenges effectively.

However, these gains in fuel efficiency and durability come with trade-offs in dynamic vehicle performance. Acceleration times across various speed intervals increase significantly under sizing optimization, with the combined AI approach showing more than a 35% increase. Additionally, maximum vehicle speed and gradability decline by approximately 10%, highlighting that downsizing powertrain components limits peak performance. In contrast, EMS-only optimization preserves acceleration and top-speed performance close to initial value, demonstrating that intelligent EMS can achieve efficiency gains without compromising drivability.

The optimization of the Type-2 fuzzy EMS system leads to distinct modifications in the membership functions, depending on the optimization strategy employed. When only the control strategy is optimized, the membership functions undergo targeted adjustments to refine rule activation to prolong the fuel cell’s lifespan—without altering system sizing parameters—illustrated in Fig. 20(a). In contrast, AI simultaneous optimization of the control strategy and component sizing results in more substantial modifications to the membership functions. These changes reflect the enhanced operational flexibility enabled by the resized components, such as the battery and fuel cell, as shown in Fig. 20(b).

Fig. 20.

Fig. 20

Results of EMS and sizing optimization: (a) optimized Type-2 fuzzy membership functions under EMS optimization, (b) optimized Type-2 fuzzy membership functions under co-simulation sizing and EMS, (c) electric motor operating points for sizing optimization, (d) electric motor operating points for EMS optimization, (e) electric motor operating points for co-simulation sizing and EMS optimization.

The analysis of electric motor efficiency maps provides further evidence of these improvements. The sizing-only and combined sizing and EMS optimization strategies shift the motor’s operating points closer to regions of maximum torque and peak efficiency. This is demonstrated by the dense clustering of operating points within the high-efficiency zones, as illustrated in Figs. 20 (c), 20 (d), and 20 (e). These shifts indicate a more efficient utilization of the electric motor, contributing to overall system performance enhancements.

Traffic condition effects

Fuel consumption significantly varies across driving scenarios under the AI co-simulation framework combining EMS and Sizing optimization. In the Highway condition, fuel consumption decreases by approximately 36% relative to the combined scenario, reflecting the more efficient and steady-state driving typical of highway environments, as shown in Figs. 21 (a) and 21 (b). In contrast, the congested scenario results in a fuel consumption increase of about 48% compared to the combined case, highlighting the elevated energy demands of frequent stops and accelerations characteristic of congested traffic, as depicted in Fig. 21(c).

Fig. 21.

Fig. 21

Spider plots for real-world traffic cycle: (a) total cycle results, (b) highway cycle results, (c) congested cycle results.

A detailed comparative analysis of fuel consumption and equivalent fuel metrics under the combined AI sizing and EMS approach reveals substantial improvements, particularly in the Highway scenario (see Table 12). Specifically, fuel cell consumption is reduced by approximately 16.4% and 19.5% relative to the EMS-only and Sizing-only strategies, respectively, while equivalent fuel consumption decreases by 15.2% and 11.1%. In contrast, the Congested scenario yields more modest gains, with equivalent fuel consumption increasing by up to 3.1%.

Table 12.

FCHEV optimization results for highway and congested driving cycles.

Optimization
Sizing EMS Sizing + EMS
Combined Highway Congested Combined Highway Congested Combined Highway Congested
S-FC 0.77 0.59 1.01 1 1 1 0.67 0.96 0.98
S-BA 1.42 1.42 1.42 1 1 1 1.39 1.39 1.4
S-EM 0.75 0.72 0.98 1 1 1 0.74 0.8 0.89
FC-Fuel 680 475 932 670 458 892 601 383 897
Eq-Fuel 705 522 847 734 547 827 648 464 821
Q-FC 2.3 1.2 0.3 1.9 1 0.31 1.6 0.55 0.31
Q-BA 0.014 0.0125 0.0118 0.015 0.0133 0.0118 0.014 0.0128 0.0117
Obj1 1.86 1.49 1.2471 1.87 1.539 1.2477 1.70 1.3938 1.2355
Obj2 0.69 0.29 0.12472 0.68 0.2821 0.1186 0.61 0.2358 0.1193
Acc1 10.4 10.4 8.45 8.1 8.1 8.1 10.3 10.3 9.3
Acc2 5.3 5.3 4.25 4.1 4.1 4.1 5.2 5.25 4.7
Acc3 22.6 22.8 17.4 16.6 16.6 16.6 22.5 22.1 19.45
Spd_Max 107.6 106.2 117.6 119 119 119 107.1 109.8 113.8
Grd 16.3 16.4 20.67 21.8 21.8 21.8 16.5 16.6 18.7

Additionally, acceleration metrics (Acc1, Acc2, and Acc3) in the congested scenario improve by 10–17% under the Sizing and EMS strategy compared to individual optimization methods. This indicates a notable enhancement in vehicle responsiveness, which is crucial for navigating stop-and-go traffic conditions. Component sizing analysis reveals that the electric motor size (S-EM) increases by approximately 11% in congested conditions compared to highway driving. This adjustment reflects the higher power requirements for frequent acceleration and deceleration typical in congested traffic.

Driving cycles effects

The AI multi-objective optimization results for the FCHEV powertrain components show distinct performance improvements when applying sizing-only, EMS-only, and combined AI sizing and EMS optimization strategies under two driving cycles: UDDS and WLTP Class3. The Pareto front plots in Figs. 22(a)-(c) for UDDS and Figs. 22(d)-(f) for WLTP Class3 demonstrates that AI simultaneous sizing and EMS optimization outperform the individual approaches by achieving a more favorable trade-off between the two cost objectives. The lower and more clustered knee points in the combined optimization cases evidence this.

Fig. 22.

Fig. 22

Pareto solutions for UDDS and WLTP Class 3 driving cycles: (a) UDDS sizing optimization, (b) UDDS EMS optimization, (c) UDDS co-simulation sizing and EMS optimization, (d) WLTP Class 3 sizing optimization, (e) WLTP EMS optimization, (f) WLTP Class 3 co-simulation sizing and EMS optimization.

Figures. 23 (a) and 23 (b) also show the SOC profiles, which validate the improved battery management achieved by the Combined AI Sizing + EMS Strategy. This approach maintains higher and more stable SOC levels compared to the other methods, indicating enhanced energy efficiency and improved battery longevity—results consistent with findings observed in the experiment driving cycle.

Fig. 23.

Fig. 23

SOC analysis for different driving cycles: (a) UDDS, (b) WLTP Class 3.

The AI EMS and sizing optimization consistently delivers the most significant reductions in fuel consumption for both the UDDS and WLTP Class 3 cycles, as summarized in Table 13. Relative to the non-optimized baseline, the AI EMS and Sizing approach reduces fuel consumption by approximately 22% for UDDS and 24% for WLTP Class 3. Compared to sizing-only optimization, it provides an additional reduction of about 18% for UDDS and 15% for WLTP Class 3. Moreover, compared to EMS-only optimization, the EMS + Sizing strategy yields further improvements of roughly 10% for UDDS and 14% for WLTP Class 3.

Table 13.

FCHEV optimization results for UDDS and WLTP Class3 driving cycles.

UDDS WLTP Class3
Non-optimized Sizing EMS Sizing + EMS Non-optimized Sizing EMS Sizing + EMS
S-FC 1 0.57 1 1.08 1 0.98 1 1.06
S-BA 1 1.38 1 1.40 1 1.40 1 1.40
S-EM 1 0.71 1 0.82 1 0.79 1 0.80
FC-Fuel 755 638 646 588 992 873 829 755
Eq-Fuel 777 661 696 631 994 874 862 779
Q-FC 3.2 2.0 1.5 1.3 5.2 4.1 1.9 1.4
Q-BA 0.0157 0.0144 0.0154 0.0144 0.0177 0.016 0.0165 0.0163
Obj1 - 1.83 1.84 1.70 - 2.41 2.03 1.91
Obj2 - 0.76 0.77 0.71 - 2.03 1.92 1.75
Acc1 8.1 10.4 8.1 10.4 8.1 10.5 8.1 10.5
Acc2 4.1 5.3 4.1 5.2 4.1 5.3 4.1 5.3
Acc3 16.6 22.9 16.6 22.0 16.6 22.4 16.6 22.3
Spd_max 119.0 106.0 119.0 111.0 119.0 109.3 119.0 110.0
Grd 21.8 16.3 21.8 16.7 21.8 16.4 21.8 16.4

The 3D scatter plots shown in Fig. 24, depicting the sizing of the electric motor (S-EM), fuel cell (S-FC), and battery (S-BA) components, reveal a marked difference in consistency between the sizing-only and combined sizing + EMS optimization strategies. Specifically, AI the EMS + Sizing approach produces a more tightly clustered distribution of component sizes, indicating a more stable and convergent solution space. In contrast, the sizing-only optimization results in a broader scatter, reflecting greater variability in the selected component sizes.

Fig. 24.

Fig. 24

Sizing optimization scale points for different driving cycles: (a) UDDS sizing, (b) UDDS co-simulation sizing + EMS, (c) WLTP Class 3 sizing, (d) WLTP Class 3 co-simulation sizing + EMS.

Road grade

To investigate the impact of each optimization method, the FCHEV was tested under varying road grades. As the road grade increases from 0% to 6%, there is a notable rise in fuel consumption and component degradation across all driving cycles, as shown in Table 14. Specifically, fuel consumption increases by approximately 30–35% in the real-world traffic cycle and 25–30% in both the WLTP Class 3 and UDDS cycles, reflecting the higher energy demand on steeper inclines. Fuel cell degradation (Q-FC) increases even more sharply, with values rising by around 40–60% in the real-world traffic cycle and exceeding 70% in the WLTP Class 3 cycle, highlighting the strong sensitivity of fuel cells to grade-induced stress. Battery degradation (Q-BA) increases gradually but still rises approximately 15–20% across all cycles.

Table 14.

FCHEV grade results for real-world traffic cycle, UDDS, and WLTP class 3 driving cycles.

Grade 4% Grade 6%
Non-optimized Sizing EMS Sizing + EMS Non-optimized Sizing EMS Sizing + EMS
Tehran
Fuel-FC 1271 1202 946 936 1606 1605 1192 1399
Eq-Fuel 1298 1234 1046 1010 1630 1634 1305 1489
Q-FC 4.2 3.9 2.5 2.6 4.7 4.6 3.4 4.2
Q-BA 0.0155 0.0144 0.0159 0.0145 0.0159 0.0148 0.0162 0.0149
WLTP Class3
FC-Fuel 1550 1496 1412 1395 1910 1849 1749 1833
Eq-Fuel 1560 1509 1453 1425 1926 1871 1803 1874
Q-FC 5.4 5.3 2.4 2.3 5.8 5.6 3.0 2.8
Q-BA 0.0185 0.0166 0.0169 0.0165 0.0193 0.0172 0.0174 0.0168
UDDS
Fuel-FC 1187 1212 1015 1022 1521 1625 1310 1355
Eq-Fuel 1216 1246 1078 1079 1549 1660 1375 1407
Q-FC 3.6 3.5 1.9 2.3 4.8 4.8 2.5 2.5
Q-BA 0.0158 0.0146 0.155 0.0145 0.0161 0.0149 0.0157 0.0148

Figure 25 further corroborates these trends, illustrating a consistent increase in fuel cell degradation (Q-FC) with rising grade under all strategies. The WLTP Class 3 cycle exhibits the highest degradation, emphasizing its role as a particularly demanding driving profile. Significantly, the combined AI sizing and EMS optimization strategy reduces fuel cell degradation by roughly 30–40% compared to the non-optimized case at a 6% grade, demonstrating its effectiveness in mitigating degradation. Additionally, the figure highlights that, as the grade increases, fuel cell degradation under the EMS-only strategy approaches that of the combined EMS and sizing strategy. This convergence underscores the critical role of EMS optimization in protecting fuel cell lifespan, particularly on steeper grades where operational stresses are most pronounced. It suggests that, even without vehicle downsizing, an effective EMS can significantly reduce degradation, emphasizing its importance in extending fuel cell durability under challenging driving conditions.

Fig. 25.

Fig. 25

Fuel cell degradation bar chart for different road grades and driving cycles: (a) Experiment, (b) UDDS, (c) WLTP Class 3.

HIL simulation

This section evaluates the impact of HIL simulation by analyzing the real-world, UDDS, and WLTP Class3 drive cycles, as shown in Figs. 26(a), (b), and (c). The fuel cell power requests during HIL tests demonstrate that the fuzzy Type-2 controllers—optimized using the hybrid reinforcement learning algorithm NSGA-II-DQN—effectively replicate the MIL simulation results. However, some fluctuations appear in the HIL data compared to MIL, mainly due to inherent time delays from hardware analog characteristics, low-pass filtering, and sampling times. Additionally, hardware-specific factors such as ADC quantization errors and PWM-to-analog conversion introduce variability, causing fuel cell power output oscillations.

A comparison of the combined AI EMS and sizing results between HIL and MIL simulations, detailed in Table 15, reveals minimal deviation, highlighting the robustness and reliability of this optimization strategy. For the UDDS cycle, fuel cell degradation remains virtually identical in both HIL and MIL—1.3 in each case—demonstrating the controller’s ability to effectively manage real-world hardware noise, time delays, and system imperfections. Similarly, battery degradation shows only slight differences, confirming the consistency of AI EMS + sizing performance across both simulation environments.

Table 15.

Optimized and non-optimized HIL results for real-world, UDDS, and WLTP Class3 driving cycles.

Experiment UDDS WLTP Class3
Non-optimized Sizing EMS Sizing + EMS Non-optimized Sizing EMS Sizing + EMS Non-optimized Sizing EMS Sizing + EMS
Fuel-FC 782 665 660 590 756 637 637 579 1009 887 826 755
Eq-Fuel 815 695 729 645 779 662 690 626 1010 890 861 780
Q-FC 3.3 2.4 2.0 1.6 3.3 2.2 1.5 1.3 6.3 5.4 2.0 1.5
Q-BA 0.0153 0.0143 0.0153 0.0145 0.0169 0.0150 0.0158 0.0147 0.0196 0.0171 0.0172 0.0169

In contrast, the deviation between HIL and MIL is more pronounced for the non-optimized, sizing-only, and EMS-only strategies. For instance, sizing-only optimization in the UDDS cycle results in a fuel cell degradation of 1.9 in HIL versus 1.5 in MIL, underscoring the sensitivity of sizing alone to hardware-induced fluctuations and delays. The discrepancies are even larger in the non-optimized and EMS-only cases, further emphasizing the superior stability and performance of the AI co-simulation EMS and sizing approach. This demonstrates its ability to deliver reliable, high-performance results in both simulation and real-time hardware testing.

Power fluctuations significantly impact battery power demand, resulting in overshoots and undershoots, which is evident in the SOC trends during HIL simulations, as shown in Fig. 27 for the WLTP Class3 driving cycle. Despite these variations, the overall SOC behavior remains closely aligned between MIL and HIL, demonstrating the effectiveness of the hybrid deep reinforcement learning-based Sizing + EMS optimization under real-time hardware constraints. Notably, non-optimized HIL tests (Fig. 27(a)) exhibit pronounced fluctuations in fuel cell power and SOC, revealing the system’s susceptibility to hardware-induced noise and delays. Although the sizing-only (Fig. 27(b)) strategy improves stability, it still presents noticeable oscillations. The HIL plot for EMS-only optimization (Fig. 27(c)) shows a significant reduction in SOC to its lowest level, which can greatly impact durability. In contrast, the AI integrated EMS + Sizing approach (Fig. 27(f)) achieves the best alignment between MIL and HIL results, with significantly reduced SOC deviations.

Fig. 26.

Fig. 26

HIL analysis using co-simulation Sizing + EMS optimization for different driving cycles: (a) fuel cell power demand for real-world, (b) UDDS, (c) WLTP Class 3.

Fig. 27.

Fig. 27

SOC HIL analysis for WLTP Class 3 driving cycle: (a) Non-optimized, (b) sizing optimization, (c) EMS optimization, (d) Sizing + EMS optimization.

Conclusion

This paper successfully addressed a critical research gap by introducing a novel AI-driven framework that employs NSGA-II-DQN for the simultaneous optimization of EMS and powertrain sizing in FCHEVs. The core innovation—using a Deep Q-Network to adaptively select genetic operators—overcame the limitations of conventional NSGA-II, which relies on fixed probabilities and often struggles with complex, multi-objective problems. This resulted in Pareto fronts with demonstrably superior convergence, diversity, and solution quality.

The framework was rigorously evaluated under realistic driving conditions, synthesized by a Random Forest-based traffic classifier. The results unequivocally demonstrate that while NSGA-II-DQN consistently outperforms its traditional counterpart, the most significant benefits are realized through the co-optimization of EMS and sizing. This integrated approach delivered substantial quantitative gains, reducing equivalent fuel consumption by 21% compared to a sizing-only approach and achieving a 30% reduction in fuel cell degradation, all while effectively managing battery aging. These improvements were consistently validated across standard driving cycles (UDDS and WLTP Class 3), with fuel consumption reduced by 22–24% from the non-optimized baseline.

However, this pursuit of optimal efficiency and durability necessitated a trade-off in dynamic performance. The co-optimization strategy, which involved downsizing key components, incurred a 35% penalty in acceleration (0–97 km/h) and reductions of approximately 10% in maximum speed and 24% in gradeability. This quantifiable compromise underscores the inherent challenge of balancing competing objectives in holistic vehicle design.

Finally, the practical robustness of the proposed system was confirmed through HIL simulations. Despite the inevitable noise, time delays, and uncertainties introduced by real hardware, the optimized controller maintained consistent and reliable performance across diverse driving conditions. This validation underscores the framework’s readiness for real-world application and marks a significant step forward in the AI-powered co-design of next-generation FCHEVs.

Abbreviations

ADC

Analog-to-Digital Converter

DoD

Depth of Discharge

DQN

Deep Q-Network

ECU

Electronic Control Unit

EMS

Energy Management Strategy

FFT

Fast Fourier Transform

FCHEV

Fuel Cell Hybrid Electrical Vehicle

GA

Genetic Algorithm

HIL

Hardware-In-The-Loop

MIL

Model-in-the-Loop

Non-Op

Non-optimized

NSGA-II

Non-dominated Sorting Genetic Algorithm II

OP

Optimized

PMP

Pontryagin’s Minimum Principal

PEMFC

Polymer Exchange Membrane Fuel Cell

PWM

Pulse-Width Modulation

SOC

State-of-Charge

SOH

State of Health

TDP

Time Domain Parametrization

T1FS

Type-1 Fuzzy Set

T2FLS

Type-2 Fuzzy Set

Symbols

A

Frontal area (m²)

CD

Air resistance coefficient (-)

EQ-Fuel

Equivalanet fuel consumption

F

Faraday constant (c/mol)

FC-Fuel

Fuel cell hydrogen consumption

FC-Power

Fuel cell power (W)

Fi

Traction force (N)

FRO

Rolling resistance force(N)

FL

Aerodynamic drag force (N)

FSt

Gravitational force along slope (N)

g

gravity (m/s²)

H2Cons

Gravitational force along slope (N)

I

Electric current

IFCnet

Net current extracted from fuel cell stack (A)

M

vehicle mass (kg)

N

Stack number of fuel cell

Pbat

Battery power (W)

P-req

Required power for FCHEV (W)

Q-BA

Battery capacity loss

Q-FC

Fuel cell power degeredation (W)

Rfd

Gear ratio (-)

Rint

Internal resistance (ohm)

Rohm

Ohmic resistance (ohm.cm²)

f

FDynamic rolling radius coefficient (-)

S-BA

Scale factor for battery sizing

S-FC

Scale factor for fuel cell Sizing

S-EM

Scale factor for electric motor sizing

T

Temperature (K)

Tm

Motor torque (Nm)

Tm max

Maximum torque of electric motor (Nm)

u

Vehicle speed (m/s)

Vcell

Cell voltage (V)

Vocv

Open circuit voltage (V)

Greek letters

α

Slope (-)

αk

Reaction transfer coefficient (-)

β

Transfer coefficients (-)

δ

Rotating mass correction factor (-)

Inline graphic

Efficiency of the transmission (-)

Inline graphic

Motor efficiency (-)

Inline graphic

Motor angular velocity (rad/s or rpm)

Author contributions

M.M.: Conceptualization, Writing e original draft, Supervision. A.M.: Conceptualization, Methodology, Soft ware, writing and editing, Data Curation.

Data availability

Correspondence and requests for materials should be addressed to M.M.-G.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Han, J., Yi, S. & Yu, S. Assessment of hydrogen vehicle fuel economy using MRAC based on deep learning. Sci. Rep.15(1), 13085 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Wang, H. et al. Optimization of energy management strategies for multi-mode hybrid electric vehicles driven by travelling road condition data. Sci. Rep.15(1), 12684 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Feng, R. et al. Performance and energy-consumption evaluation of fuel-cell hybrid heavy-duty truck based on energy flow and thermal-management characteristics experiment under different driving conditions. Energy Convers. Manag.321, 119084 (2024). [Google Scholar]
  • 4.Zhang, M., Li, X., Han, D., Shang, L. & Xu, L. Energy management strategy for fuel cell hybrid tractor considering demand power frequency characteristic compensation. Sci. Rep.14(1), 27844 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Valizadeh, M., Shiri, M., Sarvenoee, A. K., Gowtham, N. & AboRas, K. M. A comprehensive scheme for power management of FC/SC/battery, and solar-roof PV source in electric vehicle systems. Sci. Rep.14(1), 27621 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Lei, N., Zhang, H., Chen, H. & Wang, Z. A comprehensive study of various carbon-free vehicle propulsion systems utilizing ammonia-hydrogen synergy fuel. ETransportation20, 100332 (2024). [Google Scholar]
  • 7.Zhang, H. et al. Surrogate-enhanced multi-objective optimization of on-board hydrogen production device for carbon-free heavy-duty vehicles. Energy333, 137369. (2025). [Google Scholar]
  • 8.Zhang, H., Lei, N. & Wang, Z. Ammonia-hydrogen propulsion system for carbon-free heavy-duty vehicles. Appl. Energy369, 123505 (2024). [Google Scholar]
  • 9.Jia, C., Liu, W., He, H. & Chau, K. T. Deep reinforcement learning-based energy management strategy for fuel cell buses integrating future road information and cabin comfort control. Energy Conv. Manag.321, 119032 (2024). [Google Scholar]
  • 10.Li, F., Gao, L., Zhang, Y. & Liu, Y. Integrated energy management for hybrid electric vehicles: A bellman neural network approach. Eng. Appl. Artif. Intell.145, 110166 (2025). [Google Scholar]
  • 11.Sun, Y. et al. Energy management strategy for FCEV considering degradation of fuel cell. Int. J. Green Energy. 20 (1), 28–39 (2023). [Google Scholar]
  • 12.Yang, H., Sun, Y., Xia, C. & Zhang, H. Research on energy management strategy of fuel cell electric tractor based on multi-algorithm fusion and optimization. Energies15(17), 6389 (2022). [Google Scholar]
  • 13.Sun, H. et al. Health-and behavior-aware energy management strategy for fuel cell hybrid electric vehicles based on parallel deep deterministic policy gradient learning. Eng. Appl. Artif. Intell.158, 111311 (2025). [Google Scholar]
  • 14.Rostami, S. M. R., Al-Shibaany, Z., Kay, P. & Karimi, H. R. Deep reinforcement learning and fuzzy logic controller codesign for energy management of hydrogen fuel cell powered electric vehicles. Sci. Rep.14(1), 30917 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Zhao, Z., Wang, T., Li, M., Wang, H. & Wang, Y. Optimization of fuzzy control energy management strategy for fuel cell vehicle power system using a multi-islandgenetic algorithm. Energy Sci. Eng.9 (4), 548–564 (2021). [Google Scholar]
  • 16.Lei, N., Zhang, H., Hu, J., Hu, Z. & Wang, Z. Sim-to-real design and development of reinforcement learning-based energy management strategies for fuel cell electric vehicles. Applied Energy, 393, p.126030. (2025).
  • 17.Jain, M., Desai, C. & Williamson, S. S. September. Genetic algorithm based optimal powertrain component sizing and control strategy design for a fuel cell hybrid electric bus. In 2009 IEEE vehicle power and propulsion conference (pp. 980–985). IEEE. (2009).
  • 18.Hu, X., Murgovski, N., Johannesson, L. M. & Egardt, B. Optimal dimensioning and power management of a fuel cell/battery hybrid bus via convex programming. IEEE/ASME Transactions on Mechatronics20(1), 457–468 (2014). [Google Scholar]
  • 19.Xie, S. et al. Aging-aware co-optimization of battery size, depth of discharge, and energy management for plug-in hybrid electric vehicles. J. Power Sources450, 227638 (2020). [Google Scholar]
  • 20.Zou, Y., Yang, Y., Zhang, Y. & Tang, X. Aging-aware real-time multi-layer co-optimization approach for hybrid vehicles: across configuration, parameters, and control. Energy Conv. Manag.332, 119748 (2025). [Google Scholar]
  • 21.Liu, C. & Liu, L. Optimal power source sizing of fuel cell hybrid vehicles based on Pontryagin’s minimum principle. Int. J. Hydrogen Energy40(26), 8454–8464 (2015). [Google Scholar]
  • 22.Sorrentino, M., Cirillo, V. & Nappi, L. Development of flexible procedures for co-optimizing design and control of fuel cell hybrid vehicles. Energy Convers. Manag.185, 537–551 (2019). [Google Scholar]
  • 23.Xu, L. et al. Optimal sizing of plug-in fuel cell electric vehicles using models of vehicle performance and system cost. Appl. Energy103, 477–487 (2013). [Google Scholar]
  • 24.Sadek, H., Chedid, R. & Fares, D. Power sources sizing for a fuel cell hybrid vehicle. Energy Storage2(2), e124 (2020). [Google Scholar]
  • 25.KoteswaraRao, K. V., Srinivasulu, G. N., Rahul, J. R. & Velisala, V. Optimal component sizing and performance of Fuel Cell–Battery powered vehicle over world harmonized and new european driving cycles. Energy Conv. Manag.300, 117992 (2024). [Google Scholar]
  • 26.Hu, Z. et al. Multi-objective energy management optimization and parameter sizing for proton exchange membrane hybrid fuel cell vehicles. Energy Conv. Manag.129, 108–121 (2016). [Google Scholar]
  • 27.Wang, Y., Moura, S. J., Advani, S. G. & Prasad, A. K. Optimization of powerplant component size on board a fuel cell/battery hybrid bus for fuel economy and system durability. Int. J. Hydrogen Energy44(33), 18283–18292 (2019). [Google Scholar]
  • 28.Ceschia, A., Azib, T., Bethoux, O. & Alves, F. Optimal sizing of fuel cell hybrid power sources with reliability consideration. Energies13, 3510 (2020). [Google Scholar]
  • 29.Iqbal, M., Becherif, M., Ramadan, H. S. & Badji, A. Dual-layer approach for systematic sizing and online energy management of fuel cell hybrid vehicles. Appl. Energy300, 117345 (2021). [Google Scholar]
  • 30.Li, J. et al. Battery optimal sizing under a synergistic framework with DQN-based power managements for the fuel cell hybrid powertrain. IEEE Transactions on Transportation Electrification8(1), 36–47 (2021). [Google Scholar]
  • 31.da Silva, S. F. et al. Aging-aware optimal power management control and component sizing of a fuel cell hybrid electric vehicle powertrain. Energy Conv. Manag.292, 117330 (2023). [Google Scholar]
  • 32.Lei, N., Zhang, H., Wang, H. & Wang, Z. An improved co-optimization of component sizing and energy management for hybrid powertrains interacting with high-fidelity model. IEEE Trans. Veh. Technol.72 (12), 15585–15596 (2023). [Google Scholar]
  • 33.Madadi, M. H. & Chitsaz, I. Improving fuel efficiency and durability in fuel cell vehicles through component sizing and power distribution management. Int. J. Hydrogen Energy71, 661–673 (2024). [Google Scholar]
  • 34.Ming, F., Gong, W., Wang, L. & Jin, Y. Constrained multi-objective optimization with deep reinforcement learning assisted operator selection. IEEE/CAA JAS11(4), 919–931 (2024). [Google Scholar]
  • 35.Song, Y. et al. Reinforcement learning-assisted evolutionary algorithm: A survey and research opportunities. Swarm Evol. Comput.86, 101517 (2024). [Google Scholar]
  • 36.Zou, S., Shi, X. & Song, S. MOEA with adaptive operator based on reinforcement learning for weapon target assignment. Electron. Res. Arch31(3), 1498–1532 (2024). [Google Scholar]
  • 37.Yin, S. & Xiang, Z. Adaptive operator selection with dueling deep Q-network for evolutionary multi-objective optimization. Neurocomputing581, 127491 (2024). [Google Scholar]
  • 38.Huang, Y. et al. Multi-Objective Path Planning for Unmanned Sweepers Considering Traffic Signals: A Reinforcement Learning-Enhanced NSGA-II Approach. Sustainability16(24), 11297 (2024). [Google Scholar]
  • 39.Yang, B., Chen, J., Xiao, X., Li, S. & Ren, T. An Enhanced NSGA-II Driven by Deep Reinforcement Learning to Mixed Flow Assembly Workshop Scheduling System with Constraints of Continuous Processing and Mold Changing. Systems13(8), 659 (2025). [Google Scholar]
  • 40.Montazeri-Gh, M. & Alimohammadi, E. Integrated energy, environmental, and economic optimization for energy management systems in PHEVs considering traffic conditions. Sci. Rep. 15(1), 25927 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Zheng, C., Zhang, D., Xiao, Y. & Li, W. Reinforcement learning-based energy management strategies of fuel cell hybrid vehicles with multi-objective control. J. Power Sources543, 231841 (2022). [Google Scholar]
  • 42.Lei, N. et al. Theory-constrained neural network with modular interpretability for fuel cell vehicle modeling. IEEE Trans. Veh. Technology 74(6), 8907–8920 (2025). [Google Scholar]
  • 43.Chen, H., Pei, P. & Song, M. Lifetime prediction and the economic lifetime of proton exchange membrane fuel cells. Appl. Energy142, 154–163 (2015). [Google Scholar]
  • 44.Song, K. et al. A comprehensive evaluation framework to evaluate energy management strategies of fuel cell electric vehicles. Electrochimica Acta292, 960–973 (2018). [Google Scholar]
  • 45.Pei, P., Chang, Q. & Tang, T. A quick evaluating method for automotive fuel cell lifetime. Int. J. Hydrogen Energy33(14), 3829–3836 (2008). [Google Scholar]
  • 46.U. S. D. o. Energy (ed), Fuel Cells, vol. Multi-Year Research, Development, and Demonstration Plan, (2017).
  • 47.Esfahanian, M. et al. Large lithium polymer battery modeling for the simulation of hybrid electric vehicles using the equivalent circuit method. Int. J. Automot. Eng.3(4), 564–576 (2013). [Google Scholar]
  • 48.Huang, Y. et al. Fuel consumption and emissions performance under real driving: Comparison between hybrid and conventional vehicles. Sci. Total Environ.659, 275–282 (2019). [DOI] [PubMed] [Google Scholar]
  • 49.Eckert, J. J., Silva, L. C. D. A. E., Santiciolli, F. M., Correa, F. C. & Dedini, F. G. Optimization of electric propulsion system for a hybridized vehicle. Mech. Based Des. Struc.47(2), 175–200 (2019). [Google Scholar]
  • 50.Cordoba-Arenas, A., Onori, S. & Rizzoni, G. A control-oriented lithium-ion battery pack model for plug-in hybrid electric vehicle cycle-life studies and system design with consideration of health management. J. Power Sources279, 791–808 (2015). [Google Scholar]
  • 51.Wang, J. et al. Cycle-life model for graphite-LiFePO4 cells. J. Power Sources196(8), 3942–3948 (2011). [Google Scholar]
  • 52.Lei, N. et al. Physics-informed data-driven modeling approach for commuting-oriented hybrid powertrain optimization. Energy Conversion and Management, 299, p.117814. (2024).
  • 53.https://worldpopulationreview.com/cities/iran/tehran
  • 54.Montazeri-Gh, M. & Naghizadeh, M. Development of the Tehran car driving cycle. Int. J. Environ. Pollut.30 (1), 106–118 (2007). [Google Scholar]
  • 55.Deb, K., Pratap, A., Agarwal, S. & Meyarivan, T. A. M. T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput.6 (2), 182–197 (2002). [Google Scholar]
  • 56.Pei, C. H. I., Chen, L. I. U., Jiang, Z. H. A. O., Kun, W. U. & Yingxun, W. Dynamic effect web generation for heterogeneous UAV cluster using DQN-based NSGA-II: Methods and applications. Chinese J. Aeronautics 4(4), 103351 (2024). [Google Scholar]
  • 57.Tian, Y. et al. Deep reinforcement learning based adaptive operator selection for evolutionary multi-objective optimization. IEEE Trans. Emerg. Top. Comput. Intell.7 (4), 1051–1064 (2022). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Correspondence and requests for materials should be addressed to M.M.-G.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES