AI-driven multi-objective optimization of FCHEV sizing and energy management considering degradation and vehicle dynamics under realistic machine learning-based traffic conditions

Morteza Montazeri-Gh; Afshin Mostashiri

doi:10.1038/s41598-025-23748-8

. 2025 Nov 13;15:39873. doi: 10.1038/s41598-025-23748-8

AI-driven multi-objective optimization of FCHEV sizing and energy management considering degradation and vehicle dynamics under realistic machine learning-based traffic conditions

Morteza Montazeri-Gh ^1,^2,^✉, Afshin Mostashiri ¹

PMCID: PMC12615721 PMID: 41233539

Abstract

The performance of Fuel Cell Hybrid Electric Vehicles (FCHEVs) is critically dependent on the optimization of energy management strategies (EMS), powertrain component sizing, and associated cost factors. Achieving the full potential of FCHEVs necessitates sophisticated optimization techniques that balance competing objectives of fuel efficiency, durability, and performance. This paper introduces a novel AI-driven multi-objective optimization framework that simultaneously optimizes both the EMS and powertrain component sizing, incorporating real-world driving conditions, degradation models, and vehicle dynamic constraints such as acceleration, top speed, and gradeability. The methodology begins by employing a machine learning approach using a Random Forest classifier to construct a representative driving cycle, categorizing traffic conditions into four distinct operational scenarios: congested, urban, extra-urban, and highway. An advanced hybrid optimization approach is then developed by combining Deep Q-Networks (DQN) with the NSGA-II evolutionary algorithm. This framework dynamically selects genetic operators (crossover and mutation) based on population performance, enhancing convergence and Pareto front quality. Both Type-2 Fuzzy Logic Controller parameters for EMS and powertrain component sizes are co-optimized to improve efficiency and durability. The proposed co-optimization framework improves both efficiency and durability, reducing fuel consumption by 21% compared to sizing-only optimization, while increasing battery durability by 7% and fuel cell durability by 30% compared to EMS-only approaches. Finally, the practical feasibility of the approach is demonstrated through hardware-in-the-loop (HIL) testing, where the optimized Type-2 fuzzy logic controller is executed in real-time by an Electronic Control Unit (ECU) via a data acquisition interface, confirming the system’s applicability.

Keywords: FCHEV, EMS, Machine learning, Sizing, Aging, HIL

Subject terms: Energy science and technology, Engineering, Mathematics and computing

Introduction

FCHEVs represent one of the most promising solutions for sustainable transportation, offering substantial reductions in greenhouse gas emissions compared to traditional internal combustion engine vehicles^1,2. By emitting only water vapor and warm air, FCHEVs directly contribute to mitigating environmental pollution and are increasingly viewed as a cornerstone of future green mobility. At the heart of FCHEV performance lies the EMS, which dynamically regulates the distribution of power between the fuel cell and the battery, ensuring optimal operation across diverse driving conditions^3–5.

Recent studies have made significant strides in applying genetic optimization methods to enhance energy management strategies (EMS) for hybrid vehicles. For example, paper⁶ utilizes the Non-dominated Sorting Genetic Algorithm III (NSGA-III) to co-optimize component sizing and EMS for an ammonia-hydrogen hybrid powertrain, improving system efficiency and reducing ammonia consumption in heavy-duty applications. Similarly, article⁷ combines NSGA-III with Bayesian optimization, achieving a 31.24% efficiency improvement in hydrogen production for carbon-free heavy-duty vehicles. Additionally, research⁸ demonstrates that adaptive EMS, based on genetic optimization, enhances energy efficiency and reduces carbon emissions in ammonia-hydrogen propulsion systems.

In recent years, artificial intelligence (AI)—and reinforcement learning (RL) in particular—has played an increasingly influential role in advancing energy management system (EMS) design. By enabling adaptive, data-driven, and real-time decision-making, these approaches have delivered measurable gains in fuel economy, reductions in component degradation, and enhanced vehicle performance under dynamic operating conditions^9–11. For instance, a hierarchical Deep Deterministic Policy Gradient (DDPG)-based EMS improved fuel-cell efficiency by up to 56%, reduced battery degradation by 0.28%, and lowered operating costs by 9.24%¹². Similarly, a hybrid Deep Dyna-Q method, which integrates model-free and model-based RL, demonstrated superior EMS optimization compared to conventional strategies, while also reducing training costs and improving policy stability under WLTC conditions¹³. Hybrid EMS structures have also been investigated, such as¹⁴, which combined fuzzy logic for high-level power allocation with RL-based low-level converter control, thereby reducing stress on critical components. More recently, degradation-aware models of fuel cells and batteries have been incorporated into RL-based EMS designs, further improving robustness, adaptability, and interpretability in long-term operation¹⁵. Paper¹⁶ emphasizes the pivotal role of RL in developing effective energy management strategies for fuel cell electric vehicles, demonstrating, through a novel sim-to-real framework, that integrating advanced RL algorithms with high-fidelity vehicle models results in significant reductions in hydrogen consumption—ranging from 4.35% to 5.73% across various testing stages.

While AI-driven optimization of EMS has progressed rapidly, the integration of AI in powertrain component sizing remains relatively underexplored. Although several studies have proposed co-optimization frameworks that address both EMS and component sizing, many have yet to fully exploit the potential of AI-based methods^17,18. In contrast, other optimization techniques, such as Dynamic Programming (DP), Pontryagin’s Maximum Principle (PMP), Equivalent Consumption Minimization Strategy (ECMS), and Particle Swarm Optimization (PSO), have been widely used. For instance, Article¹⁹ presents a co-optimization approach for hybrid electric vehicles that simultaneously optimizes battery size and EMS, considering factors such as energy consumption, battery degradation, and depth of discharge (DOD). Using convex programming, this approach aims to minimize total costs and enhance vehicle efficiency. Similarly, Article²⁰ introduces a real-time, multi-layer co-optimization strategy for hybrid vehicles, improving powertrain configuration, parameters, and control. By integrating a multi-mode, multi-gear system with fast real-time control (AFRCS), this method significantly enhances fuel economy, acceleration, and battery life, with real-world tests showing a 50.45% improvement in acceleration and a 22.67% increase in battery life. In the case of fuel cell vehicles, convex programming has also been applied to optimize EMS and component sizing for hybrid buses, focusing on driving patterns and cost sensitivity²¹. Additionally, other studies have examined the optimization of fuel cell power ratings, battery capacities, and control strategies to understand how these sizing decisions affect overall system efficiency²². A crucial, yet often overlooked, aspect of simultaneously optimizing EMS and component sizing is the impact of component aging and vehicle dynamics. Recent research highlights the importance of performance-related constraints—such as acceleration, top speed, and gradeability—in shaping optimization outcomes^23,24. For example, one study²⁵ showed that explicitly considering vehicle dynamics under WLTC conditions improved gradeability by 10.5% and extended fuel cell shutdown time by 18.5%, all while maintaining drivability.

Parallel research has also begun addressing aging-aware co-optimization by embedding degradation models for both fuel cells and batteries into the optimization framework^26–29. These studies demonstrate tangible benefits: a co-simulation method³⁰ that simultaneously optimized component sizing and EMS achieved a 29% extension in fuel cell lifetime and a 15% gain in fuel efficiency, albeit with a moderate increase in battery aging. Similarly, a fuzzy multi-objective framework applied to a battery–ultracapacitor hybrid system helped balance trade-offs among range, efficiency, and longevity³¹. Additionally, paper³² presents a co-optimization framework for hybrid powertrains, combining high-fidelity engine and motor maps with an adaptive NSGA-III algorithm enhanced by a chaos sequence (NSGA-CS) to improve diversity and prevent premature convergence. This method minimizes fuel consumption, battery degradation, and manufacturing cost, outperforming traditional approaches and validated through real driving cycles and hardware-in-loop experiments. Another study³³ optimized component sizing and EMS policies for a Toyota Mirai platform, achieving up to a 21% increase in fuel economy and meaningful reductions in both cost and degradation. Despite these advances, there remains no comprehensive framework that fully integrates AI-powered optimization of EMS and powertrain sizing simultaneously, while also incorporating real-world traffic patterns, component aging, and vehicle dynamics in a unified process. This gap underscores the need for advanced strategies that not only rely on co-simulation but also improve the underlying optimization process itself.

One important development in this regard is the use of reinforcement learning for adaptive operator selection in evolutionary algorithms, a concept already applied in diverse optimization contexts^34–36. A Dueling Deep Q-Network has been used to dynamically select crossover and mutation operators, improving multi-objective optimization on benchmark problems with applications in engineering and resource allocation³⁷. Deep RL–based operator selection has also optimized energy and travel time for unmanned electric sweepers in urban networks³⁸. More recently, its integration into NSGA-II demonstrated clear advantages, yielding higher-quality Pareto fronts in mixed-flow assembly scheduling³⁹.

Building on these advances, this paper proposes a novel hybrid AI-based multi-objective optimization framework tailored for FCHEVs. The framework embeds a DQN within NSGA-II, enabling adaptive selection of genetic operators to jointly optimize both Type-2 Fuzzy Logic Controller parameters and powertrain component sizing. To ensure realistic operating conditions, a machine-learning-based traffic classification model is incorporated, using a Random Forest classifier to generate representative driving cycles spanning congested, urban, extra-urban, and highway scenarios. Degradation models for the fuel cell and battery are explicitly included, along with essential vehicle performance constraints such as acceleration, top speed, and gradeability. The proposed system is evaluated through simulation, with key metrics including fuel consumption, battery degradation, and fuel cell aging. Finally, HIL testing is conducted to validate the practical robustness and applicability of the approach under real-world driving conditions.

Vehicle description

A FCHEV typically consists of key components such as a fuel cell system, an electric motor, a battery, a power electronics unit, a hydrogen storage tank, and a control system to efficiently manage energy flow between the fuel cell, battery, and electric motor, as illustrated in Fig. 1(a).

To better understand the vehicle’s performance, Fig. 1(b) illustrates its longitudinal dynamics, which are primarily influenced by key forces: traction force ( Inline graphic ), rolling resistance (), aerodynamic drag (), and the gravitational component along the road slope (). The governing dynamic equation determining the required traction torque is expressed in Eq. (1).

Here, Inline graphic represents the total mass of the vehicle, encompassing the weights of the fuel cells, battery, electric motor, and other essential components. denotes the input electric power supplied to the DC/AC inverter, which is required to drive the electric motor. The parameters and refer to the transmission efficiency and motor drive efficiency, respectively. Additional key variables include u for vehicle speed, ρ for air density, δ as the rotational mass correction factor, Inline graphic as the vehicle’s frontal area, for the aerodynamic drag coefficient, for the rolling resistance coefficient, and for gravitational acceleration. The complete set of vehicle specifications used in this study is provided in Table 1.

Table 1.

The specification of studied FCHEV.

	Parameters	Value	Unite	Source
Vehicle Specifications	Drag coefficient ()	0.318	-	⁴⁰
	Coefficient of Rolling Friction ()	0.0102	-
	Front view area ()	2.1
	vehicle Weight (M)	1531	Kg
	Effect coefficient of rotating objects ()	1.078	-
Battery	Nominal capacity	5	Ah	⁴⁸
Battery	Voltage	413	V	⁴⁸
Fuel cell	Maximum power	75	KW	⁴¹
	Stack number	430	-
	Active area	320
Motor	Maximum power	75	KW	ADVISOR
	Maximum torque	280	Nm
	Maximum speed	6000	rpm

Open in a new tab

Fuel cell modeling

Proton Exchange Membrane Fuel Cells (PEMFCs) generate electrical power by facilitating an electrochemical reaction between hydrogen and oxygen. In this process, hydrogen serves as the primary fuel, while oxygen is sourced from the surrounding air. The byproducts of the reaction are electricity, heat, and water, making PEMFCs an environmentally friendly option. A distinctive feature of these fuel cells is the solid polymer electrolyte membrane, which selectively allows protons to migrate through while forcing electrons to travel via an external circuit (refer to Fig. 2(a)).

Fig. 2 — Fuel cell modeling plots: **(a)** Basic structure of PEMFC, **(b)** efficiency plot of the fuel cell system⁴¹.

This separation creates the electric current necessary for power generation. The fundamental reaction governing this process is summarized in Eq. (2). Additional specifications related to the fuel cell stack are listed in Table 1.

The hydrogen consumption of a PEMFC ( Inline graphic ) can be determined using:

In this context, N represents the number of cells, F denotes the Faraday constant, and Inline graphic refers to the net current extracted from the PEMFC. This hydrogen consumption is then utilized to determine both the cost of hydrogen and the energy input to the PEMFC, as outlined below:

Where Inline graphic represents the higher heating value of the hydrogen supplied to the PEMFC. The specific relationships between the fuel cell power (FC Power) and efficiency examined in this study are depicted in Fig. 2(b)⁴¹.

To enhance the accuracy and interpretability of fuel cell vehicle models, advanced AI methods leveraging experimental data are highly effective. For example, paper⁴² presents a theory-constrained neural network (TCNN) that combines theoretical models with data-driven techniques, using experimental data to improve fuel cell temperature and voltage predictions while maintaining physical significance, resulting in better hydrogen consumption estimation and system optimization.

Fuel cell aging

This study estimates fuel cell power degradation (Q-FC) by incorporating the cumulative effects of various operational conditions using empirically derived coefficients that reflect different degradation mechanisms^43,44. The deterioration in maximum power output, denoted as Q-FC, is calculated using the equation:

Where Inline graphic indicates the peak power capacity of the fuel cell, serving as a reference for quantifying losses. The degradation coefficients are associated with distinct stress factors: α₁ (0.00126% per hour) accounts for the prolonged operation at very low power levels (below 5% of ), which can impair electrochemical efficiency; α₂ (0.00196% per cycle) reflects the impact of frequent startups and shutdowns; α₃ (5.93 × 10⁻⁵% per load change) measures degradation from transient fluctuations in load demand; α₄ corresponds to the high-power operation stress, typically encountered above 90% of rated output; and α₅ (0.002% per hour) represents the baseline performance decay under regular usage, attributable to gradual material wear such as membrane thinning and catalyst aging. The time durations t₁, t₂, and t₃ represent cumulative periods of low-power use, high-power operation, and total runtime, respectively, while n₁ and n₂ count the number of on/off cycles and transient load events⁴⁴. It is noteworthy that, according to⁴⁵, fuel cell degradation is predominantly attributed to load changing, which accounts for 56.5% of the total degradation. In alignment with standards from the U.S. Department of Energy (DOE), a fuel cell is considered to have reached its end of life (EOL) when its peak output drops by 10%, with a targeted lifespan of 5000 operational hours⁴⁶.

Battery modeling

To accurately simulate the internal resistance and open-circuit voltage behavior, lookup tables were applied based on discharge properties, charge resistance, and voltage–SOC profiles, as detailed in Fig. 3 (a)⁴⁷. In this model, the battery’s output current Inline graphic , terminal voltage , and output power are computed using the following relationships:

Fig. 3 — Characteristic maps for battery modeling: **(a)** Relationship between internal resistance, open circuit voltage, and SOC⁴⁷, **(b)** capacity depletion at C/2 discharge rate as a function of cycle number for various DoD⁵¹.

To extend battery service life and avoid damage from deep discharges, the SOC should not fall below 40%^48,49, which corresponds to a maximum depth of discharge (DoD) of 60%, as defined by:

Battery aging

The battery degradation model presented in this work adopts a semi-empirical approach, incorporating the combined influence of temperature, SOC, and C-rate to evaluate battery aging. It estimates performance decline through the total ampere-hour (Ah) throughput, offering a holistic assessment of usage impact over time. Battery capacity loss is expressed as a percentage of the initial capacity, making the degradation trends easily quantifiable. To account for varying operational conditions, the average stress factors are calculated over the full driving cycle, thereby improving the model’s representation of real-world use. The primary degradation is:

In this formulation, Inline graphic is the initial nominal battery capacity, and is the remaining capacity after experiencing a cumulative charge/discharge throughput . The total charge throughput, including both charging and discharging phases, is calculated by:

Where Inline graphic represents the battery current over time. Capacity fades at the cell level is further characterized by⁵⁰:

Here, Inline graphic represents the capacity severity factor function, denotes the gas constant, and is the cell activation energy associated with capacity fade, and refers to the battery temperature.

To generalize the model for various C-rates, Wang et al.⁵¹ introduced a modified expression:

Table 2 provides the corresponding values of the coefficient Inline graphic for different C-rates.

Table 2.

Values of B respect to C-rate⁵¹.

C- Rate	C/2	2C	6C	10C
B values	31,630	21,681	12,934	15,512

Open in a new tab

Figure 3(b) presents the capacity loss trend for a LiFePO₄ cell operating at C/2 across different DoD. The plotted curves exhibit a mild S-curve pattern, highlighting the complex nature of cycle-induced degradation. Nevertheless, a generally linear trend between capacity retention and cycle count is observed within individual DoD levels.

Electric motor modeling

The core performance parameters of an electric motor include rotational speed, output torque, and conversion efficiency. In this study, motor behavior is modeled using an efficiency map, depicted in Fig. 4. This map delineates operating zones, with quadrant I representing propulsion (driving) and quadrant IV indicating regenerative braking (generator operation). Furthermore, Fig. 4 outlines the motor’s peak torque boundaries under varying conditions. The power output or consumption of the motor, Inline graphic , in relation to torque and angular velocity , is defined by the following expressions:

Fig. 4 — Efficiency map and torque-speed profile.

Development of a real-world traffic driving cycle

A realistic, scenario-specific drive cycle largely determines the Pareto front—that is, the observed trade-off between fuel use and component aging. The cycle fixes the distribution of transients (stops, bursts, grades, cruising) that drive power demand, SOC excursions/Ah-throughput, thermal loads, and engine on/off events—exactly the mechanisms that consume fuel and accumulate degradation⁵². This research focused on formulating a representative drive cycle tailored to Tehran’s traffic conditions, aimed at capturing the city’s distinctive driving patterns. Given Tehran’s dense population—approximately 10 million residents⁵³—and its well-known issues with traffic congestion, creating an accurate drive cycle was essential for reliable modeling and simulation. To achieve this, driving data were collected from various urban routes throughout the city, providing the foundation for constructing a drive cycle suitable for evaluating and optimizing vehicle performance under real-world traffic conditions.

Data collection setup

A GPS-enabled system was developed using an Arduino Uno microcontroller integrated with a NEO-6 M GPS module to record real-world traffic data. As depicted in Fig. 5(a), this setup was employed to log vehicle speed as car traveled through various areas—the GPS device sampled data at consistent time intervals, enabling the construction of detailed speed-time profiles. The entire system was enclosed in a durable casing for protection during on-road testing, as shown in Fig. 5(b). The data collection routes covered several key streets and highways, with Fig. 5(c) highlighting the primary paths. Routes marked in red represent areas with higher traffic congestion, while those in yellow indicate regions with relatively smoother traffic flow. Figure. 5(d) displays the vehicles outfitted with the GPS during the data gathering.

Fig. 5 — Data collection setup and route for real-world traffic cycle: **(a)** The components of GPS, **(b)** GPS box, **(c)** route characteristics of the Tehran city, **(d)** setup to develop drive cycle.

Noise filtering

Traffic data was collected over six months to develop the real-world driving cycle. Aggregating all recorded speed-time samples resulted in a comprehensive drive cycle encompassing more than 300,000 s of data, as illustrated in Fig. 6(a). An advanced smoothing algorithm was applied to address high-frequency fluctuations in the velocity data, providing superior performance compared to conventional averaging techniques. This method follows the filtering approach proposed in⁵⁴, expressed by the following equation:

Fig. 6 — Development of the real-world traffic drive cycle: **(a)** Driving data collected over a six-month period across various routes in the city; **(b)** drive cycle generation using a machine learning-based clustering approach.

In this formulation, the kernel function Inline graphic assigns appropriate weights to the velocity samples within a specified temporal window centered around each time point (t), thereby enabling effective noise reduction. A smoothing window of 4 s (h = 4) was adopted, utilizing a weighted kernel to enhance filtering precision.

The employed kernel function Inline graphic is defined as:

This kernel effectively reduces noise while preserving the key dynamics of the driving cycle. The filtering process results are depicted in Fig. 6(a), demonstrating the method’s efficacy in smoothing raw velocity data.

Machine learning approach for classifying and constructing traffic drive cycle

The speed data used in this study encompassed a wide range of routes, traffic conditions, and time-of-day variations to ensure a comprehensive representation of driving environment. The dataset was segmented into microtrips, defined as continuous driving sequences separated by idle periods. A microtrip was considered active when the vehicle’s speed was greater than zero, and it ended when the speed dropped to zero and remained there for a period of time. This segmentation approach effectively captures both dynamic driving phases and stationary intervals.

For each microtrip, a set of descriptive features was extracted to characterize the driving conditions. These features included average speed, idle time percentage, speed standard deviation, peak speed, acceleration events, and deceleration events. The extracted microtrips were then classified into four driving categories: Congested, Urban, Extra-Urban, and Highway. The classification criteria and feature thresholds for each category are presented in Table 3.

Table 3.

Feature thresholds for congested, urban, extra-urban, and highway conditions.

Class	Average Velocity (km/h)	Idle Percentage (%)	Standard Deviation of Velocity	Max Velocity (km/h)	Acceleration Events	Deceleration Events
Congested	0–5	0- 100	< 2	< 10	0–5	0–5
Urban	5–15	0–75	< 5	10–30	2–10	5–20
Extra Urban	15–30	0–53	< 10	30–50	5–15	10–30
Highway	>30	0–30	> 10	> 40	10–20	20–50

Open in a new tab

The extracted features from each microtrip were employed to train a Random Forest classifier to identify the driving condition associated with each segment. The classifier was trained on 300,000 s of labeled microtrip data, with a cost matrix to penalize misclassifications. Boundary adjustment resolved ambiguous cases near class thresholds. The Random Forest algorithm operates as an ensemble of multiple decision trees, each trained on a random subset of features and bootstrapped data samples (see Fig. 7). For instance, one tree in the forest might split first on Average Velocity (> 15 km/h), then on Max Velocity (> 40 km/h) to classify Highway traffic, while another tree could prioritize Idle Percentage (> 50%) to detect Congested conditions. During inference, each tree votes independently, and the final prediction is determined by majority voting (for classification) or averaging (for regression). This approach reduces overfitting by decorrelating individual trees, leveraging the strength of collective decision-making while mitigating biases from any single tree’s structure.

Fig. 7 — Decision tree for driving condition classification based on velocity, acceleration, and traffic metrics.

The classification model’s performance is visually demonstrated in a series of plots. Figure 8(a) shows a 3D scatter plot of Average Velocity, Idle Percentage, and Standard Deviation of Velocity, with microtrips color-coded by condition. Congested microtrips show lower velocities and higher idle percentages, while Highway microtrips exhibit the opposite characteristics. Figure 8(b) is a bar chart showing the number of microtrips in each condition, with Congested being the most common. Figure 8(c) displays a 2D plot of Idle Percentage vs. Average Velocity, clearly distinguishing Congested and Highway conditions. Figure 8(d) shows Max Velocity vs. Standard Deviation, where Highway microtrips have the highest values for both features.

Fig. 8 — Random Forest classification of microtrip driving conditions: **(a)** three-dimensional scatter plot of average velocity, idle time percentage, and speed standard deviation; **(b)** two-dimensional plot of average velocity and idle time percentage; **(c)** bar chart of microtrip distribution; **(d)** plot of maximum velocity and speed standard deviation.

The confusion matrix (Fig. 9(a)) illustrates the model’s performance, showing high classification accuracy: 99.8% for Congested, 93.1% for Urban, 99.6% for Extra Urban, and 100% for Highway. Most misclassifications occur between Urban and extra-urban due to overlapping features. In Fig. 9(b), correctly classified microtrips are shown in green and misclassified in red, with errors mainly appearing at the boundaries between Urban and Extra Urban conditions.

Finally, the traffic drive cycle was generated by stitching together selected microtrips from each classified driving condition, prioritizing those closest to their respective cluster centers (see Fig. 6(b)). This approach ensured that the resulting drive cycle accurately represented the typical driving patterns. The cycle captures diverse traffic conditions—Congested, Urban, Extra Urban, and Highway—each with distinct durations, speed profiles, and behavioral characteristics, as summarized in Table 4.

Table 4.

Characteristics of congested, urban, extra-urban, and highway conditions.

Condition	Congested	Urban	Extra Urban	Highway	Total Cycle
Duration (sec)	465	500	499	565	2029
MaxSpeed (km/h)	12.5	17.1	35.7	71.4	71.4
AverageSpeed (km/h)	2.7	7.0	19.2	39.2	18.0
StdVel (km/h)	3.2	4.8	8.4	19.6	18.5
AccelEvts	90	168	211	224	693
DecelEvts	123	174	160	216	674
MaxAccel (m/)	0.83	0.87	0.965	1.40	1.40
MinAccel (m/)	−0.77	−0.9	−1.2	−1.13	−1.2
AvgAccel (m/)	0.263	0.264	0.27	0.28	0.27
AvgDecel (m/)	−0.18	−0.25	−0.33	−0.31	−0.27
IdleTime (sec)	229	83	28	13	353
NumberStops	14	12	3	1	31
Distance (km)	0.35	0.98	2.67	6.16	10.15

Open in a new tab

Hybrid multi-objective deep reinforcement learning optimization for FCHEV

This section presents a hybrid multi-objective optimization framework designed to simultaneously optimize powertrain component sizing and the EMS of a FCHEV. The proposed method combines the NSGA-II with a DQN to enable the adaptive selection of genetic operators. Integrating these advanced techniques effectively balances multiple conflicting objectives, including maximizing energy efficiency, minimizing operational costs, and prolonging system durability. Moreover, the framework simultaneously fine-tunes both the physical configuration of the powertrain and the parameters of the Type-2 fuzzy logic controller, ensuring intelligent energy distribution under real-world driving conditions.

NSGA-II multi-objective optimization

Figure 10 together illustrate the working principles of the NSGA-II algorithm for solving multi-objective optimization problems. In part (a), the process begins with a randomly generated initial population of solutions, each representing a possible trade-off between conflicting objectives. These solutions are combined with newly generated offspring and then ranked using non-dominated sorting, which organizes them into layers (F1, F2, F3, etc.) based on Pareto dominance. The first front (F1) contains the best solutions that are not outperformed by any other, while lower-ranked fronts (F2, F3, F4) represent progressively weaker alternatives. Since the population size must remain limited, NSGA-II applies an additional step known as crowding distance sorting to select which solutions survive to the next generation. This mechanism ensures that chosen solutions are not only of high quality but also well spread across the objective space, preventing the algorithm from converging to a narrow cluster of points⁵⁵. Part (b) shows this concept visually: the horizontal axis represents one cost (fuel cell and battery aging) and the vertical axis represents another (fuel consumption). The scattered black dots indicate dominated solutions, while the connected blue points represent the non-dominated fronts. On the first front (F1), consecutive solutions such as Inline graphic , , and are used to calculate crowding distance. If is far from its neighbors, it receives a larger crowding distance value and is more likely to be selected, as it contributes to maintaining diversity. Conversely, solutions packed too closely together may be rejected. Through this combination of non-dominated sorting and crowding distance sorting, NSGA-II ensures that the final Pareto front is both close to the true trade-off boundary and evenly distributed, offering decision-makers a wide set of balanced alternatives.

Fig. 10 — NSGA-II evolutionary process: **(a)** Non-dominated sorting and crowding distance, **(b)** pareto front representation with crowding distance.

In evolutionary algorithms such as NSGA-II, new solutions are generated using genetic operators, which recombine or perturb existing solutions to explore the search space. For instance, consider two parent solutions representing vehicle configurations: (40 kWh battery, 80 kW fuel cell) and (60 kWh battery, 100 kW fuel cell). A simple crossover could exchange their traits to produce (40, 100) and (60, 80), while a mutation step might slightly adjust one offspring to (42, 97), introducing diversity. In the proposed framework, more advanced operators are also considered. The Simulated Binary Crossover (SBX) blends parent values by a scaling factor, potentially producing an offspring such as (38, 90), which lies between and beyond the parents. The DE/rand/1 operator generates a new solution by adding a weighted difference between two parents to a third, for example (40, 80) + 0.8 × ((60, 100) − (30, 70)) = (64, 104), effectively exploring along directional vectors. The DE/rand/2 operator extends this idea by combining two such difference vectors, allowing the offspring to explore more aggressively, e.g., (40, 80) + 0.5 × ((60, 100) − (30, 70)) + 0.5 × ((55, 85) − (25, 60)) = (70, 107). Traditionally, such operators are applied using fixed probabilities, but this is inflexible because their usefulness changes across search stages. To address this, adaptive operator selection driven by a deep reinforcement learning agent dynamically chooses which operator to apply based on their recent performance, ensuring that the most effective operators are emphasized while weaker ones are used less often.

Hybrid NSGA-II and DQN framework for multi-objective optimization

This methodology enhances the NSGA-II algorithm with a DQN to enable the adaptive selection of genetic operators throughout the multi-objective optimization process. This hybrid NSGA-II–DQN framework effectively addresses the challenges of complex multi-objective problems by integrating evolutionary strategies with reinforcement learning. The DQN agent interacts dynamically with the evolving population—comprising decision variables and their corresponding objective function values—and selects the most appropriate genetic operators based on learned Q-values, which capture each operator’s historical performance.

The Fig. 11 shows how the proposed NSGA-II + Deep Q-Learning framework works step by step.

Evolution (top-left): The process begins with a group of candidate solutions (red and blue circles). These represent different possible designs or strategies. Special tools called operators (OP1, OP2, OP3) are used to generate new solutions by mixing and modifying the existing ones. The number of solutions first increases because many offspring are created, but later it is reduced again after only the best ones are selected.
Interaction (bottom-left): Here, an agent (the learning controller) communicates with the NSGA-II algorithm. The agent chooses which operator to apply (action), and NSGA-II evaluates the result, sending back a reward that reflects how good the new solutions are. This loop helps the system learn from trial and error.
Learning (bottom-right): All the information about the state of the population, the chosen operator, and the reward is stored in a memory called experience replay. From this memory, a Q-network (a neural network) learns to predict the usefulness of each operator in different situations. To make learning more stable, another copy of the network (target network) is also used. Together, they improve the accuracy of the Q-values, which measure how good each action is expected to be.
Decision (top-right): Finally, the system uses the learned Q-values to decide which operator (OP1, OP2, or OP3) should be applied next. This decision balances trying new options (exploration) with choosing the currently best operator (exploitation). The selected operator is then sent back to the Evolution stage, and the cycle repeats.

Figure 12 illustrates the proposed hybrid optimization framework designed for tuning multi-objective fuzzy EMS ans component sizes. It begins with the initialization of a diverse population within defined bounds, followed by an evaluation of objective functions to assess solution quality. The algorithm employs NSGA-II’s non-dominated sorting and crowding distance calculations to organize the population based on dominance. A pivotal aspect of the flowchart is the DQN’s role in selecting genetic operators; instead of following traditional methods, the algorithm dynamically chooses operators based on learned Q-values that reflect historical performance. The flow also incorporates a feedback loop, where the DQN assesses whether improvements have been made and adjusts its selection strategy accordingly. By utilizing the Chebyshev aggregation method for credit assignment, the algorithm converts multiple objective values into a single scalar reward, ensuring effective exploration of the solution space. This adaptive mechanism allows the algorithm to refine its approach over iterations, progressively identifying and prioritizing the most effective genetic operators to optimize the overall fitness of the population.

Credit assignment strategy

In multi-objective optimization, assigning credit (reward) to offspring solutions requires aggregating multiple objective values into a single scalar. To address this, the Chebyshev aggregation method is employed. This method calculates the maximum weighted distance between the objective function values and a reference point, which is defined as the minimum value for each objective observed in the current population. Formally, the aggregation function for an individual Inline graphic is given by^56,57:

Where Inline graphic is the weight assigned to the -th objective, is the value of the -th objective function, is the reference point (minimum value) for the -th objective, and is the number of objectives.

The reward for each offspring Inline graphic is calculated as follows:

This formulation ensures that the reward is bounded between 0 and 1, where the highest value corresponds to the best-performing individual in the population. After credit assignment, the reward for each offspring is computed by comparing operator rewards and individual rewards, using:

This design emphasizes larger improvements over frequent small changes. Operator rewards are maintained in a sequence R, which stores recent operator selections paired with their corresponding rewards.

Operator candidate set

The hybrid NSGA-II-DQN framework employs multiple genetic operators selected from a candidate set to generate offspring:

Simulated Binary Crossover (SBX): This operator is efficient for handling multi-modal landscapes and is described by:

Where Inline graphic is a scaling factor, is the -th decision variable of the offspring, , are the -th decision variables of the two parents.

2.
Differential evolution operators: These operators are effective for dealing with complex variable associations.

DE/rand/1 is defined by:

Where Inline graphic is a scaling factor, and , , are the parent solutions.

DE/rand/2 is defined by:

Adaptive operator selection strategy

Following the credit assignment, the DQN adaptively selects the most appropriate genetic operator based on the current population state (including decision variables and objective values). This state is input into the trained Q-network, which outputs Q-values corresponding to each genetic operator. Operator selection employs the epsilon-greedy policy:

Where Inline graphic is the starting exploration rate, is the minimum exploration rate, is the decay rate. The agent selects a random operator from the set with probability , or chooses the operator with the highest Q-value with probability . The operator selection based on the Q-values is given by:

Where Inline graphic is the candidate set of genetic operators and is the Q-value for operator in the given state.

Overall process

The NSGA-II-DQN algorithm follows the standard NSGA-II process with the key modification of DQN-based operator selection. The process begins by initializing the population and performing the standard NSGA-II operations, such as selection, crossover, and mutation. However, instead of applying traditional crossover and mutation operations, the DQN selects the genetic operator for each offspring, guided by the learned Q-values. The detailed procedure of the NSGA-II-DQN algorithm with Chebyshev credit assignment is outlined in Algorithm 1.

In the proposed NSGA-II–DQN framework, the optimization process is initiated by generating a random population of candidate solutions and initializing a Q-network to guide operator selection. At each generation, the solutions are evaluated with respect to the defined objectives, and the current reference point is updated accordingly. The population is then ranked using non-dominated sorting and crowding distance, after which a compact state representation is constructed to characterize the distribution of solutions. Based on this state, an evolutionary operator is selected from a candidate set through an ε-greedy policy applied to the Q-network. Offspring are subsequently generated by applying the chosen operator, followed by mutation within the defined bounds. For each offspring, a reward is calculated through Chebyshev aggregation of the objective values, and the corresponding experience tuple (state, operator, reward, next state) is stored in the replay memory. The Q-network is iteratively updated by sampling mini-batches from the replay buffer, computing target values, and minimizing the temporal-difference error. Over successive generations, the exploration rate ε is decayed to encourage exploitation of the learned policy. Finally, environmental selection is performed to retain the top N solutions, and the Pareto archive is updated. This iterative process continues until the maximum number of generations is reached, at which point the final Pareto front and the trained Q-network are obtained.

graphic file with name 41598_2025_23748_Figa_HTML.jpg — Algorithm 1. NSGA-II-DQN

Optimization constraints and objective function formulation

The performance of an FCHEV is strongly dependent on the optimal design of three critical factors: component sizing, power management strategies, and driving conditions. These factors are interdependent, and their interactions must be carefully considered to achieve optimal efficiency. Consequently, a holistic optimization approach is required, where each component—such as the fuel cell, motor, and battery—along with the energy management strategy, is treated as a separate optimization problem. This approach focuses on two main objectives: reducing fuel consumption and minimizing the degradation of both the battery and fuel cell. Additionally, the required longitudinal performance constraints are incorporated. While this method enhances the overall efficiency of the FCHEV, it also increases the complexity of the simulation, as it intensifies the interactions between the different variables and subsystems within the optimization process.

This method establishes two objective functions along with two constraints. The first objective function (obj1) quantifies the total aging of both the fuel cell and battery, whereas the second objective function (obj2) reflects the cost associated with fuel consumption. The optimization aims to minimize both fuel consumption and operational costs, even though these objectives may conflict with each other, particularly in terms of dynamic performance. The problem is formulated as a constrained, non-linear multi-objective optimization problem, as outlined below.

The degradation objective obj1 is formulated as a weighted average of the normalized battery and fuel-cell state-of-health terms, where Inline graphic and scale each component by its reference value to ensure commensurate contributions. The weight factors for the fuel cell and battery, denoted as w1 and w2, are set to 1 and 2, respectively.

The second objective function (obj2) represents the fuel cell vehicle’s fuel consumption, as outlined in Eq. (2). Similar to obj1, this objective function is normalized based on refrence value of fuel consumption ( Inline graphic ).

It is crucial to understand that in multi-objective optimization problems, obtaining a single solution that optimally balances all conflicting objectives simultaneously is generally unattainable. In the current study, efforts to improve fuel economy or reduce costs may increase battery and fuel cell aging. As a result, finding a balance among these competing objectives is essential. The set of solutions that achieve this balance is known as the Pareto front (see Fig. 13). Within the Pareto front, the “Knee point” is considered the optimal solution, representing the point where the distance to the extreme line is maximized. This point is critical because it provides the best trade-off between the conflicting objectives. It is the point where no objective can be improved without worsening another, making it a critical point for decision-making in practical applications.

Fig. 13 — Pareto optimal solution of a two-objective FCHEV optimization.

The powertrain components of the FCHEV were dimensioned based on performance benchmarks aligned with the industry-standard PNGV specifications³³, with key requirements summarized in Table 5.

Table 5.

FCHEV performance for optimal solution³³.

Constraints	Description	Value
Acceleration (m/s^2)	for 0 to 97 km/h	Acc1≤ 12s
	for 64 to 97 km/h	Acc2≤ 5.3s
	for 0 to 137 km/h	Acc3≤ 23.4s
Maximum speed (km/h)	0% road grade	Dis≥136
Gradeability (%)	55 mph (88.5 km/h) at 6.5 % grade	grd≥ 6.5%

Open in a new tab

A fitness function is required to assess each candidate solution, enabling the NSGA-II-DQN framework to concurrently optimize component sizing, energy management strategy, and operational expenses in FCHEVs. This study defines the fitness function as the inverse of the objective functions. To ensure the solutions meet the required constraints, penalty functions are incorporated to penalize undesirable outcomes. Specifically, acceleration-related penalty functions are used to guide the optimization process toward feasible solutions, as outlined below.

Additionally, the penalty functions for gradability and maximum speed are defined as follows:

In this study, the fitness function is formulated by incorporating the penalty functions into the objective function, as follows:

In this framework, Inline graphic represents the fitness function, while denotes the penalty function associated with the i-th constraint. The constant is a positive penalty coefficient that determines the magnitude of the penalty for each limit.

To evaluate and compare the performance of NSGA-II and NSGA-II-DQN, three performance indicators were employed, each designed to capture both convergence and diversity of the Pareto front. Convergence measures how closely the obtained solutions approach the true Pareto optimal set, while diversity assesses how uniformly these solutions are distributed across the objective space.

The first indicator is the Coverage Metric (CM), which quantifies the dominance relationship between two sets of non-dominated solutions. For two sets Inline graphic and , is defined as³¹:

A higher value of Inline graphic indicates that set dominates a greater portion of set . If , set is considered superior.

The second indicator is the Hypervolume (HV), which jointly measures convergence and diversity. HV corresponds to the Lebesgue measure of the portion of the objective space dominated by the Pareto front and bounded by a reference point:

Where Inline graphic denotes the Lebesgue measure and represents the hypervolume of the region dominated by solution i ∈ A. Larger HV values indicate better convergence and wider coverage of the objective space.

The third indicator is the Spacing Metric (SM), which evaluates the uniformity of solution distribution. It is defined as:

Where Inline graphic is the minimum distance between solution i and any other solution in A:

Here d is the mean of all Inline graphic . Smaller SM values indicate a more uniform distribution of solutions along the Pareto front.

Optimization of powertrain components in FCHEV

To optimize the FCHEV’s powertrain components, three critical design variables are considered: the size coefficients of the fuel cell stacks, the electric motor, and the battery modules. The optimization process systematically adjusts these variables to minimize fuel consumption and reduce degradation of both the battery and fuel cell while ensuring compliance with the longitudinal performance constraints specified in Table 5. The values and parameter ranges used in this optimization are detailed in Table 6.

Table 6.

Values and ranges of parameters and variables.

	Parameter or variable	Value
Sizing Parameters	Initial coefficient of the number of fuel cell stacks (K-FC)	1
	Initial Coefficient for the Number of Battery Modules (K-BA)	1
	Initial coefficient for the electric motor (K-EM)	1
	fuel cell stacks coefficient range	[0.46, 1.4]
	Electric motor tourqe coefficient range	[0.5, 1.5]
	Battery capacity coefficient range	[0.7, 1.5]
NSGA-II Parameters	Population size	10
	Generation	5
	Mutation Rate	0.1
	Crossover Rate	0.9
RL Parameters	Training Episodes	15
	Steps per Episode	5
	Network hidden layer size	128
	Batch Size	64
	Buffer Size	10,000
	Discount factor	0.99
	Learning rate	1e-3

Open in a new tab

The process begins with formulating cost functions that guide the NSGA-II-DQN algorithm in iteratively refining the sizing coefficients of the powertrain components. By repeatedly simulating the driving cycle and minimizing the defined cost function, the algorithm progressively optimizes the parameters, resulting in improved EMS, enhanced overall efficiency, reduced operational costs, and extended battery and fuel cell system durability.

Simultaneous optimization of sizing and EMS

Since this study aims to simultaneously optimize both the control strategy and the powertrain component sizing, the next phase focuses on analyzing the interaction between these two systems through concurrent optimization. This approach seeks to identify optimal component configurations alongside the most effective control strategy variables.

In this context, the paper defines five membership functions for each input variable—namely, power demand (P-req) and SOC—and seven membership functions for the output variable (FC-POWER). The fuzzy membership functions for both inputs and outputs, constructed using Fuzzy Type-2 logic, are illustrated in Fig. 14. The detailed fuzzy rules governing the system are presented in Table 7.

Fig. 14 — Inputs and output non-optimized membership functions of Type-2 fuzzy logic for FCHEV.

Table 7.

Fuzzy rules for EMS of FCHEV.

	Required Power (P-req)
	S	RS	M	RB	B
SOC
S	S	RS	M	RB	B
RS	VS	S	RS	M	RB
M	VS	VS	S	RS	M
RB	VS	VS	VS	S	RS
B	VS	VS	VS	VS	S

Open in a new tab

The objective remains unchanged, as specified in Eqs. (25) and (26), and the sizing configurations are consistent with those presented in Table 6. Table 8 details the initial optimization values for the inputs and outputs, along with their respective optimization ranges.

Table 8.

Initial optimization values and ranges for inputs and outputs.

	Index	Original Value	Fuzzy Value (Min)	Lower Lag	Lower Scale
Input 1 (P-req)	MF1	2500	[0, 7500]	[0.2, 0.5]	[0.7, 1]
	MF2	7500	[7500, 12500]	[0.2, 0.5]	[0.7, 1]
	MF3	12,500	[12500, 17500]	[0.2, 0.5]	[0.7, 1]
	MF4	17,500	[17500, 22500]	[0.2, 0.5]	[0.7, 1]
	MF5	22,500	[22500, 80000]	[0.2, 0.5]	[0.7, 1]
Input 2 (SOC)	MF1	0.6	[0.4, 0.65]	[0.2, 0.5]	[0.7, 1]
	MF2	0.65	[0.65, 0.7]	[0.2, 0.5]	[0.7, 1]
	MF3	0.7	[0.7, 0.75]	[0.2, 0.5]	[0.7, 1]
	MF4	0.75	[0.75, 0.8]	[0.2, 0.5]	[0.7, 1]
	MF5	0.8	[0.8, 1]	[0.2, 0.5]	[0.7, 1]
Output (P-FC)	MF1	2500	[0, 7500]	[0.2, 0.5]	[0.7, 1]
	MF2	7500	[7500, 12500]	[0.2, 0.5]	[0.7, 1]
	MF3	12,500	[12500, 17500]	[0.2, 0.5]	[0.7, 1]
	MF4	17,500	[17500, 22500]	[0.2, 0.5]	[0.7, 1]
	MF5	22,500	[22500, 27500]	[0.2, 0.5]	[0.7, 1]
	MF6	27,500	[27500, 32500]	[0.2, 0.5]	[0.7, 1]
	MF7	32,500	[32500, 70000]	[0.2, 0.5]	[0.7, 1]

Open in a new tab

To perform optimization using NSGA-II-DQN, 62 variables must be adjusted (see Fig. 12). Among these variables, the X variables correspond to the required power input, the Y variables represent the battery SOC, and the Z variables indicate the output of the Type-2 fuzzy controller, specifically the required power for the fuel cell.

Variables such as X10, X20, …, and X50 are assigned as optimization variables for the membership functions numbered one through five. Similarly, variables X11, X21, …, and X51 represent the lower scale values for these membership functions, while X12, X22, …, and X52 correspond to the lower lag values. This numbering convention is applied consistently to the second input (Y variables) and the output (Z variables).

The optimization process begins by defining a cost function that directs the NSGA-II-DQN algorithm to fine-tune the fuzzy controller’s coefficients iteratively. Through repeated simulations of the driving cycle aimed at minimizing this cost function, the 62 fuzzy logic parameters previously described are systematically refined. Simultaneously, the sizing parameters for the fuel cell stacks (S-FC), battery modules (S-BA), and electric motor (S-EM) are optimized in parallel. Figure 15 illustrates the optimization process for the FCHEV, combining Type-2 fuzzy logic with NSGA-II-DQN algorithms. This figure offers a detailed overview of the calculations involved in fuel consumption and battery management.

Fig. 15 — Schematic of simultaneous optimization of EMS strategy and sizing for FCHEV.

HIL configuration for FCHEV

To validate the proposed control strategy and powertrain optimization, Fig. 16 presents a comprehensive schematic of the HIL testing setup described in this article. The fuel cell vehicle model runs in real-time on a host computer, enabling dynamic simulation. Digital data—specifically, the battery SOC and the vehicle’s power demand ( Inline graphic )—are transmitted from Simulink to the Advantech PCI-1711 data acquisition card, which interfaces with the computer’s motherboard via PCI. This data is then converted into analog signals and sent to the STM32F7 hardware for further processing.

Several key factors motivate the selection of an analog communication protocol between the host computer and the STM32F7 hardware: compatibility with existing sensor technologies, cost-effectiveness, real-time data processing capabilities, robustness against interference, and seamless integration with digital systems.

Furthermore, the Type-2 fuzzy logic algorithm—optimized using the NSGA-II-DQN approach—has been implemented in C + + on the STM32F7 hardware. The output of this fuzzy controller, representing the fuel cell’s required power ( Inline graphic ), is converted into a Pulse Width Modulation (PWM) signal and passed through a first-order low-pass filter to generate a stable analog signal. This analog output is then fed into the analog input channel of the Advantech data acquisition card, enabling real-time execution of the simulation model in conjunction with the hardware. Detailed specifications for the Advantech PCI-1711 data acquisition card and the STM32F7 hardware are provided in Table 9.

Table 9.

Technical specifications of STM32F746ZG microcontroller and advantech PCI-1711/PCI-1723 data acquisition Cards.

STM32F746ZG		Advantech PCI-1711		Advantech PCI-1723
Component	Specification	Component	Specification	Specification
System on a Chip	ARM^®32-bit Cortex^®-M7 32-bit	I/O Connector	1 × 68-pin SCSI female connector	1 × 68-pin SCSI female connector
Clock speed	216 MHz max CPU frequency	Analog inputs	16 single-ended, 12 bits	Non
SRAM	320 KB	Analog outputs	2-channel, 12 bits	8-channel, 16 bits
GPIOs	(168) with external interrupt capability	Analog output range	0 ~ 5 V, 0 ~ 10 V	−10 ~ 10 V
ADC (numbers)	12-bit ADCs with 24 channels (3)	Digital input	16-channel	16-channel
DAC(numbers)	12-bit DAC channels (2)	Digital output	16-channel	16-channel
Operating voltage	3.3v
Other Transport Protocols (numbers)	USART/UART (8), I2C (4), SPI (6)

Open in a new tab

Results and discussion

This section will investigate the impact of simultaneous Sizing and EMS optimization. The analysis begins with evaluating co-simulation results based on the real-world traffic cycle. Subsequently, the influence of varying traffic conditions, such as highway and congested scenarios, on the optimization process will be explored. The next step involves discussing the optimization outcomes for different driving cycles, including UDDS and WLTP Class 3. Following that, the performance of the optimized controller will be assessed through grade testing. Finally, HIL simulations will be conducted to examine system behavior under diverse traffic conditions.

Simultanious optimization for real-world traffic driving cycle

Figure 17; Table 10 compare NSGA-II and NSGA-II-DQN across the three cases. In the Sizing case, both methods achieve very similar Pareto fronts, but NSGA-II-DQN demonstrates slightly better convergence, more uniform spacing, and marginally higher hypervolume. In the EMS case, the advantage of NSGA-II-DQN becomes more evident, as it achieves broader coverage of solutions, larger hypervolume, and improved spacing. The superiority of NSGA-II-DQN is most pronounced in the combined Sizing + EMS case, where it dominates nearly all NSGA-II solutions and simultaneously delivers higher diversity and a more uniform distribution. This significant improvement arises because simultaneous optimization greatly increases problem complexity, introducing stronger trade-offs between design and operational objectives; while NSGA-II relies on fixed operators that may stagnate, NSGA-II-DQN dynamically adapts operator selection through reinforcement learning, enabling it to explore effectively in early stages and converge more efficiently in later stages.

Fig. 17 — Pareto front comparison between NSGA-II and NSGA-II-DQN: **(a)** sizing optimization, **(b)** EMS optimization, and **(c)** simultaneous sizing + EMS optimization.

Table 10.

Comparison of NSGA-II and NSGA-II-DQN using coverage metric, hypervolume, and spacing metric across sizing, EMS, and sizing + EMS optimization cases.

	Sizing		EMS		Sizing + EMS
	NSGA-II	NSGA-II-DQN	NSGA-II	NSGA-II-DQN	NSGA-II	NSGA-II-DQN
CM	0.33	0.4	0.12	0.63	0	0.92
HV	0.000165	0.000170	0.0102	0.0117	0.0056	0.0114
SM	0.000636	0.000229	0.00579	0.00549	0.00767	0.00381

Open in a new tab

To evaluate the performance of NSGA-II against NSGA-II–DQN in terms of convergence speed and solution coverage, Fig. 18 presents the convergence behavior of both algorithms during simultaneous EMS and sizing optimization. The x-axis denotes the convergence thresholds (50%, 75%, 90%, 95%, and 99%), while the y-axis represents the number of episodes required to achieve each threshold. The results demonstrate that NSGA-II–DQN consistently requires fewer episodes, particularly at higher thresholds, underscoring its superior efficiency and faster convergence compared to the conventional NSGA-II.

By incorporating reinforcement learning into the optimization process, the quality of results can be further enhanced; however, it is equally important to recognize that minimizing fuel cell aging, battery degradation, and fuel consumption requires the simultaneous optimization of component sizing and EMS. Although EMS optimization alone can reduce aging and improve efficiency, treating it in isolation neglects the strong interdependence between system design and operational strategy. Figure 19(a) presents the Pareto fronts for sizing, EMS, and combined sizing + EMS optimization using NSGA-II-DQN, with the corresponding knee points highlighted. The knee point solutions indicate balanced trade-offs, where further improvement in one objective leads to a significant compromise in the other. Importantly, the simultaneous EMS + sizing optimization achieves a knee point that clearly outperforms the individual cases, demonstrating a superior balance between objectives. This improvement arises because component sizing and EMS are inherently coupled: the effectiveness of an EMS strategy depends on the available system capacities, while the optimal sizing configuration is strongly influenced by the way energy is managed during operation.

Fig. 19 — Optimization plots for real-world traffic cycle: **(a)** Pareto solutions for sizing optimization, **(b)** Pareto solutions for EMS optimization, **(c)** Pareto solutions for co-simulation sizing and EMS optimization, **(d)** battery SOC, **(e)** fuel cell power demand, **(f)** FFT magnitude of fuel cell power.

The integrated sizing and EMS optimization approach delivers substantial gains in both energy efficiency and component durability, as summarized in Table 11. In particular, the combined strategy achieves a reduction in equivalent fuel consumption (Eq-Fuel) of about 21% relative to the non-optimized baseline, surpassing the reductions obtained from sizing-only (14%) and EMS-only (10%) optimizations. These results highlight that AI-based co-optimization not only enhances EMS performance but also generates significant additional benefits when component sizing and EMS are addressed simultaneously.

Table 11.

FCHEV optimization results for real-world traffic cycle.

		Non-optimized	Sizing	EMS	Sizing + EMS
Sizing Scales	S-FC	1	0.77	1	0.67
	S-BA	1	1.42	1	1.39
	S-EM	1	0.75	1	0.74
Fuel Consumption	FC-Fuel (gr/100 km)	788	680	670	601
Fuel Consumption	Eq-Fuel (gr/100 km)	818	705	734	648
Degeredation	Q-FC (W)	3.0	2.3	1.9	1.6
Degeredation	Q-BA	0.0149	0.0139	0.0152	0.0143
Vehicle Objective and Constraints	Obj1	-	1.86	1.87	1.70
	Obj2	-	0.69	0.68	0.61
	Acc1 (m/s^2)	8.1	10.4	8.1	10.3
	Acc2 (m/s^2)	4.1	5.3	4.1	5.2
	Acc3 (m/s^2)	16.6	22.6	16.6	22.5
	Spd_max ((km/h))	119.0	107.6	119.0	107.1
	Grd (%)	21.8	16.3	21.8	16.5

Open in a new tab

Analysis of the battery degradation index (Q-BA) highlights the trade-offs between different optimization strategies. Compared to the non-optimized baseline (0.0149), the EMS-only strategy increases battery degradation by about 2% (0.0152), since the oscillatory power demand is shifted from the fuel cell to the battery. This transfer is confirmed by Fig. 19(b), where the SOC trajectory under EMS exhibits sharper declines, indicating that the battery is more frequently tasked with compensating for load variations. The reason for this behavior is that EMS effectively reduces fuel cell degradation (Q-FC decreases from 3.0 to 1.9 W), thereby extending fuel cell lifespan at the expense of accelerated battery aging. When EMS is combined with sizing, however, battery degradation is alleviated (0.0143), as the enlarged battery capacity allows the EMS to distribute power more evenly and limit deep SOC fluctuations, as also visible in the smoother SOC profile of Fig. 19(b). Interestingly, the sizing-only case achieves the lowest battery degradation (0.0139), suggesting that although EMS + sizing provides a more balanced compromise between fuel cell and battery health, the stronger emphasis on protecting the fuel cell in the combined strategy leads to a partial transfer of power oscillations back to the battery compared with the sizing-only solution.

The combined AI optimization markedly improved the fuel cell lifespan (Q-FC), with degradation reduced by nearly 30% compared to the sizing-only case. These findings underscore the advantage of integrating physical component resizing with intelligent control strategies, demonstrating that such a holistic approach yields more comprehensive benefits than either strategy in isolation. As illustrated in Fig. 19(c), AI simultaneous optimization more effectively minimizes power fluctuations—a primary driver of component degradation—compared to other methods.

The FFT magnitude plot in Fig. 19(d) provides further insights, presenting the frequency content of power signals on a logarithmic scale. The combined optimization reduces power spectral density at low frequencies, indicating smoother power demand and diminished transient stress on powertrain components. The EMS optimization successfully reduces fuel cell power fluctuations, shifting this variability to the battery. While this trade-off is evident in the EMS-only optimization (Table 11), the combined EMS + sizing approach introduces a critical design adaptation: the battery size is increased specifically to accommodate these redirected power fluctuations. This strategic sizing compensation maintains system stability while achieving smoother fuel cell operation, demonstrating how component sizing and energy management must be co-optimized to handle power distribution challenges effectively.

However, these gains in fuel efficiency and durability come with trade-offs in dynamic vehicle performance. Acceleration times across various speed intervals increase significantly under sizing optimization, with the combined AI approach showing more than a 35% increase. Additionally, maximum vehicle speed and gradability decline by approximately 10%, highlighting that downsizing powertrain components limits peak performance. In contrast, EMS-only optimization preserves acceleration and top-speed performance close to initial value, demonstrating that intelligent EMS can achieve efficiency gains without compromising drivability.

The optimization of the Type-2 fuzzy EMS system leads to distinct modifications in the membership functions, depending on the optimization strategy employed. When only the control strategy is optimized, the membership functions undergo targeted adjustments to refine rule activation to prolong the fuel cell’s lifespan—without altering system sizing parameters—illustrated in Fig. 20(a). In contrast, AI simultaneous optimization of the control strategy and component sizing results in more substantial modifications to the membership functions. These changes reflect the enhanced operational flexibility enabled by the resized components, such as the battery and fuel cell, as shown in Fig. 20(b).

Fig. 20 — Results of EMS and sizing optimization: **(a)** optimized Type-2 fuzzy membership functions under EMS optimization, **(b)** optimized Type-2 fuzzy membership functions under co-simulation sizing and EMS, **(c)** electric motor operating points for sizing optimization, **(d)** electric motor operating points for EMS optimization, **(e)** electric motor operating points for co-simulation sizing and EMS optimization.

The analysis of electric motor efficiency maps provides further evidence of these improvements. The sizing-only and combined sizing and EMS optimization strategies shift the motor’s operating points closer to regions of maximum torque and peak efficiency. This is demonstrated by the dense clustering of operating points within the high-efficiency zones, as illustrated in Figs. 20 (c), 20 (d), and 20 (e). These shifts indicate a more efficient utilization of the electric motor, contributing to overall system performance enhancements.

Traffic condition effects

Fuel consumption significantly varies across driving scenarios under the AI co-simulation framework combining EMS and Sizing optimization. In the Highway condition, fuel consumption decreases by approximately 36% relative to the combined scenario, reflecting the more efficient and steady-state driving typical of highway environments, as shown in Figs. 21 (a) and 21 (b). In contrast, the congested scenario results in a fuel consumption increase of about 48% compared to the combined case, highlighting the elevated energy demands of frequent stops and accelerations characteristic of congested traffic, as depicted in Fig. 21(c).

Fig. 21 — Spider plots for real-world traffic cycle: **(a)** total cycle results, **(b)** highway cycle results, **(c)** congested cycle results.

A detailed comparative analysis of fuel consumption and equivalent fuel metrics under the combined AI sizing and EMS approach reveals substantial improvements, particularly in the Highway scenario (see Table 12). Specifically, fuel cell consumption is reduced by approximately 16.4% and 19.5% relative to the EMS-only and Sizing-only strategies, respectively, while equivalent fuel consumption decreases by 15.2% and 11.1%. In contrast, the Congested scenario yields more modest gains, with equivalent fuel consumption increasing by up to 3.1%.

Table 12.

FCHEV optimization results for highway and congested driving cycles.

Optimization
	Sizing			EMS			Sizing + EMS
	Combined	Highway	Congested	Combined	Highway	Congested	Combined	Highway	Congested
S-FC	0.77	0.59	1.01	1	1	1	0.67	0.96	0.98
S-BA	1.42	1.42	1.42	1	1	1	1.39	1.39	1.4
S-EM	0.75	0.72	0.98	1	1	1	0.74	0.8	0.89
FC-Fuel	680	475	932	670	458	892	601	383	897
Eq-Fuel	705	522	847	734	547	827	648	464	821
Q-FC	2.3	1.2	0.3	1.9	1	0.31	1.6	0.55	0.31
Q-BA	0.014	0.0125	0.0118	0.015	0.0133	0.0118	0.014	0.0128	0.0117
Obj1	1.86	1.49	1.2471	1.87	1.539	1.2477	1.70	1.3938	1.2355
Obj2	0.69	0.29	0.12472	0.68	0.2821	0.1186	0.61	0.2358	0.1193
Acc1	10.4	10.4	8.45	8.1	8.1	8.1	10.3	10.3	9.3
Acc2	5.3	5.3	4.25	4.1	4.1	4.1	5.2	5.25	4.7
Acc3	22.6	22.8	17.4	16.6	16.6	16.6	22.5	22.1	19.45
Spd_Max	107.6	106.2	117.6	119	119	119	107.1	109.8	113.8
Grd	16.3	16.4	20.67	21.8	21.8	21.8	16.5	16.6	18.7

Open in a new tab

Additionally, acceleration metrics (Acc1, Acc2, and Acc3) in the congested scenario improve by 10–17% under the Sizing and EMS strategy compared to individual optimization methods. This indicates a notable enhancement in vehicle responsiveness, which is crucial for navigating stop-and-go traffic conditions. Component sizing analysis reveals that the electric motor size (S-EM) increases by approximately 11% in congested conditions compared to highway driving. This adjustment reflects the higher power requirements for frequent acceleration and deceleration typical in congested traffic.

Driving cycles effects

The AI multi-objective optimization results for the FCHEV powertrain components show distinct performance improvements when applying sizing-only, EMS-only, and combined AI sizing and EMS optimization strategies under two driving cycles: UDDS and WLTP Class3. The Pareto front plots in Figs. 22(a)-(c) for UDDS and Figs. 22(d)-(f) for WLTP Class3 demonstrates that AI simultaneous sizing and EMS optimization outperform the individual approaches by achieving a more favorable trade-off between the two cost objectives. The lower and more clustered knee points in the combined optimization cases evidence this.

Fig. 22 — Pareto solutions for UDDS and WLTP Class 3 driving cycles: **(a)** UDDS sizing optimization, **(b)** UDDS EMS optimization, **(c)** UDDS co-simulation sizing and EMS optimization, **(d)** WLTP Class 3 sizing optimization, **(e)** WLTP EMS optimization, **(f)** WLTP Class 3 co-simulation sizing and EMS optimization.

Figures. 23 (a) and 23 (b) also show the SOC profiles, which validate the improved battery management achieved by the Combined AI Sizing + EMS Strategy. This approach maintains higher and more stable SOC levels compared to the other methods, indicating enhanced energy efficiency and improved battery longevity—results consistent with findings observed in the experiment driving cycle.

Fig. 23 — SOC analysis for different driving cycles: **(a)** UDDS, **(b)** WLTP Class 3.

The AI EMS and sizing optimization consistently delivers the most significant reductions in fuel consumption for both the UDDS and WLTP Class 3 cycles, as summarized in Table 13. Relative to the non-optimized baseline, the AI EMS and Sizing approach reduces fuel consumption by approximately 22% for UDDS and 24% for WLTP Class 3. Compared to sizing-only optimization, it provides an additional reduction of about 18% for UDDS and 15% for WLTP Class 3. Moreover, compared to EMS-only optimization, the EMS + Sizing strategy yields further improvements of roughly 10% for UDDS and 14% for WLTP Class 3.

Table 13.

FCHEV optimization results for UDDS and WLTP Class3 driving cycles.

	UDDS				WLTP Class3
	Non-optimized	Sizing	EMS	Sizing + EMS	Non-optimized	Sizing	EMS	Sizing + EMS
S-FC	1	0.57	1	1.08	1	0.98	1	1.06
S-BA	1	1.38	1	1.40	1	1.40	1	1.40
S-EM	1	0.71	1	0.82	1	0.79	1	0.80
FC-Fuel	755	638	646	588	992	873	829	755
Eq-Fuel	777	661	696	631	994	874	862	779
Q-FC	3.2	2.0	1.5	1.3	5.2	4.1	1.9	1.4
Q-BA	0.0157	0.0144	0.0154	0.0144	0.0177	0.016	0.0165	0.0163
Obj1	-	1.83	1.84	1.70	-	2.41	2.03	1.91
Obj2	-	0.76	0.77	0.71	-	2.03	1.92	1.75
Acc1	8.1	10.4	8.1	10.4	8.1	10.5	8.1	10.5
Acc2	4.1	5.3	4.1	5.2	4.1	5.3	4.1	5.3
Acc3	16.6	22.9	16.6	22.0	16.6	22.4	16.6	22.3
Spd_max	119.0	106.0	119.0	111.0	119.0	109.3	119.0	110.0
Grd	21.8	16.3	21.8	16.7	21.8	16.4	21.8	16.4

Open in a new tab

The 3D scatter plots shown in Fig. 24, depicting the sizing of the electric motor (S-EM), fuel cell (S-FC), and battery (S-BA) components, reveal a marked difference in consistency between the sizing-only and combined sizing + EMS optimization strategies. Specifically, AI the EMS + Sizing approach produces a more tightly clustered distribution of component sizes, indicating a more stable and convergent solution space. In contrast, the sizing-only optimization results in a broader scatter, reflecting greater variability in the selected component sizes.

Road grade

To investigate the impact of each optimization method, the FCHEV was tested under varying road grades. As the road grade increases from 0% to 6%, there is a notable rise in fuel consumption and component degradation across all driving cycles, as shown in Table 14. Specifically, fuel consumption increases by approximately 30–35% in the real-world traffic cycle and 25–30% in both the WLTP Class 3 and UDDS cycles, reflecting the higher energy demand on steeper inclines. Fuel cell degradation (Q-FC) increases even more sharply, with values rising by around 40–60% in the real-world traffic cycle and exceeding 70% in the WLTP Class 3 cycle, highlighting the strong sensitivity of fuel cells to grade-induced stress. Battery degradation (Q-BA) increases gradually but still rises approximately 15–20% across all cycles.

Table 14.

FCHEV grade results for real-world traffic cycle, UDDS, and WLTP class 3 driving cycles.

	Grade 4%				Grade 6%
	Non-optimized	Sizing	EMS	Sizing + EMS	Non-optimized	Sizing	EMS	Sizing + EMS
Tehran
Fuel-FC	1271	1202	946	936	1606	1605	1192	1399
Eq-Fuel	1298	1234	1046	1010	1630	1634	1305	1489
Q-FC	4.2	3.9	2.5	2.6	4.7	4.6	3.4	4.2
Q-BA	0.0155	0.0144	0.0159	0.0145	0.0159	0.0148	0.0162	0.0149
WLTP Class3
FC-Fuel	1550	1496	1412	1395	1910	1849	1749	1833
Eq-Fuel	1560	1509	1453	1425	1926	1871	1803	1874
Q-FC	5.4	5.3	2.4	2.3	5.8	5.6	3.0	2.8
Q-BA	0.0185	0.0166	0.0169	0.0165	0.0193	0.0172	0.0174	0.0168
UDDS
Fuel-FC	1187	1212	1015	1022	1521	1625	1310	1355
Eq-Fuel	1216	1246	1078	1079	1549	1660	1375	1407
Q-FC	3.6	3.5	1.9	2.3	4.8	4.8	2.5	2.5
Q-BA	0.0158	0.0146	0.155	0.0145	0.0161	0.0149	0.0157	0.0148

Open in a new tab

Figure 25 further corroborates these trends, illustrating a consistent increase in fuel cell degradation (Q-FC) with rising grade under all strategies. The WLTP Class 3 cycle exhibits the highest degradation, emphasizing its role as a particularly demanding driving profile. Significantly, the combined AI sizing and EMS optimization strategy reduces fuel cell degradation by roughly 30–40% compared to the non-optimized case at a 6% grade, demonstrating its effectiveness in mitigating degradation. Additionally, the figure highlights that, as the grade increases, fuel cell degradation under the EMS-only strategy approaches that of the combined EMS and sizing strategy. This convergence underscores the critical role of EMS optimization in protecting fuel cell lifespan, particularly on steeper grades where operational stresses are most pronounced. It suggests that, even without vehicle downsizing, an effective EMS can significantly reduce degradation, emphasizing its importance in extending fuel cell durability under challenging driving conditions.

Fig. 25 — Fuel cell degradation bar chart for different road grades and driving cycles: **(a)** Experiment, **(b)** UDDS, **(c)** WLTP Class 3.

HIL simulation

This section evaluates the impact of HIL simulation by analyzing the real-world, UDDS, and WLTP Class3 drive cycles, as shown in Figs. 26(a), (b), and (c). The fuel cell power requests during HIL tests demonstrate that the fuzzy Type-2 controllers—optimized using the hybrid reinforcement learning algorithm NSGA-II-DQN—effectively replicate the MIL simulation results. However, some fluctuations appear in the HIL data compared to MIL, mainly due to inherent time delays from hardware analog characteristics, low-pass filtering, and sampling times. Additionally, hardware-specific factors such as ADC quantization errors and PWM-to-analog conversion introduce variability, causing fuel cell power output oscillations.

A comparison of the combined AI EMS and sizing results between HIL and MIL simulations, detailed in Table 15, reveals minimal deviation, highlighting the robustness and reliability of this optimization strategy. For the UDDS cycle, fuel cell degradation remains virtually identical in both HIL and MIL—1.3 in each case—demonstrating the controller’s ability to effectively manage real-world hardware noise, time delays, and system imperfections. Similarly, battery degradation shows only slight differences, confirming the consistency of AI EMS + sizing performance across both simulation environments.

Table 15.

Optimized and non-optimized HIL results for real-world, UDDS, and WLTP Class3 driving cycles.

	Experiment				UDDS				WLTP Class3
	Non-optimized	Sizing	EMS	Sizing + EMS	Non-optimized	Sizing	EMS	Sizing + EMS	Non-optimized	Sizing	EMS	Sizing + EMS
Fuel-FC	782	665	660	590	756	637	637	579	1009	887	826	755
Eq-Fuel	815	695	729	645	779	662	690	626	1010	890	861	780
Q-FC	3.3	2.4	2.0	1.6	3.3	2.2	1.5	1.3	6.3	5.4	2.0	1.5
Q-BA	0.0153	0.0143	0.0153	0.0145	0.0169	0.0150	0.0158	0.0147	0.0196	0.0171	0.0172	0.0169

Open in a new tab

In contrast, the deviation between HIL and MIL is more pronounced for the non-optimized, sizing-only, and EMS-only strategies. For instance, sizing-only optimization in the UDDS cycle results in a fuel cell degradation of 1.9 in HIL versus 1.5 in MIL, underscoring the sensitivity of sizing alone to hardware-induced fluctuations and delays. The discrepancies are even larger in the non-optimized and EMS-only cases, further emphasizing the superior stability and performance of the AI co-simulation EMS and sizing approach. This demonstrates its ability to deliver reliable, high-performance results in both simulation and real-time hardware testing.

Power fluctuations significantly impact battery power demand, resulting in overshoots and undershoots, which is evident in the SOC trends during HIL simulations, as shown in Fig. 27 for the WLTP Class3 driving cycle. Despite these variations, the overall SOC behavior remains closely aligned between MIL and HIL, demonstrating the effectiveness of the hybrid deep reinforcement learning-based Sizing + EMS optimization under real-time hardware constraints. Notably, non-optimized HIL tests (Fig. 27(a)) exhibit pronounced fluctuations in fuel cell power and SOC, revealing the system’s susceptibility to hardware-induced noise and delays. Although the sizing-only (Fig. 27(b)) strategy improves stability, it still presents noticeable oscillations. The HIL plot for EMS-only optimization (Fig. 27(c)) shows a significant reduction in SOC to its lowest level, which can greatly impact durability. In contrast, the AI integrated EMS + Sizing approach (Fig. 27(f)) achieves the best alignment between MIL and HIL results, with significantly reduced SOC deviations.

Fig. 26 — HIL analysis using co-simulation Sizing + EMS optimization for different driving cycles: **(a)** fuel cell power demand for real-world, **(b)** UDDS, **(c)** WLTP Class 3.

Fig. 27 — SOC HIL analysis for WLTP Class 3 driving cycle: **(a)** Non-optimized, **(b)** sizing optimization, **(c)** EMS optimization, **(d)** Sizing + EMS optimization.

Conclusion

This paper successfully addressed a critical research gap by introducing a novel AI-driven framework that employs NSGA-II-DQN for the simultaneous optimization of EMS and powertrain sizing in FCHEVs. The core innovation—using a Deep Q-Network to adaptively select genetic operators—overcame the limitations of conventional NSGA-II, which relies on fixed probabilities and often struggles with complex, multi-objective problems. This resulted in Pareto fronts with demonstrably superior convergence, diversity, and solution quality.

The framework was rigorously evaluated under realistic driving conditions, synthesized by a Random Forest-based traffic classifier. The results unequivocally demonstrate that while NSGA-II-DQN consistently outperforms its traditional counterpart, the most significant benefits are realized through the co-optimization of EMS and sizing. This integrated approach delivered substantial quantitative gains, reducing equivalent fuel consumption by 21% compared to a sizing-only approach and achieving a 30% reduction in fuel cell degradation, all while effectively managing battery aging. These improvements were consistently validated across standard driving cycles (UDDS and WLTP Class 3), with fuel consumption reduced by 22–24% from the non-optimized baseline.

However, this pursuit of optimal efficiency and durability necessitated a trade-off in dynamic performance. The co-optimization strategy, which involved downsizing key components, incurred a 35% penalty in acceleration (0–97 km/h) and reductions of approximately 10% in maximum speed and 24% in gradeability. This quantifiable compromise underscores the inherent challenge of balancing competing objectives in holistic vehicle design.

Finally, the practical robustness of the proposed system was confirmed through HIL simulations. Despite the inevitable noise, time delays, and uncertainties introduced by real hardware, the optimized controller maintained consistent and reliable performance across diverse driving conditions. This validation underscores the framework’s readiness for real-world application and marks a significant step forward in the AI-powered co-design of next-generation FCHEVs.

Abbreviations

ADC: Analog-to-Digital Converter
DoD: Depth of Discharge
DQN: Deep Q-Network
ECU: Electronic Control Unit
EMS: Energy Management Strategy
FFT: Fast Fourier Transform
FCHEV: Fuel Cell Hybrid Electrical Vehicle
GA: Genetic Algorithm
HIL: Hardware-In-The-Loop
MIL: Model-in-the-Loop
Non-Op: Non-optimized
NSGA-II: Non-dominated Sorting Genetic Algorithm II
OP: Optimized
PMP: Pontryagin’s Minimum Principal
PEMFC: Polymer Exchange Membrane Fuel Cell
PWM: Pulse-Width Modulation
SOC: State-of-Charge
SOH: State of Health
TDP: Time Domain Parametrization
T1FS: Type-1 Fuzzy Set
T2FLS: Type-2 Fuzzy Set

Symbols

A: Frontal area (m²)
C_D: Air resistance coefficient (-)
EQ-Fuel: Equivalanet fuel consumption
F: Faraday constant (c/mol)
FC-Fuel: Fuel cell hydrogen consumption
FC-Power: Fuel cell power (W)
F_i: Traction force (N)
F_RO: Rolling resistance force(N)
F_L: Aerodynamic drag force (N)
F_St: Gravitational force along slope (N)
g: gravity (m/s²)
H_2Cons: Gravitational force along slope (N)
I: Electric current
I_FCnet: Net current extracted from fuel cell stack (A)
M: vehicle mass (kg)
N: Stack number of fuel cell
P_bat: Battery power (W)
P-req: Required power for FCHEV (W)
Q-BA: Battery capacity loss
Q-FC: Fuel cell power degeredation (W)
R_fd: Gear ratio (-)
R_int: Internal resistance (ohm)
R_ohm: Ohmic resistance (ohm.cm²)
f: FDynamic rolling radius coefficient (-)
S-BA: Scale factor for battery sizing
S-FC: Scale factor for fuel cell Sizing
S-EM: Scale factor for electric motor sizing
T: Temperature (K)
T_m: Motor torque (Nm)
T_{m max}: Maximum torque of electric motor (Nm)
u: Vehicle speed (m/s)
V_cell: Cell voltage (V)
V_ocv: Open circuit voltage (V)

Greek letters

α: Slope (-)
α_k: Reaction transfer coefficient (-)
β: Transfer coefficients (-)
δ: Rotating mass correction factor (-)
: Efficiency of the transmission (-)
: Motor efficiency (-)
: Motor angular velocity (rad/s or rpm)

Author contributions

M.M.: Conceptualization, Writing e original draft, Supervision. A.M.: Conceptualization, Methodology, Soft ware, writing and editing, Data Curation.

Data availability

Correspondence and requests for materials should be addressed to M.M.-G.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Han, J., Yi, S. & Yu, S. Assessment of hydrogen vehicle fuel economy using MRAC based on deep learning. Sci. Rep.15(1), 13085 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Wang, H. et al. Optimization of energy management strategies for multi-mode hybrid electric vehicles driven by travelling road condition data. Sci. Rep.15(1), 12684 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Feng, R. et al. Performance and energy-consumption evaluation of fuel-cell hybrid heavy-duty truck based on energy flow and thermal-management characteristics experiment under different driving conditions. Energy Convers. Manag.321, 119084 (2024). [Google Scholar]
4.Zhang, M., Li, X., Han, D., Shang, L. & Xu, L. Energy management strategy for fuel cell hybrid tractor considering demand power frequency characteristic compensation. Sci. Rep.14(1), 27844 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Valizadeh, M., Shiri, M., Sarvenoee, A. K., Gowtham, N. & AboRas, K. M. A comprehensive scheme for power management of FC/SC/battery, and solar-roof PV source in electric vehicle systems. Sci. Rep.14(1), 27621 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Lei, N., Zhang, H., Chen, H. & Wang, Z. A comprehensive study of various carbon-free vehicle propulsion systems utilizing ammonia-hydrogen synergy fuel. ETransportation20, 100332 (2024). [Google Scholar]
7.Zhang, H. et al. Surrogate-enhanced multi-objective optimization of on-board hydrogen production device for carbon-free heavy-duty vehicles. Energy333, 137369. (2025). [Google Scholar]
8.Zhang, H., Lei, N. & Wang, Z. Ammonia-hydrogen propulsion system for carbon-free heavy-duty vehicles. Appl. Energy369, 123505 (2024). [Google Scholar]
9.Jia, C., Liu, W., He, H. & Chau, K. T. Deep reinforcement learning-based energy management strategy for fuel cell buses integrating future road information and cabin comfort control. Energy Conv. Manag.321, 119032 (2024). [Google Scholar]
10.Li, F., Gao, L., Zhang, Y. & Liu, Y. Integrated energy management for hybrid electric vehicles: A bellman neural network approach. Eng. Appl. Artif. Intell.145, 110166 (2025). [Google Scholar]
11.Sun, Y. et al. Energy management strategy for FCEV considering degradation of fuel cell. Int. J. Green Energy. 20 (1), 28–39 (2023). [Google Scholar]
12.Yang, H., Sun, Y., Xia, C. & Zhang, H. Research on energy management strategy of fuel cell electric tractor based on multi-algorithm fusion and optimization. Energies15(17), 6389 (2022). [Google Scholar]
13.Sun, H. et al. Health-and behavior-aware energy management strategy for fuel cell hybrid electric vehicles based on parallel deep deterministic policy gradient learning. Eng. Appl. Artif. Intell.158, 111311 (2025). [Google Scholar]
14.Rostami, S. M. R., Al-Shibaany, Z., Kay, P. & Karimi, H. R. Deep reinforcement learning and fuzzy logic controller codesign for energy management of hydrogen fuel cell powered electric vehicles. Sci. Rep.14(1), 30917 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Zhao, Z., Wang, T., Li, M., Wang, H. & Wang, Y. Optimization of fuzzy control energy management strategy for fuel cell vehicle power system using a multi-islandgenetic algorithm. Energy Sci. Eng.9 (4), 548–564 (2021). [Google Scholar]
16.Lei, N., Zhang, H., Hu, J., Hu, Z. & Wang, Z. Sim-to-real design and development of reinforcement learning-based energy management strategies for fuel cell electric vehicles. Applied Energy, 393, p.126030. (2025).
17.Jain, M., Desai, C. & Williamson, S. S. September. Genetic algorithm based optimal powertrain component sizing and control strategy design for a fuel cell hybrid electric bus. In 2009 IEEE vehicle power and propulsion conference (pp. 980–985). IEEE. (2009).
18.Hu, X., Murgovski, N., Johannesson, L. M. & Egardt, B. Optimal dimensioning and power management of a fuel cell/battery hybrid bus via convex programming. IEEE/ASME Transactions on Mechatronics20(1), 457–468 (2014). [Google Scholar]
19.Xie, S. et al. Aging-aware co-optimization of battery size, depth of discharge, and energy management for plug-in hybrid electric vehicles. J. Power Sources450, 227638 (2020). [Google Scholar]
20.Zou, Y., Yang, Y., Zhang, Y. & Tang, X. Aging-aware real-time multi-layer co-optimization approach for hybrid vehicles: across configuration, parameters, and control. Energy Conv. Manag.332, 119748 (2025). [Google Scholar]
21.Liu, C. & Liu, L. Optimal power source sizing of fuel cell hybrid vehicles based on Pontryagin’s minimum principle. Int. J. Hydrogen Energy40(26), 8454–8464 (2015). [Google Scholar]
22.Sorrentino, M., Cirillo, V. & Nappi, L. Development of flexible procedures for co-optimizing design and control of fuel cell hybrid vehicles. Energy Convers. Manag.185, 537–551 (2019). [Google Scholar]
23.Xu, L. et al. Optimal sizing of plug-in fuel cell electric vehicles using models of vehicle performance and system cost. Appl. Energy103, 477–487 (2013). [Google Scholar]
24.Sadek, H., Chedid, R. & Fares, D. Power sources sizing for a fuel cell hybrid vehicle. Energy Storage2(2), e124 (2020). [Google Scholar]
25.KoteswaraRao, K. V., Srinivasulu, G. N., Rahul, J. R. & Velisala, V. Optimal component sizing and performance of Fuel Cell–Battery powered vehicle over world harmonized and new european driving cycles. Energy Conv. Manag.300, 117992 (2024). [Google Scholar]
26.Hu, Z. et al. Multi-objective energy management optimization and parameter sizing for proton exchange membrane hybrid fuel cell vehicles. Energy Conv. Manag.129, 108–121 (2016). [Google Scholar]
27.Wang, Y., Moura, S. J., Advani, S. G. & Prasad, A. K. Optimization of powerplant component size on board a fuel cell/battery hybrid bus for fuel economy and system durability. Int. J. Hydrogen Energy44(33), 18283–18292 (2019). [Google Scholar]
28.Ceschia, A., Azib, T., Bethoux, O. & Alves, F. Optimal sizing of fuel cell hybrid power sources with reliability consideration. Energies13, 3510 (2020). [Google Scholar]
29.Iqbal, M., Becherif, M., Ramadan, H. S. & Badji, A. Dual-layer approach for systematic sizing and online energy management of fuel cell hybrid vehicles. Appl. Energy300, 117345 (2021). [Google Scholar]
30.Li, J. et al. Battery optimal sizing under a synergistic framework with DQN-based power managements for the fuel cell hybrid powertrain. IEEE Transactions on Transportation Electrification8(1), 36–47 (2021). [Google Scholar]
31.da Silva, S. F. et al. Aging-aware optimal power management control and component sizing of a fuel cell hybrid electric vehicle powertrain. Energy Conv. Manag.292, 117330 (2023). [Google Scholar]
32.Lei, N., Zhang, H., Wang, H. & Wang, Z. An improved co-optimization of component sizing and energy management for hybrid powertrains interacting with high-fidelity model. IEEE Trans. Veh. Technol.72 (12), 15585–15596 (2023). [Google Scholar]
33.Madadi, M. H. & Chitsaz, I. Improving fuel efficiency and durability in fuel cell vehicles through component sizing and power distribution management. Int. J. Hydrogen Energy71, 661–673 (2024). [Google Scholar]
34.Ming, F., Gong, W., Wang, L. & Jin, Y. Constrained multi-objective optimization with deep reinforcement learning assisted operator selection. IEEE/CAA JAS11(4), 919–931 (2024). [Google Scholar]
35.Song, Y. et al. Reinforcement learning-assisted evolutionary algorithm: A survey and research opportunities. Swarm Evol. Comput.86, 101517 (2024). [Google Scholar]
36.Zou, S., Shi, X. & Song, S. MOEA with adaptive operator based on reinforcement learning for weapon target assignment. Electron. Res. Arch31(3), 1498–1532 (2024). [Google Scholar]
37.Yin, S. & Xiang, Z. Adaptive operator selection with dueling deep Q-network for evolutionary multi-objective optimization. Neurocomputing581, 127491 (2024). [Google Scholar]
38.Huang, Y. et al. Multi-Objective Path Planning for Unmanned Sweepers Considering Traffic Signals: A Reinforcement Learning-Enhanced NSGA-II Approach. Sustainability16(24), 11297 (2024). [Google Scholar]
39.Yang, B., Chen, J., Xiao, X., Li, S. & Ren, T. An Enhanced NSGA-II Driven by Deep Reinforcement Learning to Mixed Flow Assembly Workshop Scheduling System with Constraints of Continuous Processing and Mold Changing. Systems13(8), 659 (2025). [Google Scholar]
40.Montazeri-Gh, M. & Alimohammadi, E. Integrated energy, environmental, and economic optimization for energy management systems in PHEVs considering traffic conditions. Sci. Rep. 15(1), 25927 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Zheng, C., Zhang, D., Xiao, Y. & Li, W. Reinforcement learning-based energy management strategies of fuel cell hybrid vehicles with multi-objective control. J. Power Sources543, 231841 (2022). [Google Scholar]
42.Lei, N. et al. Theory-constrained neural network with modular interpretability for fuel cell vehicle modeling. IEEE Trans. Veh. Technology 74(6), 8907–8920 (2025). [Google Scholar]
43.Chen, H., Pei, P. & Song, M. Lifetime prediction and the economic lifetime of proton exchange membrane fuel cells. Appl. Energy142, 154–163 (2015). [Google Scholar]
44.Song, K. et al. A comprehensive evaluation framework to evaluate energy management strategies of fuel cell electric vehicles. Electrochimica Acta292, 960–973 (2018). [Google Scholar]
45.Pei, P., Chang, Q. & Tang, T. A quick evaluating method for automotive fuel cell lifetime. Int. J. Hydrogen Energy33(14), 3829–3836 (2008). [Google Scholar]
46.U. S. D. o. Energy (ed), Fuel Cells, vol. Multi-Year Research, Development, and Demonstration Plan, (2017).
47.Esfahanian, M. et al. Large lithium polymer battery modeling for the simulation of hybrid electric vehicles using the equivalent circuit method. Int. J. Automot. Eng.3(4), 564–576 (2013). [Google Scholar]
48.Huang, Y. et al. Fuel consumption and emissions performance under real driving: Comparison between hybrid and conventional vehicles. Sci. Total Environ.659, 275–282 (2019). [DOI] [PubMed] [Google Scholar]
49.Eckert, J. J., Silva, L. C. D. A. E., Santiciolli, F. M., Correa, F. C. & Dedini, F. G. Optimization of electric propulsion system for a hybridized vehicle. Mech. Based Des. Struc.47(2), 175–200 (2019). [Google Scholar]
50.Cordoba-Arenas, A., Onori, S. & Rizzoni, G. A control-oriented lithium-ion battery pack model for plug-in hybrid electric vehicle cycle-life studies and system design with consideration of health management. J. Power Sources279, 791–808 (2015). [Google Scholar]
51.Wang, J. et al. Cycle-life model for graphite-LiFePO4 cells. J. Power Sources196(8), 3942–3948 (2011). [Google Scholar]
52.Lei, N. et al. Physics-informed data-driven modeling approach for commuting-oriented hybrid powertrain optimization. Energy Conversion and Management, 299, p.117814. (2024).
53.https://worldpopulationreview.com/cities/iran/tehran
54.Montazeri-Gh, M. & Naghizadeh, M. Development of the Tehran car driving cycle. Int. J. Environ. Pollut.30 (1), 106–118 (2007). [Google Scholar]
55.Deb, K., Pratap, A., Agarwal, S. & Meyarivan, T. A. M. T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput.6 (2), 182–197 (2002). [Google Scholar]
56.Pei, C. H. I., Chen, L. I. U., Jiang, Z. H. A. O., Kun, W. U. & Yingxun, W. Dynamic effect web generation for heterogeneous UAV cluster using DQN-based NSGA-II: Methods and applications. Chinese J. Aeronautics 4(4), 103351 (2024). [Google Scholar]
57.Tian, Y. et al. Deep reinforcement learning based adaptive operator selection for evolutionary multi-objective optimization. IEEE Trans. Emerg. Top. Comput. Intell.7 (4), 1051–1064 (2022). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Correspondence and requests for materials should be addressed to M.M.-G.

[CR1] 1.Han, J., Yi, S. & Yu, S. Assessment of hydrogen vehicle fuel economy using MRAC based on deep learning. Sci. Rep.15(1), 13085 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Wang, H. et al. Optimization of energy management strategies for multi-mode hybrid electric vehicles driven by travelling road condition data. Sci. Rep.15(1), 12684 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR3] 3.Feng, R. et al. Performance and energy-consumption evaluation of fuel-cell hybrid heavy-duty truck based on energy flow and thermal-management characteristics experiment under different driving conditions. Energy Convers. Manag.321, 119084 (2024). [Google Scholar]

[CR4] 4.Zhang, M., Li, X., Han, D., Shang, L. & Xu, L. Energy management strategy for fuel cell hybrid tractor considering demand power frequency characteristic compensation. Sci. Rep.14(1), 27844 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] 5.Valizadeh, M., Shiri, M., Sarvenoee, A. K., Gowtham, N. & AboRas, K. M. A comprehensive scheme for power management of FC/SC/battery, and solar-roof PV source in electric vehicle systems. Sci. Rep.14(1), 27621 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Lei, N., Zhang, H., Chen, H. & Wang, Z. A comprehensive study of various carbon-free vehicle propulsion systems utilizing ammonia-hydrogen synergy fuel. ETransportation20, 100332 (2024). [Google Scholar]

[CR7] 7.Zhang, H. et al. Surrogate-enhanced multi-objective optimization of on-board hydrogen production device for carbon-free heavy-duty vehicles. Energy333, 137369. (2025). [Google Scholar]

[CR8] 8.Zhang, H., Lei, N. & Wang, Z. Ammonia-hydrogen propulsion system for carbon-free heavy-duty vehicles. Appl. Energy369, 123505 (2024). [Google Scholar]

[CR9] 9.Jia, C., Liu, W., He, H. & Chau, K. T. Deep reinforcement learning-based energy management strategy for fuel cell buses integrating future road information and cabin comfort control. Energy Conv. Manag.321, 119032 (2024). [Google Scholar]

[CR10] 10.Li, F., Gao, L., Zhang, Y. & Liu, Y. Integrated energy management for hybrid electric vehicles: A bellman neural network approach. Eng. Appl. Artif. Intell.145, 110166 (2025). [Google Scholar]

[CR11] 11.Sun, Y. et al. Energy management strategy for FCEV considering degradation of fuel cell. Int. J. Green Energy. 20 (1), 28–39 (2023). [Google Scholar]

[CR12] 12.Yang, H., Sun, Y., Xia, C. & Zhang, H. Research on energy management strategy of fuel cell electric tractor based on multi-algorithm fusion and optimization. Energies15(17), 6389 (2022). [Google Scholar]

[CR13] 13.Sun, H. et al. Health-and behavior-aware energy management strategy for fuel cell hybrid electric vehicles based on parallel deep deterministic policy gradient learning. Eng. Appl. Artif. Intell.158, 111311 (2025). [Google Scholar]

[CR14] 14.Rostami, S. M. R., Al-Shibaany, Z., Kay, P. & Karimi, H. R. Deep reinforcement learning and fuzzy logic controller codesign for energy management of hydrogen fuel cell powered electric vehicles. Sci. Rep.14(1), 30917 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR15] 15.Zhao, Z., Wang, T., Li, M., Wang, H. & Wang, Y. Optimization of fuzzy control energy management strategy for fuel cell vehicle power system using a multi-islandgenetic algorithm. Energy Sci. Eng.9 (4), 548–564 (2021). [Google Scholar]

[CR16] 16.Lei, N., Zhang, H., Hu, J., Hu, Z. & Wang, Z. Sim-to-real design and development of reinforcement learning-based energy management strategies for fuel cell electric vehicles. Applied Energy, 393, p.126030. (2025).

[CR17] 17.Jain, M., Desai, C. & Williamson, S. S. September. Genetic algorithm based optimal powertrain component sizing and control strategy design for a fuel cell hybrid electric bus. In 2009 IEEE vehicle power and propulsion conference (pp. 980–985). IEEE. (2009).

[CR18] 18.Hu, X., Murgovski, N., Johannesson, L. M. & Egardt, B. Optimal dimensioning and power management of a fuel cell/battery hybrid bus via convex programming. IEEE/ASME Transactions on Mechatronics20(1), 457–468 (2014). [Google Scholar]

[CR19] 19.Xie, S. et al. Aging-aware co-optimization of battery size, depth of discharge, and energy management for plug-in hybrid electric vehicles. J. Power Sources450, 227638 (2020). [Google Scholar]

[CR20] 20.Zou, Y., Yang, Y., Zhang, Y. & Tang, X. Aging-aware real-time multi-layer co-optimization approach for hybrid vehicles: across configuration, parameters, and control. Energy Conv. Manag.332, 119748 (2025). [Google Scholar]

[CR21] 21.Liu, C. & Liu, L. Optimal power source sizing of fuel cell hybrid vehicles based on Pontryagin’s minimum principle. Int. J. Hydrogen Energy40(26), 8454–8464 (2015). [Google Scholar]

[CR22] 22.Sorrentino, M., Cirillo, V. & Nappi, L. Development of flexible procedures for co-optimizing design and control of fuel cell hybrid vehicles. Energy Convers. Manag.185, 537–551 (2019). [Google Scholar]

[CR23] 23.Xu, L. et al. Optimal sizing of plug-in fuel cell electric vehicles using models of vehicle performance and system cost. Appl. Energy103, 477–487 (2013). [Google Scholar]

[CR24] 24.Sadek, H., Chedid, R. & Fares, D. Power sources sizing for a fuel cell hybrid vehicle. Energy Storage2(2), e124 (2020). [Google Scholar]

[CR25] 25.KoteswaraRao, K. V., Srinivasulu, G. N., Rahul, J. R. & Velisala, V. Optimal component sizing and performance of Fuel Cell–Battery powered vehicle over world harmonized and new european driving cycles. Energy Conv. Manag.300, 117992 (2024). [Google Scholar]

[CR26] 26.Hu, Z. et al. Multi-objective energy management optimization and parameter sizing for proton exchange membrane hybrid fuel cell vehicles. Energy Conv. Manag.129, 108–121 (2016). [Google Scholar]

[CR27] 27.Wang, Y., Moura, S. J., Advani, S. G. & Prasad, A. K. Optimization of powerplant component size on board a fuel cell/battery hybrid bus for fuel economy and system durability. Int. J. Hydrogen Energy44(33), 18283–18292 (2019). [Google Scholar]

[CR28] 28.Ceschia, A., Azib, T., Bethoux, O. & Alves, F. Optimal sizing of fuel cell hybrid power sources with reliability consideration. Energies13, 3510 (2020). [Google Scholar]

[CR29] 29.Iqbal, M., Becherif, M., Ramadan, H. S. & Badji, A. Dual-layer approach for systematic sizing and online energy management of fuel cell hybrid vehicles. Appl. Energy300, 117345 (2021). [Google Scholar]

[CR30] 30.Li, J. et al. Battery optimal sizing under a synergistic framework with DQN-based power managements for the fuel cell hybrid powertrain. IEEE Transactions on Transportation Electrification8(1), 36–47 (2021). [Google Scholar]

[CR31] 31.da Silva, S. F. et al. Aging-aware optimal power management control and component sizing of a fuel cell hybrid electric vehicle powertrain. Energy Conv. Manag.292, 117330 (2023). [Google Scholar]

[CR32] 32.Lei, N., Zhang, H., Wang, H. & Wang, Z. An improved co-optimization of component sizing and energy management for hybrid powertrains interacting with high-fidelity model. IEEE Trans. Veh. Technol.72 (12), 15585–15596 (2023). [Google Scholar]

[CR33] 33.Madadi, M. H. & Chitsaz, I. Improving fuel efficiency and durability in fuel cell vehicles through component sizing and power distribution management. Int. J. Hydrogen Energy71, 661–673 (2024). [Google Scholar]

[CR34] 34.Ming, F., Gong, W., Wang, L. & Jin, Y. Constrained multi-objective optimization with deep reinforcement learning assisted operator selection. IEEE/CAA JAS11(4), 919–931 (2024). [Google Scholar]

[CR35] 35.Song, Y. et al. Reinforcement learning-assisted evolutionary algorithm: A survey and research opportunities. Swarm Evol. Comput.86, 101517 (2024). [Google Scholar]

[CR36] 36.Zou, S., Shi, X. & Song, S. MOEA with adaptive operator based on reinforcement learning for weapon target assignment. Electron. Res. Arch31(3), 1498–1532 (2024). [Google Scholar]

[CR37] 37.Yin, S. & Xiang, Z. Adaptive operator selection with dueling deep Q-network for evolutionary multi-objective optimization. Neurocomputing581, 127491 (2024). [Google Scholar]

[CR38] 38.Huang, Y. et al. Multi-Objective Path Planning for Unmanned Sweepers Considering Traffic Signals: A Reinforcement Learning-Enhanced NSGA-II Approach. Sustainability16(24), 11297 (2024). [Google Scholar]

[CR39] 39.Yang, B., Chen, J., Xiao, X., Li, S. & Ren, T. An Enhanced NSGA-II Driven by Deep Reinforcement Learning to Mixed Flow Assembly Workshop Scheduling System with Constraints of Continuous Processing and Mold Changing. Systems13(8), 659 (2025). [Google Scholar]

[CR40] 40.Montazeri-Gh, M. & Alimohammadi, E. Integrated energy, environmental, and economic optimization for energy management systems in PHEVs considering traffic conditions. Sci. Rep. 15(1), 25927 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR41] 41.Zheng, C., Zhang, D., Xiao, Y. & Li, W. Reinforcement learning-based energy management strategies of fuel cell hybrid vehicles with multi-objective control. J. Power Sources543, 231841 (2022). [Google Scholar]

[CR42] 42.Lei, N. et al. Theory-constrained neural network with modular interpretability for fuel cell vehicle modeling. IEEE Trans. Veh. Technology 74(6), 8907–8920 (2025). [Google Scholar]

[CR43] 43.Chen, H., Pei, P. & Song, M. Lifetime prediction and the economic lifetime of proton exchange membrane fuel cells. Appl. Energy142, 154–163 (2015). [Google Scholar]

[CR44] 44.Song, K. et al. A comprehensive evaluation framework to evaluate energy management strategies of fuel cell electric vehicles. Electrochimica Acta292, 960–973 (2018). [Google Scholar]

[CR45] 45.Pei, P., Chang, Q. & Tang, T. A quick evaluating method for automotive fuel cell lifetime. Int. J. Hydrogen Energy33(14), 3829–3836 (2008). [Google Scholar]

[CR46] 46.U. S. D. o. Energy (ed), Fuel Cells, vol. Multi-Year Research, Development, and Demonstration Plan, (2017).

[CR47] 47.Esfahanian, M. et al. Large lithium polymer battery modeling for the simulation of hybrid electric vehicles using the equivalent circuit method. Int. J. Automot. Eng.3(4), 564–576 (2013). [Google Scholar]

[CR48] 48.Huang, Y. et al. Fuel consumption and emissions performance under real driving: Comparison between hybrid and conventional vehicles. Sci. Total Environ.659, 275–282 (2019). [DOI] [PubMed] [Google Scholar]

[CR49] 49.Eckert, J. J., Silva, L. C. D. A. E., Santiciolli, F. M., Correa, F. C. & Dedini, F. G. Optimization of electric propulsion system for a hybridized vehicle. Mech. Based Des. Struc.47(2), 175–200 (2019). [Google Scholar]

[CR50] 50.Cordoba-Arenas, A., Onori, S. & Rizzoni, G. A control-oriented lithium-ion battery pack model for plug-in hybrid electric vehicle cycle-life studies and system design with consideration of health management. J. Power Sources279, 791–808 (2015). [Google Scholar]

[CR51] 51.Wang, J. et al. Cycle-life model for graphite-LiFePO4 cells. J. Power Sources196(8), 3942–3948 (2011). [Google Scholar]

[CR52] 52.Lei, N. et al. Physics-informed data-driven modeling approach for commuting-oriented hybrid powertrain optimization. Energy Conversion and Management, 299, p.117814. (2024).

[CR53] 53.https://worldpopulationreview.com/cities/iran/tehran

[CR54] 54.Montazeri-Gh, M. & Naghizadeh, M. Development of the Tehran car driving cycle. Int. J. Environ. Pollut.30 (1), 106–118 (2007). [Google Scholar]

[CR55] 55.Deb, K., Pratap, A., Agarwal, S. & Meyarivan, T. A. M. T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput.6 (2), 182–197 (2002). [Google Scholar]

[CR56] 56.Pei, C. H. I., Chen, L. I. U., Jiang, Z. H. A. O., Kun, W. U. & Yingxun, W. Dynamic effect web generation for heterogeneous UAV cluster using DQN-based NSGA-II: Methods and applications. Chinese J. Aeronautics 4(4), 103351 (2024). [Google Scholar]

[CR57] 57.Tian, Y. et al. Deep reinforcement learning based adaptive operator selection for evolutionary multi-objective optimization. IEEE Trans. Emerg. Top. Comput. Intell.7 (4), 1051–1064 (2022). [Google Scholar]

PERMALINK

AI-driven multi-objective optimization of FCHEV sizing and energy management considering degradation and vehicle dynamics under realistic machine learning-based traffic conditions

Morteza Montazeri-Gh

Afshin Mostashiri

Abstract

Introduction

Vehicle description

Fig. 1.

Table 1.

Fuel cell modeling

Fig. 2.

Fuel cell aging

Battery modeling

Fig. 3.

Battery aging

Table 2.

Electric motor modeling

Fig. 4.

Development of a real-world traffic driving cycle

Data collection setup

Fig. 5.

Noise filtering

Fig. 6.

Machine learning approach for classifying and constructing traffic drive cycle

Table 3.

Fig. 7.

Fig. 8.

Fig. 9.

Table 4.

Hybrid multi-objective deep reinforcement learning optimization for FCHEV

NSGA-II multi-objective optimization

Fig. 10.

Hybrid NSGA-II and DQN framework for multi-objective optimization

Fig. 11.

Fig. 12.

Credit assignment strategy

Operator candidate set

Adaptive operator selection strategy

Overall process

Optimization constraints and objective function formulation

Fig. 13.

Table 5.

Optimization of powertrain components in FCHEV

Table 6.

Simultaneous optimization of sizing and EMS

Fig. 14.

Table 7.

Table 8.

Fig. 15.

HIL configuration for FCHEV

Fig. 16.

Table 9.

Results and discussion

Simultanious optimization for real-world traffic driving cycle

Fig. 17.

Table 10.

Fig. 18.

Fig. 19.

Table 11.

Fig. 20.

Traffic condition effects

Fig. 21.

Table 12.

Driving cycles effects

Fig. 22.

Fig. 23.

Table 13.

Fig. 24.

Road grade

Table 14.

Fig. 25.

HIL simulation

Table 15.

Fig. 26.

Fig. 27.

Conclusion

Abbreviations

Symbols

Greek letters

Author contributions