Skip to main content
MethodsX logoLink to MethodsX
. 2024 Nov 6;13:103037. doi: 10.1016/j.mex.2024.103037

Smart charge-optimizer: Intelligent electric vehicle charging and discharging

Archana Y Chaudhari a,, Prashant B Koli b, Surbhi D Pagar c, Reena S Sahane c, Kalyani D Kute c, Priyanka M Abhale c, Akanksha J Kulkarni c, Abhilasha K Bhagat c
PMCID: PMC12370149  PMID: 40851757

Abstract

The important steps toward a low-carbon economy and sustainable energy future is switch to Electric Vehicles(EVs).The rapid development of EVs has brought a risk to reliability of the electrical system. However, the high electricity consumption of EVs will lead to the overload of power grid transformers. Strategies for scheduling charging and discharging that work are essential to reducing the negative grid effects of EVs. In order to reduce the overload of power grid transformers, this paper explores two strategies for intelligent charging and discharging scheduling. The first one is Long Short-Term Memory coupled with Integer Linear Programming(LSTM-ILP)and the second one is Q-learning. The LSTM-ILP aims to minimize the charging and discharging schedules delay. The Q-learning method makes use of reinforcement learning to ascertain the best course of action for EVs in relation to their state-of-charge and the demand on the grid. The outcomes of this research show that both strategies are successful in lowering the peak-to-average ratio of the grid and lessening the influence of EV charging demands.

  • This research aims to Couple Long Short-Term Memory with Integer Linear Programming

  • Applying Q-learning to minimize the peak to-average ratio of grid load through effective peak shaving and valley filling

  • Minimizing EV charging costs for users while respecting their mobility needs

Keywords: Electric vehicles, Q-learning, LSTM-ILP, Smart grid

Method name: 1. Long Short-Term Memory with Integer Linear Programming (LSTM-ILP), 2. Q-learning

Graphical abstract

Image, graphical abstract


Specifications table

Subject area: Engineering
More specific subject area: Energy with AI
Name of your method: 1. Long Short-Term Memory with Integer Linear Programming (LSTM-ILP)
2. Q-learning
Name and reference of original method: Ren, Lina, Mingming Yuan, and Xiaohong Jiao. “Electric vehicle charging and discharging scheduling strategy based on dynamic electricity price.” Engineering Applications of Artificial Intelligence 123 (2023): 106320.
Resource availability: Data: https://www.kaggle.com/datasets/officialdatasets/battery-soc-dataset
Software: Google Colab

Background

In recent years, many of the researchers have been actively involved in optimizing the EV charging scheduling strategy. In [1] the author proposed a double deep Q-network-based cooperative EV charging scheduling strategy. The paper [2] utilizes deep reinforcement learning to focus the EV charging scheduling problem and effectively achieve collaborative scheduling among EVs, ultimately minimizing charging costs and promoting PV energy consumption. However, it falls short in considering the coordination of renewable energy systems and transformer loads [2].

Other noteworthy contribution comes from Shuai Li and co- authors, they proposed a Distributed Transformer Joint Optimization Method using Multi-Agent Deep Reinforcement Learning for EV Charging. This approach focuses on coordinating EV charging while safeguarding user privacy and reducing communication equipment deployment costs [3].

In contrast, Lina Ren and Mingming Yuan's work in 2023 it neglects renewable energy integration and the hurdle of reward sparsity in learning scenarios [4,5]. A paper [6] introduced CDDPG, a Deep Reinforcement- Learning-Based Approach for Electric Vehicle Charging Control that considers the coordination of renewable energy and transformer load and its simulation results.

The author, F. Tuchnitz, presented a reinforcement learning-based smart charging strategy for EV. The reinforcement learning offered a flexible and scalable approach that significantly reduces the variance of the total load [7].

In a paper [8] a multiagent reinforcement learning approach is utilized to coordinate electric vehicle charging. The author proposed the method that handled various tariff types and consumer preferences while minimizing assumptions about the distribution grid [8].

The author Xu Hao et al. [9] presented an A V2G-oriented reinforcement learning framework for heterogeneous electric vehicle charging management. It utilizes deep Q-network reinforcement learning, which considers uncertainties and the diversity of EVs. It optimized Electric Vehicle (EV) charging within Vehicle-to-Grid (V2G) systems [9].

While these papers provide valuable insights into EV charging scheduling strategies, it is essential to consider their specific limitations and address them in future research to develop comprehensive and adaptable solutions for efficient EV charging management.

Methods details

Introduction

The transition towards Electric Vehicles (EVs) is a critical part of the shift to a low-carbon economy and sustainable energy future. Governments and international bodies like the International Energy Agency (IEA) have set ambitious targets for the deployment of at least 30 percent of new electric vehicle sales by 2030 [10]. the estimate all these new electric vehicles will need to be charged. Right now, most electric vehicles begin charging around 5–6PM which creates a high load spike on the grid. In the case of 10 % market penetration of EVs an increase in the peak load demand is estimated to be around 18 %, and much higher for higher levels of market penetration. On one hand, the ability of EV batteries to discharge energy back to the grid provides substantial benefits like peak load shaving, frequency regulation, spinning reserves, and improved grid stability and efficiency [6,11]. However, if uncontrolled, the high coincidence of EV charging times with existing residential peak loads can exacerbate grid stress, voltage fluctuations, transformer overloads, and a dramatic increase in peak demand [[12], [13], [14]]. The concept of vehicle-to-grid (V2G) aims to harness the bidirectional energy flow capabilities of EVs to improve grid operations through smart charging and discharging control strategies. Currently, price-based demand response schemes like real-time pricing, time-of-use tariffs and dynamic pricing are the main methods explored to incentivize EVs to participate in V2G [2].

This work aims to develop a Smart Charge-Optimizer for the charging and discharging of EVs that participate in V2G. The Objectives of the proposed system are as follows:

  • Coupled Long Short-Term Memory with Integer Linear Programming (LSTM-ILP)

  • Apply Q-learning to minimize the peak to-average ratio (PAR) of grid load through effective peak shaving and valley filling

  • Minimizing EV charging costs for users while respecting their mobility needs

  • Ensuring V2G coordination scalability while maintaining system robustness to uncertainties and increasing EV penetration levels.

Motivation

Intelligent charging scheduling strategies are highly needed given the explosive growth of electric vehicles (EVs). This need is highlighted by many strong reasons.

Firstly, the rapid expansion of EVs generates heavy stress on city transformers and grids, increasing the chances of service interruptions and inefficiency in the grid. These stresses can be minimized using a well-thought-out charging plan, ensuring a steady and dependable power source for everyone.

Second, both private consumers and utility companies may have major concerns about the expense of EV charging. By practicing a well-thought-out scheduling strategy, charging times can be optimized, taking privilege of low cost electricity rates during off-peak hours and reducing user costs while improving grid efficiency.

The following parts of the article are structured as Section 2 gives a detailed related work which includes work done by various researchers and research gaps. Section 3 delves into the specifics of the proposed methodologies. The results obtained are then shown in Section 4, followed by a comprehensive analysis. Section 5 offers a concise concluding remark of the study with some last reflections.

Proposed system model

The proposed scheme assumes that Alberta has constant number of EVs, the base energy demand profile changes hour to hour but is the same for every day of the year [15], there is always enough capacity to supply any amount of power into V2G service, all EVs are the same and the presence of EVs does not alter Alberta base load, and, on hourly basis, and every EV is enrolled in V2G scheme but only some EVs participate. The Alberta demand load is shown in Fig. 1. Either 25 %, 50 %, or 75 % of EVs participate during each hour. The participation level is known at the start of every hour. The V2G service starts at 5PM when the SOC of every EV is 30 % on average, with 10 % standard deviation. Between 5PM and 3AM EVs can charge, discharge, or do nothing, provided they participate in V2G service. If an EV does not participate, it will drive 5 kms on average, with 5 kms standard deviation [7,16]. Starting at 3AM, 75 % of EVs participate in V2G service. V2G service ends at 8AM.

Fig. 1.

Fig 1

Visualization of the data set used [17].

The Fig. 1 illustrates the load profile or demand pattern employed in the. It shows how the total EV demand load (the entire amount of energy that EVs can demand to be charged), the total EV discharge capacity (the total amount of energy that EVs can potentially discharge into the grid), and the grid foundation load (the baseline load without EVs) change over time. The input data for the suggested scheduling technique is this load profile.

Fig. 2 depict the system architecture of Long Short-Term Memory coupled with Integer Linear Programming model (LSTM-ILP). The LSTM neural network, which is its main component, predicts dynamic energy pricing by considering input parameters such as EV demand, discharge capacity, and grid foundation load. The updated linear programming optimization model considers the predicted prices and uses them to determine the best EV charging and discharging schedules while considering variables like load difference minimization, maintaining a sufficient EV battery state-of-charge (SOC), and adding subsidies for EV owners taking part in the vehicle-to-grid (V2G) program.

  • (1)

    LSTM-ILP: LSTM is a type of recurrent neural network (RNN) designed to address issues like gradient explosion or disappearance commonly found in traditional RNNs. It features interconnected neuron layers with memory cells, capable of retaining information from previous time steps and transmitting it forward, making it suitable for time- related tasks.

  • (2)

    LSTM Design: Two hidden layers with 36 nodes each. Inputs: Grid base load (fbase), collective EV charging demand load (fcharging), discharge available load (fdischarge), and large grid electricity price (R). Outputs: 24-hour electricity price (r).

  • (3)
    Linear Programming (LP): The goal of linear programming (LP) is to maximize choices within linear constraints. Typically, this process consists of three stages: issue analysis, objective function establishment, and variable limitation determination. Optimization Process:
    • Decision Variables: Charging and discharging power.
    • Objective Function: Minimize peak-to-valley grid load difference and EV charging and discharging costs.
    • Constraints: Ensure safety and technical feasibility.
  • (4)
    Improved Linear Programming (ILP): ILP enhances LP by subsidizing electricity prices for EVs participating in V2G, aiming to address rapid grid load changes more effectively. Improvement Process:
    • Redistribute chargeable and dischargeable loads if absolute difference between charging and discharging electricity prices is less than a threshold.
    • Incorporate a new constraint to ensure load redistribution.
    • Return loss incurred by EV owners as a subsidy, also fed back to LSTM as part of its loss function.

Fig. 2.

Fig 2

Proposed LSTM-ILP model.

Loss Function:

  • ∆f L =− ηR1 (8)

favg

R1=i=1nR1i

These techniques optimize EV scheduling in V2G systems, balancing grid load effectively while minimizing costs for operators and EV owners.

  • (5)
    Q-Learning: A Markov Decision Process (MDP)
    • 0: Initialize parameters:
    • 0: - Grid base load (fbase)
    • 0: - Collective EV charging demand load (fcharging) 0: - Discharge available load (fdischarge)
    • 0: - Large grid electricity price (R)
    • 0: - Threshold for the absolute difference between charging and discharging electricity prices (δr)
    • 0: - Subsidy parameter (η)
    • 0: - Define LSTM neural network architecture:
    • 0: - Input layer: fbase, fcharging, fdischarge, R

Temporal-difference learning issue is suggested to be derived from a multi-objective, multi-agent cooperative game minimization problem. Traditionally, reinforcement learning involves an agent interacting with its surroundings and choosing behaviors in states that will result in rewards, with the goal of maximizing the total of all future rewards.

Fig. 3 illustrates the classic Q- Q-learning process, in which the agent engages with the environment by acting in various states and gaining rewards. By updating the Q-values, or projected future rewards, for each state-action combination, the objective is to develop an optimal policy that maximizes the cumulative reward over time.

Fig. 3.

Fig 3

Traditional Q-learning.

Agents in multi-agent cooperative reinforcement learning have the same goal and are rewarded for each transition. Goals O1 and O2 are linearly weighted with weights w1 and w2, resulting in the creation of a single goal.

A time slot (h) and a participation percentage (Y_h) are combined to establish each state. Random selections of charging, discharging, or doing nothing at each time interval comprise actions A_h.

Agents are incentivized by the penalty P_h to charge EVs such that, by the time the V2G service expires, the mean SOC meets a minimal level. Each time slot's reward, r_h, is determined by combining the penalty for SOC deviations with the negative peak-to-average ratio (PAR) in a linear fashion. whether the mean SOC can reach a minimum required level by the end of vehicle-to-grid (V2G) service. The penalty encourages agents to charge EVs such that the mean SOC is at least a minimum threshold by the end of V2G service, (Algorithm 1).

Algorithm 1: Proposed algorithm for Long Short-Term Memory - Improved Linear Programming (LSTM-ILP).

• Two hidden layers with 36 nodes each
• Output layer: 24-hour electricity price (r) 0: Train LSTM network using historical data.
• Linear Programming (LP):
 a. Define decision variables:
   i. Charging power (Pc)
   ii. Discharging power (Pd)
 b. Establish objective function:
  Minimize peak-to-valley grid load difference (δf) and EV charging and discharging costs (Ri) 0
  Objective Function: minδf,Ri
 c. Set constraints:
  Safety and technical constraints on charging and discharging power (Pc,Pd)
  Battery state of charge constraints (SOC)
  Charge balance constraints Grid load constraints
• Improved Linear Programming (ILP):
  a. Determine if absolute difference between charging and discharging electricity prices is less than δr.
  b. If difference is less than δr: Redistribute chargeable and dischargeable loads at those times.
  c. Add a new constraint to LP objective function: Constraint: Sum of charging and discharging power (Px,Py,…) ≤ average load value (Pavg).
  d. Calculate loss function:
    Loss =(Δffavgη·R1)
  Where R1 = P of subsidies for EV owners participating in load balancing.
• Return optimized scheduling strategy for each EV, balancing grid load effectively while minimizing costs.

The reward rh for each time slot is a linear combination of the negative peak-to-average ratio (PAR) and the penalty for SOC deviations.

To solve this problem, the Q-Learning algorithm is utilized. Q- Learning is an off-policy, model-free learning algorithm that updates Q-values for state-action pairs based on observed rewards and next states. The algorithm iteratively learns the optimal policy by updating Q-values towards maximizing future expected rewards.

The proposed algorithm initializes Q-values for all state action pairs and iteratively updates them for each episode. Actions are chosen using an greedy strategy to balance exploration and exploitation. Rewards and next states are observed, and Q- values are updated accordingly. The algorithm continues until the end of the V2G service.

In summary, the proposed algorithm aims to optimize EV charging and discharging schedules by transforming the original problem into an MDP temporal-difference learning problem and utilizing Q-Learning to learn the optimal policy for each state. This approach facilitates efficient and adaptive decision-making to minimize the peak-to-average ratio while maintaining EV state-of-charge within specified constraints.

The Fig. 4 illustrates The multi-agent cooperative reinforcement learning configuration utilized in the suggested Q-learning method is shown in this diagram. In this scenario, a number of agents (EVs) work toward the same goal (keeping SOC and minimizing peak-to-average ratio) and are rewarded for it. By taking into account the involvement levels and the effects of their actions on the system as a whole, the agents collaborate to develop a common strategy that maximizes the global reward.

Fig. 4.

Fig 4

Multi-agent Q-learning.

The Q-Learning algorithm (see Algorithm 2) is used to tackle this problem. Q-Learning is a model-free, off-policy learning technique that modifies Q-values for state-action pairings according to the subsequent states and observed rewards. By adjusting Q-values in an iterative manner to maximize future predicted rewards, the algorithm discovers the best course of action.

Algorithm 2: Proposed algorithm for Q-learning model.

  • Input: Learning rate α ε [0 1]
  • Output: Exploration parameter ε < 0
  • Initialization:
   ο Initialize Q(s,a) for all s, a except Q(terminal,) = 0
   ο For each episode:
   ο Get state S charging during the evening and early morning hours.
   ο For each step: Choose action A from S using behavioral policy
    • Take action A, observe reward R, next state S'
    • Q(S,A) ← Q(S,A) + α[R + γ maxa′(S',a') − Q(S,A)]
    • S ← S' =0

All state-action pairings have their Q-values initialized by the suggested algorithm (Algorithm 3), which then iteratively updates them for every episode. To balance exploration and exploitation, actions are selected using an ε-greedy method. As rewards and subsequent states are detected, Q-values are modified appropriately. The algorithm keeps running until the V2G service expires.

Algorithm 3: Proposed algorithm for Q- learning model.

 • Input: Learning rate α ∈ [0,1], Exploration parameter ϵ < 0 0:
 • Initialization:
  ο Initialize Q[(h,Yh),a], for all h,Yh,a, Q[(terminal,)] = 0 except Q[(terminal,)] = 0
  ο For each episode:
   • for h [1,2,3,...,NH] do
   • Choose Yh+1 at random
   • for each SOC bin [1,2,3,...,NSOC bins] do 0: Choose action A using ϵ-greedy
   • Take chosen action A Calculate eh,total
   • Observe reward rh and next state (h + 1,Yh+1)
   • for each SOC bin [1,2,3,...,NSOC bins] do
   • Q[(h,Yh),A] ← Q[(h,Yh),A] + α(rh+ γ maxa'(Q[(h + 1,Yh+1),a']) − Q[(h,Yh),A])
   • end for
   • (h,Yh) ← (h + 1,Yh+1)
  ο end

To put it briefly, the suggested method converts the original issue into an MDP temporal-difference learning problem and uses Q-Learning to determine the best course of action for each state in order to optimize EV charging and discharging schedules [18]. This strategy makes it easier to make effective, flexible decisions that decrease the peak-to-average ratio while keeping the EV state-of-charge within predetermined bounds.

This Fig. 5 shows the average grid demand per episode during the training process of the Q-learning algorithm. It demonstrates the grid demand converges as the algorithm learns the optimal policy for EV charging and discharging schedules

Fig. 5.

Fig 5

Average grid demand per episode for Q-learning.

Methods results

In the endeavor to optimize vehicle-to-grid (V2G) [19] operations in a residential setting encompassing 600 households and 400 electric vehicles (EVs), two distinct models were utilized: Long Short-Term Memory (LSTM) and Deep QNetwork (DQN). Both models were instrumental in generating charging and discharging schedules used to enhance grid stability, reduce peak load, and handle charging costs for EV customers. The LSTM model's charging schedule showcased a deliberate strategy of abstaining from EV charging during the evening and early morning hours, spanning from 17:00 to 08:00, while its counterpart, the discharging schedule, indicated a consistent discharge of 10 EVs throughout the same period. This approach aligns with leveraging V2G capabilities to alleviate strain on the grid during peak demand periods, thereby optimizing energy utilization and cost- effectiveness. Conversely, the DQN model exhibited a contrasting schedule, with 10 EVs actively charging during the aforementioned time frame and no discharging activity observed. This divergence underscores the nuanced optimization strategies employed by each model, with the DQN model prioritizing off-peak charging to capitalize minimized electricity rates and increased grid support capacity.

Such meticulous scheduling endeavors underscore the pivotal role of advanced machine learning techniques in orchestrating efficient V2G operations within residential communities. By harnessing the predictive capabilities of LSTM models and the adaptive decision-making prowess of DQN frameworks, aggregators can fine-tune charging and discharging strategies to harmonize with dynamic grid conditions and user preferences. The overarching objective remains twofold: to minimize the burden on the grid during peak hours, thus averting potential strain and enhancing overall reliability, while concurrently optimizing charging patterns to align with cost effective electricity tariffs and user convenience.

The advantages of the proposed method are that EVs can act autonomously knowing only the participation level. Also, the load standard deviation for the full day was generally reduced. The disadvantages are that the method only works on constant load demand data, the SOC of EVs at 8AM is a mean value and on any individual day the mean may be less, it is unclear whether the resulting policy is optimal, and some discharging in consecutive hours is present for EVs with <25 % charge.

We proposed a Q-learning-based algorithm for EV demand response in a cooperative multi-agent multi-objective game. While it didn't reduce PAR, it achieved a 2.7 % reduction in average standard deviation of demand load for the full day and a 16.2 % reduction for the first 23 h. After 100,000 epochs, convergence was observed across 30 runs. The method's advantages include reducing load demand standard deviation and enabling EVs to act independently based only on current participation levels. However, it may discharge EVs when their state of charge is low and doesn't guarantee a minimum charge in the morning.

In LSTM-ILP, the uncertainty in EV driving behavior leads to daily fluctuations in total load demand and charging costs. Through V2G dispatching, we identified the variation range of grid load differences, charging costs, and aggregator revenue. After 50 simulations, we found the average grid load difference to be 425.1 kW, with a 95 % confidence interval of [395.7, 454.6]. This method effectively reduces load differences even at the upper limit. The average EV charging cost is 1980.1 yuan, with a 95 % confidence interval of [1696.3, 2263.9], indicating cost reduction effectiveness. Aggregators’ average income level.

Fig. 6 illustrates the peak-to-average (PAR) per episode during the training of the Q-Learning algorithm. It demonstrates how the algorithm aims to minimize the PAR, which is one of the objectives of the scheduling strategy, by learning the optimal actions for EVs based on their state and participation level.

Fig. 6.

Fig 6:

Peak-to-average ratio per episode for Q-learning.

The suggested Q-learning approach has two benefits: the load standard deviation for the entire day was largely decreased, and EVs may operate independently with just the participation level known. The method's drawbacks include the fact that it can only be applied to continuous load demand data, that the SOC of EVs at 8AM is a mean value that can vary on any given day, that it's unclear whether the resulting policy is optimal, and that EVs with a charge level of <25 % will occasionally discharge in consecutive hours.

Method validation

Fig. 7 illustrates the demand prediction capability of the LSTM-ILP model. It compares the actual demand profile (likely a combination of grid foundation load, EV demand, and discharge capacity) with the demand predicted by the LSTM-ILP model. This comparison helps evaluate the accuracy of the demand forecasting, which is crucial for the subsequent optimization step.

Fig. 7.

Fig 7

Demand prediction using LSTM-ILP.

Fig. 8 illustrates the training loss of the LSTM-ILP model over time. The loss function likely combines factors such as the load difference (peak-to-average ratio), EV charging costs, and subsidies provided to EV owners participating in the V2G scheme. Monitoring the loss during training helps assess the model's performance and convergence.

Fig. 8.

Fig 8

Model loss in LSTM-ILP.

Moreover, the implementation of these optimized schedules not only fosters grid stability and efficiency but also lays the foundation for a more sustainable energy ecosystem. By strategically orchestrating the charging and discharging behavior of EV fleets, communities can unlock the full potential of V2G technology to support renewable energy integration, reduce greenhouse gas emissions, and foster a more resilient energy infrastructure.

As such, the utilization of LSTM and DQN models in devising tailored V2G strategies represents a pivotal step towards realizing the vision of a smarter, more adaptive energy landscape, where EVs serve as dynamic assets in the transition towards a sustainable energy future.

Conclusive summary

The research presented in this paper represents a significant step towards realizing the vision of a sustainable and resilient energy future, where electric vehicles play a pivotal role as dynamic assets in the transition towards a low-carbon economy. Through the continued advancement of machine learning techniques, optimization algorithms, and interdisciplinary collaborations, the challenges posed by the widespread adoption of EVs can be effectively mitigated, paving the way for a more efficient, cost-effective, and environmentally conscious energy landscape. The proposed LSTM-ILP focus on discharging EVs to support the grid in times of peak demand than it does on charging EVs in the evening and early morning. This remedy leads to cost-effectiveness and energy usage optimization. So, using Q-learning models to create customized V2G strategies is a critical step in achieving the goal of an energy landscape that is way smarter and more flexible and in which electric vehicles (EVs) play a dynamic role as we move forward to a sustainable energy future. Moreover, the LSTM-ILP and Q-learning models may perform better when sophisticated forecasting methods for renewable energy generation and grid demand patterns are included. This might lead to more precise forecasts and better scheduling choices.

Future topics for study include investigating hybrid techniques that incorporate the best features of both approaches, extending the optimization framework to include energy storage systems and renewable energy sources, and taking into account more complicated situations with a variety of EV kinds and user preferences.

Limitations

Based on the findings of our method, In the power grid Q-learning requires a balance between exploring new actions and exploiting known actions to maximize rewards. The cost of implementing and maintaining these advanced scheduling systems can be high, potentially limiting their adoption. The effectiveness of these strategies can be influenced by unpredictable user behavior, such as unexpected changes in charging patterns.

CRediT author contribution statement

Dr. Archan Y. Chaudhari: Methodology, Data curation, Visualization, Writing – original draft, Mr. Prashant B. Koli: writing – review & editing, Ms. Surbhi D.Pagar: Project administration, Funding acquisition, Mrs. Reena S. Sahane: Project administration, Funding acquisition, Ms. Kalyani D. Kute: Project administration, Funding acquisition, Ms. Priyanka M. Abhale: Project administration, Funding acquisition, Ms. Akanksha J. Kulkarni: Project administration, Funding acquisition, Ms. Abhilasha K. Bhagat: Project administration, Funding acquisition.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

Acknowledgments

I would like to express my deepest gratitude to Prof. Pallavi Thakare, Devansh Kariya, Shivam Pawar, Ved Inamdar, and Rajat Parate, who have contributed to the completion of this research.

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Ethics statements

This work did not involve human subjects, animal experiments data, and data collected from social media platforms.

Footnotes

Related research article: None.

Data availability

Data will be made available on request.

References

  • 1.Zhang Y., et al. A cooperative EV charging scheduling strategy based on double deep Q-network and Prioritized experience replay. Eng. Appl. Artif. Intell. 2023;118 [Google Scholar]
  • 2.Zhang Y., et al. A cooperative EV charging scheduling strategy based on double deep Q-network and Prioritized experience replay. Eng. Appl. Artif. Intell. 2023;118 [Google Scholar]
  • 3.Hua, Min, et al. “Recent progress in energy management of connected hybrid electric vehicles using reinforcement learning.” arXiv preprint arXiv:2308.14602 (2023).
  • 4.Ren L., Yuan M., Jiao X. Electric vehicle charging and discharging scheduling strategy based on dynamic electricity price. Eng. Appl. Artif. Intell. 2023;123 [Google Scholar]
  • 5.Yan L., et al. A cooperative charging control strategy for electric vehicles based on multiagent deep reinforcement learning. IEEE Trans. Ind. Inform. 2022:8765–8775. 18.12. [Google Scholar]
  • 6.Zhang F., Yang Q., An D. CDDPG: a deep-reinforcement-learning-based approach for electric vehicle charging control. IEEE Internet Things J. 2020:3075–3087. 8.5. [Google Scholar]
  • 7.Tuchnitz F., et al. Development and evaluation of a smart charging strategy for an electric vehicle fleet based on reinforcement learning. Appl. Energy. 2021;285 [Google Scholar]
  • 8.Silva D., Leno F., et al. Coordination of electric vehicle charging through multiagent reinforcement learning. IEEE Trans. Smart Grid. 2019:2347–2356. 11.3. [Google Scholar]
  • 9.Hao X., et al. A V2G-oriented reinforcement learning framework and empirical study for heterogeneous electric vehicle charging management. Sustain. Cities Soc. 2023;89 [Google Scholar]
  • 10.International Energy Agency “Global EV Outlook 2024” http://www.iea.org.
  • 11.Abdullah H.M., Gastli A., Ben-Brahim L. Reinforcement learning based EV charging management systems–a review. IEEE Access. 2021;9:41506–41531. [Google Scholar]
  • 12.Li H., Qian X., Song W. Prioritized experience replay based on dynamics priority. Sci. Rep. 2024:6014. doi: 10.1038/s41598-024-56673-3. 14.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D. and Riedmiller, M. “Playing Atari with deep reinforcement learning.” arXiv preprint arXiv:1312.5602 (2013).
  • 14.Dabbaghjamanesh M., Moeini A., Kavousi-Fard A. Reinforcement learning-based load forecasting of electric vehicle charging station using Q-learning technique. IEEE Trans. Ind. Inform. 2020:4229–4237. 17.6. [Google Scholar]
  • 15.Park K., Moon I. Multi-agent deep reinforcement learning approach for EV charging scheduling in a smart grid. Appl. Energy. 2022;328 [Google Scholar]
  • 16.Chaudhari A., Mulay P. Algorithmic analysis of intelligent electricity meter data for reduction of energy consumption and carbon emission. Electr. J. 2019 32.10. [Google Scholar]
  • 17.“Battery SoC Dataset” https://www.kaggle.com/datasets/officialdatasets/battery-soc-dataset.
  • 18.Chaudhari A., Mulay P., Agarwal A., Iyer K., Sarbhai S. DIC2FBA: Distributed incremental clustering with closeness factor based algorithm for analysis of smart meter data. Int. J. Comput. Digital Syst. 2024;161:29–38. [Google Scholar]
  • 19.Ravi S.S., Aziz M. Utilization of electric vehicles for vehicle-to-grid services: progress and perspectives. Energies. 2022:589. 15.2. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data will be made available on request.


Articles from MethodsX are provided here courtesy of Elsevier

RESOURCES