Task Offloading and Resource Allocation Strategy Based on Deep Learning for Mobile Edge Computing

Zijia Yu; Xu Xu; Wei Zhou

doi:10.1155/2022/1427219

. 2022 Aug 31;2022:1427219. doi: 10.1155/2022/1427219

Task Offloading and Resource Allocation Strategy Based on Deep Learning for Mobile Edge Computing

Zijia Yu ^1,^✉, Xu Xu ¹, Wei Zhou ¹

PMCID: PMC9452000 PMID: 36093499

Abstract

For the problems of unreasonable computation offloading and uneven resource allocation in Mobile Edge Computing (MEC), this paper proposes a task offloading and resource allocation strategy based on deep learning for MEC. Firstly, in the multiuser multiserver MEC environment, a new objective function is designed by combining calculation model and communication model in the system, which can shorten the completion time of all computing tasks and minimize the energy consumption of all terminal devices under delay constraints. Then, based on the multiagent reinforcement learning system, system benefits and resource consumption are designed as rewards and losses in deep reinforcement learning. Dueling-DQN algorithm is used to solve the system problem model for obtaining resource allocation method with the highest reward. Finally, the experimental results show that when the learning rate is 0.001 and discount factor is 0.90, the performance of proposed strategy is the best. Furthermore, the proportions of reducing energy consumption and shortening completion time are 52.18% and 34.72%, respectively, which are better than other comparison strategies in terms of calculation amount and energy saving.

1. Introduction

With the rise of computing-intensive applications and explosive growth of data traffic, users' requirements for the computing power and service quality of mobile devices are also increasing [1]. At present, cloud computing also faces many problems and challenges. Due to its resource-intensive architecture, mobile cloud computing imposes a huge additional load on the backhaul link of mobile networks [2, 3]. Thus, Mobile Edge Computing (MEC) technology is proposed, which physically integrates computing and storage resources into the edge of mobile network architecture [4, 5]. This not only effectively reduces the transmission delay but also solves the problems of high load and high delay caused by mobile cloud computing [6]. At the same time, MEC has the characteristics of distributed architecture, being at the edge of network, low latency, user location awareness, and network status awareness [7, 8]. However, deploying a large number of computing and storage devices at the edge of network for users to choose and accessing neighboring service providers for edge computing will bring a series of complexities such as access and resource allocation strategy selection, user mobility management, and computing task migration problems [9].

In order to achieve the goal of short completion time and lower terminal energy consumption under delay constraint, this paper proposes a task offloading and resource allocation strategy based on deep learning for MEC. In order to shorten the completion time of computing tasks and minimize the energy consumption of all terminal devices while satisfying delay constraints, the proposed strategy is designed in a multiuser multiserver MEC environment, combined with computing model and communication model in system. Moreover, a new objective function is designed, which uses objective optimization to further reduce energy consumption and time delay. It uses Dueling-DQN algorithm to solve the optimization model to shorten completion time and minimize energy consumption of all terminal devices while meeting the delay constraints.

The remaining chapters of this paper are arranged as follows: Section 2 introduces the relevant research work on mobile task unloading. Section 3 introduces the system model. Section 4 introduces the new computing offload method based on improved DQN. In Section 5, simulation experiments are designed to verify the performance of the proposed model. Section 6 is the conclusion.

2. Related Work

In MEC network, computation offloading may occur in three types: full offloading, partial offloading, and local processing [10]. An important research hotspot in the field of computation offloading is computation offloading decisions. Generally speaking, the offloading goal mainly focuses on minimizing the overall delay or minimizing energy consumption for user devices while meeting the minimum delay requirements [11]. Reference [12] proposed a distributed task offloading strategy for low-load base station groups in MEC environment. It selects the best MEC node offloading amount by game equation on the basis of quantifying offloading cost and delay. But it is not suitable for high-load application environments. In reference to the problem of unbalanced computing resources on the edge server in vehicle edge computing network, [13] proposed a load balancing task offloading scheme based on software-defined network. This solution can effectively reduce delay and improve the efficiency of task offloading processing. However, the processing method used has poor performance, which affects the distribution efficiency. Reference [14] used greedy selection to design a maximum energy-saving priority algorithm to achieve optimal offloading of computing tasks on mobile devices, but it does not consider the delay constraints of task offloading. Reference [15] combined long short-term memory (LSTM) and candidate network set to improve the deep reinforcement learning algorithm and used this algorithm to solve the problem of offloading dependency of multinode and mobile tasks in large-scale heterogeneous MEC. But they ignore the optimal allocation problem of computing resources.

Similar to computation offloading, resource allocation is also one of the core issues in MEC [16]. In MEC network, technologies such as content caching and ultradense deployment are introduced, and multiple resources are deployed to mobile network edge according to the specific needs of users. This can further ensure the quality of service and greatly increase system capacity [17]. However, due to objective reasons such as physical volume and power consumption, the mobile network edge has limited computing resources, storage cache capacity, and spectrum resources. How to allocate multiple resources and improve the system service efficiency has a huge effect on the improvement of MEC network system performance [18]. Reference [19] proposed a time average computation rate maximization (TACRM) algorithm, which allows joint allocation of radio resources and calculation resources. However, the overall performance and task requirements of devices were not considered comprehensively in the allocation process, and the allocation efficiency still needs to be improved. Reference [20] comprehensively considered factors such as CPU, hard disk space, and required time and distance and proposed a comprehensive utility function for MEC resource allocation to achieve the optimal allocation of resources in MEC and cloud computing. However, this function considers many factors, which will seriously affect the efficiency of allocation in real applications. Reference [21] designed a two-layer optimization method for MEC, which uses pruning candidate modes to reduce the number of unfeasible offloading decisions. Through ant colony algorithm to achieve the upper-level optimization, the resource allocation effect is better. However, the server computing resource constraints and task delay constraints are not considered, and the overall timeliness is not good. Reference [22] constructed a low-complexity advanced branch model, which can be used for resource scheduling in large-scale MEC scenarios.

Due to the lack of powerful processing algorithms, the overall efficiency and performance are not ideal. To this end, comprehensively considering the task offloading and resource allocation problem, a deep learning-based MEC task offloading and resource allocation strategy is proposed to coordinate and optimize the allocation and offloading between computing resources and computing tasks, which improves the comprehensive computing efficiency of MEC.

3. System Model

The system model is a multiuser multiserver application scenario, in which there are N terminal devices and M MEC servers. The base station is used to provide communication resources for user equipment. Each base station is connected to an edge computing server through optical fiber, through the wireless communication link to connect to MEC server to calculate task data of offloading terminal devices, as shown in Figure 1. It is assumed that each terminal device can perform offloading computations or local calculations for its own execution tasks. And when offloading, the task can only be offloaded to one MEC server for calculation, and each terminal device is within the range of wireless connection. However, the computing power of each MEC server is limited; it cannot accept the offloading request of each terminal at the same time.

The collection of terminal devices is U={1,2,…, n,…, N}, the collection of MEC server is H={1,2,…, m,…, M}, and the collection of all tasks is G. Each terminal device n has a calculation-intensive task G_n to be processed, which specifically includes the data D_n (code and parameters) required for computing task G_n, the CPU workload ϕ_n required for computing task G_n, and the completion time of task G_n. The extension constraint is τ_n, namely, G_n≅(D_n, ϕ_n, τ_n). The set of offloading decisions for each G_n is X=[x₁, x₂,…, x_n,…, x_N].

When x_n={0,1,…, m,…, M}, and x_n=0 is local offloading, the rest means offloading G_n to m MEC servers.

3.1. Communication Model

In the computation offloading problem, two links are mainly studied: wireless link from terminal devices to MEC and the wired link from MEC to cloud in the core network. In the wireless link, Finite-State Markov Channel (FSMC) model based on fading characteristics is used. FSMC model has a wide range of applications in wireless networks [23, 24].

The channel is divided into nonoverlapping intervals through the division of channel-related parameter ranges, and each interval of selected parameters represents a state in FSMC model. The relevant parameter used in FSMC may be Signal-to-Noise Ratio (SNR) amplitude of received signal at the receiving end or collected energy. SNR can be selected as a parameter that composes SNR model [25]. The SNR of receiving end is divided into K levels, and each level is associated with a state of Markov chain. The block fading channel is considered to be that the SNR of receiving end is a constant within a period of time but will change according to Markov transition probability between different periods. Assume that random variable γ is the SNR of receiving end of terminal device n. That is, γ can be improved according to Markov chain of finite states, and all its states can be expressed as κ={1,2,…, K}. The realization of random variable γ of terminal device n in the time period t is represented as Γ(t), specifically expressed as

\begin{matrix} Γ_{n} (t) = k, if γ_{n} (t) \in [h^{i - 1}, h^{k}), \end{matrix}

(1)

where k ∈ κ={1,2,…, K} and h⁰=0 < h¹ < h² < …. Let ρ_{s_n′s_n^″}(t) denote the probability of state Γ_n(t) transitioning from state s_n′ to state s_n^″ in the time period t. The K × K channel state transition probability matrix of terminal device n is denoted as Φ_n(t)=[ρ_{s_n′s_n^″}(t)]_K×K, where ρ_{s_n′s_n^″}(t)=Pr (Γ_n(t+1)=s_n′|Γ_n(t)=s_n^″), s_n′, and s_n^″ ∈ κ.

In practical applications, the transfer matrix can be observed and measured from wireless environment in the past. In addition, it is considered that {Γ_n(t), 1 ≤ t ≤ T} exists independently for terminal device n. Based on FSMC channel model, Γ_n,m is used here to represent SNR between terminal device n and MEC server m. Since there is no interference between terminal devices, its channel efficiency can be expressed as ϑ_n,m=log₂ (1+Γ_n,m). Considering that the bandwidth W_m of MEC server m is divided into W_m/B_m, the bandwidth of each channel is B_m. Assuming that each user is allocated at most one channel, the transmission rate from terminal device n to MEC server m can be expressed as

\begin{matrix} v_{n, m} (t) = B_{m} ϑ_{n, m} (t) . \end{matrix}

(2)

The subchannel owned by MEC server m has certain restrictions on receiving W_m/B_m; that is, the bandwidth allocated by MEC server m to all connected users cannot exceed the total bandwidth of MEC server m. Besides, MEC server is limited by cache and computing capacity. On the one hand, MEC server can only handle a limited number of tasks; on the other hand, the load that MEC server can handle is also limited (such as the number of computing tasks). Therefore, some tasks will be further offloaded to the core network to be processed by the core network. Use g_u(t) ∈ {0,1} to represent the computation offloading decision indicator, which is used to indicate the way the server provides services. Among them, g_n(t)=0 means that the terminal device n is processed by connected MEC server for computing tasks. And g_n(t)=1 indicates that the task will be further offloaded to core network for processing by connected MEC server.

In order to further offload tasks to cloud, the wired backhaul link from MEC server to core network is considered. Assuming that the backhaul link capacity of network is Z (in bits per second), the backhaul link capacity allocated by MEC server m is Z_m. Then, the following restrictions must be met:

\begin{matrix} \sum_{n = 1}^{N} g_{n} (t) θ_{n, m} (t) ϖ_{n, m} (t) \leq Z_{m}, \\ \sum_{m = 1}^{M} \sum_{n = 1}^{N} g_{n} (t) θ_{n, m} (t) ϖ_{n, m} (t) \leq Z, \end{matrix}

(3)

where θ_n,m is the connection between terminal device n and MEC server m and ϖ_n,m is the transmission rate between terminal device n and MEC server m.

The sum of the rates of offloading computation tasks to the terminal device of core network by MEC server m cannot exceed the backhaul capacity of MEC server m. And the sum of speeds of all terminal devices processing computing tasks in the cloud cannot exceed the total backhaul capacity of system.

3.2. Calculation Model

If G_n is processed locally, use T_n^L to represent the time when G_n is executed locally, which is specifically defined as

\begin{matrix} T_{n}^{L} = \frac{ϕ_{n}}{f_{n}^{L}}, \end{matrix}

(4)

where workload ϕ_n is the total number of CPU cycles required to complete G_n and f_n^L is the local computing power of terminal device n (i.e., the number of CPU cycles executed per second).

Use E_n^L to represent the energy consumption of devices executed locally by G_n, which is defined as follows:

\begin{matrix} E_{n}^{L} = ϕ_{n} \times e_{n}, \end{matrix}

(5)

where e_n is terminal device n to calculate the energy consumption per unit of CPU cycle, e_n=(f_n^L)² × 10⁻²⁷.

If G_n is processed at the edge, delay T_n^O and device energy consumption E_n^O under G_n edge execution should be calculated from three parts: data upload, data processing, and data return [26]. The specific calculation is as follows.

First, terminal device n uploads data G_n to the corresponding MEC server by wireless channel. Let T_n′ be the time when device n uploads G_n data, which is defined as

\begin{matrix} T_{n}^{'} = \frac{D_{n}}{v^{'}}, \end{matrix}

(6)

where D_n is the data size of G_n and v′ is data upload rate in the system model (i.e., the amount of data uploaded per second).

Then, the energy consumption E_n′ of terminal device n uploading data is

\begin{matrix} E_{n}^{'} = T_{n}^{'} \times P^{'}, \end{matrix}

(7)

where P′ is the uplink transmission power of terminal device n.

Then, MEC allocates computing resources for calculation after receiving processed data. Use T_n^″ to represent the time when the offloading data is calculated in MEC server, which is defined as

\begin{matrix} T_{n}^{″} = \frac{ϕ_{n}}{f_{n m}^{O}}, \end{matrix}

(8)

where f_nm^O are the computing resources allocated by m MEC servers for G_n offload execution (i.e., the number of CPU cycles executed per second). When G_n is unloaded to the local or other MEC server, f_ij^O is zero and serves as a constraint in the model, namely,

\begin{matrix} f_{n m}^{O} = 0, x_{n} \neq m . \end{matrix}

(9)

At this time, terminal device n has no computing task and is in a waiting state and generates idle energy consumption. Suppose P_n^I is the idle power of terminal device n, then the idle energy consumption E_n^″ of terminal device n under offloading computation is

\begin{matrix} E_{n}^{″} = T_{n}^{″} \times P_{n}^{I} . \end{matrix}

(10)

Finally, MEC server returns the calculation result to terminal device n. The calculation result during backhaul is small and downlink rate is high, so the time delay and energy consumption when terminal device is received are ignored. Therefore, delay T_n^O under G_n edge execution is the sum of transmission delay T_n′ and the calculation delay T_n^″ of MEC server, namely,

\begin{matrix} T_{n}^{O} = T_{n}^{″} + T_{n}^{'} . \end{matrix}

(11)

The device energy consumption E_n^O under G_n edge execution is the sum of upload energy consumption E_n^″ of device n and the idle energy consumption E_n^″ of device n waiting for G_n to complete calculation on MEC server, namely,

\begin{matrix} E_{n}^{O} = E_{n}^{'} + E_{n}^{″} . \end{matrix}

(12)

In summary, the time delay T_n and energy consumption E_n of the entire calculation process of task G_n in terminal device n are

\begin{matrix} T_{n} = \{\begin{matrix} T_{n}^{L}, & x_{n} = 0, \\ T_{n}^{O}, & x_{n} \neq 0, \end{matrix} \\ E_{n} = \{\begin{matrix} E_{n}^{L}, & x_{n} = 0, \\ E_{n}^{O}, & x_{n} \neq 0 . \end{matrix} \end{matrix}

(13)

Note that T_n and f_nm^O should meet the following restrictions:

\begin{matrix} T_{n} \leq η_{n}, \\ T_{n} \leq η_{n} . \end{matrix}

(14)

The time delay constraint η_n of G_n is that computing power is twice 1.4 GHz. F_m is the overall computing resources of MEC server m; that is, the sum of computing resources allocated by each G_n that is offloaded to MEC server m should not exceed F_m.

3.3. Problem Model

The purpose of this paper is to jointly optimize offloading decision-making and resource allocation scheme in the multiuser multi-MEC server scenario, considering the limited computing resources and time delay constraint of computing tasks. This allows all computing tasks to shorten the completion time and minimize energy consumption of all terminal devices while meeting the delay constraints and extend the use time of terminal devices [27, 28]. Thus, the system objective function Ψ is defined as

\begin{matrix} Ψ = \sum_{n = 1}^{N} E_{n} + 10 \times \sum_{n = 1}^{N} \frac{T_{n}}{η_{n}} . \end{matrix}

(15)

(T_n/η_n) is the ratio of completion time G_n to the delay constraints. According to the calculation results of simulation experiment, the difference between ∑_n=1^NE_n and ∑_n=1^N(T_n/η_n) is a decimal order of magnitude. Therefore, to ensure that the two are of the same order of magnitude and optimized together, ∑_n=1^N(T_n/η_n) is multiplied by a factor of 10. The objective function Ψ minimizes the ratio of overall energy consumption of terminal devices to the task execution time and delay constraints by solving the optimal offloading decision and resource allocation plan to achieve research purpose. The overall problem model is as follows:

\begin{matrix} \min_{X, f} (Ψ), \\ X = [x_{1}, x_{2}, \dots, x_{n}, \dots, x_{N}], \\ X = [x_{1}, x_{2}, \dots, x_{n}, \dots, x_{N}], \\ y_{n} = \{\begin{matrix} f_{n}^{L}, x_{n} = 0, \\ f_{n m}^{O}, x_{n} = m, \end{matrix} \\ s . t . \\ C_{1} : x_{n} \in \{0,1, \dots, m, \dots, M\}, \forall n \in U, \\ C_{2} : y_{n} > 0, \forall n \in U, \\ C_{3} : f_{n m}^{O} = 0, x_{n} \neq m, \\ C_{4} : T_{n} \leq η_{n}, \forall n \in U, \\ C_{5} : \sum_{n = 1}^{N} f_{n m}^{O} \leq F_{m}, \forall m \in H, \\ C_{6} : \{\begin{matrix} \sum_{n = 1}^{N} g_{n} (t) θ_{n, m} (t) ϖ_{n, m} (t) \leq Z_{m}, \\ \sum_{m = 1}^{M} \sum_{n = 1}^{N} g_{n} (t) θ_{n, m} (t) ϖ_{n, m} (t) \leq Z, \end{matrix} \end{matrix}

(16)

where X is the task offloading decision amount and Y is the calculation resource allocation amount. Constraints C₁, C₂, and C₃ indicate that each task G_n can only be offloaded to the local or one of MEC servers for calculation. C₄ represents the constraint of task completion delay, and C₅ and C₆ represent the constraint that allocated computing resources should meet.

4. New Computation Offloading Method Based on Improved DQN

4.1. Multiagent Reinforcement Learning Algorithm

The multiagent reinforcement learning system is shown in Figure 2, where multiple agents act at the same time. Under the joint action, the entire system will be transferred, and each agent will be rewarded immediately [29, 30].

For multiagent reinforcement learning, it is first necessary to establish a Markov game model. Markov game can be described by a multigroup (n, S, A₁,…, A_n, R₁,…, R_n). Among them,

(1)
n is the number of agents; that is, N is the number of terminal devices. S is the system state, which generally refers to the joint state of multiple agents, that is, the joint state of each agent. The terminal device shares the current load status of edge computing servers, which can be expressed as
$\begin{matrix} L D (t) = [L D_{1} (t), L D_{2} (t), \dots, L D_{m} (t)], \end{matrix}$ (17)
where LD_m is the load of MEC.
(2)
R_i is the instant reward function of each agent. That is, in current state s, after joint action (A₁,…, A_n) taken by multiple agents, the reward is obtained in the next system state $\hat{s}$ .

The reward function completely describes the relationship between multiple agents. When the reward function of each agent is the same, that is, R₁=R₂=⋯=R_n, it means that the agent is a complete cooperative relationship. When there are only two agents and reward function is opposite, that is, R₁=−R₂, it means that the agents are in perfect competition. When the return function is between the two, it is a mixed relationship between competition and cooperation.

4.2. Problem Description and Modeling

4.2.1. Network Status

S={s(t)} represents the network state space, where s(t) represents the network state at time period t, and improvements are made in the entire time period T. The network status consists of SNR of each terminal device and cache status of each MEC server. s(t) can be defined as

\begin{matrix} s (t) = (Γ_{1} (t), \dots, Γ_{n} (t), \dots, Γ_{N} (t), \\ ψ_{1} (t), \dots, ψ_{m} (t), \dots, ψ_{M} (t)), \end{matrix}

(18)

where Γ_n={Γ_n,m, m ∈ M} represents SNR between user terminal device n and all MECs. ψ_m(t)={φ_k,m, k ∈ K} represents the cache status of MEC servers.

4.2.2. Network Behavior

The intelligent agent needs to determine the attachment relationship between the terminal device and MEC server in each time period. That is the determination of the terminal device's computing offload, the allocation of computing resources, and the service cache policy of each MEC server. Thus, each executable action of terminal devices in the time period t can be defined as follows:

\begin{matrix} a (t) = (A_{1} (t), \dots, A_{n} (t), \dots, A_{N} (t), \\ G_{1} (t), \dots, G_{n} (t), \dots, G_{N} (t), \\ ψ_{1} (t), \dots, ψ_{m} (t), \dots, ψ_{M} (t)), \end{matrix}

(19)

where A_n(t) = {a_n,m(t), m ∈ M} represents the attachment indicator of terminal device n and G_n(t) represents the calculation and offloading decision of terminal device n.

4.2.3. Reward Function

The goal is to maximize total benefit of system, but the reward function should be set to current benefit of system. First calculate the system leased spectrum and backhaul resources and allocate them to terminal devices part of the revenue. The unit price of spectrum leased from MEC server m is set to δ_m per Hz, and the unit price of backhaul link from MEC server m to core network is set to σ_m per bps. Corresponding to this, the calculation data is transmitted to MEC server corresponding to terminal device n and backhaul link from MEC server to the core network is used for charging. The unit price is defined as α_n per Hz and β_n per bps. Therefore, by summarizing this part of the income and expenditure, part of income for leased spectrum and backhaul resources obtained by terminal device n can be obtained:

\begin{matrix} R_{n}^{'} (t) = α_{n} \sum_{m = 1}^{M} a_{n, m} (t) B_{m} + β_{n} g_{n} (t) \sum_{m = 1}^{M} a_{n, m} (t) R_{n, m} (t) - \sum_{m = 1}^{M} δ_{m} a_{n, m} (t) B_{m} - g_{n} (t) \sum_{m = 1}^{M} σ_{m} a_{n, m} (t) R_{n, m} (t) . \end{matrix}

(20)

Then, calculate the profit obtained by terminal devices from allocating computing resources. On the one hand, when MEC side performs computing tasks, it needs to pay communication company for the loss of processing computing tasks and define the unit price of MEC server m energy consumption as χ_m. On the other hand, the terminal device needs to pay a certain price for the server on MEC side, and computing resource allocated for each unit computing task is set to ζ_n. Therefore, the benefit obtained by allocating computing resources to terminal device n can be calculated as

\begin{matrix} R_{n}^{″} (t) = (1 - d_{n} (t)) \sum_{m = 1}^{M} α_{n, m} \frac{ζ_{n} F_{n, m} (t)}{L_{u_{n}} - χ_{m} E_{n, m}^{M E C, e} (t)} . \end{matrix}

(21)

The amount of computing resources allocated to each unit computing task by the above formula has a very important impact on the completion time of computing task. Thus, the service cache cost mainly includes two parts: the cost of replacing type of cache supported on MEC side, and the cost of caching specific services on MEC server. Define the unit price of replacing cache type on MEC server m as ξ_m for each service type, and the unit price for caching services on MEC server is ξ_m per storage space. In order to increase the benefits of cache, the business type is quantified by weak backhaul from MEC server to the core network, which will be used to measure the cost of users. The benefits obtained by executing the cache service on MEC server m can be expressed as

\begin{matrix} R_{m}^{″} (t) = \sum_{n = 1}^{N} β_{n} (1 - g_{n} (t)) R_{n, m} (t) - ξ_{m} |I [ψ_{m} (t) - ψ_{m} (t - 1)]| - ς_{m} κ |ψ_{m} (t)|, \end{matrix}

(22)

where |ψ_m(t)| represents the number of nonzero elements, I[·] is an auxiliary function, and when x > 0, I(x)=1; otherwise, I(x)=0. The instant reward is designed as the total income of MVNO of all current users of system during the time period t, namely,

\begin{matrix} r (t) = \sum_{n = 1}^{N} (R_{n} (t) + R_{n}^{″} (t)) + \sum_{m = 1}^{M} R_{m}^{″} (t) . \end{matrix}

(23)

Here the long-term return ℜ(t) is expressed as

\begin{matrix} R (t) = \sum_{t = 1}^{T} ϵ r (t), \end{matrix}

(24)

where ϵ ∈ [0,1) is the discount rate of future earnings weights. When ϵ approaches 1, the system will pay more attention to long-term benefits, and when ϵ approaches 0, the system will pay more attention to short-term benefits.

4.3. Dueling-DQN

DQN is an effective reinforcement learning algorithm, which can make the agent learn good experience from the interaction with environments [31–33]. At the same time, according to DQN learning mechanism, there are improvements to DQN algorithm in different aspects. In DQN, due to the error in the Q estimated value itself, max_a Q process can be seen according to the expression. It is equivalent to putting forward the largest error, which also leads to the problem of overestimation. Double-DQN is an effective improved algorithm for this problem. In Double-DQN algorithm, the update form of $\hat{Q} (S)$ is changed to

\begin{matrix} \hat{Q} (s) = R (s) + λ \cdot \hat{Q} (\hat{s}, \max_{a} Q_{eval} (\hat{s}, a; α); α^{-}), \end{matrix}

(25)

where λ is the discount factor.

The Double-DQN algorithm takes advantage of double neural network and uses two neural networks to learn at the same time, effectively avoiding the overestimation problem caused by error amplification.

Dueling-DQN is also an improvement to DQN algorithm. Compared with previous algorithms, Dueling-DQN algorithm learns faster and has better results. Compared with DQN algorithm, Dueling-DQN retains most of the learning mechanism, and the only difference is the improvement of neural network, as shown in Figure 3.

Comparison between DQN algorithm and Dueling-DQN algorithm. (a) DQN. (b) Dueling-DQN.

In the traditional DQN algorithm, the output result is Q value corresponding to each action. In Dueling-DQN algorithm, the output is expressed as a combination of two parts: the value function and advantage function [34]. Among them, value function refers to the value of a certain state, and advantage function refers to the advantage obtained by each action on the state. Therefore, in Dueling-DQN algorithm, Q value problem in DQN can be reexpressed as the following form:

\begin{matrix} Q (s, a; α, ω_{1}, ω_{2}) = V (s; ω, ω_{2}) + l (s, a; ω, ω_{1}) - \frac{1}{|l|} \sum_{a^{'}} l (s, a^{'}; ω, ω_{1}), \end{matrix}

(26)

where V(·) and ℓ(·) are the value function and advantage function, respectively, and ω is the parameter of neural network convolutional layer. ω₁, ω₂ are the parameters of two control flow layers, respectively. The latter item of the plus sign centralizes the advantage function in order to solve the uniqueness problem of Q value.

5. Experimental Results and Analysis

The specific simulation parameters are as follows.

Assume that the computing power of each device n is 1.5 GHz, the uplink transmission power is 800 mW, the idle power is 100 mW, and the upload rate is 2.5 Mb/s. M = 4 and overall computing capacity of each MEC server is 6 GHz, 5 GHz, 3 GHz, and 1 GHz, respectively. The data D_n in task G_n obeys uniform distribution of (600, 1200), and the unit is k bits. The workload ϕ_n obeys uniform distribution of (1000, 1500), and the unit is Megacycles.

For the parameters of Dueling-DQN algorithm, set the learning rate ϵ to 0.001 and discount coefficient λ to 0.90. The size of experience replay set is 3000, and the number of randomly sampled samples is 40.

5.1. Parameter Analysis

5.1.1. Learning Rate Analysis

The learning rate of the algorithm will have a great impact on the performance of the proposed strategy. Therefore, three different learning rates ϵ of 0.01, 0.001, and 0.0001 are selected to compare the convergence of improved DQN algorithm, as shown in Figure 4.

Convergence of the proposed algorithm under different learning rates.

5.1.2. Discount Factors Analysis

Similarly, the influence of discount factor on improved DQN algorithm is shown in Figure 5, where the discount factor takes values 0.8, 0.9, and 0.95.

System long-term rewards with different discount rates.

It can be seen from Figure 5 that as the discount factor increases, the long-term reward is continuously increasing. When λ is 0.95, the long-term reward is 3700 when it is stable. Because the discount factor will affect behavior selection strategy, that is to say, a larger discount factor will cause system to pay more attention to long-term benefits, and a lower discount factor will cause system to pay more attention to current benefits, a higher discount factor will often lead to greater long-term benefits. However, in actual use, using an overly high discount factor does not have corresponding benefits. This is because the system in reality is more changeable, and too much emphasis on future benefits will lead to excessive calculations and excessive losses in the system, which often requires a trade-off.

5.2. Optimization Comparison under Different Objective Functions

For multiobjective optimization problems that reduce time delay and energy consumption, the weighted sum of task execution time delay and terminal execution energy consumption is usually used as the objective function to solve problem, and the calculation is as follows:

\begin{matrix} Ψ^{'} = \frac{\sum_{n = 1}^{N} (ω_{t} \times T_{n} + (1 - ω_{t}) \times E_{n})}{N}, \end{matrix}

(27)

where ω_t is the weight coefficient of execution delay and 1 − ω_t is the weight coefficient of execution energy.

Comparing (22) with the proposed objective function (13) to optimize the delay and energy consumption, the number of terminal devices is 12. Considering that the goal of the proposed strategy is to shorten time delay and reduce energy consumption while satisfying the time delay constraints, therefore, the values of ω_t are, respectively, 0.8, 0.6, and 0.4, and the joint experiments of Energy Reduced Scale (ERS) and Time Reduced Scale (TRS) are carried out, as shown in Table 1.

Table 1.

Comparison results of optimization for different objective functions.

Objective function	ERS (%)	TRS (%)
ω _t = 0.4	48.71	27.96
ω _t = 0.6	41.85	31.38
ω _t = 0.8	31.03	32.56
Formula (13)	52.18	34.72

Open in a new tab

It can be seen from Table 1 that when ω_t is 0.8 and 0.6, the control strategy pays more attention to the optimization of time delay, and when ω_t is 0.4, optimization results pay more attention to the optimization of energy consumption. However, the optimization result of the proposed objective function is the best, and ERS and TRS are 52.18% and 34.72%, respectively, which can shorten time delay and reduce energy consumption under the time delay constraints.

When computing task is 150, comparing control strategies under the four objective functions with the random offloading strategy, the results of ratio of the time delay and energy consumption reduction are shown in Table 2.

Table 2.

Comparison results of optimization for four objective functions.

Objective function	Delay reduction ratio (%)	Energy consumption reduction ratio (%)
ω _t = 0.4	1.25	23.29
ω _t = 0.6	1.97	21.16
ω _t = 0.8	2.03	19.98
Formula (13)	2.58	30.67

Open in a new tab

It can be seen from Table 2 that delay and energy consumption optimization effect of the proposed optimization target is better, and the reduction ratio of delay and energy consumption is 2.58% and 30.67%, respectively, because the optimization objective of the proposed strategy comprehensively considers the offloading decision and resource allocation plan of joint optimization system when the computing resources are limited and computing tasks have time delay constraints. This allows all computing tasks to shorten completion time and minimize the energy consumption of all terminal devices while meeting the delay constraints. This demonstrates the effectiveness of the proposed objective function.

5.3. Performance Comparison with Other Algorithms

In order to demonstrate the performance of the proposed strategy, compare it with [12], [19], and [14] in terms of objective function value, calculation amount, and time saving. Li and Jiang [12] proposed a distributed task offloading strategy, which selects the best MEC node offloading amount by game equation on the basis of quantifying offloading cost and delay. Reference [14] used the greedy selection algorithm to design the maximum energy-saving priority algorithm and energy priority strategy to achieve optimal offloading of computing tasks on mobile devices. Reference [19] used the time average calculation rate maximization algorithm to jointly and efficiently allocate radio resources and computing resources.

5.3.1. Algorithm Comparison under Different Cumulative Tasks

In the experiment, objective function value results of the four strategies are shown in Figure 6 for different accumulations of computing tasks.

Comparison results of offloading strategies under different cumulative number of tasks.

It can be seen from Figure 6 that the value of objective function is gradually increasing with the increase of cumulative number of tasks for the four offloading strategies. However, the proposed strategy has a relatively lower objective function value than other strategies. That is, the energy consumption and delay are relatively small. For example, when the number of tasks is 180, the objective function value is only 298. Since the proposed strategy considers computation offloading and resource allocation comprehensively, improved deep learning algorithm is used for optimization, and delay and energy consumption are minimized. Reference [19] only matched computing resources but did not rationally optimize the task offloading scheme and computing resource allocation scheme, resulting in high task execution time delay and energy consumption. References [12] and [14] both used corresponding algorithms for optimization to achieve better resource allocation and task offloading. But their analysis of time delay is less, so the performance needs to be strengthened.

5.3.2. Computation Number Comparison of Offloading Tasks under Different Offloading Strategies

Under four different computation offloading strategies, the comparison results of the computing number of offloading tasks on terminal device side are shown in Figure 7. Vertical axis represents the total calculation number of tasks performed by all terminal devices to perform calculation and offloading. The calculation number of tasks is used to represent the amount of calculation services provided by MEC server. Therefore, the evaluation indicators in the figure also represent the benefits of computing terminal devices in the offloading mode.

Comparison results of the computation number of offloading tasks under different offloading strategies.

It can be seen from Figure 7 that as time increases, computing tasks continue to increase, and the amount of task calculations also increases. However, the calculation amount of the proposed strategy is significantly better than other comparison strategies. Taking the simulation time of 140 s as an example, compared with [12], [19], and [14], the proposed strategy has increased by 11.54%, 20.83%, and 152.72%, respectively. It can be argued that the proposed strategy is the best compared to task offloading. It uses Dueling-DQN algorithm to process task offloading and resource allocation models, and its optimization performance is better than the greedy selection algorithm in [14] and the game equation model in [12].

5.3.3. Energy-Saving Comparison of per Unit Terminal Devices

Under four different computation offloading strategies, the comparison of energy consumption saved by each terminal device by computation offloading on average is shown in Figure 8. In the local calculation model, all energy consumption is generated by local calculations. In the computation offloading mode, the energy consumption is communication energy consumption caused by upload tasks. For the task of performing computation offloading, the difference between the two is energy saving.

Comparison results of energy saving per unit terminal device under different offloading strategies.

It can be seen from Figure 8 that, compared with other comparison strategies, the proposed strategy has the largest energy-saving rate, which is close to 10 × 104 J; this also means the least energy consumption. Aiming at the overestimation problem in DQN, the proposed strategy uses Dueling-DQN algorithm for optimization. And it designs the system benefits and resource consumption as rewards and losses, which improves the efficiency and rationality of task offloading and resource allocation by optimizing problem solution. Reference [19] only used the time average calculation rate maximization algorithm to efficiently allocate computing resources. The optimization algorithm is more traditional and has poor performance. Thus, the overall energy saving is not high. Reference [12] used the game equation model to optimize task offloading strategy but does not realize the rationalization of resource allocation. Therefore, the maximum energy saving is 710 × 104 J. Reference [14] used greedy selection algorithm to design an optimal energy-saving strategy but did not consider server computing resource constraints and task delay constraints. Therefore, the overall performance is not as good as the proposed strategy.

6. Conclusion

MEC server has limited computing resources and computing task has delay constraint. How to shorten completion time and reduce terminal energy consumption under the delay constraints becomes an important research issue. To solve this problem, this paper proposes a task offloading and resource allocation strategy based on deep learning for MEC. In the multiuser multiserver MEC environment, a new objective function is designed to construct mathematical model. In combination with deep reinforcement learning, the partially improved Dueling-DQN algorithm is used to solve the optimization problem model, which can reduce the completion time of computing tasks and minimize energy consumption of all terminal devices under the delay constraints. The proposed strategy is demonstrated by experiments based on Python platform. The experimental results show that when learning rate is 0.001 and discount factor is 0.90, the energy saving is close to 10 × 104 J, which is better than other comparison strategies. In terms of calculation amount, it increased by 11.54%, 20.83%, and 152.72%, respectively.

In practice, different users have different concerns about service quality. Therefore, we can refer to the different needs of users when making computation and offloading decisions in the following research. It can assign a certain weight to the factors affecting the quality of service and combine the task priority for scheduling.

Acknowledgments

This work was supported by Key Disciplines of Computer Science and Technology (2019xjzdxk1) and New Engineering Pilot Project (szxy2018xgk05).

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

References

1.Peng W. P., Su Z., Song C., Zongpu J. Research on adaptive dual task offloading decision algorithm for parking space recommendation service. The Journal of China Universities of Posts and Telecommunications . 2019;26(06):33–45. [Google Scholar]
2.Wang K., Wang X. F., Liu X., Jolfaei A. Task offloading strategy based on reinforcement learning computing in edge computing architecture of internet of vehicles. IEEE Access . 2020;8(6):173779–173789. doi: 10.1109/access.2020.3023939. [DOI] [Google Scholar]
3.Wang J., Hu J., Min G., Zomaya A. Y., Georgalas N. Fast adaptive task offloading in edge computing based on meta reinforcement learning. IEEE Transactions on Parallel and Distributed Systems . 2021;32(1):242–253. doi: 10.1109/tpds.2020.3014896. [DOI] [Google Scholar]
4.Sun Y., Guo X., Song J., et al. Adaptive learning-based task offloading for vehicular edge computing systems. IEEE Transactions on Vehicular Technology . 2019;68(4):3061–3074. doi: 10.1109/tvt.2019.2895593. [DOI] [Google Scholar]
5.Liu X., Yu J., Wang J., Gao Y. Resource allocation with edge computing in IoT networks via machine learning. IEEE Internet of Things Journal . 2020;7(4):3415–3426. doi: 10.1109/jiot.2020.2970110. [DOI] [Google Scholar]
6.Wang R., Cao Y., Noor A., Alamoudi T. A., Nour R. Agent-enabled task offloading in UAV-aided mobile edge computing. Computer Communications . 2020;149(5):324–331. doi: 10.1016/j.comcom.2019.10.021. [DOI] [Google Scholar]
7.Zhan W., Luo C., Min G., Wang C., Zhu Q., Duan H. Mobility-aware multi-user offloading optimization for mobile edge computing. IEEE Transactions on Vehicular Technology . 2020;69(3):3341–3356. doi: 10.1109/tvt.2020.2966500. [DOI] [Google Scholar]
8.Wei Z., Pan J., Lyu Z., Xu J., Shi L., Xu J. An offloading strategy with soft time windows in mobile edge computing. Computer Communications . 2020;164(8):42–49. doi: 10.1016/j.comcom.2020.09.011. [DOI] [Google Scholar]
9.Zhang R., Cheng P., Chen Z., Liu S., Li Y., Vucetic B. Online learning enabled task offloading for vehicular edge computing. IEEE Wireless Communications Letters . 2020;9(7):1–932. doi: 10.1109/lwc.2020.2973985. [DOI] [Google Scholar]
10.Zhang Q., Gui L., Hou F., Chen J., Zhu S., Tian F. Dynamic task offloading and resource allocation for mobile edge computing in dense cloud RAN. IEEE Internet of Things Journal . 2020;7(4):3282–3299. doi: 10.1109/jiot.2020.2967502. [DOI] [Google Scholar]
11.Zhang J., Guo H., Liu J. Adaptive task offloading in vehicular edge computing networks: a reinforcement learning based scheme. Mobile Networks and Applications . 2020;25(5):1736–1745. doi: 10.1007/s11036-020-01584-6. [DOI] [Google Scholar]
12.Li Y., Jiang C. Distributed task offloading strategy to low load base stations in mobile edge computing environment. Computer Communications . 2020;164(2):240–248. doi: 10.1016/j.comcom.2020.10.021. [DOI] [Google Scholar]
13.Zhang J., Guo H., Liu J., Zhang Y. Task offloading in vehicular edge computing networks: a load-balancing solution. IEEE Transactions on Vehicular Technology . 2020;69(2):2092–2104. doi: 10.1109/tvt.2019.2959410. [DOI] [Google Scholar]
14.Wei F., Chen S., Zou W. A greedy algorithm for task offloading in mobile edge computing system. China Communications . 2018;15(11):149–157. doi: 10.1109/cc.2018.8543056. [DOI] [Google Scholar]
15.Lu H., Gu C., Luo F., Ding W., Liu X. Optimization of lightweight task offloading strategy for mobile edge computing based on deep reinforcement learning. Future Generation Computer Systems . 2020;102(3):847–861. doi: 10.1016/j.future.2019.07.019. [DOI] [Google Scholar]
16.Li L., Zhang H. Delay optimization strategy for service cache and task offloading in three-tier architecture mobile edge computing system. IEEE Access . 2020;8(9):170211–170224. doi: 10.1109/access.2020.3023771. [DOI] [Google Scholar]
17.Zhang X., Zhang J., Liu Z., Cui Q., Tao X., Wang S. MDP-based task offloading for vehicular edge computing under certain and uncertain transition probabilities. IEEE Transactions on Vehicular Technology . 2020;69(3):3296–3309. doi: 10.1109/tvt.2020.2965159. [DOI] [Google Scholar]
18.Wang F., Xu J., Cui S. Optimal energy allocation and task offloading policy for wireless powered mobile edge computing systems. IEEE Transactions on Wireless Communications . 2020;19(4):2443–2459. doi: 10.1109/twc.2020.2964765. [DOI] [Google Scholar]
19.Li C., Chen W., Tang J., Luo Y. Radio and computing resource allocation with energy harvesting devices in mobile edge computing environment. Computer Communications . 2019;145(09):193–202. doi: 10.1016/j.comcom.2019.06.001. [DOI] [Google Scholar]
20.Ali Z., Khaf S., Abba Z. H., Abbas G, Jiao L. A Comprehensive Utility Function for Resource Allocation in Mobile Edge Computing. arXiv preprint arXiv:2012.10468 . 2020;66(2):1461–1477. doi: 10.32604/cmc.2020.013743. [DOI] [Google Scholar]
21.Huang P. Q., Wang Y., Wang K., Zhi-Zhong L. A bilevel optimization approach for joint offloading decision and resource allocation in cooperative mobile edge computing. IEEE Transactions on Cybernetics . 2019;50(10):1–14. doi: 10.1109/TCYB.2019.2916728. [DOI] [PubMed] [Google Scholar]
22.Liu Y., Li Y., Niu Y., Jin D. Joint optimization of path planning and resource allocation in mobile edge computing. IEEE Transactions on Mobile Computing . 2020;19(9):2129–2144. doi: 10.1109/tmc.2019.2922316. [DOI] [Google Scholar]
23.Lei Y. A., Cz A., Qy B., Zou W., Fathalla A. Task offloading for directed acyclic graph applications based on edge computing in Industrial Internet-ScienceDirect. Information Sciences . 2020;540(7):51–68. [Google Scholar]
24.He X. F., Jin R. C., Dai H. Y. Peace: privacy-preserving and cost-efficient task offloading for mobile-edge computing. IEEE Transactions on Wireless Communications . 2020;19(3):1814–1824. doi: 10.1109/twc.2019.2958091. [DOI] [Google Scholar]
25.Gu B., Zhou Z. Task offloading in vehicular mobile edge computing: a matching-theoretic framework. IEEE Vehicular Technology Magazine . 2019;14(3):100–106. doi: 10.1109/mvt.2019.2902637. [DOI] [Google Scholar]
26.Alfakih T., Hassan M. M., Gumaei A., Savaglio C., Fortino G. Task offloading and resource allocation for mobile edge computing by deep reinforcement learning based on SARSA. IEEE Access . 2020;8(5):54074–54084. doi: 10.1109/access.2020.2981434. [DOI] [Google Scholar]
27.Pan Y., Chen M., Yang Z., Huang N., Shikh-Bahaei M. Energy-efficient NOMA-based mobile edge computing offloading. IEEE Communications Letters . 2019;23(2):310–313. doi: 10.1109/lcomm.2018.2882846. [DOI] [Google Scholar]
28.Jiang Y. L., Chen Y. S., Yang S. W., Wu C. H. Energy-efficient task offloading for time-sensitive applications in fog computing. IEEE Systems Journal . 2019;13(3):2930–2941. doi: 10.1109/jsyst.2018.2877850. [DOI] [Google Scholar]
29.Xu X., Liu Q., Luo Y., et al. A computation offloading method over big data for IoT-enabled cloud-edge computing. Future Generation Computer Systems . 2019;95(06):522–533. doi: 10.1016/j.future.2018.12.055. [DOI] [Google Scholar]
30.Shu C., Zhao Z., Han Y., Min G., Duan H. Multi-user offloading for edge computing networks: a dependency-aware and latency-optimal approach. IEEE Internet of Things Journal . 2020;7(3):1678–1689. doi: 10.1109/jiot.2019.2943373. [DOI] [Google Scholar]
31.Hu S., Li G. Dynamic request scheduling optimization in mobile edge computing for IoT applications. IEEE Internet of Things Journal . 2020;7(2):1426–1437. doi: 10.1109/jiot.2019.2955311. [DOI] [Google Scholar]
32.Zeng J., Sun J., Wu B., Su X. Mobile edge communications, computing, and caching (MEC3) technology in the maritime communication network. China Communications . 2020;17(5):223–234. doi: 10.23919/jcc.2020.05.017. [DOI] [Google Scholar]
33.Lin Q., Wang F., Xu J. Optimal task offloading scheduling for energy efficient D2D cooperative computing. IEEE Communications Letters . 2019;23(10):1816–1820. doi: 10.1109/lcomm.2019.2931719. [DOI] [Google Scholar]
34.Xu X., He C., Xu Z., Qi L., Wan S., Bhuiyan M. Z. A. Joint optimization of offloading utility and privacy for edge computing enabled IoT. IEEE Internet of Things Journal . 2020;7(4):2622–2629. doi: 10.1109/jiot.2019.2944007. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data used to support the findings of this study are included within the article.

[B1] 1.Peng W. P., Su Z., Song C., Zongpu J. Research on adaptive dual task offloading decision algorithm for parking space recommendation service. The Journal of China Universities of Posts and Telecommunications . 2019;26(06):33–45. [Google Scholar]

[B2] 2.Wang K., Wang X. F., Liu X., Jolfaei A. Task offloading strategy based on reinforcement learning computing in edge computing architecture of internet of vehicles. IEEE Access . 2020;8(6):173779–173789. doi: 10.1109/access.2020.3023939. [DOI] [Google Scholar]

[B3] 3.Wang J., Hu J., Min G., Zomaya A. Y., Georgalas N. Fast adaptive task offloading in edge computing based on meta reinforcement learning. IEEE Transactions on Parallel and Distributed Systems . 2021;32(1):242–253. doi: 10.1109/tpds.2020.3014896. [DOI] [Google Scholar]

[B4] 4.Sun Y., Guo X., Song J., et al. Adaptive learning-based task offloading for vehicular edge computing systems. IEEE Transactions on Vehicular Technology . 2019;68(4):3061–3074. doi: 10.1109/tvt.2019.2895593. [DOI] [Google Scholar]

[B5] 5.Liu X., Yu J., Wang J., Gao Y. Resource allocation with edge computing in IoT networks via machine learning. IEEE Internet of Things Journal . 2020;7(4):3415–3426. doi: 10.1109/jiot.2020.2970110. [DOI] [Google Scholar]

[B6] 6.Wang R., Cao Y., Noor A., Alamoudi T. A., Nour R. Agent-enabled task offloading in UAV-aided mobile edge computing. Computer Communications . 2020;149(5):324–331. doi: 10.1016/j.comcom.2019.10.021. [DOI] [Google Scholar]

[B7] 7.Zhan W., Luo C., Min G., Wang C., Zhu Q., Duan H. Mobility-aware multi-user offloading optimization for mobile edge computing. IEEE Transactions on Vehicular Technology . 2020;69(3):3341–3356. doi: 10.1109/tvt.2020.2966500. [DOI] [Google Scholar]

[B8] 8.Wei Z., Pan J., Lyu Z., Xu J., Shi L., Xu J. An offloading strategy with soft time windows in mobile edge computing. Computer Communications . 2020;164(8):42–49. doi: 10.1016/j.comcom.2020.09.011. [DOI] [Google Scholar]

[B9] 9.Zhang R., Cheng P., Chen Z., Liu S., Li Y., Vucetic B. Online learning enabled task offloading for vehicular edge computing. IEEE Wireless Communications Letters . 2020;9(7):1–932. doi: 10.1109/lwc.2020.2973985. [DOI] [Google Scholar]

[B10] 10.Zhang Q., Gui L., Hou F., Chen J., Zhu S., Tian F. Dynamic task offloading and resource allocation for mobile edge computing in dense cloud RAN. IEEE Internet of Things Journal . 2020;7(4):3282–3299. doi: 10.1109/jiot.2020.2967502. [DOI] [Google Scholar]

[B11] 11.Zhang J., Guo H., Liu J. Adaptive task offloading in vehicular edge computing networks: a reinforcement learning based scheme. Mobile Networks and Applications . 2020;25(5):1736–1745. doi: 10.1007/s11036-020-01584-6. [DOI] [Google Scholar]

[B12] 12.Li Y., Jiang C. Distributed task offloading strategy to low load base stations in mobile edge computing environment. Computer Communications . 2020;164(2):240–248. doi: 10.1016/j.comcom.2020.10.021. [DOI] [Google Scholar]

[B13] 13.Zhang J., Guo H., Liu J., Zhang Y. Task offloading in vehicular edge computing networks: a load-balancing solution. IEEE Transactions on Vehicular Technology . 2020;69(2):2092–2104. doi: 10.1109/tvt.2019.2959410. [DOI] [Google Scholar]

[B14] 14.Wei F., Chen S., Zou W. A greedy algorithm for task offloading in mobile edge computing system. China Communications . 2018;15(11):149–157. doi: 10.1109/cc.2018.8543056. [DOI] [Google Scholar]

[B15] 15.Lu H., Gu C., Luo F., Ding W., Liu X. Optimization of lightweight task offloading strategy for mobile edge computing based on deep reinforcement learning. Future Generation Computer Systems . 2020;102(3):847–861. doi: 10.1016/j.future.2019.07.019. [DOI] [Google Scholar]

[B16] 16.Li L., Zhang H. Delay optimization strategy for service cache and task offloading in three-tier architecture mobile edge computing system. IEEE Access . 2020;8(9):170211–170224. doi: 10.1109/access.2020.3023771. [DOI] [Google Scholar]

[B17] 17.Zhang X., Zhang J., Liu Z., Cui Q., Tao X., Wang S. MDP-based task offloading for vehicular edge computing under certain and uncertain transition probabilities. IEEE Transactions on Vehicular Technology . 2020;69(3):3296–3309. doi: 10.1109/tvt.2020.2965159. [DOI] [Google Scholar]

[B18] 18.Wang F., Xu J., Cui S. Optimal energy allocation and task offloading policy for wireless powered mobile edge computing systems. IEEE Transactions on Wireless Communications . 2020;19(4):2443–2459. doi: 10.1109/twc.2020.2964765. [DOI] [Google Scholar]

[B19] 19.Li C., Chen W., Tang J., Luo Y. Radio and computing resource allocation with energy harvesting devices in mobile edge computing environment. Computer Communications . 2019;145(09):193–202. doi: 10.1016/j.comcom.2019.06.001. [DOI] [Google Scholar]

[B20] 20.Ali Z., Khaf S., Abba Z. H., Abbas G, Jiao L. A Comprehensive Utility Function for Resource Allocation in Mobile Edge Computing. arXiv preprint arXiv:2012.10468 . 2020;66(2):1461–1477. doi: 10.32604/cmc.2020.013743. [DOI] [Google Scholar]

[B21] 21.Huang P. Q., Wang Y., Wang K., Zhi-Zhong L. A bilevel optimization approach for joint offloading decision and resource allocation in cooperative mobile edge computing. IEEE Transactions on Cybernetics . 2019;50(10):1–14. doi: 10.1109/TCYB.2019.2916728. [DOI] [PubMed] [Google Scholar]

[B22] 22.Liu Y., Li Y., Niu Y., Jin D. Joint optimization of path planning and resource allocation in mobile edge computing. IEEE Transactions on Mobile Computing . 2020;19(9):2129–2144. doi: 10.1109/tmc.2019.2922316. [DOI] [Google Scholar]

[B23] 23.Lei Y. A., Cz A., Qy B., Zou W., Fathalla A. Task offloading for directed acyclic graph applications based on edge computing in Industrial Internet-ScienceDirect. Information Sciences . 2020;540(7):51–68. [Google Scholar]

[B24] 24.He X. F., Jin R. C., Dai H. Y. Peace: privacy-preserving and cost-efficient task offloading for mobile-edge computing. IEEE Transactions on Wireless Communications . 2020;19(3):1814–1824. doi: 10.1109/twc.2019.2958091. [DOI] [Google Scholar]

[B25] 25.Gu B., Zhou Z. Task offloading in vehicular mobile edge computing: a matching-theoretic framework. IEEE Vehicular Technology Magazine . 2019;14(3):100–106. doi: 10.1109/mvt.2019.2902637. [DOI] [Google Scholar]

[B26] 26.Alfakih T., Hassan M. M., Gumaei A., Savaglio C., Fortino G. Task offloading and resource allocation for mobile edge computing by deep reinforcement learning based on SARSA. IEEE Access . 2020;8(5):54074–54084. doi: 10.1109/access.2020.2981434. [DOI] [Google Scholar]

[B27] 27.Pan Y., Chen M., Yang Z., Huang N., Shikh-Bahaei M. Energy-efficient NOMA-based mobile edge computing offloading. IEEE Communications Letters . 2019;23(2):310–313. doi: 10.1109/lcomm.2018.2882846. [DOI] [Google Scholar]

[B28] 28.Jiang Y. L., Chen Y. S., Yang S. W., Wu C. H. Energy-efficient task offloading for time-sensitive applications in fog computing. IEEE Systems Journal . 2019;13(3):2930–2941. doi: 10.1109/jsyst.2018.2877850. [DOI] [Google Scholar]

[B29] 29.Xu X., Liu Q., Luo Y., et al. A computation offloading method over big data for IoT-enabled cloud-edge computing. Future Generation Computer Systems . 2019;95(06):522–533. doi: 10.1016/j.future.2018.12.055. [DOI] [Google Scholar]

[B30] 30.Shu C., Zhao Z., Han Y., Min G., Duan H. Multi-user offloading for edge computing networks: a dependency-aware and latency-optimal approach. IEEE Internet of Things Journal . 2020;7(3):1678–1689. doi: 10.1109/jiot.2019.2943373. [DOI] [Google Scholar]

[B31] 31.Hu S., Li G. Dynamic request scheduling optimization in mobile edge computing for IoT applications. IEEE Internet of Things Journal . 2020;7(2):1426–1437. doi: 10.1109/jiot.2019.2955311. [DOI] [Google Scholar]

[B32] 32.Zeng J., Sun J., Wu B., Su X. Mobile edge communications, computing, and caching (MEC3) technology in the maritime communication network. China Communications . 2020;17(5):223–234. doi: 10.23919/jcc.2020.05.017. [DOI] [Google Scholar]

[B33] 33.Lin Q., Wang F., Xu J. Optimal task offloading scheduling for energy efficient D2D cooperative computing. IEEE Communications Letters . 2019;23(10):1816–1820. doi: 10.1109/lcomm.2019.2931719. [DOI] [Google Scholar]

[B34] 34.Xu X., He C., Xu Z., Qi L., Wan S., Bhuiyan M. Z. A. Joint optimization of offloading utility and privacy for edge computing enabled IoT. IEEE Internet of Things Journal . 2020;7(4):2622–2629. doi: 10.1109/jiot.2019.2944007. [DOI] [Google Scholar]

PERMALINK

Task Offloading and Resource Allocation Strategy Based on Deep Learning for Mobile Edge Computing

Zijia Yu

Xu Xu

Wei Zhou

Abstract

1. Introduction

2. Related Work

3. System Model

Figure 1.

3.1. Communication Model

3.2. Calculation Model

3.3. Problem Model

4. New Computation Offloading Method Based on Improved DQN

4.1. Multiagent Reinforcement Learning Algorithm

Figure 2.

4.2. Problem Description and Modeling

4.2.1. Network Status

4.2.2. Network Behavior

4.2.3. Reward Function

4.3. Dueling-DQN

Figure 3.

5. Experimental Results and Analysis

5.1. Parameter Analysis

5.1.1. Learning Rate Analysis

Figure 4.

5.1.2. Discount Factors Analysis

Figure 5.

5.2. Optimization Comparison under Different Objective Functions

Table 1.

Table 2.

5.3. Performance Comparison with Other Algorithms

5.3.1. Algorithm Comparison under Different Cumulative Tasks

Figure 6.

5.3.2. Computation Number Comparison of Offloading Tasks under Different Offloading Strategies

Figure 7.

5.3.3. Energy-Saving Comparison of per Unit Terminal Devices

Figure 8.

6. Conclusion

Acknowledgments

Data Availability

Conflicts of Interest

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases