Optimization via game theoretic control

Daizhan Cheng; Zequn Liu

doi:10.1093/nsr/nwaa019

. 2020 Mar 19;7(7):1120–1122. doi: 10.1093/nsr/nwaa019

Optimization via game theoretic control

Daizhan Cheng ^1,^✉, Zequn Liu ^2,³

PMCID: PMC8289022 PMID: 34692133

Summary

Using game theoretic control to solve optimization problem is a recently developed promising method. The key technique is to convert a networked system into a potential game, with a pre-assigned criterion as the potential function. An algorithm is designed for updating strategies to reach a Nash equilibrium (i.e. optimal solution).

INTRODUCTION

Game-based control is a cross discipline of control theory and game theory. In a certain sense, control theory and game theory are twin brothers, born in the 1940s. They have a common task, that is, to manipulate certain objects to reach preassigned goals. However, one of the major difference is that the objects of controls are not intelligent, whereas the objects of games are intelligent, and are therefore subject to ‘anti-control’. Some authors claim that the control system is a particular kind of ‘games’. However, this demarcation line is murky. It was noted in Ref. [1] that when facing some intelligent objects such as intelligent machines and intelligent networks etc., the existing control theory may not be applied directly. Because the controllers and the controlled objects may have game-like interactions. Putting certain game factors into control framework is an important research topic, which is inevitable in dealing with some social and economical problems. Meanwhile, combining game factors can tremendously enlarge the development of control theory and its applications.

There are many successful applications of game theory to control. For instance, the distributed coverage of graphs [2], which is important in sensor allocation, formation control of multi-agent systems, etc.; the congestion game based control, which is useful for transportation control, optimization of facility-based networks, etc.; and Nash equilibrium oriented control, which is applied to scheduling and resource allocation of networked process [3,4].

This perspective will focus on game theoretic control (GTC), which is particularly useful for optimization of networked systems. Gopalakrishnan et al. [5] describe a framework for optimization of multi-agent control problems using game theory. They propose an hourglass architecture to illustrate the GTC using potential games as the interface (Fig. 1). This approach is mainly based on potential game theory. There are some other useful game theory-based control design techniques. For example, Stackelberg game theory has been used to control agent behaviors [6,7].

Roughly speaking, optimization via GTC can be described as follows: consider a networked multi-agent system with n agents. Assume there is a global objective function J(x₁, x₂, …, x_n), where x_i is the action (or strategy) of the i-th agent, and J could be a total cost, which will be minimized, or a total payment, which will be maximized. As shown in Fig. 1, the GTC approach consists of two major steps: (i) design utility (or payoff) functions for each agent such that the overall system becomes a potential game with J as its potential function; (ii) design a strategy updating rule (more precisely, a learning algorithm), such that when each agent is optimizing its own utility functions the system can converge to a Nash equilibrium, which is an optimal value of J. But it might be a local optimal value. Note that, in general, as each agent can only obtain its neighbor’s information, the learning algorithm must be based on local information. In the following sections we describe this in detail.

POTENTIAL GAME

Definition 2.1

A finite game is denoted by G = {N, S, u}, where (1) N = {1, , n} is the set of players; (2) is the profile of strategies (or actions), where S_i = {1, , k_i} is the set of strategies of player i; (3) u = (u₁, , u_n) and each is the utility (or payoff) function of player i.
A finite game G = {N, S, u} is called a potential game, if there exists a function , called the potential function, such that for every i ∈ N and for every s_−i ∈ S_−i ≔ _{j ≠ i}S_j and ∀α, β ∈ S_i,

(1)

In a game-based optimization problem, the potential function plays a fundamental rule, which is similar to a Lyapunov function for stabilization of dynamic systems. The importance of potential function in GTC can also be seen from Fig. 1. Unfortunately, verifying whether a finite game is potential is not an easy job. This long standing problem was solved in Ref. [8], using a semi-tensor product (STP), which is defined as follows.

Definition 2.2.

Let , , and be the least common multiple of n and p. Then the (left) STP of A and B, denoted by A⋉B, is defined as

(2)

where ⊗ is the Kronecker product.

As a convention, hereafter the default matrix product is STP, that is, AB = A⋉B.

To use STP, the action a_i ∈ S_i is denoted as Inline graphic , where j ∈ S_i is expressed as . Then each utility function u_i can be expressed as

(3)

where Inline graphic is called the structure vector of u_i ().

Construct a potential equation:

(4)

where

and

Then we have the following result.

Theorem 2.3.

A finite game G is potential, if and only if, its potential Eq. (4) has a solution [8]. Moreover, if a solution exists, then the potential function has its structure vector as follows:

(5)

Using a potential equation, Liu and Zu [9] designed an efficient algorithm. Potential Eq. (4) can be easily extended to weighted potential games. Properly choosing weights can extend the application of potential game based optimization to a considerably larger set of systems. As the potential equation is a linear algebraic system, it is easy to figure out that the set of potential games is a vector subspace of finite games [10]. This vector space structure is very useful when near potential games are also used for optimization, because ‘near’ can be described by the topology of Euclidian vector spaces.

UTILITY DESIGN

From Fig. 1, it can be seen that the utility design is one of the key issues in GTC. The utility design faces two major problems. First, as the global performance criterion J is given, one must know whether it is possible to design utility functions, that turn the overall system into a potential game with J as its potential function. Second, if this is possible, how can these be designed?

Take a facility based system as an example. When a facility based system is considered, a congestion game is a proper tool to model the system.

Definition 3.1.

A facility based system is described by , where M = {1, 2, , m} is the set of facilities, N = {1, 2, , n} are users and describes the strategies of user i, which forms a set of subsets of M. is a profile. is the total cost. The facility optimization problem is to find the best profile , which minimizes the cost. That is,

(6)

It is well known that a congestion game is a potential game. Using STP technique, Han et al. [11] obtained necessary and sufficient conditions for the solvability. Moreover, a design method is also proposed.

Let Inline graphic , where . Construct

(7)

where Ξ is the itemized cost, depending on the number of users, Inline graphic is the cost vector, and P is the total cost.

Theorem 3.2

Given a facility based system , there exist cost functions such that the overall cost P(a) becomes a potential function, if and only if, Eq. (7) has a solution [11].

The detailed design technique for cost functions is based on Ref. [11].

LEARNING ALGORITHM

When a game is performed repeatedly, each agent is able to improve its strategies through learning. Then we have a dynamic game. Learning algorithm is also called the strategy updating rule. That is, how each agent updates its strategy to maximize (or minimize) its own utility function. When a strategy updating rule is decided, the dynamics of a dynamic game are also determined. Say, a Markov-type dynamic game is described as

(8)

where x_i(t), i = 1, Inline graphic , n are the strategies of agent i at moment t, u_j(t), j = 1, , m are extra controls. A learning algorithm has to be designed such that as agents optimize their own utility functions, the system will converge to a Nash equilibrium or even a global optimal value of the performance criterion J.

Using STP, Eq. (8) can be converted into its algebraic state space representation as

(9)

where Inline graphic , , is a logical matrix. Cheng et al. [12] provides a detailed method to formulate dynamic games. In addition, using algebraic form Eq. (9), many useful properties can be obtained [12].

For a potential evolutionary game, some algorithms can lead the profile to a Nash equilibrium, for example, myopic best response adjustment. For potential games, a Nash equilibrium is a local optimal profile. Unless the Nash equilibrium is unique, a Nash equilibrium is not enough to assure global optimization.

To assure a global optimal solution, some more powerful algorithms need to be developed. Note that when global optimization is investigated, mixed strategies are usually unavoidable. Then the algorithm becomes a state-dependent Markov chain [13].

CONCLUSION

Game-based control is a promising new technique in control theory. In particular, when the system has certain intelligent properties or a complicated system with uncertainties, certain game-like interactions exist between controllers and controlled objects. As a successful application of game-based control, when the optimization of multi-agent systems is considered, GTC becomes a powerful tool. This perspective describes the framework of GTC. It consists mainly of two steps: (1) design utility functions, which turn the overall system into a potential game with the preassigned performance criterion into the potential function; (2) design a local information based learning algorithm, which assures that as each agent optimizes its own utility functions, the overall optimization can be reached. Compared with distributed optimization, this approach is much more convenient and with fewer restrictions.

Contributor Information

Daizhan Cheng, Key Laboratory of Systems and Control, Academy of Mathematics and Systems Sciences, Chinese Academy of Sciences, China.

Zequn Liu, Key Laboratory of Systems and Control, Academy of Mathematics and Systems Sciences, Chinese Academy of Sciences, China; School of Mathematical Sciences, University of Chinese Academy of Sciences, China.

FUNDING

This work was supported partly by the National Natural Science Foundation of China (61773371 and 61733018).

Conflict of interest statement. None declared.

REFERENCES

1. Guo L. J Syst Sci Math Sci (in Chinese)2012; 31: 1014–8. [Google Scholar]
2. Yazicioglu AY, Egerstedt M, Shamma JS. Est Contr Netw Syst 2013; 4: 309–15. [Google Scholar]
3. Bhakar R, Sriram VSet al. IEEE Trans Power Syst 2010; 25: 51–8. [Google Scholar]
4. Lu J, Li H, Liu Yet al. IET Contr Theor Appl 2017; 11: 2040–7. [Google Scholar]
5. Gopalakrishnan R, Marden JR, Wierman A. Perform Eval Rev 2011; 38: 31–6. [Google Scholar]
6. Başar T, Srikant R. J Opt Theory & Appl 2002; 115: 479–90. [Google Scholar]
7. Maharjan S, Zhu Q, Zhang Yet al. IEEE Trans Smart Grid 2013; 4: 120–32. [Google Scholar]
8. Cheng D. Automatica 2014; 50: 1793–801. [Google Scholar]
9. Liu X, Zhu J. Automatica 2016; 68: 245–53. [Google Scholar]
10. Cheng D, Liu T, Zhang K. IEEE Trans Aut Contr 2016; 61: 3651–6. [Google Scholar]
11. Hao Y, Pan S, Qiao Yet al. IEEE Trans Aut Contr 2018; 63: 4361–6. [Google Scholar]
12. Cheng D, He F, Qi H. IEEE Trans Aut Contr 2015; 61: 2402–15. [Google Scholar]
13. Li C, Xing Y, He Fet al. Automatica 2020; 113: 108615. [Google Scholar]

[bib1] 1. Guo L. J Syst Sci Math Sci (in Chinese)2012; 31: 1014–8. [Google Scholar]

[bib2] 2. Yazicioglu AY, Egerstedt M, Shamma JS. Est Contr Netw Syst 2013; 4: 309–15. [Google Scholar]

[bib3] 3. Bhakar R, Sriram VSet al. IEEE Trans Power Syst 2010; 25: 51–8. [Google Scholar]

[bib4] 4. Lu J, Li H, Liu Yet al. IET Contr Theor Appl 2017; 11: 2040–7. [Google Scholar]

[bib5] 5. Gopalakrishnan R, Marden JR, Wierman A. Perform Eval Rev 2011; 38: 31–6. [Google Scholar]

[bib6] 6. Başar T, Srikant R. J Opt Theory & Appl 2002; 115: 479–90. [Google Scholar]

[bib7] 7. Maharjan S, Zhu Q, Zhang Yet al. IEEE Trans Smart Grid 2013; 4: 120–32. [Google Scholar]

[bib8] 8. Cheng D. Automatica 2014; 50: 1793–801. [Google Scholar]

[bib9] 9. Liu X, Zhu J. Automatica 2016; 68: 245–53. [Google Scholar]

[bib10] 10. Cheng D, Liu T, Zhang K. IEEE Trans Aut Contr 2016; 61: 3651–6. [Google Scholar]

[bib11] 11. Hao Y, Pan S, Qiao Yet al. IEEE Trans Aut Contr 2018; 63: 4361–6. [Google Scholar]

[bib12] 12. Cheng D, He F, Qi H. IEEE Trans Aut Contr 2015; 61: 2402–15. [Google Scholar]

[bib13] 13. Li C, Xing Y, He Fet al. Automatica 2020; 113: 108615. [Google Scholar]

PERMALINK

Optimization via game theoretic control

Daizhan Cheng

Zequn Liu

Summary

INTRODUCTION

Figure 1.

POTENTIAL GAME

Definition 2.2.

Theorem 2.3.

UTILITY DESIGN

Definition 3.1.

Theorem 3.2

LEARNING ALGORITHM

CONCLUSION

Contributor Information

FUNDING

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Optimization via game theoretic control

Daizhan Cheng

Zequn Liu

Summary

INTRODUCTION

Figure 1.

POTENTIAL GAME

Definition 2.2.

Theorem 2.3.

UTILITY DESIGN

Definition 3.1.

Theorem 3.2

LEARNING ALGORITHM

CONCLUSION

Contributor Information

FUNDING

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases