Game theory, learning, and control systems

Jeff S Shamma

doi:10.1093/nsr/nwz163

. 2019 Nov 4;7(7):1118–1119. doi: 10.1093/nsr/nwz163

Game theory, learning, and control systems

Jeff S Shamma ^1,^✉

PMCID: PMC8288873 PMID: 34692132

Summary

Game theory is the study of interacting decision makers, whereas control systems involve the design of intelligent decision-making devices. When many control systems are interconnected, the result can be viewed through the lens of game theory. This article discusses both long standing connections between these fields as well as new connections stemming from emerging applications.

Game theory is the study of interacting decision makers [1], i.e. settings in which the quality of an actor’s decision depends on the decisions of others. In commuting, the congestion experienced on a road depends on a vehicle’s path, as well as the paths taken by other vehicles. In auctions, the outcome depends on one’s own bid as well as the bids of others. In competitive markets, market share depends on both a firm’s pricing as well as the pricing of its competitors.

While game theory traditionally has been studied within the realm of mathematical social sciences, there are also strong ties to control systems [2,3].

A longstanding connection is the setting of zero-sum, or minimax, games. In zero-sum games, there are two players, and a benefit to one player is a detriment to the other. A classical example is pursuit-evasion games [4]. A common perspective in control systems is that one player is the controller and the opposing player is an adversarial environment, e.g. exogenous disturbances [5] or model misspecification [6]. The controller seeks to optimize a specified performance objective, whereas the adversarial environment seeks to reduce achieved performance. There has been renewed interest in zero-sum games in the area of security [7,8], where security measures are to be taken against a variety of adversarial attacks ranging from intrusion to data corruption to privacy violation.

Another area of interest is in distributed or networked control systems [9], motivated by applications such as power networks, transportation networks, and multi-robot systems. In such settings, illustrated in Fig. 1, there is a large number of decision making components where no single actor has full information on the state of the environment or full authority over the decisions over the network. A representative application is the smart grid [10], where a distributed network of prosumers make decisions on production, consumption, and storage of energy in response to evolving demand and environmental conditions.

Figure 1. — (a) A traditional control system architecture with centralized information and authority. (b) A distributed or networked control system architecture with multiple interacting decision makers.

A more recent connection between game theory and control systems is in the area of game-theoretic learning [11,12].

To set up the discussion, we first define a (non-cooperative) game by (i) a set of players; (ii) for each player, a set of actions; and (iii) for each player, a utility function that quantifies a player’s satisfaction with the collective actions of all players. More formally, we can write the utility function of the ith player as U_i(a₁, a₂, Inline graphic , a_i, , a_n), where (a₁, , a_n) is the action profile of n players, a_i is the action of the ith player, and U_i(·) is a real-valued function where the ith player prefers the action profile a = (a₁, , a_n) over whenever U_i(a) > U_i(a^′) (i.e. larger utility is better).

An important concept in game theory is the Nash equilibrium, which is an action profile Inline graphic such that for any player i and every alternative action , i.e. each player’s action is optimal with respect to the actions of other players.

Nash equilibrium is an example of a solution concept for a game, which is a proposed outcome given the specification of the elements of a game. Such an interpretation can be problematic. A game may have multiple Nash equilibria, resulting in an issue of non-uniqueness. Another lingering question is how agents might reach a Nash equilibrium, especially given that agents have limited knowledge about the utility functions of other agents or even observations of the actions of other agents. Indeed, even computing a Nash equilibrium can have intractable computational complexity [13]. Nonetheless, Nash equilibrium is widely used as representative of the outcome of a game-theoretic model.

The study of game-theoretic learning partially addresses these issues by shifting the discussion away from Nash equilibrium and towards how players might reach a Nash equilibrium through some sort of online or adaptive learning process. Such learning processes evolve over stages, e.g. t = 0, 1, 2, Inline graphic , and can be represented as

where the action of the ith player at stage t is determined by the learning rule, Inline graphic , that acts on the information available to player i up to stage t, , as well as U_i(·), the specific utility function of player i. A learning rule may be stochastic, wherein the action is a randomized outcome according to a probability distribution generated by the learning rule. For example, in reinforcement learning, an action is selected with a probability that is proportional to the cumulative utility that it has garnered in the past.

There is a very interesting and complicating factor that distinguishes game-theoretic learning from other learning formulations such as reinforcement learning. An implicit assumption in learning is that there is a stationary environment, and so, over time, one can determine what actions are more effective. However, in game-theoretic learning, the environment comprises other learning agents. Learning in the presence of other learners results in a non-stationary environment from the perspective of any individual agent. Indeed, depending on both the structure of the underlying game and the specific learning rule, outcomes can range from convergence to a Nash equilibrium (or other solution concepts, most notably correlated equilibrium [12]) to preferential selection of some Nash equilibria over others to non-convergence and even chaotic behavior. This notion of embedding learning agents in a common environment recently has gained significant attention in the context of training neural networks through so-called generative adversarial networks [14].

The game-theoretic learning framework leads to two significant connections to control systems. First, game-theoretic learning offers an approach to designing distributed control systems [9], as illustrated in Fig. 1, where the components are programmable engineered devices. An example is in area coverage problems [15], where mobile sensors are to explore an unknown environment. The main idea is to view each component as a player in a game. The system designer must endow these artificial players with both incentives (i.e. utility functions) and adaptive control laws (i.e. learning rules) that induce a desirable collective behavior through local interactions that respect the underlying distributed decision architecture.

A second connection is that control theory offers new insights into the analysis of game-theoretic learning. A learning rule is a type of dynamical system, and so interacting learning agents constitute a feedback interconnection of dynamical systems with special structures emerging from game-theoretic learning. Recent work includes exploiting underlying passivity properties associated with game-theoretic learning rules (see the tutorial paper [16] and references therein).

These research directions represent complementary paradigms for game-theoretic learning. In the first, game-theoretic learning is being used as a prescriptive approach to programming engineered devices. In the second, game-theoretic learning is a descriptive approach to modeling evolving human decision making. Going forward, there is a significant opportunity for game theory and control systems that blend these two perspectives.

In the emerging area of cyber–physical–social systems, the distributed decision architecture in Fig. 1 is a possible mix of both programmable devices and human decision makers. An application area is in smart cities [17], where (i) human drivers may share the road with autonomous vehicles; (ii) human users must be incentivized into participating in energy demand response while monitored by IoT devices; and (iii) humans and robots interact in unstructured environments, such as in assistive robotics. In such applications, the perspectives of both game theory and control theory come together, where game theory models interactive decision making and control systems methods address evolutionary dynamics while mitigating the uncertainty inherent in human decision making.

FUNDING

This work was supported by funding from King Abdullah University of Science and Technology (KAUST).

Conflict of interest statement . None declared.

REFERENCES

1. Osborne M, Rubinstein A. A Course in Game Theory. Cambridge: MIT Press, 1994. [Google Scholar]
2. Başar T, Olsder G. Classics in Applied Mathematics: Dynamic Noncooperative Game Theory. Philadelphia: Society for Industrial and Applied Mathematics, 1999. [Google Scholar]
3. Jonathan PH. IEEE Contr Syst Mag 2017; 37: 5–8. [Google Scholar]
4. Isaacs R. Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization. New York: Wiley, 1965. [Google Scholar]
5. Başar T, Bernhard P. Systems and Control: Foundations and Applications. H-infinity-Optimal Control and Related Minimax Design Problems. Basel: Birkhäuser, 1991. [Google Scholar]
6. Bernhard P. Robust control and dynamic games. In Başar T, Zaccour G (eds.). Handbook of Dynamic Game Theory. Cham: Springer, 2016, 1–30. [Google Scholar]
7. Tambe M. Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned. Cambridge: Cambridge University Press, 2011. [Google Scholar]
8. Manshaei MH, Zhu Q, Alpcan Tet al. ACM Comput Surv 2013; 45: 25. [Google Scholar]
9. Marden JR, Shamma JS. Game theory and distributed control. In Young H, Zamir S (eds.). Handbook of Game Theory with Economic Applications, Vol. 4. Amsterdam: Elsevier, 2015, 861–99. [Google Scholar]
10. Saad W, Han Z, Poor HVet al. IEEE Signal Process Mag 2012; 29: 86–105. [Google Scholar]
11. Young H. Arne Ryde Memorial Lectures: Strategic Learning and Its Limits. Oxford: Oxford University Press, 2004. [Google Scholar]
12. Hart S. Econometrica 2005; 73: 1401–30. [Google Scholar]
13. Daskalakis C, Goldberg PW, Papadimitriou CH. SIAM J Comput 2009; 39: 195–259. [Google Scholar]
14. Salimans T, Goodfellow I, Zaremba Wet al. Improved techniques for training GANs. In Proceedings of the 30th International Conference on Neural Information Processing Systems. Red Hook: Neural Information Processing Systems Foundation, 2016, 2234–42. [Google Scholar]
15. Cortes J, Martinez S, Karatas Tet al. IEEE Trans Robot Autom 2004; 20: 243–55. [Google Scholar]
16. Park S, Martins N, Shamma J. From population games to payoff dynamics models: a passivity-based approach. In 2019 IEEE 58th Conference on Decision and Control (CDC). Piscataway: IEEE, 2019, 6584–601. [Google Scholar]
17. Cassandras CG. Engineering 2016; 2: 156–8. [Google Scholar]

[bib1] 1. Osborne M, Rubinstein A. A Course in Game Theory. Cambridge: MIT Press, 1994. [Google Scholar]

[bib2] 2. Başar T, Olsder G. Classics in Applied Mathematics: Dynamic Noncooperative Game Theory. Philadelphia: Society for Industrial and Applied Mathematics, 1999. [Google Scholar]

[bib3] 3. Jonathan PH. IEEE Contr Syst Mag 2017; 37: 5–8. [Google Scholar]

[bib4] 4. Isaacs R. Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization. New York: Wiley, 1965. [Google Scholar]

[bib5] 5. Başar T, Bernhard P. Systems and Control: Foundations and Applications. H-infinity-Optimal Control and Related Minimax Design Problems. Basel: Birkhäuser, 1991. [Google Scholar]

[bib6] 6. Bernhard P. Robust control and dynamic games. In Başar T, Zaccour G (eds.). Handbook of Dynamic Game Theory. Cham: Springer, 2016, 1–30. [Google Scholar]

[bib7] 7. Tambe M. Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned. Cambridge: Cambridge University Press, 2011. [Google Scholar]

[bib8] 8. Manshaei MH, Zhu Q, Alpcan Tet al. ACM Comput Surv 2013; 45: 25. [Google Scholar]

[bib9] 9. Marden JR, Shamma JS. Game theory and distributed control. In Young H, Zamir S (eds.). Handbook of Game Theory with Economic Applications, Vol. 4. Amsterdam: Elsevier, 2015, 861–99. [Google Scholar]

[bib10] 10. Saad W, Han Z, Poor HVet al. IEEE Signal Process Mag 2012; 29: 86–105. [Google Scholar]

[bib11] 11. Young H. Arne Ryde Memorial Lectures: Strategic Learning and Its Limits. Oxford: Oxford University Press, 2004. [Google Scholar]

[bib12] 12. Hart S. Econometrica 2005; 73: 1401–30. [Google Scholar]

[bib13] 13. Daskalakis C, Goldberg PW, Papadimitriou CH. SIAM J Comput 2009; 39: 195–259. [Google Scholar]

[bib14] 14. Salimans T, Goodfellow I, Zaremba Wet al. Improved techniques for training GANs. In Proceedings of the 30th International Conference on Neural Information Processing Systems. Red Hook: Neural Information Processing Systems Foundation, 2016, 2234–42. [Google Scholar]

[bib15] 15. Cortes J, Martinez S, Karatas Tet al. IEEE Trans Robot Autom 2004; 20: 243–55. [Google Scholar]

[bib16] 16. Park S, Martins N, Shamma J. From population games to payoff dynamics models: a passivity-based approach. In 2019 IEEE 58th Conference on Decision and Control (CDC). Piscataway: IEEE, 2019, 6584–601. [Google Scholar]

[bib17] 17. Cassandras CG. Engineering 2016; 2: 156–8. [Google Scholar]

PERMALINK

Game theory, learning, and control systems

Jeff S Shamma

Summary

Figure 1.

FUNDING

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Game theory, learning, and control systems

Jeff S Shamma

Summary

Figure 1.

FUNDING

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases