STOCHASTIC DYNAMIC GAMES - CINVESTAV - IPN Onésimo Hernández-Lerma Mathematics Department México City
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
STOCHASTIC DYNAMIC GAMES Onésimo Hernández-Lerma Mathematics Department CINVESTAV – IPN México City 1
Abstract: This talk is an introduction to stochastic dynamic games and some of their applications. It includes cooperative and noncooperative games, and some important special cases such as compromise solutions, zero–sum games, and games against nature (also known as minimax or worst–case control problems). It also includes recent results on the existence and characterizations of dynamic potential games. One of these characterizations is particularly interesting because it identifies a class of dynamic potential games in which Nash (or noncooperative) equilibria coincide with Pareto (or cooperative) equilibria. This latter fact is not very common. 2
Contents What is a “game”? Some classes of games Historical remarks Examples Cooperative games Compromise solutions Noncooperative games Zero-sum games, Minimax control, Dynamic potential games, … 3
WHAT IS A GAME ? A game is a mathematical model of conflict or bargaining between decision-makers (e.g. individuals, firms, governments,…) called players, agents, controllers,… 4
There are static (or one-shot) games repeated games, and dynamic games, in which the state of the game evolves as a dynamical system. 5
Remark. A one-player dynamic game is called a control problem. 6
HISTORICAL REMARKS The study of strategic games can be traced back to (at least) the 18th century. For instance, the earliest minimax solution of a game was proposed by James Waldegrave (1684-1741) about 1713. However, strategic games received widespread attention only with the publication of J. von Neumann and O. Morgenstern (1944). Theory of Games and Economic Behavior. 7
Historical Remarks This book concerned two-person zero-sum static games exclusively, but it spawned many fundamental ideas and concepts, in particular dynamic games. A. Wald (1945): minimax approach (“games against nature”) to statistical decision theory. J. Nash (1950): noncooperative games, bargaining games, … Nobel Prize 1994. 8
Historical Remarks P. M. Morse (1948): antisubmarine warfare, persecution-evasion games, randomized (also known as “mixed”) strategies. E. Paxson (1946), R. Isaacs (1950): differential games with . 9
Historical Remarks L.S. Shapley (1953): discrete-time Markov games, e.g. Nobel Prize 2012. W. H. Fleming (1960): stochastic differential games Jump Markov games, hybrid games, … 10
NOBEL PRIZES IN ECONOMICS 2012: A.E. Roth, L.S. Shapley. 2011: T.J. Sargent, C.A. Sims. 2007: L.Hurwicz, E.S. Maskin, R. B. Myerson. 2005: R.J. Aumann, T.C. Schelling. 2002: Vernon L. Smith. 11 1994: J.C. Harsanyi, J.F. Nash, R. Selten.
OTHER APPLICATIONS: • OR games (1920 matches in MathSciNet, July 1, 2013) • Environmental games, including exploitation of resources • Congestion games • Telecommunication networks • Portfolio games • Advertising • Monney laundering • E-commerce • Epidemics ⁞ 12
Example: The U.S. – Canada fish wars The general migration pattern of Pacific salmon raises an obvious question: 13 Whose fish are they? Department of Fisheries and Oceans, Canada, 1997.
Example: The U.S. – Canada fish wars The general migration pattern of Pacific salmon raises an obvious question: Whose fish are they? Department of Fisheries and Oceans, Canada, 1997.
Example: The U.S. – Canada fish wars Discrete-time model: with xt ≥ 0 ∀ t , where G(x) is the population’s natural growth function, e.g. Stochastic differential model: 15
Each player i (i=1,…,N) wishes to “optimize” a given objective function, e.g. defined for each initial state x0=x and each multistrategy π=(π1, …, πN ) where πi ={ati} is a strategy for player i 16
Example: oligopolies Oligopoly = market with few sellers (or firms or producers) and a large number of buyers (or consumers or customers). Examples of oligopolies: Transportation services (airlines, trains, buses, …) Energy markets (electricity, gas, oil, …) 17
Typical model: with ati = ( pti, ρti ) = (production, price) Cuornot duopoly (1838): ati = pti (“quantity game”) Edgeworth (1881): ati = ρti (“price game”) 18
Oligopoly theory began with Cournot, more than 140 years ago. Judging from many intermediate textbooks on price theory, one might think it ended with him too. J . W. Friedman, Oligopoly and the Theory of Games, 1982. 19
CLASSES OF DYNAMIC GAMES Cooperative games: The players act as a group and coordinate their actions for their mutual benefit. 20
CLASES OF DYNAMIC GAMES Noncooperative games: The players are rivals , and each of them acts in his own best interest, paying no attention whatsoever to the fortunes of the other players. 21
COOPERATIVE GAMES Notation. If u=(u1,…, uN) and v=(v1,…, vN) are in RN: u ≤ v iff ui ≤ vi ∀ i v u < v iff u ≤ v and u ≠ v 22
Fix an arbitrary initial state x0 (possibly random), and let Vi(π) ≡ Vi(π, x0 ) , i=1,…, N be the cost function of player i, when the players use the multistrategy π= ( π1,…, πn ) ∈ Π. Let V(π) := ( V1 (π),…, VN (π) ) and Γ := {V(π) | π ∈ Π} ⊂ RN be the game’s objective set. 23
Definition: A multistrategy π* ∈ Π is called a Pareto equilibrium (also known as a cooperative equilibrium) if there is no π ∈ Π for which V(π) < V(π*). The set Γ * := { V(π*) | π* ∈ Π is a Pareto equil.} is called the game’s Pareto front V2 Γ Γ* V1 24
Theorem: Under some hypotheses, π* is a Pareto equilibrium iff there is a vector λ = ( λ1,…, λN ), with λi > 0 ∀ i , and such that π* minimizes Why is π* a “cooperative equilibrium”? It is cooperative because no other joint decision of the players can improve the performance of at least one of them, without degrading the performance of the others. This leads to the following question: 25
When is a cooperative equilibrium “fair” to all the players? A possible answer. For each i ∈ { 1,…, N}, let and consider the game’s utopic (or virtual, or ideal) minimum V* := ( V1 *,…, VN * ). Then a multistrategy π* is a compromise solution with respect to a norm || · || on RN if 26
The term Vi (π) - Vi* is called the player i regret when π is used, and || V(π) - V* || V2 Γ is the group regret w.r.t. || ⋅ || V2* V1* V1 27
What norm should we use? For instance, Lp norms: with With p = 2, π* is called Salukvadze’s solution to the cooperative game. With p = ∞, π* is called a minimax equilibrium: 28
Important remark. A solution or equilibrium of a dynamic game is said to be time-consistent or dynamically stable if its “optimality” is preserved throughout the game. We shall assume that our solution concepts are dynamically stable. This is typically ensured using dynamic programming techniques. See e.g. 29
• Yeung and Petrosyan (2006) for cooperative stochastic differential games. • Yeung and Petrosyan (2012) for cooperative discrete-time stochastic games. For noncooperative games, time-consistency is obtained by Josa-Fombellida and Rincón-Zapatero (2013) for a class of stochastic differential games. See also the references therein. Remark: See International Game Theory Review 15 (2013), no. 2 (June 2013): Special issue on open problems in the theory of cooperative games. 30
NONCOOPERATIVE GAMES In a noncooperative game each player pursues his/her own interests … In a Nash (or noncooperative) equilibrium one player cannot improve his/her outcome by altering his/her decision unilaterally. Case N=2 players. A multistrategy π*= (π1* , π2*) is a Nash equilibrium (a.k.a. noncooperative equilibrium) if 31
N > 1 players. A multistrategy π*= (π1* ,…, πN*) is a Nash equilibrium if, for every i ∈ { 1,…, N}, Remark. Note that finding Nash equilibria involves solving simultaneously N optimal control problems. Remark: The existence of Nash equilibria for a general dynamic game with uncountable state space is an open problem! 32
Examples of incorrect results: 1. Lai, H.-C. and Tanaka, K. (1984). On an N-person noncooperative Markov game with a metric state space. J. Math. Anal. Appl. 101, pp. 78-96. 2. Borkar, V.S. and Ghosh, M. K. (1992). Stochastic differential games: an occupation measure based approach. J. Optim. Theory Appl. 73, pp. 359-385; correction: ibid, 88 (1996), pp. 251-252. The incorrect result in ref. 2 was reproduced in: Ramachandran, K.M.(2002) Stochastic differential games and applications. Chapter 8 in Handbook of Stochastic Analysis and Applications, edited by D. Kannan and V. Lakshmikantan, Marcel Dekker, New York. 33
Zero-sum games: V1 (π) + V2 (π) =0 for all π = ( π1, π2 ) . Let Then π*= (π1* , π2*) is a Nash equilibrium iff π*= (π1* , π2*) is a saddle point: 34
Minimax control. Consider a control (or one- player) problem that depends on unknown parameters. For instance, the state equation is or 35
To fix ideas, consider the latter case: the disturbances ξt are i.i.d. with unknown distribution. There are two cases, depending on whether the ξt are “observable” or not Yes Are the ξt “observable” Adaptive control No Minimax control 36
Adaptive control scheme: use observations of ξt to estimate μ x system a controller estimator 37
Minimax control ≡ worst-case control ≡ robust control ≡ game against nature (main ideas go back to A. Wald, c. 1939): the control problem is posed as a two-player game where player 1 is the controller, and player 2 is the “Nature” that each time t chooses the distribution, say μt ∈ M, of ξt , where M is the family of “feasible” distributions, e.g. or 38
Therefore, instead of the controller’s objective, say the “cost” function V(π), we now have where π={at} is a controller’s strategy, and γ ={μt} ⊂ M is a Nature’s strategy. Let Then π* is called a minimax strategy if A minimax strategy π* minimizes the “worst” that the Nature can do. 39
40
POTENTIAL GAMES The static case. Consider a static game in normal form where is the set of players; is the payoff function for player i, with 41
Potential Games Definition: We say that ∈ A is a Nash equilibrium for G if, for each player i∈ , Definition [Slade 1994, Monderer & Shapley 1996, ...] A differentiable funtion is called a potential for the game G if In this case G is called a potential game. 42
Potential Games Beckman et al. (1956) and Rosenthal (1973) are earlier papers on potential games. Remark: The definition of potential game does not need differentiability. 43
Potential Games Proposition. Suppose that G is a potential game with potential function P. Then ∈ A is a NE for G iff, for every i ∈ , The dynamic case A noncooperative dynamic game is said to be a dynamic potential game if it can be reduced to study a single optimal control problem. The following example is a special case of the results in Section 4.3.2 in [González-Sánchez & Hernández-Lerma 2013] 44
Example: An stochastic lake game [Dechert and O’Donnell 2006]. Consider the dynamic game: s.t. with Let This function is a potential for the SLG (1)-(2) in the sense that 45
Moreover, the solution to the control problem s.t. is a NE for the SLG. Indeed, suppose that 46
is an optimal solution to the OCP (4)-(5). Then for any sequence and the corresponding state sequence given by (2) we have (by definition (3) of P) . 47
Therefore that is, is the best reply of player 1 to the strategies . A similar argument applies to players 2, 3, …N. □ 48
PARETO vs NASH EQUILIBRIA Section 4.3.3 in González-Sánchez and Hernández- Lerma (2013) identifies a subclass of (discrete-time stochastic) dynamic potential games for which a Pareto (or cooperative) solution is also an open-loop Nash equilibrium. Martín-Herrán and Rincón-Zapatero (2005) obtain similar results for a class of (deterministic) differential games using markov strategies. 49
Example: The great fish war [Levhari and Mirman 1980]. Let xt (t=0, 1, …) be the stock of fish at time t in a specific fishing zone. Assume there are k countries deriving utility from fish consumption. Country i wants to maximize where is a discount factor and is the consumption corresponding to country i. The fish population follows the dynamics 50
To find a Pareto (or cooperative) equilibrium we want to maximize the convex combination subject to (6), where and each . Using the Euler equation approach it can be proved that, for each i= 1, …, k, the nonstationary Markov strategy for consumption is 51
where can be computed explicity [González- Sánchez, Hernández-Lerma 2013, Section 2.3.4]. Theorem 4.6 in the latter reference also yields that the Pareto solution in (7) is a Nash equilibrium. 52
REFERENCES E. Altman (2014). “Bio-inspired paradigms in network engineering games”. Journal of Dynamics and Games 1, 1-15. To appear in January 2014. M. Beckman, C.B. McGuire, C.B. Winsten (1956). Studies in the Economics of Transportation. Yale University Press, New Haven. W.D. Dechert, S.I. O’Donnell (2006). “The stochastic lake game: a numerical solution”. J. Econ. Dyn. Control 30, 1569-1587. D. González-Sánchez, O. Hernández-Lerma (2013). Discrete-Time Stochastic Control and Dynamic Potential Games: The Euler-Equation Approach. Springer Briefs, to appear. R. Josa-Fombellida, J.P. Rincón-Zapatero (2013). “An Euler-Lagrange equation approach for solving stochastic differential games”. Submitted. D. Levhari, L.D. Mirman (1980). “The great fish war: an example using dynamic Cournot-Nash solution”. Bell J. Econom. 11, 322-334. G. Martín-Herrán, J.P. Rincón-Zapatero (2005). “Efficient Markov perfect Nash equilibria: theory and application to dynamic fishery games”. J. Econ. Dyn. Control 29, 1073-1096. 53
References D. Monderer, L.S. Shapley (1996). “Potential games”. Games Econ. Behav. 14, 124-143. J.F. Reinganum (1982). “A class of differential games for which the closed-loop and open-loop Nash equilibria coincide”. J. Optim. Theory Appl. 36, 253-262. R.W. Rosenthal (1973). “A class of games possesing pure-strategy Nash equilibrium”. Int. J. Game Theory 2, 65-67. M.E. Slade (1993). “What does an oligopoly maximize”. J. Ind. Econ. 42, 45-61. E.R. Weintraub, editor (1992). Toward a History of Game Theory. Duke University Press, Durham. D.W.K. Yeung, L.A. Petrosyan (2006). Cooperative Stochastic Differential Games. Springer, New York. D.W.K. Yeung, L.A. Petrosyan (2012). “Subgame consistent solution for cooperative stochastic dynamic games with random horizon”. Int. Game Theory Rev. 14, no. 2. 54
THANK YOU FOR YOUR ATTENTION 55
You can also read