STOCHASTIC DYNAMIC GAMES - CINVESTAV - IPN Onésimo Hernández-Lerma Mathematics Department México City

 
CONTINUE READING
STOCHASTIC DYNAMIC GAMES - CINVESTAV - IPN Onésimo Hernández-Lerma Mathematics Department México City
STOCHASTIC
DYNAMIC GAMES

Onésimo Hernández-Lerma
 Mathematics Department
    CINVESTAV – IPN
       México City
                          1
STOCHASTIC DYNAMIC GAMES - CINVESTAV - IPN Onésimo Hernández-Lerma Mathematics Department México City
Abstract:
This talk is an introduction to stochastic dynamic games and
some of their applications. It includes cooperative and
noncooperative games, and some important special cases
such as compromise solutions, zero–sum games, and games
against nature (also known as minimax or worst–case
control problems). It also includes recent results on the
existence and characterizations of dynamic potential games.
One of these characterizations is particularly interesting
because it identifies a class of dynamic potential games in
which Nash (or noncooperative) equilibria coincide with
Pareto (or cooperative) equilibria. This latter fact is not very
common.                                                      2
STOCHASTIC DYNAMIC GAMES - CINVESTAV - IPN Onésimo Hernández-Lerma Mathematics Department México City
Contents
  What is a “game”?
   Some classes of games
   Historical remarks
   Examples

  Cooperative games
   Compromise solutions

  Noncooperative games
   Zero-sum games, Minimax control, Dynamic potential
   games, …                                         3
WHAT IS A GAME ?

A game is a mathematical model of
 conflict or bargaining between
 decision-makers (e.g. individuals,
 firms, governments,…) called
 players, agents, controllers,…

                                 4
There are

 static (or one-shot) games

 repeated games, and

 dynamic games, in which the state
 of the game evolves as a dynamical
 system.
                                 5
Remark. A one-player
dynamic game is called a
control problem.

                           6
HISTORICAL REMARKS
The study of strategic games can be traced back to (at
least) the 18th century. For instance, the earliest
minimax solution of a game was proposed by James
Waldegrave (1684-1741) about 1713. However,
strategic games received widespread attention only
with the publication of

    J. von Neumann and O. Morgenstern (1944).
    Theory of Games and Economic Behavior.

                                                     7
Historical Remarks

This book concerned two-person zero-sum static
games exclusively, but it spawned many fundamental
ideas and concepts, in particular dynamic games.

 A. Wald (1945): minimax approach
  (“games against nature”) to statistical
   decision theory.

 J. Nash (1950): noncooperative games,
  bargaining games, …
        Nobel Prize 1994.
                                                     8
Historical Remarks

 P. M. Morse (1948): antisubmarine warfare,
   persecution-evasion games, randomized (also
   known as “mixed”) strategies.

 E. Paxson (1946), R. Isaacs (1950): differential
   games

                                with             .

                                                     9
Historical Remarks

 L.S. Shapley (1953): discrete-time Markov games,
  e.g.

           Nobel Prize 2012.

 W. H. Fleming (1960): stochastic differential
  games

 Jump Markov games, hybrid games, …              10
NOBEL PRIZES IN ECONOMICS
2012: A.E. Roth, L.S. Shapley.

2011: T.J. Sargent, C.A. Sims.

2007: L.Hurwicz, E.S. Maskin, R. B. Myerson.

2005: R.J. Aumann, T.C. Schelling.

2002: Vernon L. Smith.

                                               11
1994: J.C. Harsanyi, J.F. Nash, R. Selten.
OTHER APPLICATIONS:
• OR games (1920 matches in MathSciNet, July 1, 2013)
• Environmental    games,   including   exploitation   of
    resources
•   Congestion games
•   Telecommunication networks
•   Portfolio games
•   Advertising
•   Monney laundering
•   E-commerce
•   Epidemics
    ⁞
                                                        12
Example: The U.S. – Canada fish wars
 The general migration pattern of Pacific salmon raises an obvious question:   13

 Whose fish are they? Department of Fisheries and Oceans, Canada, 1997.
Example: The U.S. – Canada fish wars
 The general migration pattern of Pacific salmon raises an obvious question:
 Whose fish are they? Department of Fisheries and Oceans, Canada, 1997.
Example: The U.S. – Canada fish wars

 Discrete-time model:

with xt ≥ 0 ∀ t , where G(x) is the population’s natural growth
function, e.g.

  Stochastic differential model:

                                                                  15
Each player i (i=1,…,N) wishes to “optimize” a given
  objective function, e.g.

defined for each initial state x0=x and each multistrategy

  π=(π1, …, πN )
 where πi ={ati} is a strategy for player i

                                                             16
Example: oligopolies

 Oligopoly = market with few sellers (or
firms or producers) and a large number of
buyers (or consumers or customers).

Examples of oligopolies:

  Transportation services (airlines, trains, buses,
   …)

  Energy markets (electricity, gas, oil, …)
                                                   17
Typical model:

with ati = ( pti, ρti ) = (production, price)

   Cuornot duopoly (1838): ati = pti (“quantity game”)

   Edgeworth (1881): ati = ρti (“price game”)

                                                      18
Oligopoly theory began with Cournot,
more than 140 years ago. Judging from
many intermediate textbooks on price
theory, one might think it ended with him
too.

J . W. Friedman, Oligopoly and the Theory of
Games, 1982.
                                           19
CLASSES OF DYNAMIC GAMES

   Cooperative games: The players
    act as a group and coordinate their
    actions for their mutual benefit.

                                          20
CLASES OF DYNAMIC GAMES

    Noncooperative games: The players
     are rivals , and each of them acts in his
     own best interest, paying no attention
     whatsoever to the fortunes of the other
     players.

                                             21
COOPERATIVE GAMES

Notation. If u=(u1,…, uN) and v=(v1,…, vN) are in
 RN:

 u ≤ v iff ui ≤ vi ∀ i
                                                v
 u < v iff u ≤ v and u ≠ v

                                                    22
Fix an arbitrary initial state x0 (possibly
random), and let
        Vi(π) ≡ Vi(π, x0 ) , i=1,…, N

be the cost function of player i, when the players use the
multistrategy π= ( π1,…, πn ) ∈ Π. Let

                 V(π) := ( V1 (π),…, VN (π) )

and
                     Γ := {V(π) | π ∈ Π} ⊂ RN

be the game’s objective set.
                                                             23
Definition: A multistrategy π* ∈ Π is called a Pareto
equilibrium (also known as a cooperative equilibrium) if
there is no π ∈ Π for which
                      V(π) < V(π*).
The set
  Γ * := { V(π*) | π* ∈ Π is a Pareto equil.}
is called the game’s Pareto front

                                           V2    Γ

                                    Γ*               V1
                                                           24
Theorem: Under some hypotheses, π* is a Pareto
 equilibrium iff there is a vector λ = ( λ1,…, λN ),
 with λi > 0 ∀ i , and such that π* minimizes

 Why   is π* a “cooperative equilibrium”? It is
cooperative because no other joint decision of the
players can improve the performance of at least
one of them, without degrading the performance of
the others. This leads to the following question:      25
When is a cooperative equilibrium “fair” to all
 the players?

A possible answer. For each i ∈ { 1,…, N}, let

 and consider the game’s utopic (or virtual, or ideal)
 minimum
                   V* := ( V1 *,…, VN * ).

 Then a multistrategy π* is a compromise solution with
 respect to a norm || · || on RN if

                                                         26
 The term Vi (π) - Vi* is called
the player i regret when π is
used, and
          || V(π) - V* ||           V2
                                                Γ
is the group regret w.r.t.
               || ⋅ ||
                                    V2*
                                          V1*       V1

                                                         27
What norm should we use? For instance, Lp norms:

                               with

  With p = 2, π* is called Salukvadze’s solution to the
 cooperative game.

    With p = ∞, π* is called a minimax equilibrium:

                                                           28
Important remark. A solution or
equilibrium of a dynamic game is said
to be time-consistent or dynamically
stable if its “optimality” is preserved
throughout the game. We shall assume
that our solution concepts are
dynamically stable. This is typically
ensured using dynamic programming
techniques. See e.g.
                                          29
• Yeung and Petrosyan (2006) for  cooperative
  stochastic differential games.
• Yeung and Petrosyan (2012) for cooperative
  discrete-time stochastic games.
For noncooperative games, time-consistency is
obtained by Josa-Fombellida and Rincón-Zapatero
(2013) for a class of stochastic differential games. See
also the references therein.
Remark: See International Game Theory Review 15
(2013), no. 2 (June 2013): Special issue on open
problems in the theory of cooperative games.           30
NONCOOPERATIVE GAMES

In a noncooperative game each player pursues
  his/her own interests …

In a Nash (or noncooperative) equilibrium one
   player cannot improve his/her outcome by
   altering his/her decision unilaterally.

Case N=2 players. A multistrategy π*= (π1* , π2*)
 is a Nash equilibrium (a.k.a. noncooperative
 equilibrium) if
                                                    31
N > 1 players. A multistrategy π*= (π1* ,…, πN*) is a
Nash equilibrium if, for every i ∈ { 1,…, N},

Remark. Note that finding Nash equilibria involves
solving simultaneously N optimal control problems.

Remark: The existence of Nash equilibria for a
general dynamic game with uncountable state space is
an open problem!                                   32
Examples of incorrect results:

  1. Lai, H.-C. and Tanaka, K. (1984). On an N-person
  noncooperative Markov game with a metric state space. J. Math.
  Anal. Appl. 101, pp. 78-96.
  2. Borkar, V.S. and Ghosh, M. K. (1992). Stochastic differential
  games: an occupation measure based approach. J. Optim. Theory
  Appl. 73, pp. 359-385; correction: ibid, 88 (1996), pp. 251-252.

The incorrect result in ref. 2 was reproduced in:
   Ramachandran, K.M.(2002) Stochastic differential games and
   applications. Chapter 8 in Handbook of Stochastic Analysis and
   Applications, edited by D. Kannan and V. Lakshmikantan,
   Marcel Dekker, New York.
                                                                     33
Zero-sum games: V1 (π) + V2 (π) =0 for all
π = ( π1, π2 ) . Let

Then π*= (π1* , π2*) is a Nash equilibrium iff
π*= (π1* , π2*) is a saddle point:

                                                 34
Minimax control. Consider a control (or one-
 player) problem that depends on unknown
 parameters. For instance, the state equation is

or

                                                   35
To fix ideas, consider the latter case: the disturbances ξt are
  i.i.d. with unknown distribution. There are two cases,
  depending on whether the ξt are “observable” or not

                              Yes
Are the ξt “observable”                 Adaptive control

            No

   Minimax control
                                                                  36
Adaptive control scheme: use observations of ξt to
 estimate μ
                                     x
                     system

         a
                    controller

                    estimator
                                                     37
Minimax control ≡ worst-case control ≡ robust control ≡
  game against nature (main ideas go back to A. Wald, c.
  1939): the control problem is posed as a two-player game
  where
 player 1 is the controller, and
 player 2 is the “Nature” that each time t chooses the
  distribution, say μt ∈ M, of ξt , where M is the family of
  “feasible” distributions, e.g.

                                             or

                                                               38
Therefore, instead of the controller’s objective, say the
“cost” function V(π), we now have

where π={at} is a controller’s strategy, and γ ={μt} ⊂ M is a
Nature’s strategy. Let

Then π* is called a minimax strategy if

A minimax strategy π* minimizes the “worst” that the Nature
can do.
                                                                39
40
POTENTIAL GAMES

The static case. Consider a static game in normal form

where
                 is the set of players;

               is the payoff function for player i, with

                                                     41
Potential Games

 Definition: We say that              ∈ A is a Nash
   equilibrium for G if, for each player i∈ ,

 Definition [Slade 1994, Monderer & Shapley 1996, ...]
   A differentiable funtion            is called a
   potential for the game G if

 In this case G is called a potential game.
                                                      42
Potential Games

 Beckman et al. (1956) and Rosenthal (1973) are
 earlier papers on potential games.

 Remark: The definition of potential game does not
 need differentiability.

                                                     43
Potential Games

Proposition. Suppose that G is a potential game with
  potential function P. Then              ∈ A is a NE
  for G iff, for every i ∈ ,

               The dynamic case
A noncooperative dynamic game is said to be a
dynamic potential game if it can be reduced to study
a single optimal control problem. The following
example is a special case of the results in Section 4.3.2
in [González-Sánchez & Hernández-Lerma 2013]
                                                        44
Example: An stochastic lake game [Dechert and
O’Donnell 2006]. Consider the dynamic game:

s.t.
with             Let

This function is a potential for the SLG (1)-(2) in the
sense that                                                45
Moreover, the solution to the control problem

s.t.

is a NE for the SLG. Indeed, suppose that

                                                46
is an optimal solution to the OCP (4)-(5). Then for any
sequence       and the corresponding state sequence
given by (2)

we have (by definition (3) of P)

                                           .
                                                      47
Therefore

that is,     is the best reply of player 1 to the
  strategies           . A similar argument applies to
  players 2, 3, …N. □

                                                         48
PARETO vs NASH EQUILIBRIA

Section 4.3.3 in González-Sánchez and Hernández-
Lerma (2013) identifies a subclass of (discrete-time
stochastic) dynamic potential games for which a Pareto
(or cooperative) solution is also an open-loop Nash
equilibrium. Martín-Herrán and Rincón-Zapatero
(2005) obtain similar results for a class of
(deterministic) differential games using markov
strategies.

                                                    49
Example: The great fish war [Levhari and Mirman
1980]. Let xt (t=0, 1, …) be the stock of fish at time t
in a specific fishing zone. Assume there are k countries
deriving utility from fish consumption.

Country i wants to maximize

where    is a discount factor and is the consumption
corresponding to country i. The fish population follows
the dynamics
                                                      50
To find a Pareto (or cooperative) equilibrium we want
to maximize the convex combination

subject to (6), where                 and each      .
Using the Euler equation approach it can be proved
that, for each i= 1, …, k, the nonstationary Markov
strategy for consumption is

                                                        51
where        can be computed explicity [González-
Sánchez, Hernández-Lerma 2013, Section 2.3.4].
Theorem 4.6 in the latter reference also yields that the
Pareto solution in (7) is a Nash equilibrium.

                                                           52
REFERENCES
 E. Altman (2014). “Bio-inspired paradigms in network engineering games”.
  Journal of Dynamics and Games 1, 1-15. To appear in January 2014.
 M. Beckman, C.B. McGuire, C.B. Winsten (1956). Studies in the Economics of
  Transportation. Yale University Press, New Haven.
 W.D. Dechert, S.I. O’Donnell (2006). “The stochastic lake game: a numerical
  solution”. J. Econ. Dyn. Control 30, 1569-1587.
 D. González-Sánchez, O. Hernández-Lerma (2013). Discrete-Time Stochastic
  Control and Dynamic Potential Games: The Euler-Equation Approach. Springer
  Briefs, to appear.
 R. Josa-Fombellida, J.P. Rincón-Zapatero (2013). “An Euler-Lagrange equation
  approach for solving stochastic differential games”. Submitted.
 D. Levhari, L.D. Mirman (1980). “The great fish war: an example using dynamic
  Cournot-Nash solution”. Bell J. Econom. 11, 322-334.
 G. Martín-Herrán, J.P. Rincón-Zapatero (2005). “Efficient Markov perfect Nash
  equilibria: theory and application to dynamic fishery games”. J. Econ. Dyn.
  Control 29, 1073-1096.                                                        53
References
 D. Monderer, L.S. Shapley (1996). “Potential games”. Games Econ. Behav. 14,
  124-143.
 J.F. Reinganum (1982). “A class of differential games for which the closed-loop
  and open-loop Nash equilibria coincide”. J. Optim. Theory Appl. 36, 253-262.
 R.W. Rosenthal (1973). “A class of games possesing pure-strategy Nash
  equilibrium”. Int. J. Game Theory 2, 65-67.
 M.E. Slade (1993). “What does an oligopoly maximize”. J. Ind. Econ. 42, 45-61.
 E.R. Weintraub, editor (1992). Toward a History of Game Theory. Duke
  University Press, Durham.
 D.W.K. Yeung, L.A. Petrosyan (2006). Cooperative Stochastic Differential
  Games. Springer, New York.
 D.W.K. Yeung, L.A. Petrosyan (2012). “Subgame consistent solution for
  cooperative stochastic dynamic games with random horizon”. Int. Game Theory
  Rev. 14, no. 2.

                                                                                    54
THANK YOU FOR
YOUR ATTENTION

                 55
You can also read