paint-brush
Fictitious Play for Mixed Strategy Equilibria in Mean Field Games: Abstract and Introductionby@gamifications
New Story

Fictitious Play for Mixed Strategy Equilibria in Mean Field Games: Abstract and Introduction

by GamificationsSeptember 25th, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

The theory of mean field games (MFGs) provides a framework for modeling games with a large number of players. In an MFG model, each player controls their own trajectory based on the expected distribution of the states of all players. This paper proposes a generalized fictitious play algorithm that computes OSMFG mixed equilibria.
featured image - Fictitious Play for Mixed Strategy Equilibria in Mean Field Games: Abstract and Introduction
Gamifications HackerNoon profile picture

Authors:

(1) Chengfeng Shen, School of Mathematical Sciences, Peking University, Beijing;

(2) Yifan Luo, School of Mathematical Sciences, Peking University, Beijing;

(3) Zhennan Zhou, Beijing International Center for Mathematical Research, Peking University.

Abstract and 1. Introduction

2 Model and 2.1 Optimal Stopping and Obstacle Problem

2.2 Mean Field Games with Optimal Stopping

2.3 Pure Strategy Equilibrium for OSMFG

2.4 Mixed Strategy Equilibrium for OSMFG

3 Algorithm Construction and 3.1 Fictitious Play

3.2 Convergence of Fictitious Play to Mixed Strategy Equilibrium

3.3 Algorithm Based on Fictitious Play

3.4 Numerical Analysis

4 Numerical Experiments and 4.1 A Non-local OSMFG Example

4.2 A Local OSMFG Example

5 Conclusion, Acknowledgement, and References

Abstract

This paper considers mean field games with optimal stopping time (OSMFGs) where agents make optimal exit decisions, the coupled obstacle and Fokker-Planck equations in such models pose challenges versus classic MFGs. This paper proposes a generalized fictitious play algorithm that computes OSMFG mixed equilibria by iteratively solving pure strategy systems, i.e. approximating mixed strategies through averaging pure strategies according to a certain updating rule. The generalized fictitious play allows for a broad family of learning rates and the convergence to the mixed strategy equilibrium can be rigorously justified. The algorithm also incorporates efficient finite difference schemes of the pure strategy system, and numerical experiments demonstrate the effectiveness of the proposed method in robustly and efficiently computing mixed equilibria for OSMFGs.


Mathematics Subject Classification 2020: 91A13, 60G40, 65M06.


Keywords: mean field games, optimal stopping, obstacle problem, fictitious play, finite difference method.

1 Introduction

Recently, large-scale strategic interactions involving many agents, such as online auctions and online voting platforms, have become more prevalent compared to small-scale game scenarios with few players. Introduced in the seminal works by Lasry and Lions [14], [15], [16], the theory of mean field games (MFGs) provides a framework for modeling games with a large number of players. In an MFG model, each player controls their own trajectory based on the expected distribution of the states of all players. A key challenge in this model is determining the optimal control strategy and resulting crowd propagation when the strategies of all players reach a Nash equilibrium. The Nash equilibrium for the game can be characterized by a coupled system of partial differential equations (PDEs) consisting of a backward Hamilton-Jacobi-Bellman (HJB) equation and a forward Fokker-Planck equation. A variety of numerical methods have been proposed for solving MFG models. Early work by Achdou and Capuzzo-Dolcetta introduced finite difference techniques and analyzed their convergence properties [3], [2], [1]. Since then, additional numerical approaches have been developed. A comprehensive summary of numerical methods for mean field games is provided by Lauri`ere [17].


In this paper, we consider mean field games with optimal stopping (OSMFGs). Instead of controlling the velocity term of a stochastic differential equation (SDE) as in a standard MFG, in an OSMFG, however, agents make optimal stopping time decisions by choosing whether to exit the game or remain active at each instant in time and state. Optimal stopping problems arise naturally in economics, such as when firms optimally decide to exit markets if costs become prohibitively high [13], or when modeling industry turnover [4]. Extending MFGs to incorporate optimal stopping thus allows realistic modeling of economic scenarios where agents make optimal exit decisions over time. Mathematically, incorporating optimal stopping control leads the PDE satisfied by the value function to become an obstacle equation rather than the standard HJB equation. Furthermore, the singular nature of the stopping control precludes the existence of pure strategy equilibria, necessitating the determination of mixed strategy equilibria which lie in the probability space of the original control strategies. The obstacle equations and mixed strategy equilibria that characterize OSMFGs present greater analytical and computational challenges compared to classic MFGs.


Analogous to classic MFG models, we can formally derive a system of coupled forward-backward PDEs to characterize the Nash equilibrium for OSMFGs:




System (1.1) represents the pure strategy Nash equilibrium for the proposed OSMFG model. However, due to the limited regularity of the optimal stopping time control compared to controls on the velocity term, the existence of solutions to this PDE system is not guaranteed. To resolve this issue, we introduce the notion of a mixed strategy Nash equilibrium, which leads to a more complex system of PDEs given by:



The condition m = 0 in the region {u = ψ} ×(0, T) is relaxed to the conditions ∂tm − L∗m ≤ 0 and m ≥ 0, along with the complementary condition:



The wellposedness of system (1.2) has been proved in [5]. However, the complex mathematical structures of this system, including the forward-backward structure and the relaxed exiting conditions, pose significant challenges for computing mixed strategy equilibria. Currently, few algorithms have been proposed for OSMFG. There have been two main types of numerical algorithms developed for this problem. One is Uzawa’s algorithm, proposed in [6], which operates under the assumption of monotone running costs. This optimization-based method directly handles mixed strategy systems but is limited to the stationary case. However, in many practical applications, there is a need to solve time-dependent problems, which cannot be directly tackled by the Uzawa method. The other method utilizes a large-scale linear programming approach with fictitious play, as introduced in [10]. This algorithm relies on the measure flow formulation of OSMFG presented in [7] and has been applied to practical problems such as modeling games in electricity markets [4]. However, a major drawback of this approach is the need to solve a large-scale linear program at each iteration, which reduces computational efficiency and robustness. Additionally, while this method can provide information on crowd propagation, it does not reveal the individual stopping strategies employed by each player. In summary, efficient numerical methods for solving general OSMFG problems remain underdeveloped due to the complex mathematical structure of such games. Further algorithm development is needed to handle the obstacle equations and mixed strategy equilibria inherent to OSMFGs in a computationally practical manner.


The goal of this paper is to propose an iterative algorithm that computes the mixed strategy equilibria for OSMFGs using a generalized fictitious play. The core idea of this algorithm is to find the mixed strategy equilibria by repeatedly solving pure strategy equilibrium systems, which calibrate the mixed strategy equilibrium according to a certain updating rule, and the construction is based on the economic intuition that a mixed strategy can be regarded as a limit of linear combinations of pure strategies. Thus, the accuracy of the proposed algorithm hinges on achieving convergence of the iterative process and consistently approximating the pure strategy system (1.1).


Fictitious play is a fixed point iteration with a learning rate δn, mostly taken to be 1/n, meaning that at each iterative round, players optimize their controls based on the averaged crowd propagation across all previous rounds of plays. In this paper, we generalize the notion of fictitious play by requiring the learning rate δn to satisfy a broad criterion



and prove that for potential games, the generalized fictitious plays applied to the pure strategy system (1.1) leads to a mixed strategy equilibrium. This result constructs a bridge between pure and mixed strategies, enabling us to find mixed strategy equilibria by only solving pure strategy systems at each step, which readily accommodate finite difference schemes. Fictitious play has been applied to find equilibria for classical MFG models [9] and compute relaxed equilibria with linear programming for OSMFG [10]. However, the proposed generalized fictitious play framework uniquely contributes two advances for computing OSMFG mixed equilibria. First, it provides a new perspective of iteratively combining pure strategy solutions to approximate mixed equilibria. Second, it boosts convergence stability and customization by expanding the choice of the learning rate. Together, these innovations of an iterative pure strategy view and increased δn flexibility offer key improvements over the limitations of previous OSMFG methods for efficient mixed equilibrium calculation.



The rest of this paper is organized as follows. In Section 2, we introduce the OSMFG model and present the PDE systems characterizing the pure and mixed strategy equilibria. In Section 3, we prove the convergence of generalized fictitious play in the continuous setting and construct the fully-implicit and semiimplicit schemes for the obstacle problem and the Fokker Planck equation. We also prove the convergence of generalized fictitious play combined with the fully-implicit obstacle scheme. In Section 4, we present numerical experiments validating our algorithms. Finally, in Section 5, we summarize the paper’s main results and discuss potential extensions.


This paper is available on arxiv under CC 4.0 license.