Table of Links Abstract


Introduction
Background
Reinforcement Learning
Similar Work


Methodology
DRLAgent Design


Training Procedures


Testing Procedures


Results


SABR Experiments


Conclusions


Appendix A


References Introduction BACKGROUND Following the sale of a financial option, traders seek to mitigate the associated risk through hedging strategies. For example, a trader that has just sold a European put option may hedge against adverse price drops by selling shares in the underlying asset. Traditionally, option hedgers attempt to maintain a Delta neutral portfolio, where Delta is the first partial derivative of the option price with respect to the underlying asset (Hull 2012). Therefore, the Delta neutral hedge for the European put option requires the sale of Delta shares of the underlying. For European options, analytic solutions for the option price and Delta exist via the Black and Scholes (BS) option pricing model (Black and Scholes 1973). While the BS model shows that the underlying risk of an option position is eliminated by a continuously rebalanced Delta-neutral portfolio, financial markets operate discretely in practice. Further, the BS model assumes constant volatility and no trading costs, which is not reflective of reality. As such, dynamic option hedging under market frictions is a progressive decision-making progress under uncertainty. One field that has garnered significant attention in dynamic decision-making procedures is reinforcement learning (RL), a subfield of artificial intelligence (AI). RL problems are aided in complex environments through neural network (NN) function approximation. The field concerning the combination of NN’s and RL is called deep RL (DRL), and DRL has been used to achieve super-human level performance in video games (Mnih et al. 2013), board games (Silver et al. 2016), and robot control (Lillicrap et al. 2015). As the dynamic hedging problem requires decision-making under uncertainty, several recent studies have used DRL to effectively hedge option positions. A review of 17 studies that use DRL for dynamic stock option hedging is given by Pickard and Lawryshyn (2023), who detail that while many studies show that DRL outperforms a Delta-neutral strategy when hedging European options under transaction costs and stochastic volatility, no current studies consider the hedging of American options. For American put options specifically, there is no analytical pricing or hedging formula available due to the potential of early exercise, and numerical methods for option pricing and hedging are required (Hull 2012). To address the literature gap pertaining to DRL agents that consider the potential for early option exercise, this article details the design of DRL agents that are trained to hedge American put options under transaction costs. DRL agents in this study are designed using the deep deterministic policy gradient (DDPG) method. In addition to training an American put DRL hedger when the underlying asset price follows a geometric Brownian motion (GBM), stochastic volatility is considered by calibrating stochastic volatility models using empirical option data on several stock symbols. Once these DRL agents are tested on simulated paths generated by the calibrated model, the hedging performance of each DRL agent is evaluated on the empirical asset price path for the respective symbol between the sale and maturity dates. Finally, note that DRL agent reward function requires the option price at each time step. While an interpolation of a binomial American option tree is used in GBM cases, this study employs the use of a Chebyshev interpolation method first proposed by Glau, Mahlstedt, and Potz (2018) for the determination of the option price in stochastic volatility experiments. This Chebyshev method is model-agnostic, and this work thereby provides a framework that extends seamlessly to more intricate processes. Moreover, the Chebyshev method allows the American option price to be computed more efficiently in stochastic volatility settings, as the requirement to average the payoff of several thousand Monte Carlo (MC) simulations from the current price level to expiry or exercise is eliminated. This Chebyshev pricing method is described in detail in the methodology section of this work. The rest of this section is dedicated to the introduction of DRL and a detailed account of similar work in the DRL hedging space. This article will then detail the methodology used to train DRL agents, before presenting and discussing the results of all numerical experiments. Authors:
(1) Reilly Pickard, Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON M5S 3G8, Canada (reilly.pickard@mail.utoronto.ca);
(2) Finn Wredenhagen, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada;
(3) Julio DeJesus, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada;
(4) Mario Schlener, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada;
(5) Yuri Lawryshyn, Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON M5S 3E5, Canada. This paper is available on arxiv under CC BY-NC-SA 4.0 Deed (Attribution-Noncommercial-Sharelike 4.0 International) license. Table of Links Abstract Introduction
Background
Reinforcement Learning
Similar Work Methodology
DRLAgent Design Training Procedures Testing Procedures Results SABR Experiments Conclusions Appendix A References Abstract Abstract Abstract Introduction Background Reinforcement Learning Similar Work Introduction Background Background Reinforcement Learning Reinforcement Learning Similar Work Similar Work Methodology DRLAgent Design Methodology Methodology DRLAgent Design DRLAgent Design Training Procedures Training Procedures Training Procedures Testing Procedures Testing Procedures Testing Procedures Results Results Results SABR Experiments SABR Experiments SABR Experiments Conclusions Conclusions Conclusions Appendix A Appendix A Appendix A References References References Introduction BACKGROUND Following the sale of a financial option, traders seek to mitigate the associated risk through hedging strategies. For example, a trader that has just sold a European put option may hedge against adverse price drops by selling shares in the underlying asset. Traditionally, option hedgers attempt to maintain a Delta neutral portfolio, where Delta is the first partial derivative of the option price with respect to the underlying asset (Hull 2012). Therefore, the Delta neutral hedge for the European put option requires the sale of Delta shares of the underlying. For European options, analytic solutions for the option price and Delta exist via the Black and Scholes (BS) option pricing model (Black and Scholes 1973). While the BS model shows that the underlying risk of an option position is eliminated by a continuously rebalanced Delta-neutral portfolio, financial markets operate discretely in practice. Further, the BS model assumes constant volatility and no trading costs, which is not reflective of reality. As such, dynamic option hedging under market frictions is a progressive decision-making progress under uncertainty. One field that has garnered significant attention in dynamic decision-making procedures is reinforcement learning (RL), a subfield of artificial intelligence (AI). RL problems are aided in complex environments through neural network (NN) function approximation. The field concerning the combination of NN’s and RL is called deep RL (DRL), and DRL has been used to achieve super-human level performance in video games (Mnih et al. 2013), board games (Silver et al. 2016), and robot control (Lillicrap et al. 2015). As the dynamic hedging problem requires decision-making under uncertainty, several recent studies have used DRL to effectively hedge option positions. A review of 17 studies that use DRL for dynamic stock option hedging is given by Pickard and Lawryshyn (2023), who detail that while many studies show that DRL outperforms a Delta-neutral strategy when hedging European options under transaction costs and stochastic volatility, no current studies consider the hedging of American options. For American put options specifically, there is no analytical pricing or hedging formula available due to the potential of early exercise, and numerical methods for option pricing and hedging are required (Hull 2012). To address the literature gap pertaining to DRL agents that consider the potential for early option exercise, this article details the design of DRL agents that are trained to hedge American put options under transaction costs. DRL agents in this study are designed using the deep deterministic policy gradient (DDPG) method. In addition to training an American put DRL hedger when the underlying asset price follows a geometric Brownian motion (GBM), stochastic volatility is considered by calibrating stochastic volatility models using empirical option data on several stock symbols. Once these DRL agents are tested on simulated paths generated by the calibrated model, the hedging performance of each DRL agent is evaluated on the empirical asset price path for the respective symbol between the sale and maturity dates. Finally, note that DRL agent reward function requires the option price at each time step. While an interpolation of a binomial American option tree is used in GBM cases, this study employs the use of a Chebyshev interpolation method first proposed by Glau, Mahlstedt, and Potz (2018) for the determination of the option price in stochastic volatility experiments. This Chebyshev method is model-agnostic, and this work thereby provides a framework that extends seamlessly to more intricate processes. Moreover, the Chebyshev method allows the American option price to be computed more efficiently in stochastic volatility settings, as the requirement to average the payoff of several thousand Monte Carlo (MC) simulations from the current price level to expiry or exercise is eliminated. This Chebyshev pricing method is described in detail in the methodology section of this work. The rest of this section is dedicated to the introduction of DRL and a detailed account of similar work in the DRL hedging space. This article will then detail the methodology used to train DRL agents, before presenting and discussing the results of all numerical experiments. Authors: (1) Reilly Pickard, Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON M5S 3G8, Canada (reilly.pickard@mail.utoronto.ca); (2) Finn Wredenhagen, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada; (3) Julio DeJesus, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada; (4) Mario Schlener, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada; (5) Yuri Lawryshyn, Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON M5S 3E5, Canada. Authors: Authors: (1) Reilly Pickard, Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON M5S 3G8, Canada ( reilly.pickard@mail.utoronto.ca ); reilly.pickard@mail.utoronto.ca (2) Finn Wredenhagen, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada; (3) Julio DeJesus, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada; (4) Mario Schlener, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada; (5) Yuri Lawryshyn, Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON M5S 3E5, Canada. This paper is available on arxiv under CC BY-NC-SA 4.0 Deed (Attribution-Noncommercial-Sharelike 4.0 International) license. This paper is available on arxiv under CC BY-NC-SA 4.0 Deed (Attribution-Noncommercial-Sharelike 4.0 International) license. available on arxiv available on arxiv

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

How Deep RL Enhances Hedging Strategies for American Put Options in Volatile Markets

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

A New Cyber-Insurance Model Uses Game Theory to Protect Smart Power Grids

Can Deep Reinforcement Learning Transform Hedging for American Put Options?

How Reinforcement Learning Enhances American Put Option Hedging Strategies

What Previous Research Says About Using Reinforcement Learning for Hedging Options

How to Design a Deep Reinforcement Learning Agent for American Put Option Hedging

Hedging American Put Options with Deep Reinforcement Learning: Training Procedures

A New Cyber-Insurance Model Uses Game Theory to Protect Smart Power Grids

Can Deep Reinforcement Learning Transform Hedging for American Put Options?

How Reinforcement Learning Enhances American Put Option Hedging Strategies

What Previous Research Says About Using Reinforcement Learning for Hedging Options

How to Design a Deep Reinforcement Learning Agent for American Put Option Hedging

Hedging American Put Options with Deep Reinforcement Learning: Training Procedures

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps