Table of Links Abstract


Introduction
Background
Reinforcement Learning
Similar Work


Methodology
DRLAgent Design


Training Procedures


Testing Procedures


Results


SABR Experiments


Conclusions


Appendix A


References Methodology Successfully implementing a DRL agent to achieve the American put option hedging task requires three main steps: the design of the DRL agent, the setup of the training procedure, and the setup of a testing procedure. The design of the DRL agent features the construction of the neural network, the choice of hyperparameters, the formulation of state- and action-spaces, and a reward. The training procedure involves the data generation processes required to provide the agent with adequate state and reward information. Finally, forming testing scenarios requires the acquisition of data, and the development of a benchmark comparator for the DRL agent, such as the Delta hedging method. This section will first detail the DRL agent design before detailing the training and testing procedures for all experiments. DRL AGENT DESIGN As policy based DRL methods allow for continuous action-spaces, the DRL method employed in this study is DDPG. In this study, the actor and critic networks are both fully connected NNs with two hidden layers consisting of 64 nodes each. In both the actor and critic networks, the hidden layers use the rectified linear unit as the non-linear activation function. The actor network uses a sigmoidal output function to map the actions to the range [0, 1], while the critic network uses a linear output. Note that the actor output is multiplied by −1, as this study hedges American put options, which require shares to be shorted. The state-space for the DRL agent in this study includes the current asset price, the time-to-maturity, and the current holding (previous action). This is aligned with the state-spaces used in Kolm and Ritter (2019), Cao et al. (2021), Xiao, Yao, and Zhou (2021), and Assa, Kenyon, and Zhang (2021). This study does not include the BS Delta in the state-space, agreeing with Cao et al. (2021), Kolm and Ritter (2019), and Du et al. (2020) in that the addition of the BS Delta is an unnecessary augmentation of the state. Moreover, the inclusion of the BS Delta in the state may hinder the effectiveness of an American DRL hedger, as the BS model is derived using European options. The reward formulation used in this study is given by Authors:
(1) Reilly Pickard, Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON M5S 3G8, Canada (reilly.pickard@mail.utoronto.ca);
(2) Finn Wredenhagen, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada;
(3) Julio DeJesus, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada;
(4) Mario Schlener, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada;
(5) Yuri Lawryshyn, Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON M5S 3E5, Canada. This paper is available on arxiv under CC BY-NC-SA 4.0 Deed (Attribution-Noncommercial-Sharelike 4.0 International) license. Table of Links Abstract Introduction
Background
Reinforcement Learning
Similar Work Methodology
DRLAgent Design Training Procedures Testing Procedures Results SABR Experiments Conclusions Appendix A References Abstract Abstract Abstract Introduction Background Reinforcement Learning Similar Work Introduction Background Background Reinforcement Learning Reinforcement Learning Similar Work Similar Work Methodology DRLAgent Design Methodology Methodology DRLAgent Design DRLAgent Design Training Procedures Training Procedures Training Procedures Testing Procedures Testing Procedures Testing Procedures Results Results Results SABR Experiments SABR Experiments SABR Experiments Conclusions Conclusions Conclusions Appendix A Appendix A Appendix A References References References Methodology Successfully implementing a DRL agent to achieve the American put option hedging task requires three main steps: the design of the DRL agent, the setup of the training procedure, and the setup of a testing procedure. The design of the DRL agent features the construction of the neural network, the choice of hyperparameters, the formulation of state- and action-spaces, and a reward. The training procedure involves the data generation processes required to provide the agent with adequate state and reward information. Finally, forming testing scenarios requires the acquisition of data, and the development of a benchmark comparator for the DRL agent, such as the Delta hedging method. This section will first detail the DRL agent design before detailing the training and testing procedures for all experiments. DRL AGENT DESIGN As policy based DRL methods allow for continuous action-spaces, the DRL method employed in this study is DDPG. In this study, the actor and critic networks are both fully connected NNs with two hidden layers consisting of 64 nodes each. In both the actor and critic networks, the hidden layers use the rectified linear unit as the non-linear activation function. The actor network uses a sigmoidal output function to map the actions to the range [0, 1], while the critic network uses a linear output. Note that the actor output is multiplied by −1, as this study hedges American put options, which require shares to be shorted. The state-space for the DRL agent in this study includes the current asset price, the time-to-maturity, and the current holding (previous action). This is aligned with the state-spaces used in Kolm and Ritter (2019), Cao et al. (2021), Xiao, Yao, and Zhou (2021), and Assa, Kenyon, and Zhang (2021). This study does not include the BS Delta in the state-space, agreeing with Cao et al. (2021), Kolm and Ritter (2019), and Du et al. (2020) in that the addition of the BS Delta is an unnecessary augmentation of the state. Moreover, the inclusion of the BS Delta in the state may hinder the effectiveness of an American DRL hedger, as the BS model is derived using European options. The reward formulation used in this study is given by Authors: (1) Reilly Pickard, Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON M5S 3G8, Canada (reilly.pickard@mail.utoronto.ca); (2) Finn Wredenhagen, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada; (3) Julio DeJesus, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada; (4) Mario Schlener, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada; (5) Yuri Lawryshyn, Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON M5S 3E5, Canada. Authors: Authors: (1) Reilly Pickard, Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, ON M5S 3G8, Canada ( reilly.pickard@mail.utoronto.ca ); reilly.pickard@mail.utoronto.ca (2) Finn Wredenhagen, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada; (3) Julio DeJesus, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada; (4) Mario Schlener, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada; (5) Yuri Lawryshyn, Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON M5S 3E5, Canada. This paper is available on arxiv under CC BY-NC-SA 4.0 Deed (Attribution-Noncommercial-Sharelike 4.0 International) license. This paper is available on arxiv under CC BY-NC-SA 4.0 Deed (Attribution-Noncommercial-Sharelike 4.0 International) license. available on arxiv available on arxiv

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

How to Design a Deep Reinforcement Learning Agent for American Put Option Hedging

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

A Novel Framework for Analyzing Economic News Narratives Using GPT-3.5: Abstract and Intro

Can Deep Reinforcement Learning Transform Hedging for American Put Options?

How Deep RL Enhances Hedging Strategies for American Put Options in Volatile Markets

How Reinforcement Learning Enhances American Put Option Hedging Strategies

What Previous Research Says About Using Reinforcement Learning for Hedging Options

Hedging American Put Options with Deep Reinforcement Learning: Training Procedures

A Novel Framework for Analyzing Economic News Narratives Using GPT-3.5: Abstract and Intro

Can Deep Reinforcement Learning Transform Hedging for American Put Options?

How Deep RL Enhances Hedging Strategies for American Put Options in Volatile Markets

How Reinforcement Learning Enhances American Put Option Hedging Strategies

What Previous Research Says About Using Reinforcement Learning for Hedging Options

Hedging American Put Options with Deep Reinforcement Learning: Training Procedures

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps