This story draft by @escholar has not been reviewed by an editor, YET.

Optimizing Deep Reinforcement Learning for American Put Option Hedging: Hyperparameter Analysis

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture
0-item

Table of Links

Abstract and 1. Introduction

  1. Deep Reinforcement Learning

  2. Similar Work

    3.1 Option Hedging with Deep Reinforcement Learning

    3.2 Hyperparameter Analysis

  3. Methodology

    4.1 General DRL Agent Setup

    4.2 Hyperparameter Experiments

    4.3 Optimization of Market Calibrated DRL Agents

  4. Results

    5.1 Hyperparameter Analysis

    5.2 Market Calibrated DRL with Weekly Re-Training

  5. Conclusions

Appendix

References

3.2 Hyperparameter Analysis



Islam et al. (2017) follow up on the work of Henderson et. al (2017) by examining the performance of DDPG and TRPO methods in the HalfCheetah and Hopper gym environments. They first note that it is difficult to reproduce the results of Henderson et al. (2017), even with similar hyperparameter configurations. Next, Islam et al. (2017) add to the literature an analysis of DDPG actor and critic learning rates. They conclude first that the optimal learning rates vary between the Hopper and HalfCheetah environments, before noting it is difficult to gain a true understanding of optimal learning rate choices while all other parameters are held fixed. Andrychowicz et al. (2020) perform a thorough hyperparameter sensitivity analysis for multiple on-policy DRL methods, but do not consider off-policy methods such as DDPG. Overall, Andrychowicz et al. (2020) conclude that performance is highly dependent on hyperparameter tuning, and this limits the pace of research advances. Several other studies find similar results when testing various DRL methods in different environments, whether through a manual search of hyperparameters or some predefined hyperparameter optimization algorithm (Ashraf et al. (2021), Kiran and Ozyildirim (2022), Eimer et al. (2022)).


As such, it is evident that hyperparameter configurations can drastically impact DRL agent results. However, much of the literature on DRL hyperparameter choices often utilize a generic, pre-constructed environment for analysis, rather than addressing real-life applicable problems. Therefore, this study aims to contribute to the DRL hedging literature, and the DRL space as a whole, by conducting a thorough investigation of how hyperparameter choices impact the realistic problem of option hedging in a highly uncertain financial landscape.


Authors:

(1) Reilly Pickard, Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, Canada ([email protected]);

(2) F. Wredenhagen, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada;

(3) Y. Lawryshyn, Department of Chemical Engineering, University of Toronto, Toronto, Canada.


This paper is available on arxiv under CC BY-NC-ND 4.0 Deed (Attribution-Noncommercial-Noderivs 4.0 International) license.


L O A D I N G
. . . comments & more!

About Author

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture
EScholar: Electronic Academic Papers for Scholars@escholar
We publish the best academic work (that's too often lost to peer reviews & the TA's desk) to the global tech community

Topics

Around The Web...

Trending Topics

blockchaincryptocurrencyhackernoon-top-storyprogrammingsoftware-developmenttechnologystartuphackernoon-booksBitcoinbooks