This story draft by @escholar has not been reviewed by an editor, YET.

Market Calibrated DRL with Weekly Re-Training

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture
0-item

Table of Links

Abstract and 1. Introduction

  1. Deep Reinforcement Learning

  2. Similar Work

    3.1 Option Hedging with Deep Reinforcement Learning

    3.2 Hyperparameter Analysis

  3. Methodology

    4.1 General DRL Agent Setup

    4.2 Hyperparameter Experiments

    4.3 Optimization of Market Calibrated DRL Agents

  4. Results

    5.1 Hyperparameter Analysis

    5.2 Market Calibrated DRL with Weekly Re-Training

  5. Conclusions

Appendix

References

5.2 Market Calibrated DRL with Weekly Re-Training

The final objective of this study is to optimize the key result of Pickard et al. (2024), wherein market calibrated DRL agents outperform the BS Delta method, by examining the impact of re-training DRL agents with re-calibrated stochastic volatility models each week. Recall that options on 8 symbols, with 5 strikes each are used. Hedging results using the empirical asset path for each respective symbol between the sale date (October 16th, 2023) and maturity date (November 17th, 2023) are presented in Table 7. Italicized and underlined are the optimal agent for each case. Note that the average final P&L is averaged across the five strikes for ease of presentation here, but the results for all 40 options are provided in the Appendix. The asset paths for all 8 symbols between October 16th, 2023, and November 17th, 2023, are also available in the Appendix.


Table 7: Mean Final P&L across 5 Strikes when Hedging Empirical Asset Paths


The results confirm the hypothesis that DRL agents re-trained each week with re-calibrated models outperform DRL agents only trained on the sale date. Specifically, weekly retrained agents outperform single trained agents in 12 of the 16 cases listed above. Moreover, the DRL agent with weekly re-training outperforms the BS Delta method, which uses the updated volatility from the re-calibrated model, in 14 of 16 cases. This is a very encouraging result, as not only does it reiterate the finding of Pickard et al. (2024) that DRL agents may be implemented for hedging real asset paths by training them with market calibrated models, but by re-calibrating the models with higher frequency, DRL performance improves. The use case here for practitioners is evident: given a basket of options and the underlying option data, i.e., the bid-ask spread and the implied volatility surface, DRL agents may be trained each week with the aid of new, recalibrated Chebyshev models to effectively hedge each option in the basket. Moreover, recall that another key of this proposed method is that re-calibrating does not impose longer training times, as each DRL agent is only trained to hedge between successive recalibration points. As such, this work is trivially extended to more frequent recalibration.


Authors:

(1) Reilly Pickard, Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, Canada ([email protected]);

(2) F. Wredenhagen, Ernst & Young LLP, Toronto, ON, M5H 0B3, Canada;

(3) Y. Lawryshyn, Department of Chemical Engineering, University of Toronto, Toronto, Canada.


This paper is available on arxiv under CC BY-NC-ND 4.0 Deed (Attribution-Noncommercial-Noderivs 4.0 International) license.


Trending Topics

blockchaincryptocurrencyhackernoon-top-storyprogrammingsoftware-developmenttechnologystartuphackernoon-booksBitcoinbooks