120 reads

Exploring Cutting-Edge DRL Algorithms for Quantitative Finance

by Reinforcement Technology AdvancementsJune 8th, 2024

Too Long; Didn't Read

Dive into the world of deep reinforcement learning algorithms for quantitative finance, exploring value-based, policy-based, and actor-critic algorithms like DQN, DDPG, PPO, A3C, A2C, SAC, and TD3, and their applications in trading strategies.

featured image - Exploring Cutting-Edge DRL Algorithms for Quantitative Finance

‘trading chart’ Image created by HackerNoon AI Image Generator

Authors:

(1) Xiao-Yang Liu, Hongyang Yang, Columbia University (xl2427,[email protected]);

(2) Jiechao Gao, University of Virginia ([email protected]);

(3) Christina Dan Wang (Corresponding Author), New York University Shanghai ([email protected]).

Table of Links

Abstract and 1 Introduction

2.2 Deep Reinforcement Learning Libraries and 2.3 Deep Reinforcement Learning in Finance

3 The Proposed FinRL Framework and 3.1 Overview of FinRL Framework

3.2 Application Layer

3.3 Agent Layer

3.4 Environment Layer

3.5 Training-Testing-Trading Pipeline

4 Hands-on Tutorials and Benchmark Performance and 4.1 Backtesting Module

4.2 Baseline Strategies and Trading Metrics

4.3 Hands-on Tutorials

4.4 Use Case I: Stock Trading

4.5 Use Case II: Portfolio Allocation and 4.6 Use Case III: Cryptocurrencies Trading

5 Ecosystem of FinRL and Conclusions, and References

We review the state-of-the-art DRL algorithms, relevant opensource libraries, and applications of DRL in quantitative finance.

2.1 Deep Reinforcement Learning Algorithms

Many DRL algorithms have been developed. They fall into three categories: value based, policy based, and actor-critic based.

A value based algorithm estimates a state-action value function that guides the optimal policy. Q-learning [49] approximates a Qvalue (expected return) by iteratively updating a Q-table, which works for problems with small discrete state spaces and action spaces. Researchers proposed to utilize deep neural networks for approximating Q-value functions, e.g., deep Q-network (DQN) and its variants double DQN and dueling DQN [1].

A policy based algorithm directly updates the parameters of a policy through policy gradient [45]. Instead of value estimation, policy gradient uses a neural network to model the policy directly, whose input is a state and output is a probability distribution according to which the agent takes an action at the input state.

An actor-critic based algorithm combines the advantages of value based and policy based algorithms. It updates two neural networks, namely, an actor network updates the policy (probability distribution) while a critic network estimates the state-action value function. During the training process, the actor network takes actions and the critic network evaluates those actions. The state-of-art actor-critic based algorithms are deep deterministic policy gradient (DDPG), proximal policy optimization (PPO), asynchronous advantage actor critic (A3C), advantage actor critic (A2C), soft actor-critic (SAC), multi-agent DDPG, and twin-delayed DDPG (TD3) [1].

This paper is available on arxiv under CC BY 4.0 DEED license.

L O A D I N G
. . . comments & more!

About Author

Reinforcement Technology Advancements@reinforcement

Leading research and publication in advancing reinforcement machine learning, shaping intelligent systems & automation.

Read my stories Learn More

TOPICS

web3 #crypto-api #deep-reinforcement-learning #cryptocurrency-trading #quantitative-finance #drl-algorithms #automated-trading-in-finrl #financial-market-simulation #ai-in-finance

THIS ARTICLE WAS FEATURED IN...

Join HackerNoon

Latest technology trends. Customized Experience. Curated Stories. Publish Your Ideas

Exploring Cutting-Edge DRL Algorithms for Quantitative Finance

Too Long; Didn't Read

Table of Links

2 RELATED WORKS

2.1 Deep Reinforcement Learning Algorithms

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

RELATED STORIES