paint-brush
Algorithmic Trading Made Easy: Understanding State, Action, and Reward in the Application Layerby@reinforcement

Algorithmic Trading Made Easy: Understanding State, Action, and Reward in the Application Layer

tldt arrow

Too Long; Didn't Read

The application layer in algorithmic trading involves mapping trading strategies into DRL language, customizing state, action, and reward for optimal decision-making in market environments.
featured image - Algorithmic Trading Made Easy: Understanding State, Action, and Reward in the Application Layer
Reinforcement Technology Advancements HackerNoon profile picture

Authors:

(1) Xiao-Yang Liu, Hongyang Yang, Columbia University (xl2427,[email protected]);

(2) Jiechao Gao, University of Virginia ([email protected]);

(3) Christina Dan Wang (Corresponding Author), New York University Shanghai ([email protected]).

Abstract and 1 Introduction

2 Related Works and 2.1 Deep Reinforcement Learning Algorithms

2.2 Deep Reinforcement Learning Libraries and 2.3 Deep Reinforcement Learning in Finance

3 The Proposed FinRL Framework and 3.1 Overview of FinRL Framework

3.2 Application Layer

3.3 Agent Layer

3.4 Environment Layer

3.5 Training-Testing-Trading Pipeline

4 Hands-on Tutorials and Benchmark Performance and 4.1 Backtesting Module

4.2 Baseline Strategies and Trading Metrics

4.3 Hands-on Tutorials

4.4 Use Case I: Stock Trading

4.5 Use Case II: Portfolio Allocation and 4.6 Use Case III: Cryptocurrencies Trading

5 Ecosystem of FinRL and Conclusions, and References

3.2 Application Layer

On the application layer, users map an algorithmic trading strategy into the DRL language by specifying the state space, action space and reward function. For example, the state, action and reward for several use cases are given in Table 1. Users can customize according to their own trading strategies.


State space S. The state space describes how the agent perceives the environment. A trading agent observes many features to make sequential decisions in an interactive market environment. We allow the time step 𝑡 to have multiple levels of granularity, e.g., daily, hourly or a minute basis. We provide various features for users to select and update, in each time step 𝑡:



Table 1: Key components and attributes. OHLCV stands for Open, High, Low, Close and Volume.


This paper is available on arxiv under CC BY 4.0 DEED license.