Authors:
(1) Xiao-Yang Liu, Hongyang Yang, Columbia University (xl2427,[email protected]);
(2) Jiechao Gao, University of Virginia ([email protected]);
(3) Christina Dan Wang (Corresponding Author), New York University Shanghai ([email protected]).
2 Related Works and 2.1 Deep Reinforcement Learning Algorithms
2.2 Deep Reinforcement Learning Libraries and 2.3 Deep Reinforcement Learning in Finance
3 The Proposed FinRL Framework and 3.1 Overview of FinRL Framework
3.5 Training-Testing-Trading Pipeline
4 Hands-on Tutorials and Benchmark Performance and 4.1 Backtesting Module
4.2 Baseline Strategies and Trading Metrics
4.5 Use Case II: Portfolio Allocation and 4.6 Use Case III: Cryptocurrencies Trading
5 Ecosystem of FinRL and Conclusions, and References
Baseline trading strategies are provided to compare with DRL strategies. Investors usually have two mutually conflicting objectives: the highest possible profits and the lowest possible risks [43]. We include three conventional strategies as baselines.
Passive trading strategy [31] is an easy and popular strategy that has the minimal trading activities. Investors simply buy and hold index ETFs [46] to replicate a broad market index or indices such as Dow Jones Industrial Average (DJIA) index and Standard & Poor’s 500 (S&P 500) index.
Mean-variance and min-variance strategy [2] both aim to achieve an optimal balance between the risks and profits. It selects a diversified portfolio with risky assets, and the risk is diversified when traded together.
Equally weighted strategy is a type of portfolio allocation method. It gives the same importance to each asset in a portfolio.
FinRL includes common metrics to evaluate trading performance:
Final portfolio value: the amount of money at the end of the trading period.
Cumulative return: subtracting the initial value from the final portfolio value, then dividing by the initial value.
Annualized return and standard deviation: geometric average return in a yearly sense, and the corresponding deviation.
Maximum drawdown ratio: the maximum observed loss from a historical peak to a trough of a portfolio, before a new peak is achieved. Maximum drawdown is an indicator of downside risk over a time period.
Sharpe ratio in (1) is the average return earned in excess of the risk-free rate per unit of volatility.
This paper is available on arxiv under CC BY 4.0 DEED license.