Hackernoon logoOptimizing a Portfolio of Cryptocurrencies with Deep Reinforcement Learning by@sonaam1234

Optimizing a Portfolio of Cryptocurrencies with Deep Reinforcement Learning

Sonam Srivastava Hacker Noon profile picture

@sonaam1234Sonam Srivastava

Portfolio Optimization or the process of giving optimal weights to assets in a financial portfolio is a fundamental problem in Financial Engineering. There are many approaches one can follow — for passive investments the most common is liquidity based weighting or market capitalization weighting. If one has no view on investment performance one follows equal weighting. Following the Capital Asset Pricing Model, the most elegant solution is the Markovitz Optimal portfolio — where risk-averse investors try to maximize return based on their level of risk .

There is no one solution to this problem. It is essentially a problem where an agent that can best learn and adapt to the market environment will deliver best results. This is the essence of any Reinforcement Learning problem. Reinforcement Learning has delivered excellent results in problems with similar premise like video games and board games where they have far outperformed humans.

Problem Framework

We used Reinforcement Learning framework proposed by Z. Jiang, D. Xu, J. Liang, A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem.

In the proposed framework, a neural network is trained to inspect the history of an asset as well as the previous portfolio weights and evaluate its potential growth for the immediate future. The evaluation score of each asset is discounted by the size of its intentional weight change and is presented to a softmax layer, whose outcome will be the new portfolio weights for the coming trading period. The reward function of the RL framework is the explicit average of the periodic logarithmic returns.

Three different species of networks are tested in this work: a Convolutional Neural Network (CNN), a basic Recurrent Neural Network (RNN), and a Long Short Term Memory (LSTM)

The model can be trained on any set of assets, and here we test it on the cryptocurrency exchange market.

The portfolio is rebalanced every 30 minutes to create the portfolio with a universe consisting of top 11 coins in the cryptocurrency market using the RL framework.


We compare the performance of the RL with the following frameworks (also detailed in the paper by Z. Jian et al)—

For the period 2015/07/01 to 2017/07/01, the results in the test period (last 3 months) are:

For the period 2016/07/01 to 2018/07/01, the results in the test period (last 3 months) are:


The deep reinforcement learning framework behaved far better than any other optimization framework in the test period in 2017, but it was actually inferior to a few frameworks in the test period in 2018.

It is very evident that the returns of any optimization framework is very much dependent on the market environment. As the RL framework we used also tries to limit turnover, it might have behaved worse than a few of the frameworks where there is no such constraint in 2018. The top performing framework in 2018 was one where only the best coin is invested in; which means that the portfolio would be churned very often, or that there would be a large turnover.

PS: the link to the code forked from ZenghyaoJiang https://github.com/sonaam1234/PGPortfolio


Join Hacker Noon

Create your free account to unlock your custom reading experience.