My first experience with crypto wasn’t under very good circumstances: a friend who takes care of several servers at his job was infected with ransomware – and this malware was demanding he pay the ransom amount in a cryptocurrency called Monero (XMR). After this not-so-friendly introduction, I started to study how the technology behind cryptocurrencies works, and I fell in love with it. I was already interested in the stock market, so I joined the familiar (stock market) with the novel (crypto). To test the knowledge I’d learned from my stock market books, I started creating a simple crossover bot. Moving Average Convergence Divergence (MACD) This worked for a while, but I quickly realized that I should - and could - make the bot a lot better. Now, the project that I started as a hobby has a capital management system, a combination of technical indicators, and sentiment analysis powered by machine learning. Between 10 March 2020 and 10 August 2020, my bot resulted in a of 63.1%, of 1.74, and cumulative gross result of approximately 588% (you can see a copy of all of my trades during this period in ). success rate profit factor this Google Sheet report About my project I needed a bot that gave me a high-performance, scalable way to calculate technical indicators and process sentiment data in real-time. To do everything I need in terms of my technical indicators calculation, I collect data and market depth via an always-up websocket connection that tracks every Bitcoin market on the (~215 in total, 182 being tradeable, at this moment). candlestick chart Binance exchange The machine learning sentiment analysis started as a simple experiment to see if external news affected the market. For example: if a famous person in the crypto ecosystem tweeted that a big exchange was hacked, the price will probably fall and affect the whole market. Likewise, very good news should impact the price in a positive way. I calculated sentiment analysis scores in real-time, as soon as new data was ingested from sources like Twitter, Reddit, RSS feeds, and etc. Then, using these scores, I could determine market conditions at the moment. Now, I combine these two components with a weighted average, 60% technical indicators and 40% sentiment analysis. Current trading bot dashboard where I track all my ongoing trades and results; In this particular case, filtered for the period of 10-March-2020 to 10-August-2020. Quick breakdown of my results and success rates week-over-week; In this particular case, filtered for the period of 10-March-2020 to 10-August-2020. Using TimescaleDB At the beginning, I tried to save the collected data in simple files, but quickly realized that wasn’t a good way to store and process this data. I started looking for an alternative: a performant database. I went through several databases, and all of them always lacked something I wound up needing to continue my project. I tried MongoDB, InfluxDB, and Druid, but none of them 100% met my needs. Of the databases I tried, InfluxDB was a good option; however, every query that I tried to run was painful, due to their own query language (InfluxQL). As soon as my series started to grow exponentially to higher levels, the server didn't have enough memory to handle them all in real-time. This is because the current storage engine requires more and more allocated memory for each series. I have a large number of unique metrics, so the process ran out of available memory quickly, InfluxDB TSM I handle somewhat large amounts of data every day, especially on days with many market movements. On average, I’m ingesting around 20k records/market, or 3.6 million total records, per day (20k*182 markets) and even with this huge amount of data, my query response time is in the milliseconds. require This is where TimescaleDB started to shine for me. It gave me fast real-time aggregations, built-in time-series functions, high ingestion rates – and it didn’t elevated memory usage to do all of this. In addition to this raw market data, a common use case for me is to analyze the data in different time frames (e.g., 1min, 5min, 1hr, etc.) I maintain these records in a pre-computed aggregate to increase my query performance and allow me to make faster decisions about whether or not to enter a position. For example, here’s a simple query that I use a lot to follow the performance of my trades on a daily or weekly basis (daily in this case): time_group, total_trades, positive_trades, negative_trades, ( * (positive_trades / total_trades), ) success_rate, profit gross_profit, ((profit - (total_trades * )), ) net_profit ( time_bucket( , buy_at:: ):: time_group, (*) total_trades, ( profit > ):: positive_trades, ( profit <= ):: negative_trades, ( (profit), ) profit trade time_group time_group ) T time_group SELECT ROUND 100 2 AS as ROUND 0.15 2 AS FROM SELECT '1 day' TIMESTAMP DATE AS COUNT AS SUM CASE WHEN 0 THEN 1 ELSE 0 END NUMERIC AS SUM CASE WHEN 0 THEN 1 ELSE 0 END NUMERIC AS ROUND SUM 2 AS FROM GROUP BY ORDER BY ORDER BY And, I often use this function to : measure market volatility, decomposing the range of a market pair in a period tr(_symbol , _till ) ( ZONE, ( , ), ( , )) plpgsql $$ candlestick ( * candlestick c c.symbol = _symbol c.time > () - _till ) d.time, ( (a, b, c)) :: ( , ) , ( (a, b, c) / d.close) :: ( , ) ( today.time, today.close, today.high - today.low a, ( (today.high - yesterday.close), ) b, ( (today.low - yesterday.close), ) c candlestick today LATERAL ( yesterday.close candlestick yesterday yesterday.time < today.time yesterday.time ) yesterday today.time > () - _till) d; ; $$; atr(_interval , _symbol , _till ) ( ZONE, ( , ), ( , )) plpgsql $$ true_range ( * tr(_symbol, _till) ) tr.date, avg.sma , avg.sma_percent true_range tr LATERAL ( (lat.result) sma, (lat.percent) sma_percent ( * true_range inr inr.date <= tr.date inr.date _interval ) lat ) tr.date > () - _till tr.date; ; $$; * atr( , , ) CREATE OR REPLACE FUNCTION TEXT INTERVAL RETURNS TABLE date TIMESTAMP WITHOUT TIME result NUMERIC 9 8 percent NUMERIC 9 8 LANGUAGE AS DECLARE BEGIN RETURN QUERY WITH AS SELECT FROM WHERE AND NOW SELECT GREATEST NUMERIC 9 8 as result GREATEST NUMERIC 9 8 as percent FROM SELECT as COALESCE ABS 0 COALESCE ABS 0 FROM LEFT JOIN SELECT FROM WHERE ORDER BY DESC LIMIT 1 ON TRUE WHERE NOW END CREATE OR REPLACE FUNCTION INT TEXT INTERVAL RETURNS TABLE date TIMESTAMP WITHOUT TIME result NUMERIC 9 8 percent NUMERIC 9 8 LANGUAGE AS DECLARE BEGIN RETURN QUERY WITH AS SELECT FROM SELECT result percent FROM INNER JOIN SELECT avg avg FROM SELECT FROM WHERE ORDER BY DESC LIMIT avg ON TRUE WHERE NOW ORDER BY END SELECT FROM 14 'BNBBTC' '4 HOURS' ORDER BY date My current deployment & future plans My current architecture. To develop my bot and all its capabilities, I used Node.js as my main programming language and various libraries: to communicate between all my modules without overengineering, to train and deploy all my machine learning models, and for technical indicator calculation, as well as various others. cote tensorFlow tulind I modified some to meet my needs and created some from scratch, including a , a level calculator for , and . candlestick recognition pattern support/resistance Fibonacci retracement Today, I have a total of 55 markets (which are re-evaluated every month, based on trade simulation performance) that trade simultaneously 24/7; when all my strategy conditions are met, a trade is automatically opened. The bot respects my capital management system, which is basically to limit myself to 10 opened positions and only using 10% of the available capital at a given time. To keep track of the results of an open trade, I use dynamic and . Trailing Stop Loss Trailing Take Profit The process of re-evaluating a market requires a second instance of my bot that runs in the background and uses my main strategy to simulate trades in all Bitcoin markets. When it detects that a market is doing well, based on the metrics I track, that market enters the main bot instance and starts live trading. The same applies to those that are performing poorly; as soon as the main instance of my bot detects things are going badly, the market is removed from the main instance and the second instance begins tracking it. If it improves, it's added back in. As every developer likely knows all too well, the process of building a software is to always improve it. Right now, I’m trying to improve my capital management system using . Kelly Criterion Why ? TimescaleDB For my use case, I’ve found TimescaleDB is a powerful and solid choice: it’s fast with reliable ingest rates, efficiently stores and compresses a huge dataset in a way that’s manageable and cost-effective, and gives me real-time aggregation functionality. The , , and this about managing and processing huge time-series datasets is all pretty easy to understand and follow – and the TimescaleDB team is responsive and helpful (and they always show up in community discussions, like ). TimescaleDB website core documentation blog post my Reddit AMA It’s been easy and straightforward to scale, without adding any new technologies to the stack. And, as an SQL user, TimescaleDB adds very little maintenance overhead, especially compared to learning or maintaining a new database or language. This post is slightly modified from original post on the TimescaleDB blog (read it here ).