Developer-first API stock broker
We are happy to announce MarketStore is now open source! MarketStore is a database server optimized for financial timeseries data written in pure Go, designed and developed by Alpaca. You can think of it as an extensible DataFrame service that is accessible from anywhere in your system, at higher scalability.
It is designed from the ground up to address scalability issues around handling large amounts of financial market data used in algorithmic trading backtesting, charting, and analyzing price history with data spanning many years, including tick-level for the all US equities or the exploding crypto currencies space. If you are struggling with managing lots of HDF5 files, this is perfect solution to your problem.
A few years ago, Alpaca started developing AlpacaAlgo which helps retail traders to transform their own idea into trading algos without writing any lines of code. The platform trains a deep learning model to capture the trading idea using historical price data for each and runs backtesting quickly. You can then turn on the algo to live mode which should perform the same calculation as backtesting.
Then followed AlpacaScan, which offers pre-built algorithm results for all the US equities with quick backtesting results. Running complicated logics against eight thousands symbols instantly in a timeseries manner is not an easy task to do.
In doing so, we found the access to more than ten years of tick-level historical price data with thousands of symbols was the performance bottleneck of the system and a pain point to the user experience. Most of our application layer is written in python using PyData libraries including Pandas DataFrame, but these are more focused on the calculation and analysis, and it was clear that something was necessary to improve fast access to the huge amount of data in different parts.
Back then, the easiest way of handling the financial market timeseries data was storing DataFrame in HDF5. It is the defacto storage format for small use cases but it did not fit our case for the scalability reason.
Then we sat together and explored the possibility. The idea popped quickly that we would need some kind of database that serves DataFrame on demand. It should be an HTTP-based API service for easy access, store tera-bytes of price data on disk, provide simple data management on the filesystem, be able to update 10k+ symbol prices every second, and respond to 10k+ clients with sub-second latency.
Looking around github for possible solutions out there, many timeseries databases were there but mainly for general-purpose timeseries data, targeting IoT sensor data or system monitoring metrics and designed for JSON. Financial timeseries data in particular has different requirements, in which the data is pretty dense, more structured, and long history is demanded. We couldn’t find the best solution for this particular use case. It was obvious that not just we but also anyone who works on this financial timeseries data would need this solution. In the coming age where more and more people write automated trading systems or analyze this kind of data, it is inevitable to handle it very efficiently. The database server should be written cleanly and be reusable to be open sourced one day for everyone to use.
That’s how we started developing MarketStore, a pure Go-based database designed for financial timeseries data. We designed it with a true database architecture. Luke, one of the co-founders and CTO of the successful MPP database Greenplum, joined the conversation and designed the storage layer. Hitoshi, who used to work for Greenplum and Pivotal as the architect of Greenplum and major contributor of PostgreSQL, developed the code base around query and plugin architecture.
There are native Go as well as Python clients and both perform very well. The Python client easily converts the server response into DataFrame and you notice almost no difference from reading the data from local disk, with the bonus of higher scalability.
One of the common challenges in database software is the data import layer. That is no exception for our financial data system as well. Different asset classes have different characteristics, and each one of the upstream data providers offer different data formats. In order to address this issue, MarketStore has the plugin system for the data ingestion layer. It comes with the default plugin for data ingestion both from GDAX API and Slait, which is another open source product of ours. We will discuss Slait in another post. With the GDAX plugin, you can immediately start consuming and storing Bitcoin, Ethereum, Bitcoin Cash, and Litecoin data from the moment you start MarketStore.
Since it is a plugin architecture, you can write your own data ingestion for your own needs. Also, Go and Python clients support writing data from remote.
MarketStore is available today as an open source project on GitHub and is production ready. Given the ease of the build system in Go, it is pretty straightforward to build it on your own. A Docker container is also built for every release for easy access.
Inside Alpaca, from our trading management system to deep learning modeling to charting, almost all applications use MarketStore as the backend both in development and production.
We hope open sourcing MarketStore can help more people working in a similar domain and contribute to the community. Alpaca’s mission is always to help individuals to have more technology power in the financial markets, and that doesn’t mean only our end-products. By providing technology this way, we wish to help everyone.
Try it today and let us know what works for you and does not. We are also more than happy to receive help around feature development and documentation. There will be more posts here about details around the usage of MarketStore.
If you’re a hacker and can create something cool that works in the financial market, please check out our project “Commission Free Stock Trading API” where we provide simple REST Trading API and real-time market data for free.
Brokerage services are provided by Alpaca Securities LLC (alpaca.markets), member FINRA/SIPC. Alpaca Securities LLC is a wholly-owned subsidiary of AlpacaDB, Inc.