Building a Scalable Price Pipeline for Trading Systems

Hi, everyone! Today I'm going to discuss engineering of a scalable price pipeline. Every trading system consists of two major subsystems:

Order Management - responsible for routing orders from clients to exchanges and managing order lifecycle.
Price Distribution - collects prices from exchanges, aggregates them and distributes to clients.

Both of them are equally important. Prices are needed before orders can be placed. After execution, actual data is needed to calculate risk metrics. In this article, I will focus on the price distribution subsystem.

High Level Architecture

Let's first decide which functionality is expected from the price distribution subsystem:

Streaming prices to clients. These could be human clients or algorithms that are making trading decisions.
Storing historical prices for later analysis. That will be required for example for end of day risk calculations or model backtesting.

Historical Data Storage

To store data in general we have two options:

row-based storage
column-oriented storage

The decision on storage must be made based on queries which will be executed against the data. If we need strong transactional guarantees and frequent updates row-based storage is better. This type of storage is good for querying by ID. The whole row is stored in a sequential way so it is efficient to retrieve it if you need all columns.

But for analytical purposes we will not fetch data by ID. Instead, we will query by time range. Our data will also be multidimensional - for each bar we will have at least open, high, low, close, volume fields. For different types of analyses we might need only some of these fields. For example, to calculate moving average we need only close prices. Columnar DB allows reading the whole column efficiently, because all data is stored in one file. Such disk operations are very fast.

There are multiple open-source columnar databases available. Here are some:

ClickHouse
TimescaleDB
Parquet
DuckDB
etc.

Overview

Here are key components which our system will consist of:

Streaming service - connects to exchanges, collects prices and aggregates them into OHLCV bars. This data must be stored and also could be streamed to clients live.
Storage - some columnar DB.
Query services - any service which requires historical data. This could be: a backend for frontend, risk calculation scripts, model training pipelines, etc.

It's very important to segregate query functionality from streaming service for several reasons:

Development of these components could be done independently by different teams with different release cycles.
Any bugs and defects in query service should not affect streaming.
Query load could have spikes, which might require scaling up independently.

At a high level, the architecture looks like:

Dual-buffer Design

Streaming service is a core of the system. Let's dive deeper into its design. Main problem is that we have many-to-many relationship between data producers and consumers.

Multiple sources: each source could be stored in one or more data sinks. Multiple streaming clients might subscribe to the same source.
Multiple consumers: data stores will receive information from different sources. Each web socket client might subscribe to many sources.

Service must be designed in such a way that all these relationships are handled efficiently and don't impact each other. The most straightforward way is to use standard queues. But they have several drawbacks:

Lock contention when multiple producers and consumers access the same queue.
Backpressure is binary - either data is consumed or not.
Only one consumer can read from the queue.
Slower due to event allocation overhead.

A better approach is to use dual ring buffer design. Each producer and consumer has their own buffers. At any point of time one buffer is being filled, while the other is being drained. This design has below advantages:

Input buffer handles burst absorption from exchange. Output buffer decouples downstream consumers.
No allocation during steady state means consistent sub-microsecond latencies.
Addition of new consumer does not impact existing ones.

Here is a diagram illustrating such design:

Tick Aggregation

Key piece of the streaming service is tick aggregation. Raw ticks are coming from data sources. For each tick, we need:

Open price - first tick price in the timeframe
High price - maximum price
Low price - minimum price
Close price - last tick price
Volume - sum of all tick volumes

One problem is how to handle end of period.

A straightforward way is to set a timer on startup with a fixed interval, but such a timer drifts over time. If application runs for many days this drift could be significant.
Another option is to calculate timeframe boundaries based on tick timestamps. But if there are no ticks for a long time, no bars will be generated.
The best option is to calculate time remaining till the end of the current timeframe and set a timer for that duration. Such a timer will be scheduled only for one next event. After that next timeframe duration will be calculated again. This will prevent accumulation of drift.

Timer will publish events to instrument buffers. Each aggregator will listen to these events and flush current bar to output buffers if timeframe ended.

Implementation

I implemented this architecture, code is available in repository Price Server.

Key components are:

InstrumentDataProcessor - input buffer, contains a list of aggregators.
CandleAggregator - created for each timeframe, aggregates ticks into OHLCV bars.
CandlePersistenceProcessor - output DB buffer.
ClientSubscriptionProcessor - WebSocket output buffer.
NonDriftingTimer - timer which prevents drift over time.

Modules

The solution is implemented to support modules:

Source connectors
Repositories

Implementations for Binance and Clickhouse are in modules/ folder. New ones could be added without changing core logic by implementing appropriate interfaces.

Starting the server

All runnable components are packed into Docker images and could be easily started:

docker compose up --build

This command will start:

ClickHouse instance
Streaming service
Nginx with UI on port 80
BFF which returns configuration and provides batch API for historical data

Then just open http://localhost in a browser.

Conclusion

In this article, we explored the architecture of a scalable price aggregation pipeline. Key takeaways:

Separation of concerns is critical. Streaming and query services should be independent to allow different release cycles and prevent query load spikes from affecting real-time data flow.
Columnar storage (ClickHouse, TimescaleDB, Parquet) is the right choice for analytical workloads where time-range queries and partial column reads are common.
Dual-buffer design solves the many-to-many problem between producers and consumers. It eliminates lock contention, provides burst absorption, and achieves sub-microsecond latencies without allocation overhead.
Modular architecture allows adding new exchange connectors and storage backends without touching core logic.

Price distribution is a foundational component of any trading system. The patterns discussed here—separation of concerns, dual-buffer design, and modular architecture—provide a foundation for building market data infrastructure.

Source code: https://github.com/zharkomi/price_server