Too Long; Didn't Read
The company embarked on a significant dogfooding effort to build "Insights," a tool for users to analyze query performance. They collected query statistics from all customer databases, totaling more than 1 trillion records about individual queries. With over 10 billion new records ingested daily, the dataset served by a single Timescale service exceeds 350 TB. They use TimescaleDB at the core, which is a PostgreSQL extension to handle the massive data.
The database observability tool, Insights, correlates various metrics for users to find underperforming queries. To scale PostgreSQL to manage this huge dataset, they used continuous aggregates, hypertables, and data tiering. Timescale's columnar compression achieved compression rates of up to 20x. They also leveraged features like approximate schema mutability for flexibility.
While the process was successful, they identified areas for improvement, including better database observability and handling large data updates. They highlighted challenges faced when altering schema, adding new indexes, or updating continuous aggregates at scale. Nonetheless, this experience provided invaluable insights and empathy for customers using Timescale at scale.
Overall, this article details the company's journey in scaling PostgreSQL with TimescaleDB to create "Insights," offering real-time database observability for customers while addressing technical challenges and recognizing opportunities for future improvements.