One of the biggest trends in software development today is the emergence of PostgreSQL as the de facto database standard. There have been a few blog posts on how to use PostgreSQL for Everything, but none yet on why this is happening (and more importantly, why this matters).
Until now!
01 PostgreSQL Is Becoming the De Facto Database Standard
02 Everything Is Becoming a Computer
03 The Return of PostgreSQL
04 Free Yourself, Build the Future, Embrace PostgreSQL
05 Timescale Started as “PostgreSQL for Time Series”
06 Timescale Expanded Beyond Time Series
07 Timescale is now “PostgreSQL made Powerful”
08 Coda: Yoda?
Over the past several months, “PostgreSQL for Everything” has become a growing war cry among developers:
“PostgreSQL isn’t just a simple relational database; it’s a data management framework with the potential to engulf the entire database realm. The trend of “Using Postgres for Everything” is no longer limited to a few elite teams but is becoming a mainstream best practice.”
(
“One way to simplify your stack and reduce the moving parts, speed up development, lower the risk and deliver more features in your startup is “Use Postgres for everything.” Postgres can replace—up to millions of users—many backend technologies, Kafka, RabbitMQ, Mongo and Redis among them.”
(
(
“When I first heard about Postgres (at a time when MySQL absolutely dominated), it was described to me as "that database made by those math nerds," and then it occurred to me: yeah, those are exactly the people you want making your database.”
(
Source )
“It has made a remarkable comeback. Now that NoSQL is dead and Oracle owns MySQL, what else is there?”
(Source)
“Postgres is not just a relational DB. It's a way of life.”
(Source)
Thanks to its rock-solid foundation, plus its versatility through native features and extensions, developers can now use PostgreSQL for Everything, replacing complex, brittle data architectures with straightforward simplicity:
This might help explain why PostgreSQL last year took the top spot from MySQL in the rankings for most popular database among professional developers (60,369 respondents):
Which database environments have you done extensive development work in over the past year, and which do you want to work in over the next year? More than 49 % of respondents answered PostgreSQL. (
Those results are from the 2023 Stack Overflow Developer Survey. If you look across time, you can see the steady increase in PostgreSQL adoption over the past few years:
While PostgreSQL was the second favorite database of Stack Overflow’s Developer Survey respondents between 2020-2022, its usage has consistently increased. Source:
This is not just a trend among small startups and hobbyists. In fact, PostgreSQL usage is increasing across organizations of all sizes:
The percentage of PostgreSQL usage by company size. (
At Timescale, this trend is not new to us. We have been PostgreSQL believers for nearly a decade. That’s why we built our business on PostgreSQL, why we are one of the
There have been a few blog posts on how to use PostgreSQL for everything, but none yet on why this is happening (and, more importantly, why this matters).
Until now.
But to understand why this is happening, we have to understand an even more foundational trend and how that trend is changing the fundamental nature of human reality.
Everything—our cars, our homes, our cities, our farms, our factories, our currencies, our things—is becoming a computer. We, too, are becoming digital. Every year, we digitize more of our own identity and actions: how we buy things, how we entertain ourselves, how we collect art, how we find answers to our questions, how we communicate and connect, how we express who we are.
Twenty-two years ago, this idea of “ubiquitous computing” seemed audacious. Back then, I was a graduate student at the MIT AI Lab, working on my
A lot has changed since then. Computing is now ubiquitous: on our desks, in our pockets, in our things, and in our “cloud.” That much we predicted.
But the second-order effects of those changes were not what most of us expected:
Ubiquitous computing has led to ubiquitous data. With each new computing device, we collect more information about our reality: human data, machine data, business data, environmental data, and synthetic data. This data is flooding our world.
The data flood has led to a Cambrian explosion of databases. All these new sources of data have required new places to store them. Twenty years ago, there were maybe five viable database options. Today there are several hundred, most of them specialized for specific use cases or data, with new ones emerging each month.
More data and more databases has led to more software complexity. Choosing the right database for your software workload is no longer easy. Instead, developers are forced to cobble together complex architectures that might include: a relational database (for its reliability), a non-relational database (for its scalability), a data warehouse (for its ability to serve analysis), an object store (for its ability to cheaply archive old data). This architecture might even have more specialized components, like a time-series or vector database.
More complexity means less time to build. Complex architectures are more brittle, require more complex application logic, offer less time for development, and slow down development. Complexity is not a benefit but a real cost.
As computing has become more ubiquitous, our reality has become more entwined with computing. We have brought computing into our world and ourselves into its world. We are no longer just our offline identities but a hybrid of what we do offline and online.
Software developers are humanity’s vanguard in this new reality. We are the ones building the software that shapes this new reality.
But developers are now flooded with data and drowning in database complexity.
This means that developers—instead of shaping the future—are spending more and more of their time managing the plumbing.
How did we get here?
Ubiquitous computing has led to ubiquitous data. This did not happen overnight but in cascading waves over several decades:
With each wave, computers have become smaller, more powerful, and more ubiquitous. Each wave also built on the previous one: personal computers are smaller mainframes; the Internet is a network of connected computers; smartphones are even smaller computers connected to the Internet; cloud computing democratized access to computing resources; the Internet of Things is smartphone components reconstructed as part of other physical things connected to the Cloud.
But in the past two decades, computing advances have not just occurred in the physical world but also in the digital one, reflecting our hybrid reality:
With each new wave of computing, we get new sources of information about our hybrid reality: human digital exhaust, machine data, business data, and synthetic data. Future waves will create even more data. All this data fuels new waves, the latest of which is Generative AI, which in turn further shapes our reality.
Computing waves are not siloed but cascade like dominoes. What started as a data trickle soon became a data flood. And then the data flood has led to the creation of more and more databases.
All these new sources of data have required new places to store them—or databases.
Mainframes started with the
The collaborative power of the Internet enabled the rise of open-source software, including the first open-source databases:
The Internet also created a massive amount of data, which led to the first non-relational, or NoSQL, databases:
Around 2010, we started to hit a breaking point. Up until that point, software applications would primarily rely on a single database—e.g., Oracle, MySQL, PostgreSQL—and the choice was relatively easy.
But “Big Data” kept getting bigger: the Internet of Things led to the rise of machine data; smartphone usage started growing exponentially thanks to the iPhone and Android, leading to even more human digital exhaust; cloud computing democratized access to compute and storage, amplifying these trends. Generative AI very recently made this problem worse with the creation of vector data.
As the volume of data collected grew, we saw the rise of specialized databases:
Twenty years ago, there were maybe five viable database options. Today, there are
Faced with this flood and with specialized databases with a variety of trade-offs, developers had no choice but to cobble together complex architectures.
These architectures typically include a relational database (for reliability), a non-relational database (for scalability), a data warehouse (for data analysis), an object store (for cheap archiving), and even more specialized components like a time-series or vector database for those use cases.
But more complexity means less time to build. Complex architectures are more brittle, require more complex application logic, offer less time for development, and slow down development.
This means that instead of building the future, software developers find themselves spending far too much time maintaining the plumbing. This is where we are today.
There is a better way.
This is where our story takes a twist. Our hero, instead of being a shiny new database, is an old stalwart, with a name only a mother core developer could love: PostgreSQL.
At first, PostgreSQL was a distant number two behind MySQL. MySQL was easier to use, had a company behind it, and a name that anyone could easily pronounce. But then MySQL was acquired by Sun Microsystems (2008), which was then acquired by Oracle (2009). And software developers, who saw MySQL as the free savior from the expensive Oracle dictatorship, started to reconsider what to use.
At that same time, a distributed community of developers, sponsored by a handful of small independent companies, was slowly making PostgreSQL better and better. They quietly added powerful features, like full-text search (2008), window functions (2009), and JSON support (2012). They also made the database more rock-solid, through capabilities like streaming replication, hot standby, in-place upgrade (2010), logical replication (2017), and by diligently fixing bugs and smoothing rough edges.
One of the most impactful capabilities added to PostgreSQL during this time was the ability to support extensions: software modules that add functionality to PostgreSQL (2011).
Thanks to extensions, PostgreSQL started to become more than just a great relational database. Thanks to PostGIS, it became a great geospatial database; thanks to TimescaleDB, it became a great time-series database; hstore, a key-value store; AGE, a graph database; pgvector, a vector database. PostgreSQL became a platform.
Now, developers can use PostgreSQL for its reliability, scalability (replacing non-relational databases), data analysis (replacing data warehouses), and more.
At this point, the smart reader should ask, “What about big data?”. That’s a fair question. Historically, “big data” (e.g., hundreds of terabytes or even petabytes)—and the related analytics queries—has been a bad fit for a database like PostgreSQL that doesn’t scale horizontally on its own.
That, too, is changing. Last November, we launched “
So while “Big Data” has historically been an area of weakness for PostgreSQL, soon, no workload will be too big.
PostgreSQL is the answer. PostgreSQL is how we free ourselves and build the future.
Instead of futzing with several different database systems, each with its own quirks and query languages, we can rely on the world’s most versatile and, possibly, most reliable database: PostgreSQL. We can spend less time on the plumbing and more time on building the future.
And PostgreSQL keeps getting better. The PostgreSQL community continues to make the core better. There are many more companies contributing to PostgreSQL today, including the hyperscalers.
Today's PostgreSQL ecosystem (
There are also more innovative, independent companies building around core to make the PostgreSQL experience better:
And, of course, there’s us,
The Timescale story will probably sound a little familiar: we were solving some hard sensor data problems for IoT customers, and we were drowning in data. To keep up, we built a complex stack that included at least two different database systems (one of which was a time-series database).
One day, we reached our breaking point. In our UI, we wanted to filter devices by both device_type and uptime. This should have been a simple SQL join. But because we were using two different databases, it instead required writing glue code in our application between our two databases. It was going to take us weeks and an entire engineering sprint to make the change.
Then, one of our engineers had a crazy idea: Why don’t we just build a time-series database right in PostgreSQL? That way, we would just have one database for all our data and would be free to ship software faster. Then we built it, and it made our lives so much easier. Then we told our friends about it, and they wanted to try it. And we realized that this was something that we needed to share with the world.
So, we open-sourced our time-series extension, TimescaleDB, and
In the seven years since, we’ve heavily invested in both the extension and in our PostgreSQL cloud service, offering a better and better PostgreSQL developer experience for time-series and analytics: 350x faster queries, 44 % higher inserts via hypertables (auto-partitioning tables); millisecond response times for common queries via continuous aggregates (real-time materialized views); 90 %+ storage cost savings via native columnar compression; infinite, low-cost object storage via tiered storage; and more.
That’s where we started, in time-series data, and also what we are most known for.
But last year we started to expand.
We launched
Recently,
PopSQL is the SQL editor for team collaboration
We also launched “
Today, Timescale is PostgreSQL made Powerful—at any scale. We now solve hard data problems—that no one else does—not just in time series but in AI, energy, gaming, machine data, electric vehicles, space, finance, video, audio, web3, and much more.
We believe that developers should be using PostgreSQL for everything, and we are improving PostgreSQL so that they can.
Customers use Timescale not just for their time-series data but also for their vector data and general relational data. They use Timescale so that they can use PostgreSQL for Everything. You can too:
Our human reality, both physical and virtual, offline and online, is filled with data. As Yoda might say, data surrounds us, binds us. This reality is increasingly governed by software, written by software developers, by us.
It’s worth appreciating how remarkable that is. Not that long ago, in 2002, when I was an MIT grad student, the world had lost faith in software. We were recovering from the dotcom bubble collapse. Leading business publications proclaimed that “
But today, especially now in this world of generative AI, we are the ones shaping the future. We are the future builders. We should be pinching ourselves.
Everything is becoming a computer. This has largely been a good thing: our cars are safer, our homes are more comfortable, and our factories and farms are more productive. We have instant access to more information than ever before. We are more connected with each other. At times, it has made us healthier and happier.
But not always. Like the force, computing has both a light and dark side. There has been growing evidence that mobile phones and social media are directly contributing to a
We have become the stewards of two valuable resources that affect how the future is built: our time and our energy.
We can either choose to spend those resources on managing the plumbing or embrace PostgreSQL for Everything and build the right future.
I think you know where we stand.
Thanks for reading. #Postgres4Life
(
This post was written by Ajay Kulkarni.