Do you have an app on Heroku using Postgres? Are you running out of database connections? Maybe you've already tried setting CONN_MAX_AGE, but that only helped for a while. It’s time for PgBouncer, the de facto standard for Postgres connection pooling.
In this article, we'll look at how to use PgBouncer to scale your app on Heroku. We'll use Django, a popular python web framework, as an example for the settings. The installation steps for PgBouncer and the concepts, however, should work for any app on Heroku.
Although setting up PgBouncer is straight forward, finding the right settings for your particular use case can be a challenge. There is no one-size-fits-all solution for a PgBouncer configuration.
This article will show you:
You can find a sample project on GitHub at https://github.com/steuke/heroku-pgbouncer-django-demo.
Note: to use this guide, your app must use the `DATABASE_URL` environment variable to configure its database-connection. This is important. If you configure your database manually, you will first need to switch to using `DATABASE_URL`. For Django, have a look at dj-database-url and the sample project, which makes this very easy. You can check out the sample project to better understand how this can be done.
Choosing the right PgBouncer deployment type
PgBouncer can be deployed in different ways. Heroku supports both server-side and client-side deployments. The server-side implementation on Heroku is still in beta, is somewhat restricted on configuration options, and only supports transaction-mode (see next section). Therefore, this guide uses the client-side approach, which deploys PgBouncer on the same dyno as your Django app and gives you more control over each setting.
If you are unsure which deployment type to use, the official Heroku documentation will help you decide if client-side deployment is right for you.
Choosing the right PgBouncer pool mode and Django settings
PgBouncer has three types of connection pool modes, listed here from most polite to most aggressive connection sharing: session, transaction, and statement mode.
Here is what you need to know when considering different pool modes for use with your Django app:
As a best practice, start by using transaction-mode and verify that your app still works. After that, feel free to experiment with the other modes and their impact on performance and resource usage. For example, if you experience errors because your app uses features that are not supported, try switching to session-mode. If you don't experience errors, and you want to push your app's performance further, you could try to use statement-mode. More details can be found in the Heroku documentation.
Installing PgBouncer on Heroku Using a Buildpack
A word of advice: Use a staging environment to follow this guide. Roll this change out to production only after testing your app in a staging environment. Incorrect steps might prevent your app from working until you fix the problem. (You have been warned!)
OK, let's get started. Follow these five steps to add PgBouncer to your app (in this article, we'll use the Heroku CLI for commands):
1. Disable use of server-side cursors
In a Django app, simply add this line to your settings.py:
DISABLE_SERVER_SIDE_CURSORS = True # required when using pgbouncer's pool_mode=transaction
2. Add the PgBouncer buildpack
$ heroku buildpacks:add heroku/pgbouncer
3. Add the python buildpack
In some cases, you have to manually add the python buildpack (or the buildpack for the language you are using). Since there is no harm in doing this, let's do it:
$ heroku buildpacks:add heroku/python
4. Change your Procfile so that PgBouncer is launched at startup
If you are using python/Django, then before this change, your Procfile will look something like this:
web: gunicorn your_django_project.wsgi
Change it to something like this:
web: bin/start-pgbouncer gunicorn your_django_project.wsgi
This change will pass your original startup command to the bin/start-pgbouncer script. The script is part of the PgBouncer buildpack. Its main purpose is to change the DATABASE_URL environment variable to point to the local PgBouncer connection pool instead of the original Postgres database.
This is the reason your Django app must use the DATABASE_URL environment variable to configure the database connection. If you do not use DATABASE_URL, your app will still work, but it won’t be connecting via PgBouncer to your Postgres.
5. Commit the changes and deploy them to Heroku
Use your favorite git client to commit the changes, and push them to Heroku. Then start looking at the logs with `heroku logs -t --source app | grep pgbouncer` while running tests on your app. You should see a series of log-entries similar to these:
2020-04-30T18:59:48.918478+00:00 app[web.1]: 2020-04-30 18:59:48.918 53 LOG C-0x56211ba01900: pgbouncer/pgbouncer@unix(63):6000 login attempt: db=pgbouncer user=pgbouncer tls=no
2020-04-30T18:59:48.919044+00:00 app[web.1]: 2020-04-30 18:59:48.919 53 LOG C-0x56211ba01900: pgbouncer/pgbouncer@unix(63):6000 closing because: client close request (age=0)
Congratulations! This indicates that your app is successfully configured and using PgBouncer to connect to Postgres.
What did you just achieve?
Your web dyno is now running a PgBouncer instance, which is providing a connection pool to your Django app (running on the same dyno). Your Django app is connecting to this local PgBouncer connection pool instead of directly to the Postgres database.
In the next section, you'll learn how to monitor PgBouncer.
There are several ways to monitor PgBouncer. One common way is to connect to the PgBouncer admin-console using psql. This section demonstrates two other ways: using Heroku logs and querying the stats from within your Django app.
Using Heroku logs
The PgBouncer buildpack is configured to log the PgBouncer stats in one minute intervals. Running the following command shows the most recent stats:
$ heroku logs --source app | grep "LOG Stats"
The output will look similar to this:
2020-04-30T19:11:25.720304+00:00 app[web.1]: 2020-04-30 19:11:25.720 54 LOG Stats: 5 xacts/s, 46 queries/s, in 5401 B/s, out 2756 B/s, xact 45640 us, query 992 us wait time 1396 us
```
Querying the PgBouncer database from your Django app using SHOW STATS
PgBouncer provides a virtual database containing its stats. This is also referred to as the Admin Console. The PgBouncer buildpack provides access to this virtual database via a Unix socket named /tmp, using the username "PgBouncer" and no password.
Using these credentials, you can retrieve the PgBouncer stats from within your Django app by using the low-level psycopg2 database API. Here is an example function pgbouncer_stats() that retrieves the stats using that API:
def pgbouncer_stats() -> List[Dict]:
try:
connection = psycopg2.connect(host="/tmp/", port=6000, dbname="pgbouncer", user="pgbouncer",
cursor_factory=extras.DictCursor)
connection.autocommit = True # must enable autocommit when querying pgbouncer stats
cursor = connection.cursor()
cursor.execute('SHOW STATS')
pgbouncer_stats = [dict(row) for row in cursor]
return pgbouncer_stats
except Exception as e:
return [{'error': True, 'description': 'Could not query pgbouncer stats.', 'reason': f'{e}'}]
The return value of pgbouncer_stats() is a list of dictionaries, with one dictionary for each database that PgBouncer is pooling:
"pgbouncer_stats":[
{
"database":"db1",
"total_xact_count":220,
"total_query_count":2750,
"total_received":330110,
"total_sent":168517,
"total_xact_time":15062895,
"total_query_time":2837299,
"total_wait_time":84837,
"avg_xact_count":2,
"avg_query_count":29,
"avg_recv":3519,
"avg_sent":1796,
"avg_xact_time":68467,
"avg_query_time":1031,
"avg_wait_time":904
},
{
"database":"pgbouncer",
"total_xact_count":110,
"total_query_count":110,
"total_received":0,
"total_sent":0,
"total_xact_time":0,
"total_query_time":0,
"total_wait_time":0,
"avg_xact_count":1,
"avg_query_count":1,
"avg_recv":0,
"avg_sent":0,
"avg_xact_time":0,
"avg_query_time":0,
"avg_wait_time":0
}
]
You can see that the “db1” database has a total_xact_count of 220. This indicates that a total of 220 transactions have been processed. The PgBouncer documentation has the full list of stats and their meaning.
How to fix some common problems when starting to use PgBouncer:
Next Steps: Monitor, Benchmark, and Fine-Tune PgBouncer
At this point, PgBouncer is up and running on your dyno, providing connection pooling to your Django app.
What should you do next?
PgBouncer is the right tool to help scale your Postgres app when the number of database connections becomes an issue. Keep in mind that while adding PgBouncer to your setup will result in additional complexity for you and your team, taking the time to carefully setup, monitor, and tune your configuration should improve your app's performance.