Finding High-impact Performance Bottlenecks — Django Tips

Written by chillaranand | Published 2019/02/24
Tech Story Tags: django | python | django-tips | apm | django-tricks

TLDRvia the TL;DR App

How to find bottlenecks in Django which have a high impact on the application performance.

Originally published at https://avilpage.com/2018/12/django-bottleneck-performance-scaling.html

Introduction

When optimizing the performance of web application, a common mistake is to start with optimizing the slowest page(or API). In addition to considering response time, we should also consider the traffic it is receiving to prioritize the order of optimization.

In this article, we will profile a Django web app, find high-impact performance bottlenecks and then start optimizing them to yield better performance.

Profiling

django-silk is an open source profiling tool which intercepts and stores HTTP requests data. Install it with pip.

pip install django-silk

Add silk to installed apps and include silk middleware in django settings.

MIDDLEWARE = [...'silk.middleware.SilkyMiddleware',...]

INSTALLED_APPS = (...'silk')

Run migrations so that Silk can create required database tables to store profile data.

$ python manage.py makemigrations$ python manage.py migrate$ python manage.py collectstatic

Include silk urls in root urlconf to view the profile data.

urlpatterns += [url(r'^silk/', include('silk.urls', namespace='silk'))]

On silk requests page(http://localhost:8000/silk/requests/), we can see all requests and sort them by overall time or time spent in the database.

High Impact Bottlenecks

Silk creates silk_request table which contains information about the requests processed by Django.

$ pgcli

library> \d silk_request;

+--------------------+--------------------------+-------------+| Column | Type | Modifiers ||--------------------+--------------------------+-------------|| id | character varying(36) | not null || path | character varying(190) | not null || time_taken | double precision | not null |...

We can group these requests data by path, calculate the number of requests, average time taken and impact factor of each path. Since we are considering response time and traffic, impact factor will be the product of average response time and number of requests for that path.

library> SELECTs.*, round((s.avg_time * s.count)/max(s.avg_time*s.count) over ()::NUMERIC,2) as impactFROM(select path, round(avg(time_taken)::numeric,2) as avg_time, count(path) as count from silk_request group by PATH)sORDER BY impact DESC;

+-------------------------+------------+---------+----------+| path | avg_time | count | impact ||-------------------------+------------+---------+----------|| /point/book/book/ | 239.90 | 1400 | 1.00 || /point/book/data/ | 94.81 | 1900 | 0.54 || /point/ | 152.49 | 900 | 0.41 || /point/login/ | 307.03 | 400 | 0.37 || / | 106.51 | 1000 | 0.32 || /point/auth/user/ | 494.11 | 200 | 0.29 |...

We can see /point/book/book/ has the highest impact even though it is neither most visited nor slowest view. Optimizing this view first yields in overall better performance of web app.

Conclusion

In this article, we learned how to profile the Django web app and identify bottlenecks to improve performance. In the next article, we will learn how to optimize these bottlenecks by taking an in-depth look at them.

More tips and tricks about Django are available at https://avilpage.com/tags/django-tips-tricks.html


Published by HackerNoon on 2019/02/24