4,848 reads

Python Code Optimization Tips For Developers

by Ria Jaya SharmaAugust 13th, 2019

Too Long; Didn't Read

Python is the most popular, dynamic, versatile, and one of the most sought after languages for web and AI development. Python is still the best and most relevant language for application developers. In this post, I will discuss some of the optimization procedures and patterns in Python coding in an effort to improve the overall coding knowledge of developers. To make your execution faster on CPython, get PyPy v7.1 that include two different interpreters and the feature of its predecessor PyPy3.6-beta and Python 2.7.7. Compared to CPython it is 7.6X faster.

Company Mentioned

featured image - Python Code Optimization Tips For Developers

Optimization of Python codes deals with selecting the best option among a number of possible options that are feasible to use for developers. Python is the most popular, dynamic, versatile, and one of the most sought after languages for web and AI development. Right from the programming projects like machine learning and data mining, Python is still the best and most relevant language for application developers.

In this post, I will discuss some of the optimization procedures and patterns in Python coding in an effort to improve the overall coding knowledge of developers.

Optimizing Slow Code First

When you run your Python program, the source code .py is compiled using CPython into bytecode (.pyc file) saved in a folder (_pycache_) and at the end interpreted by Python virtual M2M code (machine to machine). Since Python uses an interpreter to execute the bytecode at runtime which makes the process a bit slower.

To make your execution faster on CPython, get PyPy v7.1 that include two different interpreters and the feature of its predecessor PyPy3.6-beta and Python 2.7. Compared to CPython it is 7.6X faster which is critical for coding.

Programs in CPython takes a lot of space, but PyPy takes up less space in programs. Have issues in Python coding?

Spotting Performing Issues in Python?

Choosing data structures in Python can affect the performance of codes. Profiling of codes can be of great help in analyzing how codes perform in different situations and enable to identify the time that the program takes in its operations.

Tools such as PyCallGraph, cProfile, and gProf2dot can be useful in faster and more efficient profiling of Python codes:

cProfile is to identify how long the various parts of Python codes are executed.

PyCallGraph creates visual representations of graphs to represent the relationship between Python codes.

This example demonstrates a simple use of pycallgraph.

From pycallgraph import PyCallGraph

From pycallgraph.output import GraphvizOutput

class Juice:

    def drink(self):
        pass


class Person:

    def __init__(self):
        self.no_juices()

    def no_juices(self):
        self.juices = []

    def add_juice(self, juice):
        self.juices.append(juice)

    def drink_juices(self):
        [juice.drink() for juice in self.juices]
        self.no_juices()


def main():
    graphviz = GraphvizOutput()
    graphviz.output_file = 'basic.png'

    with PyCallGraph(output=graphviz):
        person = Person()
        for a in xrange(10):
            person.add_juice(Banana())
        person.drink_juices()


if __name__ == '__main__':
    main()

Concatenation of Strings

Python strings are generally immutable and subsequently perform quite slow. There are several ways to optimize Python strings in different situations. For example, the + (plus) operator is the ideal option for the few Python string objects.

If you use (+) operator to link multiple strings in Python codes, each time it will create a new object and creation of so many string objects will hamper the utilization of memory.

Iterator such as a List has multiple Strings, the best way to concatenate them is by utilizing the

.join() method

Let’s see an example of how the .join() and the += operator works:

import timeit

# create a list of 1000 words
list_of_words = ["foo "] * 1000

def using_join(list_of_words):  
    return "".join(list_of_words)

def using_concat_operator(list_of_words):  
    final_string = ""
    for i in list_of_words:
        final_string += i
    return final_string

print("Using join() takes {} s".format(timeit.timeit('using_join(list_of_words)', 'from __main__ import using_join, list_of_words')))

print("Using += takes {} s".format(timeit.timeit('using_concat_operator(list_of_words)', 'from __main__ import using_concat_operator, list_of_words')))

Output

$ python join-vs-concat.py 
Using join() takes 14.0949640274 s  
Using += takes 79.5631570816 s  
$ 
$ python join-vs-concat.py 
Using join() takes 13.3542580605 s  
Using += takes 76.3233859539 s

List Comprehension vs Loops in Python Coding

Loops are common in Python coding that provides a concise way to create new lists that also support various conditions.

For example, if you want to get the list of the cube root of all the odd numbers in a defined range using the

for loop

;

new_list = []  
for n in range(1, 9):  
    if n % 3 == 0:
        new_list.append(n**3)

The list comprehension version of the loop would be like:

new_list = [ n**3 for n in range(1,9) if n%3 == 0]

So, it’s clear that list comprehension is more precise, shorter, and notably faster in execution time than

for loops

Optimizing Memory Footprint in Python

Dealing with loops in Python coding, you will need to create a complete list of integers for help in executing the for-loops. Here, in this case, functions such as

range

and

xrange

will be useful in your coding. The functionality of both is the same, but they are different in that the range returns a list object but the

xrange

returns an

xrange

object.

If you need to generate a large number of integers, then

xrange

function is an ideal choice for this purpose. Why? It is because

xrange

works as a generator in that doesn’t provide the final result, but it gives you the ability to generate the values in the final list.

Let’s see the difference between these two functions in terms of memory consumption -

xrange

range

$ python
Python 2.7.10 (default, Oct 23 2015, 19:19:21)  
[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.  
>>> import sys
>>> 
>>> r = range(1000000)
>>> x = xrange(1000000)
>>> 
>>> print(sys.getsizeof(r))
8000072  
>>> 
>>> print(sys.getsizeof(x))
40  
>>> 
>>> print(type(r))
<type 'list'>  
>>> print(type(x))
<type 'xrange'>

Hence, it’s clear that

xrange

function is a much better option in saving memory than the range function. The range function consumed 8000072 bytes, on the other hand,

xrange

function consumed only 40 bytes of memory.

Summary

Hope these optimization tips enables robust and reliable execution of Python codes at multiple levels of granularity from profiling to data structures to string concatenation and memory optimization with