In this article, I will first walk you through the distinction between concurrent programming and parallel execution, discuss about Python built-ins concurrent programming mechanisms and the pitfalls of multi-threading in Python. Understanding Concurrent Programming vs Parallel Execution Concurrent programming is not equivalent to parallel execution, despite the fact that these two terms are often being used interchangeably. Illustration of concurrency without parallelism Concurrency is a property which more than one operation can be run simultaneously but it doesn’t mean it will be. (Imagine if your processor is single-threaded. ) Illustration of parallelism Parallel is a property which operations are actually being run simultaneously. It is usually determined by the hardware constraints. Think of your program as a fast food chain, concurrency is incorporated when two separate counters for order and collection are built. However, it doesn’t ensure parallelism as it depends on the number of employees available. If there is only one employee to handle both order and collection requests, the operations can’t be running in parallel. Parallelism is only present when there are two employees to serve order and collection simultaneously. Python Built-ins Now, what do we have in Python? Does Python have built-ins that facilitate us to build concurrent programs and enable them to run in parallel? In the following discussion, we assume our programs are all written and run in a multi-threaded or multi-core processor. The answer is (Yes and No in German). Why yes? Python does have built-in libraries for the most common concurrent programming constructs — multiprocessing and multithreading. You may think, since Python supports both, why Jein? The reason is, multithreading in Python is not really multithreading, due to the GIL in Python. Jein Multi-threading — Thread-based Parallelism is the package that provides API to create and manage threads. Threads in Python are always non-deterministic and their scheduling is performed by the operating system. However, multi-threading might not be doing what you expect to be. threading Why multi-threading in Python might not be what you want? Other than the common pitfalls such as deadlock, starvation in multithreading in general. Python is notorious for its poor performance in multithreading. Let us look at the following snippet: import threading def countdown():x = 1000000000while x > 0:x -= 1 # Implementation 1: Multi-threadingdef implementation_1():thread_1 = threading.Thread(target=countdown)thread_2 = threading.Thread(target=countdown)thread_1.start()thread_2.start()thread_1.join()thread_2.join() # Implementation 2: Run in serialdef implementation_2():countdown()countdown() Which implementation would be faster? Let us perform a timing. Timing results of both implementations Surprisingly, running 2 serially outperformed multi-threading? How could this happen? Thanks to the notorious Global Interpreter Lock (GIL). countdown() What is Global Interpreter Lock (GIL)? Depends on the distribution of your Python, which most of the case, is an implementation of CPython. CPython is the original implementation of Python, you can read more about it in this . StackOverflow thread In CPython, multi-threading is supported by introducing a known as Global Interpreter Lock (aka GIL). It is to prevent multiple threads from accessing the same Python object simultaneously. This make sense, you wouldn’t want someone else to mutate your object while you are processing it. Mutex Illustration of Implementation_1 So, from our code snippet above, creates 2 threads and are supposed to run in parallel on a multi-threaded system. However, only one thread can hold the GIL at a time, one thread must wait for another thread to release the GIL before running. Meanwhile, scheduling and switching done by the OS introduce overhead that make even slower. implementation_1 implementation_1 How to bypass GIL? How could we bypass GIL, while maintaining the use of multi-threading? There is no universal good answer for this question, as this varies from the purpose of your code. Using a different implementation of Python such as Jython, PyPy or IronPython is an option. I personally do not advocate using a different implementation of Python since most libraries written are not tested against different implementations of Python. Another potential workaround is to use C-extenstion, or better known as . Note that Cython and CPython are not the same. You can read more about Cython . Cython here Use multiprocessing instead. Since in multiprocessing, an interpreter is created for every child process. The situation where threads fighting for GIL simple doesn’t exist as there is always only a main thread in every process. Despite all the pitfalls, should we still use multi-threading? If your task is I/O bound, meaning that the thread spends most of its time handling I/O such as performing network requests. It is still perfectly fine to use multi-threading as the thread is, most of the time, being blocked and put into blocked queue by the OS. Thread is also always less resource-hungry than process. Multiprocessing — Process-based Parallelism Let us implement our previous code snippet using multiprocessing. import multiprocessing # countdown() is defined in the previous snippet. def implementation_3():process_1 = multiprocessing.Process(target=countdown)process_2 = multiprocessing.Process(target=countdown)process_1.start()process_2.start()process_1.join()process_2.join() The result itself is self-explanatory. Timing results of multiprocessing vs multithreading Conclusion The constraint of GIL was something that caught me in the very beginning of time as a Python developer. I wasn’t aware of my decision of using threads is totally worthless until I did the timing. I hope this article helps. Please click the 👏 button if you find this useful.

Chain

Target

Concurrent Programming in Python is not what you think it is.

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Asynchronous Tasks with Celery + Redis in Django

The Essential Guide to Relearning Java Thread Primitives

A Guide to Improving Your Python Performance Speed

A simple introduction to Python’s asyncio

Assume It Worked and Fix Later

Async/Await in Golang: An Introductory Guide

Asynchronous Tasks with Celery + Redis in Django

The Essential Guide to Relearning Java Thread Primitives

A Guide to Improving Your Python Performance Speed

A simple introduction to Python’s asyncio

Assume It Worked and Fix Later

Async/Await in Golang: An Introductory Guide

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps