paint-brush
Parallelism and Concurrency in Python (Concepts + Code)by@devonroll
2,063 reads
2,063 reads

Parallelism and Concurrency in Python (Concepts + Code)

by KaranOctober 3rd, 2019
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

In this post, we will discuss about concurrency and parallelism in python. We will look at Multithreading, Multiprocessing, Async programming and concurrency. In Python, concurrency is achieved with the help of threading and Async IO modules. A program running in parallel will be called as concurrent but the reverse is not true. The output of the code vary a lot in python and the output of python code varies a lot. We simply create a Pool object in multiprocessed which offers a convenient means to parallelize the execution of a function across multiple input values.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - Parallelism and Concurrency in Python (Concepts + Code)
Karan HackerNoon profile picture

Hi Folks !! Hope you all programming geeks are doing well. In this post, we will discuss about concurrency and Parallelism in python. Here, we will look at Multithreading , Multiprocessing , asynchronous programming , concurrency and parallelism and how we can use these concepts to speed up computation tasks in python. So, without wasting time, lets get started .

[ I am already assuming that you have a fair knowledge of python. If not, I would recommend you to read this post before moving forward :

https://hackernoon.com/pythonic-way-of-doing-things-code-kq2b430vv ]

Parallelism

It means performing multiple tasks at same time and in same order .

1. Multiprocessing: It means distributing your tasks over CPU cores [ type <lscpu> in terminal to check the number of cores in your computer. ]. For any CPU bound tasks ( like — doing numerical computations ), we can use python’s multiprocessing module . We simply create a Pool object in multiprocessing which offers a convenient means to parallelize the execution of a function across multiple input values. Let's look at it with the help of an example :

import multiprocessing
import os 
import time 
import numpy as np

def DotProduct(A):
    dot_product = np.dot(A[0],A[1])   
    return

 List = [[np.arange(1000000).reshape(5000,200),np.arange(1000000).reshape(200,5000)],
         [np.arange(1000000).reshape(500,2000),np.arange(1000000).reshape(2000,500)],
         [np.arange(1000000).reshape(5000,200),np.arange(1000000).reshape(200,5000)]]
  
if __name__ == "__main__":    
    # executing a code without multiprocessing .. ie. on single core . 
    start = time.time()
    B = list(map(DotProduct,List))
    end = time.time() - start
    print("Full time taken : " , end , "seconds")
    
    # lets look at executing same code with multiprocesing module on multiple cores ..  
    start = time.time()
    pool = multiprocessing.cpu_count() 
    with multiprocessing.Pool(pool) as p:
        print(p.map(DotProduct,List))
    end = time.time() - start
    print("Full time taken : " , end , "seconds")   

## output //
Full time taken : 23.593358993530273 seconds
Full time taken : 14.405884027481079 seconds

Concurrency

It means performing multiple tasks at same time but in overlapping or different or same order . (Python is not great at handling concurrency ) but it does a pretty decent job .

1. Multithreading : running different/multiple threads to perform tasks on a single processor . Multithreading is really good for performing IO bound tasks (like — Sending multiple request to servers concurrently etc ..). Every new thread created will have a PID (process ID) and it will have a start function . join() function of the thread can be used, if want to run loc after thread finishes its job. Python has a very complicated relationship with its GIL and the output of the code vary a lot .

2. Async IO : In Python, Async IO is a single threaded - single process design paradigm that somehow manages to achieve concurrency .

Lets look at it with the help of an example .

import threading
import os 
import time 
import numpy as np

def BasicOperation():
    # square of number 
    def square(number):
        return number*number
    # cube of a number 
    def cube(number):
        return number**3
    # nth power of a number 
    def nth_power(number,power):
        return number**power
    # sum of n numbers 
    def sum_of_n_numbers(number):
        return number*(number+1)/2  
    # using functions to drive a program ... 
    print("square of 5 is " , square(5))
    print("cube of 5 is " , cube(5))
    print("5 raise to power 2 is " , nth_power(5,2))
    print("sum of first 5 numbers is" , sum_of_n_numbers(5))
    
def DotProduct():
    A = np.arange(1000000).reshape(5000,200)
    B = np.arange(1000000).reshape(200,5000)
    Dot = np.dot(A,B)

if __name__ == "__main__":      
        # without threading ... 
        start = time.time()
        BasicOperation()
        Mid = time.time() - start
        print("Mid time taken : " , Mid , "seconds")
        DotProduct()
        end = time.time() - start
        print("Full time taken : " , end , "seconds")
        # with threading ... 
        start = time.time()
        Thread_1 = threading.Thread(target = BasicOperation, name = ' Basic Operation Thread ') 
        Thread_2 = threading.Thread(target = DotProduct , name=' Dot Product Thread ')
        Thread_1.start() 
        Thread_2.start() 
        Thread_1.join() 
        Mid = time.time() - start
        print("Mid time taken : " , Mid , "seconds") 
        Thread_2.join()
        end = time.time() - start
        print("Full time taken : " , end , "seconds")

## output //
square of 5 is 25
cube of 5 is 125
5 raise to power 2 is 25
sum of first 5 numbers is 15.0
Mid time taken : 0.0006113052368164062 seconds
Full time taken : square of 5 is 10.373110294342041 25seconds

cube of 5 is Mid time taken : 1250.0015938282012939453
5 raise to power 2 is seconds
25
sum of first 5 numbers is 15.0
Full time taken : 12.598262786865234 seconds

Summary

We use python’s multiprocessing module to achieve parallelism whereas concurrency in Python is achieved with the help of threading and Async IO modules . A program running in parallel will be called as concurrent but the reverse is not true .

That's it. Thank you for taking your time and reading my post. I hope you liked it.