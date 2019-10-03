Parallelism and Concurrency in Python (Concepts + Code)

Hi Folks !! Hope you all programming geeks are doing well. In this post, we will discuss about concurrency and Parallelism in python. Here, we will look at Multithreading , Multiprocessing , asynchronous programming , concurrency and parallelism and how we can use these concepts to speed up computation tasks in python. So, without wasting time, lets get started .

Parallelism

It means performing multiple tasks at same time and in same order .

1. Multiprocessing: It means distributing your tasks over CPU cores [ type <lscpu> in terminal to check the number of cores in your computer. ]. For any CPU bound tasks ( like — doing numerical computations ), we can use python’s multiprocessing module . We simply create a Pool object in multiprocessing which offers a convenient means to parallelize the execution of a function across multiple input values. Let's look at it with the help of an example :

import multiprocessing import os import time import numpy as np def DotProduct (A) : dot_product = np.dot(A[ 0 ],A[ 1 ]) return List = [[np.arange( 1000000 ).reshape( 5000 , 200 ),np.arange( 1000000 ).reshape( 200 , 5000 )], [np.arange( 1000000 ).reshape( 500 , 2000 ),np.arange( 1000000 ).reshape( 2000 , 500 )], [np.arange( 1000000 ).reshape( 5000 , 200 ),np.arange( 1000000 ).reshape( 200 , 5000 )]] if __name__ == "__main__" : # executing a code without multiprocessing .. ie. on single core . start = time.time() B = list(map(DotProduct,List)) end = time.time() - start print( "Full time taken : " , end , "seconds" ) # lets look at executing same code with multiprocesing module on multiple cores .. start = time.time() pool = multiprocessing.cpu_count() with multiprocessing.Pool(pool) as p: print(p.map(DotProduct,List)) end = time.time() - start print( "Full time taken : " , end , "seconds" )

Full time taken : 23.593358993530273 seconds

Full time taken : 14.405884027481079 seconds

Concurrency

It means performing multiple tasks at same time but in overlapping or different or same order . (Python is not great at handling concurrency ) but it does a pretty decent job .

1. Multithreading : running different/multiple threads to perform tasks on a single processor . Multithreading is really good for performing IO bound tasks (like — Sending multiple request to servers concurrently etc ..). Every new thread created will have a PID (process ID) and it will have a start function . join() function of the thread can be used, if want to run loc after thread finishes its job. Python has a very complicated relationship with its GIL and the output of the code vary a lot .

2. Async IO : In Python, Async IO is a single threaded - single process design paradigm that somehow manages to achieve concurrency .

Lets look at it with the help of an example .

import threading import os import time import numpy as np def BasicOperation () : # square of number def square (number) : return number*number # cube of a number def cube (number) : return number** 3 # nth power of a number def nth_power (number,power) : return number**power # sum of n numbers def sum_of_n_numbers (number) : return number*(number+ 1 )/ 2 # using functions to drive a program ... print( "square of 5 is " , square( 5 )) print( "cube of 5 is " , cube( 5 )) print( "5 raise to power 2 is " , nth_power( 5 , 2 )) print( "sum of first 5 numbers is" , sum_of_n_numbers( 5 )) def DotProduct () : A = np.arange( 1000000 ).reshape( 5000 , 200 ) B = np.arange( 1000000 ).reshape( 200 , 5000 ) Dot = np.dot(A,B) if __name__ == "__main__" : # without threading ... start = time.time() BasicOperation() Mid = time.time() - start print( "Mid time taken : " , Mid , "seconds" ) DotProduct() end = time.time() - start print( "Full time taken : " , end , "seconds" ) # with threading ... start = time.time() Thread_1 = threading.Thread(target = BasicOperation, name = ' Basic Operation Thread ' ) Thread_2 = threading.Thread(target = DotProduct , name= ' Dot Product Thread ' ) Thread_1.start() Thread_2.start() Thread_1.join() Mid = time.time() - start print( "Mid time taken : " , Mid , "seconds" ) Thread_2.join() end = time.time() - start print( "Full time taken : " , end , "seconds" )

Summary

We use python’s multiprocessing module to achieve parallelism whereas concurrency in Python is achieved with the help of threading and Async IO modules . A program running in parallel will be called as concurrent but the reverse is not true .

That's it. Thank you for taking your time and reading my post. I hope you liked it.

