I. Benchmark, benchmark, benchmark What gets measured gets managed. Benchmarking sounds like a tedious process, but if you already have working code separated into functions, it can be as easy as adding a decorator to the function you are trying to profile. First off, lets install so that we can measure the time spent on each line of code in our function: line_profiler pip3 install line_profiler This provides a decorator( ) that you can use to benchmark any function in your code line by line. As an example, lets say that we have the following : @profile code s = l ls: val l: s += val s smallrange = list(range( )) inlist = [smallrange, smallrange, smallrange, smallrange] list_sum = sum_of_lists(inlist) print(list_sum) #filename: test.py @profile : def sum_of_lists (ls) '''Calculates the sum of an input list of lists''' 0 for in for in return #create a list of lists 10000 #now sum them This will profile the function when called - notice the decorator above the function definition. sum_of_lists @profile Now we can profile our by doing: code python3 -m line_profiler test.py Which gives us: The 5th column shows the percentage of the runtime spent on each line - this will point you to the section of your code that needs optimization the most, as this is where most of the runtime is spent. Keep in mind that this benchmarking library has significant overhead, but it's perfect for finding weak points in your code and replacing them with something more efficient. For running inside notebooks, check out the magic command. line_profiler Jupyter %%lprun 2. Avoid loops when possible In many cases using operations like , or (usually the fastest) in instead of loops can give you a significant performance boost without much work, as these operations are heavily optimized internally. Lets modify our previous example a bit by replacing the nested loops with and : map list comprehensions numpy.vectorize python map sum (sum(list(map(sum,ls)))) smallrange = list(range( )) inlist = [smallrange,smallrange,smallrange,smallrange] list_sum = sum_of_lists_map(inlist) print(list_sum) #filename: test_map.py : def sum_of_lists_map (ls) '''Calculates the sum of an input list of lists''' return #create a list of lists 10000 #now sum them Lets see how the new map version does compared to the original by timing them 1000 times: The map version is over 6X faster than the original! 3. Compile your Python modules using Cython If you don't want to modify your project at all but still want some performance gains for free, Cython is your friend. Although Cython is not a general purpose python to C compiler, Cython lets you compile your python modules into shared object files(.so), which can be loaded by your main python script. For this, you will need to have Cython, as well as a C compiler installed on your machine: pip3 install cython If you are on a Debian, you can download GCC by doing: sudo apt install gcc Lets separate the starting example code into 2 files, named test_cython.py and test_module.pyx: s = l ls: val l: s += val s #filename: test_module.pyx : def sum_of_lists (ls) '''Calculates the sum of an input list of lists''' 0 for in for in return Our main file has to import this function from the test_module.pyx file: test_module * smallrange = list(range( )) inlist = [smallrange,smallrange,smallrange,smallrange] list_sum = sum_of_lists(inlist) print(list_sum) #filename: test_cython.py from import #create a list of lists 10000 #now sum them Now lets define a setup.py file for compiling our module using Cython: setuptools setup Cython.Build cythonize setup( ext_modules = cythonize( ) ) #filename: setup.py from import from import "test_module.pyx" Finally, its time to compile our module: python3 setup.py build_ext --inplace Now lets see how much better this version does compared to the original by timing them 1000 times: In this case Cython nets us an almost 2X speed-up compared to the original - but this will vary depending on the type of code you are trying to optimize. If you are looking to take advantage of inside notebooks, there is a magic available which lets you compile your functions with minimal hassle. Cython Jupyter %%Cython Conclusion These were 3 easy to implement tips to net you some extra performance - for more information about and in , you can check out the and cell magics. line_profiler Cython Jupyter %%lprun %%cython