Profilers are designed to check how programs use various computing resources. They measure various program characteristics (memory usage, duration of function calls, etc.) while program is running and aggregate collected data to create a report that can aid program optimization. tools In this post we will build two simple profilers that measure function call time (just “profilers” below), but solve this problem in slightly different manner. This post has emerged from various notes I made while working on and I hope that it will be useful for you. vprof Python (more precisely CPython) is used for all code in this post, since it provides great instrumentation for writing profilers out of the box, but concepts discussed here can be easily applied to other mainstream high-level languages. programming We will consider two types of profilers - deterministic and statistical.Deterministic profilers record exact time of all function calls and returns and then process collected data to determine duration of function calls and other important parameters. While this method has greater accuracy, it adds significant overhead (especially to small functions) and programs run slower under profiling.Statistical profilers periodically take snapshots of running program and deduce where time is being spent. This method has lesser overhead than deterministic profilers and programs run almost at full speed, but typically accuracy is lower, since collected profile provides just a statistical approximation of where time was spent. Here is code that we will be inspecting with our profilers The code generates N prime numbers in which sum of digits is even. While the code is far from being algorithmically perfect, it is simple enough and fits our needs well. Deterministic profiler Perhaps simplest way to measure execution time of our code is to use Python’s built-in function time.time() If we run it, we’ll get something like this Total time is 2.436735 s Looks good. Now we know the time call to takes. There is one problem though — we’d like to see the time and calls take in order to determine the bottlenecks. _get_prime_s sum_of_digits is_prime Let’s improve our approach Here we are using well known Python technique — profiling decorators. For those who aren’t familiar with Python decorators — is just a shortcut for which allows altering Python functions ( you can find nice introduction to Python decorators). @decorator func = decorator(func) here Then we need to wrap our functions with new decorator profile and run our code again The output is Function is_prime run time is 0.000003 sFunction sum_of_digits run time is 0.000002 sFunction is_prime run time is 0.000001 sFunction sum_of_digits run time is 0.000001 sFunction is_prime run time is 0.000001 sFunction is_prime run time is 0.000001 s…(many lines here)…Total time is 2.567238 s Not quite that was expected. We can see individual call times, but, since there are so many function calls, we can’t see the big picture.Also, you might notice that total time is greater now, because we introduced profiler overhead — each wrapped function calls twice, does subtraction, string formatting and prints resulting string to . time.time() stdout Our profiling decorator can be improved. Printing call time on each function call is not necessary if we want to see the overview. Instead of printing call times, let’s store function name, it’s caller name and cumulative call time after each function call. We replaced call with accumulating function call time inside that allows us storing cumulative call time per caller. print defaultdict Wrapping functions with improved decorator and running code again Total time is 2.637101 s Stats are stored in variable. Let’s print them stats {‘get_n_primes’: defaultdict(<type ‘float’>, {‘<module>’: 2.7848520278930664}),‘is_prime’: defaultdict(<type ‘float’>, {‘get_n_primes’: 2.7565882205963135}),‘sum_of_digits’: defaultdict(<type ‘float’>, {‘get_n_primes’:0.005716800689697266})} Even from the raw stats we can see that call to takes most of the time. is_prime Finally, let’s summarize collected stats and print nice-looking report Total time: 2.637101 sFunction: is_prime, caller: get_n_primes, function run time: 2.611894 s, percentage: 99.044156 %Function: get_n_primes, caller: <module>, function run time: 2.637097 s, percentage: 99.999855 %Function: sum_of_digits, caller: get_n_primes, function run time: 0.005741 s, percentage: 0.217715 % Now we see that most of the time is spent inside function and if we want to improve overall performance, _ is first candidate for optimization. is_prime get_primes is prime Accuracy of our deterministic profiler is determined by these factors:1. Timer accuracy. No measurements can be more accurate than the underlying timer (~1 msec).2. Measurement of time is not instant and there exists a delay between the time when user’s code is exited and time is measured and then user’s code is again executed. Small functions that do not do large number of calculations are especially prone to this kind of error, since it tends to accumulate. This error can be reduced by calibrating the profiler. NOTE: Python provides hook for writing profilers, so I recommend to use it for all serious use cases. sys.settrace() Statistical profiler As it was mentioned above, statistical profilers operate by sampling the call stack at regular intervals. In order to do this, we need to define a function that will do call stack sampling and then schedule this function to be executed periodically. Python module will help us with that. signal Here is the code for this approach callback is signal handler. It stores current call stack name along with how many times call stack was sampled and schedules a timer that emits new signal. We use ITIMER_PROF, because it decrements underlying timer both when the process executes and when the system is executing on behalf of the process. ITIMER_PROF emits SIGPROF on expiration. Other code just sets up our simple statistical profiler and measures total run time. sample_stack After we run the code and print variable , we’ll see this stats {‘<module>-get_primes’: 312,‘<module>-get_primes-is_prime’: 876,‘<module>-get_primes-sum_of_digits’: 1} We can see that has caught 312 samples, — 876 samples and — just 1 sample. Since majority of samples belong to function, we can assume that’s where most of the time is spent. We’ll use total number of samples (312 + 876 + 1 = 1189) for calculations below. get_primes is_prime sum of digits is_prime Let’s summarize statistical profiler stats and compare them to deterministic profiler stats Total time: 2.493037 sTotal samples: 1189Function: <module>-get_primes-sum_of_digits, sample count: 1, percentage: 0.084104 %Function: <module>-get_primes-is_prime, sample count: 876, percentage: 73.675357 %Function: <module>-get_primes, sample count: 1189, percentage: 100.000000 % First of all we can observe that total run time is less than with deterministic profiler, which can be attributed to fact that statistical profilers have less overhead. Also, percentage for (~73.7%) is lower than it was with deterministic profiler (~99%). This happens because resulting data from statistical profilers is an approximation. Furthermore, if we run profiler several times we can get pretty diverse results (70-99%). It means that lower accuracy should be taken into account when making optimization decisions. Nevertheless, these results can help us find bottlenecks in code. is_prime Summary We have built simple deterministic and statistical profilers for Python. Profilers measure function call time, but listed approaches can be applied to measuring other program characteristics such as memory usage. Also concepts discussed here are applicable to other mainstream high-level programming languages. I hope this post will help you to interpret profiling results thoughtfully and build your own profilers if it’s necessary. Code from this post is and . here here is how hackers start their afternoons. We’re a part of the family. We are now and happy to opportunities. Hacker Noon @AMI accepting submissions discuss advertising &sponsorship To learn more, , , or simply, read our about page like/message us on Facebook tweet/DM @HackerNoon. If you enjoyed this story, we recommend reading our and . Until next time, don’t take the realities of the world for granted! latest tech stories trending tech stories