Programs may freeze for many reasons, such as software and hardware problems, limitations of particular technologies, concurrency programming failures, resource exhaustion, software bugs, issues with networks, and among others, . inefficient algorithm implementations The last mentioned is often the root of the problem and may persist in the code for a long time. It may not be quite noticeable when input data is small, but it becomes a defining problem with larger data. Quadratic time complexity We’ll examine a case of such freeze caused by the of some logic in Python. Let’s look at the code I’ve encountered in my work that has caused a every time it was run. It could have been avoided by taking into consideration of basic operations in . inefficient implementation long-lasting freeze the cost CPython Assume we’ve got a list of files we’d like to process, and a list of ignored files that should be excluded from the processing. files = [f'/tmp/temp{i}.txt' for i in range(29_000, 69_000)] ignored_files = [f'/tmp/temp{i}.txt' for i in range(30_000)] To illustrate the problem, I’m using list comprehension to generate the lists. In practice, they were read from a file and passed to a method as parameters. The following code was supposed to get us files, excluding the ignored ones. files_to_be_processed = [] for file in files: for ignored in ignored_files: if file == ignored: break else: files_to_be_processed.append(file) The code was iterating over the list of . For each file in the , it was iterating over the to find out whether the file should be ignored. Assuming that: files files ignored_files is the length of the n files is the length of the m ignored_files The code had time complexity for getting . O(m*n) files_to_be_processed When we ran the code, it was to finish processing. So, we had to wait for the program would come to life again. Because a freeze was repeating every run, the development became tedious and slow. In the case of frequent runs, the development process slowed down radically. freezing for about 40 seconds The running time is negligible when m << n. But when is comparable to , the running time becomes noticeable due to the current implementation, which gives roughly time complexity while it could be linear . Given the fact we had dozens of thousands of files, processing them took up to 1 minute in some cases. It appeared to be the case when the overlap of and was small. Because most items out of the weren’t in the ignored list, each was iterating over the whole . n m O(n²) O(n) files ignored_files files ignored_files Such a simple task as excluding files from processing should not be so , even for dozens of thousands of files. It turned out we had the freeze because of the costly implementation of excluding files in Python. time-consuming Improving the time complexity We can reduce the execution time and transform , taking into account the of various operations in CPython. According to the docs, checking takes in the average case. So, we can use the data structure for to improve time complexity: O(n²) into O(n) time-complexity whether an item is a member of a set O(1) set ignored_files files_to_be_processed = [] unique_ignored_files = set(ignored_files) for file in files: if file not in unique_ignored_files: files_to_be_processed.append(file) # or using a list comprehension unique_ignored_files = set(ignored_files) files_to_be_processed = [f for f in files if f not in unique_ignored_files] Here we’re checking if a file is in the for Because we have to iterate over all files, we get overall time complexity. It’s highly advantageous over the previous solution, especially when the input data is large. As a result, It to prepare several dozens of thousands of files You may run the example yourself to see the running time changes. unique_ignored_files O(1). O(n) O(n²) takes less than a second files_to_be_processed. Alternatively, Python has the operator for sets that could further simplify the code. It returns a new set with elements in the left set that are not in the right set. difference files_to_be_processed = set(files) - set(ignored_files) According to , it yields the same The resulting set can be converted to any particular data structure if needed. difference time-complexity O(n). files_to_be_processed It makes sense to pay attention to in the code as they may originate from the . The problem often reveals itself for large inputs when the time complexity begins to matter. Having a grasp of the cost of some operations may help to detect and fix such issues. long-lasting freezes sub-optimal algorithm implementation