Welcome back to Essential Algorithms, where I go over the many, many different algorithms every programmer should know and understand. Today's algorithm is the dead-simple, yet terribly inefficient, Bubble Sort.
Once again, we'll begin with an unsorted array that needs to be put into order:
arr = [5, 4, 3, 2, 1]
To put it very simply, the Bubble Sort simply passes over the array, comparing one number to the one next to it. If the bigger number is on the left, then it swaps their places. It goes through each element individually until it's done with the array, and repeats that as many times as it needs to in order to sort the array. While at first glance this seems very simple and like there isn't more to it, looking deeper reveals more about how we can implement and optimize this algorithm.
Let's start with the name, Bubble Sort. The reason it's called Bubble Sort is because it causes the largest unsorted number to "bubble" up to the top of the array on the first pass. Since it will always swap 2 numbers' places if the bigger one is on the left, and the biggest one will always be on the left until it's in place at the end, then logically the biggest number will be swapped with every number ahead of it during the first pass over and placed correctly. If we understand this, we can understand that after one pass over, the largest element is for sure in place; after the second pass over, the second largest element is in place, and so on.
This means the array with length n will be sorted in n passes over the array. This also means that we can determine how many numbers are accurately placed in the array by keeping track of how many times the array has been sorted. Knowing this, we also don't need to go over the entire array each time; each pass doesn't need to go over all n elements, but n-(the number of times we've gone over the array), because all of the elements after that are already sorted. So, the algorithm can be broken down into the following two steps:
Now, let's step through the algorithm with an example. Here's the array we start with:
[5, 4, 3, 2, 1]
We have 5 numbers, so we do the above first loop 5 times.
Python: arr = [5, 4, 3, 2, 1] def bubble_sort(unsorted): for i in range(len(unsorted)): # Do the second loop
Notice that in the first loop, we don't actually need to access the array itself, and the index of that loop won't point to anywhere in the array, but is one we use to keep track of how many times we've gone over the array, and thus, how many items are already sorted.
Setting up the next loop requires us to think for a moment about where we need to stop in the array before completing. If we have 5 items, the last item is located at
The second loop has us comparing the number at the current index to the one at the next index, so we have to stop before we get to the end (as trying to access the element at
that doesn't exist will cause problems).
In order to compare all of the items, we need to stop running the loop at the index before the last item we're going to touch. We also know the first time we run the loop (when i=0), none of the items will be definitely sorted, meaning we have to sweep through them all. The second time we run the loop, i=1, the very last item will be sorted, so we can ignore the very last item. Therefore, the inner loop needs to be run n-(1+i) times.
Python: arr = [5, 4, 3, 2, 1] def bubble_sort(unsorted): for i in range(len(unsorted)): for j in (range(len(unsorted) - (i+1))): # Do comparisons
Bubble Sort can be optimized further from here. I remember the first time I tried writing a sorting algorithm, without knowing any of them, and I wrote something kind of like a recursive pseudo Bubble Sort, that looked a lot like this:
def bubble_sort(unsorted): swapped = False for i in range(len(unsorted)-1): if unsorted[i] > unsorted[i+1]: unsorted[i],unsorted[i+1] = unsorted[i+1],unsorted[i] swapped = True if swapped == True: return bubble_sort(unsorted) else: return unsorted
This worked in a very similar way, just without the nested loops. It iterated over the entire array, until it was sorted. The advantage to doing this is you utilize the passes over the array to verify if the array is already sorted, and can stop while it's sorted, cutting down on the number of steps being taken. One of the disadvantage is that it doesn't keep track of how many times you've iterated through the array, thus it doesn't know how many items are in the right place and how many it can skip, which adds up and compounds over time to a large number of unnecessary executions. (Another disadvantage we won't dwell on too much is the recursive nature of an O(n^2) function, which will quickly grow the stack size too large and cause the program to hang or crash).
Optimizing it (although almost certainly an exercise in futility, as we'll observe below) can then be a simple question of integrating a missing advantage from one of the above functions into the other one. If we start with the function with the nested loops, we can keep track of each pass through, and if we stopped swapping numbers at any point, we'd know we have sorted the entire array and we can return what we have right now. Thus, our first completed (python) function can be optimized with this method to look like this:
arr = [5, 4, 3, 2, 1] def bubble_sort(unsorted): for i in range(len(unsorted)): swapped = False for j in (range(len(unsorted) - (i+1))): if unsorted[j] > unsorted[j+1]: unsorted[j],unsorted[j+1] = unsorted[j+1],unsorted[j] swapped = True if swapped == False: return unsorted return unsorted
Now, optimizing the recursive bubble-sort function merely requires us to keep track of how many times it's been called, which can be easily done by adding a variable that gets passed to the function, and incremented by
each time the function gets called.
Then, rather than using
, we can turn the subtracted
(1+the number of sorted items)
The second function, optimized with the advantages of the first, looks like this:
def bubble_sort(unsorted, last_sorted): swapped = False for i in range(len(unsorted)-(1+last_sorted)): if unsorted[i] > unsorted[i+1]: unsorted[i],unsorted[i+1] = unsorted[i+1],unsorted[i] swapped = True if swapped == True: return bubble_sort(unsorted, last_sorted+1) else: return unsorted bubble_sort(arr, 0)
So what is the Time Complexity of the Bubble Sort algorithm, and why have I been referencing how bad it is? Since it's usually described as being O(n^2), that means for an array of 8, we have to perform 64 steps, and for an array of 9, we'd have to perform 81 steps, right? Well, actually no. So if not, how many steps does it take?
Stack Overflow seems to have multiple similar-looking yet different equations as answers, why do some say n*(n-1), and others n*(n-1)/2 ? And, knowing this, why do we say it takes O(n^2) time?
Let's write out the most inefficient way of writing this algorithm, in order to start exploring the answers these questions.
def bad_bubble_sort(unsorted): for i in range(len(unsorted)): for j in range(len(unsorted)-1): if unsorted[j] > unsorted[j+1]: unsorted[j],unsorted[j+1] = unsorted[j+1],unsorted[j] return unsorted
This implementation does not check to see if the array is sorted, nor does this one know how many numbers have been sorted into place. This means it runs the maximum number of times, that it can, by design. The outer loop needs to sort n items, so it runs n times. In the inner loop, called by the outer one, each step of the loop takes the currently pointed to item in the array, and compares it to the next one.
This means all of them except for the last element come together to make a pair with the next item, which gives us n-1 pairs to compare in any array of length n. Thus, we can calculate the worst-implemented performance is actually accurately calculated as O(n*(n-1)). And this is close to n^2, but not quite exactly it.
So what about the first version that we made, that keeps track of the number of sorted items at the end? After all, the outer loop stays the same, but the inner loop gets smaller! To calculate that, we need to do a bit more math. For a number n, that is the length of the array, we know the first time we pass through the entire array we perform n-1 comparisons. The next time, we do n-2, then n-3...until we get down to 1.
So if we have 10 items, we do 9 comparisons, then 8, so that 9+8+7+6+5+4+3+2+1=45. This is a series sum can be described mathematically as (n*(n-1))/2. Plugging 10 into there, we get 10*(10-1)/2 = 10*9/2 = 90/2 = 45. (Here you can find a better explanation of why this is).
This means the implementation that keeps track of the number of sorted items will do half as many steps as the dumb one that doesn't, and therefore be twice as fast.
With these two functions, we can now compare the results they yield as the size of n grows larger, which is what Big-O notation is all about. In the terribly-implemented
function, with it's actual complexity calculated as n*(n-1), would need to do 90 comparisons if passed a 10-item array. If we passed it 20 items, it would need to do 380 comparisons. If we passed it 5 items, it would need to do 20 comparisons. In fact, it's easy to see that going from 5 items to 10, doubling the input, makes 4.5 times the output.
And when we go from 10 items to 20, we make 4.22 times the output. As we keep doubling the array size, we keep approaching creating 4 times the output, while always staying slightly above it.
function works the same way. If we passed that function, with it's (n*(n-1))/2 calculated complexity, we can plug in 5 and get the result 10. We can plug in 10 items, and get the result 45. Once again, jumping from 5 items to 10 increased the output 4.5 times. This means we can observe both functions increasing their output in the exact same way, proportionally.
So despite being different functions, their complexity grows at the exact same rate as the size of the input is increased.
We can also observe that as the size of the input for O(n*(n-1)) keeps being doubled, the size of the output tends to move closer towards quadrupling, or we can say as n increases by a factor of m, the output tends towards being m^2. We can easily verify this with large values: plugging in 1,000 gives us an output of 999,000. Tripling the input should create a nearly 9-fold increase in output, and plugging in 3,000 gives us 8,997,000, an increase of just over 9 times.
It is because of this, that we say Bubble Sort's complexity is O(n^2), when discussing it's performance. We aren't usually interested in actually calculating the number of steps it needs to take in order to perform the algorithm, but we want to be able to explain how the runtime grows, as the size of the input grows, and that is O(n^2).
While we've talked about average and worst-case complexities already, I'd like to briefly touch on the best-case complexity. The absolute best case for this is O(n), with an exact complexity of O(n-1). The reason for this lies in the optimization tactic that keeps track of whether or not we've performed any swaps. If we can get through the entire unsorted array without making a single swap, we know that we have a sorted array.
If passed an already-sorted array, and we've implemented this optimization, then only one pass has to be made over the array (making n-1 comparisons). Of course, how long that takes is determined entirely by the size of n, and thus we say it's best case performance is O(n). However, observing average and worst cases above, we can see that a bubble sort is not a good idea for lists of any appreciable size, and work best with lists expected to be already mostly sorted.
Bubble Sort is one of those basic fundamentals of algorithms that every programmer should know. While basic and inefficient, it's intuitive, and one of the basic methods of sorting most people will think of first when attempting to write a sorting algorithm without any prior study or experience. This simplicity and lack of real-world application in productions shouldn't be looked at as a reason not to learn it or study it.
In fact, that very simplicity makes it a perfect algorithm to demonstrate the basics of optimization, reasons to pick nested loops over recursion (especially when you can dynamically decrease the size of one loop), and is a great way to demonstrate time complexity in general and specific time complexities of individual algorithms.
Hopefully, if you've read this far, you've gained some insight and learned a thing or two. If you liked this, go ahead and share it! If you have any topics you want me to cover and explore like this, feel free to reach out and email me! And most importantly, keep on coding and practicing, until next time!