One of the cornerstones of the software engineering role is your understanding of and competency in using data structures and algorithms.
During software engineering interviews, recruiters and hiring managers will ask you to complete several coding questions to evaluate your knowledge of various data structures and algorithms.
One of the most fundamental algorithms software engineers should be experts in is a binary search.
Binary search is a versatile search algorithm that allows you to find an element within a sorted list.
The key is to take advantage of sorted data to divide the search space in half at every step.
Often, you can speed up naive algorithms from O(n) to O(log n), which is a considerable improvement.
The binary search implementation we'll review below might differ slightly from what's readily found online. Still, we think it's easier to remember and less prone to off-by-one errors.
Remember, however, that memorizing won't serve you when prepping for technical interviews. On the other hand, if you understand why binary search works, you'll be able to write good code that applies to various problems.
Let's take a look at a simple example in Python.
If you’re interested in seeing some examples of Binary Search in Java, check out this Hackernoon article here.
Imagine an interviewer gives you an array of 0
s and 1
s. They guaranteed that all the 0
s would appear before all 1
s.
So, arrays like [0, 0, 1, 1, 1]
, [1, 1, 1]
, and [0, 0, 0, 0]
are valid, while [1, 0], [0, 1, 0, 1]
are not.
They ask you to find the first index of the array that is a 1
. If no elements in the array are 1
, then return -1
.
In Python, the signature for this function looks like this:
def find_first_one(array: List[int]) -> int:
pass
A naive solution for this problem would look like this:
def find_first_one(array: List[int]) -> int:
for i in range(len(array)):
if array[i] == 1:
return i
return -1
The worst case runtime of the naive algorithm is O(n) since, in the worst case, we'll have n-1 0s and one 1, and we'll need to scan the entire array.
Nevertheless, we can speed this up with a binary search.
Tip: Take a moment to review "loop invariants," or properties of a given loop that are true before and after each loop iteration.*
We start with two-pointers, called lo
and hi
, which represents the lowest and highest number that a possible answer could be. Since the answer is always an index, we can initially set lo = 0
and hi = len(array)-1
.
For this problem, we'll let the loop invariant be array[lo] = 0
and array[hi] = 1
.
This may not be the case initially, but we explicitly check those cases first. For example, if array[0] = 1
, the answer is just 0
since 0
is the first index where 1
appears. If array[n-1] == 0
then the answer is -1
since there are no 1
s in the array.
In each iteration of binary search, we'll try to halve our search space. We take the midpoint between lo
and hi
(which can be computed as (lo+hi)/2)
, and depending on whether the value of the array at that index is 0
or 1
, we can update either lo
or hi
to the midpoint.
def find_first_one(array: List[int]) -> int:
n = len(array)
if array[0] == 1:
return 1 # first element is 1
if array[n-1] == 0:
return -1 # no 1s in this array
lo = 0
hi = n-1
while lo + 1 < hi: # loop invariant: array[lo] = 0, array[hi] = 1
mid = (lo + hi) // 2
if array[mid] == 0:
lo = mid
else:
hi = mid
return hi
it's easy to see why we set lo = mid
when array[mid] == 0
; that is, we want to ensure that that loop invariant is true.
Now for our stopping condition. Notice lo+1 < hi
. We did this because when lo+1 = hi
, lo
and hi
are adjacent indices in the array. Once we have that, we immediately see that hi
is the first index that is a 1
because it has a 0
immediately before it.
Another way to think of this is to find the place where the array switches from 0
to 1
(in contrast to finding the first 1
).
To compute the time complexity, we can note that we halve the distance between lo
and hi
at each iteration, which is O(log n).
Binary search can also be applied if there is only one point in time where a condition switches from false to true (or vice versa).
In the example above, the condition is "array is equal to one." In other problems, this condition can be different, but the binary search can be applied if the underlying array has some structure where some prefix is all False, and the suffix is all True (or vice versa).
Let's try it out with a few more examples.
Given a sorted array, find the index of the first element ≥ X (return -1 if no element ≥ X).
def lower_bound(array: List[int], X: int) -> int:
pass
Hint: consider loop invariants array[lo] < X
and array[hi] >= X
Here's what a full solution could look like:
def lower_bound(array: List[int], X: int) -> int:
n = len(array)
if array[n-1] < X:
return -1
if array[0] >= X:
return 0
lo = 0
hi = n-1
while lo+1 < hi: # array[lo] < X and array[hi] >= X.
mid = (lo + hi) / 2
if array[mid] < X:
lo = mid
else:
hi = mid
return hi
Note that this code is similar to find_first_one
from Example #1 except for the different boundary checks.
The time complexity of this solution is O(\log_2(n))
.
Binary search works on data structures beyond arrays. To demonstrate, let's compute the square root of a non-negative number (to a certain level of precision.) We'll use loop invariants similarly to what we did above.
PRECISION = 1e-6
def square_root(X: float) -> float:
lo = 0
hi = X
while (hi - lo) > PRECISION: # lo^2 < X, hi^2 >= X
mid = (lo + hi) / 2
if mid*mid < X:
lo = mid
else:
hi = mid
return (lo + hi) / 2
Here, we've changed the termination condition to take into account precision. We also know the answer will be between lo and hi, so in the end, we can approximate it as (lo+hi)/2
.
The runtime of this is O(log (X / PRECISION))
.
And that's that!
A binary search algorithm is a must-know for software engineering candidates. Therefore, it's best to count on these coding questions during your interviews.
Of course, it's possible to find elements within sorted arrays using naive algorithms, but that's not necessarily going to impress a hiring manager.
But you know what will? Efficiency and optimization are the names of the game for binary search.
Nevertheless, there will be plenty of other data structures and algorithms that you'll be tested on during a software engineer interview. To learn more, check out the Exponent Software Engineering Course."
Also published here.