If you prefer to follow along via my YouTube video, you can watch it here!
Data structures are used to store information in our code. We use different data structures to allow us to add, access, and delete data in the most efficient and applicable ways to solve our problems. In interviews, choosing the right data structure could allow you to reduce the time complexity of your solution, something that is greatly valued by companies. I know I have been stuck unable to optimize some of my interview questions just because I used the wrong data structure. Therefore, today we will be going over the must-know data structures for coding interviews.
If this post is helpful, please consider subscribing to my YouTube channel for more content like this! Also if you haven’t already, check out my last article on the 4 Most Commonly Asked Algorithms in Coding Interviews!
Disclaimer: This post is created based off my experience looking for internships and entry level (new grad) roles. At any point if I state you need to know a data structure, this means you should understand reasons to use it, time complexity to insert, lookup, and delete elements, tradeoffs between it and others, and know how to implement it in the language of your choice. If I mention Big-O at any point in the article, I am referring to the industry usage of the term (it is equivalent to Big-Theta in academia). Now that we have that out of the way, let’s get to the list!
Arrays are a basic data structure that is usually built into most languages. They are important for two main reasons. First, arrays are often the parameters to your coding questions in interviews. Therefore, having a strong understanding of them will ensure you can be successful in interviews. Second, you can implement most other data structures with an array, so you cannot understand how many other data structures are implemented without having a fundamental understanding of arrays.
Linked Lists are similar to arrays in that they are common in interviews and can be used to implement some other data structures. For this one, I highly recommend understanding how to implement this as well as adding methods to add, lookup, and delete values from the list. I have had interviews in which I have been asked to implement a Linked List from scratch, so I know it has a chance of coming up in your interviews!
Stack and queues are opposite data structures. Each can be implemented with an array, but are most commonly implemented with a Linked List. The stack is a Last In, First Out (LIFO) data structure which means that the last element you put in is the first one to come out. Think of this like you are building a tower of legos. You build the tower up with individual blocks and when you want to take it down, you take blocks off from the top until you remove all of them.
On the other hand, the queue is a First In, First Out data (FIFO) structure which means the first element you put in is the first one to come out. You can think of these as a line at a grocery store where the first person in line is the first person that makes it to the checkout counter. Knowing these differences allows you to choose which one to use in an interview.
Hash tables are implemented with an array and a hash function. The hash function takes in the element that you want to put in the hash table and spits out an index in the array to put that element. The great part about hash tables is that they are amortized time complexity (don’t know what this means? I have a definition below!) is O(1) for insertion, lookup, and deletion. So you might be thinking to yourself that we found the perfect data structure, but there are some downsides to using a hash table.
First, they lack any kind of structure or organization so if you want to do something like sort your elements or know the order in which they were inserted, hash tables do not offer this. Second, you need a bigger array than the number of elements in the hash table. This is because if you put a new element through your hash function and it spits out an index in your array where you already have another element, you get a collision. Collisions can reduce your time complexity, so we aim to minimize the number of collisions by having a bigger array than the number of elements in the table. This means that if you are constrained on space, a hash table might not be your best option.
Amortized Time Complexity is the idea that in most cases, we achieve a certain time complexity, but on rare occasions, we have to do a more expensive operation. Fortunately, this operation occurs so infrequently that our average time complexity is lower. So for hash tables, we say this is an amortized O(1) time complexity because we know that hash collisions are very infrequent and most of the time it will just take us O(1) time to perform all 3 operations.
Trees and graphs are built on the idea of having nodes with values and each node having references to its adjacent nodes. I recommend starting with an understanding of trees by familiarizing yourself with Binary Search Trees, n-ary trees, and Tries. With Binary Search Trees or BSTs, know a self-balancing BST. Some common ones include AVL Trees and Red-Black Trees. Once you’ve mastered trees, take that knowledge and learn more about graphs because they are very common in more complicated interview questions.
I hope you found this story informative! If you liked the post/video, feel free to leave a like on my video and subscribe to my YouTube account for more content like this!