It is surprising how algorithms are everywhere and how we modern developers often don’t care about them. I assumed and was convinced this knowledge is only for language and framework core development teams. — Who cares how Array.sort() works if it does its job and does it fast enough, right? So I came with this mindset to Facebook Hacker Cup wishing to win a t-shirt last year and failed in the first round, was disappointed, started to think how I could improve and started slowly diving into algorithms world. Long story short, let’s consider a simple example: We have an array of integer numbers and the task is to find all non-unique elements in it. Naive Solution Just think for a second how you would do it and let me show my first working solution: var length = data.length;var unique = [];var distinct = []; for (var i = 0; i < length; i++) {var elem = data[i];var uniqueInd = unique.indexOf(elem); if (uniqueInd > -1) {  
    unique.splice(uniqueInd, 1);  
}

if (distinct.indexOf(elem) === -1) {  
    distinct.push(elem);  
    unique.push(elem);  
} } for (var i = length - 1; i > -1; i--) {if (unique.indexOf(data[i]) > -1) {data.splice(i, 1);}} return data; The time I wrote that code I didn’t know any functional Javascript methods like or so it looked long. But it worked, and made me super proud of myself. I spent less than a day on it. map reduce I posted it to to compare with other solutions. https://js.checkio.org/mission/non-unique-elements/ Nice Solution Once you submit your own code there you can check out how other guys approached it. And this beauty was on the top of the list: return data.filter(function(a){return data.indexOf(a) !== data.lastIndexOf(a)}); It’s short and readable. The idea behind is brilliant —if the first and the last occurrence of an element are same then this element occurs only once, so it is unique. Just look at this and compare to what I had. Do you see this precipice? I was so impressed that I started to do other competitive programming quizzes which at one point brought me to the big O notation. Fast Solution To quickly understand why big O makes sense look at this chart taken from Steven Skiena book: Imagine our algorithm has n square complexity. This will mean that it takes ~16.7min to process array of 1 million numbers. Isn’t that too much? I would say it is considering that on Facebook Hacker Cup you have only 5 minutes to upload results once you get the input data. And does our nice solution still that nice when we look at it from algorithm complexity point of view? Same as my first solution unfortunately not. Let’s show why: and take n operations in the worst case (say we have an array of all unique numbers) because they basically iterate through the whole data. And we execute them inside a method which is a loop of n so it means that we do n operations n times which is O(n * n). Oops. That is unacceptably slow. indexOf lastIndexOf filter One way to fix this is to use sorting. The basics of algorithms science tell us we can sort with O(n * log(n)) complexity which is if you check the table above much faster than O(n * n) and takes reasonable time even for 1 billion records. Once we sort the list we can use another basic algorithms trick by overriding and methods with binary search. The idea behind binary search is that on each step we split the searchable array into 2 pieces and continue searching only in one of them. We simply check if the number in the middle of the array is bigger or less than the one we searched for and then choose left part or right part of the data correspondingly knowing that it is sorted. Until we have an array of only one element which is either the element we are searching for or it is not and that would mean our element is not in the array at all. So to complete this we need x steps where 2 to the power of x is n(because on each iteration we divide amount of our data by 2), which gives us O(log(n)) complexity to find both and . And because O(log(n)) + O(log(n)) = O(log(n)) we have got O(log(n)) complexity altogether for and methods call. indexOf lastIndexOf indexOf lastIndexOf indexOf lastIndexOf Finally we have a loop of n(because we iterate through the whole array with ) with O(log(n)) complexity inside each iteration and that gives us O(n * log(n)) overall complexity. filter It is the same as sorting complexity so in the end, we have O(n * log(n)) + O(n * log(n)) = O(n * log(n)) for the whole algorithm. Cool! It’s much faster. Now we can handle even 1 billion records in less than a minute. And this is how we can do with O(n * log(n)) complexity on a sorted array: indexOf function newIndexOf(array, elem, startIndex, endIndex) {if (array.length === 1) {if (array[1] === elem) {return startIndex;} else {return -1;}} var median = Math.floor(array.length / 2);  

if (elem <= array\[median\]) {  
    return newIndexOf(array, elem, startIndex, median);  
} else {  
    return newIndexOf(array, elem, median + 1, endIndex);  
} } implementation will be the same but instead of ≤ condition we will use < condition. lastIndexOf An even faster solution But talking about a web server with a high load even ~30 seconds seems to be too much. Can we make it better? Every time I think about advanced array utilities I think of library. And that’s how they to do this: lodash.js propose _.transform(_.countBy(array), function(result, count, value) {if (count > 1) result.push(value);}, []); First, we use method to calculate occurrence of each element. It gives us a JS object which has elements as keys and their occurrences as values. method utilizes JS method inside and takes O(n) time complexity because it is basically iterating through the whole array once. countBy countBy reduce Then with , we go through all keys we have in the object and take those of them which have values(occurrences) greater than 1 which again gives us time complexity O(n) because we cannot have more than n keys in this object. transform And because O(n) + O(n) = O(n) we have O(n) complexity as a final result. And if you check the table above you’ll see that we can process 1 billion records 30 times faster now. Wow! Conclusion Somehow my short analysis confirms that frameworks and utility libraries like usually take care of algorithms time complexity and we can trust them in that sense. lodash.js At the same time, it is easy to get into the trap of nice looking clever code which in the end will significantly decrease our app performance. So even though we may not create complex algorithms ourselves, it is good to know how to calculate the complexity of a given one because then we can be confident we don’t get surprises when at one point we decide to scale our application to larger volumes. If you’re really interested I highly recommend Steven Skiena and book for learning as well as and platforms to get some practice. And may be see you on the next programming competition? ;) classes https://js.checkio.org/ https://www.codewars.com/

Facebook

Super

Finding Non-Unique Elements in Javascript

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

10 Things You Probably Didn’t Know about Sass

10 Things Everyone Should Know About Machine Learning

10 Repositories that Will Transform the Way You Approach Technical Interviews

10 (Free) Data Structure and Algorithm Courses Junior Developers Should Explore

10 Data Structure & Algorithms Books Every Programmer Should Read

The Noonification: How to Develop a DSL in Kotlin (12/12/2023)

10 Things You Probably Didn’t Know about Sass

10 Things Everyone Should Know About Machine Learning

10 Repositories that Will Transform the Way You Approach Technical Interviews

10 (Free) Data Structure and Algorithm Courses Junior Developers Should Explore

10 Data Structure & Algorithms Books Every Programmer Should Read

The Noonification: How to Develop a DSL in Kotlin (12/12/2023)

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps