paint-brush
Counting lines 60% faster than wc, with Clojure + core.asyncby@atroche
2,380 reads
2,380 reads

Counting lines 60% faster than wc, with Clojure + core.async

by Alistair RocheMarch 2nd, 2017
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Yesterday I noticed that <a href="https://medium.com/@atroche/file-processing-in-clojure-can-easily-become-cpu-bound-3c1c38669daf" target="_blank">my laptop’s SSD is so fast, any single core can’t keep up </a>— even with as simple a task as counting lines in a <a href="https://hackernoon.com/tagged/file" target="_blank">file</a>, with as battle-tested a tool as <a href="https://en.wikipedia.org/wiki/Wc_%28Unix%29" target="_blank">wc</a>.

Company Mentioned

Mention Thumbnail
featured image - Counting lines 60% faster than wc, with Clojure + core.async
Alistair Roche HackerNoon profile picture

Yesterday I noticed that my laptop’s SSD is so fast, any single core can’t keep up — even with as simple a task as counting lines in a file, with as battle-tested a tool as wc.

This evening, after taking on lots of advice from the lovely commenters on the reddit thread, I found a better way.

The key idea is to have one thread doing absolutely nothing except:

  • reading in data to a byte array, and
  • passing the array to another thread to do the “processing” (if you can glorify newline-counting with that word).

wc, the venerable unix tool, counts lines in the same thread (check out this hyper-stripped down implementation I sketched up yesterday from the BSD wc source).

Clojure’s default way of reading in lines (line-seq, which uses BufferedReader under the hood) performs UTF decoding on the same thread, which (as I noticed yesterday) is a real bottleneck when your SSD is pumping through well over a gigabyte of data per second.

And the one in my mid-2015 MBP is nowhere near top-of-the-line anymore:

The parallel version I coded up tonight takes an average of 6.9 seconds to finish, which is around a 60% improvement. Not only that, but it’s not using the ugly low-level threading and concurrency primitives of the JVM, but the fairly high-level abstraction of core.async.

(I also tested it against much smaller files and found a similar speedup).

Here’s the code:

It’s getting late here, so I’m not going to dive into it, but it just implements that core idea I mentioned above of keeping one thread dedicated to doing the system call and little else. Here’s the full namespace, requires and all. If you have questions or comments, jump over to the /r/clojure reddit thread.

It seems like this issue of CPU-bound IO is going to become more important as we move towards faster SSDs, more cores, but static clock speeds. I wonder how standard libraries (like Clojure’s and Java’s) will adapt to keep up with the pace?

[edit: check out the awesome unix-y ways of dealing with multi-core IO people came up with in the /r/programming thread]

[edit2: here’s the whole Clojure file, if you want to try running or modifying it. and if you want to generate a test file, try: “head -c 10G /dev/random > random.txt”]