paint-brush
Drawing 2.7 billion Points in 10sby@sdanisch
2,112 reads
2,112 reads

Drawing 2.7 billion Points in 10s

by Simon DanischOctober 31st, 2017
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

When I was reading <a href="https://www.slideshare.net/continuumio/visualizing-a-billion-points-w-bokeh-datashader" target="_blank">this blog post</a> about visualizing 2.7 billion points in a minute on a <a href="https://hackernoon.com/tagged/macbook" target="_blank">macbook</a>, I got curious how fast this would be in <a href="https://hackernoon.com/tagged/julia" target="_blank">Julia</a>.

Company Mentioned

Mention Thumbnail
featured image - Drawing 2.7 billion Points in 10s
Simon Danisch HackerNoon profile picture

When I was reading this blog post about visualizing 2.7 billion points in a minute on a macbook, I got curious how fast this would be in Julia.

Not only for fun, but also to integrate it into my plotting library (MakiE). Since I’ve been very happy at how quickly I was able to create a very fast solution, I decided to share my experience!

To get started, one can download the data from http://planet.osm.org/gps/

Parsing the CSV

I'm assuming that one wants to do lots of operations on the data, so the most sensible thing to do is to convert the 55gb CSV to a 22gb binary blob, which can get memory mapped and doesn't need parsing!

There wasn't any tool around that could do this out of the box, but with the help of TextParser.jl I was able to whip up a fairly fast custom solution - which also demonstrates how nicely one can extend existing Julia libraries.

The first problem is, that TextParser expects a string to parse the CSV. Since I can't just load the whole 55gb dataset into RAM, our first trick is to create a memory mapped string type in Julia:





struct MMapString{T}# using Type parameter to not write out the result type of mmap# but I also don't want to loose performancex::Tend








# Constructor from a pathMMapString(path::String) = MMapString(Mmap.mmap(open(path), Vector{UInt8}))# we only need to overload the iteration protocol for TextParse to work!Base.getindex(x::MMapString, i) = Char(x.x[i])Base.next(x::MMapString, i) = Char(x.x[i]), i + 1Base.done(x::MMapString, i) = i > length(x.x)Base.start(x::MMapString, i) = 1Base.length(x::MMapString) = length(x.x)

This is all one needs to satisfy the basic interface of most libraries expecting a string like type. Now we can write a function to read the 55gb dataset, parse it line by line and write it into a binary blob.





One of the problems that occurred was, that the function to write to an IO file stream was allocating quite a bit. Base.write creates a mutable container (Ref) for its argument to safely pass it to C. Even Base.unsafe_write that directly takes a pointer still has a check that allocates 128 bytes. Be ready for unsafe_unsafe_write, short uuwrite. This saves us 40 seconds and allocates 3.5 gb less memory (which ends up as ~1.2x faster)!This is actually a great example of how Julia currently fails, and how one can still claw back the performance.We just call the C library directly and use the almost deprecated syntax (`&`) to take a pointer from the stack allocated argument.in Julia 0.7 this is likely no issue and we should be able to just use `write`, since the compiler got a lot better at stack allocating and eliminating mutable containers.

By the way I was figuring this out with @time write(io, 1f0) which shows the allocations, and then using @which write(io, 1f0) (or alternatively @edit to jump directly into the editor) to figure out where the function is defined in Julia and where the allocations come from.


using TextParseusing TextParse: Record, Field, Numeric, tryparsenext







function uuwrite(io, ptr::T) where TInt(ccall(:ios_write,Csize_t, (Ptr{Void}, Ptr{T}, Csize_t),io.ios, &ptr, sizeof(T)))end




























function save2bin(path, n = typemax(Int))str = MMapString(path)# Descriptor of 'num,num\n' which is the format in the csvrec = Record((Field(Numeric(Float32), delim = ','),Field(Numeric(Float32), eoldelim = true)))# skip the header! Nice is, that Julia's findfirst works with# any iteratorpos = findfirst(str, '\n') + 1io = open(homedir()*"/gpspoints.bin", "w")i = 0while !done(str, pos) && i <= n# tryparsenext takes roughly 35ns so it's fairly optimalp_or_null, pos = tryparsenext(rec, str, pos, length(str))isnull(p_or_null) && continue # continue when parsing failsp = get(p_or_null)uuwrite(io, p)i += 1endclose(io)iendtic()@time save2bin(homedir() * "/Downloads/gps-points.csv");toc()

This rewrites the file and saves it as a binary in 190 seconds, which isn't too bad for such a simple implementation and an operation that only needs to be done one time.

Drawing the Image

Now we can memory map the file as a Vector{NTuple{2, Float32}} which holds the raw gps coordinates. We can then directly loop over the points and accumulate each position in a 2D array.










"""Transforms from longitude/latitude to pixel on screen, with `dims` refering tothe dimensions of the screen in pixel"""@inline function gps2pixel(point, dims)lon, lat = point[1], point[2]x = ((dims[1] / 180.0) * (90.0 + (lon / 10^7)))y = ((dims[2] / 360.0) * (180.0 - (lat / 10^7)))(x, y)end






































function to_image_inner!(img, points, start, stop)dims = size(img)for i = start:stop# don't check bounds when indexing into an array@inbounds beginp0 = gps2pixel(points[i], dims)idx = Int.(round.(p0))xidx, yidx = dims[1] - idx[1], dims[2] - idx[2]if checkbounds(Bool, img, xidx, yidx)# we should give each point a radius and then add# the coverage to each pixel for a smoother image# But this does well enough for this short example:img[xidx, yidx] += 0.001f0 # accumulateendendendendfunction to_image!(img, points, range)N = length(range)# Number of threads Julia has available# Use the environment variable JULIA_NUM_THREADS=8 to start# Julia with 8 threadsNT = Threads.nthreads()slices = floor(Int, N / NT)offset = minimum(range)Threads.@threads for i = 1:NT# @threads creates a closure, which sometimes# confuses type inference leading to slow code.# this is why it's a good practice to move the loop body# behind a function barrier# (https://docs.julialang.org/en/latest/manual/performance-tips/#kernal-functions-1)to_image_inner!(img, points,offset + ((i-1) * slices + 1), offset + (i * slices))endreturn imgend

We can use the above function in the following way to get a simple gray scale image:








img = zeros(Float32, 600, 960)io = open(homedir() * "/gpspoints.bin")points = Mmap.mmap(io, Vector{NTuple{2, Float32}})tic()to_image!(img, points, 1:length(points))toc()using FileIO, Colors # save it with FileIO as a Gray imageFileIO.save("gps.png", Gray.(clamp.(1f0 .- img, 0, 1)))

Et voila!

This took 10 seconds to draw 22gb of points on a normal desktop computer.

One thing to point out is the great dot call syntax you can see in the example:

Gray.(clamp.(1f0 .- img, 0, 1)))


This is actually a wonderful mechanism to apply a function per element on an array while fusing all operations. So the above will turn into a single for loopover img, clamping, inverting and converting it to a color in one go!

We can also use my new interactive plotting library to create a nice animation of the progress:

using MakiE



















resolution = (600, 960)scene = Scene(resolution = reverse(resolution))img = zeros(Float32, resolution)imviz = heatmap(img, colornorm = (0, 1))center!(scene)# slice `points` into multiple batches which we will updateslice = 10^7range = slice:slice:length(points)stop = 1vio = VideoStream(scene, homedir()*"/Desktop/", "gps")while truestart = stopstop = min(stop + slice, length(points))to_image!(img, points, start:stop)imviz[:heatmap] = img # update heatmap plotrecordframe!(vio) # add frame to streamstop == length(points) && breakendfinish(vio, "gif") # finish streaming and export as gif!

There are of course various improvements we could apply, but I’m very happy with this as a first result. It is also not clear how comparable this solution is to the initially mentioned blog post, since I haven’t run that on my computer yet. If you want to try this on your machine, check out the full code!