paint-brush
Speed Up Allocating ActiveRecord Objectsby@pavelevstigneev
1,634 reads
1,634 reads

Speed Up Allocating ActiveRecord Objects

by Pavel EvstigneevSeptember 9th, 2016
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

In one of projects at work we needed to generate big report and ActiveRecord was too slow and too memory hungry for this. In some edge cases it took more then an hour to generate report with 100,000–200,000 rows. Memory was more then 2GB, VM start swapping and all instances running on that VM become slow.
featured image - Speed Up Allocating ActiveRecord Objects
Pavel Evstigneev HackerNoon profile picture

In one of projects at work we needed to generate big report and ActiveRecord was too slow and too memory hungry for this. In some edge cases it took more then an hour to generate report with 100,000–200,000 rows. Memory was more then 2GB, VM start swapping and all instances running on that VM become slow.

First I try to implement find_in_batches that will use limit() and offset(), but it didn’t work well. Every next batch become slower because DB needed to find first N batches and give me only N-th, on next batch it will N+1 batches, and so on. Better for memory and worse for speed.

Next I start to think why exactly it takes so much time? Most of the time it spent creating ActiveRecord objects and accessing attributes. That was biggest memory consumer too. I come up with idea to make light version of ActiveRecord object, with as less allocations and better performance. I call it LightRecord.

LightRecord is extension for ActiveRecord, that can create read-only objects with as less allocations as possible. It simply use hash from database driver (mysql2) and define accessor methods for fields. No type casting, no dirty attributes (ActiveModel::Dirty), no copying data.

Benchmark for generating CSV report with 36634 rows and 18 columns. OS X 10.11.6, ruby 3.0.1, rails 5.0.0.1

  • ActiveRecord: 5.2 sec, 270 MB, 2.65 millions allocated objects
  • LightRecord: 1.41 sec, 120 MB, 0.93 millions allocated objects
  • LightRecord Streaming: 1.3 seconds, 108 MB, 0.93 millions allocated objects

As you can see there is also LightRecord With Streaming. Uses streaming feature from mysql2 client, which loads records one by one and can iterate them. The memory on a chart is growing straight, because it doesn’t load all the data at beginning.

You can see code example on LightRecord github page. We use it at one project in production and I’m happy with performance it gain