Squeezing Performance from SQLite: Insertions (with Room)

Written by JasonWyatt | Published 2017/05/21
Tech Story Tags: database | sqlite | androiddev | room | google-io-2017

TLDRvia the TL;DR App

At I/O last week, Google announced a new tool called Room which makes working with SQLite databases a lot easier, but it begs the question: How well does it perform?

This is a followup to my earlier insertions experimentation results. First we’ll look at some results from a couple of experiments comparing the relative performance of Room against the practices discussed in the previous post. Then, I’ll provide some of my own suggestions to the Googlers working on it.

Methodology

As with the original post, each experiment involved comparing two or more ways of inserting 1000, 10,000, and 100,000 randomly generated records into two different types of tables:

  • an extremely simple table (simple) consisting of a single integer column, and
  • a more real-world table (tracks) which describes a collection of music tracks where each contains an id, title, duration, lyrics, etc. The idcolumn is this table’s PRIMARY KEY.

I also performed every test on my 16GB Nexus 5X, running API level 25 (Nougat).

The results were calculated by tracking the time that had elapsed while all of the inserts for the current size iteration were running. The time taken to connect to the database before and to wipe the table after each iteration was excluded from the results.

I’ve updated the GitHub project from the original post with the new test scenarios using Room.

Important Notes

The version of Room used in these experiments was 1.0.0-alpha1.

Room requires all @Entity classes to define a @PrimaryKey value, and doesn’t let you fall back to SQLite’s default rowid primary key. This means that the Room-based simple table actually had two columns: an auto-generated id, and the actual value.

Caveat: Memory Issues

The documentation for Room suggests the following approach for batch inserts using the @Dao pattern:

@Daopublic interface MyDao {@Insertvoid insertAll(MyItem... items);}

This is a nice bit of sugar to make inserting a bunch of items into your database easy, but it does require that you instantiate objects for each item you’re going to insert.

When I tried to create 100,000 instances of my TrackInfo entity class, I quickly ran out memory. In order to make my performance tests at least run, I had to do some hackery by calling insertAll multiple times with smaller collections of TrackInfo objects so there was an opportunity for the garbage collector to tidy up between runs.

With this caveat in mind, let’s get to the numbers.

Experiments

After inspecting the code generated by Room, it was clear that the Googlers working on it know about the massive performance gain provided by wrapping multiple inserts/updates/deletes within transactions, so we’ll omit comparing Room against the naive case discussed in my original post.

Additionally: in the original post we found out that using SQLiteStatement directly was slightly faster than using db.execSQL(), and since Room’s generated code for insertions uses SQLiteStatement, let’s just compare apples to apples.

SQLiteStatement vs DAO insertAll

Room seems a little slower than manually using SQLiteStatement for the simple case. However, that could be chalked up to the fact that the Room experiment’s simple table needed to have a @PrimaryKey field in addition to the value.

Additionally, it looks like Room is smart and re-uses the SQLiteStatement object for multiple insertions. I found this bit of code in the source in EntityInsertionAdapter.java:

/*** Inserts the given entities into the database.** @param _entities Entities to insert*/_public final void insert(T[] entities) {final SupportSQLiteStatement stmt = acquire();try {for (T entity : entities) {bind(stmt, entity);stmt.executeInsert();}} finally {release(stmt);}}

That little bit of re-use makes a nice difference during insertions, and my original implementation of SQLiteStatement insertions didn’t do it. This made Room’s insertions of tracks records a bit faster.

Finally, I think the caveat above about requiring multiple calls to insertAll is what hurt Room on the large 100,000-item case.

Batched SQLiteStatement vs DAO insertAll

What happens when you compare a batched-insert SQLiteStatement against Room’s insertAll pattern?

Conclusion: Room for Improvement

Pun intended. #sorrynotsorry

I like Room, I think it has the potential to be a very powerful and useful library. It’ll take a lot of the nasty boilerplate out of using SQLite in Android applications, and speed up development greatly.

That said, I think Google still has a bit of work to do if they want to provide good support for situations where developers need to insert a lot of data into their SQLite databases using Room. Here are a few suggestions:

Let @Insert DAO Methods Accept a Generator

The varargs approach to batch inserts with @Dao is nice, but insufficient for large numbers of objects. I would propose allowing @Insert annotated methods accept an iterator or iterable which is called before each insertion to provide the next object. Something like the following:

@Daopublic interface MyDao {@Insertvoid insertAll(Iterator<MyItem> itemIterator);}

If you were inserting items from a JSON endpoint on your server, with this approach you could use an Iterator with your streaming JSON parser of choice and not have to overload the JVM heap with more objects than necessary.

Implement Batch Insertions

Google could also greatly speed up bulk inserts by smartly implementing batch insertions where they create an SQLiteStatement. Something like this:

INSERT INTO my_items (...) VALUES (...), (...), ...;

It might take a bit of clever code generation to determine how to avoid the situation where you can run into the max number of variables allowed by SQLite, as well as support earlier versions of SQLite so the app doesn’t crash because you can’t insert more than 500 values at a time.

I think it’s totally doable though.

Allow Using SQLite’s ROWID for Tables

When an explicit primary key is not defined for a table, SQLite uses ROWID internally. Currently, Room doesn’t allow for you to have a table without an explicit primary key. For most situations this is probably a good thing, but sometimes it can make sense to just use ROWID.

It would be nice if the @Entity annotation had a flag you could set to not require specifying a @PrimaryKey.

Other Posts in this Series


Published by HackerNoon on 2017/05/21