Welcome back! In the first part of this series, we looked at a very "low-level" way to interact with a relational database by sending it raw SQL strings and retrieving the results. We created a very simple Express application that we can use as an example and deployed it on Heroku with a Postgres database.
In this part, we're going to examine a few libraries which build on top of that foundation, adding layers of abstraction that let you read and manipulate database data in a more "JavaScript-like" way.
So far, we've seen how to send raw SQL to the database; statements like:
SELECT * FROM users
If we wanted to get comments by a specific user, say the user whose id is 1, we might use something like this:
SELECT * FROM comments WHERE user_id = 1
There's nothing wrong with interacting with your database in this way, but it can feel a bit cumbersome, and it requires you to keep "shifting gears" mentally. You think about your JavaScript code in one way, but when you need to think about the data in your database, you have to start thinking in SQL.
The purpose of the rest of the database libraries we're going to consider is to let you treat the data in your database more like the JavaScript objects and code in your application. "Under the hood" it's all SQL, but you won't need to care much about that unless you want to.
The first library we'll talk about is Knex. The documentation page describes Knex as a "query builder", and its purpose is to provide a layer of abstraction on top of raw SQL.
Installing Knex
Knex requires pg (or MySQL if you're using a MySQL database). We already have pg installed, so we just add knex like this:
npm install knex
git add package.json package-lock.json
git commit -m "Install the knex library"
Using Knex
The NPM page for knex describes it as a "query builder." Knex abstracts SQL to a certain extent, but not very far. We still need to understand the underlying SQL, but we can write it in a more JavaScript-like syntax, rather than having to slice and dice SQL strings. More importantly, we can use composition to chain knex terms in a way that is much more comfortable for JavaScript programmers.
So, when we used pg, we had this statement:
const result = await db.query('SELECT * FROM users');
When we use knex, we can write this:
const result = await db.select().from('users');
That might not look like much of a difference, but because of the way we can compose knex function calls, we can also do things like this:
const result = await db.select().from('users').limit(5).offset(8);
Here, we're getting 5 user records, starting at position 8 in the total set of all possible user records that match our query. You can see the full set of options available in the knex documentation.
Let's change our Express app to use knex to display some records from our database. First, in index.js replace these two lines:
const { Pool } = require('pg');
const conn = new Pool({ connectionString: process.env.DATABASE_URL });
...with this:
const db = require('knex')({
client: 'pg',
connection: process.env.DATABASE_URL
});
Then, change the implementation of
listUsers
to this:async function listUsers(req, res) {
try {
const result = await db.select().from('users').limit(5).offset(5);
const results = { 'users': (result) ? result : null};
res.render('pages/index', results );
} catch (err) {
console.error(err);
res.send("Error " + err);
}
}
Our views/pages/index.ejs file can stay exactly the same as before.
Commit, push, and deploy:
git add index.js
git commit -m "Use knex to display user data"
git push heroku master
When you refresh your browser, you should see user records 6 to 10 on the page.
You can see the code with these changes here.
Knex gives us a way of interacting with our database, which is much more like JavaScript, but we are still required to think in a database-centric way when we need to manipulate data.
The next three libraries we're going to talk about are all built on top of knex (which is built on top of pg or MySQL), and are examples of "object relational mapping" or ORM libraries. As the name implies, the purpose of an ORM library is to translate between data in a relational database and JavaScript objects in your application. What that means is, instead of thinking about records in the users table when you write your JavaScript code, you can think about user objects.
Objection
The first library we'll look at is objection, which is built on top of knex:
npm install objection
git add package.json package-lock.json
git commit -m "Install the objection library"
To highlight some of the utility of ORM libraries, we're going to alter our application to display users and their comments. Objection is built on top of knex, so in our index.js file, we have to leave the knex block in place, and add a bit more code (I'm putting everything in the index.js file, to keep things simple. In a real application you would break the code into separate files):
const { Model } = require('objection');
Model.knex(db);
This gives us a Model class from which we can inherit to define two classes User and Comment. We'll define Comment first:
class Comment extends Model {
static get tableName() {
return 'comments';
}
}
Our class needs to extend
Model
, and must implement a tableName
function to tell Objection which database table contains the underlying records.The
User
class is similar, but we're going to add some behavior to our class; a fullName
function, which we can use in our view template. We're also going to tell Objection that Users
have Comments
(i.e. a user owns zero or more comments). In ORM-speak, this is usually described as a "has many relationship" - i.e. a user has many comments. Here's what the code for this looks like:class User extends Model {
static get tableName() {
return 'users';
}
fullName() {
return `${this.first_name} ${this.last_name}`;
}
static get relationMappings() {
return {
comments: {
relation: Model.HasManyRelation,
modelClass: Comment,
join: {
from: 'users.id',
to: 'comments.user_id'
}
}
};
}
}
We define a relationMappings object inside our User class, with a single key of comments and a value telling Objection that this is a HasManyRelation on the Comment class, where the value of the id column of the users table matches the value of the user_id column of the comments table.
Now that we've defined our classes, let's use them in our code. Here's the new implementation of listUsers:
async function listUsers(req, res) {
try {
const users = await User.query().limit(5);
for (i in users) {
const user = users[i];
user.comments = await User.relatedQuery('comments').for(user.id);
}
const results = { 'users': users };
res.render('pages/index', results );
} catch (err) {
console.error(err);
res.send("Error " + err);
}
}
Here, we fetch 5 users, then for each of those users, we fetch their comments and assign those to the comments property of our user object. In views/pages/index.ejs we can display our users and their comments like this:
<h1>Users</h1>
<ul>
<% users.map((user) => { %>
<li><%= user.id %> - <%= user.fullName() %></li>
<ul>
<% user.comments.map((comment) => { %>
<li><%= comment.body %></li>
<% }); %>
</ul>
<% }); %>
</ul>
You can see the code with these changes here. As usual, commit, and push to deploy:
git add index.js views/pages/index.ejs
git commit -m "Show users and comments using Objection"
git push heroku master
Now, when you reload the page, you should see users and comments.
This code highlights a common problem that people run into when using ORM libraries, known as the "N+1 selects" problem.
This is the block of code we used to fetch users and their comments:
const users = await User.query().limit(5);
for (i in users) {
const user = users[i];
user.comments = await User.relatedQuery('comments').for(user.id);
}
This works, but is very inefficient. First, we fetch 5 users, then for each of those 5 users, we fetch their comments by making another call to the database. So, we've made 1 call for the users, then another 5 calls to get the comments. That's 5 calls plus the first 1, i.e. 5+1 or N+1 where N == 5. Hence the "N+1 selects" problem.
Unless your database queries are very complex, the time it takes to make a round-trip call to the database is a lot longer than the time it takes the database to calculate and transmit the results of your queries. So, to keep our applications fast, we need to minimize the number of calls to the database, as much as we can. The code above is the exact opposite of this.
For this trivial example, you won't notice any difference, but for real-world applications, the performance hit can be very serious, and cause lots of problems.
Fortunately, every ORM library has features that make it easy to avoid this problem (provided you know it's there). Here's how Objection does it; in index.js, replace the code block above with this:
const users = await User.query().limit(5).withGraphFetched('comments');
This one line does the same as the code block above, but in a much more database-efficient way. Objection will use the relationship information we provided to figure out how to fetch the user data and the comments data in a single query, and unpack and stitch together the results into the same object structure we built before using our for loop.
You can see the code with these changes here.
The next ORM library we'll look at is Bookshelf.
A lot of the differences between ORM libraries depend on what use-case the library is optimized for. In the case of Bookshelf, it's clearly designed to make it as easy as possible to render paginated lists of data, which is a very common use case in web applications.
Let's replace Objection with Bookshelf in our application:
npm uninstall objection
npm install bookshelf
git add package.jsonpackage-lock.json
git commit -m "Replace Objection with Bookshelf"
In index.js, replace these lines:
const { Model } = require('objection');
Model.knex(db);
...with this:
const bookshelf = require('bookshelf')(db);
Replace our class definitions with these:
const Comment = bookshelf.model('Comment', {
tableName: 'comments'
});
const User = bookshelf.model('User', {
tableName: 'users',
comments() {
// by default, bookshelf infers that the foreign key is 'user_id'
return this.hasMany('Comment');
}
});
Our
listUsers
function now looks like this:async function listUsers(req, res) {
try {
const models = await new User()
.fetchPage({
pageSize: 5,
page: 1,
withRelated: ['comments']
});
users = [];
models.map(m => {
const user = m.attributes;
const comments = m.related('comments');
user.comments = comments.map(c => c.attributes);
users.push(user);
});
const results = { 'users': users };
res.render('pages/index', results );
} catch (err) {
console.error(err);
res.send("Error " + err);
}
}
As you can see, the definition of the classes is a bit more concise, but Bookshelf needs a more verbose definition of how to unpack our data to build the users/comments structure. Notice also how the concept of pages of data is built directly into the library's API.
The code in views/pages/index.ejs is almost identical (I've removed the fullName function from the User class):
<h1>Users</h1>
<ul>
<% users.map((user) => { %>
<li><%= user.id %> - <%= user.first_name %> <%= user.last_name %></li>
<ul>
<% user.comments.map((comment) => { %>
<li><%= comment.body %></li>
<% }); %>
</ul>
<% }); %>
</ul
You can see the code with these changes here. And of course, once again commit and deploy.
git add index.js views/pages/index.ejs
git commit -m "Show users and comments using Bookshelf"
git push heroku master
The last library we're going to look at is Sequelize.
Sequelize is quite opinionated in the way it expects your data to be structured. If you follow its conventions, you can write less code and allow Sequelize to do a lot of the work for you. In particular, Sequelize has a lot of features to help create tables for you, and by default, it will create them following its own structure and naming conventions.
The database we've been using isn't structured in exactly the way Sequelize expects, so we need to add a bit of extra configuration to allow Sequelize to work with it.
To remove bookshelf and install sequelize, run these commands:
npm uninstall bookshelf
npm install sequelize
git add package.json package-lock.json
git commit -m "Replace Bookshelf with Sequelize"
In index.js, replace these lines:
const db = require('knex')({
client: 'pg',
connection: process.env.DATABASE_URL
});
const bookshelf = require('bookshelf')(db)
...with these:
const { Sequelize, DataTypes } = require('sequelize');
const sequelize = new Sequelize(process.env.DATABASE_URL);
Then, replace the class definitions for User and Comment with this code:
const User = sequelize.define('User', {
first_name: { type: DataTypes.STRING },
last_name: { type: DataTypes.STRING },
email: { type: DataTypes.STRING }
},
{
tableName: 'users',
timestamps: false
}
);
const Comment = sequelize.define('Comment', {
body: { type: DataTypes.STRING }
}, {
tableName: 'comments',
timestamps: false
}
);
User.hasMany(Comment, { foreignKey: 'user_id' });
Note that we passed two objects to
sequelize.define
. The first object defined our object's properties, and the second contained some metadata.In this case, we told Sequelize that the database table that underpins the User class is called 'users' (by default, Sequelize would infer that the table was called 'Users'), and the
timestamps: false
tells Sequelize that our table does not have timestamp columns, called createdAt and updatedAt. Sequelize makes it very easy to write code that will create tables for you, and when it does so, it adds these timestamp columns and sets their values accordingly when you write to the database. The sequelize documentation is excellent, and has more about this.
The
foreignKey: 'user_id'
that we pass to hasMany is another of the places where we have to tell Sequelize we're not following its conventions. It expects (and would create for us) a column called UserId to link comments to users.Inside our
listUsers
function, we can replace all of this code:const models = await new User()
.fetchPage({
pageSize: 5,
page: 1,
withRelated: ['comments']
});
users = [];
models.map(m => {
const user = m.attributes;
const comments = m.related('comments');
user.comments = comments.map(c => c.attributes);
users.push(user);
});
...with this single line:
const users = await User.findAll({ include: Comment });
We also have to make one tiny change in views/pages/index.ejs. Replace this line:
<% user.comments.map((comment) => { %>
...with this (the difference is user.Comments instead of user.comments):
<% user.Comments.map((comment) => { %>
You can see the code with these changes here.
git add index.js views/pages/index.ejs
git commit -m "Show users and comments using Sequelize"
git push heroku master
So there you have it — 5 ways you can query a relational database from your JavaScript application. We started with raw SQL via the pg/mysql library, then looked at the knex query builder, before moving on to three ORM libraries; objection, bookshelf, and sequelize.
So, which is the right choice for your application?
As always, it depends. There's nothing you can do with an ORM library that you can't do using a query builder or even raw SQL. Since everything works using SQL "under the hood". That's not surprising. Also, even if you decide to use an ORM, most libraries still give you a way to send raw SQL to your database. So what level of abstraction you use depends on the problem you're trying to solve, and what kind of code you want to focus your attention on.
If you're making heavy use of the features of your database, maybe with complex views or stored procedures, you might find it easier to use knex or raw SQL. But, for most web applications it's quite likely that an ORM library will make your life easier by abstracting away the table structure and allowing you to think about your application data as JavaScript objects.
If you've decided on ORM, the choice of which ORM library to use isn't always clear-cut. The landscape of JavaScript libraries is very dynamic. New libraries are created quite often, and older ones fall out of favor. Here are a few things to think about when making your choice:
Happy querying!