Before you go, check out these stories!

0
Hackernoon logoDatabase Schema Versioning and Migrations Made Simpler For High Speed CI/CD by@csouchet

Database Schema Versioning and Migrations Made Simpler For High Speed CI/CD

Author profile picture

@csouchetSouchet Céline

R&D Engineer at Bonitasoft

If you are a back-end developer, you are often faced with having to migrate your database schema with each new release.

The framework called Liquibase can make it easier for you when you need to upgrade your database schema.

In this article, I’ll explain how Liquibase can be used in a Java project, with Spring/Hibernate, to version the database schema.

How does Liquibase work?

Changelog

Liquibase works with Changelog files (the list of all the changes, in order, that need to execute to update the database).

There are 4 supported formats for these Changelogs: SQL, XML, YAML, and JSON. 

Changeset

A Changeset represents a single change to your database.

Each changeset is identified by “id” and “author” attributes, and by the directory and file name of the changelog file - which makes it possible to uniquely identify it and to be applied only once.

When the changelog is executed, the changesets defined in it will be executed one by one, in the order of definition.

Thus, if there is an error on a changeset, all the precedents will have already been applied. Liquibase will end its execution on an error. So, you will only have to correct the incorrect changeset in error, then relaunch Liquibase and continue your migration. (I will explain later how Liquibase finds from which changeset it should resume).

Change types

Each changeset contains one or more Change Types that describe a type of operation to apply to the database. 

Liquibase supports both raw SQL and Change Types (that generate SQL for supported databases). But, if you want to execute the same changelog in different database vendors, it’s better to use Change Types. Change Types also allow automatic rollback if there is an error.

Generally, there should only be one Change Type per changeset to avoid failed auto-commit statements that can leave the database in an unexpected state.

Rollback

Liquibase allows you to undo changes you have made to your database, either automatically (generated from a Change Type) or via custom rollback SQL.

You need to include a <rollback> clause, in each changeset, whenever a change doesn’t support out of box rollback (e.g. <sql>, <insert>, <update>, and other destructive modifications).

For example, the dropTable tag of Liquibase is a destructive modification, as it is not possible to add a table by retrieving deleted data. In these cases, the rollback tag cannot be automatic. You must therefore manage the rollback yourself in order to return to the previous version of your database.

Tracking tables

If your database does not already contain tracking tables, Liquibase will create 2 tables in your database when it executes: DATABASECHANGELOG and DATABASECHANGELOGLOCK.

1. DATABASECHANGELOG 

When Liquibase runs, it queries the DATABASECHANGELOG table for the changesets that are marked as executed, and then executes all changesets that have not yet been executed.

After each execution of a changeset, Liquibase logs it in the DATABASECHANGELOG table. A row corresponds to a changeset, identified by a unique combination of the “id,” “author,” and “filename” columns.

a. filename

The filename column can be:

  • an absolute path
  • a relative path depending on how the changelog was passed to Liquibase
  • the value of the logicalFilePath attribute of the changelog
  • the value of the logicalFilePath attribute of the changeset

b. md5sum

There is another important column: md5sum. As its name suggests, it checks that the content of a changeset has not been modified since its first execution.

You MUST NEVER modify an applied changeset !!!

2. DATABASECHANGELOGLOCK

To prevent conflicts from different Liquibase instances, there is another table: DATABASECHANGELOGLOCK.

Sometimes,the lock is not released, if a Liquibase instance doesn’t exit cleanly.  After you are sure that all Liquibase instances are stopped, you can clear out the current lock by executing the following SQL command:

UPDATE DATABASECHANGELOGLOCK SET LOCKED=0

Set up

Gradle dependency

Let's start with the Gradle dependency that we need to add into our build.gradle:

compile group: 'org.liquibase', name: 'liquibase-core', version: '3.9.0'

Changelogs tree

Now that you know the basics of Liquibase, I'm going to explain how to organize your changelogs/changesets.

If you want, you can use only one changelog, but after several versions, your file will become unreadable. It’s better practice to split your changesets into several files.

The best practice is to have 1 file per feature + 1 folder per version + 1 master changelog which aggregates all changelogs with an include or includeAll tag.

First, create the master changelog file in the folder src/main/resources.

To include the sub-changelogs, there are 2 ways: using  include or includeAll tags.

1. include

You must specify changelogs one by one, which are run in the order they are found.

So, if you use this option, you need to pay attention to the order of the changelog declaration, and not create a loop.

If you create a changelog loop (root.changelog.xml includes sub.changelog.xml which includes root.changelog.xml) you will get an infinite loop.

You can also create an intermediate master changelog for all versions, to avoid having to list all the changelogs in the root master.

  • master -> 0.0.0/master, 1.0.0/master, 1.0.1/master
  • 0.0.0/master -> 0.0.0/file1, 0.0.0/file2…
  • 1.0.0/master -> 1.0.0/…
  • ...

2. includeAll

This is similar to the Include tag, but instead of passing a specific changelog file to include, you specify a directory which includes all *.xml changelog files, and all *.sql files, as individual changes. 

All files that are found are run in alphabetical order. So you must use a naming convention for files such that they run in the correct order.

3. relativeToChangelogFile

This attribute of include and includeAll calculates the path of the included file (for the DATABASECHANGELOG table) relative to the changelog file containing the included file rather than to the classpath.

4. logicalFilePath

This attribute of changelog and changeset overrides the file name and path when creating the unique identifier of changesets. 

It is required when moving or renaming changelogs.

Also, it can be useful when many Liquibase instances use the same database, but the classpath of the changelog files they use are not the same.

For example, you can split your changelogs in different modules of your project to test your repositories/services independently. But you can also have final tests when all your modules/changelogs are packaged. If the classpath of your changelogs in the packaging are not the same as in the modules, Liquibase doesn’t recognize the changelogs as the same.

The best solution is to have different databases for the different tests, but sometimes, the company you work for will refuse to make another available because of cost.

  • Without logicalFilePath
  • With logicalFilePath

Advanced

You can find some interesting advanced features here, like changelog parameters, column tag, and preconditions tag.

1. Preconditions

If you want to apply a changelog/changeset only under certain conditions, you need to use the precondition tag. There are many kinds of preconditions. You can find them here.

If the precondition is not valid, you can either halt the update, skip a changeset, mark a changeset as run, or show a warning.

2. Liquibase Extension

If you don’t want to use the SQL, XML, YAML, and JSON formats for your changelog, you can use the Liquibase extension to create it in whatever format you like.

Integration

Liquibase easily integrates into your Java application, if you use Spring or Spring Boot.

Spring

If you want to execute the changes on the Spring application startup, use the following Spring Bean:

@Bean
public SpringLiquibase liquibase() {
    SpringLiquibase liquibase = new SpringLiquibase();
    liquibase.setchangelog("classpath:db/changelog/db.changelog-master.xml");
    liquibase.setDataSource(dataSource());
    return liquibase;
}
Note: db/changelog/db.changelog-master.xml must already exist in the folder src/main/resources.

Spring Boot 2

If you are using Spring Boot, there is no need to define a Bean for Liquibase.

All you need to do is change the value of spring.liquibase.change-log in the file src/main/resources/application.properties of Spring, and the Liquibase migrations will run automatically on the application startup:

spring.liquibase.change-log=classpath:/db/changelog/db.changelog-master.xml
Note: The default value is classpath:/db/changelog/db.changelog-master.yaml. You can find all the properties of Spring Boot for Liquibase, here.

Test

Unit test

For the unit tests, you don’t need to use Liquibase, as you don’t test the interaction with the database.

If you’re using Spring Boot, disable Liquibase & Spring Data Source, in this way: 

spring.liquibase.enabled=false
spring.autoconfigure.exclude=org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration, \
org.springframework.boot.autoconfigure.jdbc.DataSourceTransactionManagerAutoConfiguration, \
org.springframework.boot.autoconfigure.orm.jpa.HibernateJpaAutoConfiguration, \
org.springframework.boot.autoconfigure.data.jpa.JpaRepositoriesAutoConfiguration

Integration test

There are many ways to store your test data. You can:

  • Create a changelog for the test (stored in src/test/resources), and point it in the Spring configuration for testing, to store the data at Liquibase initialization. But at each modification of the database, you will have to modify the test changelogs.
  • Use Spring annotations & SQL scripts to store the data at Spring initialization. But as in the first solution, you need to modify the SQL scripts at each time the data model is modified.
  • Define all operations in each test, using the repositories (Liquibase loads the database schema at Spring initialization)
  • @ExtendWith(SpringExtension.class)
    public class AnimalRepositoryTest {
        @Autowired
        private AnimalRepository animalRepository;
        @Autowired
        private OwnerRepository ownerRepository;
        @BeforeEach
        public void setUp() {
            animalRepository.deleteAll();
            ownerRepository.deleteAll();
        }
        @Test
        public void findByName_should_return_the_right_animal() throws Exception {
            // Given
            final Owner owner = ownerRepository.save(new Owner("Céline"));
            final Animal animal = animalRepository.save(new Animal("Peluche", owner));
            // When
            final Animal result = animalRepository.findByName("Peluche");
            // Then
            assertEquals(animal, result);
        }
    }

This solution is more verbose, but it’s easier to read (and does not need to find all the test data in SQL/changelog files). It’s also easier to maintain, because if the data model changes, you’ll see that in your IDE or during the compilation phase of the tests.

Tools

Gradle Plugin

Instead of writing the changelog file manually, use the Liquibase Gradle plugin to generate one. Then you have just to split it.

1. Plugin Configuration

Add the Gradle plugin into the build.gradle:

id 'org.liquibase.gradle' version '2.0.2'

2. Generate a changelog from 1 existing database

Add the plugin configuration, for the generateChangeLog task, into the build.gradle:

liquibase {
    activities {
        main {
            changeLogFile changeLog
            url 'jdbc:postgresql://localhost:5432/bonita'
            username '%USER_NAME%'
            password '%PASSWORD%'
        }
    }
}

Then, execute the task:

./gradlew generateChangeLog

This can be useful during the development, if you want to let Hibernate generate the database schema, and then generate the changelog from the generated database.

3. Generate a changelog from the differences between 2 existing databases

Add the plugin configuration, for the diff task, into the build.gradle:

liquibase {
    activities {
        main {
            changeLogFile changeLog
            url 'jdbc:postgresql://localhost:5432/bonita'
            username '%USER_NAME%'
            password '%PASSWORD%'
            referenceUrl 'jdbc:postgresql://localhost:5432/bonita_7.5'
            referenceUsername '%USER_NAME%'
            referencePassword '%PASSWORD%'
        }
    }
}

Then, execute the task:

./gradlew diff

So you can compare the schema of a production database with the schema of a freshly installed blank database, if there is a problem during the migration of the production database.

4. Community commands

There are other Gradle tasks for this plugin that you can find here.

Liquibase Hibernate Plugin

If the application uses Hibernate, you can use this Gradle plugin to generate the changelog from your entities during the development.

1. Plugin Configuration

Add the Gradle plugin into the build.gradle:

compile group: 'org.liquibase.ext', name: 'liquibase-hibernate5', version: '3.10.0'

2. Generate a changelog from the differences between a database and persistence entities

Add the plugin configuration, for the diffChangeLog task, into the build.gradle:

liquibase {
    activities {
        main {
            changeLogFile changeLog
            url 'jdbc:postgresql://localhost:5432/bonita'
            username '%USER_NAME%'
            password '%PASSWORD%'
            referenceUrl 'hibernate:spring:com.example?dialect=org.hibernate.dialect.MySQL5Dialect&hibernate.physical_naming_strategy=org.springframework.boot.orm.jpa.hibernate.SpringPhysicalNamingStrategy&hibernate.implicit_naming_strategy=org.springframework.boot.orm.jpa.hibernate.SpringImplicitNamingStrategy'
        }
    }
}

Then, execute the task:

./gradlew diffChangeLog

Conclusion

In this tutorial, we’ve seen  several very useful features of Liquibase which enable you to evolve your database schema easily in a Java application.

Versioning your database with Liquibase means you can:

  • Refactor your code easily while having simple production releases (no need to manually execute a SQL statement suite or develop your own migration tool, as the updating of the database is done when launching the new version of your application)
  • Easily test the state of the database (locally or in a continuous integration)
  • Improve teamwork, because the changes are visible and easily applied by everyone

The implementation of all these examples can be found in my github project.

Links

Tags

The Noonification banner

Subscribe to get your daily round-up of top tech stories!