Think Before You Hibernate

I often saw how people included hibernating into a project, not really thinking about whether they really needed it. And after some time, when the service grew, they began to wonder if this was a mistake.

Let’s try to think ahead of what the pros and cons of hibernate are in general so that next time we can determine whether we need to add this dependency in a new microservice. Perhaps it makes sense to get by with simple Spring Data JDBC, without all that JPA complexity?

Bunch of non-obvious things

Hibernate may look like tons of annotations that magically select everything from the database and save it. All we need is to add @ManyToMany, @OneToMany, and other annotations correctly.

If you dig deeper, many programmers in job interviews cannot tell exactly how their application works. For example, in the code below, we do not explicitly call the

save()

method anywhere, but changes will be saved in the database.

@Transactional
public void processSomething(long accId) {
    Account acc = accRepo.findById(accId).orElseThrow(RuntimeException::new);

    acc.setLastName("new name");
}

To me, this kind of code makes it dramatically harder to read. Code reviewers will spend a lot of time figuring out what’s going on. And the next time you have to fix a bug, it will take a lot of time.

Lazy fetching issues

@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "company_id")
private Company company;

Another common occurrence is the use of Lazy fetch in transactions. It seems tempting to add a lazy field to the entity and refer to it in a transaction to get the data.

Then one more field is added, and one more. We most likely don’t need all of the data except 1–2 fields of several objects. But as a result, the application sends 5–10 requests to the database, completely pulling out objects. Instead, it is possible to write one select requesting only the necessary data.

If the access to the fields occurs outside the service (and the transaction), there are still such ridiculous constructions that say hello to code reviewers:

@Transactional(readOnly = true)
public Account getAccountWithCompany(long id) {
    Account acc = accRepo.findById(id).orElseThrow(RuntimeException::new);

    acc.getCompany().getName();

    return acc;
}

It would seem Eager fetching or @EntityGraph could be used. But Eager affects other code, and @EntityGraph requires a separate request (if we are already writing it, do we need a hibernate?).

N+1 selects problem

Writing code that will do batch inserts with hibernate is not an easy task. Don’t forget to:

1. Add a property to application.yml

spring.jpa.properties.hibername.jdbc.batch_size: 50

2. Create a sequence increasing by the batch size

CREATE SEQUENCE account_id_seq START 1 INCREMENT BY 50;

3. Setup sequence generator to use several sequence values at once. Otherwise, the hibernate will refer to the sequence N times.

@Entity
public class Account {
    @Id
    @GenericGenerator(
        name = "account_id_generator",
        strategy = "org.hibernate.id.enhanced.SequenceStyleGenerator",
        parameters = {
            @Parameter(name = "sequence_name", value = "account_id_seq"),
            @Parameter(name = "increment_size", value = "50"),
            @Parameter(name = "optimizer", value = "pooled-lo")
        })
    @GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "account_id_generator")
    private Long id;
    
    ...

It often happens that people forget about the sequence generator, and this completely negates the entire optimization of the batch insert.

What’s really handy with Hibernate is the cascading batch insert if everything is well configured. Since the hibernate first requests IDs, this allows setting foreign keys for cascade saving of related entities. You can do it without Hibernate by following the same scenario.

Don’t forget to use @BatchSize to batch select entities associated with @ManyToMany or @OneToMany annotations. Otherwise, N+1 requests will be executed.

If you nevertheless decided to use Hibernate, I recommend using the QuickPerf library in your tests to be sure how many requests to the database are executed exactly.

@Test
@ExpectSelect(2)
public void shouldSelectTwoTimes() {
    ...
}

2nd level cache

Finally, it is worth mentioning the second-level cache in Hibernate. If you use the standard Ehcache implementation, then when the application is scaled, different instances will store different data in the cache.

As a result, the response of the service may depend on which instance the request arrived at.

To avoid this, it’s probably best to start using Spring Cache right away. Then, when scaling, it will be enough to connect a distributed implementation (Redis, Hazelcast, Apache Ignite, etc.).

Let me also remind you that in using the L2 cache of Hibernate, a connection to a database will still be acquired for each request.

2021-07-26 20:20:44.479  INFO 55184 --- [           main] i.StatisticalLoggingSessionEventListener : Session Metrics {
    4125 nanoseconds spent acquiring 1 JDBC connections;
    0 nanoseconds spent releasing 0 JDBC connections;
    0 nanoseconds spent preparing 0 JDBC statements;
    0 nanoseconds spent executing 0 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    12333 nanoseconds spent performing 1 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    0 nanoseconds spent executing 0 flushes (flushing a total of 0 entities and 0 collections);
    0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}

Conclusion

I’m not trying to say that you should never use Hibernate. It can be useful in certain scenarios. But I want to say that you should always think about it in the early stages of a project not to regret it in the future.

Also published here.