When writing complex data processes, we often find hash tables very useful. But when you decide to use your object as a key, it’s easy to implement it the wrong way.
A little disclaimer: this article requires some specific knowledge of Java development, and I can’t cover all this stuff in this article. Being familiar with these is highly appreciated:
HashMap.java
equals()
/ hashCode()
best practicesIt consists of key->val entries
→ that are in buckets
→ that are in array
. One bucket includes many entries. Here’s an example of what it looks like:
When the algorithm searches for a suitable key:
hashCode()
. Roughly speaking, it takes hash, simplifies it in order not to go out of bounds of the array of buckets, and then tries to take the bucket by this index - bucket = table[hash & length]
,key.equals()
If you have a Number
or a String
as a key, you have nothing to worry about. However, sometimes you need your object as a key, for example, for future processing or simplifying the logic, or to name it.
With complex keys, when it comes to code, the first thing we do is override Equals and HashCode
because it’s necessary to make this data structure work.
But are your keys immutable?
What will happen to this search algorithm if the key's hash is changed sometime after it has been used in a map? We can say goodbye to an entry like that. Most likely, we’ll never see it again.
Once, I spent hours trying to fix a bug like this until I debugged it and found out that my key’s field was modified deep inside a business logic by another developer.
It seems like a common problem: don’t modify it, and it will be okay. But it’s not. It’s about the way we write our objects.
Throughout my experience, most objects were made using Lombok
or IDE boilerplate. It’s easy to use and saves plenty of time. One simple action generates Constructors
, Getters / Setters
, overridesequals
, and hashCode
. But those objects are mutable.
Let’s have a look at the example below. The user has userName
and a bunch of roles
. Each role has a name and some permissions
, which this role can do.
User.java
@Data
@AllArgsConstructor
public class User {
private String userName;
private Set<Role> roles;
}
Role.java
@Data
@AllArgsConstructor
public class Role {
private String name;
private Set<String> permissions;
}
It seems impossible to make this mistake - modifying a key of HashMap. However, one day it could happen. For example, a user’s role gets new permission throughout a business logic. After adding it, hashCode
will change, and the user’s data is gone.
Here’s a demonstration of that:
@Test
public void cantFindExistingDataByUser() {
// given
final var map = new HashMap<User, String>();
final var role = new Role("admin", new HashSet<>());
final var key = new User("name", Set.of(role));
map.put(key, "some data");
// when
final var data = map.get(key);
// Here we modify role's permissions, it can happen somewhere far away where we don't remember that User is a key in HashMap
role.getPermissions().add("new permission");
final var dataAfterSomeTime = map.get(key);
// then
assertThat(data).isEqualTo("some data");
assertThat(dataAfterSomeTime).isNull();
}
As you can see, it’s effortless to break our map. The good news - we can avoid this forever if we use immutable objects. They don’t change with time, are side-effects free, and we can use them in a multi-threading environment.
There are plenty of ways to write immutable objects. I will show arguably a good one I prefer. We’ll use a Builder pattern and, of course, the final
keyword on our fields. Remember that all fields classes must be immutable too. If we have a collection, it must be unmodifiable, and every field’s class must be immutable.
User.java
public class User {
private final String userName;
private final Set<Role> roles;
public User(Builder builder) {
this.userName = builder.userName;
this.roles = builder.roles;
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
User user = (User) o;
return Objects.equals(userName, user.userName) && Objects.equals(roles, user.roles);
}
@Override
public int hashCode() {
return Objects.hash(userName, roles);
}
public static class Builder {
private String userName;
private Set<Role> roles;
public static Builder user() {
return new Builder();
}
public User build() {
return new User(this);
}
public Builder userName(String userName) {
this.userName = userName;
return this;
}
public Builder roles(Set<Role> roles) {
this.roles = Collections.unmodifiableSet(roles);
return this;
}
}
}
Role.java
public class Role {
private final String name;
private final Set<String> permissions;
public Role(Builder builder) {
this.name = builder.name;
this.permissions = builder.permissions;
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Role role = (Role) o;
return Objects.equals(name, role.name) && Objects.equals(permissions, role.permissions);
}
@Override
public int hashCode() {
return Objects.hash(name, permissions);
}
public Set<String> getPermissions() {
return permissions;
}
public static class Builder {
private String name;
private Set<String> permissions;
public static Builder role() {
return new Builder();
}
public Role build() {
return new Role(this);
}
public Builder name(String name) {
this.name = name;
return this;
}
public Builder permissions(Set<String> permissions) {
this.permissions = Collections.unmodifiableSet(permissions);
return this;
}
}
}
And here’s an attempt to break this version of HashMap, as you can see, we cannot do that because it’s immutable and will never be changed.
@Test
public void alwaysCanFindExistingDataByUser() {
// given
final var map = new HashMap<User, String>();
final var role = role().name("admin").permissions(new HashSet<>()).build();
final var key = user().userName("name").roles(Set.of(role)).build();
map.put(key, "some data");
// when
final var data = map.get(key);
// Never can we modify an immutable field
assertThatThrownBy(() -> role.getPermissions().add("new permission")).isInstanceOf(UnsupportedOperationException.class);
final var dataAfterSomeTime = map.get(key);
// then
assertThat(data).isEqualTo("some data");
assertThat(dataAfterSomeTime).isEqualTo("some data");
}
If you use HashMap, don’t forget to override equals()
and hashCode()
and do not use mutable objects as a key. Instead, make it immutable, because immutable objects: don’t change with time, are side-effects free, and are good in a multi-threading environment.
You can find a full working example here, on GitHub.