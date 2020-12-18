Hacking things here and there
I was writing some Java test code when I faced up the voracity of the
method. It presents, despite its ostensible simplicity, a tricky problem.
equals
I'd like to emphasize that this is not a specific Java issue. For example C# has a homologous way.
The
class is the root of every class. It defines various methods and
Object
is one of them. By default this method has a simple behavior: an object x is only equals to itself. Any other object is different.
equals
Obviously the
method has the common properties of an equivalence relation. It is reflexive: x is equal to x; It is symmetric: if x is equal to y, then y is equal to x; and so on...
equals
Furthermore a logic relationship links
and the
equals
method. The latter returns the object hash. In this context it's an integer representation of the object. So if two object are equal, then their hash should be equal too.
hashCode
I'll use a simple code example to highlight the main issue. Here is the starting point:
interface Book {
String title();
String author();
}
final class DbBook implements Book {
...
}
We can ask to a
instance its title and its author. A
Book
instance represents a book stored in a database.
DbBook
As I said the
class is the root of every class. This means that
Object
also inherits the
DbBook
method with its default behavior. So we should override
equals
to implement a custom equivalence relation. To respect the aforesaid logic implicatio we should also override the
equals
method.
hashCode
Now suppose that in our context two books are equal if they have the same title. This seems the
goal and here is a common implementation (I generated it with the IDE):
equals
final class DbBook implements Book {
...
@Override
public boolean equals(final Object o) {
if (this == o) return true;
if (!(o instanceof DbBook)) return false;
final DbBook dbBook = (DbBook) o;
return title().equals(dbBook.title());
}
@Override
public int hashCode() {
return Objects.hash(title());
}
...
}
However because the
check,
instanceof
is only comparable to
aDbBook
. This means that an
anotherDbBook
instance is always different from
AnotherBookImplementation
, though having the same title.
aDbBook
The problem seems the
check. So we can weak it a bit:
instanceof
final class DbBook implements Book {
...
@Override
public boolean equals(final Object o) {
if (this == o) return true;
if (!(o instanceof Book)) return false;
final Book book = (Book) o;
return title().equals(book.title());
}
...
}
In this way we are restricting
to be a
o
instance.
Book
The effect of this change is the destruction of our software. As I anticipated the
method should be reflexive. This means that every
equals
implementation must exhibit this
Book
behavior. In other words: every
equals
implementation is high-coupled with each other. Definitely it's a really bad approach.
Book
The main issue of the
approach is that responsabilities aren't decoupled correctly. There should be another object responsible of the comparison. There should be another object that represents the comparison.
equals
There may be various implementation of this approach. I'll suggest one called representation-based and another behavior-based.
The first one is derives from this article. Basically an object (like
) can gives us a representation of itself. Then a
aDbBook
object represents a comparison between two
Comparison<R>
representation. In this way a representation is similar to the hash returned by
R
. But it's more generic because it could be based on bytes, strings and so on...
hashCode
However this means that
could be equal to
aCat
, if they have the same representation. I consider this as the main drawback of this approach.
aDog
The behavior-based is born from an observation. I think that the only valid discriminating factor about objects is their behavior. It's exposed through the methods. Or, more formally, through the messages the object supports. The protocol or interface is the collection of the supported messages.
For this reason the first step to define equality should be based on interfaces. Then an
object will represent the equality between two objects with the same interface.
Equality
In this way
will be always different from
aCat
because the different interfaces. Presumably the former implements a
aDog
interface, the latter a
Cat
interface. Nonetheless, thanks to polymorphism, if they both implement a
Dog
interface, then they could be equal. This could be possibile with
Pet
limited to the
anEquality
interface.
Pet
Here is an example related to the initial
example. I defined two
Book
classes to stress out that equality is not a
Equality
responsability.
Book
interface Equality {
Boolean equals();
}
final class TitleBasedEquality implements Equality {
TitleBasedEquality(final Book book, final Book anotherBook) {
this.book = book;
this.anotherBook = anotherBook;
}
@Override
public Boolean equals() {
return book.title().equals(anotherBook.title());
}
private final Book book;
private final Book anotherBook;
}
final class PrefixBasedEquality implements Equality {
PrefixBasedEquality(final Book book, final Book anotherBook, final Integer length) {
this.book = book;
this.anotherBook = anotherBook;
this.length = length;
}
@Override
public Boolean equals() {
var first = book.title().substring(length);
var second = anotherBook.title().substring(length);
return first.equals(second);
}
private final Book book;
private final Book anotherBook;
private final Integer length;
}
So a
object compares the full title. A
TitleBasedEquality
compares only a prefix.
PrefixBasedEquality
We gather a lot of flexibility. And we can choose the correct equality comparison based on the context. This is possible thanks to the responsability decoupling.
However, as you can see, I'm using
. I could replace it with
String.equals
. But I consider this case as a reasonable compromise forced by the programming language.
StringEquality
A possible drawback of this approach regards an interface with only void methods. In this case each pair of instances are always equal. But this means that these type of objects are only and always comparated on their interface. I find it coherent and I find it respectful towards the objects.
Definitely the object equality problem is a tough problem. I think that the major issue is that we think equality in terms of data. But objects are not data. This is the reason because I support the idea of some sort of behavior-based comparison. After all the exhibited behavior is what distinguishes one object from another one. Nothing more.
