On Type Class Instance Selection

Written by jonathangfischoff | Published 2017/05/14
Tech Story Tags: haskell | functional-programming | programming | learning-to-code

TLDRvia the TL;DR App

It was well into my Haskell career before I was able to create type classes to solve problems regularly.

In retrospect, I did not have a good grasp of how instance selection worked. I didn’t understand when I was going down a path that would result in overlapping instances or when I was in the clear.

Simple Overlap

I’ll often see Haskellers write code like I used to:

class Combine a wherecombine :: a -> a -> a

instance Combine String wherecombine = (++)

instance Num a => Combine a wherecombine = (+)

and get confused when it fails to compile.

So why does the following code overlap?

String is not an instance of Num so it would seem the compiler could see the instances as non-overlapping.

It turns out type checking is split into passes. Instance selection happens first and later the type checker ensures constraints like Num a are satisfied.

To tell if two instances overlap one, has to ignore the context portions of the instance declaration.

Let’s line up the two instances and see if they overlap.

Combine StringCombine a

So do they? Well, I haven’t really defined what it means for instances to overlap.

Instance selection is like a form of pattern matching. Each instance is analogous to a different equation that performs a match and instead of matching data constructors, the matches use type constructors and type variables.

Pattern Matching Review

Here is a function with some normal pattern matching.

asInt True = 1asInt a = 0

Do the matches overlap? Yes. The second pattern is a wildcard and matches everything. There is nothing ambiguous here because pattern matching is able to use the order of the matches to determine how to handle the overlap.

The compiler will complain if you switch the order because the more specific match will never get hit.

asInt a = 0asInt True = 1

Instance selection works like pattern matching except on the type level. However, there is no way to specify an order, so an overlap causes a compiler error unconditionally.

This leads to a heuristic for using type classes. If you ever write an instance like instance Combine a you are saying that there can only be one instance for the type class. This begs the question of why use a type class at all? The answer is probably “don’t use a type class.”

Everything Works But You Still Fail

Let’s keep using our mistaken instance to see how instance selection can succeed but compilation will fail.

Let’s write out the instance signatures and remove the overlapping instance.

class Combine a wherecombine :: a -> a -> a

instance Num a => Combine a wherecombine :: Num a => a -> a -> acombine = (+)

We can now use combine

main = putStrLn $ combine “hello “ “world”

Behind the scenes the compiler has made a version of combine based on our instance declaration and, let’s call it combine_num (for a more detailed explanation of what a compiler does look at this post).

The compiler sees a use of combine in main and must satisfy the constraint Combine a. It infers that a in this case is of type String. It must now satisfy the specialized constraint Combine String. The compiler looks at its instances and tries to find something that will match.

The compiler doesn’t have a Combine String but it does have a Combine a. a will match String because it is wildcard that matches everything.

The compiler substitutes in combine_num.

main = putStrLn $ combine_num “hello “ “world”

combine_num brings with it the constraint Num a, which is specialized here to Num String. This constraint leads to another search for an instance for Num String, but there is no match this time and compilation fails.

Using a type class method adds a class constraint for which the compiler must solve. Solving the constraint requires finding a suitable instance, which can result in the compiler having to solve more constraints. All the constraints must be satisfied for compilation to succeed but they are not solved at the same time.

Trickier Overlap

Let’s say you want to write a type class that counts how many arguments a function has:

class ArgCount a whereargCount :: a -> Int

instance ArgCount (a -> b) whereargCount _ = 1

instance ArgCount (x -> y -> z) whereargCount _ = 2

Let’s line up our declarations like before:

ArgCount (a -> b)ArgCount (x -> y -> z)

Do they overlap? They do (otherwise I would not have asked the question :p). Adding parentheses will make the reason clearer.

ArgCount (a -> b)ArgCount (x -> (y -> z))

The b in the first instance can be anything, even y -> z so it will match the second instance and we do have overlap.

If you are still confused by this example, it might be because of the infix notation. Pattern matching with constructors typically is in prefix notation (but it doesn’t have to be). Here are the same examples written out in prefix form:

ArgCount (-> a b )ArgCount (-> x (y -> z))

Let’s try to fix this situation with recursion.

-- I upgraded to using proxies so I would not have to use undefinedclass ArgCount a whereargCount :: p a -> Int

instance ArgCount b => ArgCount (a -> b) whereargCount _ = 1 + argCount (Proxy :: Proxy b)

instance ArgCount (m a) whereargCount _ = 0

The idea is that I will be able to count functions that are monadic where the final result is something like State Int or IO a. Unfortunately, this will also overlap.

ArgCount (m a)ArgCount (a -> b)

Again this is clearer when written in prefix form.

ArgCount (m a)ArgCount ((-> a) b)

We can see that m could be -> a so we have overlap.

The code below will work for counting the number of args for IO functions.

class ArgCount a whereargCount :: p a -> Int

instance ArgCount b => ArgCount (a -> b) whereargCount _ = 1 + argCount (Proxy :: Proxy b)

instance ArgCount (IO a) whereargCount _ = 0

A similar pattern is used by [Printf](http://hackage.haskell.org/package/base-4.9.1.0/docs/Text-Printf.html#t:PrintfType).

So a few things to notice.

The less polymorphic an instance is, the less likely you are to get overlap. Sometimes you might think that your instance is relatively specific ( a -> b is only for functions of one argument) but because of possible substitutions, it is actually is much more general ( a -> b is really an instance for all functions).

When More Polymorphism Means Better Inference

Let’s work through one final instance selection example: a trick for getting better type inference. This code is take from mono-traversable [Data.Sequences](https://hackage.haskell.org/package/mono-traversable-1.0.2/docs/src/Data-Sequences.html#line-1488) but was also present in [Printf](http://hackage.haskell.org/package/base-4.9.1.0/docs/Text-Printf.html#t:PrintfType)

instance (c ~ Char) => Textual [c] wherewords = List.words

If we ignore the context, we see that this is an instance of Textual for all types of lists regardless of the element type. However, the code will actually only compile if the type of list is [Char] or String.

There are cases where the type checker might be able to infer that it needs an instance for a [c] and, after the substitution of words, determine at some later stage that the list is really a String. If we had written the instance just for String, instance selection could have failed without a type annotation.

Although it is not the typical situation, if you only need one specialized instance of a polymorphic type, this is a better way to go than using FlexibleInstance … at least from a type inference perspective.

Type Classes Are Awesome

Many Haskellers have bad memories of failed attempts to use type classes successfully. My hope is by demystifying how type class instance selection works, we can avoid undue anguish.

I only covered simple vanilla type classes without any specific type class extensions. It is tempting to think overlap issues can me remedied through enabling OverlappingInstances (and the more modern Overlapping and Overlaps pragmas) but it is by no means a panacea. Covering what can be accomplished with the useful type class extensions is at least a blog post on it’s own.

In my next blog post I’ll talk about some simple patterns for writing type classes that you can use for fun and profit.

PS. Thanks to Sukant Hajra, Alan Zimmerman and Sudhir Kumar for early review and initial feedback.

Hacker Noon is how hackers start their afternoons. We’re a part of the @AMIfamily. We are now accepting submissions and happy to discuss advertising & sponsorship opportunities.

To learn more, read our about page, like/message us on Facebook, or simply, tweet/DM @HackerNoon.

If you enjoyed this story, we recommend reading our latest tech stories and trending tech stories. Until next time, don’t take the realities of the world for granted!


Published by HackerNoon on 2017/05/14