Chapter Two: Type Classes

Type classes are an indefensible part of how we write Haskell. A lot of the rest of this material won't make a lot of sense without them🤪

What's in a Name?

Calling them "type classes" is both a stroke of genius and a terrible mistake. There are absolutely strong parallels with classes from object-oriented programming. They also mostly work the exact opposite way that you'd expect coming from OOP.

Abstraction

Type classes let you define a common interface across many types. The concrete function that gets called gets determined by how you define an instance of that class.

For example, let's say that you have these functions:

add :: Int -> Int -> Int
add a b = a + b
(++) :: [a] -> [a] -> [a]
[] ++ bs = bs
(a : as) ++ bs = a : as ++ bs
append :: Text -> Text -> Text
append a b = -- Long complex function
-- Function composition
(.) :: (b -> c) -> (a -> b) -> (a -> c)
f . g = \x -> f (g x)

In a sense, all of these do the same thing: they combine values of a common type together (i.e. they all have the form a -> a -> a) While we may expect them to behave in some common ways, the exact meaning of that depends on the specific type. Why not write functions that could work on any of these types with respect to this combining behaviour?

We can define a common interface for this "combinable" behaviour!

class Combinable a where
combine :: a -> a -> a

This is called a semigroup in maths, so the version in the library is actually called Semigroup. Don't let the math-y name scare you! It's actually very straightforward for a "combinable" thing.

class Semigroup a where
(<>) :: a -> a -> a

Let's write some instances!

instance Semigroup Int where
a <> b = a + b
instance Semigroup [a] where
as <> bs = as ++ bs
instance Semigroup Text where
a <> b = append a b

Finally, let's use it in a function. We place "class constraints" before the arguments in the type signature, separated by a =>.

triple :: Semigroup a => a -> a
triple x = x <> x <> x
triple 10
-- 30
triple [1, 2, 3]
-- [1, 2, 3, 1, 2, 3, 1, 2, 3]
triple "hi"
-- "hihihi"
(triple (*2)) 5
-- 40

Libraries tend to give you a lot of "functions for free" to go with your instance. It's typically worth it to define an instance for a new data type.

Recursive Abstraction

It's Roughly Like Inheritance™

Unlike OO where inheritance happens on concrete data, Haskell's "inheritance" on interfaces is more like saying "assuming that you have an x, then you can also do y!" It lets you extend abstract interfaces with more functions. This usually makes for a smaller number of types that have the new interface, because they need the old type class, plus an instance for the new one, which is not always even possible!

Let's extend that Semigroup to have an element that when used with <> is a no op. We'll call this mempty, because of how it works on lists ([] ++ xs = xs) and because the name id was already taken.

class Semigroup a => Monoid a where
mempty :: a

By definition, a Monoid is assumed to have a Semigroup instance kicking around. It has both mempty and <> automatically available. Let's look at a few of these:

intance Monoid Int where
mempty = 0
instance Monoid [a] where
mempty = []
instance Monoid Text where
mempty = ""
instance Monoid (-> a) where
mempty = id

Functor Hierarchy

The most widely-used type class heirarchy is the Functor tree, which includes the infamour Monad class. These also have special rules that you need to obey when writing instances for, but you'll mostly be using someone else's so don't worry about them too much for now.

There's also highly-recommended visual guide for the following material at Functors, Applicatives, And Monads In Pictures.

Functor

You may not know it, but you're already familair with functors!

class Functor where
fmap :: (a -> b) -> f a -> f b

You'll see a lot of fmap's operator variants <$> and <&> floating around

fmap show [1,2,3] -- ["1", "2", "3"]
show <$> [1,2,3] -- ["1", "2", "3"]

The pun here is that it's the same as $ (application) but working on something that's been put into a container (practically any sum or product type), hence the visual pun of <$>. Same goes for & (pipe) and <&>.

show $ 1 -- "1"
show <$> [1, 2, 3] -- ["1", "2", "3"]
1 & show -- "1"
[1, 2, 3] <&> show -- ["1", "2", "3"]

Applicative Functor

Applicatives add a lot of power to Functors by defining two functions:

  1. pure (also known as the somewhat confusingly named return) takes a plain value and wraps it in a container.

  2. <*> does function application when both functions and arguments are wrapped

class Functor f => Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b

The <*> is often referred to as the Tie Fighter operator 🤣 It's also called ap, short for "applicative apply".

For example, on Maybe:

instance Applicative (Maybe a) where
pure x = Just x
Just f <*> Just x = Just $ f x
_ <*> _ = Nothing
pure 4 :: Just Int -- Just 4
pure "hi" :: Just Text -- Just "hi"
Just (*10) <*> Just 6 -- Just 60
Nothing <*> Just 6 -- Nothing
Just (*10) <*> Nothing -- Nothing

Monad

class Applicative m => Monad m where
(>>=) :: m a -> (a -> m b) -> m b
join :: m (m a) -> m a

>>= is pronounced "bind". This is because when you use it to link into the next function, it "binds" to the first argument, like a supercharged pipe. Further, because of how scoping rules work, argument name bindings last until the end of the chain:

[1,2,3] >>= \a ->
[a + 1] >>= \b ->
[(show b, a * 10)]
-- [("2", 10), ("3", 20), ("4", 30)]

There's also an inverted bind operator, =<< with arguments reversed

Let's look at a couple instances:

instance Monad [a] where
join xs = foldr (++) [] xs
xs >>= f = join (f <$> xs)
instance Monad (Maybe a) where
join Nothing = Nothing
join (Just inner) = inner
Nothing >>= f = Nothing
Just x >>= f = Just (f x)

>>= is special compared to <$> and <*>. It doesn't run out of arguments, and can be chained forever.

[1,2,3] >>= \a ->
pure (a + 1) >>= \b ->
pure (b * 10) >>= \c ->
pure (c - 5)
-- [15, 25, 35]

It can also be used to change behaviour, changing its structure based on arguments:

[1,2,3] >>= \a ->
if rem a 2 == 0 then [a + 1] else [a, a * 2, a * 3] >>= \b ->
if b > 10 then [a] else [b]
-- [1, 2, 3, 3, 3, 6, 9]
Just 42 >>= \a ->
if a > 10 then Nothing else Just a >>= \b ->
Just (b * 10) >>= \c ->
Just (show c)
-- Nothing

Do Notation

Above there's a lot of >>= and nesting to keep it straight. Haskell has "do notation" to clean this up, make it feel like an linear set of instructions, and also pun on a lot of imperative programming.

When using do notion, we often replace pure with return, which is the same function but renamed for extra imperative punning.

return does not exit out of the function chain. It is the same function as pure and only wraps values in the correct container.

Bind
Do Notation
Bind
[1,2,3] >>= \a ->
pure (a + 1) >>= \b ->
pure (b * 10) >>= \c ->
pure (c - 5)
Do Notation
do
a <- [1, 2, 3]
b <- return $ a + 1
c <- return $ b * 10
return $ c - 5

There's also a special variant of let that lets you skip the <- for simple values.

Without Let
With Let
Without Let
do
a <- [1, 2, 3]
b <- return (a * 10)
return (b + 1)
With Let
do
let as = [1, 2, 3]
let b = fmap (*10) as
return (b + 1)

Cheat Sheet

How these all relate to each other can be hard to keep straight at first. Here's a handy guide in pseudocode:

-- WARNING: Abuse of notation!
(a -> b) $ a == b -- Plain function
(a -> b) <$> m a == m b -- Functor
m (a -> b) <*> m a == m b -- Applicative
(a -> m b) =<< m a == m b -- Monad
{-| Legend
a is a value
b is a value
m is a data wrapper / container
-}

The Orphanage

For the compiler to know that there are not duplicate, conflicting instances for type classes, instances need to be written in the same module as the class declaration, or in the same module as a datatype definition. But what happens when you need to add an instance from a library to a datatype that's also from a library?

These are called type class "orphans," and we shove isolate them in a module (or submodules) called Fission.Internal.Orphanage. Importing this module doesn't have any actual exports, but you do need to signal to the compiler that it's a dependency for the module that you're writing. Importing it normally will lead to the linter complaining about an unused import. You can get around this warning by explicitely showiung that are no imports:

import Fission.Internal.Orphanage ()

The downside to having all of your orphan instances in a single module is that it progressivbley slows your compile times down after touching the orphanage because it need to check every module that imports it for new instances. As this file grows, it is typically a good idea to break it into submodules, and only import the orphan modules that you actually need in each scenario.