## Friday, February 22, 2008

### Name that type! (nested types edition)

If we want to write a type for binary trees in Haskell, it is easy to do so:

```
> data BinaryTree a = Leaf
>                   | InternalNode (BinaryTree a) a (BinaryTree a)

```
The structure of the type guarantees certain things about its inhabitants. For instance, every node has exactly two children. If we want to guarantee stronger invariants, we can change the shape of the type. We can write a a type for complete binary tree (that is, binary trees in which every leaf is at the same depth), as
```
> data CompleteBinaryTree a = Leaves
>                           | NonLeaves a (CompleteBinaryTree (a,a))

```
The argument in the recursive part of this datatype is not the same as it is in the declaration. Datatypes with this property are called nested or non-regular or heterogeneous or non-uniform. We can express all sorts of invariants with these datatypes. Here's another example:
```
> data AList a b = Nil
>                | Cons a (AList b a)

```
If we want to manipulate lists that alternate between two types `a` and `b`, we could use the type `[Either a b]`, but this doesn't ensure that we don't have two `a`s or two `b`s in a row. We could, instead, use the type `[(a,b)]`, but this would only allow us to create lists of even length. By switching the position of the parameters in the recursion in `AList`, we can represent alternating lists of any length.

Here are some other examples of data structure invariants we can ensure with nested types:

Finally, here are 3.5 examples of nested types I haven't seen elsewhere. See if you can figure out what they do.
```
> data A x y = B (x y)
>            | C (A (D x) (x y))
> newtype D x y = D (y,x y)
> newtype K x = K x
> type QuestionOne = A K
>
> data E x = F x x
>          | G x (E x)
> data QuestionTwo x = Q2 (E (E x))
>
> data H x y = I
>            | J y x x (H (y,x) y)
> type L = H ()
> data QuestionThreePartA x = Q3a (L x) (L x) (L x)
> data QuestionThreePartB x = Q3b (L x) (QuestionThreePartA x)

```

Max Bolingbroke said...

Interesting post Jim, thanks!

I don't have answers to all your questions, but I've had a bit of a go at the first one:

data A x y = B (x y)
| C (A (D x) (x y))

data A K y = B (K y)
| C (A (D K) (K y))

data A (D K) (K y) = B (D K (K y))
| C (A (D (D K)) (D K (K y)))

D K (K y) = D (K y, K (K y)), so:
data A (D K) (K y) = B [[(K y, K (K y))]]
| C (A (D (D K)) [[(K y, K (K y))]])

This means that for instance A K Integer = QuestionOne Integer contains the values:
B (K x)
C (B (D (K x, K (K x))))
C (C (B (D (D (K x, K (K x)), D (D (K x, K (K x)), K (D (K x, K (K x))))))))

This is a bit of a mess :-). It looks a bit like a list containing either 1 or an even number of integers to me though. But I'm not really sure!

I don't have any time left to investigate the other problems, so I hope some other more knowledgeable commentators can come along and fill in the gaps..

Erling said...

Q1 x is a factorial tree of x (N1 = leaf, N(n) = n instances of N(n-1))

Q2 is a flist of flists, where a flist is a list of at least two elements.

Not really sure where these are useful :-)

Unknown said...

It seems like Haskell still allows me to create this CompleteBinaryTree structure:

NonLeaves (NonLeaves Leaves Leaves) Leaves

Which does not have all the leaves at the same depth. Any ideas? I used your data definition verbatim.

Daniel Wagner said...

Psyonic: Actually, that structure does have all of its leaves at the same depth -- since it only has a single leaf! It just so happens that the leaf node contains another tree, but don't let that confuse you. =)

~d

Jim Apple said...

Erling: Dig deeper on Q2. What does it mean to be a list of size at least two of lists of size at least two?

standardcrypto said...

Are you ever going to explain what these types do?

Also, there is an interesting discussion of your little puzzler at