### Name that type! (nested types edition)

If we want to write a type for binary trees in Haskell, it is easy to do so:

> data BinaryTree a = Leaf > | InternalNode (BinaryTree a) a (BinaryTree a)The structure of the type guarantees certain things about its inhabitants. For instance, every node has exactly two children. If we want to guarantee stronger invariants, we can change the shape of the type. We can write a a type for complete binary tree (that is, binary trees in which every leaf is at the same depth), as

> data CompleteBinaryTree a = Leaves > | NonLeaves a (CompleteBinaryTree (a,a))The argument in the recursive part of this datatype is not the same as it is in the declaration. Datatypes with this property are called nested or non-regular or heterogeneous or non-uniform. We can express all sorts of invariants with these datatypes. Here's another example:

> data AList a b = Nil > | Cons a (AList b a)If we want to manipulate lists that alternate between two types

`a`

and `b`

, we could use the
type `[Either a b]`

, but this doesn't ensure that we don't
have two `a`

s or two `b`

s in a row. We could,
instead, use the type `[(a,b)]`

, but this would only allow
us to create lists of even length. By switching the position of the
parameters in the recursion in `AList`

, we can represent
alternating lists of any length.
Here are some other examples of data structure invariants we can ensure with nested types:

- Cyclic data structures (mirror)
- Many different types of lists with
fast
`cons`

,`snoc`

, concatenation, and indexing, including - Red-black trees (a paper, code for that paper), AVL trees (mirror, mirror 2)
- Square matrices, triangular matrices (in various works by Ralph Matthes), Toeplitz matrices, rectangular matrices (see "Manufacturing Datatypes", above)
- Efficient-merge heaps (see Okasaki's thesis & book, above), Braun trees, left-complete trees (see "Manufacturing Datatypes", above)
- Binomial queues (mirror, mirror 2)
- Tries (mirror, mirror 2, code, code mirror, old version, old version mirror) (See also Okasaki's thesis & book)
- Well-formed terms in the locally-nameless lambda calculus (mirror)

> data A x y = B (x y) > | C (A (D x) (x y)) > newtype D x y = D (y,x y) > newtype K x = K x > type QuestionOne = A K > > data E x = F x x > | G x (E x) > data QuestionTwo x = Q2 (E (E x)) > > data H x y = I > | J y x x (H (y,x) y) > type L = H () > data QuestionThreePartA x = Q3a (L x) (L x) (L x) > data QuestionThreePartB x = Q3b (L x) (QuestionThreePartA x)

## 7 comments:

Interesting post Jim, thanks!

I don't have answers to all your questions, but I've had a bit of a go at the first one:

data A x y = B (x y)

| C (A (D x) (x y))

data A K y = B (K y)

| C (A (D K) (K y))

data A (D K) (K y) = B (D K (K y))

| C (A (D (D K)) (D K (K y)))

D K (K y) = D (K y, K (K y)), so:

data A (D K) (K y) = B [[(K y, K (K y))]]

| C (A (D (D K)) [[(K y, K (K y))]])

This means that for instance A K Integer = QuestionOne Integer contains the values:

B (K x)

C (B (D (K x, K (K x))))

C (C (B (D (D (K x, K (K x)), D (D (K x, K (K x)), K (D (K x, K (K x))))))))

This is a bit of a mess :-). It looks a bit like a list containing either 1 or an even number of integers to me though. But I'm not really sure!

I don't have any time left to investigate the other problems, so I hope some other more knowledgeable commentators can come along and fill in the gaps..

Q1 x is a factorial tree of x (N1 = leaf, N(n) = n instances of N(n-1))

Q2 is a flist of flists, where a flist is a list of at least two elements.

Not really sure where these are useful :-)

It seems like Haskell still allows me to create this CompleteBinaryTree structure:

NonLeaves (NonLeaves Leaves Leaves) Leaves

Which does not have all the leaves at the same depth. Any ideas? I used your data definition verbatim.

Psyonic: Actually, that structure does have all of its leaves at the same depth -- since it only has a single leaf! It just so happens that the leaf node contains another tree, but don't let that confuse you. =)

~d

Erling: Dig deeper on Q2. What does it mean to be a list of size at least two of lists of size at least two?

Are you ever going to explain what these types do?

Also, there is an interesting discussion of your little puzzler at

http://reddit.com/r/programming/info/69lxp/comments/

Among other things it was mentioned that you need to use something called a generalized fold to iterate through something like the CompleteBinaryTree data structure suggested by psyonic.

thomashartman1-googlegroups: I've posted the answers (and two new questions).

Post a Comment