Monday, January 29, 2007

Quotient Types for Information Hiding

Imagine you're writing a datatype for representing sets. You want to hide the internal representation of sets from clients, but you also want to allow clients to perform any operations as efficiently as if the internal representation were exposed. This is not easy, and it's why Data.Set has 36 functions, when all that are really needed to ensure the meaning of a set ("These elements are distinct") are toList and fromList. Of course, writing member :: Ord a => a -> Set a -> Bool as

member x y = Data.List.elem x (toList y)
is just awful, but the price we pay for efficient member is having to open up the module and break the abstraction if one of the 34 other functions doesn't do exactly what we want. In addition to this pain, there's some danger: functions like mapMonotonic are not really safe, and cannot be made so.

The papers I've been reading about quotient types:

We could alleviate all of this mess with quotient types. Quotient types allow modules to expose their internals with no danger of breaking abstractions. The quotients are the same as the mathematical quotients in abstract algebra, where they are used frequently. Back on the type theory side, the elimination rule for typing quotients depends not just on types, but on terms, and so requires a type system with dependent types. That's a shame, since dependent types are so tricky to typecheck. It would be great if there were some form of lightweight quotient types that didn't require the full power of dependent types.

The break-the-abstraction vs. lose-efficiency-&-safety issue reminds me of the difference between GADTs and lightweight static capabilities: GADTs are verbose, but they allow the client to safely and efficently operate on datatypes in ways that aren't covered by the library.

How Many Functions are There of Type Bool -> Bool?

This post is incorrect. See the corrections.

Or, rather, how many continuous functions are there of type Bool_|_ -> Bool_|_? I count 11, which seems like a strange number. The trick is to remember that if f is continuous, then f _|_ =/= _|_ implies that for all x, f x = f _|_. This gives us only two functions which take _|_ to non-_|_:

one = const True
two = const False

The rest of the continuous functions fill the space Bool -> Bool_|_. Since Bool_|_ is of size 3, there are 32 remaining functions, which gives us 9+2 = 11 functions.

We can generalize to say that A_|_ -> B_|_ has |B| + (|B|+1)|A| inhabitants.

Saturday, January 20, 2007

Bug Reporting

A few months ago, I posted a quandary to haskell-cafe about seemingly ineffable types in GHC: for a certain term, the type inferred by GHCi could not simply be listed in the source code, as this would cause the type checker to complain. I didn't get any bites on my post, and I chalked it up to some complications I didn't understand and no one had time to explain, and I went on to thinking about other things.

Last year, I graded a C++ assignment for a class at the local university (and my alma mater). We required the use of std::set, and a student noticed that he could modify the value of a member of a std::set in place, and the compiler wouldn't complain.

Both of these turn out to be bugs, in GHC and in the C++ standard. The lessons I've learned about programming languages:

  1. If something doesn't make sense, investigate. If it turns out you just don't understand something that is the way it is for complicated reasons, you still will have learned something useful.
  2. The squeaky wheel gets the grease. I asked the Haskell Cafe about the GHC infelicity, but didn't inquire at the GHC-specific mailing lists, and I didn't keep asking when nobody responded.
  3. When you see something funny looking, don't just ask people why it's funny looking. Instead, push things to their logical conclusion, and find a test case that's obviously wrong. This will give people a reason to listen to you.

Friday, January 19, 2007

Foralls, Redexes, and Type Lambdas

In my last post about universally quantified types, I was asking why GHC disallows datatypes like Evil while allowing datatypes like Fine:

data OK (x :: * -> *) where
   OK :: OK x
type Fine = OK Maybe
type Evil = OK (forall (f  :: * -> *) . f)
This seems more confusing when Evil' is just fine:
data OK' (x :: *) where
   OK' :: OK' x
type Fine' = OK' Int
type Evil' = OK' (forall (f  :: *) . f)

Ken Shan answered my question: In the static semantics of the typed lambda calculus, only forms like (Λt:k.r) @ s are redexes. This makes sense to me, but it seems like it's so for simplicity reasons. I expect that this makes the metatheory simpler.

Unfortunately, this makes some things in GHC a pain, since GHC doesn't have type lambdas. SPJ suggests that this is because it makes unification for type inference impossible, but I don't yet understand why. The feature list for EHC indicates that it does support type lambdas, though I haven't tested this.

Update: (29 Jan 2007)

A History of Haskell points to A system of constructor classes (mirror) regarding unification and type lambdas.

Monday, January 15, 2007

Foralls, Kinds with Arrows, and Impredicativity (?)

I can't figure out why the kinding rule for universal quantification in Fω is:

           C, x : k1 |- y : *
(1)    -----------------------------
       C |- (\forall x : k1 . y) : *
That is, why the type inside the universal quantification must be of kind *.

I would expect the rule to be

            C, x : k1 |- y : k2
(1')    ------------------------------
        C |- (\forall x : k1 . y) : k2
Using (1), I can still get
              C, x : k1 |- y : * -> *
(2)    ------------------------------------------------------
       C |- (\Lambda z : * . \forall x : k1 . y @ z) : * -> *
by eta-expansion, so using (1) instead of (1') doesn't seem to reduce the available power, it just makes the notation more annoying.

I suspect that the reason for (1) instead of (1') has something to do with impredicativity, since (1) is only valid in impredicative systems anyway.

Non-constructive Proofs and Programming

Two excellent examples of programming with non-constructive proofs are given in Simplifying Programs Extracted from Classical Proofs by Yevgeniy Makarov. The examples they give are better than the immediate example I think of: the proof that there are some two irrational numbers x and y such that xy is rational. This proof is simple to do, but requires some heavy background machinery for rational and irrational numbers. Perhaps I could code this proof by using _|_ for the theorems about rational numbers that I know to be true, but don't wish to prove.

The other two examples of simple non-constructive proofs that I think of are the irrationality of the square root of 2 and the infinitude of the primes. Of course, those are usually stated in non-constructive ways, but they are actually constructive with only a slight bit of tweaking. I haven't read enough of the Makarov paper to see if the simplifications proposed would turn these non-constructive proofs into constructive ones.

Saturday, January 13, 2007

Non-constructive Proofs and Productivity

Languages in the Dependent ML family allow users to ensure termination by using a termination metric, a number that is strictly decreasing on recursive calls in the definition of the function.

There is a counterpart to termination for co-recursive structures, where we need to show that a function produces values indefinitely. So, for the Collatz conjecture, we want to show termination for:

collatz n = collatz' n []
collatz' 1 r = 1:r
collatz' n r =
    if n `mod` 2 == 0
    then collatz' (n `div` 2) (n:r)
    else collatz' (3*n + 1) (n:r)
But for the twin prime conjecture we want to show the productivity of:
data Stream a = Stream a (Stream a)
twins = twins' 3
twins' n =
    if (primep n) && (primep (n+2))
    then Stream (n,n+2) (twins' (n+2))
    else twins (n+2)

I don't know if any of the members of the DML family can check productivity. Imagine they could, and imagine we had a non-constructive proof of the twin primes conjecture. Could we use that proof to allow a definition of twins to be marked productive?

Most of the time, we talk about the Curry-Howard correspondence in reference to intuitionist logic and functional programming, but non-constructive proofs of productivity seem like a good way to be able to use more powerful logic in programming, since they don't have to change the implementation of the function they are attached to.

Friday, January 12, 2007

Static Security Assurance From Afar

I work at an e-commerce company. I was thinking today of how typed programming would make our code more reliable, and I began to wonder about using types (or other static assurance methods) to ensure that we don't accidentally reveal information to parties that shouldn't see it. I'm thinking here of the Apollo project (specifically, Translating dependency into parametricity) or Flow Caml. I don't see a way for us to use technology like this when working with other companies.

As an example, we might pass a customer's info to a third party site along with some identifying informaiton about us, to verify the third party that the customer is legit. How can I trust the third party site to not reveal our shared secret to the customer?

I suspect this is a problem for cryptographers, not type theorists. Since much of the information that we deliver to customers is tainted by using secrets to obtain it, static analysis wouldn't help maintain security.

Thursday, January 11, 2007

Type (Dis)equality

Type equality is the essential ingredient in guarded algebraic datatypes:

data Z ; data S n ;

data ListGADT a n where
    NilListGADT :: ListGADT a Zero
    ConsListGADT :: a -> ListGADT a n -> ListGADT a (S n)

data TypeEqual a b = . . .
data ListEq a n = NilListEq (TypeEqual n Z)
                | forall m . ConsListEq (TypeEqual n (S m)) a (ListEq a m)
What new power would type disequality bring? It might prevent infinite types.
data WrapList a = forall n . WrapList (ListGADT a n)

cons :: a -> WrapList a -> WrapList a
cons x (WrapList y) = WrapList (Cons x y)

ones = cons 1 ones
Here, the size type inside ones is S (S (S . . . . That's bad. What if we wrote:
data TypeDisEq a b where
    DisEqLess :: DisEq Z (S a)
    DisEqMore :: DisEq (S a) Z
    DisEqInduct :: DisEq a b -> DisEq (S a) (S b)

data ListGADTDisEq a n where
    NilListGADTDisEq :: ListGADTDisEq a Zero
    ConsListGADTDisEq :: a -> ListGADTDisEq a n -> (TypeDisEq n (S n)) -> ListGADTDisEq a (S n)
Now, since the size type inside a ListGADTDisEq can't be one more than itself, it's finite, and we might not be able to form ones. Sadly, that's not true:
data WrapListDisEq a = forall n . WrapListDisEq (ListGADTDisEq a n)

cons' :: a -> WrapListDisEq a -> WrapListDisEq a
cons' x (WrapListDisEq NilListGADTDisEq) = WrapListDisEq (ConsListGADTDisEq x Nil DisEqBase)
cons' x (WrapListDisEq p@(Cons q r s)) = WrapListDisEq (ConsListGADTDisEq x p (DisEqInduct s))

ones' = cons' 1 ones'