Sunday, April 10, 2016

Interfaces - Part 2 - Grammar

The first step to getting interfaces working is to tweak the parser to actually recognize the damned things. Well, that's not quite right. The first step is that I need to figure out what I want, and what parts make that up. Let's look at what I want from Part 1. I want them to do:

  • I want interfaces to be some contract that types can supply. "I don't care what type is passed in here, as long as it can be enumerated." 
  • I need interfaces to be able to be generic - I am not writing IntEnumerable, StringEnumerable, BoolEnumerable, and so on. 
  • I want interfaces to be multiply-implemented. Just because something is enumerable should not prevent it from being comparable.
  • I would like interfaces to be open. That is, if you wrote AwesomeParser and I have IParser, I should be able to adapt your implementation to my interface without changing your code.
  • It'd be nice if interfaces could support operators better. .NET interfaces suck at things like T+T => T.
Some of you out there will notice that these requirements are eerily close to what Haskell type classes do. That is no mistake. Haskell's type system is widely regarded as one of the most awesome made by man, and type classes in particular are one of the unique parts that set it apart. Type classes though work a lot more like .NET generic constraints than .NET interfaces. So how to design the language to support these things in a way that is easy/familiar, without breaking the formalisms necessary to keep everything from falling apart?

For the grammar, there are three main things I need to change to support interfaces:


  • I need to be able to declare an interface - "Here is a contract, and here is how you access it."
  • I need to be able to say that a type implements that interface - "Type X implements that contract"
  • I need to be able to use that interface in functions - "This function needs something that is enumerable as its first argument"

To declare an interface in Tangent, you use the standard type declaration syntax similarly to .NET interfaces. Say we wanted an interface to allow classes to provide a human readable view:

    human readable :> interface {
        human readable (this) => string;
    }

This defines an interface called "human readable" that has a single required function, "human readable" followed by a variable of the implementing type, which returns a string. In the grammar, this means defining new rules for the interface declaration and adding them to the type declaration rule.

To declare a class that uses the interface, it is included in the declaration, akin to .NET style inheritance:

    point :> (x: int), (y: int) :< human readable {
       human readable (this) => string {
           x + ", " + y;
       }
   
    } 

The only difference is that you use that funky duck face :< rather than a plain colon. This is done to better visually distinguish the interface implementations from the parameters, and when non-inline interface binding comes into play (we'll get to that in later parts). Again, this is declaring a type named "point", whose constructor is two ints with a comma in-between, and implements the human readable interface. It then goes in and implements the required function.

In the grammar, this means adding the inline interface binding part as optional. The challenge here is that the sum-type extension (x|y|z) is in the same area and also optional. That broke some tests and will likely be a complexity in the parser.

And to use the interface in a function, you can just specify it as the type:

    display (value: human readable) => void {
        print human readable value;
    }

Which works as you would expect from .NET land. You get a function that takes anything that satisfies the interface, and you can use any function guaranteed to be there. In the grammar... that isn't any change at all. But remember when I said that type classes are more like generic constraints than real interfaces? Yeah, here is where that comes into play. When that gets compiled, it will look more like this C# declaration:

    void display<T>(T value) where T: IHumanReadable { ... }

and Tangent does the magic to include the generic parameter that you don't really care about. But what if you do care about it? That's where the grammar change comes into play. If you care about the actual type being passed in, then you can declare parameters like this:

    display (value: T: human readable) => void { ... }

which will bind the actual concrete type to the phrase T so you can work with it. That isn't important here, but does come into play in other scenarios where the type-classiness of Tangent interfaces is important. Consider this interface to do a C-style compare (0 is equals, >0 is greater than, <0 is less than):

    comparable :> interface {
        compare (this) to (this) => int;
    }

Great, now we have a nice interface that says you can take two things and compare them. Let's use this in a trivial function:

    smaller (a: comparable) or (b: comparable) => comparable {
        if compare a to b < 0
          then a
          else b;
    }

That does not work.

What the interface says is that the compare function takes two arguments of the same type. The smallest function takes two comparables, but they might not be the same type. So that's where the three part parameter declaration comes into play:

    smaller (a: T: comparable) or (b: T) => T {       
        if compare a to b < 0

          then a
          else b;   
    }

In this function, we're specifically saying that the second argument (and the return value) must be the same type as the first argument. Since that meets the interface's constraints (and because Tangent doesn't have subtyping), that works. 

So those are the three grammar changes going into the language for basic interface support. Now I just need to figure out how to use them now that the parser can recognize them. Part 3 and more to come!