Saturday, July 23, 2011

I suck (and other musings)

Yes, it has been a long time. Yes, I've been doing very little on the project. Yes, I decided to restart again. Yes, I suck.

The last iteration, I managed to get completely turned around by kinds (that is, variables that store types). There are actually 3 sorts of types in the current iteration of Tangent:

- Types: Your (mostly) run of the mill types. They represent some possible set of values. 1, 3.14, "foo", etc.

- Type Literals: These are internal to the compiler. When you declare a new type, it's a type literal. The compiler knows exactly what it is, and marks it such that you can do operations on that (at compile time) and get a known answer out. So when you resolve the identifier 'string' you're accessing a variable (treated like any other static constant under the hood) for the type. This is what the last iteration neglected to understand.

- Kinds: These are type variables. Think C# generics. Tangent's Kinds require a type constraint (even if that constraint is 'anything'), but at that point they're a variable like any other. You can save them, modify them, use them as method parameters. The type checker uses the type constraint (just like C# does) to enforce that you only call methods on the type that the constraint has.

Now that I'm dealing with that properly, I have the compiler kinda sorta rebuilt. The only good thing about this is that I've decided to try and learn from the previous half dozen iterations; so the compiler actually compiles into an exe. Every other iteration either didn't get to runable code, or interpreted the intermediary form right in C#. The interpetedness made the code dog slow (and I had to implement all the interpreting making development slow) which was the death of that iteration.

Granted, the crime against humanity that I'm actually outputting is currently unoptimized and likely just as dog slow. For example:

  stuff foo.a; 

where stuff is a simple empty method call and foo.a represents an enum gets compiled out to...


return TangentObject.Global.Invoke(__tangentTypeCollection.__type0.ReductionRules[2], __tangentTypeCollection.__literal_'stuff').Invoke(__tangentTypeCollection.__type9.ReductionRules[0], TangentObject.Global.Invoke(__tangentTypeCollection.__type0.ReductionRules[6], __tangentTypeCollection.__literal_'.').Invoke(__tangentTypeCollection.__type25.ReductionRules[0], TangentObject.Global.Invoke(__tangentTypeCollection.__type0.ReductionRules[1], __tangentTypeCollection.__literal_'foo').Invoke(__tangentTypeCollection.__type6.ReductionRules[0], __tangentTypeCollection.__literal_)).Invoke(__tangentTypeCollection.__type26.ReductionRules[0], __tangentTypeCollection.__literal_'a').Invoke(__tangentTypeCollection.__type28.ReductionRules[0], __tangentTypeCollection.__literal_)).Invoke(__tangentTypeCollection.__type11.ReductionRules[0], __tangentTypeCollection.__literal_);


BLEARGH!

But it works, and it's a nice happy exe; and realistically optimization will (eventually) reduce that entirely to one direct method call.

Two other things of note since the last post involving syntax changes. I've reversed the order of identifier/types in variable declarations. It's now:


[variable]:[type]


Like pretty much every other language that includes a colon in the variable declaration. I imagine once I use it (or F#) the order will be less of an issue.

The other change involves type declarations. The old syntax used to be pretty much C# syntax:


[modifiers] class [: inheritance-list] (; | {[class-declaration-elements]*})


Since I expect anonymous types and type aliases to be more common in the language (and because I want to save myself work) it's been changed to the slightly more simple:


Goose? Type [method-declaration-element]+ => [type-expr];


So the actual declaration will look very similar to method declaration syntax. All of the rules for method declaration apply here. You must have 1 identifier or symbol, and then any number of other identifiers, symbols, or parameters. This I think provides a much nicer way to provide constructors to the language without running into multiple inheritance issues; the constructor is for the type, not the instance. If two constructors with the same name happen to lead to compatible types all the better:


Type foo( initial x : int ) => class {
  x : int = initial x;
};

Type foo => foo(42);

The simple class{} syntax is the anonymous type syntax, which can then be re-used within expressions. Inheritance will be done after the => like any other type operations. And since the language provides kinding support with user-defined operators, I expect those to be kinda awesome (and evil).