Sunday, October 23, 2011

Code generation rework and cheap parlor tricks.

Far too long since my last post. I'd like to say I'd been hard at work making cool new features, but if you've been reading at all, you'll know how much of a lie that is. Plus now I'm sleepy, so instead of doing compiler debugging while sleepy I'll get the handful of non-robotic journal readers some amusement.

The big change going on recently is re-working how I was doing the dynamic dispatch in Tangent. The previous way involved defining a method on my runtime object, sending in a parameter, doing some inspection and then calling the right method out of a dictionary. Quick, easy, functional. But it rubbed me the wrong way (not that way), and was increasingly causing me grief when trying to plan out next steps. It's poor performance, it's difficult to optimize, it's easier to debug, but harder to incorporate native .NET calls. And it makes parameter aggregation painful.

So that is out, now replaced with actual bonafide compiled CIL. The compiler makes a few quick optimizations based on types, but mostly just brute force dispatches based on what the method overloads are for that instance. And on the upside, it now nicely throws exceptions if we get into a weird case where a call is ambiguous or unknown. And (the core reason behind the work since last post), methods actually can accept and curry parameters now. All of the examples before were simple methods or only required the type of the parameter, not the value. Alas, there's still some bugs there so even the previous code examples don't work at the moment.

But I'm going to talk a little bit about what's in the works since it's both nifty, and a horrible idea.

One of the main goals of Tangent is to allow unparalleled ease and flexibility in creating domain specific languages, or even more general purpose dialects that can interoperate and evolve as programmers need them to. The core syntax defines some syntax just to make the basics familiar to programmers. Function declaration, class declaration, type operations, assignment, return... But I'd like to keep the set of keywords (and reserved symbols) relatively small.

The experimentation recently has involved just how small can be reasonably provided to the programmer and still be able to provide via some core library that provides the dialect of Tangent that is used. Once these little bugs are taken care of, Tangent should be able to support in-code definition of your standard if/else block.

For those of you who have not fainted or are on your way to stop more of this insanity, here's some of the details. The first core bit I need is the dynamic dispatch supported by methods already. And I need to be able to define specialized symbols/identifiers in source. You've seen these before.

There are two new things that you've not seen. The first is the Lazy operator (which might get a better name later). It is a built-in type operator that asks for a fully bound method resulting in the specified type. Since every access in Tangent (currently) is done via curried method, this works pretty well. So in Tangent you can define a type like ~> void which is simply 'some expression that results in void'. Once a variable of that type is put into an expression where it needs to provide void the variable is executed. This sort of thing will be used many places where you'd normally pass a lambda, but will (ideally) lead to cleaner, more readable code at the call site.

The second concept is simply that the language will treat your standard block as a nullary closure. Which is just fancy terminology for method of type ~> void. It looks like a block you've used, it behaves like a block you've used, but it can be passed into methods and it can be stored in variables.

Using these pieces, we can define the standard if/else block in code (using an enum, since .NET bools aren't accessible yet):


Type bool => enum{ values{ True, False } };
if (condition: bool) (block: ~> void) => void { block; };
if (condition: bool) (block: ~> void) 
  else (else block: ~> void) => void { block; };
if (condition: bool.False.Type) (block: ~> void) => void { };
if (condition: bool.False.Type) (block: ~> void) 
  else (else block: ~> void) => void { else block; };
entrypoint => void {
  if bool.True {
    debug "True!";
  };
};

Untested of course, since the code gen is still buggy. Hopefully there's not some trivial bug, but I'm not sure anyone reads this anyways.

One interesting trick here is specializing the method on False. Why false? Because remember that null in Tangent is essentially the top type. If I specified True as the specialization, null would trigger that. If I specified both True and False, then sending in null would cause an AmbiguousInvocationException. This way, if you send in null, it behaves as False. And the scary/nifty thing? If your dialect wants different behavior, you can go in and define that.