Saturday, September 10, 2016

The Cost of Pragmatism

Sorry for the delay in posting. As promised, I moved towards a bit more pragmatic focus with the language. Towards that goal, I made a tiny program:

a random number => int {
  new .NET System.Random.Next;
}

a random number less than (exclusive maximum : int) => int {
  new .NET System.Random.Next exclusive maximum;
}

a random number between (inclusive minimum : int) and (exclusive maximum : int) => int {
  new .NET System.Random.Next inclusive minimum exclusive maximum;
}

get a number from console => int {
  .NET System.Console.Write "Enter a number between 0 and 100: ";
  : input : string := .NET System.Console.ReadLine;
  int.Parse input;
}

provide user feedback based on (guess : int) and (target : int) => void {
  if guess = target 
    .NET System.Console.WriteLine "Correct!"
  else
    if guess < target
      .NET System.Console.Write "Sorry, your guess was too low. "
    else 
      .NET System.Console.Write "Sorry, your guess was too high. "
}

entrypoint => void {
  : target : int := a random number less than 100;
  : guess : int := 101;

  while not guess = target {
    guess = get a number from console;
    provide user feedback based on guess and target;
  }
}

A tiny console guessing game. Yes, that's what I've been working on for 2 months... At the end of the last post I hinted at performance problems with simple code with .NET imports. Turns out, that was a pretty big deal. The program above took more than 14 hours (I got tired of waiting, and killed it) to compile 2 months ago. 

The problem was in the core of the language. The way Tangent works is to parse all of the declarations, and then use them to infer meaning. That inference engine was pretty straightforward. Assuming a list of tokens coming in, a target type, and a set of rules:

  if tokens.count == 1 && tokens[0].effective_type == target_type
    return tokens;
 
  for index = 0; index < tokens.count; ++index
    foreach rule in ruleset

      if rule matches tokens at index
        replace tokens with rule result, and recurse

There's some extra-cruft in there for handling ambiguity, but the core of it is a shift-reduce parser in spirit. And that is great when you have a few dozen rules and 5-10 inputs. Now with tens of thousands of rules and larger inputs, big-O reminds you who is boss.

What I've done so far is to remove the linear search from the ruleset. It now is a bit smarter about limiting the legal rules based on the first token it sees, and then the second, and so on. I've also removed sum-types from the language. As soon as things could be A or B, it started exploding the search combinations again due to how the language is setup. Since I only really wanted them for nullable types, and I realized I couldn't really use them for nullable types (since Optional<A> and Optional<B> aren't related, even if A and B are) they got dropped. Pragmatism first.

Problem 2 was with my handling of closures. If you remember, while loops are not part of the language - they're defined in code as:

  while (condition: ~>bool) (body: ~>void) => void {
      if condition { body; while condition body; };
  }

Where ~> is the type operator to make things lazily evaluated. This means that not guess = target and the braces are their own separate delegates. And when I made them, I was closing around the locals twice, copying them in each time. So when I got the input from the console, it went into a different closure than the while conditional. So I got to take the slapdash closure code I hacked up ages ago and make it work properly. Oh, and I needed to figure out what "properly" was, since there aren't exactly a whole lot of good tutorials on the mechanics of implementing closures.

Problem 3 was smaller thankfully. Since numbers can't be part of identifiers in Tangent, I couldn't call Int32.Parse. Oops. I just hacked that in manually for now.

Problem 4 was a standard sort of bug. When running the game, it would tell you when you guessed the number, but wouldn't exit. I figured it was another closure issue, or perhaps something with tail calls in delegates. No, I had just implemented not as the CIL opcode not, which is bitwise not, which happily worked on (bool)1, yielding true again.

Next is working on interface interop, performance improvements, or some other pragmatic step forward.