Saturday, May 9, 2015

Order of Operations

So I've finally gotten this iteration of Tangent to the point where I can start to see if it actually does what I think it should do. As you might expect, it does not.

I've gotten basic classes working enough that I can start making actual arithmetic with type shenanigans to coerce things to take the right order of operations. Here is the test code for the day:

term :> term (x: int){
  (this) => int { x; }
}

factor :> factor (x: int){
  (this) => term { term x; }
}

(lhs: int) + (rhs: term) => int { asm add lhs rhs; }
(lhs: term) * (rhs: factor) => term { term asm mul lhs rhs; }
(x: int) => factor { factor x; }

entrypoint => void {
  print 2 + 3 * 4;
}


Now for a slight aside. The first time I ran this, I got 86. How did I get 86? Well, read on! The answer is at the bottom.

Anyways, this now produces 20, due to a subtlety in parsers that I didn't really fully grok until today when it bit me full on the ass. When you look at a formal grammar for arithmetic, there are always conversions to take function invocations, constants, paren expressions... bunches of stuff into a factor. What normal parsers do though is they determine the order of operations there, before ever knowing anything about the function invocation, constants, paren expressions, etc. They basically form the structure of the code ignoring any sort of conversions or trickery that needs to be done to turn that token into an int.

Since I'm doing the order of operations after knowing what trickery is necessary, the parser has forgotten if the int it sees is a constant, or the result of a function, or all of the limitations that formal grammars put on conversions to a factor. So Tangent happily converts 3 to a factor, then to a term to do the addition, and then circles the int result back around to a factor to multiply afterwards. To get this working, I'll need to break the cycle which will make normal use of integer parameters broken and/or weird.

So, 86. The root cause of the bug was in how the compiler generated its CIL. If you remember back to my post about loops, Tangent uses tail call optimization to keep function invocation workable in those sort of scenarios. The issue was that I always added the tail call to the last statement in a function. And I didn't change that when I added built-in opcodes like add and multiply (and constructor calls). So in the CIL, you saw arg1; arg2; tailcall; add;. .NET was happy to let that run, but I expect that the tailcall fubar'd my stack so that when the add actually ran, the arguments had been replaced by whatever random stuff was on the stack when that method started. Bad times.

No comments:

Post a Comment