Sunday, October 23, 2011

Code generation rework and cheap parlor tricks.

Far too long since my last post. I'd like to say I'd been hard at work making cool new features, but if you've been reading at all, you'll know how much of a lie that is. Plus now I'm sleepy, so instead of doing compiler debugging while sleepy I'll get the handful of non-robotic journal readers some amusement.

The big change going on recently is re-working how I was doing the dynamic dispatch in Tangent. The previous way involved defining a method on my runtime object, sending in a parameter, doing some inspection and then calling the right method out of a dictionary. Quick, easy, functional. But it rubbed me the wrong way (not that way), and was increasingly causing me grief when trying to plan out next steps. It's poor performance, it's difficult to optimize, it's easier to debug, but harder to incorporate native .NET calls. And it makes parameter aggregation painful.

So that is out, now replaced with actual bonafide compiled CIL. The compiler makes a few quick optimizations based on types, but mostly just brute force dispatches based on what the method overloads are for that instance. And on the upside, it now nicely throws exceptions if we get into a weird case where a call is ambiguous or unknown. And (the core reason behind the work since last post), methods actually can accept and curry parameters now. All of the examples before were simple methods or only required the type of the parameter, not the value. Alas, there's still some bugs there so even the previous code examples don't work at the moment.

But I'm going to talk a little bit about what's in the works since it's both nifty, and a horrible idea.

One of the main goals of Tangent is to allow unparalleled ease and flexibility in creating domain specific languages, or even more general purpose dialects that can interoperate and evolve as programmers need them to. The core syntax defines some syntax just to make the basics familiar to programmers. Function declaration, class declaration, type operations, assignment, return... But I'd like to keep the set of keywords (and reserved symbols) relatively small.

The experimentation recently has involved just how small can be reasonably provided to the programmer and still be able to provide via some core library that provides the dialect of Tangent that is used. Once these little bugs are taken care of, Tangent should be able to support in-code definition of your standard if/else block.

For those of you who have not fainted or are on your way to stop more of this insanity, here's some of the details. The first core bit I need is the dynamic dispatch supported by methods already. And I need to be able to define specialized symbols/identifiers in source. You've seen these before.

There are two new things that you've not seen. The first is the Lazy operator (which might get a better name later). It is a built-in type operator that asks for a fully bound method resulting in the specified type. Since every access in Tangent (currently) is done via curried method, this works pretty well. So in Tangent you can define a type like ~> void which is simply 'some expression that results in void'. Once a variable of that type is put into an expression where it needs to provide void the variable is executed. This sort of thing will be used many places where you'd normally pass a lambda, but will (ideally) lead to cleaner, more readable code at the call site.

The second concept is simply that the language will treat your standard block as a nullary closure. Which is just fancy terminology for method of type ~> void. It looks like a block you've used, it behaves like a block you've used, but it can be passed into methods and it can be stored in variables.

Using these pieces, we can define the standard if/else block in code (using an enum, since .NET bools aren't accessible yet):


Type bool => enum{ values{ True, False } };
if (condition: bool) (block: ~> void) => void { block; };
if (condition: bool) (block: ~> void) 
  else (else block: ~> void) => void { block; };
if (condition: bool.False.Type) (block: ~> void) => void { };
if (condition: bool.False.Type) (block: ~> void) 
  else (else block: ~> void) => void { else block; };
entrypoint => void {
  if bool.True {
    debug "True!";
  };
};

Untested of course, since the code gen is still buggy. Hopefully there's not some trivial bug, but I'm not sure anyone reads this anyways.

One interesting trick here is specializing the method on False. Why false? Because remember that null in Tangent is essentially the top type. If I specified True as the specialization, null would trigger that. If I specified both True and False, then sending in null would cause an AmbiguousInvocationException. This way, if you send in null, it behaves as False. And the scary/nifty thing? If your dialect wants different behavior, you can go in and define that.

Thursday, September 15, 2011

The dark depths of this beast.



First a warning for those with heart conditions, epilepsy, or are pregnant; this post contains code snippets that have maimed the hale and healthy. YOU'VE BEEN WARNED!

One of the things I noticed while doing the last bit of work is how often the core part of the compiler iterates over the same damned thing. It needs to take a sequence of tokens and figure out what order of execution is necessary so that they 'make sense'. It's the core idea of the language, and making it work well, reliably, and fast/scalably is going to be key to the thing being viable. And frankly, its the sort of thing that the experimental language is there to experiment with.

What I was doing was simply iterating through the collection of tokens, if I found one to reduce, I did the reduction and then via recursion worked on the now smaller collection. Eventually I'd try everything and it'd fail, or I'd find what I was looking for and exit out early.

That though starts at the beginning each time. And it doesn't take any sort of type info into account when reducing things to see if we're "closer" to our answer. And it provides poor support for partial consumption, like the implicit feature I discussed earlier (this was actually the downfall of the previous iteration).

So I spend the past few days thinking and tonight redoing that core bit of code. And I now have the most horrible C# I've ever seen:

 using System;  
 using System.Collections.Generic;  
 using System.Linq;  
 using System.Text;  
 using Tangent.Lang.Compilation.IntermediateForm;  
 using System.CodeDom.Compiler;  
 using System.Threading.Tasks;  
 using System.Collections.Concurrent;  
   
 namespace Tangent.Lang.Compilation {  
   public partial class Compiler {  
     private IEnumerable<List<List<Element>>> UnflattenElements(List<Element> flatElements, Type targetType, List<Type> scope, CompilerErrorCollection errors, ConsumeDirection consumeDirection) {  
   
       System.Diagnostics.Debug.WriteLine("[{0}] <- {1}", targetType, string.Join(":", flatElements));  
   
       if (targetType == Type.ExecutionToken) {  
         yield return new List<List<Element>>() { flatElements };  
         yield break;  
       }  
   
       if (flatElements.Count == 0) {  
         yield break;  
       }  
   
       Element test = null;  
       if (consumeDirection == ConsumeDirection.Right) {  
         test = flatElements.First();  
       } else {  
         test = flatElements.Last();  
       }  
   
       if (targetType.IsExplicitIdentifier) {  
         if (!string.IsNullOrWhiteSpace(test.Identifier) && targetType == LiteralTypeFor(test.Identifier)) {  
           List<Element> rework = new List<Element>(flatElements);  
           rework[consumeDirection == ConsumeDirection.Right ? 0 : flatElements.Count - 1] = new IdentifierLiteral(test.Identifier, LiteralTypeFor(test.Identifier));  
           yield return new List<List<Element>>() { rework };  
           yield break;  
         }  
       }  
   
       if (flatElements.Count == 1) {  
         if (flatElements[0].EffectiveType != null && flatElements[0].EffectiveType.IsSubTypeOf(targetType)) {  
           yield return new List<List<Element>>() { flatElements };  
           yield break;  
         }  
       }  
   
       if (test.EffectiveType != null && test.EffectiveType.IsSubTypeOf(targetType)) {  
         yield return new List<List<Element>>() { flatElements };  
       }  
   
       int start = consumeDirection == ConsumeDirection.Right ? 0 : flatElements.Count - 1;  
       int end = consumeDirection == ConsumeDirection.Right ? flatElements.Count : -1;  
       int step = consumeDirection == ConsumeDirection.Right ? 1 : -1;  
       for (int i = start; i != end; i += step) {  
         IEnumerable<List<MethodDefinition>> possibilityStack = null;  
         List<Element> currentResolutionState = null;  
         if (flatElements[i] is ReResolveElement) {  
           possibilityStack = ResolveIdentifier(flatElements[i].Identifier, scope, errors)  
             .Signal(elements => currentResolutionState=elements)  
             .Select(elements => elements.Select(element => element.EffectiveType.ReductionRules[0]).ToList());  
   
         } else if (flatElements[i].EffectiveType.ReductionRules.Any()) {  
           possibilityStack = new List<List<MethodDefinition>>() { new List<MethodDefinition>(flatElements[i].EffectiveType.ReductionRules) };  
         }  
   
         if (possibilityStack != null) {  
           foreach (var possibles in possibilityStack) {  
   
             while (possibles.Any()) {  
               HashSet<MethodDefinition> best = HighestPriorityMethods(possibles, consumeDirection == ConsumeDirection.Right ? ConsumeDirection.Left : ConsumeDirection.Right);  
               possibles.RemoveAll(def => best.Contains(def));  
               List<Element> paramElements;  
   
               if (best.First().Direction == ConsumeDirection.Right) {  
                 if (i == flatElements.Count - 1) {  
                   paramElements = new List<Element>();  
                 } else {  
                   paramElements = new List<Element>(flatElements.GetRange(i + 1, flatElements.Count - i - 1));  
                 }  
               } else {  
                 if (i == 0) {  
                   paramElements = new List<Element>();  
                 } else {  
                   paramElements = new List<Element>(flatElements.GetRange(0, i));  
                 }  
               }  
   
               Dictionary<MethodDefinition, IEnumerator<List<List<Element>>>> possibleParamPermutations = new Dictionary<MethodDefinition, IEnumerator<List<List<Element>>>>();  
               foreach (var entry in best) {  
                 possibleParamPermutations[entry] = UnflattenElements(paramElements, entry.Consumes, scope, errors, entry.Direction).GetEnumerator();  
               }  
   
               while (possibleParamPermutations.Any()) {  
                 bool permutationsLeft = false;  
                 do {  
                   permutationsLeft = false;  
                   List<List<Element>> passback = new List<List<Element>>();  
                   Dictionary<MethodDefinition, IEnumerator<List<List<Element>>>> possibleSolutions = new Dictionary<MethodDefinition, IEnumerator<List<List<Element>>>>();  
   
                   foreach (var entry in best.Where(def => possibleParamPermutations.ContainsKey(def))) {  
                       
                     if (possibleParamPermutations[entry].MoveNext()) {  
                       permutationsLeft = true;  
                       foreach (var paramPermutation in possibleParamPermutations[entry].Current) {  
                         possibleSolutions[entry] = UnflattenElements(Reduce(flatElements, i, paramPermutation, entry, currentResolutionState), targetType, scope, errors, consumeDirection).GetEnumerator();  
                         if (!possibleSolutions[entry].MoveNext()) {  
                           possibleSolutions.Remove(entry);  
                         }  
                       }  
                     } else {  
                       // no op; eventually permutationsLeft will remain false.  
                       possibleParamPermutations.Remove(entry);  
                     }  
                   }  
   
                   if (possibleSolutions.Any()) {  
                     List<MethodDefinition> solutionDefs = new List<MethodDefinition>(possibleSolutions.Keys);  
                     while (solutionDefs.Any()) {  
                       HashSet<MethodDefinition> bestSolutions = HighestPriorityMethods(solutionDefs, consumeDirection == ConsumeDirection.Right ? ConsumeDirection.Left : ConsumeDirection.Right);  
                       solutionDefs.RemoveAll(def => bestSolutions.Contains(def));  
   
                       List<MethodDefinition> cleanup = new List<MethodDefinition>();  
                       while (bestSolutions.Any()) {  
                           
                         foreach (var bestSol in bestSolutions) {  
                           System.Diagnostics.Debug.WriteLine(string.Format("({2}) {0} : [{1}]", bestSol, string.Join(" =||= ", possibleSolutions[bestSol].Current.Select(line => string.Join(":", line))), flatElements.Count));  
                           passback.AddRange(possibleSolutions[bestSol].Current);  
                           if (!possibleSolutions[bestSol].MoveNext()) {  
                             cleanup.Add(bestSol);    
                           }  
                         }  
   
                         bestSolutions.RemoveWhere(key=>cleanup.Contains(key));  
                         cleanup.Clear();  
                         yield return passback;  
                       }  
                     }  
                   }  
                 } while (permutationsLeft == true);  
               }  
             }  
           }  
         }  
       }  
     }  
   
     private List<Element> Reduce(List<Element> flatElements, int ix, List<Element> paramPermutation, MethodDefinition signature, List<Element> currentResolutionState) {  
       List<Element> result = new List<Element>();  
       Func<int, Element> getReductionElement = (i) => {  
         if (flatElements[i] is ReResolveElement) {  
           if (currentResolutionState == null) {  
             throw new ArgumentNullException();  
           }  
   
           return currentResolutionState.Where(e => e.EffectiveType.ReductionRules[0] == signature).First();  
         } else {  
           return flatElements[i];  
         }  
       };  
         
       if (signature.Consumes == Type.ExecutionToken) {  
   
         if (ix != 0) {  
           result.AddRange(flatElements.GetRange(0, ix));  
         }  
   
         result.Add(new ReductionElement(getReductionElement(ix), new Literal("", Type.ExecutionToken, null), signature));  
   
         if (ix + 1 != flatElements.Count) {  
           result.AddRange(flatElements.GetRange(ix + 1, flatElements.Count - ix - 1));  
         }  
   
         return result;  
       }  
   
       if (signature.Direction == ConsumeDirection.Right) {  
         if (ix != 0) {  
           result.AddRange(flatElements.GetRange(0, ix));  
         }  
   
         if (signature.Consumes.IsExplicitIdentifier) {  
           result.Add(new ReductionElement(getReductionElement(ix), new IdentifierLiteral(paramPermutation[0].Identifier, LiteralTypeFor(paramPermutation[0].Identifier)), signature));  
         } else {  
           result.Add(new ReductionElement(getReductionElement(ix), paramPermutation[0], signature));  
         }  
   
         if (paramPermutation.Count != 1) {  
           result.AddRange(paramPermutation.GetRange(1, paramPermutation.Count - 1));  
         }  
   
         return result;  
       }  
   
       if (signature.Direction == ConsumeDirection.Left) {  
         if (paramPermutation.Count != 1) {  
           result.AddRange(paramPermutation.GetRange(0, paramPermutation.Count - 1));  
         }  
   
         if (signature.Consumes.IsExplicitIdentifier) {  
           result.Add(new ReductionElement(getReductionElement(ix), new IdentifierLiteral(paramPermutation.Last().Identifier, LiteralTypeFor(paramPermutation.Last().Identifier)), signature));  
         } else {  
           result.Add(new ReductionElement(getReductionElement(ix), paramPermutation.Last(), signature));  
         }  
   
         if (ix != flatElements.Count - 1) {  
           result.AddRange(flatElements.GetRange(ix + 1, flatElements.Count - ix - 1));  
         }  
   
         return result;  
       }  
   
       throw new ApplicationException("Got to unreachable code somehow...");  
     }  
   
     private static HashSet<MethodDefinition> HighestPriorityMethods(List<MethodDefinition> candidates, ConsumeDirection preferredDirection) {  
       if (candidates.Count <= 1) {  
         return new HashSet<MethodDefinition>(candidates);  
       }  
   
       HashSet<MethodDefinition> best = new HashSet<MethodDefinition>();  
       best.UnionWith(candidates.Where(def => def.Consumes == Type.ExecutionToken));  
       if (best.Any()) {  
         return best;  
       }  
   
       best.UnionWith(candidates.Where(def => def.Consumes.IsExplicitIdentifier && def.Direction == preferredDirection));  
       if (best.Any()) {  
         return best;  
       }  
   
       best.UnionWith(candidates.Where(def => def.Consumes.IsExplicitIdentifier));  
       if (best.Any()) {  
         return best;  
       }  
   
       var workset = candidates.Where(def => def.Direction == preferredDirection).ToList();  
       if (workset.Count == 0) {  
         workset = candidates;  
       }  
   
       if (workset.Count == 1) {  
         return new HashSet<MethodDefinition>(workset);  
       } else if (workset.Count > 1) {  
         var superBest = workset.Where(candidate => workset.TrueForAll(other => candidate == other || candidate.IsMoreSpecificThan(other)));  
         if (superBest.Any()) {  
           return new HashSet<MethodDefinition>(superBest);  
         } else {  
           return new HashSet<MethodDefinition>(workset);  
         }  
       }  
   
       return new HashSet<MethodDefinition>(candidates);  
     }  
   }  
   
   public static class IEnumerableSignalExt {  
     public static IEnumerable<T> Signal<T>(this IEnumerable<T> collection, Action<T> signal){  
       foreach (var entry in collection) {  
         signal(entry);  
         yield return entry;  
       }  
     }  
   }  
 }  

The yielding allows me to try the best paths first. If we find a solution, we can exist out early. Unfortunately for maintainers (me), there's like 3 different places that can happen. And then we usually end up about 10 steps deep in recursion within this construct.

And the saddest thing of all is that this is actually easier to debug than what was there. It's harder to maintain, and certainly harder to understand. But performance is exponentially better even though this nests loops like it's going out of style. And it will support features in the future like partial consumption of a sub-expression based on user-defined methods.

Monday, September 5, 2011

Long weekend - Part 2: Dynamic Dispatch

Alas, Sunday was a wasted day due to a bad (legit) windows 7 ISO burn during my computer upgrades. I got to reinstall the OS just so I could re-download and burn the iso (this time checking md5's).

But now I have a beefy new I7 sitting under my desk. Woot.

Today I went through and did some more compiler and code-gen bugfixes so that I can tell you things without being a damned dirty liar. Tangent now properly supports dynamic dispatch for simple enums. Adding to our last example:

Type foo => enum{ values{ a, b+c } };
stuff (x: foo) => void { debug "in stuff."; }
stuff (x: foo.a.Type) => void { debug "in a stuff."; } 
entrypoint => void {
  debug " + "\"entrypoint\"" + @";
  stuff foo.b+c;
  stuff foo.a;
}


Here we overload stuff so that it can take any foo, but if it gets foo.a then that more specific implementation should be used. So you get the nice output:

entrypoint
in stuff.
in a stuff.


Yes yes, this isn't actually dynamic dispatch; just overloading methods on parameters. Once I can actually store variables, it'll be dynamic dispatch (another post for another time). C# supports it if the method takes a dynamic, but that isn't statically checked, few people know about it, and it's not super important in C# (plus dynamic methods are limited in where they can be used). In Tangent though using type context for terms that are ambiguous in other languages (but clear to humans) is the core of what it's trying to experiment with. This sort of method refinement on types/capabilities is a big feature that is likely to be a large part of the language idioms as it grows.

The other thing shown here is the Type member. This is autogenerated onto every type at the moment. For constant/sealed things like enums, it is typed as simply another reference to the constant (anonymous) type that enum values have. For classes, it is typed to return a kind of T. This makes sense since we're guaranteed that any subtypes will fit into that kind. It allows us to get static typing when working with types of variables for cloning, serialization, and factory sort of things.

2 days left in the long weekend. I hope to get another little bit of work and a post done, but I suspect work and (moreso) Disgaea 4 will derail that plan.

Saturday, September 3, 2011

Long weekend - Part 1: A Tiny Taste of Evil

Alas long hours of work, piles of social obligations and a superhuman aptitude for laziness have kept me from working much on the language; let alone posting to a blog that gets a whole half-dozen hits per week. But I have a 5 day weekend, and while work will still draw me in for some of the time (and laziness most of the time) I do hope to get some Tangent work done. Hell, some of that might even be feature work instead of bug fixing.

Today for example I got the last of the compiler bugs dealt with such that I can compile and run a simple Tangent program:

Type foo => enum{ values{ a } };

stuff( x:foo ) => void{
  debug "in stuff.";
}

entrypoint => void{
  debug "in entrypoint.";
  stuff foo.a;
}


Here debug is a hacked in keyword to Console.WriteLine string constants. Hurrah laziness. This runs nicely and prints the two lines as you would expect. One thing that is nifty/evil about this incarnation of Tangent is the ability to use phrases in more places than method declarations. So even though we can only do basic things at the moment, all of the syntax inference exists to do evil things like:

Type foo bar => enum{ values{ a, a+b } };

stuff( x:foo bar ) => void{
  debug "in stuff.";
}

entrypoint => void{
  debug "in entrypoint.";
  stuff  foo bar.a + b;
}


This too prints the same two lines even in this elemental stage. Note that the added spaces in stuff foo bar.a + b; are not really necessary for the compiler. The inference of operations is done on Type, not spacing. You'll still need spaces to distinguish foo bar and foobar (2 identifiers vs 1) but symbols do not need spacing to distinguish them.

In this example, it's just mostly evil; dot as member access doesn't work fantastic for the literate syntax. The idea though is that it will provide better, cleaner code to read when programmers create phrases to describe their problem domain.

Hopefully the extra-long weekend will allow more work to show more stuff being more cool.

Sunday, August 7, 2011

Compiler bug, an example why Tangent is non-trivial

Especially in the bootstrapping stage, programming languages aren't fun to work with. Tangent takes that usual un-fun and makes it a pile more un-fun because of features other developers had the good sense to avoid.

Take an exception generated by the runtime today:


Unable to invoke: No method exists on the type "enum? -> 'a'" of the signature "enum? -> 'a'".


Which is... awesome. At least there's something approximating a decent error message, even if it's still unacceptably bad. And since I managed to get this thing actually compiling (horrifically, see last post) debugging it is a pain. Actually it's not that bad. The runtime is in nice, un-code generated C# so I could debug it, but that's the thing; Tangent has enough runtime code (in this case the dynamic dispatch) that other languages don't have that I spend a bit of time writing and later debugging happy error messages like this.

So I manage to cut down the test program so there's only one enum and only one value. Still doesn't work so it's not just 2 different enums not comparing right due to my lack of name capture. Step into code to see that the types have the same name, the same goose names (see 2nd post for goose high level info), the same reduction rules... everything that should determine type equivalence... but they're not the same type.

Which is odd. There's only one type with this name in the table... Then I see that one is the literal type (a kind) and the other is a type value. A kind is not a type, so that's why the mismatch happens. Fair enough, I screwed up the level when doing code generation. So I hunt through the compiled bits to find that there really is only one type that matches this, and it's the kind (as it should be).

So I eventually found it out. The issue is with gooses and how I track type literals. Gooses (as talked about earlier) are flags to say 'you must inherit this class to be a subtype of this class'. How does that work? By keeping references to types that are inherited. A goose type has a reference to itself (to differentiate itself from a type with the same members, but is supposed to be ducktyped). This works awesome. I use it (and I suspect users of the language) will use empty goose types to annotate types. Like... for indicating that a type instance is to be treated as a literal T or Kind of T instead of the value T.

See where this is headed? Yeah, for something to be considered a literal T it needs the Type reference in its goose list indicating that it is. The reference it needs is just a static instance on Type, so matching is easy and straightforward. Except that when I was doing code generation, I added the goose representation for it to be created new. Had the same fields, same properties, same data... but wasn't the same reference as the static Type.TypeLiteralGoose. So it wasn't being treated as a Type Literal by the compiled runtime, and caused type matching errors.

Easy enough to fix; reference the static field rather than create a new instance with the same bits. Finding it is the hard part.

Granted, there were more bugs under that one, but that's another story for another time.

Saturday, July 23, 2011

I suck (and other musings)

Yes, it has been a long time. Yes, I've been doing very little on the project. Yes, I decided to restart again. Yes, I suck.

The last iteration, I managed to get completely turned around by kinds (that is, variables that store types). There are actually 3 sorts of types in the current iteration of Tangent:

- Types: Your (mostly) run of the mill types. They represent some possible set of values. 1, 3.14, "foo", etc.

- Type Literals: These are internal to the compiler. When you declare a new type, it's a type literal. The compiler knows exactly what it is, and marks it such that you can do operations on that (at compile time) and get a known answer out. So when you resolve the identifier 'string' you're accessing a variable (treated like any other static constant under the hood) for the type. This is what the last iteration neglected to understand.

- Kinds: These are type variables. Think C# generics. Tangent's Kinds require a type constraint (even if that constraint is 'anything'), but at that point they're a variable like any other. You can save them, modify them, use them as method parameters. The type checker uses the type constraint (just like C# does) to enforce that you only call methods on the type that the constraint has.

Now that I'm dealing with that properly, I have the compiler kinda sorta rebuilt. The only good thing about this is that I've decided to try and learn from the previous half dozen iterations; so the compiler actually compiles into an exe. Every other iteration either didn't get to runable code, or interpreted the intermediary form right in C#. The interpetedness made the code dog slow (and I had to implement all the interpreting making development slow) which was the death of that iteration.

Granted, the crime against humanity that I'm actually outputting is currently unoptimized and likely just as dog slow. For example:

  stuff foo.a; 

where stuff is a simple empty method call and foo.a represents an enum gets compiled out to...


return TangentObject.Global.Invoke(__tangentTypeCollection.__type0.ReductionRules[2], __tangentTypeCollection.__literal_'stuff').Invoke(__tangentTypeCollection.__type9.ReductionRules[0], TangentObject.Global.Invoke(__tangentTypeCollection.__type0.ReductionRules[6], __tangentTypeCollection.__literal_'.').Invoke(__tangentTypeCollection.__type25.ReductionRules[0], TangentObject.Global.Invoke(__tangentTypeCollection.__type0.ReductionRules[1], __tangentTypeCollection.__literal_'foo').Invoke(__tangentTypeCollection.__type6.ReductionRules[0], __tangentTypeCollection.__literal_)).Invoke(__tangentTypeCollection.__type26.ReductionRules[0], __tangentTypeCollection.__literal_'a').Invoke(__tangentTypeCollection.__type28.ReductionRules[0], __tangentTypeCollection.__literal_)).Invoke(__tangentTypeCollection.__type11.ReductionRules[0], __tangentTypeCollection.__literal_);


BLEARGH!

But it works, and it's a nice happy exe; and realistically optimization will (eventually) reduce that entirely to one direct method call.

Two other things of note since the last post involving syntax changes. I've reversed the order of identifier/types in variable declarations. It's now:


[variable]:[type]


Like pretty much every other language that includes a colon in the variable declaration. I imagine once I use it (or F#) the order will be less of an issue.

The other change involves type declarations. The old syntax used to be pretty much C# syntax:


[modifiers] class [: inheritance-list] (; | {[class-declaration-elements]*})


Since I expect anonymous types and type aliases to be more common in the language (and because I want to save myself work) it's been changed to the slightly more simple:


Goose? Type [method-declaration-element]+ => [type-expr];


So the actual declaration will look very similar to method declaration syntax. All of the rules for method declaration apply here. You must have 1 identifier or symbol, and then any number of other identifiers, symbols, or parameters. This I think provides a much nicer way to provide constructors to the language without running into multiple inheritance issues; the constructor is for the type, not the instance. If two constructors with the same name happen to lead to compatible types all the better:


Type foo( initial x : int ) => class {
  x : int = initial x;
};

Type foo => foo(42);

The simple class{} syntax is the anonymous type syntax, which can then be re-used within expressions. Inheritance will be done after the => like any other type operations. And since the language provides kinding support with user-defined operators, I expect those to be kinda awesome (and evil).

Saturday, January 22, 2011

New things

Sadly Tangent development often falls behind bills and food. Now that some time has opened up, we'll talk about actually doing some implementation. This blog was designed to step through those processes a little bit. Provide a mechanism to think about the feature, as well as provide some insight into implementation.

The feature that is the focus today is tentatively called Implicit Lambdas. C# 3 added a variety of extension methods to support operations over arbitrary collections. By default, they could look a bit... chunky:


    list.Where(x=>x.Color == Color.Blue).Select(x=>x.Position);

So the language designers introduced specialized syntax to deal with it:

    from x in list where x.Color == Color.Blue select x.Position;

This certainly looks a little better. Unfortunately, that's the only place that has special syntax. Any other methods that take another method as an argument need to use the full lambda syntax. Tangent has two problems with this. The first is that the parsing mechanism does not particularly lend itself to specialized keywords. The second is that Tangent's lambda syntax requires you specify the return type (so that the order of operation inference can work):


    (type:var,...)=>return-type{exprs}

Which means that we can't just ignore some sort of cleaner syntax to handle cases where lambdas make sense.


Enter Implicit Lambdas. The idea here is that since Tangent knows what method is taking the other method as an argument, we can let it specify what types it wants. Further, since almost every C# lambda ends up with something repetitive like:


    x => x.Color == Color.Blue

We can add some implicit scoping (and variable naming) to the mix. No need to specify x. It essentially (and under the covers) acts as though a new member method is being made on the fly for the type we're making a lambda for. So if we were to define our own where function in Tangent, it'd probably go like this:

    generic(any:T) (IEnumerable<T>: collection) where 
                   (implicit T -> bool: predicate) => yields<T> {
        foreach T:entry in collection {
            if predicate(entry) {
                yield entry;
            }
        }
    }


The modifier implicit here tells the compiler that we can use the automatic scope for that parameter. It's not automatic, because it doesn't make sense for everything. Sometimes it's useful to have the callsite explicitly show that there's a lambda there. So in Tangent, the linq query can be reduced to:


    list where Color == Blue select Position;

(assuming the Tangent convention of having Blue be a global/overload for the color, and select has had a similar treatment).

This feature does lack a little bit where the LINQ syntax does not. It doesn't deal well with multiple variables really; you'll need to have the implicit type be a key, which will probably be awkward to deal with at the callsite. It doesn't deal well when the implicit type you're acting on is actually the value you want to work with. Something like 'double every odd number in the list' is really awkward as it's currently designed.

On the upside this provides an arbitrary, user-definable way to capture expressions without nasty lambda syntax everywhere.