Sunday, December 11, 2016

Generics and you.

Sorry for the delay in writing. I've taken to writing a bit in the various Slacks I'm in, which lend themselves a bit better to the day to day work to make things function. But they're not public, so here we are.

Speaking of which, the slack denizens complained about lack of documentation, so now the github readme is updated to cover the basics of the language. It's mostly simple snippets to help define syntax and provide a gentle-ish introduction to phrases and the idioms of the language. And of course, writing them up gave me ample opportunity to find bugs.

The focus of bugs over the last 3 months has largely been around generics. There are three major places where these bugs came up: in my type system, doing codegen to .NET, and how they interoperate with closures.

In Tangent, generics are represented by a ParameterDeclaration. It's a simple class that represents the name : type sort of declarations used throughout the language. The type for generics is the constraint - and for now, almost always any. Parameter declarations though aren't types. In the type system, to refer to a generic you would use one of two types - a GenericArgumentReferenceType or a GenericInferencePlaceholder. These are meant to line up with the .NET sort of implementation where generic parameters in function parameters can infer the type based on their arguments, and generic parameters in classes or as return values just reference some explicit specification. The core parameter declaration in a type declaration is that explicit specification.

And that all fell apart when doing the actual implementation. Consider a simple class declaration.

pair (T) :> (x : T), (y : T) {   
  (this).x : T := x;       
  (this).y : T := y;       
}

T on the left side is the parameter declaration, and all of the other Ts are generic references, yes? Not so much. That compiles into C# code that looks something like this:

public class pair<T> {
  public T x;
  public T y;
}

public static pair<T> ctor<T>(T x, T y) {
  pair<T> result = new pair<T>();
  result.x = x;
  result.y = y;
  return result;
}

public static T getx<T>(pair<T> owner) {
  return owner.x;
}

public static void setx<T>(pair<T> owner, T value) {
  owner.x = value;
}

public static T gety<T>(pair<T> owner) {
  return owner.y;
}

public static void sety<T>(pair<T> owner, T value) {
  owner.y = value;
}

So while the T for the fields is an actual reference to the class' generic, the rest aren't. They get their generic info from the method they're tied to. And of course - since everything is named T there was ample opportunity for me to get them mixed up. 

That is all terrible, and it gets worse. What happens when you toss closures into the picture?

Let's take a look at an implementation of the ternary operator (x?a:b) in Tangent:

(conditional : bool) ? (a : ~>(T)) | (b : ~>T) => T {
  : result : ~> T := b;
  if conditional result = a;
  result;
}

A little weird, but nothing outlandish. We have a phrase that takes a bool, a question mark, something that yields T (think Func<T> in C#), a pipe, and something that yields T again. That phrase will return whatever T is. The gotcha here is line 2 of the function. Because if statements are defined in the language, result = a is inferred to mean () => { result = a; } meaning I need to close over result and a to make that work. For those of you that are lucky enough to not know how closures are implemented, it compiles into something like this:

public class Closure0<T> {
  public bool conditional;
  public Func<T> a;
  public Func<T> b;
  public Func<T> result;

  public void Implementation0() {
    result = a;
  }
}

public static T ternary(bool conditional, Func<T> a, Func<T> b) {
  Closure0<T> scope = new Closure0<T>();
  scope.conditional = conditional;
  scope.a = a;
  scope.b = b;
  
  scope,result = b;
  if (scope.conditional, scope.Implementation0);
  return scope.result;
} 

Which is gross on the best of days. But you'll also note that now we've got class level generics again. And again some Ts are function level generics and some Ts are class level generics. Worse yet, they change depending on if you're calling the closure variable from the closure scope or from the function scope. Bug city.

The good news is that the simple examples above now work. Also, I've gotten enough interop working that I can actually make .NET Lists and add things to them. That required a small change to the language to support explicit generic arguments in functions. That allows stupid code like this:

foo (T) => void {
  print "foo"
}

entrypoint => void {
  foo int

}

But also allows new List<int> to be a phrase that the language can interpret. Next on the docket is interfaces. The interface code I tried to do for the readme doesn't work, I can't use IEnumerable, and I certainly can't make Tangent types work with .NET interfaces or .NET types with Tangent interfaces. 

1 comment:

  1. Speaking of generics, you may find this interesting: https://github.com/ysharplanguage/System.Language

    ReplyDelete