Sunday, December 12, 2010

Shiny things.

Now that some of the smaller core pieces are out of the way, we'll move on to the core feature that Tangent provides. The one feature that has consistently generated positive feedback, which grows out of the concepts we've discussed earlier. But first, let's provide a little historical background about how this feature developed.

One of the key things we've talked about already is the desire to provide arbitrarily named infix functions. Operator overloading on symbols leads to weird re-definition of behavior, yet entities (and games in general) want a bit of those operations that can work between types yet not necessarily own either. Infix notation makes it a lot nicer. So we'd end up with something like:

(Ogre)smash(Knight);

Which is nice, readable and concise. Its definition was a little awkward, but doable:

public operator void smash(Actor subject, Actor target){ ... }

But the parens would get quickly out of hand once we start nesting these. Is it possible to just have:

Ogre smash Knight;

Since the order of operation inference doesn't really care about the parens, it's a simple matter to make them not required in the syntax.

But then the programmer would want to do something like this:

Ogre smash Knight with rock;

The only way to really make this work is to have the smash function return something that takes the type that some global 'with' happens to return something that takes a Weapon and do all the work to curry things together so all the parameters can be used at once.

That sucks.

So the thought went towards two elements to make that better. The first is kind of simple. 'with' in the above example would need to be some global with a special type (or something similar) to get picked up properly (and/or prevent another identifier with the same signature being valid). But I'm writing the compiler. Why not have something 'just work', or even generate those types myself?

No reason at all. Tangent thus provides the concept of explicit identifier parameters. A function can be specified to take the literal identifier 'with' then, which takes priority over variables in overload scenarios. For example (using current syntax):

public   foo('bar': bar) => void { ... }
public   foo(int: bar) => void { ... }

// later in code
local int: bar = 42;
foo bar;   // calls the 'bar' -> void version of foo!

But that still requires you to define the methods yourself and do all the currying. A whole lot of mechanical work, which of course promptly got pushed to computers. The big thing here is allowing definition of phrases:

public   (Actor: subject) smash (Actor: target) 
  with (Weapon: weapon) => void { ... }

So that the method definition looks like it would be called. The compiler does all of the work making curried sub-functions, and making 'with' here a explicit identifier parameter.

In the end, we have something that looks like natural language. More importantly, we have something that behaves more like natural language. Tangent has the concept of 'makes sense'. If a statement doesn't shake out to void, it gets tossed at compile time. And the compiler will shake a statement until it does end up void, properly using the terms in their correct context, and handling the meaning overloads that are inherent to natural language. But nobody will use it if it is a pain. Phrase definition should provide a simple, intuitive mechanism for programmers to say what they want.

Having things closer to natural language should make the language very adaptable with regards towards building domain specific language in Tangent. Allowing a smaller gap between the domain and the code should make errors in translating smaller, as well as the spin up time for new programmers. As long as in the drive towards natural language the design doesn't forget that it is still a programming language.

Assuming that the natural language assumptions are correct, the question then becomes if the order of operation inference becomes too hard for programmers to read, debug, or even write what they really want to do in. Unfortunately that is something that I think can only be determined by writing code in the thing.

Tuesday, November 30, 2010

Type System basics: Multiple Inheritance and you.

Sadly, I've not been writing as much as I should. More sadly, I've not been working on Tangent as much as I should.

Last time I went over some of the simple tidbits of the type system. Now I'd like to go over some of the... more controversial features; almost all surround Tangent's ability to do multiple inheritance. But first, let's quickly calm those of you out there questioning my sanity with torches and pitchforks.

The original motivation for the language was component based entities in games. One of the great problems there is how to get the components together into stuff that does more than its parts. And how to get those parts working together. Modern languages make it difficult to do simple type composition that re-exposes the behaviors while sharing certain logical concepts. Multiple inheritance is always tried, but quickly disposed with because... well it sucks. It seemed as if multiple inheritance would be a fine solution if it didn't suck, so Tangent aims for that.

One of the most common threads about component based entities is how to share/reuse some components that seem dreadfully common. Position for example. If you have a component that handles movement, it needs to adjust some position. If you have a physics handler too, then it needs that Position; as well as a renderer, AI... Often times the position just gets dumped into the entity itself. Not a huge deal for Position, but rather self defeating on less ubiquitous traits.

Tangent addresses this problem by providing property access to fields, and supporting some syntactic sugar:


public class Moveable{
    public abstract Point  Position;
    public Move(Direction to) => void{ ... }
}

Here, Position looks like an abstract field (and in earlier models, actually was). It is actually sugar to require a read/write property. Since fields expose the property, it allows the programmer to implement the requirement however they like. Better yet it makes things consistent within the language itself.

Moveable then has its dependent component encoded into the type system. As long as it is aggregated with something that implements the abstract property, it will work happily.

Which leads us to some of the common multiple inheritance problems. Tangent has two cases of the Diamond Problem. The first:

public goose class A{}
public goose class B: A{}
public goose class C: A{}
public goose class D: C with B{}

This isn't really multiple inheritance. The 'with' operation creates an anonymous type where C inherits B. The dispatch then proceeds like single inheritance would. To use B's method in certain cases requires you explicitly override it in D and then invoke that version (syntax TBD).

The second:

public class A{}
public class B{} // implements A
public class C{} // implements A
public class D{} // implements B and C

public  foo(A:arg)=>void{}
public  foo(B:arg)=>void{}
public  foo(C:arg)=>void{}

//somewhere in code
local A: d = new D;
foo(d);


Since Tangent supports dynamic dispatch on parameters, the runtime type of d is used to determine what overload to use. Sadly, there's no great solution here. If the local was declared as D you would get a compile time error. With this code, you will get an exception. Static analysis can identify methods that aren't 'closed' as far as the type system is concerned. It'll likely be a compiler flag.

The other common issue is what order to run constructors/destructors/etc. Here is where Tangent takes a little deviation. There are no constructors. Tangent allows only field specific initializers. The inheritor (or the left side of the 'with' operation) wins if there's a collision on non-private fields.

Combined with some of the other language features to accommodate the 'I need some value to initialize the invariant!' use of constructors, this should provide a good mechanism for type compositioning without many of the headaches found in other implementations (but surely a few of its own).

Here's hoping for more frequent work/posts!

Saturday, November 6, 2010

Get your ducks in a row: Type System basics

When starting out thinking through what Tangent needed to be, I started with the type system. When I went to implement the thing and ran into issues, they got worked out by starting with the type system.

This isn't exactly a huge surprise. The type system is what prevents a lot of good options when doing component based architecture in gamedev (the original language motivator). There's a not insignificant push to dynamic languages because of problems with static typing. In my experience though, dynamic languages tend to fall down in practice with larger systems and/or more programmers.

That isn't to say there aren't improvements to be had. That is what Tangent aims for: a flexible version of static typing that gets in your way less.

The most straightforward change is to use a Structural Type System. Simply put, if you have the C# types:


public interface SomethingWithName{
    string Name{get;}
}

public class Pirate{
   public string Name{get; set;}
}


In a structural type system, Pirate is an acceptable subtype of SomethingWithName despite not explicitly inheriting from it. This is pretty similar to duck typing found in dynamic languages, but is statically checked. It allows more generic code, since methods can focus on the parts they need, however the object supplies them. The writer doesn't need to know what that is, and the consumer of the method doesn't need to inherit the right interface just to use something that would work fine.

Unless it doesn't. There is certainly the case where a type might meet the criteria for an interface, but not implement the 'spirit' of the interface. You'll then get some happy runtime errors when the generic method does the entirely wrong thing. To avoid some of that (and to help inter-op with other languages) Tangent also provides a mechanism to disable that behavior. In version 1, the keyword goose (read: not duck) was used to tag a type declaration. Types with the goose tag require that a subtype explicitly inherit from it to be considered a sub-type; just like C# and its nominatively typed kin. Unless I hear a better option, goose is likely to remain the keyword.


For next time, we'll go over some of the other features of the type system, as well as our first unpopular compromise.

Friday, November 5, 2010

Introductions

Since you're here, I presume you're curious about what Tangent is, and if you should spend your very valuable web browsing time to care. The original motivation for the language was the difficulty in making component based designs for game development. That led to a few experiments, which led to wanting to try out some features, which led to Version 1. Version 1 involved quite a bit of fumbling around, bad ideas, and the prototyping you'd expect from a version 1. This journal will focus on the design and development of version 2.


Tangent is designed to be a general purpose programming language. It will end up being 'higher level' than C# and Java. It is statically typed. Beyond that, it is fairly peculiar and hard to categorize. The core concept that many of the features build off of is Order of Operation Inference.

Type Inference is a well known feature of programming languages. You have a set of known operations on known types in a known order and the compiler figures out what the resultant type of those operations is. Order of Operation Inference turns that on its head. Tangent forces each statement to result in void. It also compiles all of the Type information before compiling the Executable information. Now with a known resultant type, known operations on known types the compiler figures out what order of operations on those operations 'works' (with a few constraints/preferences to cut down on ambiguity/the search pool).

The original motivation for this was originally to allow user defined free functions that could act as infix operators and have arbitrary names. Unfortunately, that meant that order of operations on arbitrary operators is pretty much unusable. You either just read the things from left to right (which sucks when infix operators were the main goal and leads to Lisp-like paren overload usually anyways) or you have the programming specify some priority (which never works, since the priority has to line up correctly with random priorities set by other programmers). Inferring the order of operations provides a mechanism to solve that problem without forcing the programmer to do more than they would normally do. Further, it provides interesting behavior that can be utilized for other features.


Next time, we'll go into some of the type system basics, as well as some of the practical implications of Order of Operation Inference.