Tangent Developer Journal: December 2014

Monday, December 15, 2014

Progress - Expression Parsing

Work continues. After an aborted first attempt that did not deal with ambiguity well, and a bout of coder's block, the first swipe at the second stage of the parser is dev-done. The code can (read: should) now parse statements when given an arbitrary identifier stream and a scope. Scopes are simple now. No nesting. Just type constants (which aren't used yet), parameter declarations and functions in separate bundles to keep things simple.

Tests exist for:

step from identifiers to parameter reference
step from identifiers to function binding
step from function binding to function invocation
function consumption of non-terminals (values of a given type)
priority given to parameters when ambiguous
use of lower priority when it can successfully parse

Not that it actually does anything yet. All the code does is take identifiers and give you a parse tree for that statement. More than one parse tree actually if the statement is ambiguous in the current scope. It also likely behaves poorly when there are cycles in the grammar/function declarations.

Still, I am happy to have the core that makes Tangent different there and working. And happy that progress continues

Saturday, December 6, 2014

Progress - Initial Iteration

The first iteration is progressing. As much as I would like type declarations and functions be very similar, they are currently very distinct. This is for simplicity and the whole "don't get fancy" focus. The current language is dead simple (not lisp dead simple, but dead simple compared to what I'd like eventually):

program ::= (type-decl | function-decl)^*
type-decl ::= id⁺ :> enum { comma-delimited-id-list }
function-decl ::= phrase-part⁺ => id⁺ { statement^* }
phrase-part ::= id | parameter-decl
parameter-decl ::= '(' id⁺ : id⁺ ')'
statement ::= id⁺ ;

The rest of the work is done in the second step of the parser, the part that we're focusing on this time around. That part takes the various statements and performs bottom-up parsing on them using the type declarations from the first (conventional) step of the parser. The void type is used to define the start rule for statements, and other rules will need to be used to help deal with potential ambiguities (like preferring an id literal over a parameter reference).

Otherwise, it should proceed pretty much like any other bottom-up parsing approach, albeit providing an extensible syntax while demanding nothing more from programmers than they're used to - declaring and annotating types. Currently, the first step of the parser is dev-done.