git @ Cat's Eye Technologies Tamsin / master doc / Advanced_Features.markdown
master

Tree @master (Download .tar.gz)

Advanced_Features.markdown @masterview markup · raw · history · blame

Advanced Features of the Tamsin Language

This document is a work in progress.

Note that none of these features are in Tamsin version 0.1 (although the reference implementation might support them or at least the syntax for them — they should be regarded as undefined in 0.1. They may appear in 0.2.)

-> Tests for functionality "Intepret Tamsin program"

Three good ways to shoot yourself in the foot

1, forget that Tamsin is still basically a programming language, or at best an LL(n) grammar, and try to write a left-recursive rule:

expr = expr & "+" & expr | expr & "*" & expr | "0" | "1".

2, base a {} loop around something that always succeeds, like return or eof at the end of the input.

expr = {"k" | return l}.

3, base a loop around something that doesn't consume any input, like !.

expr = !"\n" & expr

Advanced Assignment

The right-hand side of can actually be more than a variable name; it can be a pattern term, just like is used in the arguments, above. This can be useful for "deconstructing" a compound return value from a production to extract the parts you want.

| main = foo → pair(A,B) & return A.
| foo = return pair(wellington, trainer).
= wellington

| main = foo → pair(A,B) & return B.
| foo = return pair(wellington, trainer).
= trainer

Even without variables, this can also be useful simply to assert something returns some value.

| main = foo → b & print 'yes' | print 'no'.
| foo = return a.
= no
= no

| main = foo → b & print 'yes' | print 'no'.
| foo = return b.
= yes
= yes

Advanced Programming

Before the first production in a program, any number of pragmas may be given. Pragmas may affect how the program following them is parsed. Each pragma begins with a @ followed by a bareword indicating the kind of pragma, followed by a number of arguments specific to that kind of pragma, followed by a ..

| @alias zrrk 2 = jersey.
| @unalias zrrk.
| main = foo.
| foo = "b".
+ b
= b

@alias

The pragma @alias introduces an alias. Its syntax consists of the name of the alias (a bareword), followed by an integer which indicates the arity, followed by =, followed by the contents of the alias (i.e., what is being aliased; presently, this must be a non-terminal.)

This sets up a syntax rule, in the rule context, that, when the alias name is encountered, parses as a call to the aliased non-terminal; in addition, this syntax rule is special in that it looks for exactly arity number of terms following the alias name. Parentheses are not required to delimit these terms.

| @alias foo 2 = jersey.
| main = jersey(a,b) & foo c d.
| jersey(A,B) = «A» & «B».
+ abcd
= d

The pragma @unalias removes a previously-introduced alias.

| @alias foo 2 = jersey.
| @unalias foo.
| main = jersey(a,b) & foo c d.
| jersey(A,B) = «A» & «B».
+ abcd
? Expected '.' at ' c d

It is an error to attempt to unalias an alias that hasn't been established.

| @alias foo 2 = jersey.
| @unalias bar.
| main = return ok.
? KeyError

Note that various of Tamin's "keywords" are actually built-in aliases for productions in the $ module, and they may be unaliased.

| @unalias return.
| main = return ok.
? Expected '.' at ' ok.'

| @unalias return.
| main = $.return(ok).
= ok

Rule Formals

Then we no longer pattern-match terms. They're just strings. So we... we parse them. Here's a preview, and we'll get more serious about this further below.

Now that you can create scanners and parsers to your heart's desire, we return to the reason you would even need to: terms vs. rules in the "formal arguments" part of a production definition.

| main = ("a" | "b" | "c") → C & donkey('f' + C) → D & return D.
| donkey["f" & ("a" | "c")] = return yes.
| donkey["f" & "b"] = return no.
+ a
= yes

| main = ("a" | "b" | "c") → C & donkey('f' + C) → D & return D.
| donkey["f" & ("a" | "c")] = return yes.
| donkey["f" & "b"] = return no.
+ b
= no

| main = ("a" | "b" | "c") → C & donkey('f' + C) → D & return D.
| donkey["f" & ("a" | "c")] = return yes.
| donkey["f" & "b"] = return no.
+ c
= yes

Variables that are set in a parse-pattern formals are available to the production's rule.

| main = donkey(world).
| donkey[any → E] = return hello(E).
= hello(w)

| main = donkey(world).
| donkey[any → E using word] = return hello(E).
| word = (T ← '' & {$.alnum → S & T ← T + S} & T) using $.char.
= hello(world)

No variables from the caller leak into the called production.

| main = set F = whatever & donkey(world).
| donkey[any → E] = return hello(F).
? KeyError

Terms are stringified before being matched.

| main = donkey(a(b(c))).
| donkey["a" & "(" & "b" & "(" & "c" & ")" & ")"] = return yes.
= yes

Thus, in this sense at least, terms are sugar for strings.

| main = donkey('a(b(c))').
| donkey["a" & "(" & "b" & "(" & "c" & ")" & ")"] = return yes.
= yes

The rule formals may call on other rules in the program.

| main = donkey('pair(pair(0,1),1)').
| donkey[pair → T using mini] = return its_a_pair(T).
| donkey[bit → T using mini] = return its_a_bit(T).
| thing = pair | bit.
| pair = "pair" & "(" & thing → A & "," & thing → B & ")" & return pair(A,B).
| bit = "0" | "1".
| mini = (bit | "(" | ")" | "," | word) using $.char.
| word = (T ← '' & {$.alnum → S & T ← T + S} & T).
= its_a_pair(pair(pair(0, 1), 1))

Auto-term creation from productions

An experimental feature. But Rooibos does it, and it could help make parser development faster/shorter. Note that feature is not fully implemented. Therefore test disabled.

    | main = expr0.
    | expr0! = expr1 & {"+" & expr1}.
    | expr1! = term & {"*" & term}.
    | term = "x" | "y" | "z" | "(" & expr0 & ")".
    + x+y*(z+x+y)
    = expr0(expr1, +, expr1)