Full support (or virtually full) for modules.
Cat's Eye Technologies
10 years ago
0 | 0 | TODO |
1 | 1 | ---- |
2 | 2 | |
3 | * analyzer needs to resolve module '' → current module | |
4 | 3 | * including files, library files should be **handled by the implementation** |
5 | 4 | * document, too, the implementation-dependent nature of input and output |
6 | 5 | * define a stringify-repr operation on terms |
9 | 8 | * `emit` must be 8-bit clean, i.e. can emit `\x00` |
10 | 9 | * tests for `emit` |
11 | 10 | * option for ref interp to not output result (or by default, don't) |
12 | * "fakie" interpreter | |
11 | * "mini" interpreter that handles variables (ouch) | |
13 | 12 | |
14 | ### lower-priority/experimental ### | |
13 | ### lower-priority ### | |
15 | 14 | |
15 | * $:reverse as a builtin | |
16 | * $:equal should be proper equality of terms | |
16 | 17 | * error reporting: line number |
17 | 18 | * error handling: skip to next sentinel and report more errors |
18 | * regex-like shortcuts: `\w` for "word", `\s` for "whitespace", etc. | |
19 | * EOF and nil are the same? it would make sense... call it `end`? | |
20 | 19 | * module-level updatable variables. |
21 | 20 | * tests for failing when utf8 scanner hits badly-encoded utf8 |
22 | 21 | * numeric values... somehow. number('65') = #65. decode(ascii, 'A') = #65. |
24 | 23 | * `using` production x: x's scanner defaults to utf8, not x |
25 | 24 | * figure out good way to do aliases with the Tamsin-parser-in-Tamsin |
26 | 25 | (dynamic grammar is really more of a Zz thing...) |
26 | * should be able to import ("open") other modules into your own namespace. | |
27 | * meta-circular implementation of compiler! | |
28 | * pattern match in send: | |
29 | * `fields → F@fields(H,T) & H` | |
30 | * maps, implemented as hash tables. | |
31 | * `Table ← {} & fields → F@fields(H,T) & Table[H] ← T` | |
32 | * on that topic — production values and/or lambda productions... | |
33 | * pretty-print AST for error messages | |
34 | * `$.alpha` | |
35 | * `$.digit` | |
36 | * don't consume stdin until asked to scan. | |
37 | * full term expressions -- maybe | |
38 | * non-backtracking versions of `|` and `{}`? `|!` and `{}!` | |
39 | ||
40 | ### wild ideas ### | |
41 | ||
42 | * regex-like shortcuts: `\w` for "word", `\s` for "whitespace", etc. | |
43 | * EOF and nil are the same? it would make sense... call it `end`? | |
27 | 44 | * productions with names with arbitrary characters in them. |
28 | 45 | * something like «foo» but foo is the name of a *non*terminal — symbolic |
29 | 46 | production references (like Perl's horrible globs as a cheap substitute |
30 | 47 | for actual function references or lambdas.) |
31 | 48 | * turn system library back into built-in keywords (esp. if : can be used) |
32 | * should be able to import ("open") other modules into your own namespace. | |
33 | * meta-circular implementation of compiler! | |
34 | 49 | * Tamsin scanner: more liberal (every non-alphanum+_ symbol scans as itself, |
35 | 50 | incl. ones that have no meaning currently like `*` and `?`) |
36 | 51 | * use `←` instead of `@`, why not? |
37 | * pattern match in send: | |
38 | * `fields → F@fields(H,T) & H` | |
39 | * maps, implemented as hash tables. | |
40 | * `Table ← {} & fields → F@fields(H,T) & Table[H] ← T` | |
41 | * on that topic — production values and/or lambda productions... | |
42 | 52 | * auto-generate terms from productions, like Rooibos does |
43 | 53 | * `;` = `&`? |
44 | * pretty-print AST for error messages | |
45 | * `$.alpha` | |
46 | * `$.digit` | |
47 | * don't consume stdin until asked to scan. | |
48 | 54 | * token classes... somehow. (then numeric is just a special token class?) |
49 | * term expressions -- harder than it sounds | |
55 | a token class is just the "call stack" of productions at the time it | |
56 | was scanned | |
50 | 57 | * be generous and allow "xyz" in term context position? |
51 | * non-backtracking versions of `|` and `{}`? (very advanced) | |
52 | 58 | * «» could be an alias w/right sym (`,,`, `„`) |
53 | 59 | (still need to scan it specially though) |
54 | * special form that consumes rest of input from the Tamsin source -- gimmick | |
60 | * special form that consumes rest of input from the Tamsin source -- | |
61 | maybe not such a gimmick since micro-tamsin does this | |
55 | 62 | * feature-testing: `$.exists('$.blargh') | do_without_blargh` |
56 | 63 | * ternary: `foo ? bar : baz` -- if foo succeeded, do bar, else do baz. |
57 | * a second implementation, in C -- with compiler to C and meta-circular | |
58 | implementation, this can be generated! |
557 | 557 | |
558 | 558 | | main = S ← blerf & "x" & frelb. |
559 | 559 | + x |
560 | ? no 'frelb' production defined | |
560 | ? no 'main:frelb' production defined | |
561 | 561 | |
562 | 562 | ### Aside: ← vs. → ### |
563 | 563 | |
1046 | 1046 | + yy@ |
1047 | 1047 | = @ |
1048 | 1048 | |
1049 | `:foo` (and indeed `foo`) should refer to the production `foo` in the | |
1050 | same module as the production where it's called from, but this doesn't work yet. | |
1051 | ||
1052 | | blah { | |
1053 | | expr = :goo. | |
1054 | | goo = "y". | |
1055 | | } | |
1056 | | main = blah:expr. | |
1057 | | goo = "x". | |
1058 | + y | |
1059 | = y | |
1060 | ||
1061 | | blah { | |
1062 | | expr = goo. | |
1063 | | goo = "y". | |
1064 | | } | |
1065 | | main = blah:expr. | |
1066 | | goo = "x". | |
1067 | + y | |
1068 | = y | |
1049 | `:foo` (and indeed `foo`) refers to the production `foo` in the | |
1050 | same module as the production where it's called from. | |
1051 | ||
1052 | | blah { | |
1053 | | expr = :goo. | |
1054 | | goo = "y". | |
1055 | | } | |
1056 | | main = blah:expr. | |
1057 | | goo = "x". | |
1058 | + y | |
1059 | = y | |
1060 | ||
1061 | | foo { | |
1062 | | expr = goo. | |
1063 | | goo = "6". | |
1064 | | } | |
1065 | | bar { | |
1066 | | expr = goo. | |
1067 | | goo = "4". | |
1068 | | } | |
1069 | | main = foo:goo & bar:goo. | |
1070 | + 64 | |
1071 | = 4 | |
1072 | ||
1073 | Can't call a production or a module that doesn't exist. | |
1074 | ||
1075 | | foo { | |
1076 | | expr = goo. | |
1077 | | goo = "6". | |
1078 | | } | |
1079 | | main = foo:zoo. | |
1080 | ? no 'foo:zoo' production defined | |
1081 | ||
1082 | | foo { | |
1083 | | expr = goo. | |
1084 | | goo = "6". | |
1085 | | } | |
1086 | | main = zoo. | |
1087 | ? no 'main:zoo' production defined | |
1088 | ||
1089 | | foo { | |
1090 | | expr = goo. | |
1091 | | goo = "6". | |
1092 | | } | |
1093 | | main = boo:zoo. | |
1094 | ? no 'boo' module defined | |
1095 | ||
1096 | You can have a Tamsin program that is all modules and no productions, but | |
1097 | you can't run it. | |
1098 | ||
1099 | | foo { | |
1100 | | main = "6". | |
1101 | | } | |
1102 | ? no 'main:main' production defined | |
1069 | 1103 | |
1070 | 1104 | Evaluation |
1071 | 1105 | ---------- |
14 | 14 | """The Analyzer takes a desugared AST, walks it, and returns a new AST. |
15 | 15 | It is responsible for: |
16 | 16 | |
17 | * Looking for undefined nonterminals and raising an error if such found. | |
18 | (this includes 'main') | |
19 | 17 | * Finding the set of local variable names used in each production and |
20 | 18 | sticking that in the locals_ field of the new Production node. |
19 | * Creating a map from module name -> Module and | |
20 | sticking that in the modmap field of the Program node. | |
21 | 21 | * Creating a map from production name -> list of productions and |
22 | sticking that in the prodmap field of the new Program node. | |
22 | sticking that in the prodmap field of the each Module node. | |
23 | * Resolving any '' modules in Prodrefs to the name of the current | |
24 | module. | |
25 | ||
26 | * Looking for undefined nonterminals and raising an error if such found. | |
27 | (this includes 'main') (this is done at the end by analyze_prodrefs) | |
23 | 28 | |
24 | 29 | TODO: it should also find any locals that are accessed before being set |
25 | 30 | """ |
28 | 33 | self.program = program |
29 | 34 | self.prodnames = set() |
30 | 35 | self.modnames = set() |
36 | self.current_module = None | |
31 | 37 | |
32 | 38 | def analyze(self, ast): |
33 | 39 | if isinstance(ast, Program): |
34 | 40 | for mod in ast.modlist: |
35 | 41 | self.modnames.add(mod.name) |
36 | 42 | modmap = {} |
43 | modlist = [] | |
37 | 44 | for mod in ast.modlist: |
38 | 45 | mod = self.analyze(mod) |
46 | modlist.append(mod) | |
39 | 47 | modmap[mod.name] = mod |
48 | if 'main' not in modmap: | |
49 | raise ValueError("no 'main' module defined") | |
40 | 50 | if 'main' not in modmap['main'].prodmap: |
41 | raise ValueError("no 'main' production defined") | |
42 | return Program(modmap, ast.modlist) | |
51 | raise ValueError("no 'main:main' production defined") | |
52 | self.program = Program(modmap, modlist) | |
53 | self.analyze_prodrefs(self.program) | |
54 | return self.program | |
43 | 55 | elif isinstance(ast, Module): |
56 | self.current_module = ast | |
44 | 57 | for prod in ast.prodlist: |
45 | 58 | self.prodnames.add(prod.name) |
46 | 59 | prodmap = {} |
60 | prodlist = [] | |
47 | 61 | for prod in ast.prodlist: |
48 | 62 | prod = self.analyze(prod) |
49 | 63 | prod.rank = len(prodmap.setdefault(prod.name, [])) |
50 | 64 | prodmap[prod.name].append(prod) |
51 | return Module(ast.name, prodmap, ast.prodlist) | |
65 | prodlist.append(prod) | |
66 | self.current_module = None | |
67 | return Module(ast.name, prodmap, prodlist) | |
52 | 68 | elif isinstance(ast, Production): |
53 | 69 | locals_ = set() |
54 | 70 | body = self.analyze(ast.body) |
59 | 75 | elif isinstance(ast, And): |
60 | 76 | return And(self.analyze(ast.lhs), self.analyze(ast.rhs)) |
61 | 77 | elif isinstance(ast, Using): |
62 | return Using(self.analyze(ast.rule), ast.prodref) | |
78 | return Using(self.analyze(ast.rule), self.analyze(ast.prodref)) | |
63 | 79 | elif isinstance(ast, Call): |
64 | prodref = ast.prodref | |
65 | if prodref.module == '' and prodref.name not in self.prodnames: | |
66 | raise ValueError("no '%s' production defined" % prodref.name) | |
67 | # TODO: also check builtins? | |
68 | return ast | |
80 | return Call(self.analyze(ast.prodref), ast.args, ast.ibuf) | |
69 | 81 | elif isinstance(ast, Send): |
70 | 82 | assert isinstance(ast.variable, Variable), ast |
71 | 83 | return Send(self.analyze(ast.rule), ast.variable) |
80 | 92 | return Concat(self.analyze(ast.lhs), self.analyze(ast.rhs)) |
81 | 93 | elif isinstance(ast, Term): |
82 | 94 | return ast |
95 | elif isinstance(ast, Prodref): | |
96 | module = ast.module | |
97 | if module == '': | |
98 | module = self.current_module.name | |
99 | new = Prodref(module, ast.name) | |
100 | return new | |
83 | 101 | else: |
84 | 102 | raise NotImplementedError(repr(ast)) |
85 | 103 | |
102 | 120 | locals_.add(ast.variable.name) |
103 | 121 | elif isinstance(ast, Not) or isinstance(ast, While): |
104 | 122 | self.collect_locals(ast.rule, locals_) |
123 | ||
124 | def analyze_prodrefs(self, ast): | |
125 | """does not return anything""" | |
126 | if isinstance(ast, Program): | |
127 | for mod in ast.modlist: | |
128 | self.analyze_prodrefs(mod) | |
129 | elif isinstance(ast, Module): | |
130 | for prod in ast.prodlist: | |
131 | self.analyze_prodrefs(prod) | |
132 | elif isinstance(ast, Production): | |
133 | self.analyze_prodrefs(ast.body) | |
134 | elif isinstance(ast, Or) or isinstance(ast, And): | |
135 | self.analyze_prodrefs(ast.lhs) | |
136 | self.analyze_prodrefs(ast.rhs) | |
137 | elif isinstance(ast, Using): | |
138 | self.analyze_prodrefs(ast.rule) | |
139 | self.analyze_prodrefs(ast.prodref) | |
140 | elif isinstance(ast, Call): | |
141 | self.analyze_prodrefs(ast.prodref) | |
142 | elif isinstance(ast, Send): | |
143 | self.analyze_prodrefs(ast.rule) | |
144 | elif isinstance(ast, Set): | |
145 | pass | |
146 | elif isinstance(ast, Not): | |
147 | self.analyze_prodrefs(ast.rule) | |
148 | elif isinstance(ast, While): | |
149 | self.analyze_prodrefs(ast.rule) | |
150 | elif isinstance(ast, Concat): | |
151 | pass | |
152 | elif isinstance(ast, Term): | |
153 | pass | |
154 | elif isinstance(ast, Prodref): | |
155 | assert ast.module != '', repr(ast) | |
156 | if ast.module == '$': | |
157 | return # TODO: also check builtins? | |
158 | if ast.module not in self.program.modmap: | |
159 | raise KeyError("no '%s' module defined" % ast.module) | |
160 | module = self.program.modmap[ast.module] | |
161 | if ast.name not in module.prodmap: | |
162 | raise KeyError("no '%s:%s' production defined" % | |
163 | (ast.module, ast.name) | |
164 | ) | |
165 | else: | |
166 | raise NotImplementedError(repr(ast)) |
29 | 29 | def find_productions(self, prodref): |
30 | 30 | mod = prodref.module |
31 | 31 | name = prodref.name |
32 | if mod == '': | |
33 | mod = 'main' | |
32 | assert mod != '' | |
34 | 33 | if mod == '$': |
35 | 34 | formals = { |
36 | 35 | 'equal': [Variable('L'), Variable('R')], |
44 | 43 | }.get(name, []) |
45 | 44 | return [Production('$.%s' % name, 0, formals, [], None)] |
46 | 45 | else: |
47 | return self.modmap[mod].prodmap[name] | |
46 | if mod not in self.modmap: | |
47 | raise KeyError("no '%s' module defined" % mod) | |
48 | prodmap = self.modmap[mod].prodmap | |
49 | if name not in prodmap: | |
50 | raise KeyError("no '%s:%s' production defined" % (mod, name)) | |
51 | return prodmap[name] | |
48 | 52 | |
49 | 53 | |
50 | 54 | def __repr__(self): |
51 | return "Program(%r, %r, %r, %r)" % ( | |
55 | return "Program(%r, %r)" % ( | |
52 | 56 | self.modmap, self.modlist |
53 | 57 | ) |
54 | 58 |
233 | 233 | return self.interpret(ast.rhs) |
234 | 234 | elif isinstance(ast, Call): |
235 | 235 | prodref = ast.prodref |
236 | #prodmod = prodref[1] | |
236 | module = prodref.module | |
237 | 237 | name = prodref.name |
238 | 238 | args = ast.args |
239 | 239 | ibuf = ast.ibuf |
6 | 6 | if [ x$1 = x ]; then |
7 | 7 | $0 interpreter && |
8 | 8 | $0 compiler && |
9 | $0 scanner && | |
10 | $0 parser && | |
11 | $0 ast && | |
9 | #$0 scanner && | |
10 | #$0 parser && | |
11 | #$0 ast && | |
12 | 12 | $0 compiledast && |
13 | 13 | $0 compileddesugarer && |
14 | $0 micro && | |
14 | 15 | echo "All tests passed!" |
15 | 16 | exit $? |
16 | 17 | fi |
89 | 90 | echo "Testing Micro-Tamsin interpreter..." |
90 | 91 | FILES="doc/Micro-Tamsin.markdown" |
91 | 92 | falderal $VERBOSE --substring-error fixture/micro-tamsin.markdown $FILES |
92 | elif [ x$1 = xinterpreter ]; then | |
93 | elif [ x$1 = xinterpreter -o x$1 = xi ]; then | |
93 | 94 | echo "Testing Python interpreter..." |
94 | 95 | falderal $VERBOSE --substring-error fixture/tamsin.py.markdown $FILES |
95 | 96 | fi |
96 | ⏎ |