Make source run under both Python 2 and Python 3. Refactor tests.
Chris Pressey
3 years ago
0 | Exanoke | |
1 | ======= | |
2 | ||
3 | _Exanoke_ is a pure functional language which is syntactically restricted to | |
4 | expressing the primitive recursive functions. | |
5 | ||
6 | I'll assume you know what a primitive recursive function is. If not, go look | |
7 | it up, as it's quite interesting, if only for the fact that it demonstrates | |
8 | even a genius like Kurt Gödel can sometimes be mistaken. (He initially | |
9 | thought that all functions could be expressed primitive recursively, until | |
10 | Ackermann came up with a counterexample.) | |
11 | ||
12 | So, you have a program. There are two ways that you can ensure that it | |
13 | implements a primtive recursive function: | |
14 | ||
15 | * You can statically analyze the bastard, and prove that all of its | |
16 | loops eventually terminate, and so forth; or | |
17 | * You can write it in a language which is inherently restricted to | |
18 | expressing only primitive recursive functions. | |
19 | ||
20 | The second option is the route [PL-{GOTO}][] takes. But that's an imperative | |
21 | language, and it's fairly easy to restrict an imperative language in this | |
22 | way. In PL-{GOTO}'s case, they just took PL and removed the `GOTO` command. | |
23 | The rest of the language essentially contains only `for` loops, so what you | |
24 | get is something in which you can only express primitive recursive functions. | |
25 | (That imperative programs consisting of only `for` loops can express only and | |
26 | exactly the primitive recursive functions was established by Meyer and Ritchie | |
27 | in "The complexity of loop programs".) | |
28 | ||
29 | But what about functional languages? | |
30 | ||
31 | The approach I've taken in [TPiS][], and that I wanted to take in [Pixley][] | |
32 | and [Robin][], is to provide an unrestricted functional language to the | |
33 | programmer, and statically analyze it to see if you're going and writing | |
34 | primitive recursive functions in it or not. | |
35 | ||
36 | Thing is, that's kind of difficult. Is it possible to take the same approach | |
37 | PL-{GOTO} takes, and *syntactically* restrict a functional language to the | |
38 | primitive recursive functions? | |
39 | ||
40 | I mean, in a trivial sense, it must be; in the original definition, primitive | |
41 | recursive functions *were* functions. (Duh.) But these have a highly | |
42 | arithmetical flavour, with bounded sums and products and whatnot. What | |
43 | would primitive recursion look like in the setting of general (and symbolic) | |
44 | functional programming? | |
45 | ||
46 | Functional languages don't do the `for` loop thing, they do the recursion | |
47 | thing, and there are no natural bounds on that recursion, so some restriction | |
48 | on recursion would have to be captured by the grammar, and... well, it sounds | |
49 | somewhat interesting, and doable, so let's try it. | |
50 | ||
51 | [Pixley]: https://catseye.tc/projects/pixley/ | |
52 | [PL-{GOTO}]: http://catseye.tc/projects/pl-goto.net/ | |
53 | [Robin]: https://github.com/catseye/Robin | |
54 | [TPiS]: http://catseye.tc/projects/tpis/ | |
55 | ||
56 | Ground Rules | |
57 | ------------ | |
58 | ||
59 | Here are some ground rules about how to tell if a functional program is | |
60 | primitive recursive: | |
61 | ||
62 | * It doesn't perform mutual recursion. | |
63 | * When recursion happens, it's always with arguments that are strictly | |
64 | "smaller" values than the arguments the function received. | |
65 | * There is a "smallest" value that an argument can take on, so that | |
66 | there is always a base case to the recursion, so that it always | |
67 | eventually terminates. | |
68 | * Higher-order functions are not used. | |
69 | ||
70 | The first point can be enforced simply by providing a token that | |
71 | refers to the function currently being defined (`self` is a reasonable | |
72 | choice) to permit recursion, but to disallow calling any function that | |
73 | has not yet occurred, lexically, in the program source. | |
74 | ||
75 | The second point can be enforced by stating syntactic rules for | |
76 | "smallerness". (Gee, typing that made me feel a bit like George W. Bush!) | |
77 | ||
78 | The third point can be enforced by providing some default behaviour when | |
79 | functions are called with the "smallest" kinds of values. This could be | |
80 | as simple as terminating the program if you try to find a value "smaller" | |
81 | than the "smallest" value. | |
82 | ||
83 | The fourth point can be enforced by simply disallowing functions to be | |
84 | passed to, or returned from, functions. | |
85 | ||
86 | ### Note on these Criteria ### | |
87 | ||
88 | In fact, these four criteria taken together do not strictly speaking | |
89 | define primitive recursion. They don't exclude functional programs which | |
90 | always terminate but which aren't primitive recursive (for example, the | |
91 | Ackermann function.) However, determining that such functions terminate | |
92 | requires a more sophisticated notion of "smallerness" — a reduction ordering | |
93 | on their arguments. Our notion of "smallerness" will be simple enough that | |
94 | it will be easy to express syntactically, and will only capture primitive | |
95 | recursion. | |
96 | ||
97 | ### Note on Critical Arguments ### | |
98 | ||
99 | I should note, though, that the second point is an oversimplification. | |
100 | Not *all* arguments need to be strictly "smaller" upon recursion — only | |
101 | those arguments which are used to determine *if* the function recurses. | |
102 | I'll call those the _critical arguments_. Other arguments can take on | |
103 | any value (which is useful for having "accumulator" arguments and such.) | |
104 | ||
105 | When statically analyzing a function for primitive recursive-ness, you | |
106 | need to check how it decides to recurse, to find out which arguments are | |
107 | the critical arguments, so you can check that those ones always get | |
108 | "smaller". | |
109 | ||
110 | But we can proceed in a simpler fashion here — we can simply say that | |
111 | the first argument to every function is the critical argument, and all | |
112 | the rest aren't. This is without loss of generality, as we can always | |
113 | split some functionality which would require more than one critical | |
114 | argument across multiple functions, each of which only has one critical | |
115 | argument. (Much like every `for` loop has only one loop variable.) | |
116 | ||
117 | Data types | |
118 | ---------- | |
119 | ||
120 | Let's just go with pairs and atoms for now, although natural numbers would | |
121 | be easy to add too. Following Ruby, atoms are preceded by a colon; while I | |
122 | find this syntax somewhat obnoxious, it is less obnoxious than requiring that | |
123 | atoms are in ALL CAPS, which is what Exanoke originally had. In truth, there | |
124 | would be no real problem with allowing atoms, arguments, and function names | |
125 | (and even `self`) to all be arbitrarily alphanumeric, but it would require | |
126 | more static context checking to sort them all out, and we're trying to be | |
127 | as syntactic as reasonably possible here. | |
128 | ||
129 | `:true` is the only truthy atom. Lists are by convention only, and, by | |
130 | convention, lists compose via the second element of each pair, and `:nil` is | |
131 | the agreed-upon list-terminating atom, much love to it. | |
132 | ||
133 | Grammar | |
134 | ------- | |
135 | ||
136 | Exanoke ::= {FunDef} Expr. | |
137 | FunDef ::= "def" Ident "(" "#" {"," Ident} ")" Expr. | |
138 | Expr ::= "cons" "(" Expr "," Expr ")" | |
139 | | "head" "(" Expr ")" | |
140 | | "tail" "(" Expr ")" | |
141 | | "if" Expr "then" Expr "else" Expr | |
142 | | "self" "(" Smaller {"," Expr} ")" | |
143 | | "eq?" "(" Expr "," Expr")" | |
144 | | "cons?" "(" Expr ")" | |
145 | | "not" "(" Expr ")" | |
146 | | "#" | |
147 | | ":" Ident | |
148 | | Ident ["(" Expr {"," Expr} ")"] | |
149 | | Smaller. | |
150 | Smaller ::= "<head" SmallerTerm | |
151 | | "<tail" SmallerTerm | |
152 | | "<if" Expr "then" Smaller "else" Smaller. | |
153 | SmallerTerm ::= "#" | |
154 | | Smaller. | |
155 | Ident ::= name. | |
156 | ||
157 | The first argument to a function does not have a user-defined name; it is | |
158 | simply referred to as `#`. Again, there would be no real problem if we were | |
159 | to allow the programmer to give it a better name, but more static context | |
160 | checking would be involved. | |
161 | ||
162 | Note that `<if` is not strictly necessary. Its only use is to embed a | |
163 | conditional into the first argument being passed to a recursive call. You | |
164 | could also use a regular `if` and make the recursive call in both branches, | |
165 | one with `:true` as the first argument and the other with `:false`. | |
166 | ||
167 | Examples | |
168 | -------- | |
169 | ||
170 | -> Tests for functionality "Evaluate Exanoke program" | |
171 | ||
172 | -> Functionality "Evaluate Exanoke program" is implemented by | |
173 | -> shell command "src/exanoke.py %(test-body-file)" | |
174 | ||
175 | `cons` can be used to make lists and trees and things. | |
176 | ||
177 | | cons(:hi, :there) | |
178 | = (:hi :there) | |
179 | ||
180 | | cons(:hi, cons(:there, :nil)) | |
181 | = (:hi (:there :nil)) | |
182 | ||
183 | `head` extracts the first element of a cons cell. | |
184 | ||
185 | | head(cons(:hi, :there)) | |
186 | = :hi | |
187 | ||
188 | | head(:bar) | |
189 | ? head: Not a cons cell | |
190 | ||
191 | `tail` extracts the second element of a cons cell. | |
192 | ||
193 | | tail(cons(:hi, :there)) | |
194 | = :there | |
195 | ||
196 | | tail(tail(cons(:hi, cons(:there, :nil)))) | |
197 | = :nil | |
198 | ||
199 | | tail(:foo) | |
200 | ? tail: Not a cons cell | |
201 | ||
202 | `<head` and `<tail` and syntactic variants of `head` and `tail` which | |
203 | expect their argument to be "smaller than or equal in size to" a critical | |
204 | argument. | |
205 | ||
206 | | <head cons(:hi, :there) | |
207 | ? Expected <smaller>, found "cons" | |
208 | ||
209 | | <tail :hi | |
210 | ? Expected <smaller>, found ":hi" | |
211 | ||
212 | `if` is used for descision-making. | |
213 | ||
214 | | if :true then :hi else :there | |
215 | = :hi | |
216 | ||
217 | | if :hi then :here else :there | |
218 | = :there | |
219 | ||
220 | `eq?` is used to compare atoms. | |
221 | ||
222 | | eq?(:hi, :there) | |
223 | = :false | |
224 | ||
225 | | eq?(:hi, :hi) | |
226 | = :true | |
227 | ||
228 | `eq?` only compares atoms; it can't deal with cons cells. | |
229 | ||
230 | | eq?(cons(:one, :nil), cons(:one, :nil)) | |
231 | = :false | |
232 | ||
233 | `cons?` is used to detect cons cells. | |
234 | ||
235 | | cons?(:hi) | |
236 | = :false | |
237 | ||
238 | | cons?(cons(:wagga, :nil)) | |
239 | = :true | |
240 | ||
241 | `not` does the expected thing when regarding atoms as booleans. | |
242 | ||
243 | | not(:true) | |
244 | = :false | |
245 | ||
246 | | not(:false) | |
247 | = :true | |
248 | ||
249 | Cons cells are falsey. | |
250 | ||
251 | | not(cons(:wanga, :nil)) | |
252 | = :true | |
253 | ||
254 | `self` and `#` can only be used inside function definitions. | |
255 | ||
256 | | # | |
257 | ? Use of "#" outside of a function body | |
258 | ||
259 | | self(:foo) | |
260 | ? Use of "self" outside of a function body | |
261 | ||
262 | We can define functions. Here's the identity function. | |
263 | ||
264 | | def id(#) | |
265 | | # | |
266 | | id(:woo) | |
267 | = :woo | |
268 | ||
269 | Functions must be called with the appropriate arity. | |
270 | ||
271 | | def id(#) | |
272 | | # | |
273 | | id(:foo, :bar) | |
274 | ? Arity mismatch (expected 1, got 2) | |
275 | ||
276 | | def snd(#, another) | |
277 | | another | |
278 | | snd(:foo) | |
279 | ? Arity mismatch (expected 2, got 1) | |
280 | ||
281 | Parameter names must be defined in the function definition. | |
282 | ||
283 | | def id(#) | |
284 | | woo | |
285 | | id(:woo) | |
286 | ? Undefined argument "woo" | |
287 | ||
288 | You can't call a parameter as if it were a function. | |
289 | ||
290 | | def wat(#, woo) | |
291 | | woo(#) | |
292 | | wat(:woo) | |
293 | ? Undefined function "woo" | |
294 | ||
295 | You can't define two functions with the same name. | |
296 | ||
297 | | def wat(#) | |
298 | | :there | |
299 | | def wat(#) | |
300 | | :hi | |
301 | | wat(:woo) | |
302 | ? Function "wat" already defined | |
303 | ||
304 | You can't name a function with an atom. | |
305 | ||
306 | | def :wat(#) | |
307 | | # | |
308 | | :wat(:woo) | |
309 | ? Expected identifier, but found atom (':wat') | |
310 | ||
311 | Every function takes at least one argument. | |
312 | ||
313 | | def wat() | |
314 | | :meow | |
315 | | wat() | |
316 | ? Expected '#', but found ')' | |
317 | ||
318 | The first argument of a function must be `#`. | |
319 | ||
320 | | def wat(meow) | |
321 | | meow | |
322 | | wat(:woo) | |
323 | ? Expected '#', but found 'meow' | |
324 | ||
325 | The subsequent arguments don't have to be called `#`, and in fact, they | |
326 | shouldn't be. | |
327 | ||
328 | | def snd(#, another) | |
329 | | another | |
330 | | snd(:foo, :bar) | |
331 | = :bar | |
332 | ||
333 | | def snd(#, #) | |
334 | | # | |
335 | | snd(:foo, :bar) | |
336 | ? Expected identifier, but found goose egg ('#') | |
337 | ||
338 | A function can call a built-in. | |
339 | ||
340 | | def snoc(#, another) | |
341 | | cons(another, #) | |
342 | | snoc(:there, :hi) | |
343 | = (:hi :there) | |
344 | ||
345 | Functions can call other user-defined functions. | |
346 | ||
347 | | def double(#) | |
348 | | cons(#, #) | |
349 | | def quadruple(#) | |
350 | | double(double(#)) | |
351 | | quadruple(:meow) | |
352 | = ((:meow :meow) (:meow :meow)) | |
353 | ||
354 | Functions must be defined before they are called. | |
355 | ||
356 | | def quadruple(#) | |
357 | | double(double(#)) | |
358 | | def double(#) | |
359 | | cons(#, #) | |
360 | | :meow | |
361 | ? Undefined function "double" | |
362 | ||
363 | Argument names may shadow previously-defined functions, because we | |
364 | can syntactically tell them apart. | |
365 | ||
366 | | def snoc(#, other) | |
367 | | cons(other, #) | |
368 | | def snocsnoc(#, snoc) | |
369 | | snoc(snoc(snoc, #), #) | |
370 | | snocsnoc(:blarch, :glamch) | |
371 | = (:blarch (:blarch :glamch)) | |
372 | ||
373 | A function may recursively call itself, as long as it does so with | |
374 | values which are smaller than or equal in size to the critical argument | |
375 | as the first argument. | |
376 | ||
377 | | def count(#) | |
378 | | self(<tail #) | |
379 | | count(cons(:alpha, cons(:beta, :nil))) | |
380 | ? tail: Not a cons cell | |
381 | ||
382 | | def count(#) | |
383 | | if eq?(#, :nil) then :nil else self(<tail #) | |
384 | | count(cons(:alpha, cons(:beta, :nil))) | |
385 | = :nil | |
386 | ||
387 | | def last(#) | |
388 | | if not(cons?(#)) then # else self(<tail #) | |
389 | | last(cons(:alpha, cons(:beta, :graaap))) | |
390 | = :graaap | |
391 | ||
392 | | def count(#, acc) | |
393 | | if eq?(#, :nil) then acc else self(<tail #, cons(:one, acc)) | |
394 | | count(cons(:A, cons(:B, :nil)), :nil) | |
395 | = (:one (:one :nil)) | |
396 | ||
397 | Arity must match when a function calls itself recursively. | |
398 | ||
399 | | def urff(#) | |
400 | | self(<tail #, <head #) | |
401 | | urff(:woof) | |
402 | ? Arity mismatch on self (expected 1, got 2) | |
403 | ||
404 | | def urff(#, other) | |
405 | | self(<tail #) | |
406 | | urff(:woof, :moo) | |
407 | ? Arity mismatch on self (expected 2, got 1) | |
408 | ||
409 | The remaining tests demonstrate that a function cannot call itself if it | |
410 | does not pass a values which is smaller than or equal in size to the | |
411 | critical argument as the first argument. | |
412 | ||
413 | | def urff(#) | |
414 | | self(cons(#, #)) | |
415 | | urff(:woof) | |
416 | ? Expected <smaller>, found "cons" | |
417 | ||
418 | | def urff(#) | |
419 | | self(#) | |
420 | | urff(:graaap) | |
421 | ? Expected <smaller>, found "#" | |
422 | ||
423 | | def urff(#, boof) | |
424 | | self(boof) | |
425 | | urff(:graaap, :skooorp) | |
426 | ? Expected <smaller>, found "boof" | |
427 | ||
428 | | def urff(#, boof) | |
429 | | self(<tail boof) | |
430 | | urff(:graaap, :skooorp) | |
431 | ? Expected <smaller>, found "boof" | |
432 | ||
433 | | def urff(#) | |
434 | | self(:wanga) | |
435 | | urff(:graaap) | |
436 | ? Expected <smaller>, found ":wanga" | |
437 | ||
438 | | def urff(#) | |
439 | | self(if eq?(:alpha, :alpha) then <head # else <tail #) | |
440 | | urff(:graaap) | |
441 | ? Expected <smaller>, found "if" | |
442 | ||
443 | | def urff(#) | |
444 | | self(<if eq?(:alpha, :alpha) then <head # else <tail #) | |
445 | | urff(:graaap) | |
446 | ? head: Not a cons cell | |
447 | ||
448 | | def urff(#) | |
449 | | self(<if eq?(self(<head #), :alpha) then <head # else <tail #) | |
450 | | urff(:graaap) | |
451 | ? head: Not a cons cell | |
452 | ||
453 | | def urff(#) | |
454 | | self(<if self(<tail #) then <head # else <tail #) | |
455 | | urff(cons(:graaap, :skooorp)) | |
456 | ? tail: Not a cons cell | |
457 | ||
458 | Now, some practical examples, on Peano naturals. Addition: | |
459 | ||
460 | | def inc(#) | |
461 | | cons(:one, #) | |
462 | | def add(#, other) | |
463 | | if eq?(#, :nil) then other else self(<tail #, inc(other)) | |
464 | | | |
465 | | add(cons(:one, cons(:one, :nil)), cons(:one, :nil)) | |
466 | = (:one (:one (:one :nil))) | |
467 | ||
468 | Multiplication: | |
469 | ||
470 | | def inc(#) | |
471 | | cons(:one, #) | |
472 | | def add(#, other) | |
473 | | if eq?(#, :nil) then other else self(<tail #, inc(other)) | |
474 | | def mul(#, other) | |
475 | | if eq?(#, :nil) then :nil else | |
476 | | add(other, self(<tail #, other)) | |
477 | | def three(#) | |
478 | | cons(:one, cons(:one, cons(:one, #))) | |
479 | | | |
480 | | mul(three(:nil), three(:nil)) | |
481 | = (:one (:one (:one (:one (:one (:one (:one (:one (:one :nil))))))))) | |
482 | ||
483 | Factorial! There are 24 `:one`'s in this test's expectation. | |
484 | ||
485 | | def inc(#) | |
486 | | cons(:one, #) | |
487 | | def add(#, other) | |
488 | | if eq?(#, :nil) then other else self(<tail #, inc(other)) | |
489 | | def mul(#, other) | |
490 | | if eq?(#, :nil) then :nil else | |
491 | | add(other, self(<tail #, other)) | |
492 | | def fact(#) | |
493 | | if eq?(#, :nil) then cons(:one, :nil) else | |
494 | | mul(#, self(<tail #)) | |
495 | | def four(#) | |
496 | | cons(:one, cons(:one, cons(:one, cons(:one, #)))) | |
497 | | | |
498 | | fact(four(:nil)) | |
499 | = (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one :nil)))))))))))))))))))))))) | |
500 | ||
501 | Discussion | |
502 | ---------- | |
503 | ||
504 | So, what of it? | |
505 | ||
506 | It was not a particularly challenging design goal to meet; it's one of those | |
507 | things that seems rather obvious after the fact, that you can just dictate | |
508 | that one of the arguments is a critical argument, and only call yourself with | |
509 | some smaller version of your critical argument in that position. Recursive | |
510 | calls map quite straightforwardly to `for` loops, and you end up with what is | |
511 | essentially a functional version of of a `for` program. | |
512 | ||
513 | I guess the question is, is it worth doing this primitive-recursion check as | |
514 | a syntactic, rather than a static semantic, thing? | |
515 | ||
516 | I think it is. If you're concerned at all with writing functions which are | |
517 | guaranteed to terminate, you probably have a plan in mind (however vague) | |
518 | for how they will accomplish this, so it seems reasonable to require that you | |
519 | mark up your function to indicate how it does this. And it's certainly | |
520 | easier to implement than analyzing an arbirarily-written function. | |
521 | ||
522 | Of course, the exact syntactic mechanisms would likely see some improvement | |
523 | in a practical application of this idea. As alluded to in several places | |
524 | in this document, any actually-distinct lexical items (name of the critical | |
525 | argument, and so forth) could be replaced by simple static semantic checks | |
526 | (against a symbol table or whatnot.) Which arguments are the critical | |
527 | arguments for a particular function could be indicated in the source. | |
528 | ||
529 | One criticism (if I can call it that) of primitive recursive functions is | |
530 | that, even though they can express any algorithm which runs in | |
531 | non-deterministic exponential time (which, if you believe "polynomial | |
532 | time = feasible", means, basically, all algorithms you'd ever care about), | |
533 | for any primitive recursively expressed algorithm, theye may be a (much) | |
534 | more efficient algorithm expressed in a general recursive way. | |
535 | ||
536 | However, in my experience, there are many functions, generally non-, or | |
537 | minimally, numerical, which operate on data structures, where the obvious | |
538 | implementation _is_ primitive recursive. In day-to-day database and web | |
539 | programming, there will be operations which are series of replacements, | |
540 | updates, simple transformations, folds, and the like, all of which | |
541 | "obviously" terminate, and which can readily be written primitive recursively. | |
542 | ||
543 | Limited support for higher-order functions could be added, possibly even to | |
544 | Exanoke (as long as the "no mutual recursion" rule is still observed.) | |
545 | After all (and if you'll forgive the anthropomorphizing self-insertion in | |
546 | this sentence), if you pass me a primitive recursive function, and I'm | |
547 | primitive recursive, I'll remain primitive recursive no matter how many times | |
548 | I call your function. | |
549 | ||
550 | Lastly, the requisite etymological denoument: the name "Exanoke" started life | |
551 | as a typo for the word "example". | |
552 | ||
553 | Happy primitive recursing! | |
554 | Chris Pressey | |
555 | Cornwall, UK, WTF | |
556 | Jan 5, 2013 |
0 | Exanoke | |
1 | ======= | |
2 | ||
3 | _Exanoke_ is a pure functional language which is syntactically restricted to | |
4 | expressing the primitive recursive functions. | |
5 | ||
6 | I'll assume you know what a primitive recursive function is. If not, go look | |
7 | it up, as it's quite interesting, if only for the fact that it demonstrates | |
8 | even a genius like Kurt Gödel can sometimes be mistaken. (He initially | |
9 | thought that all functions could be expressed primitive recursively, until | |
10 | Ackermann came up with a counterexample.) | |
11 | ||
12 | So, you have a program. There are two ways that you can ensure that it | |
13 | implements a primtive recursive function: | |
14 | ||
15 | * You can statically analyze the bastard, and prove that all of its | |
16 | loops eventually terminate, and so forth; or | |
17 | * You can write it in a language which is inherently restricted to | |
18 | expressing only primitive recursive functions. | |
19 | ||
20 | The second option is the route [PL-{GOTO}][] takes. But that's an imperative | |
21 | language, and it's fairly easy to restrict an imperative language in this | |
22 | way. In PL-{GOTO}'s case, they just took PL and removed the `GOTO` command. | |
23 | The rest of the language essentially contains only `for` loops, so what you | |
24 | get is something in which you can only express primitive recursive functions. | |
25 | (That imperative programs consisting of only `for` loops can express only and | |
26 | exactly the primitive recursive functions was established by Meyer and Ritchie | |
27 | in "The complexity of loop programs".) | |
28 | ||
29 | But what about functional languages? | |
30 | ||
31 | The approach I've taken in [TPiS][], and that I wanted to take in [Pixley][] | |
32 | and [Robin][], is to provide an unrestricted functional language to the | |
33 | programmer, and statically analyze it to see if you're going and writing | |
34 | primitive recursive functions in it or not. | |
35 | ||
36 | Thing is, that's kind of difficult. Is it possible to take the same approach | |
37 | PL-{GOTO} takes, and *syntactically* restrict a functional language to the | |
38 | primitive recursive functions? | |
39 | ||
40 | I mean, in a trivial sense, it must be; in the original definition, primitive | |
41 | recursive functions *were* functions. (Duh.) But these have a highly | |
42 | arithmetical flavour, with bounded sums and products and whatnot. What | |
43 | would primitive recursion look like in the setting of general (and symbolic) | |
44 | functional programming? | |
45 | ||
46 | Functional languages don't do the `for` loop thing, they do the recursion | |
47 | thing, and there are no natural bounds on that recursion, so some restriction | |
48 | on recursion would have to be captured by the grammar, and... well, it sounds | |
49 | somewhat interesting, and doable, so let's try it. | |
50 | ||
51 | [Pixley]: https://catseye.tc/projects/pixley/ | |
52 | [PL-{GOTO}]: http://catseye.tc/projects/pl-goto.net/ | |
53 | [Robin]: https://github.com/catseye/Robin | |
54 | [TPiS]: http://catseye.tc/projects/tpis/ | |
55 | ||
56 | Ground Rules | |
57 | ------------ | |
58 | ||
59 | Here are some ground rules about how to tell if a functional program is | |
60 | primitive recursive: | |
61 | ||
62 | * It doesn't perform mutual recursion. | |
63 | * When recursion happens, it's always with arguments that are strictly | |
64 | "smaller" values than the arguments the function received. | |
65 | * There is a "smallest" value that an argument can take on, so that | |
66 | there is always a base case to the recursion, so that it always | |
67 | eventually terminates. | |
68 | * Higher-order functions are not used. | |
69 | ||
70 | The first point can be enforced simply by providing a token that | |
71 | refers to the function currently being defined (`self` is a reasonable | |
72 | choice) to permit recursion, but to disallow calling any function that | |
73 | has not yet occurred, lexically, in the program source. | |
74 | ||
75 | The second point can be enforced by stating syntactic rules for | |
76 | "smallerness". (Gee, typing that made me feel a bit like George W. Bush!) | |
77 | ||
78 | The third point can be enforced by providing some default behaviour when | |
79 | functions are called with the "smallest" kinds of values. This could be | |
80 | as simple as terminating the program if you try to find a value "smaller" | |
81 | than the "smallest" value. | |
82 | ||
83 | The fourth point can be enforced by simply disallowing functions to be | |
84 | passed to, or returned from, functions. | |
85 | ||
86 | ### Note on these Criteria ### | |
87 | ||
88 | In fact, these four criteria taken together do not strictly speaking | |
89 | define primitive recursion. They don't exclude functional programs which | |
90 | always terminate but which aren't primitive recursive (for example, the | |
91 | Ackermann function.) However, determining that such functions terminate | |
92 | requires a more sophisticated notion of "smallerness" — a reduction ordering | |
93 | on their arguments. Our notion of "smallerness" will be simple enough that | |
94 | it will be easy to express syntactically, and will only capture primitive | |
95 | recursion. | |
96 | ||
97 | ### Note on Critical Arguments ### | |
98 | ||
99 | I should note, though, that the second point is an oversimplification. | |
100 | Not *all* arguments need to be strictly "smaller" upon recursion — only | |
101 | those arguments which are used to determine *if* the function recurses. | |
102 | I'll call those the _critical arguments_. Other arguments can take on | |
103 | any value (which is useful for having "accumulator" arguments and such.) | |
104 | ||
105 | When statically analyzing a function for primitive recursive-ness, you | |
106 | need to check how it decides to recurse, to find out which arguments are | |
107 | the critical arguments, so you can check that those ones always get | |
108 | "smaller". | |
109 | ||
110 | But we can proceed in a simpler fashion here — we can simply say that | |
111 | the first argument to every function is the critical argument, and all | |
112 | the rest aren't. This is without loss of generality, as we can always | |
113 | split some functionality which would require more than one critical | |
114 | argument across multiple functions, each of which only has one critical | |
115 | argument. (Much like every `for` loop has only one loop variable.) | |
116 | ||
117 | Data types | |
118 | ---------- | |
119 | ||
120 | Let's just go with pairs and atoms for now, although natural numbers would | |
121 | be easy to add too. Following Ruby, atoms are preceded by a colon; while I | |
122 | find this syntax somewhat obnoxious, it is less obnoxious than requiring that | |
123 | atoms are in ALL CAPS, which is what Exanoke originally had. In truth, there | |
124 | would be no real problem with allowing atoms, arguments, and function names | |
125 | (and even `self`) to all be arbitrarily alphanumeric, but it would require | |
126 | more static context checking to sort them all out, and we're trying to be | |
127 | as syntactic as reasonably possible here. | |
128 | ||
129 | `:true` is the only truthy atom. Lists are by convention only, and, by | |
130 | convention, lists compose via the second element of each pair, and `:nil` is | |
131 | the agreed-upon list-terminating atom, much love to it. | |
132 | ||
133 | Grammar | |
134 | ------- | |
135 | ||
136 | Exanoke ::= {FunDef} Expr. | |
137 | FunDef ::= "def" Ident "(" "#" {"," Ident} ")" Expr. | |
138 | Expr ::= "cons" "(" Expr "," Expr ")" | |
139 | | "head" "(" Expr ")" | |
140 | | "tail" "(" Expr ")" | |
141 | | "if" Expr "then" Expr "else" Expr | |
142 | | "self" "(" Smaller {"," Expr} ")" | |
143 | | "eq?" "(" Expr "," Expr")" | |
144 | | "cons?" "(" Expr ")" | |
145 | | "not" "(" Expr ")" | |
146 | | "#" | |
147 | | ":" Ident | |
148 | | Ident ["(" Expr {"," Expr} ")"] | |
149 | | Smaller. | |
150 | Smaller ::= "<head" SmallerTerm | |
151 | | "<tail" SmallerTerm | |
152 | | "<if" Expr "then" Smaller "else" Smaller. | |
153 | SmallerTerm ::= "#" | |
154 | | Smaller. | |
155 | Ident ::= name. | |
156 | ||
157 | The first argument to a function does not have a user-defined name; it is | |
158 | simply referred to as `#`. Again, there would be no real problem if we were | |
159 | to allow the programmer to give it a better name, but more static context | |
160 | checking would be involved. | |
161 | ||
162 | Note that `<if` is not strictly necessary. Its only use is to embed a | |
163 | conditional into the first argument being passed to a recursive call. You | |
164 | could also use a regular `if` and make the recursive call in both branches, | |
165 | one with `:true` as the first argument and the other with `:false`. | |
166 | ||
167 | Examples | |
168 | -------- | |
169 | ||
170 | -> Tests for functionality "Evaluate Exanoke program" | |
171 | ||
172 | `cons` can be used to make lists and trees and things. | |
173 | ||
174 | | cons(:hi, :there) | |
175 | = (:hi :there) | |
176 | ||
177 | | cons(:hi, cons(:there, :nil)) | |
178 | = (:hi (:there :nil)) | |
179 | ||
180 | `head` extracts the first element of a cons cell. | |
181 | ||
182 | | head(cons(:hi, :there)) | |
183 | = :hi | |
184 | ||
185 | | head(:bar) | |
186 | ? head: Not a cons cell | |
187 | ||
188 | `tail` extracts the second element of a cons cell. | |
189 | ||
190 | | tail(cons(:hi, :there)) | |
191 | = :there | |
192 | ||
193 | | tail(tail(cons(:hi, cons(:there, :nil)))) | |
194 | = :nil | |
195 | ||
196 | | tail(:foo) | |
197 | ? tail: Not a cons cell | |
198 | ||
199 | `<head` and `<tail` and syntactic variants of `head` and `tail` which | |
200 | expect their argument to be "smaller than or equal in size to" a critical | |
201 | argument. | |
202 | ||
203 | | <head cons(:hi, :there) | |
204 | ? Expected <smaller>, found "cons" | |
205 | ||
206 | | <tail :hi | |
207 | ? Expected <smaller>, found ":hi" | |
208 | ||
209 | `if` is used for descision-making. | |
210 | ||
211 | | if :true then :hi else :there | |
212 | = :hi | |
213 | ||
214 | | if :hi then :here else :there | |
215 | = :there | |
216 | ||
217 | `eq?` is used to compare atoms. | |
218 | ||
219 | | eq?(:hi, :there) | |
220 | = :false | |
221 | ||
222 | | eq?(:hi, :hi) | |
223 | = :true | |
224 | ||
225 | `eq?` only compares atoms; it can't deal with cons cells. | |
226 | ||
227 | | eq?(cons(:one, :nil), cons(:one, :nil)) | |
228 | = :false | |
229 | ||
230 | `cons?` is used to detect cons cells. | |
231 | ||
232 | | cons?(:hi) | |
233 | = :false | |
234 | ||
235 | | cons?(cons(:wagga, :nil)) | |
236 | = :true | |
237 | ||
238 | `not` does the expected thing when regarding atoms as booleans. | |
239 | ||
240 | | not(:true) | |
241 | = :false | |
242 | ||
243 | | not(:false) | |
244 | = :true | |
245 | ||
246 | Cons cells are falsey. | |
247 | ||
248 | | not(cons(:wanga, :nil)) | |
249 | = :true | |
250 | ||
251 | `self` and `#` can only be used inside function definitions. | |
252 | ||
253 | | # | |
254 | ? Use of "#" outside of a function body | |
255 | ||
256 | | self(:foo) | |
257 | ? Use of "self" outside of a function body | |
258 | ||
259 | We can define functions. Here's the identity function. | |
260 | ||
261 | | def id(#) | |
262 | | # | |
263 | | id(:woo) | |
264 | = :woo | |
265 | ||
266 | Functions must be called with the appropriate arity. | |
267 | ||
268 | | def id(#) | |
269 | | # | |
270 | | id(:foo, :bar) | |
271 | ? Arity mismatch (expected 1, got 2) | |
272 | ||
273 | | def snd(#, another) | |
274 | | another | |
275 | | snd(:foo) | |
276 | ? Arity mismatch (expected 2, got 1) | |
277 | ||
278 | Parameter names must be defined in the function definition. | |
279 | ||
280 | | def id(#) | |
281 | | woo | |
282 | | id(:woo) | |
283 | ? Undefined argument "woo" | |
284 | ||
285 | You can't call a parameter as if it were a function. | |
286 | ||
287 | | def wat(#, woo) | |
288 | | woo(#) | |
289 | | wat(:woo) | |
290 | ? Undefined function "woo" | |
291 | ||
292 | You can't define two functions with the same name. | |
293 | ||
294 | | def wat(#) | |
295 | | :there | |
296 | | def wat(#) | |
297 | | :hi | |
298 | | wat(:woo) | |
299 | ? Function "wat" already defined | |
300 | ||
301 | You can't name a function with an atom. | |
302 | ||
303 | | def :wat(#) | |
304 | | # | |
305 | | :wat(:woo) | |
306 | ? Expected identifier, but found atom (':wat') | |
307 | ||
308 | Every function takes at least one argument. | |
309 | ||
310 | | def wat() | |
311 | | :meow | |
312 | | wat() | |
313 | ? Expected '#', but found ')' | |
314 | ||
315 | The first argument of a function must be `#`. | |
316 | ||
317 | | def wat(meow) | |
318 | | meow | |
319 | | wat(:woo) | |
320 | ? Expected '#', but found 'meow' | |
321 | ||
322 | The subsequent arguments don't have to be called `#`, and in fact, they | |
323 | shouldn't be. | |
324 | ||
325 | | def snd(#, another) | |
326 | | another | |
327 | | snd(:foo, :bar) | |
328 | = :bar | |
329 | ||
330 | | def snd(#, #) | |
331 | | # | |
332 | | snd(:foo, :bar) | |
333 | ? Expected identifier, but found goose egg ('#') | |
334 | ||
335 | A function can call a built-in. | |
336 | ||
337 | | def snoc(#, another) | |
338 | | cons(another, #) | |
339 | | snoc(:there, :hi) | |
340 | = (:hi :there) | |
341 | ||
342 | Functions can call other user-defined functions. | |
343 | ||
344 | | def double(#) | |
345 | | cons(#, #) | |
346 | | def quadruple(#) | |
347 | | double(double(#)) | |
348 | | quadruple(:meow) | |
349 | = ((:meow :meow) (:meow :meow)) | |
350 | ||
351 | Functions must be defined before they are called. | |
352 | ||
353 | | def quadruple(#) | |
354 | | double(double(#)) | |
355 | | def double(#) | |
356 | | cons(#, #) | |
357 | | :meow | |
358 | ? Undefined function "double" | |
359 | ||
360 | Argument names may shadow previously-defined functions, because we | |
361 | can syntactically tell them apart. | |
362 | ||
363 | | def snoc(#, other) | |
364 | | cons(other, #) | |
365 | | def snocsnoc(#, snoc) | |
366 | | snoc(snoc(snoc, #), #) | |
367 | | snocsnoc(:blarch, :glamch) | |
368 | = (:blarch (:blarch :glamch)) | |
369 | ||
370 | A function may recursively call itself, as long as it does so with | |
371 | values which are smaller than or equal in size to the critical argument | |
372 | as the first argument. | |
373 | ||
374 | | def count(#) | |
375 | | self(<tail #) | |
376 | | count(cons(:alpha, cons(:beta, :nil))) | |
377 | ? tail: Not a cons cell | |
378 | ||
379 | | def count(#) | |
380 | | if eq?(#, :nil) then :nil else self(<tail #) | |
381 | | count(cons(:alpha, cons(:beta, :nil))) | |
382 | = :nil | |
383 | ||
384 | | def last(#) | |
385 | | if not(cons?(#)) then # else self(<tail #) | |
386 | | last(cons(:alpha, cons(:beta, :graaap))) | |
387 | = :graaap | |
388 | ||
389 | | def count(#, acc) | |
390 | | if eq?(#, :nil) then acc else self(<tail #, cons(:one, acc)) | |
391 | | count(cons(:A, cons(:B, :nil)), :nil) | |
392 | = (:one (:one :nil)) | |
393 | ||
394 | Arity must match when a function calls itself recursively. | |
395 | ||
396 | | def urff(#) | |
397 | | self(<tail #, <head #) | |
398 | | urff(:woof) | |
399 | ? Arity mismatch on self (expected 1, got 2) | |
400 | ||
401 | | def urff(#, other) | |
402 | | self(<tail #) | |
403 | | urff(:woof, :moo) | |
404 | ? Arity mismatch on self (expected 2, got 1) | |
405 | ||
406 | The remaining tests demonstrate that a function cannot call itself if it | |
407 | does not pass a values which is smaller than or equal in size to the | |
408 | critical argument as the first argument. | |
409 | ||
410 | | def urff(#) | |
411 | | self(cons(#, #)) | |
412 | | urff(:woof) | |
413 | ? Expected <smaller>, found "cons" | |
414 | ||
415 | | def urff(#) | |
416 | | self(#) | |
417 | | urff(:graaap) | |
418 | ? Expected <smaller>, found "#" | |
419 | ||
420 | | def urff(#, boof) | |
421 | | self(boof) | |
422 | | urff(:graaap, :skooorp) | |
423 | ? Expected <smaller>, found "boof" | |
424 | ||
425 | | def urff(#, boof) | |
426 | | self(<tail boof) | |
427 | | urff(:graaap, :skooorp) | |
428 | ? Expected <smaller>, found "boof" | |
429 | ||
430 | | def urff(#) | |
431 | | self(:wanga) | |
432 | | urff(:graaap) | |
433 | ? Expected <smaller>, found ":wanga" | |
434 | ||
435 | | def urff(#) | |
436 | | self(if eq?(:alpha, :alpha) then <head # else <tail #) | |
437 | | urff(:graaap) | |
438 | ? Expected <smaller>, found "if" | |
439 | ||
440 | | def urff(#) | |
441 | | self(<if eq?(:alpha, :alpha) then <head # else <tail #) | |
442 | | urff(:graaap) | |
443 | ? head: Not a cons cell | |
444 | ||
445 | | def urff(#) | |
446 | | self(<if eq?(self(<head #), :alpha) then <head # else <tail #) | |
447 | | urff(:graaap) | |
448 | ? head: Not a cons cell | |
449 | ||
450 | | def urff(#) | |
451 | | self(<if self(<tail #) then <head # else <tail #) | |
452 | | urff(cons(:graaap, :skooorp)) | |
453 | ? tail: Not a cons cell | |
454 | ||
455 | Now, some practical examples, on Peano naturals. Addition: | |
456 | ||
457 | | def inc(#) | |
458 | | cons(:one, #) | |
459 | | def add(#, other) | |
460 | | if eq?(#, :nil) then other else self(<tail #, inc(other)) | |
461 | | | |
462 | | add(cons(:one, cons(:one, :nil)), cons(:one, :nil)) | |
463 | = (:one (:one (:one :nil))) | |
464 | ||
465 | Multiplication: | |
466 | ||
467 | | def inc(#) | |
468 | | cons(:one, #) | |
469 | | def add(#, other) | |
470 | | if eq?(#, :nil) then other else self(<tail #, inc(other)) | |
471 | | def mul(#, other) | |
472 | | if eq?(#, :nil) then :nil else | |
473 | | add(other, self(<tail #, other)) | |
474 | | def three(#) | |
475 | | cons(:one, cons(:one, cons(:one, #))) | |
476 | | | |
477 | | mul(three(:nil), three(:nil)) | |
478 | = (:one (:one (:one (:one (:one (:one (:one (:one (:one :nil))))))))) | |
479 | ||
480 | Factorial! There are 24 `:one`'s in this test's expectation. | |
481 | ||
482 | | def inc(#) | |
483 | | cons(:one, #) | |
484 | | def add(#, other) | |
485 | | if eq?(#, :nil) then other else self(<tail #, inc(other)) | |
486 | | def mul(#, other) | |
487 | | if eq?(#, :nil) then :nil else | |
488 | | add(other, self(<tail #, other)) | |
489 | | def fact(#) | |
490 | | if eq?(#, :nil) then cons(:one, :nil) else | |
491 | | mul(#, self(<tail #)) | |
492 | | def four(#) | |
493 | | cons(:one, cons(:one, cons(:one, cons(:one, #)))) | |
494 | | | |
495 | | fact(four(:nil)) | |
496 | = (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one :nil)))))))))))))))))))))))) | |
497 | ||
498 | Discussion | |
499 | ---------- | |
500 | ||
501 | So, what of it? | |
502 | ||
503 | It was not a particularly challenging design goal to meet; it's one of those | |
504 | things that seems rather obvious after the fact, that you can just dictate | |
505 | that one of the arguments is a critical argument, and only call yourself with | |
506 | some smaller version of your critical argument in that position. Recursive | |
507 | calls map quite straightforwardly to `for` loops, and you end up with what is | |
508 | essentially a functional version of of a `for` program. | |
509 | ||
510 | I guess the question is, is it worth doing this primitive-recursion check as | |
511 | a syntactic, rather than a static semantic, thing? | |
512 | ||
513 | I think it is. If you're concerned at all with writing functions which are | |
514 | guaranteed to terminate, you probably have a plan in mind (however vague) | |
515 | for how they will accomplish this, so it seems reasonable to require that you | |
516 | mark up your function to indicate how it does this. And it's certainly | |
517 | easier to implement than analyzing an arbirarily-written function. | |
518 | ||
519 | Of course, the exact syntactic mechanisms would likely see some improvement | |
520 | in a practical application of this idea. As alluded to in several places | |
521 | in this document, any actually-distinct lexical items (name of the critical | |
522 | argument, and so forth) could be replaced by simple static semantic checks | |
523 | (against a symbol table or whatnot.) Which arguments are the critical | |
524 | arguments for a particular function could be indicated in the source. | |
525 | ||
526 | One criticism (if I can call it that) of primitive recursive functions is | |
527 | that, even though they can express any algorithm which runs in | |
528 | non-deterministic exponential time (which, if you believe "polynomial | |
529 | time = feasible", means, basically, all algorithms you'd ever care about), | |
530 | for any primitive recursively expressed algorithm, theye may be a (much) | |
531 | more efficient algorithm expressed in a general recursive way. | |
532 | ||
533 | However, in my experience, there are many functions, generally non-, or | |
534 | minimally, numerical, which operate on data structures, where the obvious | |
535 | implementation _is_ primitive recursive. In day-to-day database and web | |
536 | programming, there will be operations which are series of replacements, | |
537 | updates, simple transformations, folds, and the like, all of which | |
538 | "obviously" terminate, and which can readily be written primitive recursively. | |
539 | ||
540 | Limited support for higher-order functions could be added, possibly even to | |
541 | Exanoke (as long as the "no mutual recursion" rule is still observed.) | |
542 | After all (and if you'll forgive the anthropomorphizing self-insertion in | |
543 | this sentence), if you pass me a primitive recursive function, and I'm | |
544 | primitive recursive, I'll remain primitive recursive no matter how many times | |
545 | I call your function. | |
546 | ||
547 | Lastly, the requisite etymological denoument: the name "Exanoke" started life | |
548 | as a typo for the word "example". | |
549 | ||
550 | Happy primitive recursing! | |
551 | Chris Pressey | |
552 | Cornwall, UK, WTF | |
553 | Jan 5, 2013 |
461 | 461 | import doctest |
462 | 462 | (fails, something) = doctest.testmod() |
463 | 463 | if fails == 0: |
464 | print "All tests passed." | |
464 | print("All tests passed.") | |
465 | 465 | sys.exit(0) |
466 | 466 | else: |
467 | 467 | sys.exit(1) |
472 | 472 | try: |
473 | 473 | prog = p.program() |
474 | 474 | except SyntaxError as e: |
475 | print >>sys.stderr, str(e) | |
475 | sys.stderr.write(str(e)) | |
476 | sys.stderr.write("\n") | |
476 | 477 | sys.exit(1) |
477 | 478 | if options.show_ast: |
478 | 479 | from pprint import pprint |
480 | 481 | sys.exit(0) |
481 | 482 | try: |
482 | 483 | ev = Evaluator(prog) |
483 | print str(ev.eval(prog)) | |
484 | print(str(ev.eval(prog))) | |
484 | 485 | except TypeError as e: |
485 | print >>sys.stderr, str(e) | |
486 | sys.stderr.write(str(e)) | |
487 | sys.stderr.write("\n") | |
486 | 488 | sys.exit(1) |
487 | 489 | sys.exit(0) |
488 | 490 | |
491 | 493 | import os |
492 | 494 | |
493 | 495 | def rpython_load(filename): |
494 | fd = os.open(filename, os.O_RDONLY, 0644) | |
496 | fd = os.open(filename, os.O_RDONLY, 0o644) | |
495 | 497 | text = '' |
496 | 498 | chunk = os.read(fd, 1024) |
497 | 499 | text += chunk |
507 | 509 | prog = p.program() |
508 | 510 | ev = Evaluator(prog) |
509 | 511 | result = ev.eval(prog) |
510 | print result.__repr__() | |
512 | print(result.__repr__()) | |
511 | 513 | return 0 |
512 | 514 | |
513 | 515 | return rpython_main, None |
0 | 0 | #!/bin/sh |
1 | 1 | |
2 | falderal README.markdown | |
2 | APPLIANCES="tests/appliances/exanoke.py.md" | |
3 | ||
4 | falderal $APPLIANCES README.md |