Make source run under both Python 2 and Python 3. Refactor tests.
Chris Pressey
10 months ago

0 | Exanoke | |

1 | ======= | |

2 | ||

3 | _Exanoke_ is a pure functional language which is syntactically restricted to | |

4 | expressing the primitive recursive functions. | |

5 | ||

6 | I'll assume you know what a primitive recursive function is. If not, go look | |

7 | it up, as it's quite interesting, if only for the fact that it demonstrates | |

8 | even a genius like Kurt Gödel can sometimes be mistaken. (He initially | |

9 | thought that all functions could be expressed primitive recursively, until | |

10 | Ackermann came up with a counterexample.) | |

11 | ||

12 | So, you have a program. There are two ways that you can ensure that it | |

13 | implements a primtive recursive function: | |

14 | ||

15 | * You can statically analyze the bastard, and prove that all of its | |

16 | loops eventually terminate, and so forth; or | |

17 | * You can write it in a language which is inherently restricted to | |

18 | expressing only primitive recursive functions. | |

19 | ||

20 | The second option is the route [PL-{GOTO}][] takes. But that's an imperative | |

21 | language, and it's fairly easy to restrict an imperative language in this | |

22 | way. In PL-{GOTO}'s case, they just took PL and removed the `GOTO` command. | |

23 | The rest of the language essentially contains only `for` loops, so what you | |

24 | get is something in which you can only express primitive recursive functions. | |

25 | (That imperative programs consisting of only `for` loops can express only and | |

26 | exactly the primitive recursive functions was established by Meyer and Ritchie | |

27 | in "The complexity of loop programs".) | |

28 | ||

29 | But what about functional languages? | |

30 | ||

31 | The approach I've taken in [TPiS][], and that I wanted to take in [Pixley][] | |

32 | and [Robin][], is to provide an unrestricted functional language to the | |

33 | programmer, and statically analyze it to see if you're going and writing | |

34 | primitive recursive functions in it or not. | |

35 | ||

36 | Thing is, that's kind of difficult. Is it possible to take the same approach | |

37 | PL-{GOTO} takes, and *syntactically* restrict a functional language to the | |

38 | primitive recursive functions? | |

39 | ||

40 | I mean, in a trivial sense, it must be; in the original definition, primitive | |

41 | recursive functions *were* functions. (Duh.) But these have a highly | |

42 | arithmetical flavour, with bounded sums and products and whatnot. What | |

43 | would primitive recursion look like in the setting of general (and symbolic) | |

44 | functional programming? | |

45 | ||

46 | Functional languages don't do the `for` loop thing, they do the recursion | |

47 | thing, and there are no natural bounds on that recursion, so some restriction | |

48 | on recursion would have to be captured by the grammar, and... well, it sounds | |

49 | somewhat interesting, and doable, so let's try it. | |

50 | ||

51 | [Pixley]: https://catseye.tc/projects/pixley/ | |

52 | [PL-{GOTO}]: http://catseye.tc/projects/pl-goto.net/ | |

53 | [Robin]: https://github.com/catseye/Robin | |

54 | [TPiS]: http://catseye.tc/projects/tpis/ | |

55 | ||

56 | Ground Rules | |

57 | ------------ | |

58 | ||

59 | Here are some ground rules about how to tell if a functional program is | |

60 | primitive recursive: | |

61 | ||

62 | * It doesn't perform mutual recursion. | |

63 | * When recursion happens, it's always with arguments that are strictly | |

64 | "smaller" values than the arguments the function received. | |

65 | * There is a "smallest" value that an argument can take on, so that | |

66 | there is always a base case to the recursion, so that it always | |

67 | eventually terminates. | |

68 | * Higher-order functions are not used. | |

69 | ||

70 | The first point can be enforced simply by providing a token that | |

71 | refers to the function currently being defined (`self` is a reasonable | |

72 | choice) to permit recursion, but to disallow calling any function that | |

73 | has not yet occurred, lexically, in the program source. | |

74 | ||

75 | The second point can be enforced by stating syntactic rules for | |

76 | "smallerness". (Gee, typing that made me feel a bit like George W. Bush!) | |

77 | ||

78 | The third point can be enforced by providing some default behaviour when | |

79 | functions are called with the "smallest" kinds of values. This could be | |

80 | as simple as terminating the program if you try to find a value "smaller" | |

81 | than the "smallest" value. | |

82 | ||

83 | The fourth point can be enforced by simply disallowing functions to be | |

84 | passed to, or returned from, functions. | |

85 | ||

86 | ### Note on these Criteria ### | |

87 | ||

88 | In fact, these four criteria taken together do not strictly speaking | |

89 | define primitive recursion. They don't exclude functional programs which | |

90 | always terminate but which aren't primitive recursive (for example, the | |

91 | Ackermann function.) However, determining that such functions terminate | |

92 | requires a more sophisticated notion of "smallerness" — a reduction ordering | |

93 | on their arguments. Our notion of "smallerness" will be simple enough that | |

94 | it will be easy to express syntactically, and will only capture primitive | |

95 | recursion. | |

96 | ||

97 | ### Note on Critical Arguments ### | |

98 | ||

99 | I should note, though, that the second point is an oversimplification. | |

100 | Not *all* arguments need to be strictly "smaller" upon recursion — only | |

101 | those arguments which are used to determine *if* the function recurses. | |

102 | I'll call those the _critical arguments_. Other arguments can take on | |

103 | any value (which is useful for having "accumulator" arguments and such.) | |

104 | ||

105 | When statically analyzing a function for primitive recursive-ness, you | |

106 | need to check how it decides to recurse, to find out which arguments are | |

107 | the critical arguments, so you can check that those ones always get | |

108 | "smaller". | |

109 | ||

110 | But we can proceed in a simpler fashion here — we can simply say that | |

111 | the first argument to every function is the critical argument, and all | |

112 | the rest aren't. This is without loss of generality, as we can always | |

113 | split some functionality which would require more than one critical | |

114 | argument across multiple functions, each of which only has one critical | |

115 | argument. (Much like every `for` loop has only one loop variable.) | |

116 | ||

117 | Data types | |

118 | ---------- | |

119 | ||

120 | Let's just go with pairs and atoms for now, although natural numbers would | |

121 | be easy to add too. Following Ruby, atoms are preceded by a colon; while I | |

122 | find this syntax somewhat obnoxious, it is less obnoxious than requiring that | |

123 | atoms are in ALL CAPS, which is what Exanoke originally had. In truth, there | |

124 | would be no real problem with allowing atoms, arguments, and function names | |

125 | (and even `self`) to all be arbitrarily alphanumeric, but it would require | |

126 | more static context checking to sort them all out, and we're trying to be | |

127 | as syntactic as reasonably possible here. | |

128 | ||

129 | `:true` is the only truthy atom. Lists are by convention only, and, by | |

130 | convention, lists compose via the second element of each pair, and `:nil` is | |

131 | the agreed-upon list-terminating atom, much love to it. | |

132 | ||

133 | Grammar | |

134 | ------- | |

135 | ||

136 | Exanoke ::= {FunDef} Expr. | |

137 | FunDef ::= "def" Ident "(" "#" {"," Ident} ")" Expr. | |

138 | Expr ::= "cons" "(" Expr "," Expr ")" | |

139 | | "head" "(" Expr ")" | |

140 | | "tail" "(" Expr ")" | |

141 | | "if" Expr "then" Expr "else" Expr | |

142 | | "self" "(" Smaller {"," Expr} ")" | |

143 | | "eq?" "(" Expr "," Expr")" | |

144 | | "cons?" "(" Expr ")" | |

145 | | "not" "(" Expr ")" | |

146 | | "#" | |

147 | | ":" Ident | |

148 | | Ident ["(" Expr {"," Expr} ")"] | |

149 | | Smaller. | |

150 | Smaller ::= "<head" SmallerTerm | |

151 | | "<tail" SmallerTerm | |

152 | | "<if" Expr "then" Smaller "else" Smaller. | |

153 | SmallerTerm ::= "#" | |

154 | | Smaller. | |

155 | Ident ::= name. | |

156 | ||

157 | The first argument to a function does not have a user-defined name; it is | |

158 | simply referred to as `#`. Again, there would be no real problem if we were | |

159 | to allow the programmer to give it a better name, but more static context | |

160 | checking would be involved. | |

161 | ||

162 | Note that `<if` is not strictly necessary. Its only use is to embed a | |

163 | conditional into the first argument being passed to a recursive call. You | |

164 | could also use a regular `if` and make the recursive call in both branches, | |

165 | one with `:true` as the first argument and the other with `:false`. | |

166 | ||

167 | Examples | |

168 | -------- | |

169 | ||

170 | -> Tests for functionality "Evaluate Exanoke program" | |

171 | ||

172 | -> Functionality "Evaluate Exanoke program" is implemented by | |

173 | -> shell command "src/exanoke.py %(test-body-file)" | |

174 | ||

175 | `cons` can be used to make lists and trees and things. | |

176 | ||

177 | | cons(:hi, :there) | |

178 | = (:hi :there) | |

179 | ||

180 | | cons(:hi, cons(:there, :nil)) | |

181 | = (:hi (:there :nil)) | |

182 | ||

183 | `head` extracts the first element of a cons cell. | |

184 | ||

185 | | head(cons(:hi, :there)) | |

186 | = :hi | |

187 | ||

188 | | head(:bar) | |

189 | ? head: Not a cons cell | |

190 | ||

191 | `tail` extracts the second element of a cons cell. | |

192 | ||

193 | | tail(cons(:hi, :there)) | |

194 | = :there | |

195 | ||

196 | | tail(tail(cons(:hi, cons(:there, :nil)))) | |

197 | = :nil | |

198 | ||

199 | | tail(:foo) | |

200 | ? tail: Not a cons cell | |

201 | ||

202 | `<head` and `<tail` and syntactic variants of `head` and `tail` which | |

203 | expect their argument to be "smaller than or equal in size to" a critical | |

204 | argument. | |

205 | ||

206 | | <head cons(:hi, :there) | |

207 | ? Expected <smaller>, found "cons" | |

208 | ||

209 | | <tail :hi | |

210 | ? Expected <smaller>, found ":hi" | |

211 | ||

212 | `if` is used for descision-making. | |

213 | ||

214 | | if :true then :hi else :there | |

215 | = :hi | |

216 | ||

217 | | if :hi then :here else :there | |

218 | = :there | |

219 | ||

220 | `eq?` is used to compare atoms. | |

221 | ||

222 | | eq?(:hi, :there) | |

223 | = :false | |

224 | ||

225 | | eq?(:hi, :hi) | |

226 | = :true | |

227 | ||

228 | `eq?` only compares atoms; it can't deal with cons cells. | |

229 | ||

230 | | eq?(cons(:one, :nil), cons(:one, :nil)) | |

231 | = :false | |

232 | ||

233 | `cons?` is used to detect cons cells. | |

234 | ||

235 | | cons?(:hi) | |

236 | = :false | |

237 | ||

238 | | cons?(cons(:wagga, :nil)) | |

239 | = :true | |

240 | ||

241 | `not` does the expected thing when regarding atoms as booleans. | |

242 | ||

243 | | not(:true) | |

244 | = :false | |

245 | ||

246 | | not(:false) | |

247 | = :true | |

248 | ||

249 | Cons cells are falsey. | |

250 | ||

251 | | not(cons(:wanga, :nil)) | |

252 | = :true | |

253 | ||

254 | `self` and `#` can only be used inside function definitions. | |

255 | ||

256 | | # | |

257 | ? Use of "#" outside of a function body | |

258 | ||

259 | | self(:foo) | |

260 | ? Use of "self" outside of a function body | |

261 | ||

262 | We can define functions. Here's the identity function. | |

263 | ||

264 | | def id(#) | |

265 | | # | |

266 | | id(:woo) | |

267 | = :woo | |

268 | ||

269 | Functions must be called with the appropriate arity. | |

270 | ||

271 | | def id(#) | |

272 | | # | |

273 | | id(:foo, :bar) | |

274 | ? Arity mismatch (expected 1, got 2) | |

275 | ||

276 | | def snd(#, another) | |

277 | | another | |

278 | | snd(:foo) | |

279 | ? Arity mismatch (expected 2, got 1) | |

280 | ||

281 | Parameter names must be defined in the function definition. | |

282 | ||

283 | | def id(#) | |

284 | | woo | |

285 | | id(:woo) | |

286 | ? Undefined argument "woo" | |

287 | ||

288 | You can't call a parameter as if it were a function. | |

289 | ||

290 | | def wat(#, woo) | |

291 | | woo(#) | |

292 | | wat(:woo) | |

293 | ? Undefined function "woo" | |

294 | ||

295 | You can't define two functions with the same name. | |

296 | ||

297 | | def wat(#) | |

298 | | :there | |

299 | | def wat(#) | |

300 | | :hi | |

301 | | wat(:woo) | |

302 | ? Function "wat" already defined | |

303 | ||

304 | You can't name a function with an atom. | |

305 | ||

306 | | def :wat(#) | |

307 | | # | |

308 | | :wat(:woo) | |

309 | ? Expected identifier, but found atom (':wat') | |

310 | ||

311 | Every function takes at least one argument. | |

312 | ||

313 | | def wat() | |

314 | | :meow | |

315 | | wat() | |

316 | ? Expected '#', but found ')' | |

317 | ||

318 | The first argument of a function must be `#`. | |

319 | ||

320 | | def wat(meow) | |

321 | | meow | |

322 | | wat(:woo) | |

323 | ? Expected '#', but found 'meow' | |

324 | ||

325 | The subsequent arguments don't have to be called `#`, and in fact, they | |

326 | shouldn't be. | |

327 | ||

328 | | def snd(#, another) | |

329 | | another | |

330 | | snd(:foo, :bar) | |

331 | = :bar | |

332 | ||

333 | | def snd(#, #) | |

334 | | # | |

335 | | snd(:foo, :bar) | |

336 | ? Expected identifier, but found goose egg ('#') | |

337 | ||

338 | A function can call a built-in. | |

339 | ||

340 | | def snoc(#, another) | |

341 | | cons(another, #) | |

342 | | snoc(:there, :hi) | |

343 | = (:hi :there) | |

344 | ||

345 | Functions can call other user-defined functions. | |

346 | ||

347 | | def double(#) | |

348 | | cons(#, #) | |

349 | | def quadruple(#) | |

350 | | double(double(#)) | |

351 | | quadruple(:meow) | |

352 | = ((:meow :meow) (:meow :meow)) | |

353 | ||

354 | Functions must be defined before they are called. | |

355 | ||

356 | | def quadruple(#) | |

357 | | double(double(#)) | |

358 | | def double(#) | |

359 | | cons(#, #) | |

360 | | :meow | |

361 | ? Undefined function "double" | |

362 | ||

363 | Argument names may shadow previously-defined functions, because we | |

364 | can syntactically tell them apart. | |

365 | ||

366 | | def snoc(#, other) | |

367 | | cons(other, #) | |

368 | | def snocsnoc(#, snoc) | |

369 | | snoc(snoc(snoc, #), #) | |

370 | | snocsnoc(:blarch, :glamch) | |

371 | = (:blarch (:blarch :glamch)) | |

372 | ||

373 | A function may recursively call itself, as long as it does so with | |

374 | values which are smaller than or equal in size to the critical argument | |

375 | as the first argument. | |

376 | ||

377 | | def count(#) | |

378 | | self(<tail #) | |

379 | | count(cons(:alpha, cons(:beta, :nil))) | |

380 | ? tail: Not a cons cell | |

381 | ||

382 | | def count(#) | |

383 | | if eq?(#, :nil) then :nil else self(<tail #) | |

384 | | count(cons(:alpha, cons(:beta, :nil))) | |

385 | = :nil | |

386 | ||

387 | | def last(#) | |

388 | | if not(cons?(#)) then # else self(<tail #) | |

389 | | last(cons(:alpha, cons(:beta, :graaap))) | |

390 | = :graaap | |

391 | ||

392 | | def count(#, acc) | |

393 | | if eq?(#, :nil) then acc else self(<tail #, cons(:one, acc)) | |

394 | | count(cons(:A, cons(:B, :nil)), :nil) | |

395 | = (:one (:one :nil)) | |

396 | ||

397 | Arity must match when a function calls itself recursively. | |

398 | ||

399 | | def urff(#) | |

400 | | self(<tail #, <head #) | |

401 | | urff(:woof) | |

402 | ? Arity mismatch on self (expected 1, got 2) | |

403 | ||

404 | | def urff(#, other) | |

405 | | self(<tail #) | |

406 | | urff(:woof, :moo) | |

407 | ? Arity mismatch on self (expected 2, got 1) | |

408 | ||

409 | The remaining tests demonstrate that a function cannot call itself if it | |

410 | does not pass a values which is smaller than or equal in size to the | |

411 | critical argument as the first argument. | |

412 | ||

413 | | def urff(#) | |

414 | | self(cons(#, #)) | |

415 | | urff(:woof) | |

416 | ? Expected <smaller>, found "cons" | |

417 | ||

418 | | def urff(#) | |

419 | | self(#) | |

420 | | urff(:graaap) | |

421 | ? Expected <smaller>, found "#" | |

422 | ||

423 | | def urff(#, boof) | |

424 | | self(boof) | |

425 | | urff(:graaap, :skooorp) | |

426 | ? Expected <smaller>, found "boof" | |

427 | ||

428 | | def urff(#, boof) | |

429 | | self(<tail boof) | |

430 | | urff(:graaap, :skooorp) | |

431 | ? Expected <smaller>, found "boof" | |

432 | ||

433 | | def urff(#) | |

434 | | self(:wanga) | |

435 | | urff(:graaap) | |

436 | ? Expected <smaller>, found ":wanga" | |

437 | ||

438 | | def urff(#) | |

439 | | self(if eq?(:alpha, :alpha) then <head # else <tail #) | |

440 | | urff(:graaap) | |

441 | ? Expected <smaller>, found "if" | |

442 | ||

443 | | def urff(#) | |

444 | | self(<if eq?(:alpha, :alpha) then <head # else <tail #) | |

445 | | urff(:graaap) | |

446 | ? head: Not a cons cell | |

447 | ||

448 | | def urff(#) | |

449 | | self(<if eq?(self(<head #), :alpha) then <head # else <tail #) | |

450 | | urff(:graaap) | |

451 | ? head: Not a cons cell | |

452 | ||

453 | | def urff(#) | |

454 | | self(<if self(<tail #) then <head # else <tail #) | |

455 | | urff(cons(:graaap, :skooorp)) | |

456 | ? tail: Not a cons cell | |

457 | ||

458 | Now, some practical examples, on Peano naturals. Addition: | |

459 | ||

460 | | def inc(#) | |

461 | | cons(:one, #) | |

462 | | def add(#, other) | |

463 | | if eq?(#, :nil) then other else self(<tail #, inc(other)) | |

464 | | | |

465 | | add(cons(:one, cons(:one, :nil)), cons(:one, :nil)) | |

466 | = (:one (:one (:one :nil))) | |

467 | ||

468 | Multiplication: | |

469 | ||

470 | | def inc(#) | |

471 | | cons(:one, #) | |

472 | | def add(#, other) | |

473 | | if eq?(#, :nil) then other else self(<tail #, inc(other)) | |

474 | | def mul(#, other) | |

475 | | if eq?(#, :nil) then :nil else | |

476 | | add(other, self(<tail #, other)) | |

477 | | def three(#) | |

478 | | cons(:one, cons(:one, cons(:one, #))) | |

479 | | | |

480 | | mul(three(:nil), three(:nil)) | |

481 | = (:one (:one (:one (:one (:one (:one (:one (:one (:one :nil))))))))) | |

482 | ||

483 | Factorial! There are 24 `:one`'s in this test's expectation. | |

484 | ||

485 | | def inc(#) | |

486 | | cons(:one, #) | |

487 | | def add(#, other) | |

488 | | if eq?(#, :nil) then other else self(<tail #, inc(other)) | |

489 | | def mul(#, other) | |

490 | | if eq?(#, :nil) then :nil else | |

491 | | add(other, self(<tail #, other)) | |

492 | | def fact(#) | |

493 | | if eq?(#, :nil) then cons(:one, :nil) else | |

494 | | mul(#, self(<tail #)) | |

495 | | def four(#) | |

496 | | cons(:one, cons(:one, cons(:one, cons(:one, #)))) | |

497 | | | |

498 | | fact(four(:nil)) | |

499 | = (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one :nil)))))))))))))))))))))))) | |

500 | ||

501 | Discussion | |

502 | ---------- | |

503 | ||

504 | So, what of it? | |

505 | ||

506 | It was not a particularly challenging design goal to meet; it's one of those | |

507 | things that seems rather obvious after the fact, that you can just dictate | |

508 | that one of the arguments is a critical argument, and only call yourself with | |

509 | some smaller version of your critical argument in that position. Recursive | |

510 | calls map quite straightforwardly to `for` loops, and you end up with what is | |

511 | essentially a functional version of of a `for` program. | |

512 | ||

513 | I guess the question is, is it worth doing this primitive-recursion check as | |

514 | a syntactic, rather than a static semantic, thing? | |

515 | ||

516 | I think it is. If you're concerned at all with writing functions which are | |

517 | guaranteed to terminate, you probably have a plan in mind (however vague) | |

518 | for how they will accomplish this, so it seems reasonable to require that you | |

519 | mark up your function to indicate how it does this. And it's certainly | |

520 | easier to implement than analyzing an arbirarily-written function. | |

521 | ||

522 | Of course, the exact syntactic mechanisms would likely see some improvement | |

523 | in a practical application of this idea. As alluded to in several places | |

524 | in this document, any actually-distinct lexical items (name of the critical | |

525 | argument, and so forth) could be replaced by simple static semantic checks | |

526 | (against a symbol table or whatnot.) Which arguments are the critical | |

527 | arguments for a particular function could be indicated in the source. | |

528 | ||

529 | One criticism (if I can call it that) of primitive recursive functions is | |

530 | that, even though they can express any algorithm which runs in | |

531 | non-deterministic exponential time (which, if you believe "polynomial | |

532 | time = feasible", means, basically, all algorithms you'd ever care about), | |

533 | for any primitive recursively expressed algorithm, theye may be a (much) | |

534 | more efficient algorithm expressed in a general recursive way. | |

535 | ||

536 | However, in my experience, there are many functions, generally non-, or | |

537 | minimally, numerical, which operate on data structures, where the obvious | |

538 | implementation _is_ primitive recursive. In day-to-day database and web | |

539 | programming, there will be operations which are series of replacements, | |

540 | updates, simple transformations, folds, and the like, all of which | |

541 | "obviously" terminate, and which can readily be written primitive recursively. | |

542 | ||

543 | Limited support for higher-order functions could be added, possibly even to | |

544 | Exanoke (as long as the "no mutual recursion" rule is still observed.) | |

545 | After all (and if you'll forgive the anthropomorphizing self-insertion in | |

546 | this sentence), if you pass me a primitive recursive function, and I'm | |

547 | primitive recursive, I'll remain primitive recursive no matter how many times | |

548 | I call your function. | |

549 | ||

550 | Lastly, the requisite etymological denoument: the name "Exanoke" started life | |

551 | as a typo for the word "example". | |

552 | ||

553 | Happy primitive recursing! | |

554 | Chris Pressey | |

555 | Cornwall, UK, WTF | |

556 | Jan 5, 2013 |

0 | Exanoke | |

1 | ======= | |

2 | ||

3 | _Exanoke_ is a pure functional language which is syntactically restricted to | |

4 | expressing the primitive recursive functions. | |

5 | ||

6 | I'll assume you know what a primitive recursive function is. If not, go look | |

7 | it up, as it's quite interesting, if only for the fact that it demonstrates | |

8 | even a genius like Kurt Gödel can sometimes be mistaken. (He initially | |

9 | thought that all functions could be expressed primitive recursively, until | |

10 | Ackermann came up with a counterexample.) | |

11 | ||

12 | So, you have a program. There are two ways that you can ensure that it | |

13 | implements a primtive recursive function: | |

14 | ||

15 | * You can statically analyze the bastard, and prove that all of its | |

16 | loops eventually terminate, and so forth; or | |

17 | * You can write it in a language which is inherently restricted to | |

18 | expressing only primitive recursive functions. | |

19 | ||

20 | The second option is the route [PL-{GOTO}][] takes. But that's an imperative | |

21 | language, and it's fairly easy to restrict an imperative language in this | |

22 | way. In PL-{GOTO}'s case, they just took PL and removed the `GOTO` command. | |

23 | The rest of the language essentially contains only `for` loops, so what you | |

24 | get is something in which you can only express primitive recursive functions. | |

25 | (That imperative programs consisting of only `for` loops can express only and | |

26 | exactly the primitive recursive functions was established by Meyer and Ritchie | |

27 | in "The complexity of loop programs".) | |

28 | ||

29 | But what about functional languages? | |

30 | ||

31 | The approach I've taken in [TPiS][], and that I wanted to take in [Pixley][] | |

32 | and [Robin][], is to provide an unrestricted functional language to the | |

33 | programmer, and statically analyze it to see if you're going and writing | |

34 | primitive recursive functions in it or not. | |

35 | ||

36 | Thing is, that's kind of difficult. Is it possible to take the same approach | |

37 | PL-{GOTO} takes, and *syntactically* restrict a functional language to the | |

38 | primitive recursive functions? | |

39 | ||

40 | I mean, in a trivial sense, it must be; in the original definition, primitive | |

41 | recursive functions *were* functions. (Duh.) But these have a highly | |

42 | arithmetical flavour, with bounded sums and products and whatnot. What | |

43 | would primitive recursion look like in the setting of general (and symbolic) | |

44 | functional programming? | |

45 | ||

46 | Functional languages don't do the `for` loop thing, they do the recursion | |

47 | thing, and there are no natural bounds on that recursion, so some restriction | |

48 | on recursion would have to be captured by the grammar, and... well, it sounds | |

49 | somewhat interesting, and doable, so let's try it. | |

50 | ||

51 | [Pixley]: https://catseye.tc/projects/pixley/ | |

52 | [PL-{GOTO}]: http://catseye.tc/projects/pl-goto.net/ | |

53 | [Robin]: https://github.com/catseye/Robin | |

54 | [TPiS]: http://catseye.tc/projects/tpis/ | |

55 | ||

56 | Ground Rules | |

57 | ------------ | |

58 | ||

59 | Here are some ground rules about how to tell if a functional program is | |

60 | primitive recursive: | |

61 | ||

62 | * It doesn't perform mutual recursion. | |

63 | * When recursion happens, it's always with arguments that are strictly | |

64 | "smaller" values than the arguments the function received. | |

65 | * There is a "smallest" value that an argument can take on, so that | |

66 | there is always a base case to the recursion, so that it always | |

67 | eventually terminates. | |

68 | * Higher-order functions are not used. | |

69 | ||

70 | The first point can be enforced simply by providing a token that | |

71 | refers to the function currently being defined (`self` is a reasonable | |

72 | choice) to permit recursion, but to disallow calling any function that | |

73 | has not yet occurred, lexically, in the program source. | |

74 | ||

75 | The second point can be enforced by stating syntactic rules for | |

76 | "smallerness". (Gee, typing that made me feel a bit like George W. Bush!) | |

77 | ||

78 | The third point can be enforced by providing some default behaviour when | |

79 | functions are called with the "smallest" kinds of values. This could be | |

80 | as simple as terminating the program if you try to find a value "smaller" | |

81 | than the "smallest" value. | |

82 | ||

83 | The fourth point can be enforced by simply disallowing functions to be | |

84 | passed to, or returned from, functions. | |

85 | ||

86 | ### Note on these Criteria ### | |

87 | ||

88 | In fact, these four criteria taken together do not strictly speaking | |

89 | define primitive recursion. They don't exclude functional programs which | |

90 | always terminate but which aren't primitive recursive (for example, the | |

91 | Ackermann function.) However, determining that such functions terminate | |

92 | requires a more sophisticated notion of "smallerness" — a reduction ordering | |

93 | on their arguments. Our notion of "smallerness" will be simple enough that | |

94 | it will be easy to express syntactically, and will only capture primitive | |

95 | recursion. | |

96 | ||

97 | ### Note on Critical Arguments ### | |

98 | ||

99 | I should note, though, that the second point is an oversimplification. | |

100 | Not *all* arguments need to be strictly "smaller" upon recursion — only | |

101 | those arguments which are used to determine *if* the function recurses. | |

102 | I'll call those the _critical arguments_. Other arguments can take on | |

103 | any value (which is useful for having "accumulator" arguments and such.) | |

104 | ||

105 | When statically analyzing a function for primitive recursive-ness, you | |

106 | need to check how it decides to recurse, to find out which arguments are | |

107 | the critical arguments, so you can check that those ones always get | |

108 | "smaller". | |

109 | ||

110 | But we can proceed in a simpler fashion here — we can simply say that | |

111 | the first argument to every function is the critical argument, and all | |

112 | the rest aren't. This is without loss of generality, as we can always | |

113 | split some functionality which would require more than one critical | |

114 | argument across multiple functions, each of which only has one critical | |

115 | argument. (Much like every `for` loop has only one loop variable.) | |

116 | ||

117 | Data types | |

118 | ---------- | |

119 | ||

120 | Let's just go with pairs and atoms for now, although natural numbers would | |

121 | be easy to add too. Following Ruby, atoms are preceded by a colon; while I | |

122 | find this syntax somewhat obnoxious, it is less obnoxious than requiring that | |

123 | atoms are in ALL CAPS, which is what Exanoke originally had. In truth, there | |

124 | would be no real problem with allowing atoms, arguments, and function names | |

125 | (and even `self`) to all be arbitrarily alphanumeric, but it would require | |

126 | more static context checking to sort them all out, and we're trying to be | |

127 | as syntactic as reasonably possible here. | |

128 | ||

129 | `:true` is the only truthy atom. Lists are by convention only, and, by | |

130 | convention, lists compose via the second element of each pair, and `:nil` is | |

131 | the agreed-upon list-terminating atom, much love to it. | |

132 | ||

133 | Grammar | |

134 | ------- | |

135 | ||

136 | Exanoke ::= {FunDef} Expr. | |

137 | FunDef ::= "def" Ident "(" "#" {"," Ident} ")" Expr. | |

138 | Expr ::= "cons" "(" Expr "," Expr ")" | |

139 | | "head" "(" Expr ")" | |

140 | | "tail" "(" Expr ")" | |

141 | | "if" Expr "then" Expr "else" Expr | |

142 | | "self" "(" Smaller {"," Expr} ")" | |

143 | | "eq?" "(" Expr "," Expr")" | |

144 | | "cons?" "(" Expr ")" | |

145 | | "not" "(" Expr ")" | |

146 | | "#" | |

147 | | ":" Ident | |

148 | | Ident ["(" Expr {"," Expr} ")"] | |

149 | | Smaller. | |

150 | Smaller ::= "<head" SmallerTerm | |

151 | | "<tail" SmallerTerm | |

152 | | "<if" Expr "then" Smaller "else" Smaller. | |

153 | SmallerTerm ::= "#" | |

154 | | Smaller. | |

155 | Ident ::= name. | |

156 | ||

157 | The first argument to a function does not have a user-defined name; it is | |

158 | simply referred to as `#`. Again, there would be no real problem if we were | |

159 | to allow the programmer to give it a better name, but more static context | |

160 | checking would be involved. | |

161 | ||

162 | Note that `<if` is not strictly necessary. Its only use is to embed a | |

163 | conditional into the first argument being passed to a recursive call. You | |

164 | could also use a regular `if` and make the recursive call in both branches, | |

165 | one with `:true` as the first argument and the other with `:false`. | |

166 | ||

167 | Examples | |

168 | -------- | |

169 | ||

170 | -> Tests for functionality "Evaluate Exanoke program" | |

171 | ||

172 | `cons` can be used to make lists and trees and things. | |

173 | ||

174 | | cons(:hi, :there) | |

175 | = (:hi :there) | |

176 | ||

177 | | cons(:hi, cons(:there, :nil)) | |

178 | = (:hi (:there :nil)) | |

179 | ||

180 | `head` extracts the first element of a cons cell. | |

181 | ||

182 | | head(cons(:hi, :there)) | |

183 | = :hi | |

184 | ||

185 | | head(:bar) | |

186 | ? head: Not a cons cell | |

187 | ||

188 | `tail` extracts the second element of a cons cell. | |

189 | ||

190 | | tail(cons(:hi, :there)) | |

191 | = :there | |

192 | ||

193 | | tail(tail(cons(:hi, cons(:there, :nil)))) | |

194 | = :nil | |

195 | ||

196 | | tail(:foo) | |

197 | ? tail: Not a cons cell | |

198 | ||

199 | `<head` and `<tail` and syntactic variants of `head` and `tail` which | |

200 | expect their argument to be "smaller than or equal in size to" a critical | |

201 | argument. | |

202 | ||

203 | | <head cons(:hi, :there) | |

204 | ? Expected <smaller>, found "cons" | |

205 | ||

206 | | <tail :hi | |

207 | ? Expected <smaller>, found ":hi" | |

208 | ||

209 | `if` is used for descision-making. | |

210 | ||

211 | | if :true then :hi else :there | |

212 | = :hi | |

213 | ||

214 | | if :hi then :here else :there | |

215 | = :there | |

216 | ||

217 | `eq?` is used to compare atoms. | |

218 | ||

219 | | eq?(:hi, :there) | |

220 | = :false | |

221 | ||

222 | | eq?(:hi, :hi) | |

223 | = :true | |

224 | ||

225 | `eq?` only compares atoms; it can't deal with cons cells. | |

226 | ||

227 | | eq?(cons(:one, :nil), cons(:one, :nil)) | |

228 | = :false | |

229 | ||

230 | `cons?` is used to detect cons cells. | |

231 | ||

232 | | cons?(:hi) | |

233 | = :false | |

234 | ||

235 | | cons?(cons(:wagga, :nil)) | |

236 | = :true | |

237 | ||

238 | `not` does the expected thing when regarding atoms as booleans. | |

239 | ||

240 | | not(:true) | |

241 | = :false | |

242 | ||

243 | | not(:false) | |

244 | = :true | |

245 | ||

246 | Cons cells are falsey. | |

247 | ||

248 | | not(cons(:wanga, :nil)) | |

249 | = :true | |

250 | ||

251 | `self` and `#` can only be used inside function definitions. | |

252 | ||

253 | | # | |

254 | ? Use of "#" outside of a function body | |

255 | ||

256 | | self(:foo) | |

257 | ? Use of "self" outside of a function body | |

258 | ||

259 | We can define functions. Here's the identity function. | |

260 | ||

261 | | def id(#) | |

262 | | # | |

263 | | id(:woo) | |

264 | = :woo | |

265 | ||

266 | Functions must be called with the appropriate arity. | |

267 | ||

268 | | def id(#) | |

269 | | # | |

270 | | id(:foo, :bar) | |

271 | ? Arity mismatch (expected 1, got 2) | |

272 | ||

273 | | def snd(#, another) | |

274 | | another | |

275 | | snd(:foo) | |

276 | ? Arity mismatch (expected 2, got 1) | |

277 | ||

278 | Parameter names must be defined in the function definition. | |

279 | ||

280 | | def id(#) | |

281 | | woo | |

282 | | id(:woo) | |

283 | ? Undefined argument "woo" | |

284 | ||

285 | You can't call a parameter as if it were a function. | |

286 | ||

287 | | def wat(#, woo) | |

288 | | woo(#) | |

289 | | wat(:woo) | |

290 | ? Undefined function "woo" | |

291 | ||

292 | You can't define two functions with the same name. | |

293 | ||

294 | | def wat(#) | |

295 | | :there | |

296 | | def wat(#) | |

297 | | :hi | |

298 | | wat(:woo) | |

299 | ? Function "wat" already defined | |

300 | ||

301 | You can't name a function with an atom. | |

302 | ||

303 | | def :wat(#) | |

304 | | # | |

305 | | :wat(:woo) | |

306 | ? Expected identifier, but found atom (':wat') | |

307 | ||

308 | Every function takes at least one argument. | |

309 | ||

310 | | def wat() | |

311 | | :meow | |

312 | | wat() | |

313 | ? Expected '#', but found ')' | |

314 | ||

315 | The first argument of a function must be `#`. | |

316 | ||

317 | | def wat(meow) | |

318 | | meow | |

319 | | wat(:woo) | |

320 | ? Expected '#', but found 'meow' | |

321 | ||

322 | The subsequent arguments don't have to be called `#`, and in fact, they | |

323 | shouldn't be. | |

324 | ||

325 | | def snd(#, another) | |

326 | | another | |

327 | | snd(:foo, :bar) | |

328 | = :bar | |

329 | ||

330 | | def snd(#, #) | |

331 | | # | |

332 | | snd(:foo, :bar) | |

333 | ? Expected identifier, but found goose egg ('#') | |

334 | ||

335 | A function can call a built-in. | |

336 | ||

337 | | def snoc(#, another) | |

338 | | cons(another, #) | |

339 | | snoc(:there, :hi) | |

340 | = (:hi :there) | |

341 | ||

342 | Functions can call other user-defined functions. | |

343 | ||

344 | | def double(#) | |

345 | | cons(#, #) | |

346 | | def quadruple(#) | |

347 | | double(double(#)) | |

348 | | quadruple(:meow) | |

349 | = ((:meow :meow) (:meow :meow)) | |

350 | ||

351 | Functions must be defined before they are called. | |

352 | ||

353 | | def quadruple(#) | |

354 | | double(double(#)) | |

355 | | def double(#) | |

356 | | cons(#, #) | |

357 | | :meow | |

358 | ? Undefined function "double" | |

359 | ||

360 | Argument names may shadow previously-defined functions, because we | |

361 | can syntactically tell them apart. | |

362 | ||

363 | | def snoc(#, other) | |

364 | | cons(other, #) | |

365 | | def snocsnoc(#, snoc) | |

366 | | snoc(snoc(snoc, #), #) | |

367 | | snocsnoc(:blarch, :glamch) | |

368 | = (:blarch (:blarch :glamch)) | |

369 | ||

370 | A function may recursively call itself, as long as it does so with | |

371 | values which are smaller than or equal in size to the critical argument | |

372 | as the first argument. | |

373 | ||

374 | | def count(#) | |

375 | | self(<tail #) | |

376 | | count(cons(:alpha, cons(:beta, :nil))) | |

377 | ? tail: Not a cons cell | |

378 | ||

379 | | def count(#) | |

380 | | if eq?(#, :nil) then :nil else self(<tail #) | |

381 | | count(cons(:alpha, cons(:beta, :nil))) | |

382 | = :nil | |

383 | ||

384 | | def last(#) | |

385 | | if not(cons?(#)) then # else self(<tail #) | |

386 | | last(cons(:alpha, cons(:beta, :graaap))) | |

387 | = :graaap | |

388 | ||

389 | | def count(#, acc) | |

390 | | if eq?(#, :nil) then acc else self(<tail #, cons(:one, acc)) | |

391 | | count(cons(:A, cons(:B, :nil)), :nil) | |

392 | = (:one (:one :nil)) | |

393 | ||

394 | Arity must match when a function calls itself recursively. | |

395 | ||

396 | | def urff(#) | |

397 | | self(<tail #, <head #) | |

398 | | urff(:woof) | |

399 | ? Arity mismatch on self (expected 1, got 2) | |

400 | ||

401 | | def urff(#, other) | |

402 | | self(<tail #) | |

403 | | urff(:woof, :moo) | |

404 | ? Arity mismatch on self (expected 2, got 1) | |

405 | ||

406 | The remaining tests demonstrate that a function cannot call itself if it | |

407 | does not pass a values which is smaller than or equal in size to the | |

408 | critical argument as the first argument. | |

409 | ||

410 | | def urff(#) | |

411 | | self(cons(#, #)) | |

412 | | urff(:woof) | |

413 | ? Expected <smaller>, found "cons" | |

414 | ||

415 | | def urff(#) | |

416 | | self(#) | |

417 | | urff(:graaap) | |

418 | ? Expected <smaller>, found "#" | |

419 | ||

420 | | def urff(#, boof) | |

421 | | self(boof) | |

422 | | urff(:graaap, :skooorp) | |

423 | ? Expected <smaller>, found "boof" | |

424 | ||

425 | | def urff(#, boof) | |

426 | | self(<tail boof) | |

427 | | urff(:graaap, :skooorp) | |

428 | ? Expected <smaller>, found "boof" | |

429 | ||

430 | | def urff(#) | |

431 | | self(:wanga) | |

432 | | urff(:graaap) | |

433 | ? Expected <smaller>, found ":wanga" | |

434 | ||

435 | | def urff(#) | |

436 | | self(if eq?(:alpha, :alpha) then <head # else <tail #) | |

437 | | urff(:graaap) | |

438 | ? Expected <smaller>, found "if" | |

439 | ||

440 | | def urff(#) | |

441 | | self(<if eq?(:alpha, :alpha) then <head # else <tail #) | |

442 | | urff(:graaap) | |

443 | ? head: Not a cons cell | |

444 | ||

445 | | def urff(#) | |

446 | | self(<if eq?(self(<head #), :alpha) then <head # else <tail #) | |

447 | | urff(:graaap) | |

448 | ? head: Not a cons cell | |

449 | ||

450 | | def urff(#) | |

451 | | self(<if self(<tail #) then <head # else <tail #) | |

452 | | urff(cons(:graaap, :skooorp)) | |

453 | ? tail: Not a cons cell | |

454 | ||

455 | Now, some practical examples, on Peano naturals. Addition: | |

456 | ||

457 | | def inc(#) | |

458 | | cons(:one, #) | |

459 | | def add(#, other) | |

460 | | if eq?(#, :nil) then other else self(<tail #, inc(other)) | |

461 | | | |

462 | | add(cons(:one, cons(:one, :nil)), cons(:one, :nil)) | |

463 | = (:one (:one (:one :nil))) | |

464 | ||

465 | Multiplication: | |

466 | ||

467 | | def inc(#) | |

468 | | cons(:one, #) | |

469 | | def add(#, other) | |

470 | | if eq?(#, :nil) then other else self(<tail #, inc(other)) | |

471 | | def mul(#, other) | |

472 | | if eq?(#, :nil) then :nil else | |

473 | | add(other, self(<tail #, other)) | |

474 | | def three(#) | |

475 | | cons(:one, cons(:one, cons(:one, #))) | |

476 | | | |

477 | | mul(three(:nil), three(:nil)) | |

478 | = (:one (:one (:one (:one (:one (:one (:one (:one (:one :nil))))))))) | |

479 | ||

480 | Factorial! There are 24 `:one`'s in this test's expectation. | |

481 | ||

482 | | def inc(#) | |

483 | | cons(:one, #) | |

484 | | def add(#, other) | |

485 | | if eq?(#, :nil) then other else self(<tail #, inc(other)) | |

486 | | def mul(#, other) | |

487 | | if eq?(#, :nil) then :nil else | |

488 | | add(other, self(<tail #, other)) | |

489 | | def fact(#) | |

490 | | if eq?(#, :nil) then cons(:one, :nil) else | |

491 | | mul(#, self(<tail #)) | |

492 | | def four(#) | |

493 | | cons(:one, cons(:one, cons(:one, cons(:one, #)))) | |

494 | | | |

495 | | fact(four(:nil)) | |

496 | = (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one (:one :nil)))))))))))))))))))))))) | |

497 | ||

498 | Discussion | |

499 | ---------- | |

500 | ||

501 | So, what of it? | |

502 | ||

503 | It was not a particularly challenging design goal to meet; it's one of those | |

504 | things that seems rather obvious after the fact, that you can just dictate | |

505 | that one of the arguments is a critical argument, and only call yourself with | |

506 | some smaller version of your critical argument in that position. Recursive | |

507 | calls map quite straightforwardly to `for` loops, and you end up with what is | |

508 | essentially a functional version of of a `for` program. | |

509 | ||

510 | I guess the question is, is it worth doing this primitive-recursion check as | |

511 | a syntactic, rather than a static semantic, thing? | |

512 | ||

513 | I think it is. If you're concerned at all with writing functions which are | |

514 | guaranteed to terminate, you probably have a plan in mind (however vague) | |

515 | for how they will accomplish this, so it seems reasonable to require that you | |

516 | mark up your function to indicate how it does this. And it's certainly | |

517 | easier to implement than analyzing an arbirarily-written function. | |

518 | ||

519 | Of course, the exact syntactic mechanisms would likely see some improvement | |

520 | in a practical application of this idea. As alluded to in several places | |

521 | in this document, any actually-distinct lexical items (name of the critical | |

522 | argument, and so forth) could be replaced by simple static semantic checks | |

523 | (against a symbol table or whatnot.) Which arguments are the critical | |

524 | arguments for a particular function could be indicated in the source. | |

525 | ||

526 | One criticism (if I can call it that) of primitive recursive functions is | |

527 | that, even though they can express any algorithm which runs in | |

528 | non-deterministic exponential time (which, if you believe "polynomial | |

529 | time = feasible", means, basically, all algorithms you'd ever care about), | |

530 | for any primitive recursively expressed algorithm, theye may be a (much) | |

531 | more efficient algorithm expressed in a general recursive way. | |

532 | ||

533 | However, in my experience, there are many functions, generally non-, or | |

534 | minimally, numerical, which operate on data structures, where the obvious | |

535 | implementation _is_ primitive recursive. In day-to-day database and web | |

536 | programming, there will be operations which are series of replacements, | |

537 | updates, simple transformations, folds, and the like, all of which | |

538 | "obviously" terminate, and which can readily be written primitive recursively. | |

539 | ||

540 | Limited support for higher-order functions could be added, possibly even to | |

541 | Exanoke (as long as the "no mutual recursion" rule is still observed.) | |

542 | After all (and if you'll forgive the anthropomorphizing self-insertion in | |

543 | this sentence), if you pass me a primitive recursive function, and I'm | |

544 | primitive recursive, I'll remain primitive recursive no matter how many times | |

545 | I call your function. | |

546 | ||

547 | Lastly, the requisite etymological denoument: the name "Exanoke" started life | |

548 | as a typo for the word "example". | |

549 | ||

550 | Happy primitive recursing! | |

551 | Chris Pressey | |

552 | Cornwall, UK, WTF | |

553 | Jan 5, 2013 |

461 | 461 | import doctest |

462 | 462 | (fails, something) = doctest.testmod() |

463 | 463 | if fails == 0: |

464 | print "All tests passed." | |

464 | print("All tests passed.") | |

465 | 465 | sys.exit(0) |

466 | 466 | else: |

467 | 467 | sys.exit(1) |

472 | 472 | try: |

473 | 473 | prog = p.program() |

474 | 474 | except SyntaxError as e: |

475 | print >>sys.stderr, str(e) | |

475 | sys.stderr.write(str(e)) | |

476 | sys.stderr.write("\n") | |

476 | 477 | sys.exit(1) |

477 | 478 | if options.show_ast: |

478 | 479 | from pprint import pprint |

480 | 481 | sys.exit(0) |

481 | 482 | try: |

482 | 483 | ev = Evaluator(prog) |

483 | print str(ev.eval(prog)) | |

484 | print(str(ev.eval(prog))) | |

484 | 485 | except TypeError as e: |

485 | print >>sys.stderr, str(e) | |

486 | sys.stderr.write(str(e)) | |

487 | sys.stderr.write("\n") | |

486 | 488 | sys.exit(1) |

487 | 489 | sys.exit(0) |

488 | 490 | |

491 | 493 | import os |

492 | 494 | |

493 | 495 | def rpython_load(filename): |

494 | fd = os.open(filename, os.O_RDONLY, 0644) | |

496 | fd = os.open(filename, os.O_RDONLY, 0o644) | |

495 | 497 | text = '' |

496 | 498 | chunk = os.read(fd, 1024) |

497 | 499 | text += chunk |

507 | 509 | prog = p.program() |

508 | 510 | ev = Evaluator(prog) |

509 | 511 | result = ev.eval(prog) |

510 | print result.__repr__() | |

512 | print(result.__repr__()) | |

511 | 513 | return 0 |

512 | 514 | |

513 | 515 | return rpython_main, None |

0 | 0 | #!/bin/sh |

1 | 1 | |

2 | falderal README.markdown | |

2 | APPLIANCES="tests/appliances/exanoke.py.md" | |

3 | ||

4 | falderal $APPLIANCES README.md |