Merge pull request #1 from cpressey/modernize
Modernize sources and add JS build and demo.
Chris Pressey authored 5 years ago
GitHub committed 5 years ago
0 | b231a35b4312e97e69604d9a7653053f5ea3d227 rel_1_0_2007_1111 | |
1 | 03a2e99d04ad40cc6cccb6730e063cd49a263866 rel_1_0_2011_1214 | |
2 | 10a9c2ce499315db903a647e6930f94a6ebc7e15 rel_1_0_2014_0819 | |
3 | 982794c8cc0e383108af3378db31279f9c78ea52 rel_1_0_2015_0101 |
0 | The Emmental Programming Language | |
1 | ================================= | |
2 | ||
3 | November 2007, Chris Pressey, Cat's Eye Technologies | |
4 | ||
5 | Introduction | |
6 | ------------ | |
7 | ||
8 | Emmental is a self-modifying programming language. That is not to say | |
9 | that is a language in which programs are self-modifying; rather, it is | |
10 | the language itself, as defined by a meta-circular interpreter, that can | |
11 | be modified during the course of a running program. Indeed, this is how | |
12 | Emmental, without conventional conditional and repetition/recursion | |
13 | constructs, achieves Turing-completeness. | |
14 | ||
15 | Meta-circular Interpreters | |
16 | -------------------------- | |
17 | ||
18 | One way to attempt to define a language is by giving what's called a | |
19 | *meta-circular interpreter* (often shortened to "MCI" in this document.) | |
20 | This is an interpreter for some language which is written in that same | |
21 | language (or at least, a language which is very close to it.) | |
22 | ||
23 | Meta-circular interpreters were a popular way to the describe the | |
24 | semantics of programming languages, especially LISP-like languages, and | |
25 | especially before the advent of denotational semantics. The term | |
26 | "meta-circular" was apparently coined by John C. Reynolds in his paper | |
27 | "Definitional Interpreters for Higher-Order Programming Languages" (1972 | |
28 | Proceedings ACM National Conference.) | |
29 | ||
30 | Of course, in the real world, MCI's are not often used. They certainly | |
31 | *can* be used: if you have a working Scheme interpreter that came with | |
32 | your computer system, there is nothing stopping you from writing another | |
33 | Scheme interpreter in Scheme, and running your programs on your | |
34 | interpreter (which is itself running on your system's interpreter.) | |
35 | However, this is quite a bit less efficient due to the duplication of | |
36 | effort. A somewhat more realistic case might be if your system came | |
37 | with, say, a Scheme compiler. You might then feed your Scheme | |
38 | interpreter (written in Scheme) through that to make a native Scheme | |
39 | interpreter, and use that to interpret your programs. (In this setup, | |
40 | the interpreter is usually described as "self-hosting" rather than | |
41 | "meta-circular".) | |
42 | ||
43 | But, as should be obvious, you already need an implementation of Scheme | |
44 | for your Scheme interpreter written in Scheme to be of much practical | |
45 | use to you. If your meta-circular interpreter is all you have, you won't | |
46 | be able to use it to run or understand Scheme programs. Because the MCI | |
47 | is defined in terms of itself, you'll need some other source of | |
48 | "understanding how it works" to make it complete. This understanding | |
49 | might come from an implementation in some other programming language, or | |
50 | a specification in some formal language, or a description in some | |
51 | natural language, or simply from intuition — but it has to come from | |
52 | somewhere. | |
53 | ||
54 | Assuming that we do have that external source of understanding, the | |
55 | meta-circular interpreter can come in quite handy in codifying the | |
56 | semantics of the language. And, in Emmental's case, *altering* those | |
57 | semantics: Emmental's MCI supports operations which instruct Emmental's | |
58 | MCI to modify its behaviour. | |
59 | ||
60 | Interpreter Structure | |
61 | --------------------- | |
62 | ||
63 | To describe the structure of Emmental's MCI, we first examine the | |
64 | general structure of interpreters. If you've ever written a virtual | |
65 | machine executor in, say, C, you've noticed that it has the general form | |
66 | ||
67 | pc = start; | |
68 | while (!done) { | |
69 | switch (instruction[pc]) { | |
70 | case INSTR_ONE: | |
71 | /* implement semantics of INSTR_ONE */ | |
72 | pc += advance(INSTR_ONE); | |
73 | break; | |
74 | case INSTR_TWO: | |
75 | /* implement semantics of INSTR_TWO */ | |
76 | pc += advance(INSTR_TWO); | |
77 | break; | |
78 | /* ... */ | |
79 | case INSTR_HALT: | |
80 | done = 1; | |
81 | break; | |
82 | default: | |
83 | perror("Invalid opcode"); | |
84 | } | |
85 | } | |
86 | ||
87 | Note that `advance()` is some function that computes how far the program | |
88 | counter is advanced on that instruction. This value is typically +1 for | |
89 | most instructions, but more or less than 1 (and dependent on the state | |
90 | of the program) for a handful of "branch" instructions. Note also that | |
91 | `advance()` would not typically be implemented in C as a function; I'm | |
92 | just showing it like this to emphasize the regular structure. | |
93 | ||
94 | From this we infer that the basic structure of an interpreter is a | |
95 | *dictionary* or *map* that associates program symbols with operations. | |
96 | There is some extra housekeeping like the fetch-execute cycle that | |
97 | surrounds this dictionary, but this can (hopefully) be handled mostly | |
98 | automatically, freeing us to concentrate on *symbols* and *operations*. | |
99 | ||
100 | The symbols could be taken from any finite alphabet, but in Emmental, to | |
101 | keep things relatively simple, we'll just use the ASCII character set. | |
102 | (Actually, to be precise, this is the full 8-bit ASCII character set. | |
103 | Non-printable control characters are allowed, as are characters between | |
104 | 128 and 255, and each has a distinct meaning. But their representations | |
105 | are not defined.) | |
106 | ||
107 | The operations can be thought of, abstractly, as functions which | |
108 | transform program states. Or they can be thought of, concretely, as | |
109 | segments of code — mini-programs which implement these functions. In the | |
110 | case of a meta-circular interpreter, these mini-programs would be | |
111 | written *in the language being interpreted*. | |
112 | ||
113 | To extend this idea to a *self-modifying* meta-circular interpreter, the | |
114 | operations can be thought of as functions which transform both program | |
115 | states *and* interpreter definitions. (Alternatively, the interpreter | |
116 | definition could be thought of as part of the program state, although I | |
117 | feel that's a bit gorier a way to look at it, and I prefer the other | |
118 | view, at least for Emmental.) | |
119 | ||
120 | In Emmental, most operations leave the interpreter definition unchanged. | |
121 | However, there is one operation which alters the interpreter definition, | |
122 | and it is this altered definition that is used to interpret the | |
123 | remainder of the program. | |
124 | ||
125 | Emmental Semantics (in Emmental) | |
126 | -------------------------------- | |
127 | ||
128 | Emmental is essentially a stack-based language. (There's also a queue, | |
129 | but it's not as central.) All operations implicitly get data from, and | |
130 | implicitly deposit results back onto, a single stack. For | |
131 | orthogonality's sake, this stack may contain only ASCII symbols. (And | |
132 | note that trying to pop an empty stack, or dequeue an empty queue, is an | |
133 | error that aborts the program.) | |
134 | ||
135 | Note that because we've established that an interpreter (at least, | |
136 | insofar as Emmental ever needs to know) is simply a map that takes | |
137 | symbols to operations, and that operations in Emmental are defined | |
138 | (meta-circularly) as Emmental programs, we can use the following | |
139 | notation to describe interpreters: | |
140 | ||
141 | % → XYZ+*! | |
142 | & → 'ap'ag'ag | |
143 | ||
144 | That is, the symbol `%`, when encountered in an Emmental program, | |
145 | indicates an operation that is defined by the Emmental program `XYZ+*!`, | |
146 | and so forth. | |
147 | ||
148 | When a main Emmental program begins execution for the first time, it | |
149 | starts with what's called the *initial Emmental interpreter*. (This | |
150 | fact, of course, doesn't apply to any further point of execution inside | |
151 | an Emmental program, or execution of operations defined in Emmental's | |
152 | MCI, since these would be considered subprograms of a sort. These cases | |
153 | use whichever interpreter happens to be in force in that point in time.) | |
154 | ||
155 | The inital Emmental interpreter is defined as follows: | |
156 | ||
157 | a → a | |
158 | b → b | |
159 | c → c | |
160 | ... | |
161 | ||
162 | That is, for every symbol x in the ASCII set, x `→` x. | |
163 | ||
164 | Doesn't tell us a lot about Emmental's semantics, does it? No. Nothing | |
165 | at all, really. But remember what I said about needing an external | |
166 | source of understanding, in order to actually get full value out of an | |
167 | MCI. And remember the purpose of Emmental's MCI: it is not there so much | |
168 | to help us understand Emmental, but to allow us to *change* Emmental, | |
169 | from inside an Emmental program. | |
170 | ||
171 | And, for all that our description of the initial Emmental interpreter is | |
172 | unhelpfully tautological, it is not incorrect: the semantics of `a` can | |
173 | in fact be thought of as being defined by an Emmental program that | |
174 | consists of only one instruction, `a`. This happy state of affairs comes | |
175 | about because Emmental is stack-based; the "signature" (the requirements | |
176 | for the "before" and "after" stacks) of the symbol `a` is the same as | |
177 | the signature of the program containing the single symbol `a`. No extra | |
178 | syntax to specify arity and the like is necessary. | |
179 | ||
180 | Above all, don't panic: we *will* describe what symbols like `a` | |
181 | actually mean in Emmental, we'll just need to do it in something besides | |
182 | Emmental. In fact, let's do that right now. | |
183 | ||
184 | Emmental Semantics (in English) | |
185 | ------------------------------- | |
186 | ||
187 | ### Primitive Arithmetic | |
188 | ||
189 | `#` pushes the symbol NUL (ASCII 0) onto the stack. | |
190 | ||
191 | The symbols `0`, `1`, ... `9` pop a symbol from the stack, multiply its | |
192 | ASCII value by 10 modulo 256, add the value 0, 1, ... 9 (respectively) | |
193 | to that value modulo 256, and push the resulting symbol back onto the | |
194 | stack. | |
195 | ||
196 | The upshot of these 11 operations is that you can push arbitrary symbols | |
197 | onto the stack by spelling out their ASCII values in decimal. For | |
198 | example, `#64` pushes a `@` onto the stack. | |
199 | ||
200 | `+` pops two symbols off the stack, adds together their ASCII values | |
201 | modulo 256, and pushes the symbol with the resultant ASCII value back | |
202 | onto the stack. | |
203 | ||
204 | `-` pops two symbols off the stack, subtracts the ASCII value of the | |
205 | first popped from the ASCII value of the second popped modulo 256, and | |
206 | pushes the symbol with the resultant ASCII value back onto the stack. | |
207 | ||
208 | `~` ("log") pops a symbol from the stack, computes the discrete base-2 | |
209 | logarithm of the ASCII value of that symbol, and pushes the symbol with | |
210 | the resultant ASCII value back onto the stack. The discrete base-2 | |
211 | logarithm of a number is the floor or integer part of the base-2 | |
212 | logarithm of that number. Alternately, it is the number of the highest | |
213 | bit position (starting with the LSB = bit position 0) with a bit set | |
214 | when the number is viewed as binary. Because the base-2 logarithm of the | |
215 | number 0 itself is undefined, the number 0 is treated as 256 for this | |
216 | operation; its discrete base-2 logarithm is 8. | |
217 | ||
218 | ### Stack and Queue Manipulation | |
219 | ||
220 | `^` ("enqueue a copy") pops a symbol off the stack, makes a copy of it, | |
221 | pushes it back onto the stack, and enqueues the copy onto the queue. | |
222 | ||
223 | `v` ("dequeue") dequeues a symbol from the queue and pushes it onto the | |
224 | stack. | |
225 | ||
226 | Using these operations in combination, one can form "discard", | |
227 | "duplicate", "swap", and other more advanced stack manipulation | |
228 | operations. For example, assuming an empty queue and more than two | |
229 | elements on the stack, "swap" can be accomplished with the code | |
230 | `^v^-+^^v^v^v-+^v-+^v-+vv`. | |
231 | ||
232 | Despite this fact, the operation `:` duplicates the top value of the | |
233 | stack. (Emmental is not an absolutely minimal language; note, for | |
234 | instance, that it has all ten decimal digits as operations when these | |
235 | could surely have been defined in terms of only one or two operations. | |
236 | The reasons for a seperate `:` operation are given below in the section | |
237 | on Computational Class.) | |
238 | ||
239 | ### I/O | |
240 | ||
241 | `.` pops a symbol off the stack and sends it to the standard output as | |
242 | an ASCII symbol. | |
243 | ||
244 | `,` waits for an ASCII symbol to arrive on standard input, and pushes it | |
245 | onto the stack. | |
246 | ||
247 | ### Interpreter Modification and Reflection | |
248 | ||
249 | First let's define what it means to *pop a string* off the stack. | |
250 | Symbols are popped off the stack until a `;` symbol is found on the | |
251 | stack. The symbols popped off are considered a string in the reverse | |
252 | order they were popped; i.e. the last symbol popped is the first symbol | |
253 | of the string. The `;` symbol is popped off the stack, but is not made | |
254 | part of the string; it is simply discarded. | |
255 | ||
256 | Since an Emmental program is a string, popping a program is the same as | |
257 | popping a string, just that the string is interpreted as a program. | |
258 | ||
259 | `!` (sometimes called "supplant") pops a symbol, which we call s, off | |
260 | the stack. Then it pops a program t. It then inserts the association s | |
261 | `→` t into the interpreter definition. This overwrites whatever mapping | |
262 | of s might have been in the interpreter definition previously. This new | |
263 | interpreter definition is used for all subsequent execution (until it is | |
264 | changed again, of course.) | |
265 | ||
266 | Note that `!` does *early binding*. That is, the meaning of each symbol | |
267 | in this program t is the meaning of that symbol *at the time `!` is | |
268 | executed*. If some subsequent `!` operation later changes the meaning of | |
269 | one of the symbols that occurs in t, the meaning of t is not changed. | |
270 | The semantics of t are "captured" or "frozen". This implies, among other | |
271 | things, that supplanting some symbol z with itself (a program consisting | |
272 | only of the symbol z) is a no-op, because z's meaning, at the time that | |
273 | `!` is executed, is invariably z. | |
274 | ||
275 | `?` (sometimes called "eval") pops a symbol, which we call s, off the | |
276 | stack. It then executes that symbol (interprets it as an operation) with | |
277 | the interpreter currently in effect. | |
278 | ||
279 | Note that `?` does *late binding*. That is, in contrast with `!`, `?` | |
280 | never "freezes" the semantic definition of the thing that it is | |
281 | executing. This is true even when `?` occurs in a operation redefinition | |
282 | (i.e. the program that supplanted some symbol's semantics when an `!` | |
283 | was executed.) This implies, among other things, that supplanting some | |
284 | symbol z with the program that consists of instructions that push the | |
285 | ASCII value of z onto the stack, followed by a `?` instruction, creates | |
286 | a *cyclic meaning* for z. This is because the z that will be executed by | |
287 | the `?` will always be the present z, that is, the z that is executing | |
288 | the `?`. | |
289 | ||
290 | For convenience's sake, `;` pushes the symbol `;` onto the stack. | |
291 | ||
292 | All other symbols are no-ops. | |
293 | ||
294 | Computational Class | |
295 | ------------------- | |
296 | ||
297 | I believe Emmental is Turing-complete with only the operations that have | |
298 | been given so far, but I haven't proved it yet. All the elements are | |
299 | there, and although some of them are somewhat "cramped", they look | |
300 | viable. | |
301 | ||
302 | If you want to try thinking about how you could write real programs | |
303 | (like a universal Turing-machine simulator) in Emmental, you might want | |
304 | to skip this section, since it contains "spoilers". | |
305 | ||
306 | Repetition can be accomplished by assigning a symbol a cyclic semantics, | |
307 | by use of a `?` within a `!`. For example, we can redefine the semantics | |
308 | of `0` to be `#48?`. This is simply a program that pushes the symbol `0` | |
309 | onto the stack and executes it with the current interpreter, and, since | |
310 | `0` has been redefined to mean `#48?` in the current interpreter, this | |
311 | will loop forever. The entire program to do this to `0` and run the | |
312 | infinite loop is: | |
313 | ||
314 | ;#35#52#56#63#48!0 | |
315 | ||
316 | This technique can also be used to "jump" from one definition to | |
317 | another, by using `?` to execute some *other* symbol at the end of a | |
318 | definition (that is, some symbol other than the symbol being defined.) | |
319 | ||
320 | Conditionals are a little more restrictive. The trick to them is, | |
321 | strangely, the discrete log operator `~` in combination with the eval | |
322 | operator `?`. Since `~` maps all symbols into a set of nine symbols, and | |
323 | `?` executes the symbol on the stack, `~?` will execute one of the | |
324 | symbols from ASCII 0 (NUL) to ASCII 8 (BS). We can then, for instance, | |
325 | define NUL to do one thing, define SOH through BEL to do the same as | |
326 | NUL, and define BS to do some other thing; this essentially | |
327 | distinguishes between 0 (which executes BS) and every other value (which | |
328 | executes NUL). Further, we can use this in conjunction with `-` to | |
329 | compare two values for equality. So, for example, a program which inputs | |
330 | a character, and outputs Y if the character is M and N otherwise, would | |
331 | be: | |
332 | ||
333 | #59#35#55#56#46#!;##1!;##2!;##3!;##4!;##5!;##6!;##7!#59#35#56#57#46#8!,#77-~? | |
334 | ||
335 | In case NUL through BS are in use for some reason, we can always add 9 | |
336 | to the result of the log (`~#9+?`) to map the answer onto HT through | |
337 | DC1. Or, of course, any of a great number of other arithmetical mappings | |
338 | of our choosing. The most severe constraint is that there be 9 available | |
339 | symbols to act as "destinations" for our "branch". Even if we never | |
340 | overwrite one "branch" with another (and we can do that in Emmental!) | |
341 | and even if we allocate one extra symbol to be the "launch point" of the | |
342 | branch, we still have room for 25 branches in the ASCII character set. | |
343 | ||
344 | So these parts look good. If there's a problem, it's with the queue. | |
345 | Specifically, the problem seems to be the need to know the present size | |
346 | of the queue in order to do stack housework like "duplicate" and the | |
347 | subsequent need for "duplicate" to achieve "discard." (Duplicate can be | |
348 | defined as `^v`, but this only works when the queue is empty. Discard | |
349 | can be defined as duplicate plus `-+`, but this only works when there | |
350 | are other elements below the element being discarded. [This last point | |
351 | is not generally a problem since we can push arbitrary values onto the | |
352 | stack before any program.]) | |
353 | ||
354 | However, if it turns out that we need "duplicate" or "discard" in order | |
355 | to write routines that can handle a variable-sized queue — and that | |
356 | strikes me as likely — then it looks like we have a severe problem. | |
357 | ||
358 | Here's one way I could try to deal with it. I could say that the queue | |
359 | is *local* to the operation being defined (or the main program.) Then | |
360 | you could define `:` to be `^v`, and inside `:`'s definition, the queue | |
361 | would always initially be empty, so the definition would work. | |
362 | ||
363 | But... we need the queue to store our global data. For example, if we | |
364 | are going to simulate a Turing machine, we'd need to use the queue to | |
365 | store the tape (perhaps "doubled up", with one symbol of each pair | |
366 | telling us "the next symbol is a simulated tape symbol" or "the next | |
367 | symbol is some housekeeping value.") We can't store the tape on just one | |
368 | stack. And, once you are looping in Emmental, you've left the "main | |
369 | program" forever; you're jumping from definition to definition, and each | |
370 | has their own queue. At best, you'd need to "dump" the queue onto the | |
371 | stack each time you switched definitions, and even then you'd need a | |
372 | loop to do that, and to loop you need to switch definitions. It's a | |
373 | royal mess. | |
374 | ||
375 | So here's how I will deal with it. I will add a primitive duplicate | |
376 | operation, `:`, to Emmental. Proving that Emmental is Turing-complete is | |
377 | still, then, a challenge, although a doable-seeming challenge. I will | |
378 | then propose a more formidable challenge: prove that the language formed | |
379 | by removing the `:` operation from Emmental is Turing-complete. | |
380 | (Equivalently, prove that the set of Emmental programs that begin with | |
381 | `;#0#58!` is Turing-complete. The nice thing about Emmental is that you | |
382 | can always shoot yourself in the foot — until you erase your pistol, | |
383 | that is.) | |
384 | ||
385 | And if you *really* like a challenge, try proving that Emmental without | |
386 | `~` is Turing-complete. I don't think that it is, although it's possible | |
387 | for it to compute parity, at least (input a symbol, output E if its | |
388 | ASCII value is even, and O if it's odd. To accomplish this, multiply the | |
389 | input's ASCII value by 128 by adding 127 copies of it to it; this is | |
390 | modulo 256, so the only results can be 0 or 128. Define those operations | |
391 | to print out E and O respectively. But that's as far as I've gotten.) | |
392 | ||
393 | Discussion | |
394 | ---------- | |
395 | ||
396 | ### Design Decisions | |
397 | ||
398 | I would've liked to have given Emmental a `'` or `"` instruction similar | |
399 | to Funge's "quote" and "quote-mode" instructions; instructions that | |
400 | treat one or more of the following symbols in the program literally, | |
401 | pushing them, as symbols, onto the stack, instead of executing them. | |
402 | However, such instructions are somewhat problematic, both theoretically | |
403 | and (for the approach I took implementing Emmental) practically. There | |
404 | are two ways of thinking about the problems that arise. | |
405 | ||
406 | One is that the function which implements `'` is given access to the | |
407 | program text itself, and possibly the position within the program, and | |
408 | it uses these to extract the "immediate mode" symbol past the `'`. This | |
409 | information could be available because these pieces of information are | |
410 | considered extra arguments to the function, or because they are (gorily) | |
411 | considered part of the overall program state. Either way, this operation | |
412 | is given a lot of information to work with, and for consistency (since | |
413 | we want to be all nice and neat and say that all operations have the | |
414 | same signature so that our dictionary has a nice uniform type,) *all* | |
415 | operations have access to this information. This is almost too much | |
416 | information; that is, it is so much that operations don't really *need* | |
417 | the dictionary. We could just say there is *one* operation, defined by a | |
418 | function, and that function is given the current symbol and has to | |
419 | decide what it means through whatever means it likes. | |
420 | ||
421 | This approach is very powerful, of course, but it's just not the style | |
422 | that Emmental embodies. (In fact, the idea to view interpreters as | |
423 | dictionaries was one of the foundational design choices for Emmental, to | |
424 | the point where I started constructing a "little theory of interpreters | |
425 | as maps." It really wasn't exploited as much as I think it could have | |
426 | been. If an interpreter is a map of symbols to strings of symbols, it's | |
427 | much more tractable than an opaque function would be; you can define all | |
428 | sorts of operations on them, for example concatenating two interpreters | |
429 | (for all symbols s in interpreter a and interpreter b, c[s] `→` a[s]b[s] | |
430 | — that sort of thing,) computing union or intersection of interpreters, | |
431 | Cartesian product, etc.) | |
432 | ||
433 | The other way of looking at it is to say that there are in fact | |
434 | *multiple* meta-circular interpreters available inside Emmental, and | |
435 | symbols like `'` switch temporarily to an alternate MCI. This alternate | |
436 | MCI interprets every symbol as "push this symbol", then reinstates the | |
437 | previous MCI. I like this explication better than the one above — MCIs | |
438 | begin to look a bit like continuations! — but to do it justice would | |
439 | take some work. I envision a language where the program has fine control | |
440 | over which MCI is in effect, possibly by keeping a map from symbols to | |
441 | MCIs, or maybe even being able to push MCIs onto the stack. This is a | |
442 | wee bit much for Emmental too though, and perhaps I'll explore these | |
443 | possibilities in a future language. | |
444 | ||
445 | ### Turing-completeness | |
446 | ||
447 | You can make the argument that Emmental's way of being Turing-complete | |
448 | is really nothing new: when you redefine some symbol, you're really just | |
449 | defining a new function, and when you use `?` to execute that symbol | |
450 | from within its own definition, you're just making a recursive function | |
451 | call. | |
452 | ||
453 | Well, yes, you can make that argument. But it has to do with how you | |
454 | think about "what is a language", I think. Does a Pascal program | |
455 | fragment which defines a procedure called `PrintFibonacci` represent | |
456 | another programming language, one different from Pascal? You could | |
457 | certainly say that it does — it's the language Pascal where the token | |
458 | `PrintFibonacci` has some meaning that it doesn't have in Pascal. | |
459 | ||
460 | In that view, any language where you can define procedures, or | |
461 | functions, or standard libraries, or the like, is an extensible | |
462 | language. But even languages where you *can't* define new procedures or | |
463 | functions is arguably an extensible language. Take some initial | |
464 | Brainfuck program fragment, for instance. After it executes, it leaves | |
465 | the Brainfuck tape and while-stack in some state that depends on the | |
466 | input. Any Brainfuck fragment that executes after that, will execute in | |
467 | that environment, and that environment is arguably a version of the | |
468 | language Brainfuck, suitably extended. | |
469 | ||
470 | You don't normally think of it that way, I bet, but you *could* — and | |
471 | you would need to, to some degree, to claim that Emmental is "just" | |
472 | defining new functions. The reason you don't typically look at languages | |
473 | like this (unless you are very strange) is because it's much more useful | |
474 | to divide the world into "languages" and "programs." And Emmental *does* | |
475 | make this division, it just makes it in a slightly different place than | |
476 | usual. | |
477 | ||
478 | As far as I'm concerned, if I describe what Emmental does as modifying | |
479 | the Emmental language via its MCI, and what Emmental actually does is | |
480 | consistent with the idea of modifying the Emmental language via its MCI, | |
481 | then what Emmental effectively does is modify the Emmental language via | |
482 | its MCI. And if it needs to do this in a certain way in order to | |
483 | simulate a universal Turing machine, then that difference (however | |
484 | slight) sets it apart from systems where this simulation needs to be | |
485 | done by defining recursive functions. | |
486 | ||
487 | Implementation | |
488 | -------------- | |
489 | ||
490 | `emmental.hs` is a reference interpreter for Emmental written in | |
491 | Haskell. Run the function `emmental` on a string; you can also run | |
492 | `debug` on a string to view the state of the program (stack & queue) | |
493 | during execution. (Note that `debug` is *not* able to show program | |
494 | states that occur internal to an operation.) | |
495 | ||
496 | Happy interpreter-redefining! | |
497 | Chris Pressey | |
498 | Chicago, IL | |
499 | November 11, 2007 |
0 | The Emmental Programming Language | |
1 | ================================= | |
2 | ||
3 | _Try it online_ [@ catseye.tc](https://catseye.tc/installation/Emmental) | |
4 | | _Wiki entry_ [@ esolangs.org](https://esolangs.org/wiki/Emmental) | |
5 | | _See also:_ [Mascarpone](https://github.com/catseye/Mascarpone) | |
6 | ||
7 | - - - - | |
8 | ||
9 | Introduction | |
10 | ------------ | |
11 | ||
12 | Emmental is a self-modifying programming language. That is not to say | |
13 | that it is a language in which programs are self-modifying; rather, it is | |
14 | the language itself, as defined by a meta-circular interpreter, that can | |
15 | be modified during the course of a running program. Indeed, this is how | |
16 | Emmental, without conventional conditional and repetition/recursion | |
17 | constructs, achieves Turing-completeness. | |
18 | ||
19 | Meta-circular Interpreters | |
20 | -------------------------- | |
21 | ||
22 | One way to attempt to define a language is by giving what's called a | |
23 | *meta-circular interpreter* (often shortened to "MCI" in this document.) | |
24 | This is an interpreter for some language which is written in that same | |
25 | language (or at least, a language which is very close to it.) | |
26 | ||
27 | Meta-circular interpreters were a popular way to the describe the | |
28 | semantics of programming languages, especially LISP-like languages, and | |
29 | especially before the advent of denotational semantics. The term | |
30 | "meta-circular" was apparently coined by John C. Reynolds in his paper | |
31 | "Definitional Interpreters for Higher-Order Programming Languages" (1972 | |
32 | Proceedings ACM National Conference.) | |
33 | ||
34 | Of course, in the real world, MCI's are not often used. They certainly | |
35 | *can* be used: if you have a working Scheme interpreter that came with | |
36 | your computer system, there is nothing stopping you from writing another | |
37 | Scheme interpreter in Scheme, and running your programs on your | |
38 | interpreter (which is itself running on your system's interpreter.) | |
39 | However, this is quite a bit less efficient due to the duplication of | |
40 | effort. A somewhat more realistic case might be if your system came | |
41 | with, say, a Scheme compiler. You might then feed your Scheme | |
42 | interpreter (written in Scheme) through that to make a native Scheme | |
43 | interpreter, and use that to interpret your programs. (In this setup, | |
44 | the interpreter is usually described as "self-hosting" rather than | |
45 | "meta-circular".) | |
46 | ||
47 | But, as should be obvious, you already need an implementation of Scheme | |
48 | for your Scheme interpreter written in Scheme to be of much practical | |
49 | use to you. If your meta-circular interpreter is all you have, you won't | |
50 | be able to use it to run or understand Scheme programs. Because the MCI | |
51 | is defined in terms of itself, you'll need some other source of | |
52 | "understanding how it works" to make it complete. This understanding | |
53 | might come from an implementation in some other programming language, or | |
54 | a specification in some formal language, or a description in some | |
55 | natural language, or simply from intuition — but it has to come from | |
56 | somewhere. | |
57 | ||
58 | Assuming that we do have that external source of understanding, the | |
59 | meta-circular interpreter can come in quite handy in codifying the | |
60 | semantics of the language. And, in Emmental's case, *altering* those | |
61 | semantics: Emmental's MCI supports operations which instruct Emmental's | |
62 | MCI to modify its behaviour. | |
63 | ||
64 | Interpreter Structure | |
65 | --------------------- | |
66 | ||
67 | To describe the structure of Emmental's MCI, we first examine the | |
68 | general structure of interpreters. If you've ever written a virtual | |
69 | machine executor in, say, C, you've noticed that it has the general form | |
70 | ||
71 | pc = start; | |
72 | while (!done) { | |
73 | switch (instruction[pc]) { | |
74 | case INSTR_ONE: | |
75 | /* implement semantics of INSTR_ONE */ | |
76 | pc += advance(INSTR_ONE); | |
77 | break; | |
78 | case INSTR_TWO: | |
79 | /* implement semantics of INSTR_TWO */ | |
80 | pc += advance(INSTR_TWO); | |
81 | break; | |
82 | /* ... */ | |
83 | case INSTR_HALT: | |
84 | done = 1; | |
85 | break; | |
86 | default: | |
87 | perror("Invalid opcode"); | |
88 | } | |
89 | } | |
90 | ||
91 | Note that `advance()` is some function that computes how far the program | |
92 | counter is advanced on that instruction. This value is typically +1 for | |
93 | most instructions, but more or less than 1 (and dependent on the state | |
94 | of the program) for a handful of "branch" instructions. Note also that | |
95 | `advance()` would not typically be implemented in C as a function; I'm | |
96 | just showing it like this to emphasize the regular structure. | |
97 | ||
98 | From this we infer that the basic structure of an interpreter is a | |
99 | *dictionary* or *map* that associates program symbols with operations. | |
100 | There is some extra housekeeping like the fetch-execute cycle that | |
101 | surrounds this dictionary, but this can (hopefully) be handled mostly | |
102 | automatically, freeing us to concentrate on *symbols* and *operations*. | |
103 | ||
104 | The symbols could be taken from any finite alphabet, but in Emmental, to | |
105 | keep things relatively simple, we'll just use the ASCII character set. | |
106 | (Actually, to be precise, this is the full 8-bit ASCII character set. | |
107 | Non-printable control characters are allowed, as are characters between | |
108 | 128 and 255, and each has a distinct meaning. But their representations | |
109 | are not defined.) | |
110 | ||
111 | The operations can be thought of, abstractly, as functions which | |
112 | transform program states. Or they can be thought of, concretely, as | |
113 | segments of code — mini-programs which implement these functions. In the | |
114 | case of a meta-circular interpreter, these mini-programs would be | |
115 | written *in the language being interpreted*. | |
116 | ||
117 | To extend this idea to a *self-modifying* meta-circular interpreter, the | |
118 | operations can be thought of as functions which transform both program | |
119 | states *and* interpreter definitions. (Alternatively, the interpreter | |
120 | definition could be thought of as part of the program state, although I | |
121 | feel that's a bit gorier a way to look at it, and I prefer the other | |
122 | view, at least for Emmental.) | |
123 | ||
124 | In Emmental, most operations leave the interpreter definition unchanged. | |
125 | However, there is one operation which alters the interpreter definition, | |
126 | and it is this altered definition that is used to interpret the | |
127 | remainder of the program. | |
128 | ||
129 | Emmental Semantics (in Emmental) | |
130 | -------------------------------- | |
131 | ||
132 | Emmental is essentially a stack-based language. (There's also a queue, | |
133 | but it's not as central.) All operations implicitly get data from, and | |
134 | implicitly deposit results back onto, a single stack. For | |
135 | orthogonality's sake, this stack may contain only ASCII symbols. (And | |
136 | note that trying to pop an empty stack, or dequeue an empty queue, is an | |
137 | error that aborts the program.) | |
138 | ||
139 | Note that because we've established that an interpreter (at least, | |
140 | insofar as Emmental ever needs to know) is simply a map that takes | |
141 | symbols to operations, and that operations in Emmental are defined | |
142 | (meta-circularly) as Emmental programs, we can use the following | |
143 | notation to describe interpreters: | |
144 | ||
145 | % → XYZ+*! | |
146 | & → 'ap'ag'ag | |
147 | ||
148 | That is, the symbol `%`, when encountered in an Emmental program, | |
149 | indicates an operation that is defined by the Emmental program `XYZ+*!`, | |
150 | and so forth. | |
151 | ||
152 | When a main Emmental program begins execution for the first time, it | |
153 | starts with what's called the *initial Emmental interpreter*. (This | |
154 | fact, of course, doesn't apply to any further point of execution inside | |
155 | an Emmental program, or execution of operations defined in Emmental's | |
156 | MCI, since these would be considered subprograms of a sort. These cases | |
157 | use whichever interpreter happens to be in force in that point in time.) | |
158 | ||
159 | The inital Emmental interpreter is defined as follows: | |
160 | ||
161 | a → a | |
162 | b → b | |
163 | c → c | |
164 | ... | |
165 | ||
166 | That is, for every symbol x in the ASCII set, x `→` x. | |
167 | ||
168 | Doesn't tell us a lot about Emmental's semantics, does it? No. Nothing | |
169 | at all, really. But remember what I said about needing an external | |
170 | source of understanding, in order to actually get full value out of an | |
171 | MCI. And remember the purpose of Emmental's MCI: it is not there so much | |
172 | to help us understand Emmental, but to allow us to *change* Emmental, | |
173 | from inside an Emmental program. | |
174 | ||
175 | And, for all that our description of the initial Emmental interpreter is | |
176 | unhelpfully tautological, it is not incorrect: the semantics of `a` can | |
177 | in fact be thought of as being defined by an Emmental program that | |
178 | consists of only one instruction, `a`. This happy state of affairs comes | |
179 | about because Emmental is stack-based; the "signature" (the requirements | |
180 | for the "before" and "after" stacks) of the symbol `a` is the same as | |
181 | the signature of the program containing the single symbol `a`. No extra | |
182 | syntax to specify arity and the like is necessary. | |
183 | ||
184 | Above all, don't panic: we *will* describe what symbols like `a` | |
185 | actually mean in Emmental, we'll just need to do it in something besides | |
186 | Emmental. In fact, let's do that right now. | |
187 | ||
188 | Emmental Semantics (in English) | |
189 | ------------------------------- | |
190 | ||
191 | ### Primitive Arithmetic | |
192 | ||
193 | `#` pushes the symbol NUL (ASCII 0) onto the stack. | |
194 | ||
195 | The symbols `0`, `1`, ... `9` pop a symbol from the stack, multiply its | |
196 | ASCII value by 10 modulo 256, add the value 0, 1, ... 9 (respectively) | |
197 | to that value modulo 256, and push the resulting symbol back onto the | |
198 | stack. | |
199 | ||
200 | The upshot of these 11 operations is that you can push arbitrary symbols | |
201 | onto the stack by spelling out their ASCII values in decimal. For | |
202 | example, `#64` pushes a `@` onto the stack. | |
203 | ||
204 | `+` pops two symbols off the stack, adds together their ASCII values | |
205 | modulo 256, and pushes the symbol with the resultant ASCII value back | |
206 | onto the stack. | |
207 | ||
208 | `-` pops two symbols off the stack, subtracts the ASCII value of the | |
209 | first popped from the ASCII value of the second popped modulo 256, and | |
210 | pushes the symbol with the resultant ASCII value back onto the stack. | |
211 | ||
212 | `~` ("log") pops a symbol from the stack, computes the discrete base-2 | |
213 | logarithm of the ASCII value of that symbol, and pushes the symbol with | |
214 | the resultant ASCII value back onto the stack. The discrete base-2 | |
215 | logarithm of a number is the floor or integer part of the base-2 | |
216 | logarithm of that number. Alternately, it is the number of the highest | |
217 | bit position (starting with the LSB = bit position 0) with a bit set | |
218 | when the number is viewed as binary. Because the base-2 logarithm of the | |
219 | number 0 itself is undefined, the number 0 is treated as 256 for this | |
220 | operation; its discrete base-2 logarithm is 8. | |
221 | ||
222 | ### Stack and Queue Manipulation | |
223 | ||
224 | `^` ("enqueue a copy") pops a symbol off the stack, makes a copy of it, | |
225 | pushes it back onto the stack, and enqueues the copy onto the queue. | |
226 | ||
227 | `v` ("dequeue") dequeues a symbol from the queue and pushes it onto the | |
228 | stack. | |
229 | ||
230 | Using these operations in combination, one can form "discard", | |
231 | "duplicate", "swap", and other more advanced stack manipulation | |
232 | operations. For example, assuming an empty queue and more than two | |
233 | elements on the stack, "swap" can be accomplished with the code | |
234 | `^v^-+^^v^v^v-+^v-+^v-+vv`. | |
235 | ||
236 | Despite this fact, the operation `:` duplicates the top value of the | |
237 | stack. (Emmental is not an absolutely minimal language; note, for | |
238 | instance, that it has all ten decimal digits as operations when these | |
239 | could surely have been defined in terms of only one or two operations. | |
240 | The reasons for a seperate `:` operation are given below in the section | |
241 | on Computational Class.) | |
242 | ||
243 | ### I/O | |
244 | ||
245 | `.` pops a symbol off the stack and sends it to the standard output as | |
246 | an ASCII symbol. | |
247 | ||
248 | `,` waits for an ASCII symbol to arrive on standard input, and pushes it | |
249 | onto the stack. | |
250 | ||
251 | ### Interpreter Modification and Reflection | |
252 | ||
253 | First let's define what it means to *pop a string* off the stack. | |
254 | Symbols are popped off the stack until a `;` symbol is found on the | |
255 | stack. The symbols popped off are considered a string in the reverse | |
256 | order they were popped; i.e. the last symbol popped is the first symbol | |
257 | of the string. The `;` symbol is popped off the stack, but is not made | |
258 | part of the string; it is simply discarded. | |
259 | ||
260 | Since an Emmental program is a string, popping a program is the same as | |
261 | popping a string, just that the string is interpreted as a program. | |
262 | ||
263 | `!` (sometimes called "supplant") pops a symbol, which we call s, off | |
264 | the stack. Then it pops a program t. It then inserts the association s | |
265 | `→` t into the interpreter definition. This overwrites whatever mapping | |
266 | of s might have been in the interpreter definition previously. This new | |
267 | interpreter definition is used for all subsequent execution (until it is | |
268 | changed again, of course.) | |
269 | ||
270 | Note that `!` does *early binding*. That is, the meaning of each symbol | |
271 | in this program t is the meaning of that symbol *at the time `!` is | |
272 | executed*. If some subsequent `!` operation later changes the meaning of | |
273 | one of the symbols that occurs in t, the meaning of t is not changed. | |
274 | The semantics of t are "captured" or "frozen". This implies, among other | |
275 | things, that supplanting some symbol z with itself (a program consisting | |
276 | only of the symbol z) is a no-op, because z's meaning, at the time that | |
277 | `!` is executed, is invariably z. | |
278 | ||
279 | `?` (sometimes called "eval") pops a symbol, which we call s, off the | |
280 | stack. It then executes that symbol (interprets it as an operation) with | |
281 | the interpreter currently in effect. | |
282 | ||
283 | Note that `?` does *late binding*. That is, in contrast with `!`, `?` | |
284 | never "freezes" the semantic definition of the thing that it is | |
285 | executing. This is true even when `?` occurs in a operation redefinition | |
286 | (i.e. the program that supplanted some symbol's semantics when an `!` | |
287 | was executed.) This implies, among other things, that supplanting some | |
288 | symbol z with the program that consists of instructions that push the | |
289 | ASCII value of z onto the stack, followed by a `?` instruction, creates | |
290 | a *cyclic meaning* for z. This is because the z that will be executed by | |
291 | the `?` will always be the present z, that is, the z that is executing | |
292 | the `?`. | |
293 | ||
294 | For convenience's sake, `;` pushes the symbol `;` onto the stack. | |
295 | ||
296 | All other symbols are no-ops. | |
297 | ||
298 | Computational Class | |
299 | ------------------- | |
300 | ||
301 | I believe Emmental is Turing-complete with only the operations that have | |
302 | been given so far, but I haven't proved it yet. All the elements are | |
303 | there, and although some of them are somewhat "cramped", they look | |
304 | viable. | |
305 | ||
306 | If you want to try thinking about how you could write real programs | |
307 | (like a universal Turing-machine simulator) in Emmental, you might want | |
308 | to skip this section, since it contains "spoilers". | |
309 | ||
310 | Repetition can be accomplished by assigning a symbol a cyclic semantics, | |
311 | by use of a `?` within a `!`. For example, we can redefine the semantics | |
312 | of `0` to be `#48?`. This is simply a program that pushes the symbol `0` | |
313 | onto the stack and executes it with the current interpreter, and, since | |
314 | `0` has been redefined to mean `#48?` in the current interpreter, this | |
315 | will loop forever. The entire program to do this to `0` and run the | |
316 | infinite loop is: | |
317 | ||
318 | ;#35#52#56#63#48!0 | |
319 | ||
320 | This technique can also be used to "jump" from one definition to | |
321 | another, by using `?` to execute some *other* symbol at the end of a | |
322 | definition (that is, some symbol other than the symbol being defined.) | |
323 | ||
324 | Conditionals are a little more restrictive. The trick to them is, | |
325 | strangely, the discrete log operator `~` in combination with the eval | |
326 | operator `?`. Since `~` maps all symbols into a set of nine symbols, and | |
327 | `?` executes the symbol on the stack, `~?` will execute one of the | |
328 | symbols from ASCII 0 (NUL) to ASCII 8 (BS). We can then, for instance, | |
329 | define NUL to do one thing, define SOH through BEL to do the same as | |
330 | NUL, and define BS to do some other thing; this essentially | |
331 | distinguishes between 0 (which executes BS) and every other value (which | |
332 | executes NUL). Further, we can use this in conjunction with `-` to | |
333 | compare two values for equality. So, for example, a program which inputs | |
334 | a character, and outputs Y if the character is M and N otherwise, would | |
335 | be: | |
336 | ||
337 | #59#35#55#56#46#!;##1!;##2!;##3!;##4!;##5!;##6!;##7!#59#35#56#57#46#8!,#77-~? | |
338 | ||
339 | In case NUL through BS are in use for some reason, we can always add 9 | |
340 | to the result of the log (`~#9+?`) to map the answer onto HT through | |
341 | DC1. Or, of course, any of a great number of other arithmetical mappings | |
342 | of our choosing. The most severe constraint is that there be 9 available | |
343 | symbols to act as "destinations" for our "branch". Even if we never | |
344 | overwrite one "branch" with another (and we can do that in Emmental!) | |
345 | and even if we allocate one extra symbol to be the "launch point" of the | |
346 | branch, we still have room for 25 branches in the ASCII character set. | |
347 | ||
348 | So these parts look good. If there's a problem, it's with the queue. | |
349 | Specifically, the problem seems to be the need to know the present size | |
350 | of the queue in order to do stack housework like "duplicate" and the | |
351 | subsequent need for "duplicate" to achieve "discard." (Duplicate can be | |
352 | defined as `^v`, but this only works when the queue is empty. Discard | |
353 | can be defined as duplicate plus `-+`, but this only works when there | |
354 | are other elements below the element being discarded. [This last point | |
355 | is not generally a problem since we can push arbitrary values onto the | |
356 | stack before any program.]) | |
357 | ||
358 | However, if it turns out that we need "duplicate" or "discard" in order | |
359 | to write routines that can handle a variable-sized queue — and that | |
360 | strikes me as likely — then it looks like we have a severe problem. | |
361 | ||
362 | Here's one way I could try to deal with it. I could say that the queue | |
363 | is *local* to the operation being defined (or the main program.) Then | |
364 | you could define `:` to be `^v`, and inside `:`'s definition, the queue | |
365 | would always initially be empty, so the definition would work. | |
366 | ||
367 | But... we need the queue to store our global data. For example, if we | |
368 | are going to simulate a Turing machine, we'd need to use the queue to | |
369 | store the tape (perhaps "doubled up", with one symbol of each pair | |
370 | telling us "the next symbol is a simulated tape symbol" or "the next | |
371 | symbol is some housekeeping value.") We can't store the tape on just one | |
372 | stack. And, once you are looping in Emmental, you've left the "main | |
373 | program" forever; you're jumping from definition to definition, and each | |
374 | has their own queue. At best, you'd need to "dump" the queue onto the | |
375 | stack each time you switched definitions, and even then you'd need a | |
376 | loop to do that, and to loop you need to switch definitions. It's a | |
377 | royal mess. | |
378 | ||
379 | So here's how I will deal with it. I will add a primitive duplicate | |
380 | operation, `:`, to Emmental. Proving that Emmental is Turing-complete is | |
381 | still, then, a challenge, although a doable-seeming challenge. I will | |
382 | then propose a more formidable challenge: prove that the language formed | |
383 | by removing the `:` operation from Emmental is Turing-complete. | |
384 | (Equivalently, prove that the set of Emmental programs that begin with | |
385 | `;#0#58!` is Turing-complete. The nice thing about Emmental is that you | |
386 | can always shoot yourself in the foot — until you erase your pistol, | |
387 | that is.) | |
388 | ||
389 | And if you *really* like a challenge, try proving that Emmental without | |
390 | `~` is Turing-complete. I don't think that it is, although it's possible | |
391 | for it to compute parity, at least (input a symbol, output E if its | |
392 | ASCII value is even, and O if it's odd. To accomplish this, multiply the | |
393 | input's ASCII value by 128 by adding 127 copies of it to it; this is | |
394 | modulo 256, so the only results can be 0 or 128. Define those operations | |
395 | to print out E and O respectively. But that's as far as I've gotten.) | |
396 | ||
397 | Discussion | |
398 | ---------- | |
399 | ||
400 | ### Design Decisions | |
401 | ||
402 | I would've liked to have given Emmental a `'` or `"` instruction similar | |
403 | to Funge's "quote" and "quote-mode" instructions; instructions that | |
404 | treat one or more of the following symbols in the program literally, | |
405 | pushing them, as symbols, onto the stack, instead of executing them. | |
406 | However, such instructions are somewhat problematic, both theoretically | |
407 | and (for the approach I took implementing Emmental) practically. There | |
408 | are two ways of thinking about the problems that arise. | |
409 | ||
410 | One is that the function which implements `'` is given access to the | |
411 | program text itself, and possibly the position within the program, and | |
412 | it uses these to extract the "immediate mode" symbol past the `'`. This | |
413 | information could be available because these pieces of information are | |
414 | considered extra arguments to the function, or because they are (gorily) | |
415 | considered part of the overall program state. Either way, this operation | |
416 | is given a lot of information to work with, and for consistency (since | |
417 | we want to be all nice and neat and say that all operations have the | |
418 | same signature so that our dictionary has a nice uniform type,) *all* | |
419 | operations have access to this information. This is almost too much | |
420 | information; that is, it is so much that operations don't really *need* | |
421 | the dictionary. We could just say there is *one* operation, defined by a | |
422 | function, and that function is given the current symbol and has to | |
423 | decide what it means through whatever means it likes. | |
424 | ||
425 | This approach is very powerful, of course, but it's just not the style | |
426 | that Emmental embodies. (In fact, the idea to view interpreters as | |
427 | dictionaries was one of the foundational design choices for Emmental, to | |
428 | the point where I started constructing a "little theory of interpreters | |
429 | as maps." It really wasn't exploited as much as I think it could have | |
430 | been. If an interpreter is a map of symbols to strings of symbols, it's | |
431 | much more tractable than an opaque function would be; you can define all | |
432 | sorts of operations on them, for example concatenating two interpreters | |
433 | (for all symbols s in interpreter a and interpreter b, c[s] `→` a[s]b[s] | |
434 | — that sort of thing,) computing union or intersection of interpreters, | |
435 | Cartesian product, etc.) | |
436 | ||
437 | The other way of looking at it is to say that there are in fact | |
438 | *multiple* meta-circular interpreters available inside Emmental, and | |
439 | symbols like `'` switch temporarily to an alternate MCI. This alternate | |
440 | MCI interprets every symbol as "push this symbol", then reinstates the | |
441 | previous MCI. I like this explication better than the one above — MCIs | |
442 | begin to look a bit like continuations! — but to do it justice would | |
443 | take some work. I envision a language where the program has fine control | |
444 | over which MCI is in effect, possibly by keeping a map from symbols to | |
445 | MCIs, or maybe even being able to push MCIs onto the stack. This is a | |
446 | wee bit much for Emmental too though, and perhaps I'll explore these | |
447 | possibilities in a future language. | |
448 | ||
449 | ### Turing-completeness | |
450 | ||
451 | You can make the argument that Emmental's way of being Turing-complete | |
452 | is really nothing new: when you redefine some symbol, you're really just | |
453 | defining a new function, and when you use `?` to execute that symbol | |
454 | from within its own definition, you're just making a recursive function | |
455 | call. | |
456 | ||
457 | Well, yes, you can make that argument. But it has to do with how you | |
458 | think about "what is a language", I think. Does a Pascal program | |
459 | fragment which defines a procedure called `PrintFibonacci` represent | |
460 | another programming language, one different from Pascal? You could | |
461 | certainly say that it does — it's the language Pascal where the token | |
462 | `PrintFibonacci` has some meaning that it doesn't have in Pascal. | |
463 | ||
464 | In that view, any language where you can define procedures, or | |
465 | functions, or standard libraries, or the like, is an extensible | |
466 | language. But even languages where you *can't* define new procedures or | |
467 | functions is arguably an extensible language. Take some initial | |
468 | Brainfuck program fragment, for instance. After it executes, it leaves | |
469 | the Brainfuck tape and while-stack in some state that depends on the | |
470 | input. Any Brainfuck fragment that executes after that, will execute in | |
471 | that environment, and that environment is arguably a version of the | |
472 | language Brainfuck, suitably extended. | |
473 | ||
474 | You don't normally think of it that way, I bet, but you *could* — and | |
475 | you would need to, to some degree, to claim that Emmental is "just" | |
476 | defining new functions. The reason you don't typically look at languages | |
477 | like this (unless you are very strange) is because it's much more useful | |
478 | to divide the world into "languages" and "programs." And Emmental *does* | |
479 | make this division, it just makes it in a slightly different place than | |
480 | usual. | |
481 | ||
482 | As far as I'm concerned, if I describe what Emmental does as modifying | |
483 | the Emmental language via its MCI, and what Emmental actually does is | |
484 | consistent with the idea of modifying the Emmental language via its MCI, | |
485 | then what Emmental effectively does is modify the Emmental language via | |
486 | its MCI. And if it needs to do this in a certain way in order to | |
487 | simulate a universal Turing machine, then that difference (however | |
488 | slight) sets it apart from systems where this simulation needs to be | |
489 | done by defining recursive functions. | |
490 | ||
491 | Implementation | |
492 | -------------- | |
493 | ||
494 | `emmental.hs` is a reference interpreter for Emmental written in | |
495 | Haskell. Run the function `emmental` on a string; you can also run | |
496 | `debug` on a string to view the state of the program (stack & queue) | |
497 | during execution. (Note that `debug` is *not* able to show program | |
498 | states that occur internal to an operation.) | |
499 | ||
500 | Happy interpreter-redefining! | |
501 | Chris Pressey | |
502 | Chicago, IL | |
503 | November 11, 2007 |
0 | #!/bin/sh | |
1 | ||
2 | THIS=`realpath $0` | |
3 | DIR=`dirname $THIS` | |
4 | NAME=`basename $THIS` | |
5 | SRC=$DIR/../src | |
6 | if [ "x$FORCE_HUGS" != "x" ] ; then | |
7 | exec runhugs -i$SRC $SRC/Main.hs $* | |
8 | elif [ -x $DIR/$NAME.exe ] ; then | |
9 | exec $DIR/$NAME.exe $* | |
10 | elif command -v runhaskell 2>&1 >/dev/null ; then | |
11 | exec runhaskell -i$SRC $SRC/Main.hs $* | |
12 | elif command -v runhugs 2>&1 >/dev/null ; then | |
13 | exec runhugs -i$SRC $SRC/Main.hs $* | |
14 | else | |
15 | echo "Cannot run $NAME; neither $NAME.exe, nor runhaskell, nor runhugs found." | |
16 | exit 1 | |
17 | fi |
1 | 1 | |
2 | 2 | PROG=emmental |
3 | 3 | |
4 | if [ x`which ghc` = x -a x`which runhugs` = x ]; then | |
5 | echo "Neither ghc nor runhugs found on search path." | |
6 | exit 1 | |
4 | if command -v ghc >/dev/null 2>&1; then | |
5 | echo "building $PROG.exe with ghc" | |
6 | (cd src && ghc --make Main.hs -o ../bin/$PROG.exe) | |
7 | else | |
8 | echo "ghc not found, not building $PROG.exe" | |
7 | 9 | fi |
8 | 10 | |
9 | mkdir -p bin | |
11 | # For this to work, you need hastec installed. | |
10 | 12 | |
11 | if [ x`which ghc` = x -o ! x$USE_HUGS = x ]; then | |
12 | # create script to run with Hugs | |
13 | cat >bin/$PROG <<'EOF' | |
14 | #!/bin/sh | |
15 | THIS=`realpath $0` | |
16 | DIR=`dirname $THIS`/../src | |
17 | runhugs $DIR/Main.hs $* | |
18 | EOF | |
19 | chmod 755 bin/$PROG | |
13 | if command -v hastec >/dev/null 2>&1; then | |
14 | echo "building $PROG.js with hastec" | |
15 | (cd src && hastec --make HasteMain.hs -o ../demo/$PROG.js) | |
20 | 16 | else |
21 | cd src && ghc --make Main.hs -o ../bin/$PROG | |
17 | echo "hastec not found, not building $PROG.js" | |
22 | 18 | fi |
0 | #!/bin/sh | |
1 | ||
2 | find . -name "*.o" -exec rm {} \; | |
3 | find . -name "*.hi" -exec rm {} \; | |
4 | find . -name "*.jsmod" -exec rm {} \; | |
5 | find . -name "*.exe" -exec rm {} \; |
0 | <!DOCTYPE html> | |
1 | <head> | |
2 | <meta charset="utf-8"> | |
3 | <title>Emmental</title> | |
4 | </head> | |
5 | <body> | |
6 | ||
7 | <h1>Emmental</h1> | |
8 | ||
9 | <p>(emmental.hs compiled to .js by <code>hastec</code>, running in HTML5 document)</p> | |
10 | ||
11 | <div id="installation"></div> | |
12 | ||
13 | <script src="../eg/examplePrograms.jsonp.js"></script> | |
14 | <script src="hastec-io-launcher.js"></script> | |
15 | <script src="emmental.js"></script> | |
16 | <script> | |
17 | launch({ | |
18 | container: document.getElementById('installation'), | |
19 | initialOption: "hello.emmental" | |
20 | }); | |
21 | </script> | |
22 | </body> |
0 | function launch(config) { | |
1 | config.container.innerHTML = ` | |
2 | <textarea id="prog" rows="10" cols="80"></textarea> | |
3 | <div id="control-panel"></div> | |
4 | <div>Input: <input id="prog-input"></input></div> | |
5 | <div>Output: <pre id="prog-output"></pre></div> | |
6 | <div><button id="run-button">Run</button></div> | |
7 | <pre id="result"></pre> | |
8 | `; | |
9 | ||
10 | function makeSelect(container, labelText, optionsArray, fun) { | |
11 | var label = document.createElement('label'); | |
12 | label.innerHTML = labelText; | |
13 | container.appendChild(label); | |
14 | var select = document.createElement("select"); | |
15 | for (var i = 0; i < optionsArray.length; i++) { | |
16 | var op = document.createElement("option"); | |
17 | op.text = optionsArray[i].filename; | |
18 | op.value = optionsArray[i].contents; | |
19 | select.options.add(op); | |
20 | } | |
21 | select.onchange = function(e) { | |
22 | fun(optionsArray[select.selectedIndex]); | |
23 | }; | |
24 | select.selectedIndex = 0; | |
25 | label.appendChild(select); | |
26 | return select; | |
27 | }; | |
28 | ||
29 | function selectOptionByText(selectElem, text) { | |
30 | var optElem; | |
31 | for (var i = 0; optElem = selectElem.options[i]; i++) { | |
32 | if (optElem.text === text) { | |
33 | selectElem.selectedIndex = i; | |
34 | selectElem.dispatchEvent(new Event('change')); | |
35 | return; | |
36 | } | |
37 | } | |
38 | } | |
39 | ||
40 | var controlPanel = document.getElementById('control-panel'); | |
41 | var select = makeSelect(controlPanel, "example program:", examplePrograms, function(option) { | |
42 | document.getElementById('prog').value = option.contents; | |
43 | }); | |
44 | selectOptionByText(select, config.initialOption); | |
45 | } |
0 | #1#1+ |
0 | examplePrograms = [ | |
1 | { | |
2 | "contents": "#1#1+\n", | |
3 | "filename": "add-one-and-one.emmental" | |
4 | }, | |
5 | { | |
6 | "contents": ";#58#126#63#36!;#46#36#!;#0#1!;#0#2!;#0#3!;#0#4!;#0#5!;#0#6!;#0#7!\n#0#33#111#108#108#101#72$\n", | |
7 | "filename": "hello.emmental" | |
8 | }, | |
9 | { | |
10 | "contents": "#59#35#55#56#46#!;##1!;##2!;##3!;##4!;##5!;##6!;##7!#59#35#56#57#46#8!,#77-~?\n", | |
11 | "filename": "is-the-input-M.emmental" | |
12 | }, | |
13 | { | |
14 | "contents": ";#43#38!#1#1&\n", | |
15 | "filename": "redefine-ampersand-as-add.emmental" | |
16 | }, | |
17 | { | |
18 | "contents": "#67#66#65^v^-+^^v^v^v-+^v-+^v-+vv\n", | |
19 | "filename": "swap-top-of-stack.emmental" | |
20 | } | |
21 | ]; |
0 | #59#35#55#56#46#!;##1!;##2!;##3!;##4!;##5!;##6!;##7!#59#35#56#57#46#8!,#77-~? |
0 | ;#43#38!#1#1& |
0 | #67#66#65^v^-+^^v^v^v-+^v-+^v-+vv |
0 | #59#35#55#56#46#!;##1!;##2!;##3!;##4!;##5!;##6!;##7!#59#35#56#57#46#8!,#77-~? |
0 | -- | |
1 | -- Emmental.hs | |
2 | -- Interpreter for the Emmental Programming Language | |
3 | -- Chris Pressey, Cat's Eye Technologies | |
4 | -- This work is in the public domain. See UNLICENSE for more information. | |
5 | -- | |
6 | ||
7 | module Emmental where | |
8 | ||
9 | import qualified Data.Map as Map | |
10 | import qualified Data.Char as Char | |
11 | ||
12 | ||
13 | ----------------------------------------------------------------------- | |
14 | -- ============================ Symbols ============================ -- | |
15 | ----------------------------------------------------------------------- | |
16 | ||
17 | type Symbol = Char | |
18 | ||
19 | ||
20 | ----------------------------------------------------------------------- | |
21 | -- ======================== Program States ========================= -- | |
22 | ----------------------------------------------------------------------- | |
23 | ||
24 | data State = State [Symbol] [Symbol] | |
25 | deriving (Ord, Eq, Show) | |
26 | ||
27 | pop (State (head:tail) queue) = (head, State tail queue) | |
28 | push (State list queue) sym = (State (sym:list) queue) | |
29 | ||
30 | popString (State (';':tail) queue) = ([], State tail queue) | |
31 | popString (State (head:tail) queue) = | |
32 | let | |
33 | (string, state') = popString (State tail queue) | |
34 | in | |
35 | (string ++ [head], state') | |
36 | ||
37 | enqueue (State stack queue) symbol = State stack (symbol:queue) | |
38 | dequeue (State stack queue) = | |
39 | let | |
40 | symbol = last queue | |
41 | queue' = init queue | |
42 | in | |
43 | (symbol, State stack queue') | |
44 | ||
45 | ||
46 | ----------------------------------------------------------------------- | |
47 | -- ========================= Interpreters ========================== -- | |
48 | ----------------------------------------------------------------------- | |
49 | ||
50 | data Interpreter = Interp (Map.Map Symbol Operation) | |
51 | ||
52 | fetch (Interp map) sym = Map.findWithDefault (fitRegOp opNop) sym map | |
53 | supplant (Interp map) sym op = (Interp (Map.insert sym op map)) | |
54 | ||
55 | ||
56 | ----------------------------------------------------------------------- | |
57 | -- ========================== Operations =========================== -- | |
58 | ----------------------------------------------------------------------- | |
59 | ||
60 | type Operation = State -> Interpreter -> IO (State, Interpreter) | |
61 | ||
62 | composeOps :: Operation -> Operation -> Operation | |
63 | ||
64 | composeOps op1 op2 = f where | |
65 | f state interpreter = do | |
66 | (state', interpreter') <- op1 state interpreter | |
67 | op2 state' interpreter' | |
68 | ||
69 | createOp :: Interpreter -> [Symbol] -> Operation | |
70 | ||
71 | createOp interpreter [] = | |
72 | (fitRegOp opNop) | |
73 | createOp interpreter (head:tail) = | |
74 | composeOps (fetch interpreter head) (createOp interpreter tail) | |
75 | ||
76 | -- | |
77 | -- It's useful for us to express a lot of our operators as non-monadic | |
78 | -- functions that don't affect the interpreter. This is a little "adapter" | |
79 | -- function that lets us create monadic functions with the right signature | |
80 | -- from them. | |
81 | -- | |
82 | ||
83 | fitRegOp :: (State -> State) -> Operation | |
84 | ||
85 | fitRegOp regop = f where | |
86 | f state interpreter = | |
87 | let | |
88 | state' = regop state | |
89 | in | |
90 | do return (state', interpreter) | |
91 | ||
92 | ||
93 | ------------------------------------------------------------ | |
94 | --------------- The operations themselves. ----------------- | |
95 | ------------------------------------------------------------ | |
96 | ||
97 | -- | |
98 | -- Redefine the meaning of the symbol on the stack with | |
99 | -- a mini-program also popped off the stack. | |
100 | -- | |
101 | ||
102 | opSupplant state interpreter = | |
103 | let | |
104 | (opSym, state') = pop state | |
105 | (newOpDefn, state'') = popString state' | |
106 | newOp = createOp interpreter newOpDefn | |
107 | in | |
108 | do return (state'', supplant interpreter opSym newOp) | |
109 | ||
110 | -- | |
111 | -- Execute the symbol on the stack with the current interpreter. | |
112 | -- | |
113 | ||
114 | opEval state interpreter = | |
115 | let | |
116 | (opSym, state') = pop state | |
117 | newOp = createOp interpreter [opSym] | |
118 | in | |
119 | newOp state' interpreter | |
120 | ||
121 | -- | |
122 | -- I/O. | |
123 | -- | |
124 | ||
125 | opInput state interpreter = do | |
126 | symbol <- getChar | |
127 | do return (push state symbol, interpreter) | |
128 | ||
129 | opOutput state interpreter = | |
130 | let | |
131 | (symbol, state') = pop state | |
132 | in do | |
133 | putChar symbol | |
134 | return (state', interpreter) | |
135 | ||
136 | -- | |
137 | -- Primitive arithmetic. | |
138 | -- | |
139 | ||
140 | opAdd state = | |
141 | let | |
142 | (symA, state') = pop state | |
143 | (symB, state'') = pop state' | |
144 | in | |
145 | push state'' (Char.chr (((Char.ord symB) + (Char.ord symA)) `mod` 256)) | |
146 | ||
147 | opSubtract state = | |
148 | let | |
149 | (symA, state') = pop state | |
150 | (symB, state'') = pop state' | |
151 | in | |
152 | push state'' (Char.chr (((Char.ord symB) - (Char.ord symA)) `mod` 256)) | |
153 | ||
154 | discreteLog 0 = 8 | |
155 | discreteLog 1 = 0 | |
156 | discreteLog 2 = 1 | |
157 | discreteLog n = (discreteLog (n `div` 2)) + 1 | |
158 | ||
159 | opDiscreteLog state = | |
160 | let | |
161 | (symbol, state') = pop state | |
162 | in | |
163 | push state' (Char.chr (discreteLog (Char.ord symbol))) | |
164 | ||
165 | -- | |
166 | -- Stack manipulation. | |
167 | -- | |
168 | ||
169 | -- | |
170 | -- Pop the top symbol of the stack, make a copy of it, push it back onto the | |
171 | -- stack, and enqueue the copy onto the queue. | |
172 | -- | |
173 | ||
174 | opEnqueueCopy state = | |
175 | let | |
176 | (sym, _) = pop state | |
177 | in | |
178 | enqueue state sym | |
179 | ||
180 | -- | |
181 | -- Dequeue a symbol from the queue and push it onto the stack. | |
182 | -- | |
183 | ||
184 | opDequeue state = | |
185 | let | |
186 | (sym, state') = dequeue state | |
187 | in | |
188 | push state' sym | |
189 | ||
190 | -- | |
191 | -- Duplicate the top symbol of the stack. | |
192 | -- | |
193 | ||
194 | opDuplicate state = | |
195 | let | |
196 | (symbol, _) = pop state | |
197 | in | |
198 | push state symbol | |
199 | ||
200 | -- | |
201 | -- Miscellaneous operations. | |
202 | -- | |
203 | ||
204 | opNop state = | |
205 | state | |
206 | ||
207 | -- | |
208 | -- Parameterizable operations. | |
209 | -- | |
210 | ||
211 | opPushValue value state = | |
212 | push state (Char.chr value) | |
213 | ||
214 | opAccumValue value state = | |
215 | let | |
216 | (sym, state') = pop state | |
217 | value' = ((Char.ord sym) * 10) + value | |
218 | in | |
219 | push state' (Char.chr (value' `mod` 256)) | |
220 | ||
221 | ||
222 | ----------------------------------------------------------------------- | |
223 | -- ===================== Debugging Functions ======================= -- | |
224 | ----------------------------------------------------------------------- | |
225 | ||
226 | type Debugger = State -> Interpreter -> IO () | |
227 | ||
228 | debugNop s i = do | |
229 | return () | |
230 | ||
231 | debugPrintState s i = do | |
232 | putStr ((show s) ++ "\n") | |
233 | return () | |
234 | ||
235 | ||
236 | ----------------------------------------------------------------------- | |
237 | -- ============================ Executor =========================== -- | |
238 | ----------------------------------------------------------------------- | |
239 | ||
240 | execute :: [Symbol] -> State -> Interpreter -> Debugger -> IO (State, Interpreter) | |
241 | ||
242 | execute [] state interpreter debugger = | |
243 | return (state, interpreter) | |
244 | execute (opSym:program') state interpreter debugger = | |
245 | let | |
246 | operation = fetch interpreter opSym | |
247 | in do | |
248 | (state', interpreter') <- operation state interpreter | |
249 | debugger state' interpreter' | |
250 | execute program' state' interpreter' debugger | |
251 | ||
252 | ||
253 | ----------------------------------------------------------------------- | |
254 | -- ====================== Top-Level Function ======================= -- | |
255 | ----------------------------------------------------------------------- | |
256 | ||
257 | initialInterpreter = Interp | |
258 | (Map.fromList | |
259 | [ | |
260 | ('.', opOutput), | |
261 | (',', opInput), | |
262 | ||
263 | ('#', fitRegOp (opPushValue 0)), | |
264 | ('0', fitRegOp (opAccumValue 0)), | |
265 | ('1', fitRegOp (opAccumValue 1)), | |
266 | ('2', fitRegOp (opAccumValue 2)), | |
267 | ('3', fitRegOp (opAccumValue 3)), | |
268 | ('4', fitRegOp (opAccumValue 4)), | |
269 | ('5', fitRegOp (opAccumValue 5)), | |
270 | ('6', fitRegOp (opAccumValue 6)), | |
271 | ('7', fitRegOp (opAccumValue 7)), | |
272 | ('8', fitRegOp (opAccumValue 8)), | |
273 | ('9', fitRegOp (opAccumValue 9)), | |
274 | ||
275 | ('+', fitRegOp opAdd), | |
276 | ('-', fitRegOp opSubtract), | |
277 | ('~', fitRegOp opDiscreteLog), | |
278 | ||
279 | ('^', fitRegOp opEnqueueCopy), | |
280 | ('v', fitRegOp opDequeue), | |
281 | (':', fitRegOp opDuplicate), | |
282 | ||
283 | ('!', opSupplant), | |
284 | ('?', opEval), | |
285 | ||
286 | (';', fitRegOp (opPushValue (Char.ord ';'))) | |
287 | ] | |
288 | ) | |
289 | ||
290 | initialState = State [] [] | |
291 | ||
292 | emmental string = do | |
293 | (state, interpreter) <- execute string initialState initialInterpreter debugNop | |
294 | return state | |
295 | ||
296 | debug string = do | |
297 | (state, interpreter) <- execute string initialState initialInterpreter debugPrintState | |
298 | return state | |
299 | ||
300 | ||
301 | ----------------------------------------------------------------------- | |
302 | -- ========================== Test Cases =========================== -- | |
303 | ----------------------------------------------------------------------- | |
304 | ||
305 | -- | |
306 | -- Drivers for test cases. 'demo' runs them straight, whereas 'test' | |
307 | -- uses the debugger. | |
308 | -- | |
309 | ||
310 | demo n = emmental (testProg n) | |
311 | ||
312 | test n = debug (testProg n) | |
313 | ||
314 | -- | |
315 | -- Here we introduce a bit of a cheat, in order to make writing | |
316 | -- complex Emmental programs tolerable. You can still see the | |
317 | -- programs in their fully glory by executing "show (testProg n)". | |
318 | -- | |
319 | ||
320 | quote [] = [] | |
321 | quote (symbol:rest) = "#" ++ (show (Char.ord symbol)) ++ (quote rest) | |
322 | ||
323 | -- | |
324 | -- Add one and one. | |
325 | -- | |
326 | ||
327 | testProg 1 = "#1#1+" | |
328 | ||
329 | -- | |
330 | -- Redefine & as "+". | |
331 | -- | |
332 | ||
333 | testProg 2 = ";#43#38!#1#1&" -- 59,43,38 ==> ";+&" | |
334 | ||
335 | -- | |
336 | -- Redefine 0 as "9". | |
337 | -- | |
338 | ||
339 | testProg 3 = ";#57#48!#0" -- 59,57,48 ==> ";90" | |
340 | ||
341 | -- | |
342 | -- Redefine 0 as "#48?". This results in an infinite loop when 0 is executed. | |
343 | -- | |
344 | ||
345 | testProg 4 = ";#35#52#56#63#48!0" -- 59,35,52,56,63,48 ==> ";#48?0" | |
346 | ||
347 | -- | |
348 | -- Redefine $ as ".#36?". This results in a loop that pops symbols and | |
349 | -- and prints them, until the stack underflows, when $ is executed. | |
350 | -- | |
351 | ||
352 | testProg 5 = ";#46#35#51#54#63#36! #65#66#67#68#69$" | |
353 | ||
354 | -- | |
355 | -- Duplicate the top stack element (assuming an empty queue.) | |
356 | -- This shows that the : operation is not strictly necessary | |
357 | -- (when you know the size of the queue.) | |
358 | -- | |
359 | ||
360 | testProg 6 = "#65^v" | |
361 | ||
362 | -- | |
363 | -- Discard the top stack element (assuming more than one element | |
364 | -- on the stack, and an empty queue.) | |
365 | -- | |
366 | ||
367 | testProg 7 = "#33#123^v-+" | |
368 | ||
369 | -- | |
370 | -- Swap the top two elements of the stack (assuming an empty queue.) | |
371 | -- | |
372 | ||
373 | testProg 8 = "#67#66#65^v^-+^^v^v^v-+^v-+^v-+vv" | |
374 | ||
375 | -- | |
376 | -- Input a symbol. Report whether its ASCII value is even or odd. | |
377 | -- | |
378 | ||
379 | testProg 9 = (quote ";^v:") ++ "!" ++ -- : = dup | |
380 | (quote ";#69.") ++ "#!" ++ -- NUL = print "E" | |
381 | (quote ";#79.") ++ "#128!" ++ -- \128 = print "O" | |
382 | (quote (";" ++ (take 127 [':',':'..]) ++ -- m = mul by 128 | |
383 | (take 127 ['+','+'..]) ++ "m")) ++ "!" ++ | |
384 | ",m?" | |
385 | ||
386 | -- | |
387 | -- Input a symbol. Report whether it is M or not. | |
388 | -- | |
389 | ||
390 | testProg 10 = (quote ";#78.") ++ "#!" ++ -- NUL = print "N" | |
391 | ";##1!" ++ -- SOH = same as NUL | |
392 | ";##2!" ++ -- STX = same as NUL | |
393 | ";##3!" ++ -- ETX = same as NUL | |
394 | ";##4!" ++ -- EOT = same as NUL | |
395 | ";##5!" ++ -- ENQ = same as NUL | |
396 | ";##6!" ++ -- ACK = same as NUL | |
397 | ";##7!" ++ -- BEL = same as NUL | |
398 | (quote ";#89.") ++ "#8!" ++ -- BS = print "Y" | |
399 | ",#77-~?" | |
400 | ||
401 | -- | |
402 | -- Same as testProg 5, except stop printing when a NUL is | |
403 | -- encountered, instead of just underflowing the stack. | |
404 | -- | |
405 | ||
406 | testProg 11 = ";" ++ (quote ":~?$") ++ "!" ++ -- $ = dup & test | |
407 | ";" ++ (quote ".$") ++ "#!" ++ -- NUL = print & repeat | |
408 | ";#0#1!" ++ -- SOH = same as NUL | |
409 | ";#0#2!" ++ -- STX = same as NUL | |
410 | ";#0#3!" ++ -- ETX = same as NUL | |
411 | ";#0#4!" ++ -- EOT = same as NUL | |
412 | ";#0#5!" ++ -- ENQ = same as NUL | |
413 | ";#0#6!" ++ -- ACK = same as NUL | |
414 | ";#0#7!" ++ -- BEL = same as NUL | |
415 | -- BS = stop (nop) | |
416 | "#0" ++ (quote (reverse "Hello!")) ++ "$" |
0 | {-# LANGUAGE OverloadedStrings #-} | |
1 | ||
2 | module Main where | |
3 | ||
4 | import Haste.DOM (withElems, getValue, setProp) | |
5 | import Haste.Events (onEvent, MouseEvent(Click)) | |
6 | import Haste.Foreign (ffi) | |
7 | ||
8 | import Language.Emmental (emmentalWithIO) | |
9 | ||
10 | ||
11 | getCh :: IO Char | |
12 | getCh = ffi "(function() {var i=document.getElementById('prog-input'); var s=i.value; i.value=s.substring(1); return s.charCodeAt(0);})" | |
13 | ||
14 | putCh :: Char -> IO () | |
15 | putCh = ffi "(function(c) {var o=document.getElementById('prog-output'); o.textContent += String.fromCharCode(c);})" | |
16 | ||
17 | clearOutput :: IO () | |
18 | clearOutput = ffi "(function(c) {var o=document.getElementById('prog-output'); o.textContent = '';})" | |
19 | ||
20 | main = withElems ["prog", "result", "run-button"] driver | |
21 | ||
22 | driver [progElem, resultElem, runButtonElem] = | |
23 | onEvent runButtonElem Click $ \_ -> do | |
24 | Just prog <- getValue progElem | |
25 | clearOutput | |
26 | r <- emmentalWithIO (getCh) (putCh) prog | |
27 | setProp resultElem "textContent" $ show $ r |
0 | -- | |
1 | -- Emmental.hs | |
2 | -- Interpreter for the Emmental Programming Language | |
3 | -- Chris Pressey, Cat's Eye Technologies | |
4 | -- This work is in the public domain. See UNLICENSE for more information. | |
5 | -- | |
6 | ||
7 | module Language.Emmental where | |
8 | ||
9 | import Prelude ( | |
10 | Show, Char, IO, putStr, getChar, putChar, last, init, | |
11 | (++), (+), (-), (*), mod, div, return, take, reverse, show | |
12 | ) | |
13 | ||
14 | import qualified Data.Map as Map | |
15 | import qualified Data.Char as Char | |
16 | ||
17 | ||
18 | ----------------------------------------------------------------------- | |
19 | -- ============================ Symbols ============================ -- | |
20 | ----------------------------------------------------------------------- | |
21 | ||
22 | type Symbol = Char | |
23 | ||
24 | ||
25 | ----------------------------------------------------------------------- | |
26 | -- ======================== Program States ========================= -- | |
27 | ----------------------------------------------------------------------- | |
28 | ||
29 | data State = State { | |
30 | stack :: [Symbol], | |
31 | queue :: [Symbol], | |
32 | getCh :: IO Char, | |
33 | putCh :: Char -> IO () | |
34 | } | |
35 | ||
36 | instance Show State where | |
37 | show State{ stack=stack, queue=queue } = | |
38 | "State {stack = " ++ (show stack) ++ ", queue = " ++ (show queue) ++ "}" | |
39 | ||
40 | pop st@State{ stack=(head:tail) } = (head, st{ stack=tail }) | |
41 | push st@State{ stack=stack } sym = st{ stack=(sym:stack) } | |
42 | ||
43 | popString st@State{ stack=(';':tail) } = ([], st{ stack=tail }) | |
44 | popString st@State{ stack=(head:tail) } = | |
45 | let | |
46 | (string, state') = popString st{ stack=tail } | |
47 | in | |
48 | (string ++ [head], state') | |
49 | ||
50 | enqueue st@State{ queue=queue } symbol = st{ queue=(symbol:queue) } | |
51 | dequeue st@State{ queue=queue } = | |
52 | let | |
53 | symbol = last queue | |
54 | queue' = init queue | |
55 | in | |
56 | (symbol, st{ queue=queue' }) | |
57 | ||
58 | ||
59 | ----------------------------------------------------------------------- | |
60 | -- ========================= Interpreters ========================== -- | |
61 | ----------------------------------------------------------------------- | |
62 | ||
63 | data Interpreter = Interp (Map.Map Symbol Operation) | |
64 | ||
65 | fetch (Interp map) sym = Map.findWithDefault (fitRegOp opNop) sym map | |
66 | supplant (Interp map) sym op = (Interp (Map.insert sym op map)) | |
67 | ||
68 | ||
69 | ----------------------------------------------------------------------- | |
70 | -- ========================== Operations =========================== -- | |
71 | ----------------------------------------------------------------------- | |
72 | ||
73 | type Operation = State -> Interpreter -> IO (State, Interpreter) | |
74 | ||
75 | composeOps :: Operation -> Operation -> Operation | |
76 | ||
77 | composeOps op1 op2 = f where | |
78 | f state interpreter = do | |
79 | (state', interpreter') <- op1 state interpreter | |
80 | op2 state' interpreter' | |
81 | ||
82 | createOp :: Interpreter -> [Symbol] -> Operation | |
83 | ||
84 | createOp interpreter [] = | |
85 | (fitRegOp opNop) | |
86 | createOp interpreter (head:tail) = | |
87 | composeOps (fetch interpreter head) (createOp interpreter tail) | |
88 | ||
89 | -- | |
90 | -- It's useful for us to express a lot of our operators as non-monadic | |
91 | -- functions that don't affect the interpreter. This is a little "adapter" | |
92 | -- function that lets us create monadic functions with the right signature | |
93 | -- from them. | |
94 | -- | |
95 | ||
96 | fitRegOp :: (State -> State) -> Operation | |
97 | ||
98 | fitRegOp regop = f where | |
99 | f state interpreter = | |
100 | let | |
101 | state' = regop state | |
102 | in | |
103 | do return (state', interpreter) | |
104 | ||
105 | ||
106 | ------------------------------------------------------------ | |
107 | --------------- The operations themselves. ----------------- | |
108 | ------------------------------------------------------------ | |
109 | ||
110 | -- | |
111 | -- Redefine the meaning of the symbol on the stack with | |
112 | -- a mini-program also popped off the stack. | |
113 | -- | |
114 | ||
115 | opSupplant state interpreter = | |
116 | let | |
117 | (opSym, state') = pop state | |
118 | (newOpDefn, state'') = popString state' | |
119 | newOp = createOp interpreter newOpDefn | |
120 | in | |
121 | do return (state'', supplant interpreter opSym newOp) | |
122 | ||
123 | -- | |
124 | -- Execute the symbol on the stack with the current interpreter. | |
125 | -- | |
126 | ||
127 | opEval state interpreter = | |
128 | let | |
129 | (opSym, state') = pop state | |
130 | newOp = createOp interpreter [opSym] | |
131 | in | |
132 | newOp state' interpreter | |
133 | ||
134 | -- | |
135 | -- I/O. | |
136 | -- | |
137 | ||
138 | opInput state@State{ getCh=getCh } interpreter = do | |
139 | symbol <- getCh | |
140 | do return (push state symbol, interpreter) | |
141 | ||
142 | opOutput state@State{ putCh=putCh } interpreter = | |
143 | let | |
144 | (symbol, state') = pop state | |
145 | in do | |
146 | putCh symbol | |
147 | return (state', interpreter) | |
148 | ||
149 | -- | |
150 | -- Primitive arithmetic. | |
151 | -- | |
152 | ||
153 | opAdd state = | |
154 | let | |
155 | (symA, state') = pop state | |
156 | (symB, state'') = pop state' | |
157 | in | |
158 | push state'' (Char.chr (((Char.ord symB) + (Char.ord symA)) `mod` 256)) | |
159 | ||
160 | opSubtract state = | |
161 | let | |
162 | (symA, state') = pop state | |
163 | (symB, state'') = pop state' | |
164 | in | |
165 | push state'' (Char.chr (((Char.ord symB) - (Char.ord symA)) `mod` 256)) | |
166 | ||
167 | discreteLog 0 = 8 | |
168 | discreteLog 1 = 0 | |
169 | discreteLog 2 = 1 | |
170 | discreteLog n = (discreteLog (n `div` 2)) + 1 | |
171 | ||
172 | opDiscreteLog state = | |
173 | let | |
174 | (symbol, state') = pop state | |
175 | in | |
176 | push state' (Char.chr (discreteLog (Char.ord symbol))) | |
177 | ||
178 | -- | |
179 | -- Stack manipulation. | |
180 | -- | |
181 | ||
182 | -- | |
183 | -- Pop the top symbol of the stack, make a copy of it, push it back onto the | |
184 | -- stack, and enqueue the copy onto the queue. | |
185 | -- | |
186 | ||
187 | opEnqueueCopy state = | |
188 | let | |
189 | (sym, _) = pop state | |
190 | in | |
191 | enqueue state sym | |
192 | ||
193 | -- | |
194 | -- Dequeue a symbol from the queue and push it onto the stack. | |
195 | -- | |
196 | ||
197 | opDequeue state = | |
198 | let | |
199 | (sym, state') = dequeue state | |
200 | in | |
201 | push state' sym | |
202 | ||
203 | -- | |
204 | -- Duplicate the top symbol of the stack. | |
205 | -- | |
206 | ||
207 | opDuplicate state = | |
208 | let | |
209 | (symbol, _) = pop state | |
210 | in | |
211 | push state symbol | |
212 | ||
213 | -- | |
214 | -- Miscellaneous operations. | |
215 | -- | |
216 | ||
217 | opNop state = | |
218 | state | |
219 | ||
220 | -- | |
221 | -- Parameterizable operations. | |
222 | -- | |
223 | ||
224 | opPushValue value state = | |
225 | push state (Char.chr value) | |
226 | ||
227 | opAccumValue value state = | |
228 | let | |
229 | (sym, state') = pop state | |
230 | value' = ((Char.ord sym) * 10) + value | |
231 | in | |
232 | push state' (Char.chr (value' `mod` 256)) | |
233 | ||
234 | ||
235 | ----------------------------------------------------------------------- | |
236 | -- ===================== Debugging Functions ======================= -- | |
237 | ----------------------------------------------------------------------- | |
238 | ||
239 | type Debugger = State -> Interpreter -> IO () | |
240 | ||
241 | debugNop s i = do | |
242 | return () | |
243 | ||
244 | debugPrintState s i = do | |
245 | putStr ((show s) ++ "\n") | |
246 | return () | |
247 | ||
248 | ||
249 | ----------------------------------------------------------------------- | |
250 | -- ============================ Executor =========================== -- | |
251 | ----------------------------------------------------------------------- | |
252 | ||
253 | execute :: [Symbol] -> State -> Interpreter -> Debugger -> IO (State, Interpreter) | |
254 | ||
255 | execute [] state interpreter debugger = | |
256 | return (state, interpreter) | |
257 | execute (opSym:program') state interpreter debugger = | |
258 | let | |
259 | operation = fetch interpreter opSym | |
260 | in do | |
261 | (state', interpreter') <- operation state interpreter | |
262 | debugger state' interpreter' | |
263 | execute program' state' interpreter' debugger | |
264 | ||
265 | ||
266 | ----------------------------------------------------------------------- | |
267 | -- ====================== Top-Level Function ======================= -- | |
268 | ----------------------------------------------------------------------- | |
269 | ||
270 | initialInterpreter = Interp | |
271 | (Map.fromList | |
272 | [ | |
273 | ('.', opOutput), | |
274 | (',', opInput), | |
275 | ||
276 | ('#', fitRegOp (opPushValue 0)), | |
277 | ('0', fitRegOp (opAccumValue 0)), | |
278 | ('1', fitRegOp (opAccumValue 1)), | |
279 | ('2', fitRegOp (opAccumValue 2)), | |
280 | ('3', fitRegOp (opAccumValue 3)), | |
281 | ('4', fitRegOp (opAccumValue 4)), | |
282 | ('5', fitRegOp (opAccumValue 5)), | |
283 | ('6', fitRegOp (opAccumValue 6)), | |
284 | ('7', fitRegOp (opAccumValue 7)), | |
285 | ('8', fitRegOp (opAccumValue 8)), | |
286 | ('9', fitRegOp (opAccumValue 9)), | |
287 | ||
288 | ('+', fitRegOp opAdd), | |
289 | ('-', fitRegOp opSubtract), | |
290 | ('~', fitRegOp opDiscreteLog), | |
291 | ||
292 | ('^', fitRegOp opEnqueueCopy), | |
293 | ('v', fitRegOp opDequeue), | |
294 | (':', fitRegOp opDuplicate), | |
295 | ||
296 | ('!', opSupplant), | |
297 | ('?', opEval), | |
298 | ||
299 | (';', fitRegOp (opPushValue (Char.ord ';'))) | |
300 | ] | |
301 | ) | |
302 | ||
303 | initialState = State { stack=[], queue=[], getCh=getChar, putCh=putChar } | |
304 | ||
305 | emmental string = do | |
306 | (state, interpreter) <- execute string initialState initialInterpreter debugNop | |
307 | return state | |
308 | ||
309 | emmentalWithIO getCh putCh string = | |
310 | let | |
311 | i = initialState | |
312 | i' = i{ getCh=getCh, putCh=putCh } | |
313 | in do | |
314 | (state, interpreter) <- execute string i' initialInterpreter debugNop | |
315 | return state | |
316 | ||
317 | debug string = do | |
318 | (state, interpreter) <- execute string initialState initialInterpreter debugPrintState | |
319 | return state | |
320 | ||
321 | ||
322 | ----------------------------------------------------------------------- | |
323 | -- ========================== Test Cases =========================== -- | |
324 | ----------------------------------------------------------------------- | |
325 | ||
326 | -- | |
327 | -- Drivers for test cases. 'demo' runs them straight, whereas 'test' | |
328 | -- uses the debugger. | |
329 | -- | |
330 | ||
331 | demo n = emmental (testProg n) | |
332 | ||
333 | test n = debug (testProg n) | |
334 | ||
335 | -- | |
336 | -- Here we introduce a bit of a cheat, in order to make writing | |
337 | -- complex Emmental programs tolerable. You can still see the | |
338 | -- programs in their full glory by executing "show (testProg n)". | |
339 | -- | |
340 | ||
341 | quote [] = [] | |
342 | quote (symbol:rest) = "#" ++ (show (Char.ord symbol)) ++ (quote rest) | |
343 | ||
344 | -- | |
345 | -- Add one and one. | |
346 | -- | |
347 | ||
348 | testProg 1 = "#1#1+" | |
349 | ||
350 | -- | |
351 | -- Redefine & as "+". | |
352 | -- | |
353 | ||
354 | testProg 2 = ";#43#38!#1#1&" -- 59,43,38 ==> ";+&" | |
355 | ||
356 | -- | |
357 | -- Redefine 0 as "9". | |
358 | -- | |
359 | ||
360 | testProg 3 = ";#57#48!#0" -- 59,57,48 ==> ";90" | |
361 | ||
362 | -- | |
363 | -- Redefine 0 as "#48?". This results in an infinite loop when 0 is executed. | |
364 | -- | |
365 | ||
366 | testProg 4 = ";#35#52#56#63#48!0" -- 59,35,52,56,63,48 ==> ";#48?0" | |
367 | ||
368 | -- | |
369 | -- Redefine $ as ".#36?". This results in a loop that pops symbols and | |
370 | -- and prints them, until the stack underflows, when $ is executed. | |
371 | -- | |
372 | ||
373 | testProg 5 = ";#46#35#51#54#63#36! #65#66#67#68#69$" | |
374 | ||
375 | -- | |
376 | -- Duplicate the top stack element (assuming an empty queue.) | |
377 | -- This shows that the : operation is not strictly necessary | |
378 | -- (when you know the size of the queue.) | |
379 | -- | |
380 | ||
381 | testProg 6 = "#65^v" | |
382 | ||
383 | -- | |
384 | -- Discard the top stack element (assuming more than one element | |
385 | -- on the stack, and an empty queue.) | |
386 | -- | |
387 | ||
388 | testProg 7 = "#33#123^v-+" | |
389 | ||
390 | -- | |
391 | -- Swap the top two elements of the stack (assuming an empty queue.) | |
392 | -- | |
393 | ||
394 | testProg 8 = "#67#66#65^v^-+^^v^v^v-+^v-+^v-+vv" | |
395 | ||
396 | -- | |
397 | -- Input a symbol. Report whether its ASCII value is even or odd. | |
398 | -- | |
399 | ||
400 | testProg 9 = (quote ";^v:") ++ "!" ++ -- : = dup | |
401 | (quote ";#69.") ++ "#!" ++ -- NUL = print "E" | |
402 | (quote ";#79.") ++ "#128!" ++ -- \128 = print "O" | |
403 | (quote (";" ++ (take 127 [':',':'..]) ++ -- m = mul by 128 | |
404 | (take 127 ['+','+'..]) ++ "m")) ++ "!" ++ | |
405 | ",m?" | |
406 | ||
407 | -- | |
408 | -- Input a symbol. Report whether it is M or not. | |
409 | -- | |
410 | ||
411 | testProg 10 = (quote ";#78.") ++ "#!" ++ -- NUL = print "N" | |
412 | ";##1!" ++ -- SOH = same as NUL | |
413 | ";##2!" ++ -- STX = same as NUL | |
414 | ";##3!" ++ -- ETX = same as NUL | |
415 | ";##4!" ++ -- EOT = same as NUL | |
416 | ";##5!" ++ -- ENQ = same as NUL | |
417 | ";##6!" ++ -- ACK = same as NUL | |
418 | ";##7!" ++ -- BEL = same as NUL | |
419 | (quote ";#89.") ++ "#8!" ++ -- BS = print "Y" | |
420 | ",#77-~?" | |
421 | ||
422 | -- | |
423 | -- Same as testProg 5, except stop printing when a NUL is | |
424 | -- encountered, instead of just underflowing the stack. | |
425 | -- | |
426 | ||
427 | testProg 11 = ";" ++ (quote ":~?$") ++ "!" ++ -- $ = dup & test | |
428 | ";" ++ (quote ".$") ++ "#!" ++ -- NUL = print & repeat | |
429 | ";#0#1!" ++ -- SOH = same as NUL | |
430 | ";#0#2!" ++ -- STX = same as NUL | |
431 | ";#0#3!" ++ -- ETX = same as NUL | |
432 | ";#0#4!" ++ -- EOT = same as NUL | |
433 | ";#0#5!" ++ -- ENQ = same as NUL | |
434 | ";#0#6!" ++ -- ACK = same as NUL | |
435 | ";#0#7!" ++ -- BEL = same as NUL | |
436 | -- BS = stop (nop) | |
437 | "#0" ++ (quote (reverse "Hello!")) ++ "$" |
0 | 0 | module Main where |
1 | 1 | |
2 | 2 | import System.Environment |
3 | import Emmental | |
3 | import Language.Emmental | |
4 | 4 | |
5 | 5 | main = do |
6 | 6 | args <- getArgs |
0 | 0 | #!/bin/sh |
1 | 1 | |
2 | ./build.sh || exit 1 | |
3 | ||
4 | falderal --substring-error tests/Emmental.markdown | |
2 | falderal --substring-error tests/Emmental.md |
0 | Test Suite for Emmental | |
1 | ======================= | |
2 | ||
3 | This test suite is written in the format of Falderal 0.9. It is far from | |
4 | exhaustive, but provides a basic sanity check on the language. | |
5 | ||
6 | Emmental Tests | |
7 | -------------- | |
8 | ||
9 | -> Functionality "Interpret Emmental Program" is implemented by | |
10 | -> shell command | |
11 | -> "bin/emmental %(test-body-file)" | |
12 | ||
13 | -> Functionality "Interpret Emmental Program and Show Final State" | |
14 | -> is implemented by shell command | |
15 | -> "bin/emmental -r %(test-body-file)" | |
16 | ||
17 | -> Tests for functionality "Interpret Emmental Program and Show Final State" | |
18 | ||
19 | Add one and one. | |
20 | ||
21 | | #1#1+ | |
22 | = State "\STX" "" | |
23 | ||
24 | Redefine `&` as `+`. (`59,43,38 ==> ";+&"`) | |
25 | ||
26 | | ;#43#38!#1#1& | |
27 | = State "\STX" "" | |
28 | ||
29 | Redefine `0` as `9`. (`59,57,48 ==> ";90"`) | |
30 | ||
31 | | ;#57#48!#0 | |
32 | = State "\t" "" | |
33 | ||
34 | Redefine `$` as `.#36?`. This results in a loop that pops symbols and | |
35 | and prints them, until the stack underflows, when `$` is executed. | |
36 | ||
37 | | ;#46#35#51#54#63#36! #65#66#67#68#69$ | |
38 | ? pop | |
39 | ||
40 | Duplicate the top stack element (assuming an empty queue.) | |
41 | This shows that the `:` operation is not strictly necessary | |
42 | (when you know the size of the queue.) | |
43 | ||
44 | | #65^v | |
45 | = State "AA" "" | |
46 | ||
47 | Discard the top stack element (assuming more than one element | |
48 | on the stack, and an empty queue.) | |
49 | ||
50 | | #33#123^v-+ | |
51 | = State "!" "" | |
52 | ||
53 | Swap the top two elements of the stack (assuming an empty queue.) | |
54 | ||
55 | | #67#66#65^v^-+^^v^v^v-+^v-+^v-+vv | |
56 | = State "BAC" "" | |
57 | ||
58 | -> Tests for functionality "Interpret Emmental Program" | |
59 | ||
60 | Input a symbol. Report whether its ASCII value is even or odd. | |
61 | ||
62 | | #59#94#118#58!#59#35#54#57#46#!#59#35#55#57#46#128!#59#58#58#58#58#58 | |
63 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
64 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
65 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
66 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
67 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
68 | | #58#58#58#58#58#58#58#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
69 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
70 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
71 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
72 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
73 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#109!,m? | |
74 | + @ | |
75 | = E | |
76 | ||
77 | | #59#94#118#58!#59#35#54#57#46#!#59#35#55#57#46#128!#59#58#58#58#58#58 | |
78 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
79 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
80 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
81 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
82 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
83 | | #58#58#58#58#58#58#58#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
84 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
85 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
86 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
87 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
88 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#109!,m? | |
89 | + A | |
90 | = O | |
91 | ||
92 | Input a symbol. Report whether it is M or not. | |
93 | ||
94 | | #59#35#55#56#46#!;##1!;##2!;##3!;##4!;##5!;##6!;##7!#59#35#56#57#46#8! | |
95 | | ,#77-~? | |
96 | + M | |
97 | = Y | |
98 | ||
99 | | #59#35#55#56#46#!;##1!;##2!;##3!;##4!;##5!;##6!;##7!#59#35#56#57#46#8! | |
100 | | ,#77-~? | |
101 | + z | |
102 | = N | |
103 | ||
104 | Redefine `$` such that this results in a loop that pops symbols and | |
105 | and prints them, until the stack underflows, when `$` is executed, | |
106 | except stop printing when a NUL is encountered, instead of just | |
107 | underflowing the stack. (See `testProg 5` and `testProg 11` in `Emmental.hs` | |
108 | for a more lucid explanation.) | |
109 | ||
110 | | ;#58#126#63#36!;#46#36#!;#0#1!;#0#2!;#0#3!;#0#4!;#0#5!;#0#6!;#0#7! | |
111 | | #0#33#111#108#108#101#72$ | |
112 | = Hello! |
0 | Test Suite for Emmental | |
1 | ======================= | |
2 | ||
3 | This test suite is written in the format of Falderal 0.9. It is far from | |
4 | exhaustive, but provides a basic sanity check on the language. | |
5 | ||
6 | Emmental Tests | |
7 | -------------- | |
8 | ||
9 | -> Functionality "Interpret Emmental Program" is implemented by | |
10 | -> shell command | |
11 | -> "bin/emmental %(test-body-file)" | |
12 | ||
13 | -> Functionality "Interpret Emmental Program and Show Final State" | |
14 | -> is implemented by shell command | |
15 | -> "bin/emmental -r %(test-body-file)" | |
16 | ||
17 | -> Tests for functionality "Interpret Emmental Program and Show Final State" | |
18 | ||
19 | Add one and one. | |
20 | ||
21 | | #1#1+ | |
22 | = State {stack = "\STX", queue = ""} | |
23 | ||
24 | Redefine `&` as `+`. (`59,43,38 ==> ";+&"`) | |
25 | ||
26 | | ;#43#38!#1#1& | |
27 | = State {stack = "\STX", queue = ""} | |
28 | ||
29 | Redefine `0` as `9`. (`59,57,48 ==> ";90"`) | |
30 | ||
31 | | ;#57#48!#0 | |
32 | = State {stack = "\t", queue = ""} | |
33 | ||
34 | Redefine `$` as `.#36?`. This results in a loop that pops symbols and | |
35 | and prints them, until the stack underflows, when `$` is executed. | |
36 | ||
37 | | ;#46#35#51#54#63#36! #65#66#67#68#69$ | |
38 | ? pop | |
39 | ||
40 | Duplicate the top stack element (assuming an empty queue.) | |
41 | This shows that the `:` operation is not strictly necessary | |
42 | (when you know the size of the queue.) | |
43 | ||
44 | | #65^v | |
45 | = State {stack = "AA", queue = ""} | |
46 | ||
47 | Discard the top stack element (assuming more than one element | |
48 | on the stack, and an empty queue.) | |
49 | ||
50 | | #33#123^v-+ | |
51 | = State {stack = "!", queue = ""} | |
52 | ||
53 | Swap the top two elements of the stack (assuming an empty queue.) | |
54 | ||
55 | | #67#66#65^v^-+^^v^v^v-+^v-+^v-+vv | |
56 | = State {stack = "BAC", queue = ""} | |
57 | ||
58 | -> Tests for functionality "Interpret Emmental Program" | |
59 | ||
60 | Input a symbol. Report whether its ASCII value is even or odd. | |
61 | ||
62 | | #59#94#118#58!#59#35#54#57#46#!#59#35#55#57#46#128!#59#58#58#58#58#58 | |
63 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
64 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
65 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
66 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
67 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
68 | | #58#58#58#58#58#58#58#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
69 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
70 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
71 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
72 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
73 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#109!,m? | |
74 | + @ | |
75 | = E | |
76 | ||
77 | | #59#94#118#58!#59#35#54#57#46#!#59#35#55#57#46#128!#59#58#58#58#58#58 | |
78 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
79 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
80 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
81 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
82 | | #58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58#58 | |
83 | | #58#58#58#58#58#58#58#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
84 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
85 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
86 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
87 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43 | |
88 | | #43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#43#109!,m? | |
89 | + A | |
90 | = O | |
91 | ||
92 | Input a symbol. Report whether it is M or not. | |
93 | ||
94 | | #59#35#55#56#46#!;##1!;##2!;##3!;##4!;##5!;##6!;##7!#59#35#56#57#46#8! | |
95 | | ,#77-~? | |
96 | + M | |
97 | = Y | |
98 | ||
99 | | #59#35#55#56#46#!;##1!;##2!;##3!;##4!;##5!;##6!;##7!#59#35#56#57#46#8! | |
100 | | ,#77-~? | |
101 | + z | |
102 | = N | |
103 | ||
104 | Redefine `$` such that this results in a loop that pops symbols and | |
105 | and prints them, until the stack underflows, when `$` is executed, | |
106 | except stop printing when a NUL is encountered, instead of just | |
107 | underflowing the stack. (See `testProg 5` and `testProg 11` in `Emmental.hs` | |
108 | for a more lucid explanation.) | |
109 | ||
110 | | ;#58#126#63#36!;#46#36#!;#0#1!;#0#2!;#0#3!;#0#4!;#0#5!;#0#6!;#0#7! | |
111 | | #0#33#111#108#108#101#72$ | |
112 | = Hello! |