Add README, experimental.
catseye
13 years ago
0 | The Mascarpone Programming Language | |
1 | =================================== | |
2 | ||
3 | Language version 1.0. Distribution version 2007.1208.\ | |
4 | Chris Pressey, Cat's Eye Technologies | |
5 | ||
6 | *You are lost in a twisty maze of meta-circular interpreters, all | |
7 | alike.* | |
8 | ||
9 | Introduction | |
10 | ------------ | |
11 | ||
12 | Mascarpone is a self-modifying programming language in the style of | |
13 | [Emmental](/projects/emmental/). In fact it is a rationalization and | |
14 | further exploration of some of the basic ideas behind Emmental. In | |
15 | Mascarpone, meta-circular interpreters are "first-class objects": they | |
16 | can be pushed onto the stack, have operations extracted from and | |
17 | installed into them, and can themselves be meta-circularly extracted | |
18 | from the language environment ("reified") or installed into it | |
19 | ("deified.") New operations can be defined as strings of symbols, and | |
20 | these symbols are given meaning by an interpreter that is "captured" in | |
21 | the definition, similar to the way that lexical variables are captured | |
22 | in closures in functional languages. An operation may also access, and | |
23 | modify, the interpreter that invoked it. | |
24 | ||
25 | Like Emmental, Mascarpone relies on meta-circular | |
26 | interpreter-modification to achieve Turing-completeness. Unlike | |
27 | Emmental, Mascarpone is purely symbolic; there are no arithmetic | |
28 | instructions. | |
29 | ||
30 | Stack | |
31 | ----- | |
32 | ||
33 | Like Emmental, Mascarpone is a stack-based language. Unlike Emmental, | |
34 | Mascarpone's stack may contain things other than symbols. A stack | |
35 | element in Mascarpone may be a symbol, an operation, or an interpreter. | |
36 | ||
37 | Strings are popped off Mascarpone's stack slightly differently than | |
38 | Emmental's. A string begins with the symbol `]` on the stack; this is | |
39 | popped and discarded. Symbols are then successively popped and prepended | |
40 | to a growing string. As further `]`'s are encountered, they too are | |
41 | prepended to the string, but the nesting level is incremented for each | |
42 | one as well. Whenever a `[` is encountered, it is prepended to the | |
43 | string and the nesting level is decremented, unless it is zero, in which | |
44 | case the `[` is discarded and the string is complete. The net effect of | |
45 | all this futzing around is that `[]` work as nestable quoting symbols. | |
46 | ||
47 | Also unlike Emmental, Mascarpone does not have a queue. | |
48 | ||
49 | Meta-circular Interpreters | |
50 | -------------------------- | |
51 | ||
52 | The idea of an interpreter in Mascarpone is similar to that in Emmental. | |
53 | In Mascarpone, an interpreter is a map that takes symbols to operations, | |
54 | and an operation is a sequence of symbols that is given meaning by some | |
55 | interpreter. | |
56 | ||
57 | Of course, this is a circular definition, but that doesn't seem | |
58 | unreasonable, since we're working with meta-circular interpreters. If | |
59 | you like, you can think of it as forming an "infinite tower of | |
60 | meta-circular interpreters," but that's never been a really satisfying | |
61 | explanation for me. As I explained in the Emmental documentation, I | |
62 | think you need some source of understanding external to the definition | |
63 | in order to make complete sense of a meta-circular interpreter. (I also | |
64 | happen to think that humans have some sort of innate understanding of | |
65 | interpretation — that is, language — so that this demand for further | |
66 | understanding doesn't recurse forever.) | |
67 | ||
68 | There is a special interpreter in Mascarpone called "null". It is an | |
69 | error to try to interpret anything with this interpreter. Expect that | |
70 | any program that tries to do this will come crashing to a halt, or will | |
71 | spin off into space and never be heard from again, or something equally | |
72 | impressive. | |
73 | ||
74 | Every interpreter (except for null) is linked to a "parent" interpreter | |
75 | (which may be null.) No interpreter can be its own ancestor; the | |
76 | parent-child relationships between interpreters form a directed, acyclic | |
77 | graph (or DAG.) | |
78 | ||
79 | There is, at any given time in a Mascarpone, a current interpreter: this | |
80 | is the interpreter that is in force, that is being used to interpret | |
81 | symbols. The parent interpreter of the current interpreter is generally | |
82 | the interpreter that was used to execute the current operation (that is, | |
83 | the operation currently being interpreted; it consists of a string of | |
84 | symbols is interpreted by the current interpreter.) | |
85 | ||
86 | The current interpreter when any top-level Mascarpone program begins is | |
87 | the initial Mascarpone interpreter, which is described in English in the | |
88 | next section. | |
89 | ||
90 | Initial Mascarpone Interpreter | |
91 | ------------------------------ | |
92 | ||
93 | `v` ("reify") pushes the current interpreter onto the stack. | |
94 | ||
95 | `^` ("deify") pops an interpreter from the stack and installs it as the | |
96 | current interpreter. | |
97 | ||
98 | `>` ("extract") pops a symbol from the stack, then pops an interpreter. | |
99 | It pushes onto the stack the operation associated with that symbol in | |
100 | that interpreter. | |
101 | ||
102 | `<` ("install") pops a symbol from the stack, then an operation, then an | |
103 | interpreter. It pushes onto the stack a new interpreter which is the | |
104 | same as the given interpreter, except that in it, the given symbol is | |
105 | associated with the given operation. | |
106 | ||
107 | `{` ("get parent") pops an interpreter from the stack and pushes it's | |
108 | parent interpreter onto the stack. | |
109 | ||
110 | `}` ("set parent") pops an interpreter i from the stack, then pops an | |
111 | interpreter j. It pushes a new interpreter which is the same as i, | |
112 | except that it's parent interpreter is j. | |
113 | ||
114 | `*` ("create") pops an interpreter from the stack, then a string. It | |
115 | creates a new operation defined by how that interpreter would interpret | |
116 | that string of symbols, and pushes that operation onto the stack. | |
117 | ||
118 | `@` ("expand") pops an operation from the stack and pushes a program | |
119 | string, then pushes an interpreter, such that the semantics of running | |
120 | the program string with the interpreter is identical to the semantics of | |
121 | executing the operation. (Note that the program need not be the one that | |
122 | the operation was defined with, only *equivalent* to it, under the given | |
123 | interpreter; this allows one to sensibly expand "intrinsic" operations | |
124 | like those in the initial Mascarpone interpreter.) | |
125 | ||
126 | `!` ("perform") pops an operation from the stack and executes it. | |
127 | ||
128 | `0` ("null") pushes the null interpreter onto the stack. | |
129 | ||
130 | `1` ("uniform") pops an operation from the stack and pushes back an | |
131 | interpreter where all symbols are associated with that operation. | |
132 | ||
133 | `[` ("deepquote") pushes a `[` symbol onto the stack and enters "nested | |
134 | quote mode", which is really another interpreter. In nested quote mode, | |
135 | each symbol is interpreted as an operation which pushes that symbol onto | |
136 | the stack. In addition, the symbols `[` and `]` have special additional | |
137 | meaning: they nest. When a `]` matching the first `[` is encountered, | |
138 | nested quote mode ends, returning to the interpreter previously in | |
139 | effect. | |
140 | ||
141 | `'` ("quotesym") switches to "single-symbol quote mode", which is really | |
142 | yet another interpreter. In single-symbol quote mode, each symbol is | |
143 | interpreted as an operation which pushes that symbol onto the stack, | |
144 | then immediately ends single-symbol quote mode, returning to the | |
145 | interpreter previously in effect. | |
146 | ||
147 | `.` pops a symbol off the stack and sends it to the standard output. | |
148 | ||
149 | `,` waits for a symbol to arrive on standard input, and pushes it onto | |
150 | the stack. | |
151 | ||
152 | `:` duplicates the top element of the stack. | |
153 | ||
154 | `$` pops the top element of the stack and discards it. | |
155 | ||
156 | `/` swaps to the top two elements of the stack. | |
157 | ||
158 | Discussion | |
159 | ---------- | |
160 | ||
161 | ### Design decisions | |
162 | ||
163 | As you can see, Mascarpone's semantics and initial operations are a lot | |
164 | less "fugly" than Emmental's. It's a more expressive language, in that | |
165 | it's easier to elegantly convey things involving interpreters and | |
166 | meta-circularity in Mascarpone than it is in Emmental. It explores at | |
167 | least one idea that I explicitly mentioned in the Emmental documentation | |
168 | that I'd like to explore, namely, having multiple meta-circular | |
169 | interpreters and being able to switch between them (and lo and behold, | |
170 | Mascarpone has very well-developed `[]` and `'` operations.) It's also | |
171 | "prettier" in that there's more attention paid to providing duals of | |
172 | operations (both `*` and `@`, for example.) | |
173 | ||
174 | Mascarpone also appears to be Turing-complete, despite the lack of | |
175 | explicit conditional, repetition, and arithmetic operators. A cyclic | |
176 | meaning can be expressed by an operation which examines its own | |
177 | definition from the parent interpreter of the current interpreter and | |
178 | re-uses it. A conditional can be formed by creating a new interpreter in | |
179 | which one symbol, say `S`, maps to an operation which does something, | |
180 | and in which all other symbols do something else; executing a symbol in | |
181 | this interpreter is tantamount to testing if that symbol is `S`. | |
182 | ||
183 | "But", you point out, "Mascarpone only has one stack! You need at least | |
184 | two stacks in order to simulate a Turing machine's tape." Actually, | |
185 | Mascarpone *does* have another, less obvious stack: each interpreter has | |
186 | a parent interpreter. By getting the current interpreter, modifying it, | |
187 | setting it's parent to be the current interpreter, and setting it as the | |
188 | current interpreter (in Mascarpone: `v`...`v}^`), we "push" something | |
189 | onto it; by getting the current interpreter, getting its parent, and | |
190 | setting that as the current interpreter (`v{^`), we "pop". | |
191 | ||
192 | Actually, even if there was no explicit parent-child relationship | |
193 | between interpreters, we'd still be able to store a stack of | |
194 | interpreters, because each operation in an interpreter has its own | |
195 | interpreter that gives meaning to the symbols in that operation, and | |
196 | *that* interpreter can contain operations that can contain interpreters, | |
197 | etc., etc., ad infinitum. This isn't a very classy way to do it, but | |
198 | it's very reminiscent of how structures can be built in the lambda | |
199 | calculus by trapping abstractions in other abstractions. | |
200 | ||
201 | It's also worth noting that this is how you'd have to accomplish | |
202 | arithmetic, with something like Church numerals done with interpreters | |
203 | and operations, since Mascarpone has nothing but symbols. On the plus | |
204 | side, this means Mascarpone, unlike Emmental, is highly independent of | |
205 | character set or encoding — it doesn't even have to be ordered. Any set | |
206 | of symbols that contains the symbols of the initial Mascarpone | |
207 | interpreter, plus the symbols appearing in the Mascarpone program being | |
208 | executed, plus the symbols that are desired for input and output, ought | |
209 | to suffice. | |
210 | ||
211 | Actually, that's not quite true: it should be a *finite* set. This is | |
212 | mainly for the sake of the definition of the `'` operator: it switches | |
213 | to an interpreter where all symbols indicate operations that push that | |
214 | symbol on the stack. From this we can infer that there should either be | |
215 | a finite number of such operations (and thus symbols,) or somehow these | |
216 | operations know what symbol they are to push. They take the symbol that | |
217 | invoked them as an argument, perhaps. But other operations in Mascarpone | |
218 | do not have such capabilities: an operation need not even be invoked by | |
219 | a symbol, as it could be invoked by the `!` operation, for instance. | |
220 | That would make the operations in the `'` interpreter gratuitously | |
221 | special. And, practically, most character sets, on which sets of symbols | |
222 | are based, are finite, so I don't suppose this restriction is much of a | |
223 | problem. | |
224 | ||
225 | One further, somewhat related design decision deserves mention. Any | |
226 | symbol which is not defined in the initial interpreter is interpreted as | |
227 | a no-op. It probably would have been nicer to treat it as an explicit | |
228 | error-causing operation. This could be extended to looking, inside each | |
229 | putative definition, for symbols undefined in the desired interpreter | |
230 | when executing a `*` operation, and causing a (preferably intelligible) | |
231 | error early in that case. Semantics like this would have helped me save | |
232 | time in debugging one or two of the test case programs. However, while | |
233 | Mascarpone is arguably supposed to be less hostile than Emmental when it | |
234 | comes to being programmed in, it's certainly still not what you'd call a | |
235 | mainstream programming language, so while I'm somewhat irked by this | |
236 | deficiency, I hardly consider it a show-stopper. | |
237 | ||
238 | ### Related Work | |
239 | ||
240 | There are definately two related works that are worth mentioning: Brian | |
241 | Cantwell Smith's Ph.D. thesis "Procedural Reflection in Programming | |
242 | Languages" (MIT, 1982,) and Friedman and Wand's paper "Reification: | |
243 | Reflection without Metaphysics" (ACM LISP conference, 1984.) (Forgive me | |
244 | for not giving proper, perfectly-formatted, Turabianly-correct | |
245 | references to these two works, but frankly, this is the age of the | |
246 | Internet: if you're interested in either of these papers, and you can't | |
247 | find them, there's something wrong with you! If, on the other hand, you | |
248 | don't have *access* to them, perhaps there's something wrong with the | |
249 | institutions whose assumed goal is to increase the amount of human | |
250 | knowledge — but not, it seems, to widen its availability.) | |
251 | ||
252 | It's hard to say how much influence Smith's 3-LISP language and Friedman | |
253 | and Wand's Brown language (introduced in the respective papers) have had | |
254 | on Mascarpone: probably some, since I had read both of them (well, not | |
255 | *all* of Smith's monster! but enough of it to grasp the main ideas, I | |
256 | think) and thought about what they were trying to convey. (What Brown | |
257 | calls "reflection" I've called "deification" to give a sort of | |
258 | phonological dual to "reification". Also, the term "reflection" seems to | |
259 | have taken on a more general meaning in computer science since the | |
260 | '80's, so I wanted to avoid its use here.) But that was a couple of | |
261 | years previous, and the subject of meta-circular interpreters came up | |
262 | this time from a different angle; Mascarpone came primarily from trying | |
263 | to "un-knot" the ideas behind Emmental, which itself came to be, quite | |
264 | indirectly, from thinking about issues raised by John Reynolds' original | |
265 | work on meta-circularity. | |
266 | ||
267 | Certainly a huge difference that sets Mascarpone apart is that 3-LISP | |
268 | and Brown are caught up in the whole LISP/Scheme thing, so they just use | |
269 | S-expressions and functions to represent reified interpreter parts, | |
270 | which include environments and continuations. Mascarpone, on the other | |
271 | hand, reifies whole interpreters at once, as values which are complete | |
272 | interpreters. Because interpreters contain operations which contain | |
273 | interpreters ("ad infinitum", one might think,) this approach seems to | |
274 | highlight the meta-circularity in a way that is particularly striking. | |
275 | In addition, Mascarpone's "applicative" organization (like XY or Joy; | |
276 | that is, like an idealized version of FORTH) lets it avoid some of the | |
277 | referential issues like names and environments, and gives a nice direct | |
278 | one-symbol-one-operation correspondence. | |
279 | ||
280 | Because Mascarpone has interpreters as first-class values, it is never | |
281 | obliged to make the guts of the running interpreter explicit during | |
282 | reification, it just needs to make that interpreter available as a | |
283 | value. The contract of the `@` operation (which, by the way, was a | |
284 | somewhat late add to the language design, fulfilling the desire for a | |
285 | dual to `*`) says you get a program and an interpreter with semantics | |
286 | *equivalent* to the operation you specify, but it doesn't say *how* | |
287 | they're provided. You could successively perform `@` on an intrinsic | |
288 | operation (like, say, `@` itself) and get successively more explicit | |
289 | definitions, written in Mascarpone, of what `@` means. Each one could be | |
290 | thought of as descending (or ascending? does it matter?) a level in that | |
291 | infinite tower dealie. Or, you might only get back a single, random | |
292 | symbol, and an interpreter where all symbols have the semantics of `@`, | |
293 | with no explanation whatsoever. This inbuilt ambiguity is, I think, the | |
294 | appropriate level of abstraction for such an operation (in a | |
295 | meta-circular context, anyway;) saying that you always get back the | |
296 | program you defined the operation with seems overspecified (and unable | |
297 | to handle the case of intrinsics,) and saying that you always get back | |
298 | something opaque, like a function value, seems quite nonplussing in the | |
299 | context of an interpreter that can supposedly examine its own structure. | |
300 | It's not clear to me that either 3-LISP or Brown addresses this point to | |
301 | this degree. | |
302 | ||
303 | And of course, neither 3-LISP nor Brown tries to use reification and | |
304 | deification as a means of achieving Turing-completeness in the absence | |
305 | of conventional conditional and repetition constructs. | |
306 | ||
307 | Implementation | |
308 | -------------- | |
309 | ||
310 | `mascarpone.hs` is a reference interpreter for Mascarpone written in | |
311 | Haskell. Run the function `mascarpone` on a string, or `demo n` to run | |
312 | one of the included test cases. `mascarpone.hs` also has a much nicer | |
313 | debugging facility than `emmental.hs`; you can run `debug` on a string | |
314 | to view the state of the program (the current instruction, the rest of | |
315 | the program, the stack, and the current interpreter) at each step of | |
316 | execution. And you can run `test n` to debug the test cases. Lastly, | |
317 | there is a `main` function that runs `mascarpone` on a string read from | |
318 | a file named by the first argument, so a Haskell compiler can be used to | |
319 | build a stand-alone Mascarpone interpreter from this source code. | |
320 | ||
321 | Even happier interpreter-redefining! \ | |
322 | Chris Pressey \ | |
323 | Chicago, IL \ | |
324 | December 8, 2007 |