git @ Cat's Eye Technologies Mascarpone / b00a548
Add README, experimental. catseye 13 years ago
1 changed file(s) with 325 addition(s) and 0 deletion(s). Raw diff Collapse all Expand all
0 The Mascarpone Programming Language
1 ===================================
2
3 Language version 1.0. Distribution version 2007.1208.\
4 Chris Pressey, Cat's Eye Technologies
5
6 *You are lost in a twisty maze of meta-circular interpreters, all
7 alike.*
8
9 Introduction
10 ------------
11
12 Mascarpone is a self-modifying programming language in the style of
13 [Emmental](/projects/emmental/). In fact it is a rationalization and
14 further exploration of some of the basic ideas behind Emmental. In
15 Mascarpone, meta-circular interpreters are "first-class objects": they
16 can be pushed onto the stack, have operations extracted from and
17 installed into them, and can themselves be meta-circularly extracted
18 from the language environment ("reified") or installed into it
19 ("deified.") New operations can be defined as strings of symbols, and
20 these symbols are given meaning by an interpreter that is "captured" in
21 the definition, similar to the way that lexical variables are captured
22 in closures in functional languages. An operation may also access, and
23 modify, the interpreter that invoked it.
24
25 Like Emmental, Mascarpone relies on meta-circular
26 interpreter-modification to achieve Turing-completeness. Unlike
27 Emmental, Mascarpone is purely symbolic; there are no arithmetic
28 instructions.
29
30 Stack
31 -----
32
33 Like Emmental, Mascarpone is a stack-based language. Unlike Emmental,
34 Mascarpone's stack may contain things other than symbols. A stack
35 element in Mascarpone may be a symbol, an operation, or an interpreter.
36
37 Strings are popped off Mascarpone's stack slightly differently than
38 Emmental's. A string begins with the symbol `]` on the stack; this is
39 popped and discarded. Symbols are then successively popped and prepended
40 to a growing string. As further `]`'s are encountered, they too are
41 prepended to the string, but the nesting level is incremented for each
42 one as well. Whenever a `[` is encountered, it is prepended to the
43 string and the nesting level is decremented, unless it is zero, in which
44 case the `[` is discarded and the string is complete. The net effect of
45 all this futzing around is that `[]` work as nestable quoting symbols.
46
47 Also unlike Emmental, Mascarpone does not have a queue.
48
49 Meta-circular Interpreters
50 --------------------------
51
52 The idea of an interpreter in Mascarpone is similar to that in Emmental.
53 In Mascarpone, an interpreter is a map that takes symbols to operations,
54 and an operation is a sequence of symbols that is given meaning by some
55 interpreter.
56
57 Of course, this is a circular definition, but that doesn't seem
58 unreasonable, since we're working with meta-circular interpreters. If
59 you like, you can think of it as forming an "infinite tower of
60 meta-circular interpreters," but that's never been a really satisfying
61 explanation for me. As I explained in the Emmental documentation, I
62 think you need some source of understanding external to the definition
63 in order to make complete sense of a meta-circular interpreter. (I also
64 happen to think that humans have some sort of innate understanding of
65 interpretation — that is, language — so that this demand for further
66 understanding doesn't recurse forever.)
67
68 There is a special interpreter in Mascarpone called "null". It is an
69 error to try to interpret anything with this interpreter. Expect that
70 any program that tries to do this will come crashing to a halt, or will
71 spin off into space and never be heard from again, or something equally
72 impressive.
73
74 Every interpreter (except for null) is linked to a "parent" interpreter
75 (which may be null.) No interpreter can be its own ancestor; the
76 parent-child relationships between interpreters form a directed, acyclic
77 graph (or DAG.)
78
79 There is, at any given time in a Mascarpone, a current interpreter: this
80 is the interpreter that is in force, that is being used to interpret
81 symbols. The parent interpreter of the current interpreter is generally
82 the interpreter that was used to execute the current operation (that is,
83 the operation currently being interpreted; it consists of a string of
84 symbols is interpreted by the current interpreter.)
85
86 The current interpreter when any top-level Mascarpone program begins is
87 the initial Mascarpone interpreter, which is described in English in the
88 next section.
89
90 Initial Mascarpone Interpreter
91 ------------------------------
92
93 `v` ("reify") pushes the current interpreter onto the stack.
94
95 `^` ("deify") pops an interpreter from the stack and installs it as the
96 current interpreter.
97
98 `>` ("extract") pops a symbol from the stack, then pops an interpreter.
99 It pushes onto the stack the operation associated with that symbol in
100 that interpreter.
101
102 `<` ("install") pops a symbol from the stack, then an operation, then an
103 interpreter. It pushes onto the stack a new interpreter which is the
104 same as the given interpreter, except that in it, the given symbol is
105 associated with the given operation.
106
107 `{` ("get parent") pops an interpreter from the stack and pushes it's
108 parent interpreter onto the stack.
109
110 `}` ("set parent") pops an interpreter i from the stack, then pops an
111 interpreter j. It pushes a new interpreter which is the same as i,
112 except that it's parent interpreter is j.
113
114 `*` ("create") pops an interpreter from the stack, then a string. It
115 creates a new operation defined by how that interpreter would interpret
116 that string of symbols, and pushes that operation onto the stack.
117
118 `@` ("expand") pops an operation from the stack and pushes a program
119 string, then pushes an interpreter, such that the semantics of running
120 the program string with the interpreter is identical to the semantics of
121 executing the operation. (Note that the program need not be the one that
122 the operation was defined with, only *equivalent* to it, under the given
123 interpreter; this allows one to sensibly expand "intrinsic" operations
124 like those in the initial Mascarpone interpreter.)
125
126 `!` ("perform") pops an operation from the stack and executes it.
127
128 `0` ("null") pushes the null interpreter onto the stack.
129
130 `1` ("uniform") pops an operation from the stack and pushes back an
131 interpreter where all symbols are associated with that operation.
132
133 `[` ("deepquote") pushes a `[` symbol onto the stack and enters "nested
134 quote mode", which is really another interpreter. In nested quote mode,
135 each symbol is interpreted as an operation which pushes that symbol onto
136 the stack. In addition, the symbols `[` and `]` have special additional
137 meaning: they nest. When a `]` matching the first `[` is encountered,
138 nested quote mode ends, returning to the interpreter previously in
139 effect.
140
141 `'` ("quotesym") switches to "single-symbol quote mode", which is really
142 yet another interpreter. In single-symbol quote mode, each symbol is
143 interpreted as an operation which pushes that symbol onto the stack,
144 then immediately ends single-symbol quote mode, returning to the
145 interpreter previously in effect.
146
147 `.` pops a symbol off the stack and sends it to the standard output.
148
149 `,` waits for a symbol to arrive on standard input, and pushes it onto
150 the stack.
151
152 `:` duplicates the top element of the stack.
153
154 `$` pops the top element of the stack and discards it.
155
156 `/` swaps to the top two elements of the stack.
157
158 Discussion
159 ----------
160
161 ### Design decisions
162
163 As you can see, Mascarpone's semantics and initial operations are a lot
164 less "fugly" than Emmental's. It's a more expressive language, in that
165 it's easier to elegantly convey things involving interpreters and
166 meta-circularity in Mascarpone than it is in Emmental. It explores at
167 least one idea that I explicitly mentioned in the Emmental documentation
168 that I'd like to explore, namely, having multiple meta-circular
169 interpreters and being able to switch between them (and lo and behold,
170 Mascarpone has very well-developed `[]` and `'` operations.) It's also
171 "prettier" in that there's more attention paid to providing duals of
172 operations (both `*` and `@`, for example.)
173
174 Mascarpone also appears to be Turing-complete, despite the lack of
175 explicit conditional, repetition, and arithmetic operators. A cyclic
176 meaning can be expressed by an operation which examines its own
177 definition from the parent interpreter of the current interpreter and
178 re-uses it. A conditional can be formed by creating a new interpreter in
179 which one symbol, say `S`, maps to an operation which does something,
180 and in which all other symbols do something else; executing a symbol in
181 this interpreter is tantamount to testing if that symbol is `S`.
182
183 "But", you point out, "Mascarpone only has one stack! You need at least
184 two stacks in order to simulate a Turing machine's tape." Actually,
185 Mascarpone *does* have another, less obvious stack: each interpreter has
186 a parent interpreter. By getting the current interpreter, modifying it,
187 setting it's parent to be the current interpreter, and setting it as the
188 current interpreter (in Mascarpone: `v`...`v}^`), we "push" something
189 onto it; by getting the current interpreter, getting its parent, and
190 setting that as the current interpreter (`v{^`), we "pop".
191
192 Actually, even if there was no explicit parent-child relationship
193 between interpreters, we'd still be able to store a stack of
194 interpreters, because each operation in an interpreter has its own
195 interpreter that gives meaning to the symbols in that operation, and
196 *that* interpreter can contain operations that can contain interpreters,
197 etc., etc., ad infinitum. This isn't a very classy way to do it, but
198 it's very reminiscent of how structures can be built in the lambda
199 calculus by trapping abstractions in other abstractions.
200
201 It's also worth noting that this is how you'd have to accomplish
202 arithmetic, with something like Church numerals done with interpreters
203 and operations, since Mascarpone has nothing but symbols. On the plus
204 side, this means Mascarpone, unlike Emmental, is highly independent of
205 character set or encoding — it doesn't even have to be ordered. Any set
206 of symbols that contains the symbols of the initial Mascarpone
207 interpreter, plus the symbols appearing in the Mascarpone program being
208 executed, plus the symbols that are desired for input and output, ought
209 to suffice.
210
211 Actually, that's not quite true: it should be a *finite* set. This is
212 mainly for the sake of the definition of the `'` operator: it switches
213 to an interpreter where all symbols indicate operations that push that
214 symbol on the stack. From this we can infer that there should either be
215 a finite number of such operations (and thus symbols,) or somehow these
216 operations know what symbol they are to push. They take the symbol that
217 invoked them as an argument, perhaps. But other operations in Mascarpone
218 do not have such capabilities: an operation need not even be invoked by
219 a symbol, as it could be invoked by the `!` operation, for instance.
220 That would make the operations in the `'` interpreter gratuitously
221 special. And, practically, most character sets, on which sets of symbols
222 are based, are finite, so I don't suppose this restriction is much of a
223 problem.
224
225 One further, somewhat related design decision deserves mention. Any
226 symbol which is not defined in the initial interpreter is interpreted as
227 a no-op. It probably would have been nicer to treat it as an explicit
228 error-causing operation. This could be extended to looking, inside each
229 putative definition, for symbols undefined in the desired interpreter
230 when executing a `*` operation, and causing a (preferably intelligible)
231 error early in that case. Semantics like this would have helped me save
232 time in debugging one or two of the test case programs. However, while
233 Mascarpone is arguably supposed to be less hostile than Emmental when it
234 comes to being programmed in, it's certainly still not what you'd call a
235 mainstream programming language, so while I'm somewhat irked by this
236 deficiency, I hardly consider it a show-stopper.
237
238 ### Related Work
239
240 There are definately two related works that are worth mentioning: Brian
241 Cantwell Smith's Ph.D. thesis "Procedural Reflection in Programming
242 Languages" (MIT, 1982,) and Friedman and Wand's paper "Reification:
243 Reflection without Metaphysics" (ACM LISP conference, 1984.) (Forgive me
244 for not giving proper, perfectly-formatted, Turabianly-correct
245 references to these two works, but frankly, this is the age of the
246 Internet: if you're interested in either of these papers, and you can't
247 find them, there's something wrong with you! If, on the other hand, you
248 don't have *access* to them, perhaps there's something wrong with the
249 institutions whose assumed goal is to increase the amount of human
250 knowledge — but not, it seems, to widen its availability.)
251
252 It's hard to say how much influence Smith's 3-LISP language and Friedman
253 and Wand's Brown language (introduced in the respective papers) have had
254 on Mascarpone: probably some, since I had read both of them (well, not
255 *all* of Smith's monster! but enough of it to grasp the main ideas, I
256 think) and thought about what they were trying to convey. (What Brown
257 calls "reflection" I've called "deification" to give a sort of
258 phonological dual to "reification". Also, the term "reflection" seems to
259 have taken on a more general meaning in computer science since the
260 '80's, so I wanted to avoid its use here.) But that was a couple of
261 years previous, and the subject of meta-circular interpreters came up
262 this time from a different angle; Mascarpone came primarily from trying
263 to "un-knot" the ideas behind Emmental, which itself came to be, quite
264 indirectly, from thinking about issues raised by John Reynolds' original
265 work on meta-circularity.
266
267 Certainly a huge difference that sets Mascarpone apart is that 3-LISP
268 and Brown are caught up in the whole LISP/Scheme thing, so they just use
269 S-expressions and functions to represent reified interpreter parts,
270 which include environments and continuations. Mascarpone, on the other
271 hand, reifies whole interpreters at once, as values which are complete
272 interpreters. Because interpreters contain operations which contain
273 interpreters ("ad infinitum", one might think,) this approach seems to
274 highlight the meta-circularity in a way that is particularly striking.
275 In addition, Mascarpone's "applicative" organization (like XY or Joy;
276 that is, like an idealized version of FORTH) lets it avoid some of the
277 referential issues like names and environments, and gives a nice direct
278 one-symbol-one-operation correspondence.
279
280 Because Mascarpone has interpreters as first-class values, it is never
281 obliged to make the guts of the running interpreter explicit during
282 reification, it just needs to make that interpreter available as a
283 value. The contract of the `@` operation (which, by the way, was a
284 somewhat late add to the language design, fulfilling the desire for a
285 dual to `*`) says you get a program and an interpreter with semantics
286 *equivalent* to the operation you specify, but it doesn't say *how*
287 they're provided. You could successively perform `@` on an intrinsic
288 operation (like, say, `@` itself) and get successively more explicit
289 definitions, written in Mascarpone, of what `@` means. Each one could be
290 thought of as descending (or ascending? does it matter?) a level in that
291 infinite tower dealie. Or, you might only get back a single, random
292 symbol, and an interpreter where all symbols have the semantics of `@`,
293 with no explanation whatsoever. This inbuilt ambiguity is, I think, the
294 appropriate level of abstraction for such an operation (in a
295 meta-circular context, anyway;) saying that you always get back the
296 program you defined the operation with seems overspecified (and unable
297 to handle the case of intrinsics,) and saying that you always get back
298 something opaque, like a function value, seems quite nonplussing in the
299 context of an interpreter that can supposedly examine its own structure.
300 It's not clear to me that either 3-LISP or Brown addresses this point to
301 this degree.
302
303 And of course, neither 3-LISP nor Brown tries to use reification and
304 deification as a means of achieving Turing-completeness in the absence
305 of conventional conditional and repetition constructs.
306
307 Implementation
308 --------------
309
310 `mascarpone.hs` is a reference interpreter for Mascarpone written in
311 Haskell. Run the function `mascarpone` on a string, or `demo n` to run
312 one of the included test cases. `mascarpone.hs` also has a much nicer
313 debugging facility than `emmental.hs`; you can run `debug` on a string
314 to view the state of the program (the current instruction, the rest of
315 the program, the stack, and the current interpreter) at each step of
316 execution. And you can run `test n` to debug the test cases. Lastly,
317 there is a `main` function that runs `mascarpone` on a string read from
318 a file named by the first argument, so a Haskell compiler can be used to
319 build a stand-alone Mascarpone interpreter from this source code.
320
321 Even happier interpreter-redefining! \
322 Chris Pressey \
323 Chicago, IL \
324 December 8, 2007