git @ Cat's Eye Technologies Specs-on-Spec / 24d062a
Initial import of all my implementation-less language specs. catseye 12 years ago
8 changed file(s) with 3190 addition(s) and 0 deletion(s). Raw diff Collapse all Expand all
0 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
1 <!-- encoding: UTF-8 -->
2 <html xmlns="http://www.w3.org/1999/xhtml" lang="en">
3 <head>
4 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
5 <title>The Didigm Reflective Cellular Automaton</title>
6 <!-- begin html doc dynamic markup -->
7 <script type="text/javascript" src="/contrib/jquery-1.6.4.min.js"></script>
8 <script type="text/javascript" src="/scripts/documentation.js"></script>
9 <!-- end html doc dynamic markup -->
10 </head>
11 <body>
12
13 <h1>The Didigm Reflective Cellular Automaton</h1>
14
15 <p>November 2007, Chris Pressey, Cat's Eye Technologies</p>
16
17 <h2>Introduction</h2>
18
19 <p>Didigm is a <em>reflective cellular automaton</em>. What I mean to
20 impart by this phrase is that it is a cellular automaton where the
21 transition rules are given by the very patterns of cells that exist
22 in the playfield at any given time.</p>
23
24 <p>Perhaps another way to think of Didigm is:
25 Didigm =
26 <a href="/projects/alpaca/">ALPACA</a> +
27 <a href="/projects/ypsilax/">Ypsilax</a>.</p>
28
29 <h2>Didigm as Parameterized Language</h2>
30
31 <p>Didigm is actually a parameterized language. A parameterized
32 language is a schema for specifying a set of languages, where a specific language
33 can be obtained by supplying one or more parameters. For example,
34 <a href="/projects/xigxag/">Xigxag</a> is a parameterized language,
35 where the direction of the scanning and the direction of the building
36 of new states are parameters. Didigm "takes" a single parameter, an integer.
37 This parameter determines how many <em>colours</em> (number of possible
38 states for any given cell) the cellular automaton has. This parameter
39 defaults to 8. So when we say Didigm, we actually mean
40 Didigm(8), but there are an infinite number of other possible
41 languages, such as Didigm(5) and Didigm(70521).</p>
42
43 <p>The languages Didigm(0), Didigm(-1), Didigm(-2)
44 and so forth are probably nonsensical abberations, but I'll leave that
45 question for the philosophers to ponder.
46 Didigm(1) is at least well-defined, but it's trivial.
47 Didigm(2) and Didigm(3) are semantically problematic for
48 more technically interesting reasons (Didigm(3) might be CA-universal,
49 but Didigm(2) probably isn't.)
50 Didigm(4) and above are easily shown to be CA-universal.</p>
51
52 <p>
53 (I say CA-universal, and not Turing-complete, because technically
54 cellular automata cannot simulate Turing machines without some
55 extra machinery: TMs can halt, but CAs can't. Since I don't want
56 to deal with defining that extra machinery in Didigm, it's simpler to
57 avoid it for now.)
58 </p>
59
60 <p>Colours are typically numbered. However, this is not meant to imply
61 an ordering between colours. The eight colours of Didigm
62 are typically referred to as 0 to 7.</p>
63
64 <h2>Language Description</h2>
65
66 <h3>Playfield</h3>
67
68 <p>The Didigm playfield, called <em>le monde</em>,
69 is considered unbounded, like most cellular automaton
70 playfields, but there is one significant difference.
71 There is a horizontal division in this playfield, splitting
72 it into regions called <em>le ciel</em>, on top,
73 and <em>la terre</em>, below. This division is
74 distinguishable — meaning, it must be possible to tell which region
75 a given cell is in — but it need not have a presence beyond that.
76 Specifically, this division lies on the edges between cells, rather than in
77 the cells themselves. It has no "substance" and need not be visible
78 to the user. (The Didigm Input Format, below, describes how it
79 may be specified in textual input files.)</p>
80
81 <h3>Magic Colours</h3>
82
83 <p>Each region of the division has a distinguished colour which is
84 called the <em>magic colour</em> of that region. The magic colour of le ciel is
85 colour 0. The magic colour of la terre is colour 7.
86 (In Didigm(<var>n</var>),
87 the magic colour of la terre is colour <var>n</var>-1.)</p>
88
89 <h3>Transition Rules</h3>
90
91 <h4>Definition</h4>
92
93 <p>Each transition rule of the cellular automaton is not fixed,
94 rather, it is given by certain forms that are present in the playfield.</p>
95
96 <p>Such a form is called <em>une salle</em> and has the following configuration.
97 Two horizontally-adjacent cells of the magic colour abut a cell of the <em>destination
98 colour</em> to the right. Two cells below the rightmost magic-colour
99 cell is the cell of the <em>source colour</em>; it is surrounded by cells of
100 any colour called the <em>determiners</em>.</p>
101
102 <p>This is perhaps better illustrated than explained. In the following diagram,
103 the magic colour is 0 (this salle is in le ciel,)
104 the source colour is 1, the destination colour is 2, and the determiners are
105 indicated by D's.</p>
106
107 <pre>
108 002
109 DDD
110 D1D
111 DDD
112 </pre>
113
114 <h4>Application</h4>
115
116 <p>Salles are interpreted as transition rules as follows. When the colour
117 of a given cell is the same as the source colour of some salle, and when
118 the colours of all the cells surrounding that cell are the exact same
119 colours (in the exact same pattern) as the determininers of that salle,
120 we say that that salle <em>matches</em> that cell. When any cell is
121 matched by some salle in the other region, we say that that salle
122 <em>applies</em> to that cell, and that cell is replaced
123 by a cell of the destination colour of that salle.</p>
124
125 <p>"The other region" refers, of course, to the region that is not the
126 region in which the cell being transformed resides. Salles in la terre
127 only apply to cells in le ciel and vice-versa. This complementarity
128 serves to limit the amount of chaos: if there was some salle that applied to
129 <em>all</em> cells, it would apply directly to the cells that made up that
130 salle, and that salle would be immediately transformed.</p>
131
132 <p>On each "tick" of the cellular automaton, all cells are checked to
133 find the salle that applies to them, and then all are transformed, simultaneously,
134 resulting in the next configuration of le monde.</p>
135
136 <p>There is a "default" transition rule which also serves to limit
137 the amount of chaos: if no salle applies to a cell, the colour of that
138 cell does not change.</p>
139
140 <p>Salles may overlap. However, no salle may straddle the horizon.
141 (That is, each salle must be either completely in le ciel or completely
142 in la terre.)</p>
143
144 <p>Salles may conflict (i.e. two salles may have the same source colour
145 and determiners, but different destination colours.) The behaviour in
146 this case is defined to be uniformly random: if there are <var>n</var>
147 conflicting salles, each has a 1/<var>n</var> chance of being the
148 one that applies.</p>
149
150 <h2>Didigm Input Format</h2>
151
152 <p>I'd like to give some examples, but first I need a format to given them in.</p>
153
154 <p>A Didigm Input File is a text file. The textual digit symbols
155 <code>0</code> through <code>9</code> indicate cells of colours 0 through 9.
156 Further colours may be indicated by enclosing a decimal digit string in square brackets,
157 for example <code>[123]</code>. This digit string may contain leading
158 zeros, in order for columns to line up nicely in the file.</p>
159
160 <p>A line containing only a <code>,</code> symbol in the leftmost column
161 indicates the division between le ciel and la terre. This line does not
162 become part of the playfield.</p>
163
164 <p>A line beginning with a <code>=</code> is a directive of some sort.</p>
165
166 <p>A line beginning with <code>=C</code> followed by a colour indicator
167 indicates how many colours (the <var>n</var> in Didigm(<var>n</var>))
168 this playfield contains. This directive may only occur once.</p>
169
170 <p>A line beginning with <code>=F</code> followed by a colour indicator
171 as described above, indicates that the unspecified (and unbounded) remainder
172 of le ciel or la terre (whichever side of <code>,</code> the directive is on)
173 is to be considered filled with cells of the given colour.</p>
174
175 <p>Of course, an application which implements Didigm with some
176 alternate means of specifying le monde, for example a graphical user interface,
177 need not understand the Didigm Input Format.</p>
178
179 <h2>Examples</h2>
180
181 <p>Didigm is immediately seen to be CA-universal, in that you can
182 readily (and stably) express a number of known CA-universal cellular automata in it.
183 For example, to express John Conway's Life, you could say that
184 colour 1 means "alive" and colour 2 means "dead", and compose something like</p>
185
186 <pre>
187 002002001001
188 222122112212 <i>... and so on ...</i>
189 212212212121 <i>... for all 256 ...</i>
190 222222222222 <i>... rules of Life ...</i>
191 =F3
192 ,
193 =F2
194 22222
195 21222
196 21212
197 21122
198 22222
199 </pre>
200
201 <p>Because the magic colour 7 never appears in la terre, il n'y a aucune salle
202 dans la terre et donc tout le ciel est toujours la meme chose.</p>
203
204 <p>There are of course simpler CA's that are apparently CA-universal that
205 would be possible to describe more compactly. But more interesting (to me)
206 is the possibility for making reflective CA's.</p>
207
208 <p>To do this in an uncontrolled fashion is easy. We just stick some salles
209 in le ciel, some salles in la terre, and let 'er rip. Unfortunately, in general,
210 les salles in each region will probably quickly damage enough of the salles in
211 the other region that le monde will become sterile soon enough.</p>
212
213 <p>A rudimentary example of something a little more orchestrated follows.</p>
214
215 <pre>
216 3333333333333
217 3002300230073
218 3111311132113
219 3311321131573
220 3111311131333
221 3333333333333
222 =F3
223 ,
224 =F1
225 111111111111111
226 111111131111111
227 111111111111574
228 111111111111333
229 311111111111023
230 111111111111113
231 </pre>
232
233
234 <p>The intent of this is that the 3's in la terre initially grow
235 streams of 2's to their right, due to the leftmost two salles in le ciel.
236 However, when the top stream of 2's reaches the cell just above and to the
237 left of the 5, the third salle in le ciel matches and turns the 5 into a
238 7, forming une salle dans la terre. This salle turns every 2 to the right of
239 a 0 in le ciel into a 4, thus modifying two of les salles in le ciel.
240 The result of these modified salles is to turn the bottom stream of 2's
241 into a stream of 4's halfway along.</p>
242
243 <p>This is at least predictable, but it still becomes uninteresting fairly quickly.
244 Also note that it's not just the isolated 3's in la terre that grow
245 streams of 2's to the right: the 3's on the right side of la salle
246 would, too. This could be rectified by having a wall of some other colour
247 on that side of la salle, and I'm sure you could extend this example
248 by having something else happen when the stream of 4's hit the 0 in that
249 salle, but you get the picture. Creating a neat and tidy and long-lived
250 reflective cellular automaton requires at least as much care as
251 constructing a "normal" cellular automaton, and probably in general more.</p>
252
253 <h2>History</h2>
254
255 <p>I came up with the concept of a reflective cellular automaton
256 (which is, as far as I'm aware, a novel concept) independently on
257 November 1st, 2007, while walking on Felix Avenue, in Windsor, Ontario.</p>
258
259 <p>No reference implementation exists yet. Therefore, all Didigm runs
260 have been thought experiments, and it's entirely possible that I've missed
261 something in its definition that having a working simulator would reveal.</p>
262
263 <p>Happy magic colouring!
264 <br/>Chris Pressey
265 <br/>Chicago, Illinois
266 <br/>November 17, 2007</p>
267
268 </body>
269 </html>
0 Madison
1 =======
2
3 Version 0.1
4 December 2011, Chris Pressey, Cat's Eye Technologies
5
6 Abstract
7 --------
8
9 Madison is a language in which one can state proofs of properties
10 of term-rewriting systems. Classical methods of automated reasoning,
11 such as resolution, are not used; indeed, term-rewriting itself is
12 used to check the proofs. Both direct proof and proof by induction
13 are supported. Induction in a proof must be across a structure which
14 has a well-founded inductive definition. Such structures can be
15 thought of as types, although this is largely nominal; the traditional
16 typelessness of term-rewiting systems is largely retained.
17
18 Term-rewriting
19 --------------
20
21 Madison has at its core a simple term-rewriting language. It is of
22 a common form which should be unsurprising to anyone who has worked
23 at all with term rewriting. A typical simple program contains
24 a set of rules and a term on which to apply those rules. Each
25 rule is a pair of terms; either term of the pair may contain
26 variables, but any variable that appears on the r.h.s must also
27 appear on the l.h.s. A rule matches a term if is the same as
28 the term with the exception of the variables, which are bound
29 to its subterms; applying a matching rule replaces the term
30 with the r.h.s. of the rule, with the variables expanded approp-
31 riately. Rules are applied in an innermost, leftmost fashion to
32 the term, corresponding to eager evaluation. Rewriting terminates
33 when there is no rule whose l.h.s. matches the current incarnation
34 of the term being rewritten.
35
36 A term is either an atom, which is a symbol that stands alone,
37 or a constructor, which is a symbol followed by a comma-separated list
38 of subterms enclosed in parentheses. Symbols may consist of letters,
39 digits, and hyphens, with no intervening whitespace. A symbol is
40 a variable symbol if it begins with a capital letter. Variable
41 symbols may also begin with underscores, but these may only occur
42 in the l.h.s. of a rewrite rule, to indicate that we don't care
43 what value is bound to the variable and we won't be using it on
44 the r.h.s.
45
46 (The way we are using the term "constructor" may be slightly non-
47 standard; in some other sources, this is called a "function symbol",
48 and a "constructor" is a subtly different thing.)
49
50 Because the rewriting language is merely a component (albeit the
51 core component) of a larger system, the aforementioned typical
52 simple program must be cast into some buttressing syntax. A full
53 program consists of a `let` block which contains the rules
54 and a `rewrite` admonition which specifies the term to be re-
55 written. An example follows.
56
57 | let
58 | leftmost(tree(X,Y)) -> leftmost(X)
59 | leftmost(leaf(X)) -> X
60 | in
61 | rewrite leftmost(tree(tree(leaf(alice),leaf(grace)),leaf(dan)))
62 = alice
63
64 In the above example, there are two rules for the constructor
65 `leftmost/1`. The first is applied to the outer tree to obtain
66 a new leftmost constructor containing the inner tree; the first
67 is applied again to obtain a new leftmost constructor containing
68 the leaf containing `alice`; and the second is applied to that
69 leaf term to obtain just `alice`. At that point, no more rules
70 apply, so rewriting terminates, yielding `alice`.
71
72 Madison is deterministic; if rules overlap, the first one given
73 (syntactically) is used. For this reason, it is a good idea
74 to order rules from most specific to least specific.
75
76 I used the phrase "typical simple program" above because I was
77 trying intentionally to avoid saying "simplest program". In fact,
78 technically no `let` block is required, so you can write some
79 really trivial Madison programs, like the following:
80
81 | rewrite cat
82 = cat
83
84 I think that just about covers the core term-rewriting language.
85 Term-rewriting is Turing-complete, so Madison is too. If you
86 wish to learn more about term rewriting, there are several good
87 books and webpages on the subject; I won't go into it further
88 here.
89
90 Proof-Checking
91 --------------
92
93 My desire with Madison was to design a language in which you
94 can prove things. Not a full-blown theorem prover -- just a
95 proof checker, where you supply a proof and it confirms either
96 that the proof holds or doesn't hold. (Every theorem prover
97 has at its core a proof checker, but it comes bundled with a lot of
98 extra machinery to search the space of possible proofs cleverly,
99 looking for one which will pass the proof-checking phase.)
100
101 It's no coincidence that Madison is built on top of a term-rewriting
102 language. For starters, a proof is very similar to the execution
103 trace of a term being rewritten. In each of the steps of the proof,
104 the statement to be proved is transformed by replacing some part
105 of it with some equally true thing -- in other words, rewritten.
106 In fact, Post Canonical Systems were an early kind of rewriting
107 system, devised by Emil Post to (as I understand it) illustrate this
108 similarity, and to show that proofs could be mechanically carried out
109 in a rewriting system.
110
111 So: given a term-rewriting language, we can give a trivial kind
112 of proof simply by stating the rewrite steps that *should* occur
113 when a term is rewritten, and check that proof by rewriting the term
114 and confirming that those were in fact the steps that occurred.
115
116 For the purpose of stating these sequences of rewrite steps to be
117 checked, Madison has a `theorem..proof..qed` form. To demonstrate
118 this form, let's use Madison to prove that 2 + 2 = 4, using Peano
119 arithmetic.
120
121 | let
122 | add(s(X),Y) -> add(X,s(Y))
123 | add(z,Y) -> Y
124 | in theorem
125 | add(s(s(z)),s(s(z))) ~> s(s(s(s(z))))
126 | proof
127 | add(s(s(z)),s(s(z)))
128 | -> add(s(z),s(s(s(z)))) [by add.1]
129 | -> add(z,s(s(s(s(z))))) [by add.1]
130 | -> s(s(s(s(z)))) [by add.2]
131 | qed
132 = true
133
134 The basic syntax should be fairly apparent. The `theorem` block
135 contains the statement to be proved. The `~>` means "rewrites
136 in zero or more steps to". So, here, we are saying that 2 + 2
137 (in Peano notation) rewrites, in zero or more steps, to 4.
138
139 The `proof` block contains the actual series of rewrite steps that
140 should be carried out. For elucidation, each step may name the
141 particular rule which is applied to arrive at the transformed term
142 at that step. Rules are named by their outermost constructor,
143 followed by a dot and the ordinal position of the rule in the list
144 of rules. These rule-references are optional, but the fact that
145 the rule so named was actually used to rewrite the term at that step
146 could be checked too, of course. The `qed` keyword ends the proof
147 block.
148
149 Naturally, you can also write a proof which does not hold, and
150 Madison should inform you of this fact. 2 + 3, for example,
151 does not equal 4, and it can pinpoint exactly where you went
152 wrong should you come to this conclusion:
153
154 | let
155 | add(s(X),Y) -> add(X,s(Y))
156 | add(z,Y) -> Y
157 | in theorem
158 | add(s(s(z)),s(s(s(z)))) ~> s(s(s(s(z))))
159 | proof
160 | add(s(s(z)),s(s(s(z))))
161 | -> add(s(z),s(s(s(s(z))))) [by add.1]
162 | -> add(z,s(s(s(s(z))))) [by add.1]
163 | -> s(s(s(s(z)))) [by add.2]
164 | qed
165 ? Error in proof [line 6]: step 2 does not follow from applying [add.1] to previous step
166
167 Now, while these *are* proofs, they don't tell us much about the
168 properties of the terms and rules involved, because they are not
169 *generalized*. They say something about a few fixed values, like
170 2 and 4, but they do not say anything about any *infinite*
171 sets of values, like the natural numbers. Now, that would be *really*
172 useful. And, while I could say that what you've seen of Madison so far
173 is a proof checker, it is not a very impressive one. So let's take
174 this further.
175
176 Quantification
177 --------------
178
179 To state a generalized proof, we will need to introduce variables,
180 and to have variables, we will need to be able to say what those
181 variables can range over; in short, we need *quantification*. Since
182 we're particularly interested in making statements about infinite
183 sets of values (like the natural numbers), we specifically want
184 *universal quantification*:
185
186 For all x, ...
187
188 But to have universal quantification, we first need a *universe*
189 over which to quantify. When we say "for all /x/", we generally
190 don't mean "any and all things of any kind which we could
191 possibly name /x/". Rather, we think of /x/ as having a type of
192 some kind:
193
194 For all natural numbers x, ...
195
196 Then, if our proof holds, it holds for all natural numbers.
197 No matter what integer value greater than or equal to zero
198 we choose for /x/, the truism contained in the proof remains true.
199 This is the sort of thing we want in Madison.
200
201 Well, to start, there is one glaringly obvious type in any
202 term-rewriting language, namely, the term. We could say
203
204 For all terms t, ...
205
206 But it would not actually be very interesting, because terms
207 are so general and basic that there's not actually very much you
208 can say about them that you don't already know. You sort of need
209 to know the basic properties of terms just to build a term-rewriting
210 language (like the one at Madison's core) in the first place.
211
212 The most useful property of terms as far as Madison is concerned is
213 that the subterm relationship is _well-founded_. In other words,
214 in the term `c(X)`, `X` is "smaller than" `c(X)`, and since terms are
215 finite, any series of rewrites which always results in "smaller" terms
216 will eventually terminate. For completeness, we should probably prove
217 that rigorously, but for expediency we will simply take it as a given
218 fact for our proofs.
219
220 Anyway, to get at something actually interesting, we must look further
221 than the terms themselves.
222
223 Types
224 -----
225
226 What's actually interesting is when you define a restricted
227 set of forms that terms can take, and you distinguish terms inside
228 this set of forms from the terms outside the set. For example,
229
230 | let
231 | boolean(true) -> true
232 | boolean(false) -> true
233 | boolean(_) -> false
234 | in
235 | rewrite boolean(false)
236 = true
237
238 We call a set of forms like this a _type_. As you can see, we
239 have basically written a predicate that defines our type. If any
240 of the rewrite rules in the definition of this predicate rewrite
241 a given term to `true`, that term is of our type; if it rewrites
242 to `false`, it is not.
243
244 Once we have types, any constructor may be said to have a type.
245 By this we mean that no matter what subterms the constructor has,
246 the predicate of the type of which we speak will always reduce to
247 `true` when that term is inserted in it.
248
249 Note that using predicates like this allows our types to be
250 non-disjoint; the same term may reduce to true in two different
251 predicates. My first sketches for Madison had disjoint types,
252 described by rules which reduced each term to an atom which named
253 the type of that term. (So the above would have been written with
254 rules `true -> boolean` and `false -> boolean` instead.) However,
255 while that method may be, on the surface, more elegant, I believe
256 this approach better reflects how types are actually used in
257 programming. At the end of the day, every type is just a predicate,
258 and there is nothing stopping 2 from being both a natural number and
259 an integer. And, for that matter, a rational number and a real
260 number.
261
262 In theory, every predicate is a type, too, but that's where things
263 get interesting. Is 2 not also an even number, and a prime number?
264 And in an appropriate (albeit contrived) language, is it not a
265 description of a computation which may or may not always halt?
266
267 The Type Syntax
268 ---------------
269
270 The above considerations motivate us to be careful when dealing
271 with types. We should establish some ground rules so that we
272 know that our types are useful to universally quantify over.
273
274 Unfortunately, this introduces something of a chicken-and-egg
275 situation, as our ground rules will be using logical connectives,
276 while at the same time they will be applied to those logical
277 connectives to ensure that they are sound. This is not, actually,
278 a big deal; I mention it here more because it is interesting.
279
280 So, the rules which define our type must conform to certain
281 rules, themselves. While it would be possible to allow the
282 Madison programmer to use any old bunch of rewrite rules as a
283 type, and to check that these rules make for a "good" type when
284 such a usage is seen -- and while this would be somewhat
285 attractive from the standpoint of proving properties of term-
286 rewriting systems using term-rewriting systems -- it's not strictly
287 necessary to use a descriptive approach such as this, and there are
288 certain organizational benefits we can achieve by taking a more
289 prescriptive tack.
290
291 Viz., we introduce a special syntax for defining a type with a
292 set of rules which function collectively as a type predicate.
293 Again, it's not strictly necessary to do this, but it does
294 help organize our code and perhaps our thoughts, and perhaps make
295 an implementation easier to build. It's nice to be able to say,
296 yes, what it means to be a `boolean` is defined right here and
297 nowhere else.
298
299 So, to define a type, we write our type rules in a `type..in`
300 block, like the following.
301
302 | type boolean is
303 | boolean(true) -> true
304 | boolean(false) -> true
305 | in
306 | rewrite boolean(false)
307 = true
308
309 As you can see, the wildcard reduction to false can be omitted for
310 brevity. (In other words, "Nothing else is a boolean" is implied.)
311 And, the `boolean` constructor can be used for rewriting in a term
312 just like any other plain, non-`type`-blessed rewrite rule.
313
314 | type boolean is
315 | boolean(true) -> true
316 | boolean(false) -> true
317 | in
318 | rewrite boolean(tree(leaf(sabrina),leaf(joe)))
319 = false
320
321 Here are the rules that the type-defining rules must conform to.
322 If any of these rules are violated in the `type` block, the Madison
323 implementation must complain, and not proceed to try to prove anything
324 from them.
325
326 Once a type is defined, it cannot be defined further in a regular,
327 non-type-defining rewriting rule.
328
329 | type boolean is
330 | boolean(true) -> true
331 | boolean(false) -> true
332 | in let
333 | boolean(red) -> green
334 | in
335 | rewrite boolean(red)
336 ? Constructor "boolean" used in rule but already defined as a type
337
338 The constructor in the l.h.s. must be the same in all rules.
339
340 | type foo is
341 | foo(bar) -> true
342 | baz(bar) -> true
343 | in
344 | rewrite cat
345 ? In type "foo", constructor "bar" used on l.h.s. of rule
346
347 The constructor used in the rules must be arity 1 (i.e. have exactly
348 one subterm.)
349
350 | type foo is
351 | foo(bar,X) -> true
352 | in
353 | rewrite cat
354 ? In type "foo", constructor has arity greater than one
355
356 It is considered an error if the predicate rules ever rewrite, inside
357 the `type` block, to anything besides the atoms `true` or `false`.
358
359 | type foo is
360 | foo(bar) -> true
361 | foo(tree(X)) -> bar
362 | in
363 | rewrite cat
364 ? In type "foo", rule reduces to "bar" instead of true or false
365
366 The r.h.s.'s of the rules of the type predicate must *always*
367 rewrite to `true` or `false`. That means, if we can't prove that
368 the rules always rewrite to something, we can't use them as type
369 predicate rules. In practice, there are a few properties that
370 we insist that they have.
371
372 They may involve type predicates that have previously been
373 established.
374
375 | type boolean is
376 | boolean(true) -> true
377 | boolean(false) -> true
378 | in type boolbox is
379 | boolbox(box(X)) -> boolean(X)
380 | in
381 | rewrite boolbox(box(true))
382 = true
383
384 They may involve certain, pre-defined rewriting rules which can
385 be thought of as operators on values of boolean type (which, honestly,
386 is probably built-in to the language.) For now there is only one
387 such pre-defined rewriting rule: `and(X,Y)`, where `X` and `Y` are
388 booleans, and which rewrites to a boolean, using the standard truth
389 table rules for boolean conjunction.
390
391 | type boolean is
392 | boolean(true) -> true
393 | boolean(false) -> true
394 | in type boolpair is
395 | boolpair(pair(X,Y)) -> and(boolean(X),boolean(Y))
396 | in
397 | rewrite boolpair(pair(true,false))
398 = true
399
400 | type boolean is
401 | boolean(true) -> true
402 | boolean(false) -> true
403 | in type boolpair is
404 | boolpair(pair(X,Y)) -> and(boolean(X),boolean(Y))
405 | in
406 | rewrite boolpair(pair(true,cheese))
407 = false
408
409 Lastly, the r.h.s. of a type predicate rule can refer to the self-same
410 type being defined, but *only* under certain conditions. Namely,
411 the rewriting must "shrink" the term being rewritten. This is what
412 lets us inductively define types.
413
414 | type nat is
415 | nat(z) -> true
416 | nat(s(X)) -> nat(X)
417 | in
418 | rewrite nat(s(s(z)))
419 = true
420
421 | type nat is
422 | nat(z) -> true
423 | nat(s(X)) -> nat(s(X))
424 | in
425 | rewrite nat(s(s(z)))
426 ? Type not well-founded: recursive rewrite does not decrease in [foo.2]
427
428 | type nat is
429 | nat(z) -> true
430 | nat(s(X)) -> nat(s(s(X)))
431 | in
432 | rewrite nat(s(s(z)))
433 ? Type not well-founded: recursive rewrite does not decrease in [foo.2]
434
435 | type bad
436 | bad(leaf(X)) -> true
437 | bad(tree(X,Y)) -> and(bad(X),bad(tree(Y,Y))
438 | in
439 | rewrite whatever
440 ? Type not well-founded: recursive rewrite does not decrease in [bad.2]
441
442 We can check this by looking at all the rewrite rules in the
443 definition of the type that are recursive, i.e. that contain on
444 on their r.h.s. the constructor being defined as a type predicate.
445 For every such occurrence on the r.h.s. of a recursive rewrite,
446 the contents of the constructor must be "smaller" than the contents
447 of the constructor on the l.h.s. What it means to be smaller
448 should be fairly obvious: it just has fewer subterms. If all the
449 rules conform to this pattern, rewriting will eventually terminate,
450 because it will run out of subterms to rewrite.
451
452 Application of Types in Proofs
453 ------------------------------
454
455 Now, aside from these restrictions, type predicates are basically
456 rewrite rules, just like any other. The main difference is that
457 we know they are well-defined enough to be used to scope the
458 universal quantification in a proof.
459
460 Simply having a definition for a `boolean` type allows us to construct
461 a simple proof with variables. Universal quantification over the
462 universe of booleans isn't exactly impressive; we don't cover an infinite
463 range of values, like we would with integers, or lists. But it's
464 a starting point on which we can build. We will give some rewrite rules
465 for a constructor `not`, and prove that this constructor always reduces
466 to a boolean when given a boolean.
467
468 | type boolean is
469 | boolean(true) -> true
470 | boolean(false) -> true
471 | in let
472 | not(true) -> false
473 | not(false) -> true
474 | not(_) -> undefined
475 | in theorem
476 | forall X where boolean(X)
477 | boolean(not(X)) ~> true
478 | proof
479 | case X = true
480 | boolean(not(true))
481 | -> boolean(true) [by not.1]
482 | -> true [by boolean.1]
483 | case X = false
484 | boolean(not(false))
485 | -> boolean(false) [by not.2]
486 | -> true [by boolean.2]
487 | qed
488 = true
489
490 As you can see, proofs using universally quantified variables
491 need to make use of _cases_. We know this proof is sound, because
492 it shows the rewrite steps for all the possible values of the
493 variable -- and we know they are all the possible values, from the
494 definition of the type.
495
496 In this instance, the cases are just the two possible values
497 of the boolean type, but if the type was defined inductively,
498 they would need to cover the base and inductive cases. In both
499 matters, each case in a complete proof maps to exactly one of
500 the possible rewrite rules for the type predicate. (and vice versa)
501
502 Let's prove the type of a slightly more complex rewrite rule,
503 one which has multiple subterms which can vary. (This `and`
504 constructor has already been introduced, and we've claimed we
505 can use it in the definition of well-founded inductive types;
506 but this code proves that it is indeed well-founded, and it
507 doesn't rely on it already being defined.)
508
509 | let
510 | and(true,true) -> true
511 | and(_,_) -> false
512 | in theorem
513 | forall X where boolean(X)
514 | forall Y where boolean(Y)
515 | boolean(and(X,Y)) ~> true
516 | proof
517 | case X = true
518 | case Y = true
519 | boolean(and(true,true))
520 | -> boolean(true) [by and.1]
521 | -> true [by boolean.1]
522 | case Y = false
523 | boolean(and(true,false))
524 | -> boolean(false) [by and.2]
525 | -> true [by boolean.2]
526 | case X = false
527 | case Y = true
528 | boolean(and(false,true))
529 | -> boolean(false) [by and.2]
530 | -> true [by boolean.2]
531 | case Y = false
532 | boolean(and(false,false))
533 | -> boolean(false) [by and.2]
534 | -> true [by boolean.2]
535 | qed
536 = true
537
538 Unwieldy, you say! And you are correct. But making something
539 easy to use was never my goal.
540
541 Note that the definition of `and()` is a bit more open-ended than
542 `not()`. `and.2` allows terms like `and(dog,cat)` to rewrite to `false`.
543 But our proof only shows that the result of reducing `and(A,B)` is
544 a boolean *when both A and B are booleans*. So it, in fact,
545 tells us nothing about the type of `and(dog,cat)`, nor in fact anything
546 at all about the properties of `and(A,B)` when one or more of `A` and
547 `B` are not of boolean type. So be it.
548
549 Anyway, since we were speaking of inductively defined types
550 previously, let's do that now. With the help of `and()`, here is
551 a type for binary trees.
552
553 | type tree is
554 | tree(leaf) -> true
555 | tree(branch(X,Y)) -> and(tree(X),tree(Y))
556 | in
557 | rewrite tree(branch(leaf,leaf))
558 = true
559
560 We can define some rewrite rules on trees. To start small,
561 let's define a simple predicate on trees.
562
563 | type tree is
564 | tree(leaf) -> true
565 | tree(branch(X,Y)) -> and(tree(X),tree(Y))
566 | in let
567 | empty(leaf) -> true
568 | empty(branch(_,_)) -> false
569 | in empty(branch(branch(leaf,leaf),leaf))
570 = false
571
572 | type tree is
573 | tree(leaf) -> true
574 | tree(branch(X,Y)) -> and(tree(X),tree(Y))
575 | in let
576 | empty(leaf) -> true
577 | empty(branch(_,_)) -> false
578 | in empty(leaf)
579 = true
580
581 Now let's prove that our predicate always rewrites to a boolean
582 (i.e. that it has boolean type) when its argument is a tree.
583
584 | type tree is
585 | tree(leaf) -> true
586 | tree(branch(X,Y)) -> and(tree(X),tree(Y))
587 | in let
588 | empty(leaf) -> true
589 | empty(branch(_,_)) -> false
590 | in theorem
591 | forall X where tree(X)
592 | boolean(empty(X)) ~> true
593 | proof
594 | case X = leaf
595 | boolean(empty(leaf))
596 | -> boolean(true) [by empty.1]
597 | -> true [by boolean.1]
598 | case X = branch(S,T)
599 | boolean(empty(branch(S,T)))
600 | -> boolean(false) [by empty.2]
601 | -> true [by boolean.2]
602 | qed
603 = true
604
605 This isn't really a proof by induction yet, but it's getting closer.
606 This is still really us examining the cases to determine the type.
607 But, we have an extra guarantee here; in `case X = branch(S,T)`, we
608 know `tree(S) -> true`, and `tree(T) -> true`, because `tree(X) -> true`.
609 This is one more reason why `and(X,Y)` is built-in to Madison; it
610 needs to leverage what it means and make use of this information in a
611 proof. We don't really use that extra information in this proof, but
612 we will later on.
613
614 Structural Induction
615 --------------------
616
617 Let's try something stronger, and get into something that could be
618 described as real structural induction. This time, we won't just prove
619 something's type. We'll prove something that actually walks and talks
620 like a real (albeit simple) theorem: the reflection of the reflection
621 of any binary tree is the same as the original tree.
622
623 | type tree is
624 | tree(leaf) -> true
625 | tree(branch(X,Y)) -> and(tree(X),tree(Y))
626 | in let
627 | reflect(leaf) -> leaf
628 | reflect(branch(A,B)) -> branch(reflect(B),reflect(A))
629 | in theorem
630 | forall X where tree(X)
631 | reflect(reflect(X)) ~> X
632 | proof
633 | case X = leaf
634 | reflect(reflect(leaf))
635 | -> reflect(leaf) [by reflect.1]
636 | -> leaf [by reflect.1]
637 | case X = branch(S, T)
638 | reflect(reflect(branch(S, T)))
639 | -> reflect(branch(reflect(T),reflect(S))) [by reflect.2]
640 | -> branch(reflect(reflect(S)),reflect(reflect(T))) [by reflect.2]
641 | -> branch(S,reflect(reflect(T))) [by IH]
642 | -> branch(S,T) [by IH]
643 | qed
644 = true
645
646 Finally, this is a proof using induction! In the [by IH] clauses,
647 IH stands for "inductive hypothesis", the hypothesis that we may
648 assume in making the proof; namely, that the property holds for
649 "smaller" instances of the type of X -- in this case, the "smaller"
650 trees S and T that are used to construct the tree `branch(S, T)`.
651
652 Relying on the IH is valid only after we have proved the base case.
653 After having proved `reflect(reflect(S)) -> S` for the base cases of
654 the type of S, we are free to assume that `reflect(reflect(S)) -> S`
655 in the induction cases. And we do so, to rewrite the last two steps.
656
657 Like cases, the induction in a proof maps directly to the
658 induction in the definition of the type of the variable being
659 universally quantified upon. If the induction in the type is well-
660 founded, so too will be the induction in the proof. (Indeed, the
661 relationship between induction and cases is implicit in the
662 concepts of the "base case" and "inductive case (or step)".)
663
664 Stepping Back
665 -------------
666
667 So, we have given a simple term-rewriting-based language for proofs,
668 and shown that it can handle a proof of a property over an infinite
669 universe of things (trees.) That was basically my goal in designing
670 this language. Now let's step back and consider some of the
671 implications of this system.
672
673 We have, here, a typed programming language. We can define types
674 that look an awful lot like algebraic data types. But instead of
675 glibly declaring the type of any given term, like we would in most
676 functional languages, we actually have to *prove* that our terms
677 always rewrite to a value of that type. That's more work, of
678 course, but it's also stronger: in proving that the term always
679 rewrites to a value of the type, we have, naturally, proved that
680 it *always* rewrites -- that its rewrite sequence is terminating.
681 There is no possibility that its rewrite sequence will enter an
682 infinite loop. Often, we establish this with the help of previously
683 established basis that our inductively-defined types are well-founded,
684 which is itself proved on the basis that the subterm relationship is
685 well-founded.
686
687 Much like we can prove termination in course of proving a type,
688 we can prove a type in the course of proving a property -- such
689 as the type of `reflect(reflect(T))` above. (This does not directly
690 lead to a proof of the type of `reflect`, but whatever.)
691
692 And, of course, we are only proving the type of term on the
693 assumption that its subterms have specific types. These proofs
694 say nothing about the other cases. This may provide flexibility
695 for extending rewrite systems -- or it might not, I'm not sure.
696 It might be nice to prove that all other types result in some
697 error term. (One of the more annoying things about term-rewriting
698 languages is how an error can result in a half-rewritten program
699 instead of a recgonizable error code. There seems to be a tradeoff
700 between extensibility and producing recognizable errors.)
701
702 Grammar so Far
703 --------------
704
705 I think I've described everything I want in the language above, so
706 the grammar should, modulo tweaks, look something like this:
707
708 Madison ::= Block.
709 Block ::= LetBlock | TypeBlock | ProofBlock | RewriteBlock.
710 LetBlock ::= "let" {Rule} "in" Block.
711 TypeBlock ::= "type" Symbol "is" {Rule} "in" Block.
712 RewriteBlock ::= "rewrite" Term.
713 Rule ::= Term "->" Term.
714 Term ::= Atom | Constructor | Variable.
715 Atom ::= Symbol.
716 Constructor ::= Symbol "(" Term {"," Term} ")".
717 ProofBlock ::= "theorem" Statement "proof" Proof "qed".
718 Statement ::= Quantifier Statement | MultiStep.
719 Quantifiers ::= "forall" Variable "where" Term.
720 MultiStep ::= Term "~>" Term.
721 Proof ::= Case Proof {Case Proof} | Trace.
722 Trace ::= Term {"->" Term [RuleRef]}.
723 RuleRef ::= "[" "by" (Symbol "." Integer | "IH") "]".
724
725 Discussion
726 ----------
727
728 I think that basically covers it. This document is still a little
729 rough, but that's what major version zeroes are for, right?
730
731 I have essentially convinced myself that the above-described system
732 is sufficient for simple proof checking. There are three significant
733 things I had to convince myself of to get to this point, which I'll
734 describe here.
735
736 One is that types have to be well-founded in order for them to serve
737 as scopes for universal quantification. This is obvious in
738 retrospect, but getting them into the language in a way where it was
739 clear they could be checked for well-foundedness took a little
740 effort. The demarcation of type-predicate rewrite rules was a big
741 step, and a little disappointing because it introduces the loaded
742 term `type` into Madison's vernacular, which I wanted to avoid.
743 But it made it much easier to think about, and to formulate the
744 rules for checking that a type is well-founded. As I mentioned, it
745 could go away -- Madison could just as easily check that any
746 constructor used to scope a universal quantification is well-founded.
747 But that would probably muddy the presentation of the idea in this
748 document somewhat. It would be something to keep in mind for a
749 subsequent version of Madison that further distances itself from the
750 notion of "types".
751
752 Also, it would probably be possible to extend the notion of well-
753 founded rewriting rules to mutually-recursive rewriting rules.
754 However, this would complicate the procedure for checking that a
755 type predicate is well-founded.
756
757 The second thing I had to accept to get to this conviction is that
758 `and(X,Y)` is built into the language. It can't just be defined
759 in Madison code, because while this would be wonderful from a
760 standpoint of minimalism, Madison has to know what it means to let
761 you write non-trivial inductive proofs. In a nutshell, it has to
762 know that `foo(X) -> and(bar(X),baz(X))` means that if `foo(X)` is
763 true, then `bar(X)` is also true, and `baz(X)` is true as well.
764
765 I considered making `or(X,Y)` a built-in as well, but after some
766 thought, wasn't convinced that it was that valuable in the kinds
767 of proofs I wanted to write.
768
769 Lastly, the third thing I had to come to terms with was, in general,
770 how we know a stated proof is complete. As I've tried to describe
771 above, we know it's complete because each of the cases maps to a
772 possible rewrite rule, and induction maps to the inductive definition
773 of a type predicate, which we know is well-founded because of the
774 checks Madison does on it (ultimately based on the assumption that
775 the subterm property is well-founded.) There Is Nothing Else.
776
777 This gets a little more complicated when you get into proofs by
778 induction. The thing there is that we can assume the property
779 we want to prove, in one of the cases (the inductive case) of the
780 proof, so long as we have already proved all the other cases (the
781 base case.) This is perfectly sound in proofs by hand, so it is
782 likewise perfectly sound in a formal proof checker like Madison;
783 the question is how Madison "knows" that it is sound, i.e. how it
784 can be programmed to reject proofs which are not structured this
785 way. Well, if we limit it to what I've just described above --
786 check that the scope of the universal quantification is well-
787 founded, check that there are two cases, and check that we've already
788 proved one case, then allow the inductive hypothesis to be used as a
789 rewrite rule in the other case of the proof -- this is not difficult
790 to see how it could be mechanized.
791
792 However, this is also very limited. Let's talk about limitations.
793
794 For real data structures, you might well have multiple base cases;
795 for example, a tree with two kinds of leaf nodes. Does this start
796 breaking down? Probably. It probably breaks down with multiple
797 inductive cases, as well, although you might be able to get around
798 that with breaking the proof into multiple proofs, and having
799 subsequent proofs rely on properties proved in previous proofs.
800
801 Another limitation I discovered when trying to write a proof that
802 addition in Peano arithmetic is commutative. It seemingly can't
803 be done in Madison as it currently stands, as Madison only knows
804 how to rewrite something into something else, and cannot express
805 the fact that two things (like `add(A,B)` and `add(B,A)`) rewrite
806 to the same thing. Such a facility would be easy enough to add,
807 and may appear in a future version of Madison, possibly with a
808 syntax like:
809
810 theorem
811 forall A where nat(A)
812 forall B where nat(B)
813 add(A,B) ~=~ add(B,A)
814 proof ...
815
816 You would then show that `add(A,B)` reduces to something, and
817 that `add(B,A)` reduces to something, and Madison would check
818 that the two somethings are in fact the same thing. This is, in
819 fact, a fairly standard method in the world of term rewriting.
820
821 As a historical note, Madison is one of the pieces of fallout from
822 the overly-ambitious project I started a year and a half ago called
823 Rho. Rho was a homoiconic rewriting language with several very
824 general capabilities, and it wasn't long before I decided it was
825 possible to write proofs in it, as well as the other things it was
826 designed for. Of course, this stretched it to about the limit of
827 what I could keep track of in a single project, and it was soon
828 afterwards abandoned. Other fallout from Rho made it into other
829 projects of mine, including Pail (having `let` bindings within
830 the names of other `let` bindings), Falderal (the test suite from
831 the Rho implementation), and Q-expressions (a variant of
832 S-expressions, with better quoting capabilities, still forthcoming.)
833
834 Happy proof-checking!
835 Chris Pressey
836 December 2, 2011
837 Evanston, Illinois
0 <html>
1 <head>
2 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
3 <meta name="Description" content="Cat's Eye Technologies: MDPN: Multi-Dimensional Pattern Notation">
4 <title>Cat's Eye Technologies: MDPN: Multi-Dimensional Pattern Notation</title>
5 <!-- begin html doc dynamic markup -->
6 <script type="text/javascript" src="/contrib/jquery-1.6.4.min.js"></script>
7 <script type="text/javascript" src="/scripts/documentation.js"></script>
8 <!-- end html doc dynamic markup -->
9 </head>
10 <body>
11
12 <h1>Multi-Directional Pattern Notation</h1>
13 <p>Final - Sep 6 1999</p>
14 <hr>
15
16 <h2>Introduction</h2>
17
18 <p>MDPN is an extension to EBNF, which attributes it for
19 the purposes of scanning and parsing input files which
20 assume a non-unidirectional form. A familiarity with
21 EBNF is assumed for the remainder of this document.</p>
22
23 <p>MDPN was developed by Chris Pressey in late 1998,
24 built on an earlier, less successful attempt at a "2D EBNF" devised
25 to fill a void that the
26 mainstream literature on parsing seemed to rarely if
27 ever approach, with much help provided by John Colagioia
28 throughout 1998.</p>
29
30 <p>MDPN has possible uses in the construction of
31 parsers and subsequently compilers for multi-directional and
32 multi-dimensional languages such as Orthogonal, Befunge,
33 Wierd, Blank, Plankalk&uuml;l, and even less contrived notations
34 like structured Flowchart and Object models of systems.</p>
35
36 <p>As the name indicates, MDPN provides a notation for
37 describing multidimensional patterns by
38 extending the concept of linear scanning and matching
39 with geometric attributes in a given number of dimensions.</p>
40
41 <h2>Preconditions for Multidirectional Parsing</h2>
42
43 <p>The multidirectional parsing that MDPN concerns itself
44 with assumes that any portion of the input file is accessable
45 at any time. Concepts such as LL(1) are fairly meaningless
46 in a non-unidirectional parsing system of this sort.
47 The unidirectional input devices such as paper tape and
48 punch cards that were the concern of original parsing methods
49 have been superceded
50 by modern devices such as hard disk drives and ample, cheap RAM.</p>
51
52 <p>In addition, MDPN is limited to an orthogonal
53 representation of the input file, and this
54 document is generally less concerned about forms of
55 four or higher dimensions, to reduce unnecessary complexity.</p>
56
57 <h2>Notation from EBNF</h2>
58
59 <p>Syntax is drawn from EBNF. It is slightly modified, but
60 should not surprise anyone who is familiar with EBNF.</p>
61
62 <p>A freely-chosen unadorned ('bareword') alphabetic
63 multicharacter identifier indicates
64 the name of a nonterminal (named pattern) in the
65 grammar. e.g. <code>foo</code>. (Single characters have special
66 meaning as operators.) Case matters: <code>foo</code> is not the
67 same name as <code>Foo</code> or <code>FOO</code>.</p>
68
69 <p>Double quotes begin and end literal terminals (symbols.)
70 e.g. <code>"bar"</code>.</p>
71
72 <p>A double-colon-equals-sign (<code>::=</code>)
73 describes a production (pattern match)
74 by associating a single nonterminal on the left with
75 a pattern on the right, terminated with a period.
76 e.g. <code>foo ::= "bar".</code></p>
77
78 <p>A pattern is a series of terminals, nonterminals,
79 operators, and parenthetics.</p>
80
81 <p>The <code>|</code> operator denotes
82 alternatives. e.g. <code>"foo" | "bar"</code></p>
83
84 <p>The <code>(</code> and <code>)</code> parentheses denote
85 precedence and grouping.</p>
86
87 <p>The <code>[</code> and <code>]</code> brackets denote
88 that the contents may be omitted, that is, they
89 may occur zero or one times. e.g. <code>"bar" ["baz"]</code></p>
90
91 <p>The <code>{</code> and <code>}</code> braces denote
92 that the contents may be omitted or may be repeated any number of times.
93 e.g. <code>"bar" {baz "quuz"}</code></p>
94
95 <h2>Deviations from EBNF</h2>
96
97 <p>The input file is spatially related to a
98 coordinate system and it is useful to
99 think of the input mapped to an orthogonally
100 distributed (Cartesian) form with no arbitrary
101 limit imposed on its size, hereinafter referred to
102 as <i>scan-space</i>.</p>
103
104 <p>The input file is mapped to scan-space. The
105 first printable character in the input file always
106 maps to the <i>origin</i> of scan-space regardless of
107 the number of dimensions. The origin is enumerated
108 with coordinates (0) in one dimension, (0,0) in two
109 dimensions, (0,0,0) in three dimensions, etc.</p>
110
111 <p>Scan-space follows the 'computer storage' co-ordinate
112 system so that <i>x</i> coordinates increase to
113 the 'east' (rightwards), <i>y</i> coordinates increase to the
114 'south' (downwards), and <i>z</i> coordinates increase on each
115 successive 'page'.</p>
116
117 <p>Successive characters in the input file indicate
118 successive coordinate (<i>x</i>) values in scan-space.
119 For two and three dimensions, end-of-line markers are assumed
120 to indicate "reset the <i>x</i> dimension and increment
121 the <i>y</i> dimension", and end-of-page markers
122 indicate "reset the <i>y</i> dimension and increment
123 the <i>z</i> dimension",
124 thus following the
125 commonplace mapping of computer text printouts.</p>
126
127 <p>Whitespace in the input file are <b>not</b> ignored.
128 The terminal <code>" "</code>, however, will match any
129 whitespace (including tabs, which are <b>not</b> expanded.)
130 The pattern <code>{" "}</code> may be used to indicate
131 any number of whitespaces; <code>" " {" "}</code> may be used to
132 indicate one or more whitespaces. Areas of scan-space
133 beyond the bounds of the input file are considered to be filled
134 with whitespaces.</p>
135
136 <p>Therefore, <code>"hello"</code> as a terminal is exactly
137 the same as <code>"h" "e" "l" "l" "o"</code> as an pattern of
138 terminals.</p>
139
140 <p>A <code>}</code> closing brace can be followed by a <code>^</code>
141 (<i>constraint</i>)
142 operator, which is followed by an expression in parentheses.</p>
143
144 <ul>
145 <p>This expression is actually in a subnotation which supports
146 a very simple form of algebra. The expression (built with
147 terms connected by infix <code>+-*/%</code> operators with their
148 C language meanings) can either reduce to
149 <ul>
150 <li>a constant value, as in <code>{"X"} ^ (5)</code>, which
151 would match five <code>X</code> terminals in a line; or
152 <li>an unknown value, which can involve any single lowercase
153 letters, which indicate variables local to the production,
154 as in <code>{"+"}^(x) {"-"}^(x*2)</code>, which would match only twice
155 as many minus signs as plus signs.
156 </ul></p>
157
158 <p>Complex algebraic expressions in constraints can and probably
159 should be avoided when constructing a MDPN grammar
160 for a real (non-contrived) compiler. MDPN-based
161 compiler-compilers aren't expected to support more than
162 one or two unknowns per expression, for example.
163 There is no such restriction, of course,
164 when using MDPN as a guide for hand-coding a multidimensional
165 parser, or otherwise using it as a more
166 sophisticated pattern-matching tool.</p>
167 </ul>
168
169 <h2>The Scan Pointer</h2>
170
171 <p>It is useful to imagine a <i>scan pointer</i> (SP,
172 not to be confused with a <i>stack pointer</i>, which is not
173 the concern of this document) which is analogous to the
174 current token in a single-dimensional parser, but exists
175 in MDPN as a free spatial relationship to the input file, and
176 thus also has associated geometric attributes such as
177 direction.</p>
178
179 <p>The SP's <i>location</i> is advanced through scan-space by its
180 <i>heading</i> as terminals in the productions
181 are successfully matched with symbols in the input buffer.</p>
182
183 <p>The following geometric attribution operators modify the
184 properties of the SP. Note that injudicious
185 use of any of these operators <i>can</i> result in an infinite loop
186 during scanning. There is no built-in contingency measure to
187 escape from an infinite parsing loop in MDPN (but see
188 exclusivity, below, for a possible way to overcome this.)</p>
189
190 <p><code>t</code> is the relative translation operator. It is
191 followed by a vector, in parentheses, which is added to the
192 location of the SP. This does not change its heading.</p>
193
194 <p>For example, <code>t (0,-1)</code> moves the
195 SP one symbol above the current symbol (the symbol which was
196 <i>about</i> to be matched.)</p>
197
198 <p>As a more realistic example of how this works, consider that the
199 pattern <code>"." t(-1,1) "!" t(0,-1)</code> will match a
200 period with an exclamation point directly below it, like:
201
202 <pre>.
203 !</pre>
204 </p>
205
206 <p><code>r</code> is the relative rotation operator. It is
207 followed by an axis identifier (optional: see below)
208 and an orthogonal angle (an angle <i>a</i>
209 such that |<i>a</i>| <b>mod</b> 90 degrees = 0) assumed to
210 be measured in degrees, both in parentheses.
211 The angle is added to the SP's heading.
212 Negative angle arguments are allowed.</p>
213
214 <p>Described in two dimensions, the (default) heading 0 denotes 'east,' that
215 is, parsing character by character in a rightward direction, where
216 the SP's <i>x</i> axis coordinate increases and all other axes coordinates
217 stay the same. Increasing angles ascend counterclockwise (90 = 'north',
218 180 = 'west', 270 = 'south'.)</p>
219
220 <p>For example, <code>">" r(-90) "+^"</code> would match
221
222 <pre>&gt;+
223 ^</pre>
224 </p>
225
226 <p>The axis identifier indicates which axis this rotation
227 occurs around. If the axis identifier is omitted,
228 the <i>z</i> axis is to be assumed, since this is certainly the
229 most common axis to rotate about, in two dimensions.</p>
230
231 <p>If the axis identifier is present, it may be a single letter
232 in the set <code>xyz</code> (these unsurprisingly indicate
233 the <i>x</i>, <i>y</i>, and <i>z</i> dimensions respectively),
234 or it may be a non-negative integer value, where 0 corresponds to the
235 <i>x</i> dimension, 1 corresponds to the <i>y</i> dimension, etc.
236 (Implementation note: in more than two dimensions,
237 the SP's heading property should probably be broken up
238 internally into theta, rho, &amp;c components as appropriate.)</p>
239
240 <p>For example, <code>r(z,180)</code> rotates the SP's heading
241 180 degrees about the <i>z</i> (dimension #2) axis, as does <code>r(2,180)</code>
242 or even just <code>r(180)</code>.</p>
243
244 <p><code>&lt;</code> and <code>&gt;</code> are the push
245 and pop state-stack operators, respectively. Alternately,
246 they can be viewed as lookahead-assertion parenthetics, since the stack
247 is generally assumed to be
248 local to the production. (Compiler-compilers should probably
249 notify the user, but not necessarily panic, if they find unbalanced
250 <code>&lt;&gt;</code>'s.)</p>
251
252 <p>All properties of the SP (including location and heading,
253 and scale factor if supported) are pushed as a group
254 onto the stack during <code>&lt;</code> and popped as a group
255 off the stack during <code>&gt;</code>.</p>
256
257 <h2>Advanced SP Features</h2>
258
259 <p>These features are not absolutely necessary for most
260 non-contrived multi-directional grammars. MDPN compiler-compilers
261 are not expected to support them.</p>
262
263 <p><code>T</code> is the absolute translation operator. It is
264 followed by a vector which is assigned to the location of the SP.
265 e.g. <code>T (0,0)</code> will 'home' the scan.</p>
266
267 <p><code>R</code> is the absolute rotation operator. It is
268 followed by an optional axis identifier, and an
269 orthogonal angle assumed to be measured in degrees.
270 The SP's heading is set to this
271 angle. e.g. <code>R(270)</code> sets the
272 SP scanning line after line down the input text, downwards. See
273 the <code>r</code> operator, above, for how the axis identifier functions.</p>
274
275 <p><code>S</code> is the absolute scale operator. It is
276 followed by an orthogonal <i>scaling factor</i> (a scalar <i>s</i>
277 such that <i>s</i> = <b>int</b>(<i>s</i>) and <i>s</i> >= 1).
278 The SP's scale factor is set to this value. The finest
279 possible scale, 1, indicates a 1:1 map with the input file;
280 for each one input symbol matched, the SP advances one symbol
281 in its path. When the scale factor is two, then for each
282 one input symbol matched, the SP advances two symbols, skipping
283 over an interim symbol. Etc.</p>
284
285 <p><code>s</code> is the relative scale operator. It is
286 followed by a scalar integer which is added to the SP's scaling factor
287 (so long as it does not cause the scaling factor to be zero or negative.)</p>
288
289 <p>Scale operators may also take an optional axis identifier
290 (as in <code>S(y,2)</code>), but when the axis identifier is omitted,
291 all axes are assumed (non-distortional scaling).</p>
292
293 <p><code>!&gt;</code> is a state-assertion alternative to
294 <code>&gt;</code>, for the purpose of determining that
295 the SP successfully and completely
296 reverted to a previous state that
297 was pushed onto the stack ('came full circle'). This
298 operator is something of a luxury; a
299 grammar which uses constraints correctly should
300 never <i>need</i> it, but it can come in handy.</p>
301
302 <h2>Other Advanced Features: Exclusivity</h2>
303
304 <p>Lastly, in the specification of a production,
305 the <i>exclusivity</i> applying to that production can be given
306 between a hyphen following
307 the name of the nonterminal, and the <code>::=</code> operator.</p>
308
309 <p>Exclusivity is a list of productions, named by their
310 nonterminals, and comes into play at any particular <i>instance</i>
311 of the production (i.e. when the production successfully
312 matches specific symbols at specific points in scan-space during a parse,
313 called the <i>domain</i>.) The exclusivity describes how the domain of each
314 instance is protected from being the domain of any further instances.
315 The domain of any subsequent instances of any productions listed
316 in the exclusivity is restricted from sharing points in scan-space
317 with the established domain.</p>
318
319 <p>Exclusivity is a measure to prevent so-called <i>crossword
320 grammars</i> - that is, where instances of productions
321 can <i>overlap</i> and share common symbols - if desired.
322 Internally it's generally considered a list of
323 'used-by-this-production' references
324 associated with each point in scan-space.
325 An example of the syntax to specify exclusivity is
326 <code>bar - bar quuz ::= foo {"a"} baz</code>.
327 Note that the domain of an instance of <code>bar</code> is the
328 sum of the
329 domains <code>foo</code>, <code>baz</code> and the chain of "<code>a</code>" terminals,
330 and that neither a subsequent instance of <code>quuz</code> nor <code>bar</code>
331 again can overlap it.</p>
332
333 <h2>Examples of MDPN-described Grammars</h2>
334
335 <p><b>Example 1.</b> A grammar for describing boxes.</p>
336
337 <p>The task of writing a translator to recognize a
338 two-dimensional construct such as a box can easily
339 be realized using a tool such as MDPN.</p>
340
341 <p>An input file might contain a of box drawn in
342 ASCII characters, such as
343
344 <pre>+------+
345 | |
346 | |
347 +------+</pre>
348 </p>
349
350 <p>Let's also say that boxes have a minimum height of four (they
351 must contain at least two rows), but
352 no minimum width. Also, edge characters must match up with
353 which edge they are on. So, the following forms are both
354 illegal inputs:
355
356 <pre>+-+
357 +-+</pre>
358
359 <pre>+-|-+
360 | |
361 *
362 | |
363 +-|-+</pre>
364 </p>
365
366 <p>The MDPN production used to describe this box might be
367
368 <pre> Box ::= "+" {"-"}^(w) r(-90) "+" "||" {"|"}^(h) r(-90)
369 "+" {"-"}^(w) r(-90) "+" "||" {"|"}^(h) r(-90).</pre>
370 </p>
371
372 <p><b>Example 2.</b> A simplified grammar for Plankalk&uuml;l's assignments.</p>
373
374 <p>An input file might contain an ASCII approximation of something
375 Zuse might have jotted down on paper:
376
377 <pre> |Z + Z => Z
378 V|1 2 3
379 S|1.n 1.n 1.n</pre>
380 </p>
381
382 <p>Simplified MDPN productions used to describe this might be
383
384 <pre>
385 Staff ::= Spaces "|" TempVar AddOp TempVar Assign TempVar.<br>
386 TempVar ::= "Z" t(-1,1) Index t(-1,1) Structure t(0,-2) Spaces.
387 Index ::= &lt;Digit {Digit}&gt;.
388 Digit ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9".
389 Structure ::= &lt;Digit ".n"&gt;.
390 AddOp ::= ("+" | "-") Spaces.
391 Assign ::= "=>" Spaces.
392 Spaces ::= {" "}.
393 </pre>
394 </p>
395
396 <hr>
397 <i>Last Updated Mar 2&nbsp;<img src="/images/icons/copyright.gif"
398 align=absmiddle width=12 height=12 alt="(c)" border=0>2004 <a
399 href="/">Cat's Eye Technologies</a>.
400 All rights reserved.
401 This document may be freely redistributed in its unchanged form.</i>
402 </body></html>
0 <?xml version="1.0"?>
1 <!-- encoding: utf-8 -->
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3 <html xmlns="http://www.w3.org/1999/xhtml" lang="en">
4 <head>
5 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
6 <title>The Oozlybub and Murphy Programming Language</title>
7 <!-- begin html doc dynamic markup -->
8 <script type="text/javascript" src="/contrib/jquery-1.6.4.min.js"></script>
9 <script type="text/javascript" src="/scripts/documentation.js"></script>
10 <!-- end html doc dynamic markup -->
11 </head>
12 <body>
13
14 <h1>Oozlybub and Murphy</h1>
15
16 <p>Language version 1.1</p>
17
18 <h2>Overview</h2>
19
20 <p>This document describes a new programming language.
21 The name of this language is <dfn>Oozlybub and Murphy</dfn>.
22 Despite appearances, this name refers to a single language.
23 The majority of the language is
24 named Oozlybub. The fact that the language is not entirely named
25 Oozlybub is named Murphy. Deal with it.</p>
26
27 <p>For the sake of providing an "olde tyme esoterickal de-sign", the
28 language combines several unusual features, including multiple interleaved parse
29 streams, infinitely long variable names, gratuitously strong typing, and
30 only-conjectural Turing completeness.
31 While no implementation of the language exists as of this writing, it is thought to
32 be sufficiently consistent to be implementable, modulo any errors in this docunemt.</p>
33
34 <p>In places the language may resemble <a href="/projects/smith/">SMITH</a>
35 and <a href="/projects/quylthulg/">Quylthulg</a>, but this was not intended,
36 and the similarities are purely emergent.</p>
37
38 <h2>Program Structure</h2>
39
40 <p>A Oozlybub and Murphy program consists of a number of <dfn>variables</dfn> and
41 a number of objects called <dfn>dynasts</dfn>. A Oozlybub and Murphy program text
42 consists of multiple <dfn>parse streams</dfn>. Each parse stream contains
43 zero or more variable declarations, and optionally a single dynast.</p>
44
45 <h3>Parse Streams</h3>
46
47 <p>A parse stream is just a segment, possibly non-contiguous, of the text of
48 a Oozlybub and Murphy program. A program starts out with a single parse stream, but
49 certain <dfn>parse stream manipulation pragmas</dfn> can change this.
50 These pragmas have the form <code>{@<var>x</var>}</code> and
51 have a similar syntactic status as comments; they can appear anywhere except
52 inside a lexeme.</p>
53
54 <p>Parse streams are arranged as a ring (a cyclic doubly linked list.) When parsing
55 of the program text begins initially, there is already a single pre-created parse stream.
56 When the program text ends, all parse streams which may be active are deleted.</p>
57
58 <p>The meanings of the pragmas are:</p>
59
60 <ul>
61 <li><code>{@+}</code> Create a new parse stream to the right of the current one.</li>
62 <li><code>{@&gt;}</code> Switch to the parse stream to the right of the current one.</li>
63 <li><code>{@&lt;}</code> Switch to the parse stream to the left of the current one.</li>
64 <li><code>{@-}</code> Delete the current parse stream. The parse stream to the
65 left of the deleted parse stream will become the new current parse stream.</li>
66 </ul>
67
68 <p>Deleting a parse stream while it contains an unfinished syntactic construct is a syntax
69 error, just as an end-of-file in that circumstance would be in most other languages.</p>
70
71 <p>Providing a concrete example of parse streams in action will be difficult in the
72 absence of defined syntax for the rest of Oozlybub and Murphy, so we will, for the purposes of
73 the following demonstration only, pretend that the contents of a parse stream is a sentence of English.
74 Here is how three parse streams might be managed:</p>
75
76 <p><code>The quick {@+}brown{@&gt;}Now is the time{@&lt;}fox{@&lt;} for all good
77 men to {@+}{@&gt;}Wherefore art thou {@&gt;} jumped over {@&gt;}{@&gt;}Romeo?{@-}
78 come to the aid of {@&gt;}the lazy dog's tail.{@-}their country.{@-}
79 </code></p>
80
81 <h3>Variables</h3>
82
83 <p>All variables are declared in a block at the beginning of a parse stream.
84 If there is also a dynast in that stream, the variables are private to that dynast;
85 otherwise they are global and shared by all dynasts. (<em>Defined in 1.1</em>)
86 Any dynamically created dynast gets its own private copies of any private
87 variables the original dynast had; they will initially hold the values they
88 had in the original, but they are not shared.</p>
89
90 <p>The name of a variable in Oozlybub and Murphy is not a fixed, finite-length string of symbols,
91 as you would find in other programming languages. No sir! In Oozlybub and Murphy, each
92 variable is named by a possibly-infinite set of strings
93 (over the alphanumeric-plus-spaces alphabet <code>[a-zA-Z0-9 ]</code>),
94 at least one of which must be infinitely long. (<em>New in 1.1</em>: spaces
95 [but no other kinds of whitespace] are allowed in these strings.)</p>
96
97 <p>To accomodate this method of identifying a variable, in Oozlybub and Murphy programs,
98 which are finite, variables are identified using regular expressions
99 which match their set of names.
100 An equivalence class of regular expressions is a set of all regular expressions which
101 accept exactly the same set of strings; each equivalence class of regular expressions
102 refers to the same, unique Oozlybub and Murphy variable.</p>
103
104 <p>(In case you wonder about the implementability of this:
105 Checking that two regular expressions are
106 equivalent is decidable: we convert them both to NFAs, then
107 to DFAs, then minimize those DFAs, then check if the transition
108 graphs of those DFAs are isomorphic.
109 Checking that the regular expression accepts at least
110 one infinitely-long string is also decidable: just look for
111 a cycle in the DFA's graph.)</p>
112
113 <p>Note that these identifier-sets need not be disjoint. <code>/ma*/</code>
114 and <code>/mb*/</code> are distinct variables, even though both contain the
115 string <code>m</code>. (Note also that we are fudging slightly on how we
116 consider to have described an infinitely long name; technically we would
117 want to have a Büchi automaton that specifies an unending repetition with
118 <sup>ω</sup> instead of *. But the distinction is subtle enough in this
119 context that we're gonna let it slide.)</p>
120
121 <p>Syntax for giving a variable name is fairly straightforward: it is delimited on
122 either side by <code>/</code> symbols; the alphanumeric symbols are literals;
123 textual concatenation is regular expression sequencing,
124 <code>|</code> is alteration, <code>(</code> and <code>)</code> increase precedence,
125 and <code>*</code> is Kleene asteration (zero or more occurrences).
126 Asteration has higher precedence than sequencing, which has higher precedence
127 than alteration. Because none of these operators is alphanumeric nor a space, no
128 escaping scheme needs to be installed.</p>
129
130 <p>Variables are declared with the following syntax (<code>i</code> and <code>a</code>
131 are the types of the variables, described in the next section):</p>
132
133 <pre>
134 VARIABLES ARE i /pp*/, i /qq*/, a /(0|1)*/.
135 </pre>
136
137 <p>This declares an integer variable identified by the names {<code>p</code>, <code>pp</code>, <code>ppp</code>, ...},
138 an integer variable identified by the names {<code>q</code>, <code>qq</code>, <code>qqq</code>, ...},
139 and an array variable identified by the names of all strings of <code>0</code>'s and <code>1</code>'s.</p>
140
141 <p>When not in wimpmode (see below), any regular expression which denotes a
142 variable may not be literally repeated anywhere else in the program. So in the above example,
143 it would not be legal to refer to <code>/pp*/</code> further down in the program; an equivalent regular
144 expression, such as <code>/p|ppp*/</code> or <code>/p*p/</code> or
145 <code>/pp*|pp*|pp*/</code> would have to be used instead.</p>
146
147 <h3>Types</h3>
148
149 <p>Oozlybub and Murphy is a statically-typed language, in that variables
150 as well as values have types, and a value of one type cannot be stored in
151 a variable of another type. The types of values, however, are not entirely
152 disjoint, as we will see, and special considerations may arise for checking
153 and conversion because of this.</p>
154
155 <p>The basic types are:</p>
156
157 <ul>
158 <li><code>i</code>, the type of integers.
159 <p>These are integers of unbounded extent, both positive and negative.
160 Literal constants of type <code>i</code> are given in the usual decimal format.
161 Variables of this type initially contain the value 0.</p>
162 </li>
163
164 <li><code>p</code>, the type of prime numbers.
165 <p>All prime numbers are integers but not all integers are prime numbers.
166 Thus, values of prime number type will automatically be coerced to
167 integers in contexts that require integers; however the reverse is not true,
168 and in the other direction a conversion function (<code>P?</code>) must
169 be used. There are no literal constants of type <code>p</code>. Variables
170 of this type initially contain the value 2.</p>
171 </li>
172
173 <li><code>a</code>, the type of arrays of integers.
174 <p>An integer array has an integer index which is likewise of unbounded
175 extent, both positive and negative. Variables of this type initially
176 contain an empty array value, where all of the entries are 0.</p>
177 </li>
178
179 <li><code>b</code>, the type of booleans.
180 <p>A boolean has two possible values, <code>true</code> and <code>false</code>.
181 Note that there are no literal constants of type <code>b</code>; these must be
182 specified by constructing a tautology or contradiction with boolean (or other) operators.
183 It is illegal to retrieve the value of a variable of this type before first assigning it,
184 except to construct a tautology or contradiction.</p>
185 </li>
186
187 <li><code>t</code>, the type of truth-values.
188 <p>A truth-value has two possible values, <code>yes</code> and <code>no</code>.
189 There are no literal constants of type <code>t</code>.
190 It is illegal to retrieve the value of a variable of this type before first assigning it,
191 except to construct a tautology or contradiction.</p>
192 </li>
193
194 <li><code>z</code>, the type of bits.
195 <p>A bit has two possible values, <code>one</code> and <code>zero</code>.
196 There are no literal constants of type <code>z</code>.
197 It is illegal to retrieve the value of a variable of this type before first assigning it,
198 except to construct a tautology or contradiction.</p>
199 </li>
200
201 <li><code>c</code>, the type of conditions.
202 <p>A condition has two possible values, <code>go</code> and <code>nogo</code>.
203 There are no literal constants of type <code>c</code>.
204 It is illegal to retrieve the value of a variable of this type before first assigning it,
205 except to construct a tautology or contradiction.</p>
206 </li>
207
208 </ul>
209
210 <h3>Wimpmode</h3>
211
212 <p>(<em>New in 1.1</em>) An Oozlybub and Murphy program is in <dfn>wimpmode</dfn> if it
213 declares a global variable of integer type which matches the string <code>am a wimp</code>,
214 for example:</p>
215
216 <pre>
217 VARIABLES ARE i /am *a *wimp/.
218 </pre>
219
220 <p>Certain language constructs, noted in this document as such,
221 are only permissible in wimpmode.
222 If they are used in a program in which wimpmode is not in effect, a
223 compile-time error shall occur and the program shall not be executed.</p>
224
225 <h3>Dynasts</h3>
226
227 <p>Each dynast is labeled with a positive integer and contains an expression.
228 Only one dynast may be denoted in any given parse stream, but dynasts may also
229 be created dynamically during program execution.</p>
230
231 <p>Program execution begins at the lowest-numbered dynast that exists in the initial program.
232 When a dynast is executed, the expression of that dynast is evaluated for its side-effects.
233 If there is a dynast labelled with the next higher integer (i.e. the successor of the label
234 of the current dynast),
235 execution continues with that dynast; otherwise, the program halts.
236 Once a dynast has been executed, it continues to exist until the program
237 halts, but it may never be executed again.</p>
238
239 <p>Evaluation of an expression may have side-effects, including
240 writing characters to an output channel, reading characters from
241 an input channel, altering the value of a variable, and creating
242 a new dynast.</p>
243
244 <p>Dynasts are written with the syntax <code>dynast(<var>label</var>) &lt;-&gt; <var>expr</var></code>. A concrete example follows:</p>
245
246 <pre>
247 dynast(100) &lt;-&gt; for each prime /p*/ below 1000 do write (./p*|p/+1.)
248 </pre>
249
250 <h3>TRIVIA PORTION OF SHOW</h3>
251
252 <p>WHO WAS IT FAMOUS MAN THAT SAID THIS?</p>
253
254 <ul>
255 <li>A) RONALD REAGAN</li>
256 <li>B) RONALD REAGAN</li>
257 <li>B) RONALD STEWART</li>
258 <li>C) RENALDO</li>
259 </ul>
260
261 <p>contestant enters lightning round now</p>
262
263 <h3>Expressions</h3>
264
265 <p>In the following, the letter preceding <var>-expr</var> or <var>-var</var> indicates
266 the expected type, if any, of that expression or variable. Where the expressions listed below are infix
267 expressions, they are listed from highest to lowest precedence. Unless noted otherwise,
268 subexpressions are evaluated left to right.</p>
269
270 <ul>
271
272 <li><code>(.<var>expr</var>.)</code>
273 <p>Surrounding an expression with <dfn>dotted parens</dfn> gives it that precedence boost
274 that's just the thing to have it be evaluated before the expression it's in, but
275 there is a catch. The number of parens in the dotted parens expression must
276 match the nesting depth in the following way: if a set of dotted parens is
277 nested within <var>n</var> dotted parens, it must contain fib(<var>n</var>)
278 parens, where fib(<var>n</var>) is the <var>n</var>th member of the
279 Fibonacci sequence. For example, <code>(.(.0.).)</code> and <code>(.(.((.(((.(((((.0.))))).))).)).).)</code>
280 are syntactically well-formed expressions (when not nested in any other dotted paren
281 expression), but <code>(.(((.0.))).)</code> and
282 <code>(.(.(.0.).).)</code> are not.</p>
283 </li>
284
285 <li><code><var>var</var></code>
286 <p>A variable evaluates to the value it contains at that point in execution.</p>
287 </li>
288
289 <li><code>0</code>, <code>1</code>, <code>2</code>, <code>3</code>, etc.
290 <p>Decimal literals evaluate to the expected value of type <code>i</code>.</p>
291 </li>
292
293 <li><code>#myself#</code>
294 <p>This special nullary token evaluates to the numeric label of the currently
295 executing dynast.</p>
296 </li>
297
298 <li><code><var>var</var> := <var>expr</var></code>
299 <p>Evaluates the <var>expr</var> and stores the result in the specified variable.
300 The variable and the expression must have the same type. Evaluates to whatever
301 <var>expr</var> evaluated to.</p>
302 </li>
303
304 <li><code><var>a-expr</var> [<var>i-expr</var>]</code>
305 <p>Evaluates to the <code>i</code> stored at the location in the
306 array given by <var>i-expr</var>.</p>
307 </li>
308
309 <li><code><var>a-expr</var> [<var>i-expr</var>] := <var>i-expr</var></code>
310 <p>Evaluates the second <var>i-expr</var> and stores the result in the location
311 in the array given by the first <var>i-expr</var>. Evaluates to whatever
312 the second <var>i-expr</var> evaluated to.</p>
313 </li>
314
315 <li><code><var>a-expr</var> ? <var>i-expr</var></code>
316 <p>Evaluates to <code>go</code> if <code><var>a-expr</var> [<var>i-expr</var>]</code>
317 and <code><var>i-expr</var></code> evaluate to the same thing, <code>nogo</code> otherwise.
318 The <var>i-expr</var> is only evaluated once.</p>
319 </li>
320
321 <li><code>minus <var>i-expr</var></code>
322 <p>Evaluate to the integer that is zero minus the result of evaluating <var>i-expr</var>.</p>
323 </li>
324
325 <li><code>write <var>i-expr</var></code>
326 <p>Write the Unicode code point whose number is obtained by evaluating <var>i-expr</var>, to
327 the standard output channel. Writing a negative number shall produce one of a number of
328 amusing and informative messages which are not defined by this document.</p>
329 </li>
330
331 <li><code>#read#</code>
332 <p>Wait for a Unicode character to become available on the standard
333 input channel and evaluate to its integer code point value.</p>
334 </li>
335
336 <li><code>not? <var>z-expr</var></code>
337 <p>Converts a bit value to a boolean value (<code>zero</code> becomes <code>true</code> and <code>one</code> becomes <code>false</code>).</p>
338 </li>
339
340 <li><code>if? <var>b-expr</var></code>
341 <p>Converts a boolean value to condition value (true becomes go and false becomes nogo).</p>
342 </li>
343
344 <li><code>cvt? <var>c-expr</var></code>
345 <p>Converts a condition value to a truth-value (<code>go</code> becomes <code>yes</code> and <code>nogo</code> becomes <code>no</code>).</p>
346 </li>
347
348 <li><code>to? <var>t-expr</var></code>
349 <p>Converts a truth-value to a bit value (<code>yes</code> becomes <code>one</code> and <code>no</code> becomes <code>zero</code>).</p>
350 </li>
351
352 <li><code>P? <var>i-expr</var> [<var>t-var</var>]</code>
353 <p>If the result of evaluating <var>i-expr</var> is a prime number, evaluates to that
354 prime number (and has the type <code>p</code>). If it is not prime, stores the value <code>no</code>
355 into <var>t-var</var> and evaluates to 2.</p>
356 </li>
357
358 <li><code><var>i-expr</var> * <var>i-expr</var></code>
359 <p>Evaluates to the product of the two <var>i-expr</var>s. The result is never
360 of type <code>p</code>, but the implementation doesn't need to do anything
361 based on that fact.</p>
362 </li>
363
364 <li><code><var>i-expr</var> + <var>i-expr</var></code>
365 <p>Evaluates to the sum of the two <var>i-expr</var>s.</p>
366 </li>
367
368 <li><code>exists/dynast <var>i-expr</var></code>
369 <p>Evaluates to <code>one</code> if a dynast exists with the given label,
370 or <code>zero</code> if one does not.</p>
371 </li>
372
373 <li><code>copy/dynast <var>i-expr</var>, <var>p-expr</var>, <var>p-expr</var></code>
374 <p>Creates a new dynast based on an existing one. The existing one is identified by
375 the label given in the <var>i-expr</var>. The new dynast is a copy of the existing
376 dynast, but with a new label. The new label is the sum of the two <var>p-expr</var>s.
377 If a dynast with that label already exists, the program terminates.
378 (<em>Defined in 1.1</em>) This expression evaluates to the value of the given <var>i-expr</var>.</p>
379 </li>
380
381 <li><code>create/countably/many/dynasts <var>i-expr</var>, <var>i-expr</var></code>
382 <p>Creates a countably infinite number of dynasts based on an existing one. The existing one is identified by
383 the label given in the first <var>i-expr</var>. The new dynasts are copies of the existing
384 dynast, but with new labels. The new labels start at the first odd integer
385 greater than the second <var>i-expr</var>, and consist of every odd integer greater than that.
386 If any dynast with such a label already exists, the program terminates.
387 (<em>Defined in 1.1</em>) This expression evaluates to the value of the first given <var>i-expr</var>.</p>
388 </li>
389
390 <li><code><var>b-expr</var> and <var>b-expr</var></code>
391 <p>Evaluates to <code>one</code> if both <var>b-expr</var>s are <code>true</code>,
392 <code>zero</code> otherwise. Note that this is not short-circuting; both <var>b-expr</var>s
393 are evaluated.</p>
394 </li>
395
396 <li><code><var>c-expr</var> or <var>c-expr</var></code>
397 <p>Evaluates to <code>yes</code> if either or both <var>c-expr</var>s are <code>go</code>,
398 <code>no</code> otherwise. Note that this is not short-circuting; both <var>c-expr</var>s
399 are evaluated.</p>
400 </li>
401
402 <li><code>do <var>expr</var></code>
403 <p>Evaluates the <var>expr</var>, throws away the result, and evaluates to <code>go</code>.</p>
404 </li>
405
406 <li><code><var>c-expr</var> then <var>expr</var></code>
407 <p><strong>Wimpmode only.</strong> Evaluates the <var>c-expr</var> on the left-hand side for its side-effects only,
408 throwing away the result, then evaluates to the result of evaluating the right-hand
409 side <var>expr</var>.</p>
410 </li>
411
412 <li><code><var>c-expr</var> ,then <var>i-expr</var></code>
413 <p>(<em>New in 1.1</em>) Evaluates the <var>c-expr</var> on the left-hand side; if it is <code>go</code>,
414 evaluates to the result of evaluating the right-hand side <var>i-expr</var>;
415 if it is <code>nogo</code>, evaluates to an unspecified and quite possibly random
416 integer between 1 and 1000000 inclusive, without evaluating the right-hand side.
417 Note that this operator has the same precedence as <code>then</code>.</p>
418 </li>
419
420 <li><code>for each prime <var>var</var> below <var>i-expr</var> do <var>i-expr</var></code>
421 <p>The <var>var</var> must be a declared variable of type <code>p</code>. The first
422 <var>i-expr</var> must evaluate to an integer, which we will call <var>k</var>.
423 The second <var>i-expr</var> is evaluated once for each prime number between
424 <var>k</var> and 2, inclusive; each time it is evaluated, <var>var</var> is bound to
425 a successively smaller prime number between <var>k</var> and 2.
426 (<em>Defined in 1.1</em>) Evaluates to the result of the final evaluation of the second <var>i-expr</var>.</p>
427 </li>
428
429 </ul>
430
431 <h3>Grammar</h3>
432
433 <p>This section attempts to capture and summarize the syntax rules (for a single
434 parse stream) described above, using an EBNF-like syntax extended with a few
435 ad-hoc annotations that I don't feel like explaining right now.</p>
436
437 <pre>
438 ParseStream ::= VarDeclBlock {DynastLit}.
439 VarDeclBlock ::= "VARIABLES ARE" VarDecl {"," VarDecl} ".".
440 VarDecl ::= TypeSpec VarName.
441 TypeSpec ::= "i" | "p" | "a" | "b" | "t" | "z" | "c".
442 VarName ::= "/" Pattern "/".
443 Pattern ::= {[a-zA-Z0-9 ]}
444 | Pattern "|" Pattern /* ignoring precedence here */
445 | Pattern "*" /* and here */
446 | "(" Pattern ")".
447 DynastLit ::= "dynast" "(" Gumber ")" "&lt;-&gt;" Expr.
448 Expr ::= Expr1[c] {"then" Expr1 | ",then" Expr1[i]}.
449 Expr1 ::= Expr2[c] {"or" Expr2[c]}.
450 Expr2 ::= Expr3[b] {"and" Expr3[b]}.
451 Expr3 ::= Expr4[i] {"+" Expr4[i]}.
452 Expr4 ::= Expr5[i] {"*" Expr5[i]}.
453 Expr5 ::= Expr6[a] {"?" Expr6[i]}.
454 Expr6 ::= Prim[a] {"[" Expr[i] "]"} [":=" Expr[i]].
455 Prim ::= {"("}* "." Expr "." {")"}* /* remember the Fibonacci rule! */
456 | VarName [":=" Expr]
457 | Gumber
458 | "#myself#"
459 | "minus" Expr[i]
460 | "write" Expr[i]
461 | "#read#"
462 | "not?" Expr[z]
463 | "if?" Expr[b]
464 | "cvt?" Expr[c]
465 | "to?" Expr[t]
466 | "P?" Expr[i]
467 | "exists/dynast" Expr[i]
468 | "copy/dynast" Expr[i] "," Expr[p] "," Expr[p]
469 | "create/countably/many/dynasts"
470 Expr[i] "," Expr[i]
471 | "do" Expr
472 | "for" "each" "prime" VarName "below"
473 Expr[i] "do" Expr[i].
474 Gumber ::= {[0-9]}.
475 </pre>
476
477 <h3>Boolean Idioms</h3>
478
479 <p>Here we show how we can get any value of any of the <code>b</code>, <code>t</code>,
480 <code>z</code>, and <code>c</code> types, without any constants or variables with
481 known values of these types.</p>
482
483 <pre>
484 VARIABLES ARE b /b*/.
485 zero = /b*|b/ and not? to? cvt? if? /b*|b*/
486 true = not? zero
487 go = if? true
488 yes = cvt? go
489 one = to? yes
490 false = not? one
491 nogo = if? false
492 no = cvt? nogo
493 </pre>
494
495 <h3>Computational Class</h3>
496
497 <p>Because the single in-dynast looping construct, <code>for each prime below</code>, is
498 always a finite loop, the execution of any fixed number of dynasts cannot be Turing-complete.
499 We must create new dynasts at runtime, and continue execution in them, if we want
500 any chance at being Turing-complete. We demonstrate this by showing an example of
501 a (conjecturally) infinite loop in Oozlybub and Murphy, an idiom which will doubtless
502 come in handy in real programs.</p>
503
504 <pre>
505 VARIABLES ARE p /p*/, p /q*/.
506 dynast(3) &lt;-&gt;
507 (. do (. if? not? exists/dynast 5 ,then
508 create/countably/many/dynasts #myself#, 5 .) .) ,then
509 (. for each prime /p*|p/ below #myself#+2 do
510 for each prime /q*|q/ below /p*|pp/+1 do
511 if? not? exists/dynast /p*|p|p/+/q*|q|q/ ,then
512 copy/dynast #myself#, /p*|ppp/, /q*|qqq/ .)
513 </pre>
514
515 <p>As you can see, the ability to loop indefinitely in Oozlybub and Murphy hinges on whether
516 Goldbach's Conjecture is true or not. Looping forever requires creating an unbounded number of
517 new dynasts. We can create all the odd-numbered dynasts at once, but that won't be enough
518 to loop forever, as we must proceed to the next highest numbered dynast after executing a
519 dynast. So we must create new dynasts with successively higher even integer labels, and these
520 can only be created by summing two primes. So, if Goldbach's conjecture is false, then there
521 is some even number greater than two which is not the sum of two primes; thus there is some
522 dynast that cannot be created by a running Oozlybub and Murphy program, thus it is not possible
523 to loop forever in Oozlybub and Murphy, thus Oozlybub and Murphy is not Turing-complete
524 (because it cannot simulate any Turing machine that loops forever.)</p>
525
526 <p>It should not however be difficult to show that Oozlybub and Murphy is Turing-complete under
527 the assumption that Goldbach's Conjecture is true. If Goldbach's Conjecture is true, then the above program
528 is an infinite loop. We need only add to it appropriate conditional instructions to,
529 say, simulate the execution of an arbitrarily-chosen Turing machine. An array can serve as the
530 tape, and an integer can serve as the head. Another integer can serve as the state of the finite
531 control. The integer can be tested against various fixed integers by establishing an array for each
532 of these fixed integers and using the <code>?</code> operator against each in turn; each branch
533 can mutate the tape, tape head, and finite control as desired. The program can halt by neglecting
534 to create a new even dynast to execute next, or by trying to create a dynast with a label that
535 already exists.</p>
536
537 <p>Happy FLIMPING,
538 <br/>Chris Pressey
539 <br/>December 1, 2010
540 <br/>Evanston, Illinois, USA
541 </p>
542
543 </body>
544 </html>
0 <html><head>
1 <meta http-equiv="Content-Type" content="text/html;CHARSET=iso-8859-1">
2 <title>Cat's Eye Technologies: The Opus-2 Language</title>
3 </head>
4 <body bgcolor="#ffffff" text="#000000" link="#1c3f7f"
5 vlink="#204020" background="/img/sinewhite.gif" >
6
7 <h1>Opus-2</h1>
8
9 <p>Opus-2 is an abstract artlang composed by Chris Pressey at or around
10 March 10, 2001.
11
12 <h3>Design Goals</h3>
13
14 <p>Eliminate word order entirely. Despite the appearance of the
15 resulting language, this was the only real design goal to begin with.
16
17 <h3>Grammatical Overview</h3>
18
19 <p>Verbs in Opus-2 take the form of colours. Nouns take the form of
20 sounds. Adjectives take the form of smells. Adverbs take the form of
21 inner-ear sensations. Certain tenses and phrasings are indicated by
22 tastes.
23
24 <p>To distinguish between the roles of the nouns in a sentence,
25 objects are quieter, <i>sotto voce</i> sounds, and subjects are
26 foreground sounds. It is important to remember that the sensations
27 corresponding to object, subject, and verb all occur at the same time
28 in an event termed an <i>sentence-experience</i>.
29
30 <p>This dominant-recessive relationship is also present in strong
31 and weak scents which indicate whether an adjective describes the
32 subject or the object, and in intense and gentle inner-ear sensations
33 (feelings of sudden or gradual acceleration) to determine the target
34 of an adverb.
35
36 <h3>Vocabulary Overview</h3>
37
38 <p>Sample dictionary:
39
40 <p><center><table border=0>
41
42 <tr><td colspan=2 align=center><b>verbs</b></td></tr>
43 <tr><td>flee</td><td><i>pale green</i></td></tr>
44 <tr><td>approach</td><td><i>deep orange</i></td></tr>
45 <tr><td>examine</td><td><i>medium grey</i></td></tr>
46 <tr><td>glorify</td><td><i>deep red</i></td></tr>
47
48 <tr><td colspan=2 align=center><b>nouns</b></td></tr>
49 <tr><td>man</td><td><i>Eb below middle C, trombone</i></td></tr>
50 <tr><td>woman</td><td><i>F above middle C, french horn</i></td></tr>
51 <tr><td>world</td><td><i>car door slamming</i></td></tr>
52 <tr><td>child</td><td><i>middle C, tubular bells</i></td></tr>
53 <tr><td>building</td><td><i>F, tympani roll</i></td></tr>
54 <tr><td>radio</td><td><i>harp sweep</i></td></tr>
55
56 <tr><td colspan=2 align=center><b>adjectives</b></td></tr>
57 <tr><td>fast</td><td><i>burning rubber</i></td></tr>
58 <tr><td>dangerous</td><td><i>mothballs</i></td></tr>
59
60 <tr><td colspan=2 align=center><b>adverbs</b></td></tr>
61 <tr><td>quickly</td><td><i>leaning 40 degrees left</i></td></tr>
62 <tr><td>dangerously</td><td><i>leaning 25 degrees right</i></td></tr>
63
64 </table></center>
65
66 <h3>Context Overview</h3>
67
68 <p>While each sentence is "instantaneous" in the sense that
69 there is no internal word order, sentence-experiences still
70 follow one another, and each sentence-experience does take a certain
71 amount of time to perceive.
72 Tense is thus implied by context between successive sentences
73 and the duration of each sentence. The shorter a sentence is,
74 the further into the future it is presumed to refer to.
75 (Thanks to Rob Norman and Panu Kalliokosi for suggesting these ideas.)
76
77 <h3>Examples of Usage</h3>
78
79 <p>Example sentence-experience: "The building glorifies the woman":
80
81 <p><center><table border=2>
82 <tr><td><i>deep red</i></td></tr>
83 <tr><td><i>F, tympani roll, forte</i></td></tr>
84 <tr><td><i>F, french horn, piano</i></td></tr>
85 </table></center>
86
87 <p>Example sentence-experience: "The man quickly flees the dangerous child":
88
89 <p><center><table border=2>
90 <tr><td><i>pale green</i></td></tr>
91 <tr><td><i>Eb, trombone, forte</i></td></tr>
92 <tr><td><i>leaning 40 degrees left (sudden)</i></td></tr>
93 <tr><td><i>C, tubular bells, piano</i></td></tr>
94 <tr><td><i>mothballs (gentle whiff)</i></td></tr>
95 </table></center>
96
97 <h3>Who Speaks Opus-2?</h3>
98
99 <p>This language was designed purely as an abstract exercise in language
100 design. Thus it was not designed for any preconceived group of speakers,
101 and little consideration was given to their culture and capabilities.
102 It is neither specifically a conversation language, nor a formalized
103 language (e.g. a programming language.)
104
105 <p>Most of the problems of finding speakers of Opus-2 is in finding creatures
106 that can create smells and inner-ear sensations as easily as humans can
107 create complex sounds.
108
109 However, some popular opinions of who or what might speak Opus-2 have
110 been suggested since its unveiling:
111
112 <ul>
113 <li>An efficient-yet-entertaining form of future communication using direct neural jacks.
114 <li>A code used by e.g. Neo (from <i>The Matrix</i>) to communicate to
115 subjects unknowingly trapped in a virtual reality.
116 <li>A pidgin spoken between highly-telepathic beings and marginally-telepathic beings.
117 </ul>
118
119 </body></html>
0 <html><head>
1 <meta http-equiv="Content-Type" content="text/html;CHARSET=iso-8859-1">
2 <meta name="Description" content="Cat's Eye Technologies: The Tamerlane Programming Language">
3 <title>Cat's Eye Technologies: The Tamerlane Programming Language</title>
4 </head>
5 <body bgcolor="#ffffff" text="#000000" link="#1c3f7f"
6 vlink="#204020" background="/img/sinewhite.gif" >
7 <center>
8 <h1>Tamerlane</h1>
9 <p>Chris Pressey
10 <br>Created Jan 29 2000
11 </center>
12
13 <h3>Introduction to Tamerlane</h3>
14
15 <p>Tamerlane is a "constraint flow" language. The point of its creation
16 is to attempt to break as many paradigmal stereotypes and idioms that
17 I'm aware of; at least, to make it tricky and confounding
18 to pigeonhole easily.
19
20 <p>It has some concepts in it from imperative languages,
21 some from functional languages, some from dataflow and object oriented
22 languages, and some from graph rewriting
23 and other constraint-based languages, and they're all muddled together
24 into a ridiculous <i>potpourri</i>.
25
26 <p>Despite being such a mutt, there might actually be some algorithms that
27 would be dreadfully easy to write in Tamerlane.
28
29 <h3>Overview of Tamerlane</h3>
30
31 <p>A Tamerlane program consists of a mutably weighted directed graph.
32
33 <p>Each node is considered to be an independent updatable store.
34 The data held in each store is represented by the weights of the
35 arcs exiting the node.
36
37 <p>An arc of weight zero is functionally equivalent to the absence
38 of an arc.
39
40 <h3>Description and Example</h3>
41
42 <p>At this point we may introduce the syntax in a sample ASCII
43 notation for a simple, almost pathological Tamerlane program:
44
45 <pre>
46 Point-A: 1 Point-B,
47 Point-B: 1 Point-C,
48 Point-C: 1 Point-A.
49 </pre>
50
51 <p>The user of a Tamerlane program may submit <i>messages</i>
52 to the program at runtime. In this sense the user and the program are
53 both objects which share the symmetrical relationship
54 <b>user-of/used-by</b>. The program object's interface exposes a
55 <tt>query</tt> method to the user, which is to be considered
56 runtime-polymorphic.
57
58 <p>Using this message-passing mechanism, queries are submitted by the user to a
59 running Tamerlane program, much as queries would be submitted
60 to a running Prolog program.
61
62 <p>Queries have their own syntax and semantics.
63 Unlike Prolog, the user's queries are interpreted as <i>rules</i>,
64 perhaps accompanied by information about
65 where and when the rules are "introduced" into the graph.
66
67 <p>As an example of a query that could be submitted to the above
68 Tamerlane program:
69
70 <pre>
71 1 Point-A -> 0 Point-A @ Point-A
72 </pre>
73
74 This would introduce the rewriting rule
75
76 <ul>"Replace occurances of <tt>1 Point-A</tt> with <tt>0 Point-A</tt>"
77 </ul>
78
79 <p>This rule is applied to the weights of the nodes in the graph
80 starting with the node specified after the <tt>@</tt> symbol. In this
81 instance it would start by trying to apply the rewrite to
82 the node <tt>Point-A</tt>, but finding <tt>Point-A</tt> to contain <tt>1 Point-B</tt>,
83 nothing would change.
84
85 <p>Each time a further query is submitted, each rule which has
86 been introduced into the graph, disappears from the node it
87 was working on, and is transmitted to the adjacent node with
88 the lowest positive weight value. If there is a tie for
89 lowest weight, the rule is transmitted to all of the
90 adjacent nodes with the same lowest weight.
91
92 <p>For efficacy we can consider the user able to submit a <tt>nop</tt>
93 query. This would not introduce any new rules into the graph,
94 but it would cause all active rules to propogate to new nodes
95 on the graph.
96
97 <p>So assume the user submits <tt>nop</tt>. The rule that was
98 introduced by the last query 'moves' from the
99 node labelled <tt>Point-A</tt> to the node labelled <tt>Point-B</tt> (since it's
100 the only positive route out of <tt>Point-A</tt>.) It tries to rewrite
101 <tt>Point-B</tt>, but finding only <tt>1 Point-C</tt> in <tt>Point-B</tt>, nothing happens.
102
103 <p>Assume the user <tt>nop</tt>s again. The rule is propogated to <tt>Point-C</tt>.
104 Finally the pattern match succeeds, and <tt>Point-C</tt> is rewritten to
105 a new <tt>Point-C</tt>:
106
107 <pre>
108 Point-C: 0 Point-A.
109 </pre>
110
111 <p>After one more <tt>nop</tt>, the engine will generate a
112
113 <pre>
114 Rule stopped at Point-C (no adjacent nodes)
115 </pre>
116
117 <p>message back to the user. This uses the operation that the user object's
118 interface supplies to the running program, called <tt>messageback</tt>,
119 which the user must supply, but is, like <tt>query</tt>, considered
120 runtime-polymorphic.
121
122 <h2>Advanced Topics</h2>
123
124 <h3>Rule Priority</h3>
125
126 <p>Obviously, the user can enter more than one query in
127 succession; s/he doesn't need to explicity <tt>nop</tt> to
128 wait for rules to resolve. Since the graph can contain cycles, this would lead
129 to a form of inherent, synchronous concurrency: each time the <tt>query</tt> or <tt>nop</tt>
130 methods are invoked, more than one rule may be applied to the same
131 node.
132
133 <p>The order in which these competing rules is applied is based
134 on the rule's <i>priority</i>. Rules can be submitted with a specific
135 priority in the following manner:
136
137 <pre>
138 1 Point-A -> 0 Point-A @ Point-A ! 10
139 </pre>
140
141 <p>If there is a tie in priority when two rules are competing to
142 rewrite the same node, the outcome is guaranteed to be
143 not simply undefined, but rather, non-deterministic, or at least
144 probabilistic.
145
146 <p>The user can also specify a delay, measured in number of method calls
147 (<tt>query</tt>s or <tt>nop</tt>s)
148 from the present query, at which the rule will be introduced into
149 the graph, like so:
150
151 <pre>
152 1 Point-A -> 0 Point-A @ Point-A in 10
153 </pre>
154
155 <p>And, for the sake of efficacy, the <tt>nop</tt> method on the
156 program object has overloaded syntaxes whose semantics are
157 'keep <tt>nop</tt>ing until some or all rules have stopped.'
158
159 <h3>Negative Weights</h3>
160
161 <p>A negative weight from one node to another is interpreted in a
162 rather negative fashion with respect to the running program.
163
164 <p>A negative weight is selected for propogation when it is closer
165 to zero (while still being non-zero) than any other weight (absolute
166 value.) When a negative weight is selected, however, the
167 semantics of propogation are different.
168
169 <p>When a rule encounters an arc of the form:
170
171 <pre>
172 X: -1 Y.
173 </pre>
174
175 <p>The rule is not "just" copied to Y like "usual". Instead, it
176 "rolls back" the rule to Y in the fashion of a functional
177 continuation.
178
179 <p>All of the nodes that have been "touched" by this rule, since it
180 was last at node Y, are reset to their condition when the rule
181 was last at node Y. The rule itself is deleted from X and
182 propogated to Y, and rewriting continues from there.
183
184 <p>If the rule has never visited Y, however, then such an exit arc is
185 not considered a valid candidate. It has an effective weight of
186 0 (non-existent.)
187
188 <h3>Lambda Graphs</h3>
189
190 <p>Lambda abstraction is a powerful form of referring to non-numeric
191 calculatory items such as functions and, in the case of Tamerlane, graphs.
192
193 <p>A user may submit a rule in the form
194
195 <pre>
196 1 X -> 1 Y ? Y: 1 Z, Z: 1 Y
197 </pre>
198
199 <p>Y is like a 'local variable' in this respect. When the rule is
200 applied to the node, and the pattern match succeeds, the nodes
201 named Y and Z need not actually exist, as they are simply
202 created dynamically and 'attached' to the program graph.
203
204 <p>Lambda graphs are subject to garbage collection, since they can
205 only be used by the rule that created them.
206
207 <h3>Pigeonholes (Updatable)</h3>
208
209 <p>Variables may be supplied which are explicitly updatable,
210 and these are termed pigeonholes. Pigeonholes acknowledge events.
211 Their assignment is associated with both nodes and arcs. They
212 can occur when:
213
214 <ul>
215 <li>a rule enters a node
216 <li>a rule leaves a node
217 <li>a rule chooses an arc
218 <li>a rule passes through an arc
219 <li>a rule rewrites an arc
220 </ul>
221
222 <p>Updatable variables are always
223 named after the node (or node.arc) they're associated with, preceded by
224 a <tt>$</tt> symbol. The data in the variable is, of course, the arc weight
225 (or in the case of nodes, the node's internal key.)
226
227 <p>If the pigeonhole is assigned the special value <tt>%Cancel</tt>,
228 the rule is cancelled from moving to the node/through the arc/rewriting the
229 arc.
230
231 <h3>Placeholders (Unification)</h3>
232
233 <p>Variables may also be supplied as "placeholders". These are the
234 same as bindable (unifiable) variables in logical languages like
235 Prolog. An example of such might be:
236
237 <pre>
238 1 X 7 ^Y -> 7 X 1 ^Y
239 </pre>
240
241 <p>Which would replace "1 X 7 Foo" with "7 X 1 Foo", "1 X 7 Bar" with
242 "7 X 1 Bar", etc.
243
244 <h3>Horn Rules</h3>
245
246 <p>Horn rules only succeed if all of their rules succeed. A Horn rule
247 is specified like:
248
249 <pre>
250 1 X -> 1 Y + 1 Z 0 G -> 1 G
251 </pre>
252
253 <p>(Remember that 0 G is equivalent to "the absence of an arc to G." If you
254 have a pattern like
255
256 <pre>
257 ^a G ^b G 0 G -> ^b Q
258 </pre>
259
260 <p>It will only match when there are exactly two different exit arcs to
261 the same node G, which is not disallowed (nor is a node with an exit
262 arc which points to itself.))
263
264 <p>Note also that arcs are unordered amongst themselves. The rule
265
266 <pre>
267 1 A 1 B 1 C -> 1 E 3 D
268 </pre>
269
270 <p>is the same as
271
272 <pre>
273 1 B 1 A 1 C -> 3 D 1 E
274 </pre>
275
276 <hr>
277 <center><font size=-1>Last Updated Jan 27&nbsp;<img src="/img/copyright.gif"
278 align=absmiddle width=12 height=12 alt="(c)" border=0>2001 <a href="/">Cat's Eye Technologies</a>.</font></center>
279 </body></html>
0 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
1 <!-- encoding: UTF-8 -->
2 <html xmlns="http://www.w3.org/1999/xhtml" lang="en">
3 <head>
4 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
5 <title>You are Reading the Name of this Esolang</title>
6 <!-- begin html doc dynamic markup -->
7 <script type="text/javascript" src="/contrib/jquery-1.6.4.min.js"></script>
8 <script type="text/javascript" src="/scripts/documentation.js"></script>
9 <!-- end html doc dynamic markup -->
10 </head>
11 <body>
12
13 <h1>You are Reading the Name of this Esolang</h1>
14
15 <p>November 2007, Chris Pressey, Cat's Eye Technologies</p>
16
17 <h2>Introduction</h2>
18
19 <p>This programming language, called <b>You are Reading the Name of this Esolang</b>,
20 is my first foray into the design space of programming languages whose programs contain
21 undecidable elements. In the case of You are Reading the Name of this Esolang,
22 these elements are the instructions themselves — or rather, the symbols that
23 the instructions are composed of.</p>
24
25 <p>Before we begin, some lexical notes. The name of this language is not
26 pronounced exactly how it looks; rather, it is pronounced as an English speaker
27 would pronounce the phrase
28 "you are hearing the name of this esolang." In addition, it is strongly discouraged
29 to refer to this language by the name "Yartnote", whether spoken or in writing, and
30 in any capitalization scheme. After all, there may actually be a completely
31 unrelated esolang called Yartnote one day, Zeus willing.
32 A similar logic applies to the taboo on calling it "YRNE".</p>
33
34 <h2>Program structure</h2>
35
36 <p>A You are Reading the Name of this Esolang program is a string of
37 symbols drawn from the alphabet <code>0</code>, <code>1</code>, <code>[</code>, and
38 <code>]</code>.</p>
39
40 <p><code>0</code> and <code>1</code> are interpreted as they are in the programming
41 language Spoon, or rather, the slightly more clearly-specified version of Spoon that
42 follows. The Spoon tape is considered unbounded in both directions. Each cell of
43 the Spoon tape may contain any non-negative integer (again, unbounded.) In addition,
44 attempting to decrement a cell below 0 results in an immediate termination of the
45 program. Oh, and there's no way to change what the symbols are.
46 But that's the extent of the difference, I think.</p>
47
48 <p>For your convenience, the Spoon instructions are repeated here
49 (taken from the public-domain <a href="http://esolangs.org/wiki/Spoon">Spoon</a>
50 entry on the <a href="http://esolangs.org/wiki/">Esolang wiki</a>):</p>
51
52 <pre>1 Increment the memory cell under the pointer
53 000 Decrement the memory cell under the pointer
54 010 Move the pointer to the right
55 011 Move the pointer to the left
56 0011 Jump back to the matching 00100
57 00100 Jump past the matching 0011 if the cell under the pointer is zero
58 001010 Output the character signified by the cell at the pointer
59 0010110 Input a character and store it at the cell in the pointer
60 00101110 Output the entire memory array
61 00101111 Immediately terminate program execution</pre>
62
63 <p>Each <code>[</code> must be matched with a <code>]</code>; between them lies a
64 subprogram with the same structure as a general You are Reading the Name of this Esolang
65 program. The meaning of this subprogram is determined from its structure as follows.
66 The subprogram is considered to be given the same input as the entire program. If the
67 subprogram halts on this input, it is reduced to a <code>1</code>, and if it loops
68 forever on this input, it is reduced to a <code>0</code>. These reduced instructions
69 are interpreted as they are in Spoon, as described above. Any output produced by the
70 subprogram is simply discarded.</p>
71
72 <p>Subprograms may themselves contain subprograms, nested to arbitrary depth, in which case the
73 reduction above is recursively applied, from the inside out, until a string of only
74 <code>0</code> and <code>1</code> symbols remains. This is then executed as if it
75 were a Spoon program. Any syntactically ill-formed program or subprogram is
76 considered to halt immediately, producing no output. Note however that, as a consequence
77 of this, a subprogram can be syntactically ill-formed (for example consisting of a
78 single <code>0</code>) while the parent program can still be syntactically OK
79 (the subprogram just reduces to a <code>1</code> in the parent program.)</p>
80
81 <h2>Implementation notes</h2>
82
83 <p>An implementation will determine if a particular subprogram halts, or not, if it can.
84 Implementations may vary in the power of their proof methods used for this, but at a
85 minimum must be able to recognize at least one subprogram that halts on any input, and
86 one subprogram that loops forever on any input. This implementation-dependence should
87 not strike anyone as too bizarre, I don't think — it is quite similar to how
88 different implementations of a traditional systems-programming language can, for example,
89 provide different levels of support for sizes of numerical data types like integers.</p>
90
91 <p>Recall that the problem of telling if an arbitrary program in some given Turing-complete language
92 halts on some input is <em>undecidable</em>, or equivalently, that set of all programs
93 that halt on some input is <em>recursively enumerable</em>. The set of all programs that loop
94 forever on some given input is the complement of this set, and it is called <em>co-recursively
95 enumerable</em> (<em>co-r.e.</em> for short.)</p>
96
97 <p>Despite this, there are many methods, ranging from simplistic to sophisticated,
98 that can be used to prove in <em>specific</em> circumstances that a given program,
99 on a given input, will either halt or fail to halt. These methods can be used in
100 a You are Reading the Name of this Esolang implementation.</p>
101
102 <p>The simplest method for proving that a subprogram halts is probably just to
103 simulate it on the given input indefinately, returning <code>1</code> if it halts.
104 If the subprogram does indeed halt, this technique will (eventually) reveal
105 that fact. The simulation can be peformed concurrently with other subprograms,
106 so that if no proof of halting for one subprogram is ever found, this will not
107 prevent other subprograms from being checked.</p>
108
109 <p>The simplest method for proving that a subprogram loops forever is probably to
110 check it against a library of subprograms known to loop forever. For example, it
111 can check if the program is <code>0010000000111001000011</code>
112 (in Brainfuck: <code>[-]+[]</code>.
113 You can readily assure yourself that this program loops forever, on any input.)
114 This technique of course
115 limits the number of recognizably looping subprograms that the implementation can
116 handle to a finite number. Checking the general case, that is, recognizing an infinite number
117 of forever-looping programs, is more difficult, however. Such an implementation
118 will require techniques such as automatically finding a proof by induction,
119 or abstract interpretation. However, ultimately, we know from the Halting Problem that there
120 is no perfectly general technique that will recognize <em>every</em> program that
121 loops forever.</p>
122
123 <h2>Computability class</h2>
124
125 <p>You are Reading the Name of this Esolang can be trivially shown to be as powerful
126 as Spoon, since valid Spoon programs are valid You are Reading the Name of this Esolang
127 programs. (Modulo the absence of negative-valued tape cells and the other little variations
128 mentioned above. Since these issues have been dealt with extensively in the
129 Brainfuck "literature", such as it is, I'm not going to worry about them.)</p>
130
131 <p>What about the other way around?</p>
132
133 <p>Well, take as a starting point the fact (in classical logic, at least)
134 that every Spoon program either halts at some point or loops forever. (Whether we can
135 <em>discover</em> which of these is the case is a different story.)
136 This means that every You are Reading the Name of this Esolang program has a
137 "canonical" form consisting of just <code>1</code>'s and <code>0</code>'s —
138 again, whether we have an interpreter powerful enough to discover it or not.
139 At this level, Spoon is as powerful as "canonical" You are Reading the Name of this Esolang,
140 because "canonical" You are Reading the Name of this Esolang programs are valid
141 Spoon programs.</p>
142
143 <p>Now we can go in the other direction. We can start with any "canonical"
144 You are Reading the Name of this Esolang program, and replace each <code>1</code>
145 with any You are Reading the Name of this Esolang subprogram that always halts.
146 Since even the simple method, described above, of proving that a subprogram halts
147 will always resolve to <code>1</code> if the subprogram does indeed halt, this subset
148 of You are Reading the Name of this Esolang programs is still executable
149 in Spoon (or any other Turing-complete language). We only need to add the halting
150 proof mechanism to rewrite the program into "canonical" form, before executing
151 or simulating it.</p>
152
153 <p>This extends recursively to any arbitrary level of nesting, too: we can
154 replace each <code>1</code> in each subprogram with subprograms that always
155 halt, with no bound. We only have to test these subprograms recursively,
156 from the inside out, to eventually recover a "canonical" program.</p>
157
158 <p>However, something strange happens when we turn our attention to <code>0</code>'s.
159 If we replace even one <code>0</code> with a subprogram that loops forever on
160 some input, there is always a possibility that: a) the You are Reading the Name of this Esolang
161 program will be run with that input, and that b) the interpreter cannot prove that
162 the subprogram loops forever with that input. Because the set of programs that loop forever
163 is co-r.e., there is no Turing machine (or Spoon program) that can look at any
164 one Spoon program and say, yep, I'm certain that this here program loops forever on this
165 input, so it should darn well be rewritten into a <code>0</code> symbol.</p>
166
167 <p>Thus it seems that there are You are Reading the Name of this Esolang programs
168 which no Spoon interpreter — indeed, not any interpreter for any Turing-complete
169 language — is able to interpret.</p>
170
171 <p>Since the case of <code>0</code>'s seems to mirror the case of <code>1</code>'s
172 when it comes to expanding them into subprograms, we may conjecture that, instead
173 of remaining basically the same as we consider more and more deeply nested subprograms
174 (as it was with expanding <code>1</code>'s,) maybe the problem
175 becomes more intractable the deeper we go in expanding <code>0</code>'s.
176 Perhaps we are climbing up the arithmetic hierarchy?</p>
177
178 <p>At any rate, <em>I</em> certainly initially conjectured that that was the case,
179 but it appears to be off the mark. Say you have a Spoon interpreter
180 that's equipped with an oracle. You can feed an input string and a Spoon program into
181 the oracle, and the oracle tells you whether or not that program halts on that input.
182 You could then use that oracle to resolve a given You are Reading the Name of this Esolang
183 subprogram into a <code>0</code> or <code>1</code>. But, you could also do this
184 recursively, resolving them from the inside outward. A Spoon interpreter with such
185 an oracle would be able to simulate any You are Reading the Name of this Esolang program
186 — no more powerful oracle is needed, no matter how deep the subprograms are
187 nested.</p>
188
189 <h2>Discussion</h2>
190
191 <p>I started designing You are Reading the Name of this Esolang shortly after
192 reading about the programming language Gravity, while trying to determine the
193 sense in which it is "non-computable." In particular, I noticed that these two statements
194 (taken from the <a href="http://esolangs.org/wiki/Gravity">Gravity</a>
195 entry on the <a href="http://esolangs.org/wiki/">Esolang wiki</a>,)
196 on which the claim of Gravity's "non-computability" apparently rests,
197 have ready analogies to problems the world of Turing machines:</p>
198
199 <ul>
200 <li>"Although [Gravity's] behavior is well-defined and deterministic, the evolution of its
201 space is in general non-computable [...]"
202 <p>The evolution of the state-space (set of successive configurations) of a universal
203 Turing machine is also in general non-computable (there's no Turing machine that can tell
204 you that some given configuration will never be reached.)
205 </p></li>
206 <li>"It can be shown that a Turing machine cannot compute, in the general case, whether
207 even a single collision [in a given Gravity program] will ever happen."
208 <p>It can also be shown that a Turing machine cannot compute, in the general case,
209 whether or not even a single given state of another Turing machine's finite control will
210 ever be reached. (Just make that state a halt state, and you have the Halting Problem
211 right there.)</p></li>
212 </ul>
213
214 <p>Because of this, I am skeptical that Gravity is any more "non-computable" than
215 a universal Turing machine. (I am, however, far from an expert on the computability
216 of differential equations; it could be that the rather nonspecific term "non-computable",
217 as used in that subfield, means something stronger than simply "undecidable".)</p>
218
219 <p>At any rate, the idea interested me enough to spur me into designing a language
220 that I <em>could</em> be reasonably certain was non-computable, in
221 some sense that I could explain. The name You are Reading the Name of this Esolang
222 was drifting around nearby
223 in the æther at that moment, and seemed fitting enough for this monstrosity.</p>
224
225 <p>The general approach was to simply force the
226 language interpreter to decide — that is, to reduce to either a <code>0</code> or a
227 <code>1</code> — some problem that is undecidable. This led to looking for
228 something that needed specifically either a <code>0</code> or a <code>1</code>
229 to specify something necessary, and that in turn
230 led to the choice of Spoon as a base language. (Of course, I could have picked
231 just about any language which is "its own binary Gödel numbering"; there are
232 plenty to choose from there, but Spoon had a cool name. What can I say — I
233 like The Tick.)</p>
234
235 <p>The obvious choice of undecidable
236 problem was whether another program halts or not. Making the subject of
237 this problem a <em>subprogram</em> with
238 the same structure as the general program let me examine the case of unbounded
239 recursive descent. This turned out to be not quite as interesting as I hoped, but perhaps
240 still somewhat illuminating. (Just what <em>would</em> it take, to require that
241 a Spoon interpreter have a more powerful oracle than HP, to run every You are Reading
242 the Name of this Esolang program? Perhaps
243 <a href="http://esolangs.org/wiki/Banana_Scheme">Banana Scheme</a> could provide
244 some inspiration, here. <em>It</em> certainly seems to be climbing the arithmetic
245 hierarchy, although I can't quite say how far. Possibly "damned far.")</p>
246
247 <p>I suppose one or two other things can be said about
248 You are Reading the Name of this Esolang.</p>
249
250 <p>Unlike both Gravity and Banana Scheme,
251 You are Reading the Name of this Esolang has a recursively enumerable <em>syntax</em>: the problem of
252 whether or not a given string over the alphabet <code>0</code>, <code>1</code>,
253 <code>[</code>, and <code>]</code> is even a well-formed You are Reading the Name of
254 this Esolang program is undecidable!</p>
255
256 <p>It's not entirely clear how to interpret the instruction
257 <code>00101110</code>, "Output the entire memory array," in the context of
258 having a tape unbounded in both directions. I suppose I ought to stipulate that
259 we are to just output the portion of the tape that has been "touched", i.e. that
260 the tape head has moved over. But really, it's not so important for the
261 goals of You are Reading the Name of this Esolang, so maybe I should just leave
262 it undefined, for kicks.</p>
263
264 <p>The fact that every subprogram takes the <em>same</em> input (same as the
265 main program) might lead to some interesting programs — programs which
266 are unduly sensitive to changes in the input, I imagine. Of course,
267 this doesn't affect the undecidability of subprograms, since they are always
268 free to ignore input completely.</p>
269
270 <p>There is no implementation, yet, but constructing an efficient one would be
271 a good exercise in static program analysis.</p>
272
273 <p>Happy undeciding!</p>
274
275 <p>-Chris Pressey
276 <br/>November 5, 2007
277 <br/>Chicago, Illinois, USA</p>
278
279 </body>
280 </html>
0 -> encoding: UTF-8
1
2 The Xoomonk Programming Language
3 ================================
4
5 Language version 0.1 Pretty Much Complete But Maybe Not Totally Finalized
6
7 Abstract
8 --------
9
10 _Xoomonk_ is a programming language in which _malingering updatable stores_
11 are first-class objects. Malingering updatable stores unify several language
12 constructs, including procedure activations, named parameters, and object-like
13 data structures.
14
15 Description
16 -----------
17
18 Overall, Xoomonk looks like a "typical" imperative language. The result
19 of evaluating an expression can be assigned to a variable, and the contents
20 of a variable can be used in a further expression.
21
22 | a := 1
23 | b := a
24 | print b
25 = 1
26
27 However, blocks of these imperative statements can appear as terms in
28 expressions; such blocks evaluate to entire updatable stores.
29
30 | a := {
31 | c := 5
32 | d := c
33 | }
34 | print a
35 = [c=5,d=5]
36
37 Once created, a store can be updated and accessed.
38
39 | a := {
40 | c := 5
41 | d := c
42 | }
43 | print a
44 | a.d := 7
45 | print a
46 | print a.c
47 = [c=5,d=5]
48 = [c=5,d=7]
49 = 5
50
51 A store can also be assigned to a variable after creation. Stores are
52 accessed by reference, so this creates two aliases to the same store.
53
54 | a := {
55 | c := 5
56 | d := c
57 | }
58 | b := a
59 | b.c := 17
60 | print a
61 | print b
62 = [c=17,d=5]
63 = [c=17,d=5]
64
65 To create an independent copy of the store, the postfix `*` operator is
66 used.
67
68 | a := {
69 | c := 5
70 | d := c
71 | }
72 | b := a*
73 | b.c := 17
74 | print a
75 | print b
76 = [c=5,d=5]
77 = [c=17,d=5]
78
79 Empty blocks are permissible.
80
81 | a := {}
82 | print a
83 = []
84
85 Once a store has been created, only those variables defined in the store
86 can be updated and accessed -- new variables cannot be added.
87
88 | a := { b := 6 }
89 | print a.c
90 ? Attempt to access an undefined variable
91
92 | a := { b := 6 }
93 | a.c := 12
94 ? Attempt to assign an undefined variable
95
96 Stores and integers are the only two data types in Xoomonk. However, there
97 are some special forms of the print statement, demonstrated here, which
98 allow for textual output.
99
100 | a := 65
101 | print char a
102 | print string "Hello, world!"
103 | print string "The value of a is ";
104 | print a;
105 | print "!"
106 = A
107 = Hello, world!
108 = The value of a is 65!
109
110 Xoomonk enforces strict block scope. Variables can be shadowed in an
111 inner block.
112
113 | a := 14
114 | b := {
115 | a := 12
116 | print a
117 | }
118 | print a
119 = 12
120 = 14
121
122 The special identifier `^` refers to the lexically enclosing store, that is,
123 the store which was in effect when the this store was defined.
124
125 | a := 14
126 | b := {
127 | a := 12
128 | print ^.a
129 | }
130 | print b.a
131 = 14
132 = 12
133
134 As long as a variable is assigned somewhere in a block, it can be accessed
135 before it is first assigned. In this case, it will hold the integer value 0.
136 It need not stay an integer, for typing is dynamic.
137
138 | b := {
139 | print a
140 | a := 12
141 | print a
142 | a := { c := 13 }
143 | print a
144 | }
145 | print b
146 = 0
147 = 12
148 = [c=13]
149 = [a=[c=13]]
150
151 We now present today's main feature.
152
153 It's important to understand that a block need not define all the
154 variables used in it. Such blocks do not immediately evaluate to a store.
155 Instead, they evaluate to an object called an _unsaturated store_.
156
157 Or, to put it another way:
158
159 If, in a block, you refer to a variable which has not yet been given
160 a value in that updatable store, the computations within the block are not
161 performed until that variable is given a value. Such a store is called an
162 _unsaturated store_.
163
164 | a := {
165 | d := c
166 | }
167 | print a
168 = [c=?,d=*]
169
170 An unsaturated store behaves similarly to a saturated store in certain
171 respects. In particular, unsaturated stores can be updated. If doing
172 so means that all of the undefined variables in the store are now defined,
173 the block associated with that store is evaluated, and the store becomes
174 saturated. In this sense, an unsaturated store is like a promise, and
175 this bears some resemblance to lazy evaluation (thus the term _malingering_).
176
177 | a := {
178 | print string "executing block"
179 | d := c
180 | }
181 | print a
182 | a.c := 7
183 | print a
184 = [c=?,d=*]
185 = executing block
186 = [c=7,d=7]
187
188 Once a store has become saturated, the block associated with it is not
189 executed again.
190
191 | a := {
192 | d := c
193 | }
194 | a.c := 7
195 | print a
196 | a.c := 4
197 | print a
198 = [c=7,d=7]
199 = [c=4,d=7]
200
201 The main program is a block. If it is unsaturated, no execution occurs
202 at all.
203
204 | print "hello"
205 | a := 14
206 | b := c
207 =
208
209 Variables cannot generally be accessed from an unsaturated store.
210
211 | a := {
212 | d := c
213 | }
214 | x := a.c
215 ? Attempt to access an unassigned variable
216
217 | a := {
218 | d := c
219 | }
220 | x := a.d
221 ? Attempt to access an unresolved variable
222
223 This is true, even if the variable is assigned a constant expression
224 inside the block.
225
226 | a := {
227 | b := 7
228 | d := c
229 | }
230 | x := a.b
231 ? Attempt to access an unresolved variable
232
233 If, however, the unsaturated store contains some variables that have
234 been updated since the store was created, those variable may be accessed.
235
236 | a := {
237 | print "executing block"
238 | p := q
239 | d := c
240 | }
241 | a.q = 7
242 | print a.q
243 = 7
244
245 Nor is it possible to assign a variable in an unsaturated store which is
246 defined somewhere in the block.
247
248 | a := {
249 | b := 7
250 | d := c
251 | }
252 | a.b := 4
253 ? Attempt to assign an unresolved variable
254
255 A variable is considered unresolved even if it is assigned within the block,
256 if that assignment takes place during or after its first use in an expresion.
257
258 | a := {
259 | b := b
260 | }
261 | a.b := 5
262 | print a
263 = [b=5]
264
265 | a := {
266 | print string "executing block"
267 | l := b
268 | b := 3
269 | l := 3
270 | }
271 | print string "saturating store"
272 | a.b := 5
273 | print a
274 = saturating store
275 = executing block
276 = [b=3,l=3]
277
278 We now describe how this language is (we reasonably assume) Turing-complete.
279
280 Operations are accomplished with certain built-in unsaturated stores. For
281 example, there is a store called `add`, which can be used for addition. These
282 built-in stores are globally available; they do not exist in any particular
283 store themselves. One uses the `$` prefix operator to access this global
284 namespace.
285
286 | print $add
287 | $add.x := 3
288 | $add.y := 5
289 | print $add.result
290 | print $add
291 = [x=?,y=?,result=*]
292 = 8
293 = [x=3,y=5,result=8]
294
295 Because using a built-in operation store in this way saturates it, it cannot
296 be used again. Typically you want to make a copy of the store first, and use
297 that, leaving the built-in store unmodified.
298
299 | o1 := $add*
300 | o1.x := 4
301 | o1.y := 7
302 | o2 := $add*
303 | o2.x := o1.result
304 | o2.y := 9
305 | print o2.result
306 = 20
307
308 Since Xoomonk is not a strictly minimalist language, there is a selection
309 of built-in stores which provide useful operations: `$add`, `$sub`, `$mul`,
310 `$div`, `$gt`, and `$not`.
311
312 Decision-making is also accomplished with a built-in store, `if`. This store
313 contains variables caled `cond`, `then`, and `else`. `cond` should
314 be an integer, and `then` and `else` should be unsaturated stores where `x` is
315 unassigned. When the first three are assigned values, if `cond` is nonzero,
316 it is assigned to `x` in the `then` store; otherwise, if it is zero, it is
317 assigned to `x` in the `else` store.
318
319 | o1 := $if*
320 | o1.then := {
321 | y := x
322 | print "condition is true"
323 | }
324 | o1.else := {
325 | y := x
326 | print "condition is false"
327 | }
328 | o1.cond := 0
329 = condition is false
330
331 | o1 := $if*
332 | o1.then := {
333 | y := x
334 | print "condition is true"
335 | }
336 | o1.else := {
337 | y := x
338 | print "condition is false"
339 | }
340 | o1.cond := 1
341 = condition is true
342
343 Repetition is also accomplished with a built-in store, `loop`. This store
344 contains an unassigned variable called `do`. When it is assigned a value,
345 assumed to be an unsaturated store, a copy of it is made. The variable
346 `x` inside that copy is assigned the value 0. This is supposed to saturate
347 the store. The variable `continue` is then accessed from the store. If
348 it is nonzero, the process repeats, with another copy of the `do` store
349 getting 0 assigned to its `x`, and so forth.
350
351 | l := $loop*
352 | counter := 5
353 | l.do := {
354 | y := x
355 | print ^.counter
356 | o := $sub*
357 | o.x := ^.counter
358 | o.y := 1
359 | ^.counter := o.result
360 | continue := o.result
361 | }
362 | print string "done!"
363 = 5
364 = 4
365 = 5
366 = 2
367 = 1
368 = done!
369
370 Because the `loop` construct will always execute the `do` store at least once
371 (even assuming its only unassigned variable is `x`), it acts like a so-called
372 `repeat` loop. It can be used in conjunction with `if` to simulate a
373 so-called `while` loop. With this loop, the built-in operations provided,
374 and variables which may contain unbounded integer values, Xoomonk should
375 be uncontroversially Turing-complete.
376
377 Finally, there is no provision for defining functions or procedures, because
378 malingering stores can act as these constructs.
379
380 | perimeter := {
381 | o1 := $mul*
382 | o1.x := x
383 | o1.y := 2
384 | o2 := $mul*
385 | o2.x := y
386 | o2.y := 2
387 | o3 := $add*
388 | o3.x := o1.result
389 | o3.y := o2.result
390 | result := o3.result
391 | }
392 | p1 := perimeter*
393 | p1.x := 13
394 | p1.y := 6
395 | print p1.result
396 | p2 := perimeter*
397 | p2.x := 4
398 | p2.y := 1
399 | print p2.result
400 = 38
401 = 10
402
403 Grammar
404 -------
405
406 Xoomonk ::= { Stmt }.
407 Stmt ::= Assign | Print.
408 Assign ::= Ref ":=" Expr.
409 Print ::= "print" ("string" <string> | "char" Expr | Expr) [";"].
410 Expr ::= (Block | Ref | Const) ["*"].
411 Block ::= "{" { Stmt } "}".
412 Ref ::= Name {"." Name}.
413 Name ::= "^" | "$" <alphanumeric> | <alphanumeric>.
414 Const ::= <integer-literal>.
415
416 Discussion
417 ----------
418
419 There is some similarity with Wouter van Oortmerssen's language Bla, in that
420 function environments are very close cousins of updatable stores. But
421 Xoomonk, quite unlike Bla, is an imperative language; once created, a store
422 may be updated at any point. And, of course, this property is exploited in
423 the introduction of malingering stores.
424
425 Xoomonk originally had an infix operator `&`, which takes two stores as its
426 arguments, at least one of which must be saturated, and evaluates to a third
427 store which is the result of merging the two argument stores. This result store
428 may be saturated even if only one of the argument stores was saturated, if
429 the saturated store gave all the values that the unsaturated store needed.
430 This operator was dropped because it is mostly syntactic sugar for assigning
431 each of the desired variables from one store to the other. However, it does
432 admittedly provide a very natural syntax, which orthogonally includes "named
433 arguments", when using unsaturated stores as procedures:
434
435 perimeter = {
436 # see example above
437 }
438 g := perimeter* & { x := 13 y := 6 }
439 print g.result
440
441 Xoomonk, as a project, is also an experiment in _test-driven language
442 design_. As you can see, I've described the language with a series of
443 examples, written in (something close to) Falderal format, each of which
444 could be used as a test case. This should make implementing an interpreter
445 or compiler for the language much easier, when I get around to that.
446
447 Happy malingering!
448
449 -Chris Pressey
450 Cat's Eye Technologies
451 August 7, 2011
452 Evanston, IL