Initial import of all my implementation-less language specs.
catseye
12 years ago
0 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> | |
1 | <!-- encoding: UTF-8 --> | |
2 | <html xmlns="http://www.w3.org/1999/xhtml" lang="en"> | |
3 | <head> | |
4 | <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> | |
5 | <title>The Didigm Reflective Cellular Automaton</title> | |
6 | <!-- begin html doc dynamic markup --> | |
7 | <script type="text/javascript" src="/contrib/jquery-1.6.4.min.js"></script> | |
8 | <script type="text/javascript" src="/scripts/documentation.js"></script> | |
9 | <!-- end html doc dynamic markup --> | |
10 | </head> | |
11 | <body> | |
12 | ||
13 | <h1>The Didigm Reflective Cellular Automaton</h1> | |
14 | ||
15 | <p>November 2007, Chris Pressey, Cat's Eye Technologies</p> | |
16 | ||
17 | <h2>Introduction</h2> | |
18 | ||
19 | <p>Didigm is a <em>reflective cellular automaton</em>. What I mean to | |
20 | impart by this phrase is that it is a cellular automaton where the | |
21 | transition rules are given by the very patterns of cells that exist | |
22 | in the playfield at any given time.</p> | |
23 | ||
24 | <p>Perhaps another way to think of Didigm is: | |
25 | Didigm = | |
26 | <a href="/projects/alpaca/">ALPACA</a> + | |
27 | <a href="/projects/ypsilax/">Ypsilax</a>.</p> | |
28 | ||
29 | <h2>Didigm as Parameterized Language</h2> | |
30 | ||
31 | <p>Didigm is actually a parameterized language. A parameterized | |
32 | language is a schema for specifying a set of languages, where a specific language | |
33 | can be obtained by supplying one or more parameters. For example, | |
34 | <a href="/projects/xigxag/">Xigxag</a> is a parameterized language, | |
35 | where the direction of the scanning and the direction of the building | |
36 | of new states are parameters. Didigm "takes" a single parameter, an integer. | |
37 | This parameter determines how many <em>colours</em> (number of possible | |
38 | states for any given cell) the cellular automaton has. This parameter | |
39 | defaults to 8. So when we say Didigm, we actually mean | |
40 | Didigm(8), but there are an infinite number of other possible | |
41 | languages, such as Didigm(5) and Didigm(70521).</p> | |
42 | ||
43 | <p>The languages Didigm(0), Didigm(-1), Didigm(-2) | |
44 | and so forth are probably nonsensical abberations, but I'll leave that | |
45 | question for the philosophers to ponder. | |
46 | Didigm(1) is at least well-defined, but it's trivial. | |
47 | Didigm(2) and Didigm(3) are semantically problematic for | |
48 | more technically interesting reasons (Didigm(3) might be CA-universal, | |
49 | but Didigm(2) probably isn't.) | |
50 | Didigm(4) and above are easily shown to be CA-universal.</p> | |
51 | ||
52 | <p> | |
53 | (I say CA-universal, and not Turing-complete, because technically | |
54 | cellular automata cannot simulate Turing machines without some | |
55 | extra machinery: TMs can halt, but CAs can't. Since I don't want | |
56 | to deal with defining that extra machinery in Didigm, it's simpler to | |
57 | avoid it for now.) | |
58 | </p> | |
59 | ||
60 | <p>Colours are typically numbered. However, this is not meant to imply | |
61 | an ordering between colours. The eight colours of Didigm | |
62 | are typically referred to as 0 to 7.</p> | |
63 | ||
64 | <h2>Language Description</h2> | |
65 | ||
66 | <h3>Playfield</h3> | |
67 | ||
68 | <p>The Didigm playfield, called <em>le monde</em>, | |
69 | is considered unbounded, like most cellular automaton | |
70 | playfields, but there is one significant difference. | |
71 | There is a horizontal division in this playfield, splitting | |
72 | it into regions called <em>le ciel</em>, on top, | |
73 | and <em>la terre</em>, below. This division is | |
74 | distinguishable — meaning, it must be possible to tell which region | |
75 | a given cell is in — but it need not have a presence beyond that. | |
76 | Specifically, this division lies on the edges between cells, rather than in | |
77 | the cells themselves. It has no "substance" and need not be visible | |
78 | to the user. (The Didigm Input Format, below, describes how it | |
79 | may be specified in textual input files.)</p> | |
80 | ||
81 | <h3>Magic Colours</h3> | |
82 | ||
83 | <p>Each region of the division has a distinguished colour which is | |
84 | called the <em>magic colour</em> of that region. The magic colour of le ciel is | |
85 | colour 0. The magic colour of la terre is colour 7. | |
86 | (In Didigm(<var>n</var>), | |
87 | the magic colour of la terre is colour <var>n</var>-1.)</p> | |
88 | ||
89 | <h3>Transition Rules</h3> | |
90 | ||
91 | <h4>Definition</h4> | |
92 | ||
93 | <p>Each transition rule of the cellular automaton is not fixed, | |
94 | rather, it is given by certain forms that are present in the playfield.</p> | |
95 | ||
96 | <p>Such a form is called <em>une salle</em> and has the following configuration. | |
97 | Two horizontally-adjacent cells of the magic colour abut a cell of the <em>destination | |
98 | colour</em> to the right. Two cells below the rightmost magic-colour | |
99 | cell is the cell of the <em>source colour</em>; it is surrounded by cells of | |
100 | any colour called the <em>determiners</em>.</p> | |
101 | ||
102 | <p>This is perhaps better illustrated than explained. In the following diagram, | |
103 | the magic colour is 0 (this salle is in le ciel,) | |
104 | the source colour is 1, the destination colour is 2, and the determiners are | |
105 | indicated by D's.</p> | |
106 | ||
107 | <pre> | |
108 | 002 | |
109 | DDD | |
110 | D1D | |
111 | DDD | |
112 | </pre> | |
113 | ||
114 | <h4>Application</h4> | |
115 | ||
116 | <p>Salles are interpreted as transition rules as follows. When the colour | |
117 | of a given cell is the same as the source colour of some salle, and when | |
118 | the colours of all the cells surrounding that cell are the exact same | |
119 | colours (in the exact same pattern) as the determininers of that salle, | |
120 | we say that that salle <em>matches</em> that cell. When any cell is | |
121 | matched by some salle in the other region, we say that that salle | |
122 | <em>applies</em> to that cell, and that cell is replaced | |
123 | by a cell of the destination colour of that salle.</p> | |
124 | ||
125 | <p>"The other region" refers, of course, to the region that is not the | |
126 | region in which the cell being transformed resides. Salles in la terre | |
127 | only apply to cells in le ciel and vice-versa. This complementarity | |
128 | serves to limit the amount of chaos: if there was some salle that applied to | |
129 | <em>all</em> cells, it would apply directly to the cells that made up that | |
130 | salle, and that salle would be immediately transformed.</p> | |
131 | ||
132 | <p>On each "tick" of the cellular automaton, all cells are checked to | |
133 | find the salle that applies to them, and then all are transformed, simultaneously, | |
134 | resulting in the next configuration of le monde.</p> | |
135 | ||
136 | <p>There is a "default" transition rule which also serves to limit | |
137 | the amount of chaos: if no salle applies to a cell, the colour of that | |
138 | cell does not change.</p> | |
139 | ||
140 | <p>Salles may overlap. However, no salle may straddle the horizon. | |
141 | (That is, each salle must be either completely in le ciel or completely | |
142 | in la terre.)</p> | |
143 | ||
144 | <p>Salles may conflict (i.e. two salles may have the same source colour | |
145 | and determiners, but different destination colours.) The behaviour in | |
146 | this case is defined to be uniformly random: if there are <var>n</var> | |
147 | conflicting salles, each has a 1/<var>n</var> chance of being the | |
148 | one that applies.</p> | |
149 | ||
150 | <h2>Didigm Input Format</h2> | |
151 | ||
152 | <p>I'd like to give some examples, but first I need a format to given them in.</p> | |
153 | ||
154 | <p>A Didigm Input File is a text file. The textual digit symbols | |
155 | <code>0</code> through <code>9</code> indicate cells of colours 0 through 9. | |
156 | Further colours may be indicated by enclosing a decimal digit string in square brackets, | |
157 | for example <code>[123]</code>. This digit string may contain leading | |
158 | zeros, in order for columns to line up nicely in the file.</p> | |
159 | ||
160 | <p>A line containing only a <code>,</code> symbol in the leftmost column | |
161 | indicates the division between le ciel and la terre. This line does not | |
162 | become part of the playfield.</p> | |
163 | ||
164 | <p>A line beginning with a <code>=</code> is a directive of some sort.</p> | |
165 | ||
166 | <p>A line beginning with <code>=C</code> followed by a colour indicator | |
167 | indicates how many colours (the <var>n</var> in Didigm(<var>n</var>)) | |
168 | this playfield contains. This directive may only occur once.</p> | |
169 | ||
170 | <p>A line beginning with <code>=F</code> followed by a colour indicator | |
171 | as described above, indicates that the unspecified (and unbounded) remainder | |
172 | of le ciel or la terre (whichever side of <code>,</code> the directive is on) | |
173 | is to be considered filled with cells of the given colour.</p> | |
174 | ||
175 | <p>Of course, an application which implements Didigm with some | |
176 | alternate means of specifying le monde, for example a graphical user interface, | |
177 | need not understand the Didigm Input Format.</p> | |
178 | ||
179 | <h2>Examples</h2> | |
180 | ||
181 | <p>Didigm is immediately seen to be CA-universal, in that you can | |
182 | readily (and stably) express a number of known CA-universal cellular automata in it. | |
183 | For example, to express John Conway's Life, you could say that | |
184 | colour 1 means "alive" and colour 2 means "dead", and compose something like</p> | |
185 | ||
186 | <pre> | |
187 | 002002001001 | |
188 | 222122112212 <i>... and so on ...</i> | |
189 | 212212212121 <i>... for all 256 ...</i> | |
190 | 222222222222 <i>... rules of Life ...</i> | |
191 | =F3 | |
192 | , | |
193 | =F2 | |
194 | 22222 | |
195 | 21222 | |
196 | 21212 | |
197 | 21122 | |
198 | 22222 | |
199 | </pre> | |
200 | ||
201 | <p>Because the magic colour 7 never appears in la terre, il n'y a aucune salle | |
202 | dans la terre et donc tout le ciel est toujours la meme chose.</p> | |
203 | ||
204 | <p>There are of course simpler CA's that are apparently CA-universal that | |
205 | would be possible to describe more compactly. But more interesting (to me) | |
206 | is the possibility for making reflective CA's.</p> | |
207 | ||
208 | <p>To do this in an uncontrolled fashion is easy. We just stick some salles | |
209 | in le ciel, some salles in la terre, and let 'er rip. Unfortunately, in general, | |
210 | les salles in each region will probably quickly damage enough of the salles in | |
211 | the other region that le monde will become sterile soon enough.</p> | |
212 | ||
213 | <p>A rudimentary example of something a little more orchestrated follows.</p> | |
214 | ||
215 | <pre> | |
216 | 3333333333333 | |
217 | 3002300230073 | |
218 | 3111311132113 | |
219 | 3311321131573 | |
220 | 3111311131333 | |
221 | 3333333333333 | |
222 | =F3 | |
223 | , | |
224 | =F1 | |
225 | 111111111111111 | |
226 | 111111131111111 | |
227 | 111111111111574 | |
228 | 111111111111333 | |
229 | 311111111111023 | |
230 | 111111111111113 | |
231 | </pre> | |
232 | ||
233 | ||
234 | <p>The intent of this is that the 3's in la terre initially grow | |
235 | streams of 2's to their right, due to the leftmost two salles in le ciel. | |
236 | However, when the top stream of 2's reaches the cell just above and to the | |
237 | left of the 5, the third salle in le ciel matches and turns the 5 into a | |
238 | 7, forming une salle dans la terre. This salle turns every 2 to the right of | |
239 | a 0 in le ciel into a 4, thus modifying two of les salles in le ciel. | |
240 | The result of these modified salles is to turn the bottom stream of 2's | |
241 | into a stream of 4's halfway along.</p> | |
242 | ||
243 | <p>This is at least predictable, but it still becomes uninteresting fairly quickly. | |
244 | Also note that it's not just the isolated 3's in la terre that grow | |
245 | streams of 2's to the right: the 3's on the right side of la salle | |
246 | would, too. This could be rectified by having a wall of some other colour | |
247 | on that side of la salle, and I'm sure you could extend this example | |
248 | by having something else happen when the stream of 4's hit the 0 in that | |
249 | salle, but you get the picture. Creating a neat and tidy and long-lived | |
250 | reflective cellular automaton requires at least as much care as | |
251 | constructing a "normal" cellular automaton, and probably in general more.</p> | |
252 | ||
253 | <h2>History</h2> | |
254 | ||
255 | <p>I came up with the concept of a reflective cellular automaton | |
256 | (which is, as far as I'm aware, a novel concept) independently on | |
257 | November 1st, 2007, while walking on Felix Avenue, in Windsor, Ontario.</p> | |
258 | ||
259 | <p>No reference implementation exists yet. Therefore, all Didigm runs | |
260 | have been thought experiments, and it's entirely possible that I've missed | |
261 | something in its definition that having a working simulator would reveal.</p> | |
262 | ||
263 | <p>Happy magic colouring! | |
264 | <br/>Chris Pressey | |
265 | <br/>Chicago, Illinois | |
266 | <br/>November 17, 2007</p> | |
267 | ||
268 | </body> | |
269 | </html> |
0 | Madison | |
1 | ======= | |
2 | ||
3 | Version 0.1 | |
4 | December 2011, Chris Pressey, Cat's Eye Technologies | |
5 | ||
6 | Abstract | |
7 | -------- | |
8 | ||
9 | Madison is a language in which one can state proofs of properties | |
10 | of term-rewriting systems. Classical methods of automated reasoning, | |
11 | such as resolution, are not used; indeed, term-rewriting itself is | |
12 | used to check the proofs. Both direct proof and proof by induction | |
13 | are supported. Induction in a proof must be across a structure which | |
14 | has a well-founded inductive definition. Such structures can be | |
15 | thought of as types, although this is largely nominal; the traditional | |
16 | typelessness of term-rewiting systems is largely retained. | |
17 | ||
18 | Term-rewriting | |
19 | -------------- | |
20 | ||
21 | Madison has at its core a simple term-rewriting language. It is of | |
22 | a common form which should be unsurprising to anyone who has worked | |
23 | at all with term rewriting. A typical simple program contains | |
24 | a set of rules and a term on which to apply those rules. Each | |
25 | rule is a pair of terms; either term of the pair may contain | |
26 | variables, but any variable that appears on the r.h.s must also | |
27 | appear on the l.h.s. A rule matches a term if is the same as | |
28 | the term with the exception of the variables, which are bound | |
29 | to its subterms; applying a matching rule replaces the term | |
30 | with the r.h.s. of the rule, with the variables expanded approp- | |
31 | riately. Rules are applied in an innermost, leftmost fashion to | |
32 | the term, corresponding to eager evaluation. Rewriting terminates | |
33 | when there is no rule whose l.h.s. matches the current incarnation | |
34 | of the term being rewritten. | |
35 | ||
36 | A term is either an atom, which is a symbol that stands alone, | |
37 | or a constructor, which is a symbol followed by a comma-separated list | |
38 | of subterms enclosed in parentheses. Symbols may consist of letters, | |
39 | digits, and hyphens, with no intervening whitespace. A symbol is | |
40 | a variable symbol if it begins with a capital letter. Variable | |
41 | symbols may also begin with underscores, but these may only occur | |
42 | in the l.h.s. of a rewrite rule, to indicate that we don't care | |
43 | what value is bound to the variable and we won't be using it on | |
44 | the r.h.s. | |
45 | ||
46 | (The way we are using the term "constructor" may be slightly non- | |
47 | standard; in some other sources, this is called a "function symbol", | |
48 | and a "constructor" is a subtly different thing.) | |
49 | ||
50 | Because the rewriting language is merely a component (albeit the | |
51 | core component) of a larger system, the aforementioned typical | |
52 | simple program must be cast into some buttressing syntax. A full | |
53 | program consists of a `let` block which contains the rules | |
54 | and a `rewrite` admonition which specifies the term to be re- | |
55 | written. An example follows. | |
56 | ||
57 | | let | |
58 | | leftmost(tree(X,Y)) -> leftmost(X) | |
59 | | leftmost(leaf(X)) -> X | |
60 | | in | |
61 | | rewrite leftmost(tree(tree(leaf(alice),leaf(grace)),leaf(dan))) | |
62 | = alice | |
63 | ||
64 | In the above example, there are two rules for the constructor | |
65 | `leftmost/1`. The first is applied to the outer tree to obtain | |
66 | a new leftmost constructor containing the inner tree; the first | |
67 | is applied again to obtain a new leftmost constructor containing | |
68 | the leaf containing `alice`; and the second is applied to that | |
69 | leaf term to obtain just `alice`. At that point, no more rules | |
70 | apply, so rewriting terminates, yielding `alice`. | |
71 | ||
72 | Madison is deterministic; if rules overlap, the first one given | |
73 | (syntactically) is used. For this reason, it is a good idea | |
74 | to order rules from most specific to least specific. | |
75 | ||
76 | I used the phrase "typical simple program" above because I was | |
77 | trying intentionally to avoid saying "simplest program". In fact, | |
78 | technically no `let` block is required, so you can write some | |
79 | really trivial Madison programs, like the following: | |
80 | ||
81 | | rewrite cat | |
82 | = cat | |
83 | ||
84 | I think that just about covers the core term-rewriting language. | |
85 | Term-rewriting is Turing-complete, so Madison is too. If you | |
86 | wish to learn more about term rewriting, there are several good | |
87 | books and webpages on the subject; I won't go into it further | |
88 | here. | |
89 | ||
90 | Proof-Checking | |
91 | -------------- | |
92 | ||
93 | My desire with Madison was to design a language in which you | |
94 | can prove things. Not a full-blown theorem prover -- just a | |
95 | proof checker, where you supply a proof and it confirms either | |
96 | that the proof holds or doesn't hold. (Every theorem prover | |
97 | has at its core a proof checker, but it comes bundled with a lot of | |
98 | extra machinery to search the space of possible proofs cleverly, | |
99 | looking for one which will pass the proof-checking phase.) | |
100 | ||
101 | It's no coincidence that Madison is built on top of a term-rewriting | |
102 | language. For starters, a proof is very similar to the execution | |
103 | trace of a term being rewritten. In each of the steps of the proof, | |
104 | the statement to be proved is transformed by replacing some part | |
105 | of it with some equally true thing -- in other words, rewritten. | |
106 | In fact, Post Canonical Systems were an early kind of rewriting | |
107 | system, devised by Emil Post to (as I understand it) illustrate this | |
108 | similarity, and to show that proofs could be mechanically carried out | |
109 | in a rewriting system. | |
110 | ||
111 | So: given a term-rewriting language, we can give a trivial kind | |
112 | of proof simply by stating the rewrite steps that *should* occur | |
113 | when a term is rewritten, and check that proof by rewriting the term | |
114 | and confirming that those were in fact the steps that occurred. | |
115 | ||
116 | For the purpose of stating these sequences of rewrite steps to be | |
117 | checked, Madison has a `theorem..proof..qed` form. To demonstrate | |
118 | this form, let's use Madison to prove that 2 + 2 = 4, using Peano | |
119 | arithmetic. | |
120 | ||
121 | | let | |
122 | | add(s(X),Y) -> add(X,s(Y)) | |
123 | | add(z,Y) -> Y | |
124 | | in theorem | |
125 | | add(s(s(z)),s(s(z))) ~> s(s(s(s(z)))) | |
126 | | proof | |
127 | | add(s(s(z)),s(s(z))) | |
128 | | -> add(s(z),s(s(s(z)))) [by add.1] | |
129 | | -> add(z,s(s(s(s(z))))) [by add.1] | |
130 | | -> s(s(s(s(z)))) [by add.2] | |
131 | | qed | |
132 | = true | |
133 | ||
134 | The basic syntax should be fairly apparent. The `theorem` block | |
135 | contains the statement to be proved. The `~>` means "rewrites | |
136 | in zero or more steps to". So, here, we are saying that 2 + 2 | |
137 | (in Peano notation) rewrites, in zero or more steps, to 4. | |
138 | ||
139 | The `proof` block contains the actual series of rewrite steps that | |
140 | should be carried out. For elucidation, each step may name the | |
141 | particular rule which is applied to arrive at the transformed term | |
142 | at that step. Rules are named by their outermost constructor, | |
143 | followed by a dot and the ordinal position of the rule in the list | |
144 | of rules. These rule-references are optional, but the fact that | |
145 | the rule so named was actually used to rewrite the term at that step | |
146 | could be checked too, of course. The `qed` keyword ends the proof | |
147 | block. | |
148 | ||
149 | Naturally, you can also write a proof which does not hold, and | |
150 | Madison should inform you of this fact. 2 + 3, for example, | |
151 | does not equal 4, and it can pinpoint exactly where you went | |
152 | wrong should you come to this conclusion: | |
153 | ||
154 | | let | |
155 | | add(s(X),Y) -> add(X,s(Y)) | |
156 | | add(z,Y) -> Y | |
157 | | in theorem | |
158 | | add(s(s(z)),s(s(s(z)))) ~> s(s(s(s(z)))) | |
159 | | proof | |
160 | | add(s(s(z)),s(s(s(z)))) | |
161 | | -> add(s(z),s(s(s(s(z))))) [by add.1] | |
162 | | -> add(z,s(s(s(s(z))))) [by add.1] | |
163 | | -> s(s(s(s(z)))) [by add.2] | |
164 | | qed | |
165 | ? Error in proof [line 6]: step 2 does not follow from applying [add.1] to previous step | |
166 | ||
167 | Now, while these *are* proofs, they don't tell us much about the | |
168 | properties of the terms and rules involved, because they are not | |
169 | *generalized*. They say something about a few fixed values, like | |
170 | 2 and 4, but they do not say anything about any *infinite* | |
171 | sets of values, like the natural numbers. Now, that would be *really* | |
172 | useful. And, while I could say that what you've seen of Madison so far | |
173 | is a proof checker, it is not a very impressive one. So let's take | |
174 | this further. | |
175 | ||
176 | Quantification | |
177 | -------------- | |
178 | ||
179 | To state a generalized proof, we will need to introduce variables, | |
180 | and to have variables, we will need to be able to say what those | |
181 | variables can range over; in short, we need *quantification*. Since | |
182 | we're particularly interested in making statements about infinite | |
183 | sets of values (like the natural numbers), we specifically want | |
184 | *universal quantification*: | |
185 | ||
186 | For all x, ... | |
187 | ||
188 | But to have universal quantification, we first need a *universe* | |
189 | over which to quantify. When we say "for all /x/", we generally | |
190 | don't mean "any and all things of any kind which we could | |
191 | possibly name /x/". Rather, we think of /x/ as having a type of | |
192 | some kind: | |
193 | ||
194 | For all natural numbers x, ... | |
195 | ||
196 | Then, if our proof holds, it holds for all natural numbers. | |
197 | No matter what integer value greater than or equal to zero | |
198 | we choose for /x/, the truism contained in the proof remains true. | |
199 | This is the sort of thing we want in Madison. | |
200 | ||
201 | Well, to start, there is one glaringly obvious type in any | |
202 | term-rewriting language, namely, the term. We could say | |
203 | ||
204 | For all terms t, ... | |
205 | ||
206 | But it would not actually be very interesting, because terms | |
207 | are so general and basic that there's not actually very much you | |
208 | can say about them that you don't already know. You sort of need | |
209 | to know the basic properties of terms just to build a term-rewriting | |
210 | language (like the one at Madison's core) in the first place. | |
211 | ||
212 | The most useful property of terms as far as Madison is concerned is | |
213 | that the subterm relationship is _well-founded_. In other words, | |
214 | in the term `c(X)`, `X` is "smaller than" `c(X)`, and since terms are | |
215 | finite, any series of rewrites which always results in "smaller" terms | |
216 | will eventually terminate. For completeness, we should probably prove | |
217 | that rigorously, but for expediency we will simply take it as a given | |
218 | fact for our proofs. | |
219 | ||
220 | Anyway, to get at something actually interesting, we must look further | |
221 | than the terms themselves. | |
222 | ||
223 | Types | |
224 | ----- | |
225 | ||
226 | What's actually interesting is when you define a restricted | |
227 | set of forms that terms can take, and you distinguish terms inside | |
228 | this set of forms from the terms outside the set. For example, | |
229 | ||
230 | | let | |
231 | | boolean(true) -> true | |
232 | | boolean(false) -> true | |
233 | | boolean(_) -> false | |
234 | | in | |
235 | | rewrite boolean(false) | |
236 | = true | |
237 | ||
238 | We call a set of forms like this a _type_. As you can see, we | |
239 | have basically written a predicate that defines our type. If any | |
240 | of the rewrite rules in the definition of this predicate rewrite | |
241 | a given term to `true`, that term is of our type; if it rewrites | |
242 | to `false`, it is not. | |
243 | ||
244 | Once we have types, any constructor may be said to have a type. | |
245 | By this we mean that no matter what subterms the constructor has, | |
246 | the predicate of the type of which we speak will always reduce to | |
247 | `true` when that term is inserted in it. | |
248 | ||
249 | Note that using predicates like this allows our types to be | |
250 | non-disjoint; the same term may reduce to true in two different | |
251 | predicates. My first sketches for Madison had disjoint types, | |
252 | described by rules which reduced each term to an atom which named | |
253 | the type of that term. (So the above would have been written with | |
254 | rules `true -> boolean` and `false -> boolean` instead.) However, | |
255 | while that method may be, on the surface, more elegant, I believe | |
256 | this approach better reflects how types are actually used in | |
257 | programming. At the end of the day, every type is just a predicate, | |
258 | and there is nothing stopping 2 from being both a natural number and | |
259 | an integer. And, for that matter, a rational number and a real | |
260 | number. | |
261 | ||
262 | In theory, every predicate is a type, too, but that's where things | |
263 | get interesting. Is 2 not also an even number, and a prime number? | |
264 | And in an appropriate (albeit contrived) language, is it not a | |
265 | description of a computation which may or may not always halt? | |
266 | ||
267 | The Type Syntax | |
268 | --------------- | |
269 | ||
270 | The above considerations motivate us to be careful when dealing | |
271 | with types. We should establish some ground rules so that we | |
272 | know that our types are useful to universally quantify over. | |
273 | ||
274 | Unfortunately, this introduces something of a chicken-and-egg | |
275 | situation, as our ground rules will be using logical connectives, | |
276 | while at the same time they will be applied to those logical | |
277 | connectives to ensure that they are sound. This is not, actually, | |
278 | a big deal; I mention it here more because it is interesting. | |
279 | ||
280 | So, the rules which define our type must conform to certain | |
281 | rules, themselves. While it would be possible to allow the | |
282 | Madison programmer to use any old bunch of rewrite rules as a | |
283 | type, and to check that these rules make for a "good" type when | |
284 | such a usage is seen -- and while this would be somewhat | |
285 | attractive from the standpoint of proving properties of term- | |
286 | rewriting systems using term-rewriting systems -- it's not strictly | |
287 | necessary to use a descriptive approach such as this, and there are | |
288 | certain organizational benefits we can achieve by taking a more | |
289 | prescriptive tack. | |
290 | ||
291 | Viz., we introduce a special syntax for defining a type with a | |
292 | set of rules which function collectively as a type predicate. | |
293 | Again, it's not strictly necessary to do this, but it does | |
294 | help organize our code and perhaps our thoughts, and perhaps make | |
295 | an implementation easier to build. It's nice to be able to say, | |
296 | yes, what it means to be a `boolean` is defined right here and | |
297 | nowhere else. | |
298 | ||
299 | So, to define a type, we write our type rules in a `type..in` | |
300 | block, like the following. | |
301 | ||
302 | | type boolean is | |
303 | | boolean(true) -> true | |
304 | | boolean(false) -> true | |
305 | | in | |
306 | | rewrite boolean(false) | |
307 | = true | |
308 | ||
309 | As you can see, the wildcard reduction to false can be omitted for | |
310 | brevity. (In other words, "Nothing else is a boolean" is implied.) | |
311 | And, the `boolean` constructor can be used for rewriting in a term | |
312 | just like any other plain, non-`type`-blessed rewrite rule. | |
313 | ||
314 | | type boolean is | |
315 | | boolean(true) -> true | |
316 | | boolean(false) -> true | |
317 | | in | |
318 | | rewrite boolean(tree(leaf(sabrina),leaf(joe))) | |
319 | = false | |
320 | ||
321 | Here are the rules that the type-defining rules must conform to. | |
322 | If any of these rules are violated in the `type` block, the Madison | |
323 | implementation must complain, and not proceed to try to prove anything | |
324 | from them. | |
325 | ||
326 | Once a type is defined, it cannot be defined further in a regular, | |
327 | non-type-defining rewriting rule. | |
328 | ||
329 | | type boolean is | |
330 | | boolean(true) -> true | |
331 | | boolean(false) -> true | |
332 | | in let | |
333 | | boolean(red) -> green | |
334 | | in | |
335 | | rewrite boolean(red) | |
336 | ? Constructor "boolean" used in rule but already defined as a type | |
337 | ||
338 | The constructor in the l.h.s. must be the same in all rules. | |
339 | ||
340 | | type foo is | |
341 | | foo(bar) -> true | |
342 | | baz(bar) -> true | |
343 | | in | |
344 | | rewrite cat | |
345 | ? In type "foo", constructor "bar" used on l.h.s. of rule | |
346 | ||
347 | The constructor used in the rules must be arity 1 (i.e. have exactly | |
348 | one subterm.) | |
349 | ||
350 | | type foo is | |
351 | | foo(bar,X) -> true | |
352 | | in | |
353 | | rewrite cat | |
354 | ? In type "foo", constructor has arity greater than one | |
355 | ||
356 | It is considered an error if the predicate rules ever rewrite, inside | |
357 | the `type` block, to anything besides the atoms `true` or `false`. | |
358 | ||
359 | | type foo is | |
360 | | foo(bar) -> true | |
361 | | foo(tree(X)) -> bar | |
362 | | in | |
363 | | rewrite cat | |
364 | ? In type "foo", rule reduces to "bar" instead of true or false | |
365 | ||
366 | The r.h.s.'s of the rules of the type predicate must *always* | |
367 | rewrite to `true` or `false`. That means, if we can't prove that | |
368 | the rules always rewrite to something, we can't use them as type | |
369 | predicate rules. In practice, there are a few properties that | |
370 | we insist that they have. | |
371 | ||
372 | They may involve type predicates that have previously been | |
373 | established. | |
374 | ||
375 | | type boolean is | |
376 | | boolean(true) -> true | |
377 | | boolean(false) -> true | |
378 | | in type boolbox is | |
379 | | boolbox(box(X)) -> boolean(X) | |
380 | | in | |
381 | | rewrite boolbox(box(true)) | |
382 | = true | |
383 | ||
384 | They may involve certain, pre-defined rewriting rules which can | |
385 | be thought of as operators on values of boolean type (which, honestly, | |
386 | is probably built-in to the language.) For now there is only one | |
387 | such pre-defined rewriting rule: `and(X,Y)`, where `X` and `Y` are | |
388 | booleans, and which rewrites to a boolean, using the standard truth | |
389 | table rules for boolean conjunction. | |
390 | ||
391 | | type boolean is | |
392 | | boolean(true) -> true | |
393 | | boolean(false) -> true | |
394 | | in type boolpair is | |
395 | | boolpair(pair(X,Y)) -> and(boolean(X),boolean(Y)) | |
396 | | in | |
397 | | rewrite boolpair(pair(true,false)) | |
398 | = true | |
399 | ||
400 | | type boolean is | |
401 | | boolean(true) -> true | |
402 | | boolean(false) -> true | |
403 | | in type boolpair is | |
404 | | boolpair(pair(X,Y)) -> and(boolean(X),boolean(Y)) | |
405 | | in | |
406 | | rewrite boolpair(pair(true,cheese)) | |
407 | = false | |
408 | ||
409 | Lastly, the r.h.s. of a type predicate rule can refer to the self-same | |
410 | type being defined, but *only* under certain conditions. Namely, | |
411 | the rewriting must "shrink" the term being rewritten. This is what | |
412 | lets us inductively define types. | |
413 | ||
414 | | type nat is | |
415 | | nat(z) -> true | |
416 | | nat(s(X)) -> nat(X) | |
417 | | in | |
418 | | rewrite nat(s(s(z))) | |
419 | = true | |
420 | ||
421 | | type nat is | |
422 | | nat(z) -> true | |
423 | | nat(s(X)) -> nat(s(X)) | |
424 | | in | |
425 | | rewrite nat(s(s(z))) | |
426 | ? Type not well-founded: recursive rewrite does not decrease in [foo.2] | |
427 | ||
428 | | type nat is | |
429 | | nat(z) -> true | |
430 | | nat(s(X)) -> nat(s(s(X))) | |
431 | | in | |
432 | | rewrite nat(s(s(z))) | |
433 | ? Type not well-founded: recursive rewrite does not decrease in [foo.2] | |
434 | ||
435 | | type bad | |
436 | | bad(leaf(X)) -> true | |
437 | | bad(tree(X,Y)) -> and(bad(X),bad(tree(Y,Y)) | |
438 | | in | |
439 | | rewrite whatever | |
440 | ? Type not well-founded: recursive rewrite does not decrease in [bad.2] | |
441 | ||
442 | We can check this by looking at all the rewrite rules in the | |
443 | definition of the type that are recursive, i.e. that contain on | |
444 | on their r.h.s. the constructor being defined as a type predicate. | |
445 | For every such occurrence on the r.h.s. of a recursive rewrite, | |
446 | the contents of the constructor must be "smaller" than the contents | |
447 | of the constructor on the l.h.s. What it means to be smaller | |
448 | should be fairly obvious: it just has fewer subterms. If all the | |
449 | rules conform to this pattern, rewriting will eventually terminate, | |
450 | because it will run out of subterms to rewrite. | |
451 | ||
452 | Application of Types in Proofs | |
453 | ------------------------------ | |
454 | ||
455 | Now, aside from these restrictions, type predicates are basically | |
456 | rewrite rules, just like any other. The main difference is that | |
457 | we know they are well-defined enough to be used to scope the | |
458 | universal quantification in a proof. | |
459 | ||
460 | Simply having a definition for a `boolean` type allows us to construct | |
461 | a simple proof with variables. Universal quantification over the | |
462 | universe of booleans isn't exactly impressive; we don't cover an infinite | |
463 | range of values, like we would with integers, or lists. But it's | |
464 | a starting point on which we can build. We will give some rewrite rules | |
465 | for a constructor `not`, and prove that this constructor always reduces | |
466 | to a boolean when given a boolean. | |
467 | ||
468 | | type boolean is | |
469 | | boolean(true) -> true | |
470 | | boolean(false) -> true | |
471 | | in let | |
472 | | not(true) -> false | |
473 | | not(false) -> true | |
474 | | not(_) -> undefined | |
475 | | in theorem | |
476 | | forall X where boolean(X) | |
477 | | boolean(not(X)) ~> true | |
478 | | proof | |
479 | | case X = true | |
480 | | boolean(not(true)) | |
481 | | -> boolean(true) [by not.1] | |
482 | | -> true [by boolean.1] | |
483 | | case X = false | |
484 | | boolean(not(false)) | |
485 | | -> boolean(false) [by not.2] | |
486 | | -> true [by boolean.2] | |
487 | | qed | |
488 | = true | |
489 | ||
490 | As you can see, proofs using universally quantified variables | |
491 | need to make use of _cases_. We know this proof is sound, because | |
492 | it shows the rewrite steps for all the possible values of the | |
493 | variable -- and we know they are all the possible values, from the | |
494 | definition of the type. | |
495 | ||
496 | In this instance, the cases are just the two possible values | |
497 | of the boolean type, but if the type was defined inductively, | |
498 | they would need to cover the base and inductive cases. In both | |
499 | matters, each case in a complete proof maps to exactly one of | |
500 | the possible rewrite rules for the type predicate. (and vice versa) | |
501 | ||
502 | Let's prove the type of a slightly more complex rewrite rule, | |
503 | one which has multiple subterms which can vary. (This `and` | |
504 | constructor has already been introduced, and we've claimed we | |
505 | can use it in the definition of well-founded inductive types; | |
506 | but this code proves that it is indeed well-founded, and it | |
507 | doesn't rely on it already being defined.) | |
508 | ||
509 | | let | |
510 | | and(true,true) -> true | |
511 | | and(_,_) -> false | |
512 | | in theorem | |
513 | | forall X where boolean(X) | |
514 | | forall Y where boolean(Y) | |
515 | | boolean(and(X,Y)) ~> true | |
516 | | proof | |
517 | | case X = true | |
518 | | case Y = true | |
519 | | boolean(and(true,true)) | |
520 | | -> boolean(true) [by and.1] | |
521 | | -> true [by boolean.1] | |
522 | | case Y = false | |
523 | | boolean(and(true,false)) | |
524 | | -> boolean(false) [by and.2] | |
525 | | -> true [by boolean.2] | |
526 | | case X = false | |
527 | | case Y = true | |
528 | | boolean(and(false,true)) | |
529 | | -> boolean(false) [by and.2] | |
530 | | -> true [by boolean.2] | |
531 | | case Y = false | |
532 | | boolean(and(false,false)) | |
533 | | -> boolean(false) [by and.2] | |
534 | | -> true [by boolean.2] | |
535 | | qed | |
536 | = true | |
537 | ||
538 | Unwieldy, you say! And you are correct. But making something | |
539 | easy to use was never my goal. | |
540 | ||
541 | Note that the definition of `and()` is a bit more open-ended than | |
542 | `not()`. `and.2` allows terms like `and(dog,cat)` to rewrite to `false`. | |
543 | But our proof only shows that the result of reducing `and(A,B)` is | |
544 | a boolean *when both A and B are booleans*. So it, in fact, | |
545 | tells us nothing about the type of `and(dog,cat)`, nor in fact anything | |
546 | at all about the properties of `and(A,B)` when one or more of `A` and | |
547 | `B` are not of boolean type. So be it. | |
548 | ||
549 | Anyway, since we were speaking of inductively defined types | |
550 | previously, let's do that now. With the help of `and()`, here is | |
551 | a type for binary trees. | |
552 | ||
553 | | type tree is | |
554 | | tree(leaf) -> true | |
555 | | tree(branch(X,Y)) -> and(tree(X),tree(Y)) | |
556 | | in | |
557 | | rewrite tree(branch(leaf,leaf)) | |
558 | = true | |
559 | ||
560 | We can define some rewrite rules on trees. To start small, | |
561 | let's define a simple predicate on trees. | |
562 | ||
563 | | type tree is | |
564 | | tree(leaf) -> true | |
565 | | tree(branch(X,Y)) -> and(tree(X),tree(Y)) | |
566 | | in let | |
567 | | empty(leaf) -> true | |
568 | | empty(branch(_,_)) -> false | |
569 | | in empty(branch(branch(leaf,leaf),leaf)) | |
570 | = false | |
571 | ||
572 | | type tree is | |
573 | | tree(leaf) -> true | |
574 | | tree(branch(X,Y)) -> and(tree(X),tree(Y)) | |
575 | | in let | |
576 | | empty(leaf) -> true | |
577 | | empty(branch(_,_)) -> false | |
578 | | in empty(leaf) | |
579 | = true | |
580 | ||
581 | Now let's prove that our predicate always rewrites to a boolean | |
582 | (i.e. that it has boolean type) when its argument is a tree. | |
583 | ||
584 | | type tree is | |
585 | | tree(leaf) -> true | |
586 | | tree(branch(X,Y)) -> and(tree(X),tree(Y)) | |
587 | | in let | |
588 | | empty(leaf) -> true | |
589 | | empty(branch(_,_)) -> false | |
590 | | in theorem | |
591 | | forall X where tree(X) | |
592 | | boolean(empty(X)) ~> true | |
593 | | proof | |
594 | | case X = leaf | |
595 | | boolean(empty(leaf)) | |
596 | | -> boolean(true) [by empty.1] | |
597 | | -> true [by boolean.1] | |
598 | | case X = branch(S,T) | |
599 | | boolean(empty(branch(S,T))) | |
600 | | -> boolean(false) [by empty.2] | |
601 | | -> true [by boolean.2] | |
602 | | qed | |
603 | = true | |
604 | ||
605 | This isn't really a proof by induction yet, but it's getting closer. | |
606 | This is still really us examining the cases to determine the type. | |
607 | But, we have an extra guarantee here; in `case X = branch(S,T)`, we | |
608 | know `tree(S) -> true`, and `tree(T) -> true`, because `tree(X) -> true`. | |
609 | This is one more reason why `and(X,Y)` is built-in to Madison; it | |
610 | needs to leverage what it means and make use of this information in a | |
611 | proof. We don't really use that extra information in this proof, but | |
612 | we will later on. | |
613 | ||
614 | Structural Induction | |
615 | -------------------- | |
616 | ||
617 | Let's try something stronger, and get into something that could be | |
618 | described as real structural induction. This time, we won't just prove | |
619 | something's type. We'll prove something that actually walks and talks | |
620 | like a real (albeit simple) theorem: the reflection of the reflection | |
621 | of any binary tree is the same as the original tree. | |
622 | ||
623 | | type tree is | |
624 | | tree(leaf) -> true | |
625 | | tree(branch(X,Y)) -> and(tree(X),tree(Y)) | |
626 | | in let | |
627 | | reflect(leaf) -> leaf | |
628 | | reflect(branch(A,B)) -> branch(reflect(B),reflect(A)) | |
629 | | in theorem | |
630 | | forall X where tree(X) | |
631 | | reflect(reflect(X)) ~> X | |
632 | | proof | |
633 | | case X = leaf | |
634 | | reflect(reflect(leaf)) | |
635 | | -> reflect(leaf) [by reflect.1] | |
636 | | -> leaf [by reflect.1] | |
637 | | case X = branch(S, T) | |
638 | | reflect(reflect(branch(S, T))) | |
639 | | -> reflect(branch(reflect(T),reflect(S))) [by reflect.2] | |
640 | | -> branch(reflect(reflect(S)),reflect(reflect(T))) [by reflect.2] | |
641 | | -> branch(S,reflect(reflect(T))) [by IH] | |
642 | | -> branch(S,T) [by IH] | |
643 | | qed | |
644 | = true | |
645 | ||
646 | Finally, this is a proof using induction! In the [by IH] clauses, | |
647 | IH stands for "inductive hypothesis", the hypothesis that we may | |
648 | assume in making the proof; namely, that the property holds for | |
649 | "smaller" instances of the type of X -- in this case, the "smaller" | |
650 | trees S and T that are used to construct the tree `branch(S, T)`. | |
651 | ||
652 | Relying on the IH is valid only after we have proved the base case. | |
653 | After having proved `reflect(reflect(S)) -> S` for the base cases of | |
654 | the type of S, we are free to assume that `reflect(reflect(S)) -> S` | |
655 | in the induction cases. And we do so, to rewrite the last two steps. | |
656 | ||
657 | Like cases, the induction in a proof maps directly to the | |
658 | induction in the definition of the type of the variable being | |
659 | universally quantified upon. If the induction in the type is well- | |
660 | founded, so too will be the induction in the proof. (Indeed, the | |
661 | relationship between induction and cases is implicit in the | |
662 | concepts of the "base case" and "inductive case (or step)".) | |
663 | ||
664 | Stepping Back | |
665 | ------------- | |
666 | ||
667 | So, we have given a simple term-rewriting-based language for proofs, | |
668 | and shown that it can handle a proof of a property over an infinite | |
669 | universe of things (trees.) That was basically my goal in designing | |
670 | this language. Now let's step back and consider some of the | |
671 | implications of this system. | |
672 | ||
673 | We have, here, a typed programming language. We can define types | |
674 | that look an awful lot like algebraic data types. But instead of | |
675 | glibly declaring the type of any given term, like we would in most | |
676 | functional languages, we actually have to *prove* that our terms | |
677 | always rewrite to a value of that type. That's more work, of | |
678 | course, but it's also stronger: in proving that the term always | |
679 | rewrites to a value of the type, we have, naturally, proved that | |
680 | it *always* rewrites -- that its rewrite sequence is terminating. | |
681 | There is no possibility that its rewrite sequence will enter an | |
682 | infinite loop. Often, we establish this with the help of previously | |
683 | established basis that our inductively-defined types are well-founded, | |
684 | which is itself proved on the basis that the subterm relationship is | |
685 | well-founded. | |
686 | ||
687 | Much like we can prove termination in course of proving a type, | |
688 | we can prove a type in the course of proving a property -- such | |
689 | as the type of `reflect(reflect(T))` above. (This does not directly | |
690 | lead to a proof of the type of `reflect`, but whatever.) | |
691 | ||
692 | And, of course, we are only proving the type of term on the | |
693 | assumption that its subterms have specific types. These proofs | |
694 | say nothing about the other cases. This may provide flexibility | |
695 | for extending rewrite systems -- or it might not, I'm not sure. | |
696 | It might be nice to prove that all other types result in some | |
697 | error term. (One of the more annoying things about term-rewriting | |
698 | languages is how an error can result in a half-rewritten program | |
699 | instead of a recgonizable error code. There seems to be a tradeoff | |
700 | between extensibility and producing recognizable errors.) | |
701 | ||
702 | Grammar so Far | |
703 | -------------- | |
704 | ||
705 | I think I've described everything I want in the language above, so | |
706 | the grammar should, modulo tweaks, look something like this: | |
707 | ||
708 | Madison ::= Block. | |
709 | Block ::= LetBlock | TypeBlock | ProofBlock | RewriteBlock. | |
710 | LetBlock ::= "let" {Rule} "in" Block. | |
711 | TypeBlock ::= "type" Symbol "is" {Rule} "in" Block. | |
712 | RewriteBlock ::= "rewrite" Term. | |
713 | Rule ::= Term "->" Term. | |
714 | Term ::= Atom | Constructor | Variable. | |
715 | Atom ::= Symbol. | |
716 | Constructor ::= Symbol "(" Term {"," Term} ")". | |
717 | ProofBlock ::= "theorem" Statement "proof" Proof "qed". | |
718 | Statement ::= Quantifier Statement | MultiStep. | |
719 | Quantifiers ::= "forall" Variable "where" Term. | |
720 | MultiStep ::= Term "~>" Term. | |
721 | Proof ::= Case Proof {Case Proof} | Trace. | |
722 | Trace ::= Term {"->" Term [RuleRef]}. | |
723 | RuleRef ::= "[" "by" (Symbol "." Integer | "IH") "]". | |
724 | ||
725 | Discussion | |
726 | ---------- | |
727 | ||
728 | I think that basically covers it. This document is still a little | |
729 | rough, but that's what major version zeroes are for, right? | |
730 | ||
731 | I have essentially convinced myself that the above-described system | |
732 | is sufficient for simple proof checking. There are three significant | |
733 | things I had to convince myself of to get to this point, which I'll | |
734 | describe here. | |
735 | ||
736 | One is that types have to be well-founded in order for them to serve | |
737 | as scopes for universal quantification. This is obvious in | |
738 | retrospect, but getting them into the language in a way where it was | |
739 | clear they could be checked for well-foundedness took a little | |
740 | effort. The demarcation of type-predicate rewrite rules was a big | |
741 | step, and a little disappointing because it introduces the loaded | |
742 | term `type` into Madison's vernacular, which I wanted to avoid. | |
743 | But it made it much easier to think about, and to formulate the | |
744 | rules for checking that a type is well-founded. As I mentioned, it | |
745 | could go away -- Madison could just as easily check that any | |
746 | constructor used to scope a universal quantification is well-founded. | |
747 | But that would probably muddy the presentation of the idea in this | |
748 | document somewhat. It would be something to keep in mind for a | |
749 | subsequent version of Madison that further distances itself from the | |
750 | notion of "types". | |
751 | ||
752 | Also, it would probably be possible to extend the notion of well- | |
753 | founded rewriting rules to mutually-recursive rewriting rules. | |
754 | However, this would complicate the procedure for checking that a | |
755 | type predicate is well-founded. | |
756 | ||
757 | The second thing I had to accept to get to this conviction is that | |
758 | `and(X,Y)` is built into the language. It can't just be defined | |
759 | in Madison code, because while this would be wonderful from a | |
760 | standpoint of minimalism, Madison has to know what it means to let | |
761 | you write non-trivial inductive proofs. In a nutshell, it has to | |
762 | know that `foo(X) -> and(bar(X),baz(X))` means that if `foo(X)` is | |
763 | true, then `bar(X)` is also true, and `baz(X)` is true as well. | |
764 | ||
765 | I considered making `or(X,Y)` a built-in as well, but after some | |
766 | thought, wasn't convinced that it was that valuable in the kinds | |
767 | of proofs I wanted to write. | |
768 | ||
769 | Lastly, the third thing I had to come to terms with was, in general, | |
770 | how we know a stated proof is complete. As I've tried to describe | |
771 | above, we know it's complete because each of the cases maps to a | |
772 | possible rewrite rule, and induction maps to the inductive definition | |
773 | of a type predicate, which we know is well-founded because of the | |
774 | checks Madison does on it (ultimately based on the assumption that | |
775 | the subterm property is well-founded.) There Is Nothing Else. | |
776 | ||
777 | This gets a little more complicated when you get into proofs by | |
778 | induction. The thing there is that we can assume the property | |
779 | we want to prove, in one of the cases (the inductive case) of the | |
780 | proof, so long as we have already proved all the other cases (the | |
781 | base case.) This is perfectly sound in proofs by hand, so it is | |
782 | likewise perfectly sound in a formal proof checker like Madison; | |
783 | the question is how Madison "knows" that it is sound, i.e. how it | |
784 | can be programmed to reject proofs which are not structured this | |
785 | way. Well, if we limit it to what I've just described above -- | |
786 | check that the scope of the universal quantification is well- | |
787 | founded, check that there are two cases, and check that we've already | |
788 | proved one case, then allow the inductive hypothesis to be used as a | |
789 | rewrite rule in the other case of the proof -- this is not difficult | |
790 | to see how it could be mechanized. | |
791 | ||
792 | However, this is also very limited. Let's talk about limitations. | |
793 | ||
794 | For real data structures, you might well have multiple base cases; | |
795 | for example, a tree with two kinds of leaf nodes. Does this start | |
796 | breaking down? Probably. It probably breaks down with multiple | |
797 | inductive cases, as well, although you might be able to get around | |
798 | that with breaking the proof into multiple proofs, and having | |
799 | subsequent proofs rely on properties proved in previous proofs. | |
800 | ||
801 | Another limitation I discovered when trying to write a proof that | |
802 | addition in Peano arithmetic is commutative. It seemingly can't | |
803 | be done in Madison as it currently stands, as Madison only knows | |
804 | how to rewrite something into something else, and cannot express | |
805 | the fact that two things (like `add(A,B)` and `add(B,A)`) rewrite | |
806 | to the same thing. Such a facility would be easy enough to add, | |
807 | and may appear in a future version of Madison, possibly with a | |
808 | syntax like: | |
809 | ||
810 | theorem | |
811 | forall A where nat(A) | |
812 | forall B where nat(B) | |
813 | add(A,B) ~=~ add(B,A) | |
814 | proof ... | |
815 | ||
816 | You would then show that `add(A,B)` reduces to something, and | |
817 | that `add(B,A)` reduces to something, and Madison would check | |
818 | that the two somethings are in fact the same thing. This is, in | |
819 | fact, a fairly standard method in the world of term rewriting. | |
820 | ||
821 | As a historical note, Madison is one of the pieces of fallout from | |
822 | the overly-ambitious project I started a year and a half ago called | |
823 | Rho. Rho was a homoiconic rewriting language with several very | |
824 | general capabilities, and it wasn't long before I decided it was | |
825 | possible to write proofs in it, as well as the other things it was | |
826 | designed for. Of course, this stretched it to about the limit of | |
827 | what I could keep track of in a single project, and it was soon | |
828 | afterwards abandoned. Other fallout from Rho made it into other | |
829 | projects of mine, including Pail (having `let` bindings within | |
830 | the names of other `let` bindings), Falderal (the test suite from | |
831 | the Rho implementation), and Q-expressions (a variant of | |
832 | S-expressions, with better quoting capabilities, still forthcoming.) | |
833 | ||
834 | Happy proof-checking! | |
835 | Chris Pressey | |
836 | December 2, 2011 | |
837 | Evanston, Illinois |
0 | <html> | |
1 | <head> | |
2 | <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> | |
3 | <meta name="Description" content="Cat's Eye Technologies: MDPN: Multi-Dimensional Pattern Notation"> | |
4 | <title>Cat's Eye Technologies: MDPN: Multi-Dimensional Pattern Notation</title> | |
5 | <!-- begin html doc dynamic markup --> | |
6 | <script type="text/javascript" src="/contrib/jquery-1.6.4.min.js"></script> | |
7 | <script type="text/javascript" src="/scripts/documentation.js"></script> | |
8 | <!-- end html doc dynamic markup --> | |
9 | </head> | |
10 | <body> | |
11 | ||
12 | <h1>Multi-Directional Pattern Notation</h1> | |
13 | <p>Final - Sep 6 1999</p> | |
14 | <hr> | |
15 | ||
16 | <h2>Introduction</h2> | |
17 | ||
18 | <p>MDPN is an extension to EBNF, which attributes it for | |
19 | the purposes of scanning and parsing input files which | |
20 | assume a non-unidirectional form. A familiarity with | |
21 | EBNF is assumed for the remainder of this document.</p> | |
22 | ||
23 | <p>MDPN was developed by Chris Pressey in late 1998, | |
24 | built on an earlier, less successful attempt at a "2D EBNF" devised | |
25 | to fill a void that the | |
26 | mainstream literature on parsing seemed to rarely if | |
27 | ever approach, with much help provided by John Colagioia | |
28 | throughout 1998.</p> | |
29 | ||
30 | <p>MDPN has possible uses in the construction of | |
31 | parsers and subsequently compilers for multi-directional and | |
32 | multi-dimensional languages such as Orthogonal, Befunge, | |
33 | Wierd, Blank, Plankalkül, and even less contrived notations | |
34 | like structured Flowchart and Object models of systems.</p> | |
35 | ||
36 | <p>As the name indicates, MDPN provides a notation for | |
37 | describing multidimensional patterns by | |
38 | extending the concept of linear scanning and matching | |
39 | with geometric attributes in a given number of dimensions.</p> | |
40 | ||
41 | <h2>Preconditions for Multidirectional Parsing</h2> | |
42 | ||
43 | <p>The multidirectional parsing that MDPN concerns itself | |
44 | with assumes that any portion of the input file is accessable | |
45 | at any time. Concepts such as LL(1) are fairly meaningless | |
46 | in a non-unidirectional parsing system of this sort. | |
47 | The unidirectional input devices such as paper tape and | |
48 | punch cards that were the concern of original parsing methods | |
49 | have been superceded | |
50 | by modern devices such as hard disk drives and ample, cheap RAM.</p> | |
51 | ||
52 | <p>In addition, MDPN is limited to an orthogonal | |
53 | representation of the input file, and this | |
54 | document is generally less concerned about forms of | |
55 | four or higher dimensions, to reduce unnecessary complexity.</p> | |
56 | ||
57 | <h2>Notation from EBNF</h2> | |
58 | ||
59 | <p>Syntax is drawn from EBNF. It is slightly modified, but | |
60 | should not surprise anyone who is familiar with EBNF.</p> | |
61 | ||
62 | <p>A freely-chosen unadorned ('bareword') alphabetic | |
63 | multicharacter identifier indicates | |
64 | the name of a nonterminal (named pattern) in the | |
65 | grammar. e.g. <code>foo</code>. (Single characters have special | |
66 | meaning as operators.) Case matters: <code>foo</code> is not the | |
67 | same name as <code>Foo</code> or <code>FOO</code>.</p> | |
68 | ||
69 | <p>Double quotes begin and end literal terminals (symbols.) | |
70 | e.g. <code>"bar"</code>.</p> | |
71 | ||
72 | <p>A double-colon-equals-sign (<code>::=</code>) | |
73 | describes a production (pattern match) | |
74 | by associating a single nonterminal on the left with | |
75 | a pattern on the right, terminated with a period. | |
76 | e.g. <code>foo ::= "bar".</code></p> | |
77 | ||
78 | <p>A pattern is a series of terminals, nonterminals, | |
79 | operators, and parenthetics.</p> | |
80 | ||
81 | <p>The <code>|</code> operator denotes | |
82 | alternatives. e.g. <code>"foo" | "bar"</code></p> | |
83 | ||
84 | <p>The <code>(</code> and <code>)</code> parentheses denote | |
85 | precedence and grouping.</p> | |
86 | ||
87 | <p>The <code>[</code> and <code>]</code> brackets denote | |
88 | that the contents may be omitted, that is, they | |
89 | may occur zero or one times. e.g. <code>"bar" ["baz"]</code></p> | |
90 | ||
91 | <p>The <code>{</code> and <code>}</code> braces denote | |
92 | that the contents may be omitted or may be repeated any number of times. | |
93 | e.g. <code>"bar" {baz "quuz"}</code></p> | |
94 | ||
95 | <h2>Deviations from EBNF</h2> | |
96 | ||
97 | <p>The input file is spatially related to a | |
98 | coordinate system and it is useful to | |
99 | think of the input mapped to an orthogonally | |
100 | distributed (Cartesian) form with no arbitrary | |
101 | limit imposed on its size, hereinafter referred to | |
102 | as <i>scan-space</i>.</p> | |
103 | ||
104 | <p>The input file is mapped to scan-space. The | |
105 | first printable character in the input file always | |
106 | maps to the <i>origin</i> of scan-space regardless of | |
107 | the number of dimensions. The origin is enumerated | |
108 | with coordinates (0) in one dimension, (0,0) in two | |
109 | dimensions, (0,0,0) in three dimensions, etc.</p> | |
110 | ||
111 | <p>Scan-space follows the 'computer storage' co-ordinate | |
112 | system so that <i>x</i> coordinates increase to | |
113 | the 'east' (rightwards), <i>y</i> coordinates increase to the | |
114 | 'south' (downwards), and <i>z</i> coordinates increase on each | |
115 | successive 'page'.</p> | |
116 | ||
117 | <p>Successive characters in the input file indicate | |
118 | successive coordinate (<i>x</i>) values in scan-space. | |
119 | For two and three dimensions, end-of-line markers are assumed | |
120 | to indicate "reset the <i>x</i> dimension and increment | |
121 | the <i>y</i> dimension", and end-of-page markers | |
122 | indicate "reset the <i>y</i> dimension and increment | |
123 | the <i>z</i> dimension", | |
124 | thus following the | |
125 | commonplace mapping of computer text printouts.</p> | |
126 | ||
127 | <p>Whitespace in the input file are <b>not</b> ignored. | |
128 | The terminal <code>" "</code>, however, will match any | |
129 | whitespace (including tabs, which are <b>not</b> expanded.) | |
130 | The pattern <code>{" "}</code> may be used to indicate | |
131 | any number of whitespaces; <code>" " {" "}</code> may be used to | |
132 | indicate one or more whitespaces. Areas of scan-space | |
133 | beyond the bounds of the input file are considered to be filled | |
134 | with whitespaces.</p> | |
135 | ||
136 | <p>Therefore, <code>"hello"</code> as a terminal is exactly | |
137 | the same as <code>"h" "e" "l" "l" "o"</code> as an pattern of | |
138 | terminals.</p> | |
139 | ||
140 | <p>A <code>}</code> closing brace can be followed by a <code>^</code> | |
141 | (<i>constraint</i>) | |
142 | operator, which is followed by an expression in parentheses.</p> | |
143 | ||
144 | <ul> | |
145 | <p>This expression is actually in a subnotation which supports | |
146 | a very simple form of algebra. The expression (built with | |
147 | terms connected by infix <code>+-*/%</code> operators with their | |
148 | C language meanings) can either reduce to | |
149 | <ul> | |
150 | <li>a constant value, as in <code>{"X"} ^ (5)</code>, which | |
151 | would match five <code>X</code> terminals in a line; or | |
152 | <li>an unknown value, which can involve any single lowercase | |
153 | letters, which indicate variables local to the production, | |
154 | as in <code>{"+"}^(x) {"-"}^(x*2)</code>, which would match only twice | |
155 | as many minus signs as plus signs. | |
156 | </ul></p> | |
157 | ||
158 | <p>Complex algebraic expressions in constraints can and probably | |
159 | should be avoided when constructing a MDPN grammar | |
160 | for a real (non-contrived) compiler. MDPN-based | |
161 | compiler-compilers aren't expected to support more than | |
162 | one or two unknowns per expression, for example. | |
163 | There is no such restriction, of course, | |
164 | when using MDPN as a guide for hand-coding a multidimensional | |
165 | parser, or otherwise using it as a more | |
166 | sophisticated pattern-matching tool.</p> | |
167 | </ul> | |
168 | ||
169 | <h2>The Scan Pointer</h2> | |
170 | ||
171 | <p>It is useful to imagine a <i>scan pointer</i> (SP, | |
172 | not to be confused with a <i>stack pointer</i>, which is not | |
173 | the concern of this document) which is analogous to the | |
174 | current token in a single-dimensional parser, but exists | |
175 | in MDPN as a free spatial relationship to the input file, and | |
176 | thus also has associated geometric attributes such as | |
177 | direction.</p> | |
178 | ||
179 | <p>The SP's <i>location</i> is advanced through scan-space by its | |
180 | <i>heading</i> as terminals in the productions | |
181 | are successfully matched with symbols in the input buffer.</p> | |
182 | ||
183 | <p>The following geometric attribution operators modify the | |
184 | properties of the SP. Note that injudicious | |
185 | use of any of these operators <i>can</i> result in an infinite loop | |
186 | during scanning. There is no built-in contingency measure to | |
187 | escape from an infinite parsing loop in MDPN (but see | |
188 | exclusivity, below, for a possible way to overcome this.)</p> | |
189 | ||
190 | <p><code>t</code> is the relative translation operator. It is | |
191 | followed by a vector, in parentheses, which is added to the | |
192 | location of the SP. This does not change its heading.</p> | |
193 | ||
194 | <p>For example, <code>t (0,-1)</code> moves the | |
195 | SP one symbol above the current symbol (the symbol which was | |
196 | <i>about</i> to be matched.)</p> | |
197 | ||
198 | <p>As a more realistic example of how this works, consider that the | |
199 | pattern <code>"." t(-1,1) "!" t(0,-1)</code> will match a | |
200 | period with an exclamation point directly below it, like: | |
201 | ||
202 | <pre>. | |
203 | !</pre> | |
204 | </p> | |
205 | ||
206 | <p><code>r</code> is the relative rotation operator. It is | |
207 | followed by an axis identifier (optional: see below) | |
208 | and an orthogonal angle (an angle <i>a</i> | |
209 | such that |<i>a</i>| <b>mod</b> 90 degrees = 0) assumed to | |
210 | be measured in degrees, both in parentheses. | |
211 | The angle is added to the SP's heading. | |
212 | Negative angle arguments are allowed.</p> | |
213 | ||
214 | <p>Described in two dimensions, the (default) heading 0 denotes 'east,' that | |
215 | is, parsing character by character in a rightward direction, where | |
216 | the SP's <i>x</i> axis coordinate increases and all other axes coordinates | |
217 | stay the same. Increasing angles ascend counterclockwise (90 = 'north', | |
218 | 180 = 'west', 270 = 'south'.)</p> | |
219 | ||
220 | <p>For example, <code>">" r(-90) "+^"</code> would match | |
221 | ||
222 | <pre>>+ | |
223 | ^</pre> | |
224 | </p> | |
225 | ||
226 | <p>The axis identifier indicates which axis this rotation | |
227 | occurs around. If the axis identifier is omitted, | |
228 | the <i>z</i> axis is to be assumed, since this is certainly the | |
229 | most common axis to rotate about, in two dimensions.</p> | |
230 | ||
231 | <p>If the axis identifier is present, it may be a single letter | |
232 | in the set <code>xyz</code> (these unsurprisingly indicate | |
233 | the <i>x</i>, <i>y</i>, and <i>z</i> dimensions respectively), | |
234 | or it may be a non-negative integer value, where 0 corresponds to the | |
235 | <i>x</i> dimension, 1 corresponds to the <i>y</i> dimension, etc. | |
236 | (Implementation note: in more than two dimensions, | |
237 | the SP's heading property should probably be broken up | |
238 | internally into theta, rho, &c components as appropriate.)</p> | |
239 | ||
240 | <p>For example, <code>r(z,180)</code> rotates the SP's heading | |
241 | 180 degrees about the <i>z</i> (dimension #2) axis, as does <code>r(2,180)</code> | |
242 | or even just <code>r(180)</code>.</p> | |
243 | ||
244 | <p><code><</code> and <code>></code> are the push | |
245 | and pop state-stack operators, respectively. Alternately, | |
246 | they can be viewed as lookahead-assertion parenthetics, since the stack | |
247 | is generally assumed to be | |
248 | local to the production. (Compiler-compilers should probably | |
249 | notify the user, but not necessarily panic, if they find unbalanced | |
250 | <code><></code>'s.)</p> | |
251 | ||
252 | <p>All properties of the SP (including location and heading, | |
253 | and scale factor if supported) are pushed as a group | |
254 | onto the stack during <code><</code> and popped as a group | |
255 | off the stack during <code>></code>.</p> | |
256 | ||
257 | <h2>Advanced SP Features</h2> | |
258 | ||
259 | <p>These features are not absolutely necessary for most | |
260 | non-contrived multi-directional grammars. MDPN compiler-compilers | |
261 | are not expected to support them.</p> | |
262 | ||
263 | <p><code>T</code> is the absolute translation operator. It is | |
264 | followed by a vector which is assigned to the location of the SP. | |
265 | e.g. <code>T (0,0)</code> will 'home' the scan.</p> | |
266 | ||
267 | <p><code>R</code> is the absolute rotation operator. It is | |
268 | followed by an optional axis identifier, and an | |
269 | orthogonal angle assumed to be measured in degrees. | |
270 | The SP's heading is set to this | |
271 | angle. e.g. <code>R(270)</code> sets the | |
272 | SP scanning line after line down the input text, downwards. See | |
273 | the <code>r</code> operator, above, for how the axis identifier functions.</p> | |
274 | ||
275 | <p><code>S</code> is the absolute scale operator. It is | |
276 | followed by an orthogonal <i>scaling factor</i> (a scalar <i>s</i> | |
277 | such that <i>s</i> = <b>int</b>(<i>s</i>) and <i>s</i> >= 1). | |
278 | The SP's scale factor is set to this value. The finest | |
279 | possible scale, 1, indicates a 1:1 map with the input file; | |
280 | for each one input symbol matched, the SP advances one symbol | |
281 | in its path. When the scale factor is two, then for each | |
282 | one input symbol matched, the SP advances two symbols, skipping | |
283 | over an interim symbol. Etc.</p> | |
284 | ||
285 | <p><code>s</code> is the relative scale operator. It is | |
286 | followed by a scalar integer which is added to the SP's scaling factor | |
287 | (so long as it does not cause the scaling factor to be zero or negative.)</p> | |
288 | ||
289 | <p>Scale operators may also take an optional axis identifier | |
290 | (as in <code>S(y,2)</code>), but when the axis identifier is omitted, | |
291 | all axes are assumed (non-distortional scaling).</p> | |
292 | ||
293 | <p><code>!></code> is a state-assertion alternative to | |
294 | <code>></code>, for the purpose of determining that | |
295 | the SP successfully and completely | |
296 | reverted to a previous state that | |
297 | was pushed onto the stack ('came full circle'). This | |
298 | operator is something of a luxury; a | |
299 | grammar which uses constraints correctly should | |
300 | never <i>need</i> it, but it can come in handy.</p> | |
301 | ||
302 | <h2>Other Advanced Features: Exclusivity</h2> | |
303 | ||
304 | <p>Lastly, in the specification of a production, | |
305 | the <i>exclusivity</i> applying to that production can be given | |
306 | between a hyphen following | |
307 | the name of the nonterminal, and the <code>::=</code> operator.</p> | |
308 | ||
309 | <p>Exclusivity is a list of productions, named by their | |
310 | nonterminals, and comes into play at any particular <i>instance</i> | |
311 | of the production (i.e. when the production successfully | |
312 | matches specific symbols at specific points in scan-space during a parse, | |
313 | called the <i>domain</i>.) The exclusivity describes how the domain of each | |
314 | instance is protected from being the domain of any further instances. | |
315 | The domain of any subsequent instances of any productions listed | |
316 | in the exclusivity is restricted from sharing points in scan-space | |
317 | with the established domain.</p> | |
318 | ||
319 | <p>Exclusivity is a measure to prevent so-called <i>crossword | |
320 | grammars</i> - that is, where instances of productions | |
321 | can <i>overlap</i> and share common symbols - if desired. | |
322 | Internally it's generally considered a list of | |
323 | 'used-by-this-production' references | |
324 | associated with each point in scan-space. | |
325 | An example of the syntax to specify exclusivity is | |
326 | <code>bar - bar quuz ::= foo {"a"} baz</code>. | |
327 | Note that the domain of an instance of <code>bar</code> is the | |
328 | sum of the | |
329 | domains <code>foo</code>, <code>baz</code> and the chain of "<code>a</code>" terminals, | |
330 | and that neither a subsequent instance of <code>quuz</code> nor <code>bar</code> | |
331 | again can overlap it.</p> | |
332 | ||
333 | <h2>Examples of MDPN-described Grammars</h2> | |
334 | ||
335 | <p><b>Example 1.</b> A grammar for describing boxes.</p> | |
336 | ||
337 | <p>The task of writing a translator to recognize a | |
338 | two-dimensional construct such as a box can easily | |
339 | be realized using a tool such as MDPN.</p> | |
340 | ||
341 | <p>An input file might contain a of box drawn in | |
342 | ASCII characters, such as | |
343 | ||
344 | <pre>+------+ | |
345 | | | | |
346 | | | | |
347 | +------+</pre> | |
348 | </p> | |
349 | ||
350 | <p>Let's also say that boxes have a minimum height of four (they | |
351 | must contain at least two rows), but | |
352 | no minimum width. Also, edge characters must match up with | |
353 | which edge they are on. So, the following forms are both | |
354 | illegal inputs: | |
355 | ||
356 | <pre>+-+ | |
357 | +-+</pre> | |
358 | ||
359 | <pre>+-|-+ | |
360 | | | | |
361 | * | |
362 | | | | |
363 | +-|-+</pre> | |
364 | </p> | |
365 | ||
366 | <p>The MDPN production used to describe this box might be | |
367 | ||
368 | <pre> Box ::= "+" {"-"}^(w) r(-90) "+" "||" {"|"}^(h) r(-90) | |
369 | "+" {"-"}^(w) r(-90) "+" "||" {"|"}^(h) r(-90).</pre> | |
370 | </p> | |
371 | ||
372 | <p><b>Example 2.</b> A simplified grammar for Plankalkül's assignments.</p> | |
373 | ||
374 | <p>An input file might contain an ASCII approximation of something | |
375 | Zuse might have jotted down on paper: | |
376 | ||
377 | <pre> |Z + Z => Z | |
378 | V|1 2 3 | |
379 | S|1.n 1.n 1.n</pre> | |
380 | </p> | |
381 | ||
382 | <p>Simplified MDPN productions used to describe this might be | |
383 | ||
384 | <pre> | |
385 | Staff ::= Spaces "|" TempVar AddOp TempVar Assign TempVar.<br> | |
386 | TempVar ::= "Z" t(-1,1) Index t(-1,1) Structure t(0,-2) Spaces. | |
387 | Index ::= <Digit {Digit}>. | |
388 | Digit ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9". | |
389 | Structure ::= <Digit ".n">. | |
390 | AddOp ::= ("+" | "-") Spaces. | |
391 | Assign ::= "=>" Spaces. | |
392 | Spaces ::= {" "}. | |
393 | </pre> | |
394 | </p> | |
395 | ||
396 | <hr> | |
397 | <i>Last Updated Mar 2 <img src="/images/icons/copyright.gif" | |
398 | align=absmiddle width=12 height=12 alt="(c)" border=0>2004 <a | |
399 | href="/">Cat's Eye Technologies</a>. | |
400 | All rights reserved. | |
401 | This document may be freely redistributed in its unchanged form.</i> | |
402 | </body></html> |
0 | <?xml version="1.0"?> | |
1 | <!-- encoding: utf-8 --> | |
2 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> | |
3 | <html xmlns="http://www.w3.org/1999/xhtml" lang="en"> | |
4 | <head> | |
5 | <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> | |
6 | <title>The Oozlybub and Murphy Programming Language</title> | |
7 | <!-- begin html doc dynamic markup --> | |
8 | <script type="text/javascript" src="/contrib/jquery-1.6.4.min.js"></script> | |
9 | <script type="text/javascript" src="/scripts/documentation.js"></script> | |
10 | <!-- end html doc dynamic markup --> | |
11 | </head> | |
12 | <body> | |
13 | ||
14 | <h1>Oozlybub and Murphy</h1> | |
15 | ||
16 | <p>Language version 1.1</p> | |
17 | ||
18 | <h2>Overview</h2> | |
19 | ||
20 | <p>This document describes a new programming language. | |
21 | The name of this language is <dfn>Oozlybub and Murphy</dfn>. | |
22 | Despite appearances, this name refers to a single language. | |
23 | The majority of the language is | |
24 | named Oozlybub. The fact that the language is not entirely named | |
25 | Oozlybub is named Murphy. Deal with it.</p> | |
26 | ||
27 | <p>For the sake of providing an "olde tyme esoterickal de-sign", the | |
28 | language combines several unusual features, including multiple interleaved parse | |
29 | streams, infinitely long variable names, gratuitously strong typing, and | |
30 | only-conjectural Turing completeness. | |
31 | While no implementation of the language exists as of this writing, it is thought to | |
32 | be sufficiently consistent to be implementable, modulo any errors in this docunemt.</p> | |
33 | ||
34 | <p>In places the language may resemble <a href="/projects/smith/">SMITH</a> | |
35 | and <a href="/projects/quylthulg/">Quylthulg</a>, but this was not intended, | |
36 | and the similarities are purely emergent.</p> | |
37 | ||
38 | <h2>Program Structure</h2> | |
39 | ||
40 | <p>A Oozlybub and Murphy program consists of a number of <dfn>variables</dfn> and | |
41 | a number of objects called <dfn>dynasts</dfn>. A Oozlybub and Murphy program text | |
42 | consists of multiple <dfn>parse streams</dfn>. Each parse stream contains | |
43 | zero or more variable declarations, and optionally a single dynast.</p> | |
44 | ||
45 | <h3>Parse Streams</h3> | |
46 | ||
47 | <p>A parse stream is just a segment, possibly non-contiguous, of the text of | |
48 | a Oozlybub and Murphy program. A program starts out with a single parse stream, but | |
49 | certain <dfn>parse stream manipulation pragmas</dfn> can change this. | |
50 | These pragmas have the form <code>{@<var>x</var>}</code> and | |
51 | have a similar syntactic status as comments; they can appear anywhere except | |
52 | inside a lexeme.</p> | |
53 | ||
54 | <p>Parse streams are arranged as a ring (a cyclic doubly linked list.) When parsing | |
55 | of the program text begins initially, there is already a single pre-created parse stream. | |
56 | When the program text ends, all parse streams which may be active are deleted.</p> | |
57 | ||
58 | <p>The meanings of the pragmas are:</p> | |
59 | ||
60 | <ul> | |
61 | <li><code>{@+}</code> Create a new parse stream to the right of the current one.</li> | |
62 | <li><code>{@>}</code> Switch to the parse stream to the right of the current one.</li> | |
63 | <li><code>{@<}</code> Switch to the parse stream to the left of the current one.</li> | |
64 | <li><code>{@-}</code> Delete the current parse stream. The parse stream to the | |
65 | left of the deleted parse stream will become the new current parse stream.</li> | |
66 | </ul> | |
67 | ||
68 | <p>Deleting a parse stream while it contains an unfinished syntactic construct is a syntax | |
69 | error, just as an end-of-file in that circumstance would be in most other languages.</p> | |
70 | ||
71 | <p>Providing a concrete example of parse streams in action will be difficult in the | |
72 | absence of defined syntax for the rest of Oozlybub and Murphy, so we will, for the purposes of | |
73 | the following demonstration only, pretend that the contents of a parse stream is a sentence of English. | |
74 | Here is how three parse streams might be managed:</p> | |
75 | ||
76 | <p><code>The quick {@+}brown{@>}Now is the time{@<}fox{@<} for all good | |
77 | men to {@+}{@>}Wherefore art thou {@>} jumped over {@>}{@>}Romeo?{@-} | |
78 | come to the aid of {@>}the lazy dog's tail.{@-}their country.{@-} | |
79 | </code></p> | |
80 | ||
81 | <h3>Variables</h3> | |
82 | ||
83 | <p>All variables are declared in a block at the beginning of a parse stream. | |
84 | If there is also a dynast in that stream, the variables are private to that dynast; | |
85 | otherwise they are global and shared by all dynasts. (<em>Defined in 1.1</em>) | |
86 | Any dynamically created dynast gets its own private copies of any private | |
87 | variables the original dynast had; they will initially hold the values they | |
88 | had in the original, but they are not shared.</p> | |
89 | ||
90 | <p>The name of a variable in Oozlybub and Murphy is not a fixed, finite-length string of symbols, | |
91 | as you would find in other programming languages. No sir! In Oozlybub and Murphy, each | |
92 | variable is named by a possibly-infinite set of strings | |
93 | (over the alphanumeric-plus-spaces alphabet <code>[a-zA-Z0-9 ]</code>), | |
94 | at least one of which must be infinitely long. (<em>New in 1.1</em>: spaces | |
95 | [but no other kinds of whitespace] are allowed in these strings.)</p> | |
96 | ||
97 | <p>To accomodate this method of identifying a variable, in Oozlybub and Murphy programs, | |
98 | which are finite, variables are identified using regular expressions | |
99 | which match their set of names. | |
100 | An equivalence class of regular expressions is a set of all regular expressions which | |
101 | accept exactly the same set of strings; each equivalence class of regular expressions | |
102 | refers to the same, unique Oozlybub and Murphy variable.</p> | |
103 | ||
104 | <p>(In case you wonder about the implementability of this: | |
105 | Checking that two regular expressions are | |
106 | equivalent is decidable: we convert them both to NFAs, then | |
107 | to DFAs, then minimize those DFAs, then check if the transition | |
108 | graphs of those DFAs are isomorphic. | |
109 | Checking that the regular expression accepts at least | |
110 | one infinitely-long string is also decidable: just look for | |
111 | a cycle in the DFA's graph.)</p> | |
112 | ||
113 | <p>Note that these identifier-sets need not be disjoint. <code>/ma*/</code> | |
114 | and <code>/mb*/</code> are distinct variables, even though both contain the | |
115 | string <code>m</code>. (Note also that we are fudging slightly on how we | |
116 | consider to have described an infinitely long name; technically we would | |
117 | want to have a Büchi automaton that specifies an unending repetition with | |
118 | <sup>ω</sup> instead of *. But the distinction is subtle enough in this | |
119 | context that we're gonna let it slide.)</p> | |
120 | ||
121 | <p>Syntax for giving a variable name is fairly straightforward: it is delimited on | |
122 | either side by <code>/</code> symbols; the alphanumeric symbols are literals; | |
123 | textual concatenation is regular expression sequencing, | |
124 | <code>|</code> is alteration, <code>(</code> and <code>)</code> increase precedence, | |
125 | and <code>*</code> is Kleene asteration (zero or more occurrences). | |
126 | Asteration has higher precedence than sequencing, which has higher precedence | |
127 | than alteration. Because none of these operators is alphanumeric nor a space, no | |
128 | escaping scheme needs to be installed.</p> | |
129 | ||
130 | <p>Variables are declared with the following syntax (<code>i</code> and <code>a</code> | |
131 | are the types of the variables, described in the next section):</p> | |
132 | ||
133 | <pre> | |
134 | VARIABLES ARE i /pp*/, i /qq*/, a /(0|1)*/. | |
135 | </pre> | |
136 | ||
137 | <p>This declares an integer variable identified by the names {<code>p</code>, <code>pp</code>, <code>ppp</code>, ...}, | |
138 | an integer variable identified by the names {<code>q</code>, <code>qq</code>, <code>qqq</code>, ...}, | |
139 | and an array variable identified by the names of all strings of <code>0</code>'s and <code>1</code>'s.</p> | |
140 | ||
141 | <p>When not in wimpmode (see below), any regular expression which denotes a | |
142 | variable may not be literally repeated anywhere else in the program. So in the above example, | |
143 | it would not be legal to refer to <code>/pp*/</code> further down in the program; an equivalent regular | |
144 | expression, such as <code>/p|ppp*/</code> or <code>/p*p/</code> or | |
145 | <code>/pp*|pp*|pp*/</code> would have to be used instead.</p> | |
146 | ||
147 | <h3>Types</h3> | |
148 | ||
149 | <p>Oozlybub and Murphy is a statically-typed language, in that variables | |
150 | as well as values have types, and a value of one type cannot be stored in | |
151 | a variable of another type. The types of values, however, are not entirely | |
152 | disjoint, as we will see, and special considerations may arise for checking | |
153 | and conversion because of this.</p> | |
154 | ||
155 | <p>The basic types are:</p> | |
156 | ||
157 | <ul> | |
158 | <li><code>i</code>, the type of integers. | |
159 | <p>These are integers of unbounded extent, both positive and negative. | |
160 | Literal constants of type <code>i</code> are given in the usual decimal format. | |
161 | Variables of this type initially contain the value 0.</p> | |
162 | </li> | |
163 | ||
164 | <li><code>p</code>, the type of prime numbers. | |
165 | <p>All prime numbers are integers but not all integers are prime numbers. | |
166 | Thus, values of prime number type will automatically be coerced to | |
167 | integers in contexts that require integers; however the reverse is not true, | |
168 | and in the other direction a conversion function (<code>P?</code>) must | |
169 | be used. There are no literal constants of type <code>p</code>. Variables | |
170 | of this type initially contain the value 2.</p> | |
171 | </li> | |
172 | ||
173 | <li><code>a</code>, the type of arrays of integers. | |
174 | <p>An integer array has an integer index which is likewise of unbounded | |
175 | extent, both positive and negative. Variables of this type initially | |
176 | contain an empty array value, where all of the entries are 0.</p> | |
177 | </li> | |
178 | ||
179 | <li><code>b</code>, the type of booleans. | |
180 | <p>A boolean has two possible values, <code>true</code> and <code>false</code>. | |
181 | Note that there are no literal constants of type <code>b</code>; these must be | |
182 | specified by constructing a tautology or contradiction with boolean (or other) operators. | |
183 | It is illegal to retrieve the value of a variable of this type before first assigning it, | |
184 | except to construct a tautology or contradiction.</p> | |
185 | </li> | |
186 | ||
187 | <li><code>t</code>, the type of truth-values. | |
188 | <p>A truth-value has two possible values, <code>yes</code> and <code>no</code>. | |
189 | There are no literal constants of type <code>t</code>. | |
190 | It is illegal to retrieve the value of a variable of this type before first assigning it, | |
191 | except to construct a tautology or contradiction.</p> | |
192 | </li> | |
193 | ||
194 | <li><code>z</code>, the type of bits. | |
195 | <p>A bit has two possible values, <code>one</code> and <code>zero</code>. | |
196 | There are no literal constants of type <code>z</code>. | |
197 | It is illegal to retrieve the value of a variable of this type before first assigning it, | |
198 | except to construct a tautology or contradiction.</p> | |
199 | </li> | |
200 | ||
201 | <li><code>c</code>, the type of conditions. | |
202 | <p>A condition has two possible values, <code>go</code> and <code>nogo</code>. | |
203 | There are no literal constants of type <code>c</code>. | |
204 | It is illegal to retrieve the value of a variable of this type before first assigning it, | |
205 | except to construct a tautology or contradiction.</p> | |
206 | </li> | |
207 | ||
208 | </ul> | |
209 | ||
210 | <h3>Wimpmode</h3> | |
211 | ||
212 | <p>(<em>New in 1.1</em>) An Oozlybub and Murphy program is in <dfn>wimpmode</dfn> if it | |
213 | declares a global variable of integer type which matches the string <code>am a wimp</code>, | |
214 | for example:</p> | |
215 | ||
216 | <pre> | |
217 | VARIABLES ARE i /am *a *wimp/. | |
218 | </pre> | |
219 | ||
220 | <p>Certain language constructs, noted in this document as such, | |
221 | are only permissible in wimpmode. | |
222 | If they are used in a program in which wimpmode is not in effect, a | |
223 | compile-time error shall occur and the program shall not be executed.</p> | |
224 | ||
225 | <h3>Dynasts</h3> | |
226 | ||
227 | <p>Each dynast is labeled with a positive integer and contains an expression. | |
228 | Only one dynast may be denoted in any given parse stream, but dynasts may also | |
229 | be created dynamically during program execution.</p> | |
230 | ||
231 | <p>Program execution begins at the lowest-numbered dynast that exists in the initial program. | |
232 | When a dynast is executed, the expression of that dynast is evaluated for its side-effects. | |
233 | If there is a dynast labelled with the next higher integer (i.e. the successor of the label | |
234 | of the current dynast), | |
235 | execution continues with that dynast; otherwise, the program halts. | |
236 | Once a dynast has been executed, it continues to exist until the program | |
237 | halts, but it may never be executed again.</p> | |
238 | ||
239 | <p>Evaluation of an expression may have side-effects, including | |
240 | writing characters to an output channel, reading characters from | |
241 | an input channel, altering the value of a variable, and creating | |
242 | a new dynast.</p> | |
243 | ||
244 | <p>Dynasts are written with the syntax <code>dynast(<var>label</var>) <-> <var>expr</var></code>. A concrete example follows:</p> | |
245 | ||
246 | <pre> | |
247 | dynast(100) <-> for each prime /p*/ below 1000 do write (./p*|p/+1.) | |
248 | </pre> | |
249 | ||
250 | <h3>TRIVIA PORTION OF SHOW</h3> | |
251 | ||
252 | <p>WHO WAS IT FAMOUS MAN THAT SAID THIS?</p> | |
253 | ||
254 | <ul> | |
255 | <li>A) RONALD REAGAN</li> | |
256 | <li>B) RONALD REAGAN</li> | |
257 | <li>B) RONALD STEWART</li> | |
258 | <li>C) RENALDO</li> | |
259 | </ul> | |
260 | ||
261 | <p>contestant enters lightning round now</p> | |
262 | ||
263 | <h3>Expressions</h3> | |
264 | ||
265 | <p>In the following, the letter preceding <var>-expr</var> or <var>-var</var> indicates | |
266 | the expected type, if any, of that expression or variable. Where the expressions listed below are infix | |
267 | expressions, they are listed from highest to lowest precedence. Unless noted otherwise, | |
268 | subexpressions are evaluated left to right.</p> | |
269 | ||
270 | <ul> | |
271 | ||
272 | <li><code>(.<var>expr</var>.)</code> | |
273 | <p>Surrounding an expression with <dfn>dotted parens</dfn> gives it that precedence boost | |
274 | that's just the thing to have it be evaluated before the expression it's in, but | |
275 | there is a catch. The number of parens in the dotted parens expression must | |
276 | match the nesting depth in the following way: if a set of dotted parens is | |
277 | nested within <var>n</var> dotted parens, it must contain fib(<var>n</var>) | |
278 | parens, where fib(<var>n</var>) is the <var>n</var>th member of the | |
279 | Fibonacci sequence. For example, <code>(.(.0.).)</code> and <code>(.(.((.(((.(((((.0.))))).))).)).).)</code> | |
280 | are syntactically well-formed expressions (when not nested in any other dotted paren | |
281 | expression), but <code>(.(((.0.))).)</code> and | |
282 | <code>(.(.(.0.).).)</code> are not.</p> | |
283 | </li> | |
284 | ||
285 | <li><code><var>var</var></code> | |
286 | <p>A variable evaluates to the value it contains at that point in execution.</p> | |
287 | </li> | |
288 | ||
289 | <li><code>0</code>, <code>1</code>, <code>2</code>, <code>3</code>, etc. | |
290 | <p>Decimal literals evaluate to the expected value of type <code>i</code>.</p> | |
291 | </li> | |
292 | ||
293 | <li><code>#myself#</code> | |
294 | <p>This special nullary token evaluates to the numeric label of the currently | |
295 | executing dynast.</p> | |
296 | </li> | |
297 | ||
298 | <li><code><var>var</var> := <var>expr</var></code> | |
299 | <p>Evaluates the <var>expr</var> and stores the result in the specified variable. | |
300 | The variable and the expression must have the same type. Evaluates to whatever | |
301 | <var>expr</var> evaluated to.</p> | |
302 | </li> | |
303 | ||
304 | <li><code><var>a-expr</var> [<var>i-expr</var>]</code> | |
305 | <p>Evaluates to the <code>i</code> stored at the location in the | |
306 | array given by <var>i-expr</var>.</p> | |
307 | </li> | |
308 | ||
309 | <li><code><var>a-expr</var> [<var>i-expr</var>] := <var>i-expr</var></code> | |
310 | <p>Evaluates the second <var>i-expr</var> and stores the result in the location | |
311 | in the array given by the first <var>i-expr</var>. Evaluates to whatever | |
312 | the second <var>i-expr</var> evaluated to.</p> | |
313 | </li> | |
314 | ||
315 | <li><code><var>a-expr</var> ? <var>i-expr</var></code> | |
316 | <p>Evaluates to <code>go</code> if <code><var>a-expr</var> [<var>i-expr</var>]</code> | |
317 | and <code><var>i-expr</var></code> evaluate to the same thing, <code>nogo</code> otherwise. | |
318 | The <var>i-expr</var> is only evaluated once.</p> | |
319 | </li> | |
320 | ||
321 | <li><code>minus <var>i-expr</var></code> | |
322 | <p>Evaluate to the integer that is zero minus the result of evaluating <var>i-expr</var>.</p> | |
323 | </li> | |
324 | ||
325 | <li><code>write <var>i-expr</var></code> | |
326 | <p>Write the Unicode code point whose number is obtained by evaluating <var>i-expr</var>, to | |
327 | the standard output channel. Writing a negative number shall produce one of a number of | |
328 | amusing and informative messages which are not defined by this document.</p> | |
329 | </li> | |
330 | ||
331 | <li><code>#read#</code> | |
332 | <p>Wait for a Unicode character to become available on the standard | |
333 | input channel and evaluate to its integer code point value.</p> | |
334 | </li> | |
335 | ||
336 | <li><code>not? <var>z-expr</var></code> | |
337 | <p>Converts a bit value to a boolean value (<code>zero</code> becomes <code>true</code> and <code>one</code> becomes <code>false</code>).</p> | |
338 | </li> | |
339 | ||
340 | <li><code>if? <var>b-expr</var></code> | |
341 | <p>Converts a boolean value to condition value (true becomes go and false becomes nogo).</p> | |
342 | </li> | |
343 | ||
344 | <li><code>cvt? <var>c-expr</var></code> | |
345 | <p>Converts a condition value to a truth-value (<code>go</code> becomes <code>yes</code> and <code>nogo</code> becomes <code>no</code>).</p> | |
346 | </li> | |
347 | ||
348 | <li><code>to? <var>t-expr</var></code> | |
349 | <p>Converts a truth-value to a bit value (<code>yes</code> becomes <code>one</code> and <code>no</code> becomes <code>zero</code>).</p> | |
350 | </li> | |
351 | ||
352 | <li><code>P? <var>i-expr</var> [<var>t-var</var>]</code> | |
353 | <p>If the result of evaluating <var>i-expr</var> is a prime number, evaluates to that | |
354 | prime number (and has the type <code>p</code>). If it is not prime, stores the value <code>no</code> | |
355 | into <var>t-var</var> and evaluates to 2.</p> | |
356 | </li> | |
357 | ||
358 | <li><code><var>i-expr</var> * <var>i-expr</var></code> | |
359 | <p>Evaluates to the product of the two <var>i-expr</var>s. The result is never | |
360 | of type <code>p</code>, but the implementation doesn't need to do anything | |
361 | based on that fact.</p> | |
362 | </li> | |
363 | ||
364 | <li><code><var>i-expr</var> + <var>i-expr</var></code> | |
365 | <p>Evaluates to the sum of the two <var>i-expr</var>s.</p> | |
366 | </li> | |
367 | ||
368 | <li><code>exists/dynast <var>i-expr</var></code> | |
369 | <p>Evaluates to <code>one</code> if a dynast exists with the given label, | |
370 | or <code>zero</code> if one does not.</p> | |
371 | </li> | |
372 | ||
373 | <li><code>copy/dynast <var>i-expr</var>, <var>p-expr</var>, <var>p-expr</var></code> | |
374 | <p>Creates a new dynast based on an existing one. The existing one is identified by | |
375 | the label given in the <var>i-expr</var>. The new dynast is a copy of the existing | |
376 | dynast, but with a new label. The new label is the sum of the two <var>p-expr</var>s. | |
377 | If a dynast with that label already exists, the program terminates. | |
378 | (<em>Defined in 1.1</em>) This expression evaluates to the value of the given <var>i-expr</var>.</p> | |
379 | </li> | |
380 | ||
381 | <li><code>create/countably/many/dynasts <var>i-expr</var>, <var>i-expr</var></code> | |
382 | <p>Creates a countably infinite number of dynasts based on an existing one. The existing one is identified by | |
383 | the label given in the first <var>i-expr</var>. The new dynasts are copies of the existing | |
384 | dynast, but with new labels. The new labels start at the first odd integer | |
385 | greater than the second <var>i-expr</var>, and consist of every odd integer greater than that. | |
386 | If any dynast with such a label already exists, the program terminates. | |
387 | (<em>Defined in 1.1</em>) This expression evaluates to the value of the first given <var>i-expr</var>.</p> | |
388 | </li> | |
389 | ||
390 | <li><code><var>b-expr</var> and <var>b-expr</var></code> | |
391 | <p>Evaluates to <code>one</code> if both <var>b-expr</var>s are <code>true</code>, | |
392 | <code>zero</code> otherwise. Note that this is not short-circuting; both <var>b-expr</var>s | |
393 | are evaluated.</p> | |
394 | </li> | |
395 | ||
396 | <li><code><var>c-expr</var> or <var>c-expr</var></code> | |
397 | <p>Evaluates to <code>yes</code> if either or both <var>c-expr</var>s are <code>go</code>, | |
398 | <code>no</code> otherwise. Note that this is not short-circuting; both <var>c-expr</var>s | |
399 | are evaluated.</p> | |
400 | </li> | |
401 | ||
402 | <li><code>do <var>expr</var></code> | |
403 | <p>Evaluates the <var>expr</var>, throws away the result, and evaluates to <code>go</code>.</p> | |
404 | </li> | |
405 | ||
406 | <li><code><var>c-expr</var> then <var>expr</var></code> | |
407 | <p><strong>Wimpmode only.</strong> Evaluates the <var>c-expr</var> on the left-hand side for its side-effects only, | |
408 | throwing away the result, then evaluates to the result of evaluating the right-hand | |
409 | side <var>expr</var>.</p> | |
410 | </li> | |
411 | ||
412 | <li><code><var>c-expr</var> ,then <var>i-expr</var></code> | |
413 | <p>(<em>New in 1.1</em>) Evaluates the <var>c-expr</var> on the left-hand side; if it is <code>go</code>, | |
414 | evaluates to the result of evaluating the right-hand side <var>i-expr</var>; | |
415 | if it is <code>nogo</code>, evaluates to an unspecified and quite possibly random | |
416 | integer between 1 and 1000000 inclusive, without evaluating the right-hand side. | |
417 | Note that this operator has the same precedence as <code>then</code>.</p> | |
418 | </li> | |
419 | ||
420 | <li><code>for each prime <var>var</var> below <var>i-expr</var> do <var>i-expr</var></code> | |
421 | <p>The <var>var</var> must be a declared variable of type <code>p</code>. The first | |
422 | <var>i-expr</var> must evaluate to an integer, which we will call <var>k</var>. | |
423 | The second <var>i-expr</var> is evaluated once for each prime number between | |
424 | <var>k</var> and 2, inclusive; each time it is evaluated, <var>var</var> is bound to | |
425 | a successively smaller prime number between <var>k</var> and 2. | |
426 | (<em>Defined in 1.1</em>) Evaluates to the result of the final evaluation of the second <var>i-expr</var>.</p> | |
427 | </li> | |
428 | ||
429 | </ul> | |
430 | ||
431 | <h3>Grammar</h3> | |
432 | ||
433 | <p>This section attempts to capture and summarize the syntax rules (for a single | |
434 | parse stream) described above, using an EBNF-like syntax extended with a few | |
435 | ad-hoc annotations that I don't feel like explaining right now.</p> | |
436 | ||
437 | <pre> | |
438 | ParseStream ::= VarDeclBlock {DynastLit}. | |
439 | VarDeclBlock ::= "VARIABLES ARE" VarDecl {"," VarDecl} ".". | |
440 | VarDecl ::= TypeSpec VarName. | |
441 | TypeSpec ::= "i" | "p" | "a" | "b" | "t" | "z" | "c". | |
442 | VarName ::= "/" Pattern "/". | |
443 | Pattern ::= {[a-zA-Z0-9 ]} | |
444 | | Pattern "|" Pattern /* ignoring precedence here */ | |
445 | | Pattern "*" /* and here */ | |
446 | | "(" Pattern ")". | |
447 | DynastLit ::= "dynast" "(" Gumber ")" "<->" Expr. | |
448 | Expr ::= Expr1[c] {"then" Expr1 | ",then" Expr1[i]}. | |
449 | Expr1 ::= Expr2[c] {"or" Expr2[c]}. | |
450 | Expr2 ::= Expr3[b] {"and" Expr3[b]}. | |
451 | Expr3 ::= Expr4[i] {"+" Expr4[i]}. | |
452 | Expr4 ::= Expr5[i] {"*" Expr5[i]}. | |
453 | Expr5 ::= Expr6[a] {"?" Expr6[i]}. | |
454 | Expr6 ::= Prim[a] {"[" Expr[i] "]"} [":=" Expr[i]]. | |
455 | Prim ::= {"("}* "." Expr "." {")"}* /* remember the Fibonacci rule! */ | |
456 | | VarName [":=" Expr] | |
457 | | Gumber | |
458 | | "#myself#" | |
459 | | "minus" Expr[i] | |
460 | | "write" Expr[i] | |
461 | | "#read#" | |
462 | | "not?" Expr[z] | |
463 | | "if?" Expr[b] | |
464 | | "cvt?" Expr[c] | |
465 | | "to?" Expr[t] | |
466 | | "P?" Expr[i] | |
467 | | "exists/dynast" Expr[i] | |
468 | | "copy/dynast" Expr[i] "," Expr[p] "," Expr[p] | |
469 | | "create/countably/many/dynasts" | |
470 | Expr[i] "," Expr[i] | |
471 | | "do" Expr | |
472 | | "for" "each" "prime" VarName "below" | |
473 | Expr[i] "do" Expr[i]. | |
474 | Gumber ::= {[0-9]}. | |
475 | </pre> | |
476 | ||
477 | <h3>Boolean Idioms</h3> | |
478 | ||
479 | <p>Here we show how we can get any value of any of the <code>b</code>, <code>t</code>, | |
480 | <code>z</code>, and <code>c</code> types, without any constants or variables with | |
481 | known values of these types.</p> | |
482 | ||
483 | <pre> | |
484 | VARIABLES ARE b /b*/. | |
485 | zero = /b*|b/ and not? to? cvt? if? /b*|b*/ | |
486 | true = not? zero | |
487 | go = if? true | |
488 | yes = cvt? go | |
489 | one = to? yes | |
490 | false = not? one | |
491 | nogo = if? false | |
492 | no = cvt? nogo | |
493 | </pre> | |
494 | ||
495 | <h3>Computational Class</h3> | |
496 | ||
497 | <p>Because the single in-dynast looping construct, <code>for each prime below</code>, is | |
498 | always a finite loop, the execution of any fixed number of dynasts cannot be Turing-complete. | |
499 | We must create new dynasts at runtime, and continue execution in them, if we want | |
500 | any chance at being Turing-complete. We demonstrate this by showing an example of | |
501 | a (conjecturally) infinite loop in Oozlybub and Murphy, an idiom which will doubtless | |
502 | come in handy in real programs.</p> | |
503 | ||
504 | <pre> | |
505 | VARIABLES ARE p /p*/, p /q*/. | |
506 | dynast(3) <-> | |
507 | (. do (. if? not? exists/dynast 5 ,then | |
508 | create/countably/many/dynasts #myself#, 5 .) .) ,then | |
509 | (. for each prime /p*|p/ below #myself#+2 do | |
510 | for each prime /q*|q/ below /p*|pp/+1 do | |
511 | if? not? exists/dynast /p*|p|p/+/q*|q|q/ ,then | |
512 | copy/dynast #myself#, /p*|ppp/, /q*|qqq/ .) | |
513 | </pre> | |
514 | ||
515 | <p>As you can see, the ability to loop indefinitely in Oozlybub and Murphy hinges on whether | |
516 | Goldbach's Conjecture is true or not. Looping forever requires creating an unbounded number of | |
517 | new dynasts. We can create all the odd-numbered dynasts at once, but that won't be enough | |
518 | to loop forever, as we must proceed to the next highest numbered dynast after executing a | |
519 | dynast. So we must create new dynasts with successively higher even integer labels, and these | |
520 | can only be created by summing two primes. So, if Goldbach's conjecture is false, then there | |
521 | is some even number greater than two which is not the sum of two primes; thus there is some | |
522 | dynast that cannot be created by a running Oozlybub and Murphy program, thus it is not possible | |
523 | to loop forever in Oozlybub and Murphy, thus Oozlybub and Murphy is not Turing-complete | |
524 | (because it cannot simulate any Turing machine that loops forever.)</p> | |
525 | ||
526 | <p>It should not however be difficult to show that Oozlybub and Murphy is Turing-complete under | |
527 | the assumption that Goldbach's Conjecture is true. If Goldbach's Conjecture is true, then the above program | |
528 | is an infinite loop. We need only add to it appropriate conditional instructions to, | |
529 | say, simulate the execution of an arbitrarily-chosen Turing machine. An array can serve as the | |
530 | tape, and an integer can serve as the head. Another integer can serve as the state of the finite | |
531 | control. The integer can be tested against various fixed integers by establishing an array for each | |
532 | of these fixed integers and using the <code>?</code> operator against each in turn; each branch | |
533 | can mutate the tape, tape head, and finite control as desired. The program can halt by neglecting | |
534 | to create a new even dynast to execute next, or by trying to create a dynast with a label that | |
535 | already exists.</p> | |
536 | ||
537 | <p>Happy FLIMPING, | |
538 | <br/>Chris Pressey | |
539 | <br/>December 1, 2010 | |
540 | <br/>Evanston, Illinois, USA | |
541 | </p> | |
542 | ||
543 | </body> | |
544 | </html> |
0 | <html><head> | |
1 | <meta http-equiv="Content-Type" content="text/html;CHARSET=iso-8859-1"> | |
2 | <title>Cat's Eye Technologies: The Opus-2 Language</title> | |
3 | </head> | |
4 | <body bgcolor="#ffffff" text="#000000" link="#1c3f7f" | |
5 | vlink="#204020" background="/img/sinewhite.gif" > | |
6 | ||
7 | <h1>Opus-2</h1> | |
8 | ||
9 | <p>Opus-2 is an abstract artlang composed by Chris Pressey at or around | |
10 | March 10, 2001. | |
11 | ||
12 | <h3>Design Goals</h3> | |
13 | ||
14 | <p>Eliminate word order entirely. Despite the appearance of the | |
15 | resulting language, this was the only real design goal to begin with. | |
16 | ||
17 | <h3>Grammatical Overview</h3> | |
18 | ||
19 | <p>Verbs in Opus-2 take the form of colours. Nouns take the form of | |
20 | sounds. Adjectives take the form of smells. Adverbs take the form of | |
21 | inner-ear sensations. Certain tenses and phrasings are indicated by | |
22 | tastes. | |
23 | ||
24 | <p>To distinguish between the roles of the nouns in a sentence, | |
25 | objects are quieter, <i>sotto voce</i> sounds, and subjects are | |
26 | foreground sounds. It is important to remember that the sensations | |
27 | corresponding to object, subject, and verb all occur at the same time | |
28 | in an event termed an <i>sentence-experience</i>. | |
29 | ||
30 | <p>This dominant-recessive relationship is also present in strong | |
31 | and weak scents which indicate whether an adjective describes the | |
32 | subject or the object, and in intense and gentle inner-ear sensations | |
33 | (feelings of sudden or gradual acceleration) to determine the target | |
34 | of an adverb. | |
35 | ||
36 | <h3>Vocabulary Overview</h3> | |
37 | ||
38 | <p>Sample dictionary: | |
39 | ||
40 | <p><center><table border=0> | |
41 | ||
42 | <tr><td colspan=2 align=center><b>verbs</b></td></tr> | |
43 | <tr><td>flee</td><td><i>pale green</i></td></tr> | |
44 | <tr><td>approach</td><td><i>deep orange</i></td></tr> | |
45 | <tr><td>examine</td><td><i>medium grey</i></td></tr> | |
46 | <tr><td>glorify</td><td><i>deep red</i></td></tr> | |
47 | ||
48 | <tr><td colspan=2 align=center><b>nouns</b></td></tr> | |
49 | <tr><td>man</td><td><i>Eb below middle C, trombone</i></td></tr> | |
50 | <tr><td>woman</td><td><i>F above middle C, french horn</i></td></tr> | |
51 | <tr><td>world</td><td><i>car door slamming</i></td></tr> | |
52 | <tr><td>child</td><td><i>middle C, tubular bells</i></td></tr> | |
53 | <tr><td>building</td><td><i>F, tympani roll</i></td></tr> | |
54 | <tr><td>radio</td><td><i>harp sweep</i></td></tr> | |
55 | ||
56 | <tr><td colspan=2 align=center><b>adjectives</b></td></tr> | |
57 | <tr><td>fast</td><td><i>burning rubber</i></td></tr> | |
58 | <tr><td>dangerous</td><td><i>mothballs</i></td></tr> | |
59 | ||
60 | <tr><td colspan=2 align=center><b>adverbs</b></td></tr> | |
61 | <tr><td>quickly</td><td><i>leaning 40 degrees left</i></td></tr> | |
62 | <tr><td>dangerously</td><td><i>leaning 25 degrees right</i></td></tr> | |
63 | ||
64 | </table></center> | |
65 | ||
66 | <h3>Context Overview</h3> | |
67 | ||
68 | <p>While each sentence is "instantaneous" in the sense that | |
69 | there is no internal word order, sentence-experiences still | |
70 | follow one another, and each sentence-experience does take a certain | |
71 | amount of time to perceive. | |
72 | Tense is thus implied by context between successive sentences | |
73 | and the duration of each sentence. The shorter a sentence is, | |
74 | the further into the future it is presumed to refer to. | |
75 | (Thanks to Rob Norman and Panu Kalliokosi for suggesting these ideas.) | |
76 | ||
77 | <h3>Examples of Usage</h3> | |
78 | ||
79 | <p>Example sentence-experience: "The building glorifies the woman": | |
80 | ||
81 | <p><center><table border=2> | |
82 | <tr><td><i>deep red</i></td></tr> | |
83 | <tr><td><i>F, tympani roll, forte</i></td></tr> | |
84 | <tr><td><i>F, french horn, piano</i></td></tr> | |
85 | </table></center> | |
86 | ||
87 | <p>Example sentence-experience: "The man quickly flees the dangerous child": | |
88 | ||
89 | <p><center><table border=2> | |
90 | <tr><td><i>pale green</i></td></tr> | |
91 | <tr><td><i>Eb, trombone, forte</i></td></tr> | |
92 | <tr><td><i>leaning 40 degrees left (sudden)</i></td></tr> | |
93 | <tr><td><i>C, tubular bells, piano</i></td></tr> | |
94 | <tr><td><i>mothballs (gentle whiff)</i></td></tr> | |
95 | </table></center> | |
96 | ||
97 | <h3>Who Speaks Opus-2?</h3> | |
98 | ||
99 | <p>This language was designed purely as an abstract exercise in language | |
100 | design. Thus it was not designed for any preconceived group of speakers, | |
101 | and little consideration was given to their culture and capabilities. | |
102 | It is neither specifically a conversation language, nor a formalized | |
103 | language (e.g. a programming language.) | |
104 | ||
105 | <p>Most of the problems of finding speakers of Opus-2 is in finding creatures | |
106 | that can create smells and inner-ear sensations as easily as humans can | |
107 | create complex sounds. | |
108 | ||
109 | However, some popular opinions of who or what might speak Opus-2 have | |
110 | been suggested since its unveiling: | |
111 | ||
112 | <ul> | |
113 | <li>An efficient-yet-entertaining form of future communication using direct neural jacks. | |
114 | <li>A code used by e.g. Neo (from <i>The Matrix</i>) to communicate to | |
115 | subjects unknowingly trapped in a virtual reality. | |
116 | <li>A pidgin spoken between highly-telepathic beings and marginally-telepathic beings. | |
117 | </ul> | |
118 | ||
119 | </body></html> |
0 | <html><head> | |
1 | <meta http-equiv="Content-Type" content="text/html;CHARSET=iso-8859-1"> | |
2 | <meta name="Description" content="Cat's Eye Technologies: The Tamerlane Programming Language"> | |
3 | <title>Cat's Eye Technologies: The Tamerlane Programming Language</title> | |
4 | </head> | |
5 | <body bgcolor="#ffffff" text="#000000" link="#1c3f7f" | |
6 | vlink="#204020" background="/img/sinewhite.gif" > | |
7 | <center> | |
8 | <h1>Tamerlane</h1> | |
9 | <p>Chris Pressey | |
10 | <br>Created Jan 29 2000 | |
11 | </center> | |
12 | ||
13 | <h3>Introduction to Tamerlane</h3> | |
14 | ||
15 | <p>Tamerlane is a "constraint flow" language. The point of its creation | |
16 | is to attempt to break as many paradigmal stereotypes and idioms that | |
17 | I'm aware of; at least, to make it tricky and confounding | |
18 | to pigeonhole easily. | |
19 | ||
20 | <p>It has some concepts in it from imperative languages, | |
21 | some from functional languages, some from dataflow and object oriented | |
22 | languages, and some from graph rewriting | |
23 | and other constraint-based languages, and they're all muddled together | |
24 | into a ridiculous <i>potpourri</i>. | |
25 | ||
26 | <p>Despite being such a mutt, there might actually be some algorithms that | |
27 | would be dreadfully easy to write in Tamerlane. | |
28 | ||
29 | <h3>Overview of Tamerlane</h3> | |
30 | ||
31 | <p>A Tamerlane program consists of a mutably weighted directed graph. | |
32 | ||
33 | <p>Each node is considered to be an independent updatable store. | |
34 | The data held in each store is represented by the weights of the | |
35 | arcs exiting the node. | |
36 | ||
37 | <p>An arc of weight zero is functionally equivalent to the absence | |
38 | of an arc. | |
39 | ||
40 | <h3>Description and Example</h3> | |
41 | ||
42 | <p>At this point we may introduce the syntax in a sample ASCII | |
43 | notation for a simple, almost pathological Tamerlane program: | |
44 | ||
45 | <pre> | |
46 | Point-A: 1 Point-B, | |
47 | Point-B: 1 Point-C, | |
48 | Point-C: 1 Point-A. | |
49 | </pre> | |
50 | ||
51 | <p>The user of a Tamerlane program may submit <i>messages</i> | |
52 | to the program at runtime. In this sense the user and the program are | |
53 | both objects which share the symmetrical relationship | |
54 | <b>user-of/used-by</b>. The program object's interface exposes a | |
55 | <tt>query</tt> method to the user, which is to be considered | |
56 | runtime-polymorphic. | |
57 | ||
58 | <p>Using this message-passing mechanism, queries are submitted by the user to a | |
59 | running Tamerlane program, much as queries would be submitted | |
60 | to a running Prolog program. | |
61 | ||
62 | <p>Queries have their own syntax and semantics. | |
63 | Unlike Prolog, the user's queries are interpreted as <i>rules</i>, | |
64 | perhaps accompanied by information about | |
65 | where and when the rules are "introduced" into the graph. | |
66 | ||
67 | <p>As an example of a query that could be submitted to the above | |
68 | Tamerlane program: | |
69 | ||
70 | <pre> | |
71 | 1 Point-A -> 0 Point-A @ Point-A | |
72 | </pre> | |
73 | ||
74 | This would introduce the rewriting rule | |
75 | ||
76 | <ul>"Replace occurances of <tt>1 Point-A</tt> with <tt>0 Point-A</tt>" | |
77 | </ul> | |
78 | ||
79 | <p>This rule is applied to the weights of the nodes in the graph | |
80 | starting with the node specified after the <tt>@</tt> symbol. In this | |
81 | instance it would start by trying to apply the rewrite to | |
82 | the node <tt>Point-A</tt>, but finding <tt>Point-A</tt> to contain <tt>1 Point-B</tt>, | |
83 | nothing would change. | |
84 | ||
85 | <p>Each time a further query is submitted, each rule which has | |
86 | been introduced into the graph, disappears from the node it | |
87 | was working on, and is transmitted to the adjacent node with | |
88 | the lowest positive weight value. If there is a tie for | |
89 | lowest weight, the rule is transmitted to all of the | |
90 | adjacent nodes with the same lowest weight. | |
91 | ||
92 | <p>For efficacy we can consider the user able to submit a <tt>nop</tt> | |
93 | query. This would not introduce any new rules into the graph, | |
94 | but it would cause all active rules to propogate to new nodes | |
95 | on the graph. | |
96 | ||
97 | <p>So assume the user submits <tt>nop</tt>. The rule that was | |
98 | introduced by the last query 'moves' from the | |
99 | node labelled <tt>Point-A</tt> to the node labelled <tt>Point-B</tt> (since it's | |
100 | the only positive route out of <tt>Point-A</tt>.) It tries to rewrite | |
101 | <tt>Point-B</tt>, but finding only <tt>1 Point-C</tt> in <tt>Point-B</tt>, nothing happens. | |
102 | ||
103 | <p>Assume the user <tt>nop</tt>s again. The rule is propogated to <tt>Point-C</tt>. | |
104 | Finally the pattern match succeeds, and <tt>Point-C</tt> is rewritten to | |
105 | a new <tt>Point-C</tt>: | |
106 | ||
107 | <pre> | |
108 | Point-C: 0 Point-A. | |
109 | </pre> | |
110 | ||
111 | <p>After one more <tt>nop</tt>, the engine will generate a | |
112 | ||
113 | <pre> | |
114 | Rule stopped at Point-C (no adjacent nodes) | |
115 | </pre> | |
116 | ||
117 | <p>message back to the user. This uses the operation that the user object's | |
118 | interface supplies to the running program, called <tt>messageback</tt>, | |
119 | which the user must supply, but is, like <tt>query</tt>, considered | |
120 | runtime-polymorphic. | |
121 | ||
122 | <h2>Advanced Topics</h2> | |
123 | ||
124 | <h3>Rule Priority</h3> | |
125 | ||
126 | <p>Obviously, the user can enter more than one query in | |
127 | succession; s/he doesn't need to explicity <tt>nop</tt> to | |
128 | wait for rules to resolve. Since the graph can contain cycles, this would lead | |
129 | to a form of inherent, synchronous concurrency: each time the <tt>query</tt> or <tt>nop</tt> | |
130 | methods are invoked, more than one rule may be applied to the same | |
131 | node. | |
132 | ||
133 | <p>The order in which these competing rules is applied is based | |
134 | on the rule's <i>priority</i>. Rules can be submitted with a specific | |
135 | priority in the following manner: | |
136 | ||
137 | <pre> | |
138 | 1 Point-A -> 0 Point-A @ Point-A ! 10 | |
139 | </pre> | |
140 | ||
141 | <p>If there is a tie in priority when two rules are competing to | |
142 | rewrite the same node, the outcome is guaranteed to be | |
143 | not simply undefined, but rather, non-deterministic, or at least | |
144 | probabilistic. | |
145 | ||
146 | <p>The user can also specify a delay, measured in number of method calls | |
147 | (<tt>query</tt>s or <tt>nop</tt>s) | |
148 | from the present query, at which the rule will be introduced into | |
149 | the graph, like so: | |
150 | ||
151 | <pre> | |
152 | 1 Point-A -> 0 Point-A @ Point-A in 10 | |
153 | </pre> | |
154 | ||
155 | <p>And, for the sake of efficacy, the <tt>nop</tt> method on the | |
156 | program object has overloaded syntaxes whose semantics are | |
157 | 'keep <tt>nop</tt>ing until some or all rules have stopped.' | |
158 | ||
159 | <h3>Negative Weights</h3> | |
160 | ||
161 | <p>A negative weight from one node to another is interpreted in a | |
162 | rather negative fashion with respect to the running program. | |
163 | ||
164 | <p>A negative weight is selected for propogation when it is closer | |
165 | to zero (while still being non-zero) than any other weight (absolute | |
166 | value.) When a negative weight is selected, however, the | |
167 | semantics of propogation are different. | |
168 | ||
169 | <p>When a rule encounters an arc of the form: | |
170 | ||
171 | <pre> | |
172 | X: -1 Y. | |
173 | </pre> | |
174 | ||
175 | <p>The rule is not "just" copied to Y like "usual". Instead, it | |
176 | "rolls back" the rule to Y in the fashion of a functional | |
177 | continuation. | |
178 | ||
179 | <p>All of the nodes that have been "touched" by this rule, since it | |
180 | was last at node Y, are reset to their condition when the rule | |
181 | was last at node Y. The rule itself is deleted from X and | |
182 | propogated to Y, and rewriting continues from there. | |
183 | ||
184 | <p>If the rule has never visited Y, however, then such an exit arc is | |
185 | not considered a valid candidate. It has an effective weight of | |
186 | 0 (non-existent.) | |
187 | ||
188 | <h3>Lambda Graphs</h3> | |
189 | ||
190 | <p>Lambda abstraction is a powerful form of referring to non-numeric | |
191 | calculatory items such as functions and, in the case of Tamerlane, graphs. | |
192 | ||
193 | <p>A user may submit a rule in the form | |
194 | ||
195 | <pre> | |
196 | 1 X -> 1 Y ? Y: 1 Z, Z: 1 Y | |
197 | </pre> | |
198 | ||
199 | <p>Y is like a 'local variable' in this respect. When the rule is | |
200 | applied to the node, and the pattern match succeeds, the nodes | |
201 | named Y and Z need not actually exist, as they are simply | |
202 | created dynamically and 'attached' to the program graph. | |
203 | ||
204 | <p>Lambda graphs are subject to garbage collection, since they can | |
205 | only be used by the rule that created them. | |
206 | ||
207 | <h3>Pigeonholes (Updatable)</h3> | |
208 | ||
209 | <p>Variables may be supplied which are explicitly updatable, | |
210 | and these are termed pigeonholes. Pigeonholes acknowledge events. | |
211 | Their assignment is associated with both nodes and arcs. They | |
212 | can occur when: | |
213 | ||
214 | <ul> | |
215 | <li>a rule enters a node | |
216 | <li>a rule leaves a node | |
217 | <li>a rule chooses an arc | |
218 | <li>a rule passes through an arc | |
219 | <li>a rule rewrites an arc | |
220 | </ul> | |
221 | ||
222 | <p>Updatable variables are always | |
223 | named after the node (or node.arc) they're associated with, preceded by | |
224 | a <tt>$</tt> symbol. The data in the variable is, of course, the arc weight | |
225 | (or in the case of nodes, the node's internal key.) | |
226 | ||
227 | <p>If the pigeonhole is assigned the special value <tt>%Cancel</tt>, | |
228 | the rule is cancelled from moving to the node/through the arc/rewriting the | |
229 | arc. | |
230 | ||
231 | <h3>Placeholders (Unification)</h3> | |
232 | ||
233 | <p>Variables may also be supplied as "placeholders". These are the | |
234 | same as bindable (unifiable) variables in logical languages like | |
235 | Prolog. An example of such might be: | |
236 | ||
237 | <pre> | |
238 | 1 X 7 ^Y -> 7 X 1 ^Y | |
239 | </pre> | |
240 | ||
241 | <p>Which would replace "1 X 7 Foo" with "7 X 1 Foo", "1 X 7 Bar" with | |
242 | "7 X 1 Bar", etc. | |
243 | ||
244 | <h3>Horn Rules</h3> | |
245 | ||
246 | <p>Horn rules only succeed if all of their rules succeed. A Horn rule | |
247 | is specified like: | |
248 | ||
249 | <pre> | |
250 | 1 X -> 1 Y + 1 Z 0 G -> 1 G | |
251 | </pre> | |
252 | ||
253 | <p>(Remember that 0 G is equivalent to "the absence of an arc to G." If you | |
254 | have a pattern like | |
255 | ||
256 | <pre> | |
257 | ^a G ^b G 0 G -> ^b Q | |
258 | </pre> | |
259 | ||
260 | <p>It will only match when there are exactly two different exit arcs to | |
261 | the same node G, which is not disallowed (nor is a node with an exit | |
262 | arc which points to itself.)) | |
263 | ||
264 | <p>Note also that arcs are unordered amongst themselves. The rule | |
265 | ||
266 | <pre> | |
267 | 1 A 1 B 1 C -> 1 E 3 D | |
268 | </pre> | |
269 | ||
270 | <p>is the same as | |
271 | ||
272 | <pre> | |
273 | 1 B 1 A 1 C -> 3 D 1 E | |
274 | </pre> | |
275 | ||
276 | <hr> | |
277 | <center><font size=-1>Last Updated Jan 27 <img src="/img/copyright.gif" | |
278 | align=absmiddle width=12 height=12 alt="(c)" border=0>2001 <a href="/">Cat's Eye Technologies</a>.</font></center> | |
279 | </body></html> |
0 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> | |
1 | <!-- encoding: UTF-8 --> | |
2 | <html xmlns="http://www.w3.org/1999/xhtml" lang="en"> | |
3 | <head> | |
4 | <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> | |
5 | <title>You are Reading the Name of this Esolang</title> | |
6 | <!-- begin html doc dynamic markup --> | |
7 | <script type="text/javascript" src="/contrib/jquery-1.6.4.min.js"></script> | |
8 | <script type="text/javascript" src="/scripts/documentation.js"></script> | |
9 | <!-- end html doc dynamic markup --> | |
10 | </head> | |
11 | <body> | |
12 | ||
13 | <h1>You are Reading the Name of this Esolang</h1> | |
14 | ||
15 | <p>November 2007, Chris Pressey, Cat's Eye Technologies</p> | |
16 | ||
17 | <h2>Introduction</h2> | |
18 | ||
19 | <p>This programming language, called <b>You are Reading the Name of this Esolang</b>, | |
20 | is my first foray into the design space of programming languages whose programs contain | |
21 | undecidable elements. In the case of You are Reading the Name of this Esolang, | |
22 | these elements are the instructions themselves — or rather, the symbols that | |
23 | the instructions are composed of.</p> | |
24 | ||
25 | <p>Before we begin, some lexical notes. The name of this language is not | |
26 | pronounced exactly how it looks; rather, it is pronounced as an English speaker | |
27 | would pronounce the phrase | |
28 | "you are hearing the name of this esolang." In addition, it is strongly discouraged | |
29 | to refer to this language by the name "Yartnote", whether spoken or in writing, and | |
30 | in any capitalization scheme. After all, there may actually be a completely | |
31 | unrelated esolang called Yartnote one day, Zeus willing. | |
32 | A similar logic applies to the taboo on calling it "YRNE".</p> | |
33 | ||
34 | <h2>Program structure</h2> | |
35 | ||
36 | <p>A You are Reading the Name of this Esolang program is a string of | |
37 | symbols drawn from the alphabet <code>0</code>, <code>1</code>, <code>[</code>, and | |
38 | <code>]</code>.</p> | |
39 | ||
40 | <p><code>0</code> and <code>1</code> are interpreted as they are in the programming | |
41 | language Spoon, or rather, the slightly more clearly-specified version of Spoon that | |
42 | follows. The Spoon tape is considered unbounded in both directions. Each cell of | |
43 | the Spoon tape may contain any non-negative integer (again, unbounded.) In addition, | |
44 | attempting to decrement a cell below 0 results in an immediate termination of the | |
45 | program. Oh, and there's no way to change what the symbols are. | |
46 | But that's the extent of the difference, I think.</p> | |
47 | ||
48 | <p>For your convenience, the Spoon instructions are repeated here | |
49 | (taken from the public-domain <a href="http://esolangs.org/wiki/Spoon">Spoon</a> | |
50 | entry on the <a href="http://esolangs.org/wiki/">Esolang wiki</a>):</p> | |
51 | ||
52 | <pre>1 Increment the memory cell under the pointer | |
53 | 000 Decrement the memory cell under the pointer | |
54 | 010 Move the pointer to the right | |
55 | 011 Move the pointer to the left | |
56 | 0011 Jump back to the matching 00100 | |
57 | 00100 Jump past the matching 0011 if the cell under the pointer is zero | |
58 | 001010 Output the character signified by the cell at the pointer | |
59 | 0010110 Input a character and store it at the cell in the pointer | |
60 | 00101110 Output the entire memory array | |
61 | 00101111 Immediately terminate program execution</pre> | |
62 | ||
63 | <p>Each <code>[</code> must be matched with a <code>]</code>; between them lies a | |
64 | subprogram with the same structure as a general You are Reading the Name of this Esolang | |
65 | program. The meaning of this subprogram is determined from its structure as follows. | |
66 | The subprogram is considered to be given the same input as the entire program. If the | |
67 | subprogram halts on this input, it is reduced to a <code>1</code>, and if it loops | |
68 | forever on this input, it is reduced to a <code>0</code>. These reduced instructions | |
69 | are interpreted as they are in Spoon, as described above. Any output produced by the | |
70 | subprogram is simply discarded.</p> | |
71 | ||
72 | <p>Subprograms may themselves contain subprograms, nested to arbitrary depth, in which case the | |
73 | reduction above is recursively applied, from the inside out, until a string of only | |
74 | <code>0</code> and <code>1</code> symbols remains. This is then executed as if it | |
75 | were a Spoon program. Any syntactically ill-formed program or subprogram is | |
76 | considered to halt immediately, producing no output. Note however that, as a consequence | |
77 | of this, a subprogram can be syntactically ill-formed (for example consisting of a | |
78 | single <code>0</code>) while the parent program can still be syntactically OK | |
79 | (the subprogram just reduces to a <code>1</code> in the parent program.)</p> | |
80 | ||
81 | <h2>Implementation notes</h2> | |
82 | ||
83 | <p>An implementation will determine if a particular subprogram halts, or not, if it can. | |
84 | Implementations may vary in the power of their proof methods used for this, but at a | |
85 | minimum must be able to recognize at least one subprogram that halts on any input, and | |
86 | one subprogram that loops forever on any input. This implementation-dependence should | |
87 | not strike anyone as too bizarre, I don't think — it is quite similar to how | |
88 | different implementations of a traditional systems-programming language can, for example, | |
89 | provide different levels of support for sizes of numerical data types like integers.</p> | |
90 | ||
91 | <p>Recall that the problem of telling if an arbitrary program in some given Turing-complete language | |
92 | halts on some input is <em>undecidable</em>, or equivalently, that set of all programs | |
93 | that halt on some input is <em>recursively enumerable</em>. The set of all programs that loop | |
94 | forever on some given input is the complement of this set, and it is called <em>co-recursively | |
95 | enumerable</em> (<em>co-r.e.</em> for short.)</p> | |
96 | ||
97 | <p>Despite this, there are many methods, ranging from simplistic to sophisticated, | |
98 | that can be used to prove in <em>specific</em> circumstances that a given program, | |
99 | on a given input, will either halt or fail to halt. These methods can be used in | |
100 | a You are Reading the Name of this Esolang implementation.</p> | |
101 | ||
102 | <p>The simplest method for proving that a subprogram halts is probably just to | |
103 | simulate it on the given input indefinately, returning <code>1</code> if it halts. | |
104 | If the subprogram does indeed halt, this technique will (eventually) reveal | |
105 | that fact. The simulation can be peformed concurrently with other subprograms, | |
106 | so that if no proof of halting for one subprogram is ever found, this will not | |
107 | prevent other subprograms from being checked.</p> | |
108 | ||
109 | <p>The simplest method for proving that a subprogram loops forever is probably to | |
110 | check it against a library of subprograms known to loop forever. For example, it | |
111 | can check if the program is <code>0010000000111001000011</code> | |
112 | (in Brainfuck: <code>[-]+[]</code>. | |
113 | You can readily assure yourself that this program loops forever, on any input.) | |
114 | This technique of course | |
115 | limits the number of recognizably looping subprograms that the implementation can | |
116 | handle to a finite number. Checking the general case, that is, recognizing an infinite number | |
117 | of forever-looping programs, is more difficult, however. Such an implementation | |
118 | will require techniques such as automatically finding a proof by induction, | |
119 | or abstract interpretation. However, ultimately, we know from the Halting Problem that there | |
120 | is no perfectly general technique that will recognize <em>every</em> program that | |
121 | loops forever.</p> | |
122 | ||
123 | <h2>Computability class</h2> | |
124 | ||
125 | <p>You are Reading the Name of this Esolang can be trivially shown to be as powerful | |
126 | as Spoon, since valid Spoon programs are valid You are Reading the Name of this Esolang | |
127 | programs. (Modulo the absence of negative-valued tape cells and the other little variations | |
128 | mentioned above. Since these issues have been dealt with extensively in the | |
129 | Brainfuck "literature", such as it is, I'm not going to worry about them.)</p> | |
130 | ||
131 | <p>What about the other way around?</p> | |
132 | ||
133 | <p>Well, take as a starting point the fact (in classical logic, at least) | |
134 | that every Spoon program either halts at some point or loops forever. (Whether we can | |
135 | <em>discover</em> which of these is the case is a different story.) | |
136 | This means that every You are Reading the Name of this Esolang program has a | |
137 | "canonical" form consisting of just <code>1</code>'s and <code>0</code>'s — | |
138 | again, whether we have an interpreter powerful enough to discover it or not. | |
139 | At this level, Spoon is as powerful as "canonical" You are Reading the Name of this Esolang, | |
140 | because "canonical" You are Reading the Name of this Esolang programs are valid | |
141 | Spoon programs.</p> | |
142 | ||
143 | <p>Now we can go in the other direction. We can start with any "canonical" | |
144 | You are Reading the Name of this Esolang program, and replace each <code>1</code> | |
145 | with any You are Reading the Name of this Esolang subprogram that always halts. | |
146 | Since even the simple method, described above, of proving that a subprogram halts | |
147 | will always resolve to <code>1</code> if the subprogram does indeed halt, this subset | |
148 | of You are Reading the Name of this Esolang programs is still executable | |
149 | in Spoon (or any other Turing-complete language). We only need to add the halting | |
150 | proof mechanism to rewrite the program into "canonical" form, before executing | |
151 | or simulating it.</p> | |
152 | ||
153 | <p>This extends recursively to any arbitrary level of nesting, too: we can | |
154 | replace each <code>1</code> in each subprogram with subprograms that always | |
155 | halt, with no bound. We only have to test these subprograms recursively, | |
156 | from the inside out, to eventually recover a "canonical" program.</p> | |
157 | ||
158 | <p>However, something strange happens when we turn our attention to <code>0</code>'s. | |
159 | If we replace even one <code>0</code> with a subprogram that loops forever on | |
160 | some input, there is always a possibility that: a) the You are Reading the Name of this Esolang | |
161 | program will be run with that input, and that b) the interpreter cannot prove that | |
162 | the subprogram loops forever with that input. Because the set of programs that loop forever | |
163 | is co-r.e., there is no Turing machine (or Spoon program) that can look at any | |
164 | one Spoon program and say, yep, I'm certain that this here program loops forever on this | |
165 | input, so it should darn well be rewritten into a <code>0</code> symbol.</p> | |
166 | ||
167 | <p>Thus it seems that there are You are Reading the Name of this Esolang programs | |
168 | which no Spoon interpreter — indeed, not any interpreter for any Turing-complete | |
169 | language — is able to interpret.</p> | |
170 | ||
171 | <p>Since the case of <code>0</code>'s seems to mirror the case of <code>1</code>'s | |
172 | when it comes to expanding them into subprograms, we may conjecture that, instead | |
173 | of remaining basically the same as we consider more and more deeply nested subprograms | |
174 | (as it was with expanding <code>1</code>'s,) maybe the problem | |
175 | becomes more intractable the deeper we go in expanding <code>0</code>'s. | |
176 | Perhaps we are climbing up the arithmetic hierarchy?</p> | |
177 | ||
178 | <p>At any rate, <em>I</em> certainly initially conjectured that that was the case, | |
179 | but it appears to be off the mark. Say you have a Spoon interpreter | |
180 | that's equipped with an oracle. You can feed an input string and a Spoon program into | |
181 | the oracle, and the oracle tells you whether or not that program halts on that input. | |
182 | You could then use that oracle to resolve a given You are Reading the Name of this Esolang | |
183 | subprogram into a <code>0</code> or <code>1</code>. But, you could also do this | |
184 | recursively, resolving them from the inside outward. A Spoon interpreter with such | |
185 | an oracle would be able to simulate any You are Reading the Name of this Esolang program | |
186 | — no more powerful oracle is needed, no matter how deep the subprograms are | |
187 | nested.</p> | |
188 | ||
189 | <h2>Discussion</h2> | |
190 | ||
191 | <p>I started designing You are Reading the Name of this Esolang shortly after | |
192 | reading about the programming language Gravity, while trying to determine the | |
193 | sense in which it is "non-computable." In particular, I noticed that these two statements | |
194 | (taken from the <a href="http://esolangs.org/wiki/Gravity">Gravity</a> | |
195 | entry on the <a href="http://esolangs.org/wiki/">Esolang wiki</a>,) | |
196 | on which the claim of Gravity's "non-computability" apparently rests, | |
197 | have ready analogies to problems the world of Turing machines:</p> | |
198 | ||
199 | <ul> | |
200 | <li>"Although [Gravity's] behavior is well-defined and deterministic, the evolution of its | |
201 | space is in general non-computable [...]" | |
202 | <p>The evolution of the state-space (set of successive configurations) of a universal | |
203 | Turing machine is also in general non-computable (there's no Turing machine that can tell | |
204 | you that some given configuration will never be reached.) | |
205 | </p></li> | |
206 | <li>"It can be shown that a Turing machine cannot compute, in the general case, whether | |
207 | even a single collision [in a given Gravity program] will ever happen." | |
208 | <p>It can also be shown that a Turing machine cannot compute, in the general case, | |
209 | whether or not even a single given state of another Turing machine's finite control will | |
210 | ever be reached. (Just make that state a halt state, and you have the Halting Problem | |
211 | right there.)</p></li> | |
212 | </ul> | |
213 | ||
214 | <p>Because of this, I am skeptical that Gravity is any more "non-computable" than | |
215 | a universal Turing machine. (I am, however, far from an expert on the computability | |
216 | of differential equations; it could be that the rather nonspecific term "non-computable", | |
217 | as used in that subfield, means something stronger than simply "undecidable".)</p> | |
218 | ||
219 | <p>At any rate, the idea interested me enough to spur me into designing a language | |
220 | that I <em>could</em> be reasonably certain was non-computable, in | |
221 | some sense that I could explain. The name You are Reading the Name of this Esolang | |
222 | was drifting around nearby | |
223 | in the æther at that moment, and seemed fitting enough for this monstrosity.</p> | |
224 | ||
225 | <p>The general approach was to simply force the | |
226 | language interpreter to decide — that is, to reduce to either a <code>0</code> or a | |
227 | <code>1</code> — some problem that is undecidable. This led to looking for | |
228 | something that needed specifically either a <code>0</code> or a <code>1</code> | |
229 | to specify something necessary, and that in turn | |
230 | led to the choice of Spoon as a base language. (Of course, I could have picked | |
231 | just about any language which is "its own binary Gödel numbering"; there are | |
232 | plenty to choose from there, but Spoon had a cool name. What can I say — I | |
233 | like The Tick.)</p> | |
234 | ||
235 | <p>The obvious choice of undecidable | |
236 | problem was whether another program halts or not. Making the subject of | |
237 | this problem a <em>subprogram</em> with | |
238 | the same structure as the general program let me examine the case of unbounded | |
239 | recursive descent. This turned out to be not quite as interesting as I hoped, but perhaps | |
240 | still somewhat illuminating. (Just what <em>would</em> it take, to require that | |
241 | a Spoon interpreter have a more powerful oracle than HP, to run every You are Reading | |
242 | the Name of this Esolang program? Perhaps | |
243 | <a href="http://esolangs.org/wiki/Banana_Scheme">Banana Scheme</a> could provide | |
244 | some inspiration, here. <em>It</em> certainly seems to be climbing the arithmetic | |
245 | hierarchy, although I can't quite say how far. Possibly "damned far.")</p> | |
246 | ||
247 | <p>I suppose one or two other things can be said about | |
248 | You are Reading the Name of this Esolang.</p> | |
249 | ||
250 | <p>Unlike both Gravity and Banana Scheme, | |
251 | You are Reading the Name of this Esolang has a recursively enumerable <em>syntax</em>: the problem of | |
252 | whether or not a given string over the alphabet <code>0</code>, <code>1</code>, | |
253 | <code>[</code>, and <code>]</code> is even a well-formed You are Reading the Name of | |
254 | this Esolang program is undecidable!</p> | |
255 | ||
256 | <p>It's not entirely clear how to interpret the instruction | |
257 | <code>00101110</code>, "Output the entire memory array," in the context of | |
258 | having a tape unbounded in both directions. I suppose I ought to stipulate that | |
259 | we are to just output the portion of the tape that has been "touched", i.e. that | |
260 | the tape head has moved over. But really, it's not so important for the | |
261 | goals of You are Reading the Name of this Esolang, so maybe I should just leave | |
262 | it undefined, for kicks.</p> | |
263 | ||
264 | <p>The fact that every subprogram takes the <em>same</em> input (same as the | |
265 | main program) might lead to some interesting programs — programs which | |
266 | are unduly sensitive to changes in the input, I imagine. Of course, | |
267 | this doesn't affect the undecidability of subprograms, since they are always | |
268 | free to ignore input completely.</p> | |
269 | ||
270 | <p>There is no implementation, yet, but constructing an efficient one would be | |
271 | a good exercise in static program analysis.</p> | |
272 | ||
273 | <p>Happy undeciding!</p> | |
274 | ||
275 | <p>-Chris Pressey | |
276 | <br/>November 5, 2007 | |
277 | <br/>Chicago, Illinois, USA</p> | |
278 | ||
279 | </body> | |
280 | </html> |
0 | -> encoding: UTF-8 | |
1 | ||
2 | The Xoomonk Programming Language | |
3 | ================================ | |
4 | ||
5 | Language version 0.1 Pretty Much Complete But Maybe Not Totally Finalized | |
6 | ||
7 | Abstract | |
8 | -------- | |
9 | ||
10 | _Xoomonk_ is a programming language in which _malingering updatable stores_ | |
11 | are first-class objects. Malingering updatable stores unify several language | |
12 | constructs, including procedure activations, named parameters, and object-like | |
13 | data structures. | |
14 | ||
15 | Description | |
16 | ----------- | |
17 | ||
18 | Overall, Xoomonk looks like a "typical" imperative language. The result | |
19 | of evaluating an expression can be assigned to a variable, and the contents | |
20 | of a variable can be used in a further expression. | |
21 | ||
22 | | a := 1 | |
23 | | b := a | |
24 | | print b | |
25 | = 1 | |
26 | ||
27 | However, blocks of these imperative statements can appear as terms in | |
28 | expressions; such blocks evaluate to entire updatable stores. | |
29 | ||
30 | | a := { | |
31 | | c := 5 | |
32 | | d := c | |
33 | | } | |
34 | | print a | |
35 | = [c=5,d=5] | |
36 | ||
37 | Once created, a store can be updated and accessed. | |
38 | ||
39 | | a := { | |
40 | | c := 5 | |
41 | | d := c | |
42 | | } | |
43 | | print a | |
44 | | a.d := 7 | |
45 | | print a | |
46 | | print a.c | |
47 | = [c=5,d=5] | |
48 | = [c=5,d=7] | |
49 | = 5 | |
50 | ||
51 | A store can also be assigned to a variable after creation. Stores are | |
52 | accessed by reference, so this creates two aliases to the same store. | |
53 | ||
54 | | a := { | |
55 | | c := 5 | |
56 | | d := c | |
57 | | } | |
58 | | b := a | |
59 | | b.c := 17 | |
60 | | print a | |
61 | | print b | |
62 | = [c=17,d=5] | |
63 | = [c=17,d=5] | |
64 | ||
65 | To create an independent copy of the store, the postfix `*` operator is | |
66 | used. | |
67 | ||
68 | | a := { | |
69 | | c := 5 | |
70 | | d := c | |
71 | | } | |
72 | | b := a* | |
73 | | b.c := 17 | |
74 | | print a | |
75 | | print b | |
76 | = [c=5,d=5] | |
77 | = [c=17,d=5] | |
78 | ||
79 | Empty blocks are permissible. | |
80 | ||
81 | | a := {} | |
82 | | print a | |
83 | = [] | |
84 | ||
85 | Once a store has been created, only those variables defined in the store | |
86 | can be updated and accessed -- new variables cannot be added. | |
87 | ||
88 | | a := { b := 6 } | |
89 | | print a.c | |
90 | ? Attempt to access an undefined variable | |
91 | ||
92 | | a := { b := 6 } | |
93 | | a.c := 12 | |
94 | ? Attempt to assign an undefined variable | |
95 | ||
96 | Stores and integers are the only two data types in Xoomonk. However, there | |
97 | are some special forms of the print statement, demonstrated here, which | |
98 | allow for textual output. | |
99 | ||
100 | | a := 65 | |
101 | | print char a | |
102 | | print string "Hello, world!" | |
103 | | print string "The value of a is "; | |
104 | | print a; | |
105 | | print "!" | |
106 | = A | |
107 | = Hello, world! | |
108 | = The value of a is 65! | |
109 | ||
110 | Xoomonk enforces strict block scope. Variables can be shadowed in an | |
111 | inner block. | |
112 | ||
113 | | a := 14 | |
114 | | b := { | |
115 | | a := 12 | |
116 | | print a | |
117 | | } | |
118 | | print a | |
119 | = 12 | |
120 | = 14 | |
121 | ||
122 | The special identifier `^` refers to the lexically enclosing store, that is, | |
123 | the store which was in effect when the this store was defined. | |
124 | ||
125 | | a := 14 | |
126 | | b := { | |
127 | | a := 12 | |
128 | | print ^.a | |
129 | | } | |
130 | | print b.a | |
131 | = 14 | |
132 | = 12 | |
133 | ||
134 | As long as a variable is assigned somewhere in a block, it can be accessed | |
135 | before it is first assigned. In this case, it will hold the integer value 0. | |
136 | It need not stay an integer, for typing is dynamic. | |
137 | ||
138 | | b := { | |
139 | | print a | |
140 | | a := 12 | |
141 | | print a | |
142 | | a := { c := 13 } | |
143 | | print a | |
144 | | } | |
145 | | print b | |
146 | = 0 | |
147 | = 12 | |
148 | = [c=13] | |
149 | = [a=[c=13]] | |
150 | ||
151 | We now present today's main feature. | |
152 | ||
153 | It's important to understand that a block need not define all the | |
154 | variables used in it. Such blocks do not immediately evaluate to a store. | |
155 | Instead, they evaluate to an object called an _unsaturated store_. | |
156 | ||
157 | Or, to put it another way: | |
158 | ||
159 | If, in a block, you refer to a variable which has not yet been given | |
160 | a value in that updatable store, the computations within the block are not | |
161 | performed until that variable is given a value. Such a store is called an | |
162 | _unsaturated store_. | |
163 | ||
164 | | a := { | |
165 | | d := c | |
166 | | } | |
167 | | print a | |
168 | = [c=?,d=*] | |
169 | ||
170 | An unsaturated store behaves similarly to a saturated store in certain | |
171 | respects. In particular, unsaturated stores can be updated. If doing | |
172 | so means that all of the undefined variables in the store are now defined, | |
173 | the block associated with that store is evaluated, and the store becomes | |
174 | saturated. In this sense, an unsaturated store is like a promise, and | |
175 | this bears some resemblance to lazy evaluation (thus the term _malingering_). | |
176 | ||
177 | | a := { | |
178 | | print string "executing block" | |
179 | | d := c | |
180 | | } | |
181 | | print a | |
182 | | a.c := 7 | |
183 | | print a | |
184 | = [c=?,d=*] | |
185 | = executing block | |
186 | = [c=7,d=7] | |
187 | ||
188 | Once a store has become saturated, the block associated with it is not | |
189 | executed again. | |
190 | ||
191 | | a := { | |
192 | | d := c | |
193 | | } | |
194 | | a.c := 7 | |
195 | | print a | |
196 | | a.c := 4 | |
197 | | print a | |
198 | = [c=7,d=7] | |
199 | = [c=4,d=7] | |
200 | ||
201 | The main program is a block. If it is unsaturated, no execution occurs | |
202 | at all. | |
203 | ||
204 | | print "hello" | |
205 | | a := 14 | |
206 | | b := c | |
207 | = | |
208 | ||
209 | Variables cannot generally be accessed from an unsaturated store. | |
210 | ||
211 | | a := { | |
212 | | d := c | |
213 | | } | |
214 | | x := a.c | |
215 | ? Attempt to access an unassigned variable | |
216 | ||
217 | | a := { | |
218 | | d := c | |
219 | | } | |
220 | | x := a.d | |
221 | ? Attempt to access an unresolved variable | |
222 | ||
223 | This is true, even if the variable is assigned a constant expression | |
224 | inside the block. | |
225 | ||
226 | | a := { | |
227 | | b := 7 | |
228 | | d := c | |
229 | | } | |
230 | | x := a.b | |
231 | ? Attempt to access an unresolved variable | |
232 | ||
233 | If, however, the unsaturated store contains some variables that have | |
234 | been updated since the store was created, those variable may be accessed. | |
235 | ||
236 | | a := { | |
237 | | print "executing block" | |
238 | | p := q | |
239 | | d := c | |
240 | | } | |
241 | | a.q = 7 | |
242 | | print a.q | |
243 | = 7 | |
244 | ||
245 | Nor is it possible to assign a variable in an unsaturated store which is | |
246 | defined somewhere in the block. | |
247 | ||
248 | | a := { | |
249 | | b := 7 | |
250 | | d := c | |
251 | | } | |
252 | | a.b := 4 | |
253 | ? Attempt to assign an unresolved variable | |
254 | ||
255 | A variable is considered unresolved even if it is assigned within the block, | |
256 | if that assignment takes place during or after its first use in an expresion. | |
257 | ||
258 | | a := { | |
259 | | b := b | |
260 | | } | |
261 | | a.b := 5 | |
262 | | print a | |
263 | = [b=5] | |
264 | ||
265 | | a := { | |
266 | | print string "executing block" | |
267 | | l := b | |
268 | | b := 3 | |
269 | | l := 3 | |
270 | | } | |
271 | | print string "saturating store" | |
272 | | a.b := 5 | |
273 | | print a | |
274 | = saturating store | |
275 | = executing block | |
276 | = [b=3,l=3] | |
277 | ||
278 | We now describe how this language is (we reasonably assume) Turing-complete. | |
279 | ||
280 | Operations are accomplished with certain built-in unsaturated stores. For | |
281 | example, there is a store called `add`, which can be used for addition. These | |
282 | built-in stores are globally available; they do not exist in any particular | |
283 | store themselves. One uses the `$` prefix operator to access this global | |
284 | namespace. | |
285 | ||
286 | | print $add | |
287 | | $add.x := 3 | |
288 | | $add.y := 5 | |
289 | | print $add.result | |
290 | | print $add | |
291 | = [x=?,y=?,result=*] | |
292 | = 8 | |
293 | = [x=3,y=5,result=8] | |
294 | ||
295 | Because using a built-in operation store in this way saturates it, it cannot | |
296 | be used again. Typically you want to make a copy of the store first, and use | |
297 | that, leaving the built-in store unmodified. | |
298 | ||
299 | | o1 := $add* | |
300 | | o1.x := 4 | |
301 | | o1.y := 7 | |
302 | | o2 := $add* | |
303 | | o2.x := o1.result | |
304 | | o2.y := 9 | |
305 | | print o2.result | |
306 | = 20 | |
307 | ||
308 | Since Xoomonk is not a strictly minimalist language, there is a selection | |
309 | of built-in stores which provide useful operations: `$add`, `$sub`, `$mul`, | |
310 | `$div`, `$gt`, and `$not`. | |
311 | ||
312 | Decision-making is also accomplished with a built-in store, `if`. This store | |
313 | contains variables caled `cond`, `then`, and `else`. `cond` should | |
314 | be an integer, and `then` and `else` should be unsaturated stores where `x` is | |
315 | unassigned. When the first three are assigned values, if `cond` is nonzero, | |
316 | it is assigned to `x` in the `then` store; otherwise, if it is zero, it is | |
317 | assigned to `x` in the `else` store. | |
318 | ||
319 | | o1 := $if* | |
320 | | o1.then := { | |
321 | | y := x | |
322 | | print "condition is true" | |
323 | | } | |
324 | | o1.else := { | |
325 | | y := x | |
326 | | print "condition is false" | |
327 | | } | |
328 | | o1.cond := 0 | |
329 | = condition is false | |
330 | ||
331 | | o1 := $if* | |
332 | | o1.then := { | |
333 | | y := x | |
334 | | print "condition is true" | |
335 | | } | |
336 | | o1.else := { | |
337 | | y := x | |
338 | | print "condition is false" | |
339 | | } | |
340 | | o1.cond := 1 | |
341 | = condition is true | |
342 | ||
343 | Repetition is also accomplished with a built-in store, `loop`. This store | |
344 | contains an unassigned variable called `do`. When it is assigned a value, | |
345 | assumed to be an unsaturated store, a copy of it is made. The variable | |
346 | `x` inside that copy is assigned the value 0. This is supposed to saturate | |
347 | the store. The variable `continue` is then accessed from the store. If | |
348 | it is nonzero, the process repeats, with another copy of the `do` store | |
349 | getting 0 assigned to its `x`, and so forth. | |
350 | ||
351 | | l := $loop* | |
352 | | counter := 5 | |
353 | | l.do := { | |
354 | | y := x | |
355 | | print ^.counter | |
356 | | o := $sub* | |
357 | | o.x := ^.counter | |
358 | | o.y := 1 | |
359 | | ^.counter := o.result | |
360 | | continue := o.result | |
361 | | } | |
362 | | print string "done!" | |
363 | = 5 | |
364 | = 4 | |
365 | = 5 | |
366 | = 2 | |
367 | = 1 | |
368 | = done! | |
369 | ||
370 | Because the `loop` construct will always execute the `do` store at least once | |
371 | (even assuming its only unassigned variable is `x`), it acts like a so-called | |
372 | `repeat` loop. It can be used in conjunction with `if` to simulate a | |
373 | so-called `while` loop. With this loop, the built-in operations provided, | |
374 | and variables which may contain unbounded integer values, Xoomonk should | |
375 | be uncontroversially Turing-complete. | |
376 | ||
377 | Finally, there is no provision for defining functions or procedures, because | |
378 | malingering stores can act as these constructs. | |
379 | ||
380 | | perimeter := { | |
381 | | o1 := $mul* | |
382 | | o1.x := x | |
383 | | o1.y := 2 | |
384 | | o2 := $mul* | |
385 | | o2.x := y | |
386 | | o2.y := 2 | |
387 | | o3 := $add* | |
388 | | o3.x := o1.result | |
389 | | o3.y := o2.result | |
390 | | result := o3.result | |
391 | | } | |
392 | | p1 := perimeter* | |
393 | | p1.x := 13 | |
394 | | p1.y := 6 | |
395 | | print p1.result | |
396 | | p2 := perimeter* | |
397 | | p2.x := 4 | |
398 | | p2.y := 1 | |
399 | | print p2.result | |
400 | = 38 | |
401 | = 10 | |
402 | ||
403 | Grammar | |
404 | ------- | |
405 | ||
406 | Xoomonk ::= { Stmt }. | |
407 | Stmt ::= Assign | Print. | |
408 | Assign ::= Ref ":=" Expr. | |
409 | Print ::= "print" ("string" <string> | "char" Expr | Expr) [";"]. | |
410 | Expr ::= (Block | Ref | Const) ["*"]. | |
411 | Block ::= "{" { Stmt } "}". | |
412 | Ref ::= Name {"." Name}. | |
413 | Name ::= "^" | "$" <alphanumeric> | <alphanumeric>. | |
414 | Const ::= <integer-literal>. | |
415 | ||
416 | Discussion | |
417 | ---------- | |
418 | ||
419 | There is some similarity with Wouter van Oortmerssen's language Bla, in that | |
420 | function environments are very close cousins of updatable stores. But | |
421 | Xoomonk, quite unlike Bla, is an imperative language; once created, a store | |
422 | may be updated at any point. And, of course, this property is exploited in | |
423 | the introduction of malingering stores. | |
424 | ||
425 | Xoomonk originally had an infix operator `&`, which takes two stores as its | |
426 | arguments, at least one of which must be saturated, and evaluates to a third | |
427 | store which is the result of merging the two argument stores. This result store | |
428 | may be saturated even if only one of the argument stores was saturated, if | |
429 | the saturated store gave all the values that the unsaturated store needed. | |
430 | This operator was dropped because it is mostly syntactic sugar for assigning | |
431 | each of the desired variables from one store to the other. However, it does | |
432 | admittedly provide a very natural syntax, which orthogonally includes "named | |
433 | arguments", when using unsaturated stores as procedures: | |
434 | ||
435 | perimeter = { | |
436 | # see example above | |
437 | } | |
438 | g := perimeter* & { x := 13 y := 6 } | |
439 | print g.result | |
440 | ||
441 | Xoomonk, as a project, is also an experiment in _test-driven language | |
442 | design_. As you can see, I've described the language with a series of | |
443 | examples, written in (something close to) Falderal format, each of which | |
444 | could be used as a test case. This should make implementing an interpreter | |
445 | or compiler for the language much easier, when I get around to that. | |
446 | ||
447 | Happy malingering! | |
448 | ||
449 | -Chris Pressey | |
450 | Cat's Eye Technologies | |
451 | August 7, 2011 | |
452 | Evanston, IL |