Commit 24d062a97e07e3d4c565ae759a308a3640eaa10a - Specs-on-Spec

Initial import of all my implementation-less language specs. catseye 12 years ago

8 changed file(s) with 3190 addition(s) and 0 deletion(s). Raw diff Collapse all Expand all

+270

-0

didigm/doc/didigm.html less more

	0	<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
	1	<!-- encoding: UTF-8 -->
	2	<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
	3	<head>
	4	<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
	5	<title>The Didigm Reflective Cellular Automaton</title>
	6	<!-- begin html doc dynamic markup -->
	7	<script type="text/javascript" src="/contrib/jquery-1.6.4.min.js"></script>
	8	<script type="text/javascript" src="/scripts/documentation.js"></script>
	9	<!-- end html doc dynamic markup -->
	10	</head>
	11	<body>
	12
	13	<h1>The Didigm Reflective Cellular Automaton</h1>
	14
	15	<p>November 2007, Chris Pressey, Cat's Eye Technologies</p>
	16
	17	<h2>Introduction</h2>
	18
	19	<p>Didigm is a <em>reflective cellular automaton</em>. What I mean to
	20	impart by this phrase is that it is a cellular automaton where the
	21	transition rules are given by the very patterns of cells that exist
	22	in the playfield at any given time.</p>
	23
	24	<p>Perhaps another way to think of Didigm is:
	25	Didigm =
	26	<a href="/projects/alpaca/">ALPACA</a> +
	27	<a href="/projects/ypsilax/">Ypsilax</a>.</p>
	28
	29	<h2>Didigm as Parameterized Language</h2>
	30
	31	<p>Didigm is actually a parameterized language. A parameterized
	32	language is a schema for specifying a set of languages, where a specific language
	33	can be obtained by supplying one or more parameters. For example,
	34	<a href="/projects/xigxag/">Xigxag</a> is a parameterized language,
	35	where the direction of the scanning and the direction of the building
	36	of new states are parameters. Didigm "takes" a single parameter, an integer.
	37	This parameter determines how many <em>colours</em> (number of possible
	38	states for any given cell) the cellular automaton has. This parameter
	39	defaults to 8. So when we say Didigm, we actually mean
	40	Didigm(8), but there are an infinite number of other possible
	41	languages, such as Didigm(5) and Didigm(70521).</p>
	42
	43	<p>The languages Didigm(0), Didigm(-1), Didigm(-2)
	44	and so forth are probably nonsensical abberations, but I'll leave that
	45	question for the philosophers to ponder.
	46	Didigm(1) is at least well-defined, but it's trivial.
	47	Didigm(2) and Didigm(3) are semantically problematic for
	48	more technically interesting reasons (Didigm(3) might be CA-universal,
	49	but Didigm(2) probably isn't.)
	50	Didigm(4) and above are easily shown to be CA-universal.</p>
	51
	52	<p>
	53	(I say CA-universal, and not Turing-complete, because technically
	54	cellular automata cannot simulate Turing machines without some
	55	extra machinery: TMs can halt, but CAs can't. Since I don't want
	56	to deal with defining that extra machinery in Didigm, it's simpler to
	57	avoid it for now.)
	58	</p>
	59
	60	<p>Colours are typically numbered. However, this is not meant to imply
	61	an ordering between colours. The eight colours of Didigm
	62	are typically referred to as 0 to 7.</p>
	63
	64	<h2>Language Description</h2>
	65
	66	<h3>Playfield</h3>
	67
	68	<p>The Didigm playfield, called <em>le monde</em>,
	69	is considered unbounded, like most cellular automaton
	70	playfields, but there is one significant difference.
	71	There is a horizontal division in this playfield, splitting
	72	it into regions called <em>le ciel</em>, on top,
	73	and <em>la terre</em>, below. This division is
	74	distinguishable — meaning, it must be possible to tell which region
	75	a given cell is in — but it need not have a presence beyond that.
	76	Specifically, this division lies on the edges between cells, rather than in
	77	the cells themselves. It has no "substance" and need not be visible
	78	to the user. (The Didigm Input Format, below, describes how it
	79	may be specified in textual input files.)</p>
	80
	81	<h3>Magic Colours</h3>
	82
	83	<p>Each region of the division has a distinguished colour which is
	84	called the <em>magic colour</em> of that region. The magic colour of le ciel is
	85	colour 0. The magic colour of la terre is colour 7.
	86	(In Didigm(<var>n</var>),
	87	the magic colour of la terre is colour <var>n</var>-1.)</p>
	88
	89	<h3>Transition Rules</h3>
	90
	91	<h4>Definition</h4>
	92
	93	<p>Each transition rule of the cellular automaton is not fixed,
	94	rather, it is given by certain forms that are present in the playfield.</p>
	95
	96	<p>Such a form is called <em>une salle</em> and has the following configuration.
	97	Two horizontally-adjacent cells of the magic colour abut a cell of the <em>destination
	98	colour</em> to the right. Two cells below the rightmost magic-colour
	99	cell is the cell of the <em>source colour</em>; it is surrounded by cells of
	100	any colour called the <em>determiners</em>.</p>
	101
	102	<p>This is perhaps better illustrated than explained. In the following diagram,
	103	the magic colour is 0 (this salle is in le ciel,)
	104	the source colour is 1, the destination colour is 2, and the determiners are
	105	indicated by D's.</p>
	106
	107	<pre>
	108	002
	109	DDD
	110	D1D
	111	DDD
	112	</pre>
	113
	114	<h4>Application</h4>
	115
	116	<p>Salles are interpreted as transition rules as follows. When the colour
	117	of a given cell is the same as the source colour of some salle, and when
	118	the colours of all the cells surrounding that cell are the exact same
	119	colours (in the exact same pattern) as the determininers of that salle,
	120	we say that that salle <em>matches</em> that cell. When any cell is
	121	matched by some salle in the other region, we say that that salle
	122	<em>applies</em> to that cell, and that cell is replaced
	123	by a cell of the destination colour of that salle.</p>
	124
	125	<p>"The other region" refers, of course, to the region that is not the
	126	region in which the cell being transformed resides. Salles in la terre
	127	only apply to cells in le ciel and vice-versa. This complementarity
	128	serves to limit the amount of chaos: if there was some salle that applied to
	129	<em>all</em> cells, it would apply directly to the cells that made up that
	130	salle, and that salle would be immediately transformed.</p>
	131
	132	<p>On each "tick" of the cellular automaton, all cells are checked to
	133	find the salle that applies to them, and then all are transformed, simultaneously,
	134	resulting in the next configuration of le monde.</p>
	135
	136	<p>There is a "default" transition rule which also serves to limit
	137	the amount of chaos: if no salle applies to a cell, the colour of that
	138	cell does not change.</p>
	139
	140	<p>Salles may overlap. However, no salle may straddle the horizon.
	141	(That is, each salle must be either completely in le ciel or completely
	142	in la terre.)</p>
	143
	144	<p>Salles may conflict (i.e. two salles may have the same source colour
	145	and determiners, but different destination colours.) The behaviour in
	146	this case is defined to be uniformly random: if there are <var>n</var>
	147	conflicting salles, each has a 1/<var>n</var> chance of being the
	148	one that applies.</p>
	149
	150	<h2>Didigm Input Format</h2>
	151
	152	<p>I'd like to give some examples, but first I need a format to given them in.</p>
	153
	154	<p>A Didigm Input File is a text file. The textual digit symbols
	155	<code>0</code> through <code>9</code> indicate cells of colours 0 through 9.
	156	Further colours may be indicated by enclosing a decimal digit string in square brackets,
	157	for example <code>[123]</code>. This digit string may contain leading
	158	zeros, in order for columns to line up nicely in the file.</p>
	159
	160	<p>A line containing only a <code>,</code> symbol in the leftmost column
	161	indicates the division between le ciel and la terre. This line does not
	162	become part of the playfield.</p>
	163
	164	<p>A line beginning with a <code>=</code> is a directive of some sort.</p>
	165
	166	<p>A line beginning with <code>=C</code> followed by a colour indicator
	167	indicates how many colours (the <var>n</var> in Didigm(<var>n</var>))
	168	this playfield contains. This directive may only occur once.</p>
	169
	170	<p>A line beginning with <code>=F</code> followed by a colour indicator
	171	as described above, indicates that the unspecified (and unbounded) remainder
	172	of le ciel or la terre (whichever side of <code>,</code> the directive is on)
	173	is to be considered filled with cells of the given colour.</p>
	174
	175	<p>Of course, an application which implements Didigm with some
	176	alternate means of specifying le monde, for example a graphical user interface,
	177	need not understand the Didigm Input Format.</p>
	178
	179	<h2>Examples</h2>
	180
	181	<p>Didigm is immediately seen to be CA-universal, in that you can
	182	readily (and stably) express a number of known CA-universal cellular automata in it.
	183	For example, to express John Conway's Life, you could say that
	184	colour 1 means "alive" and colour 2 means "dead", and compose something like</p>
	185
	186	<pre>
	187	002002001001
	188	222122112212 <i>... and so on ...</i>
	189	212212212121 <i>... for all 256 ...</i>
	190	222222222222 <i>... rules of Life ...</i>
	191	=F3
	192	,
	193	=F2
	194	22222
	195	21222
	196	21212
	197	21122
	198	22222
	199	</pre>
	200
	201	<p>Because the magic colour 7 never appears in la terre, il n'y a aucune salle
	202	dans la terre et donc tout le ciel est toujours la meme chose.</p>
	203
	204	<p>There are of course simpler CA's that are apparently CA-universal that
	205	would be possible to describe more compactly. But more interesting (to me)
	206	is the possibility for making reflective CA's.</p>
	207
	208	<p>To do this in an uncontrolled fashion is easy. We just stick some salles
	209	in le ciel, some salles in la terre, and let 'er rip. Unfortunately, in general,
	210	les salles in each region will probably quickly damage enough of the salles in
	211	the other region that le monde will become sterile soon enough.</p>
	212
	213	<p>A rudimentary example of something a little more orchestrated follows.</p>
	214
	215	<pre>
	216	3333333333333
	217	3002300230073
	218	3111311132113
	219	3311321131573
	220	3111311131333
	221	3333333333333
	222	=F3
	223	,
	224	=F1
	225	111111111111111
	226	111111131111111
	227	111111111111574
	228	111111111111333
	229	311111111111023
	230	111111111111113
	231	</pre>
	232
	233
	234	<p>The intent of this is that the 3's in la terre initially grow
	235	streams of 2's to their right, due to the leftmost two salles in le ciel.
	236	However, when the top stream of 2's reaches the cell just above and to the
	237	left of the 5, the third salle in le ciel matches and turns the 5 into a
	238	7, forming une salle dans la terre. This salle turns every 2 to the right of
	239	a 0 in le ciel into a 4, thus modifying two of les salles in le ciel.
	240	The result of these modified salles is to turn the bottom stream of 2's
	241	into a stream of 4's halfway along.</p>
	242
	243	<p>This is at least predictable, but it still becomes uninteresting fairly quickly.
	244	Also note that it's not just the isolated 3's in la terre that grow
	245	streams of 2's to the right: the 3's on the right side of la salle
	246	would, too. This could be rectified by having a wall of some other colour
	247	on that side of la salle, and I'm sure you could extend this example
	248	by having something else happen when the stream of 4's hit the 0 in that
	249	salle, but you get the picture. Creating a neat and tidy and long-lived
	250	reflective cellular automaton requires at least as much care as
	251	constructing a "normal" cellular automaton, and probably in general more.</p>
	252
	253	<h2>History</h2>
	254
	255	<p>I came up with the concept of a reflective cellular automaton
	256	(which is, as far as I'm aware, a novel concept) independently on
	257	November 1st, 2007, while walking on Felix Avenue, in Windsor, Ontario.</p>
	258
	259	<p>No reference implementation exists yet. Therefore, all Didigm runs
	260	have been thought experiments, and it's entirely possible that I've missed
	261	something in its definition that having a working simulator would reveal.</p>
	262
	263	<p>Happy magic colouring!
	264	<br/>Chris Pressey
	265	<br/>Chicago, Illinois
	266	<br/>November 17, 2007</p>
	267
	268	</body>
	269	</html>

+838

-0

madison/doc/Madison.falderal less more

	0	Madison
	1	=======
	2
	3	Version 0.1
	4	December 2011, Chris Pressey, Cat's Eye Technologies
	5
	6	Abstract
	7	--------
	8
	9	Madison is a language in which one can state proofs of properties
	10	of term-rewriting systems. Classical methods of automated reasoning,
	11	such as resolution, are not used; indeed, term-rewriting itself is
	12	used to check the proofs. Both direct proof and proof by induction
	13	are supported. Induction in a proof must be across a structure which
	14	has a well-founded inductive definition. Such structures can be
	15	thought of as types, although this is largely nominal; the traditional
	16	typelessness of term-rewiting systems is largely retained.
	17
	18	Term-rewriting
	19	--------------
	20
	21	Madison has at its core a simple term-rewriting language. It is of
	22	a common form which should be unsurprising to anyone who has worked
	23	at all with term rewriting. A typical simple program contains
	24	a set of rules and a term on which to apply those rules. Each
	25	rule is a pair of terms; either term of the pair may contain
	26	variables, but any variable that appears on the r.h.s must also
	27	appear on the l.h.s. A rule matches a term if is the same as
	28	the term with the exception of the variables, which are bound
	29	to its subterms; applying a matching rule replaces the term
	30	with the r.h.s. of the rule, with the variables expanded approp-
	31	riately. Rules are applied in an innermost, leftmost fashion to
	32	the term, corresponding to eager evaluation. Rewriting terminates
	33	when there is no rule whose l.h.s. matches the current incarnation
	34	of the term being rewritten.
	35
	36	A term is either an atom, which is a symbol that stands alone,
	37	or a constructor, which is a symbol followed by a comma-separated list
	38	of subterms enclosed in parentheses. Symbols may consist of letters,
	39	digits, and hyphens, with no intervening whitespace. A symbol is
	40	a variable symbol if it begins with a capital letter. Variable
	41	symbols may also begin with underscores, but these may only occur
	42	in the l.h.s. of a rewrite rule, to indicate that we don't care
	43	what value is bound to the variable and we won't be using it on
	44	the r.h.s.
	45
	46	(The way we are using the term "constructor" may be slightly non-
	47	standard; in some other sources, this is called a "function symbol",
	48	and a "constructor" is a subtly different thing.)
	49
	50	Because the rewriting language is merely a component (albeit the
	51	core component) of a larger system, the aforementioned typical
	52	simple program must be cast into some buttressing syntax. A full
	53	program consists of a `let` block which contains the rules
	54	and a `rewrite` admonition which specifies the term to be re-
	55	written. An example follows.
	56
	57	\| let
	58	\| leftmost(tree(X,Y)) -> leftmost(X)
	59	\| leftmost(leaf(X)) -> X
	60	\| in
	61	\| rewrite leftmost(tree(tree(leaf(alice),leaf(grace)),leaf(dan)))
	62	= alice
	63
	64	In the above example, there are two rules for the constructor
	65	`leftmost/1`. The first is applied to the outer tree to obtain
	66	a new leftmost constructor containing the inner tree; the first
	67	is applied again to obtain a new leftmost constructor containing
	68	the leaf containing `alice`; and the second is applied to that
	69	leaf term to obtain just `alice`. At that point, no more rules
	70	apply, so rewriting terminates, yielding `alice`.
	71
	72	Madison is deterministic; if rules overlap, the first one given
	73	(syntactically) is used. For this reason, it is a good idea
	74	to order rules from most specific to least specific.
	75
	76	I used the phrase "typical simple program" above because I was
	77	trying intentionally to avoid saying "simplest program". In fact,
	78	technically no `let` block is required, so you can write some
	79	really trivial Madison programs, like the following:
	80
	81	\| rewrite cat
	82	= cat
	83
	84	I think that just about covers the core term-rewriting language.
	85	Term-rewriting is Turing-complete, so Madison is too. If you
	86	wish to learn more about term rewriting, there are several good
	87	books and webpages on the subject; I won't go into it further
	88	here.
	89
	90	Proof-Checking
	91	--------------
	92
	93	My desire with Madison was to design a language in which you
	94	can prove things. Not a full-blown theorem prover -- just a
	95	proof checker, where you supply a proof and it confirms either
	96	that the proof holds or doesn't hold. (Every theorem prover
	97	has at its core a proof checker, but it comes bundled with a lot of
	98	extra machinery to search the space of possible proofs cleverly,
	99	looking for one which will pass the proof-checking phase.)
	100
	101	It's no coincidence that Madison is built on top of a term-rewriting
	102	language. For starters, a proof is very similar to the execution
	103	trace of a term being rewritten. In each of the steps of the proof,
	104	the statement to be proved is transformed by replacing some part
	105	of it with some equally true thing -- in other words, rewritten.
	106	In fact, Post Canonical Systems were an early kind of rewriting
	107	system, devised by Emil Post to (as I understand it) illustrate this
	108	similarity, and to show that proofs could be mechanically carried out
	109	in a rewriting system.
	110
	111	So: given a term-rewriting language, we can give a trivial kind
	112	of proof simply by stating the rewrite steps that should occur
	113	when a term is rewritten, and check that proof by rewriting the term
	114	and confirming that those were in fact the steps that occurred.
	115
	116	For the purpose of stating these sequences of rewrite steps to be
	117	checked, Madison has a `theorem..proof..qed` form. To demonstrate
	118	this form, let's use Madison to prove that 2 + 2 = 4, using Peano
	119	arithmetic.
	120
	121	\| let
	122	\| add(s(X),Y) -> add(X,s(Y))
	123	\| add(z,Y) -> Y
	124	\| in theorem
	125	\| add(s(s(z)),s(s(z))) ~> s(s(s(s(z))))
	126	\| proof
	127	\| add(s(s(z)),s(s(z)))
	128	\| -> add(s(z),s(s(s(z)))) [by add.1]
	129	\| -> add(z,s(s(s(s(z))))) [by add.1]
	130	\| -> s(s(s(s(z)))) [by add.2]
	131	\| qed
	132	= true
	133
	134	The basic syntax should be fairly apparent. The `theorem` block
	135	contains the statement to be proved. The `~>` means "rewrites
	136	in zero or more steps to". So, here, we are saying that 2 + 2
	137	(in Peano notation) rewrites, in zero or more steps, to 4.
	138
	139	The `proof` block contains the actual series of rewrite steps that
	140	should be carried out. For elucidation, each step may name the
	141	particular rule which is applied to arrive at the transformed term
	142	at that step. Rules are named by their outermost constructor,
	143	followed by a dot and the ordinal position of the rule in the list
	144	of rules. These rule-references are optional, but the fact that
	145	the rule so named was actually used to rewrite the term at that step
	146	could be checked too, of course. The `qed` keyword ends the proof
	147	block.
	148
	149	Naturally, you can also write a proof which does not hold, and
	150	Madison should inform you of this fact. 2 + 3, for example,
	151	does not equal 4, and it can pinpoint exactly where you went
	152	wrong should you come to this conclusion:
	153
	154	\| let
	155	\| add(s(X),Y) -> add(X,s(Y))
	156	\| add(z,Y) -> Y
	157	\| in theorem
	158	\| add(s(s(z)),s(s(s(z)))) ~> s(s(s(s(z))))
	159	\| proof
	160	\| add(s(s(z)),s(s(s(z))))
	161	\| -> add(s(z),s(s(s(s(z))))) [by add.1]
	162	\| -> add(z,s(s(s(s(z))))) [by add.1]
	163	\| -> s(s(s(s(z)))) [by add.2]
	164	\| qed
	165	? Error in proof [line 6]: step 2 does not follow from applying [add.1] to previous step
	166
	167	Now, while these are proofs, they don't tell us much about the
	168	properties of the terms and rules involved, because they are not
	169	generalized. They say something about a few fixed values, like
	170	2 and 4, but they do not say anything about any infinite
	171	sets of values, like the natural numbers. Now, that would be really
	172	useful. And, while I could say that what you've seen of Madison so far
	173	is a proof checker, it is not a very impressive one. So let's take
	174	this further.
	175
	176	Quantification
	177	--------------
	178
	179	To state a generalized proof, we will need to introduce variables,
	180	and to have variables, we will need to be able to say what those
	181	variables can range over; in short, we need quantification. Since
	182	we're particularly interested in making statements about infinite
	183	sets of values (like the natural numbers), we specifically want
	184	universal quantification:
	185
	186	For all x, ...
	187
	188	But to have universal quantification, we first need a universe
	189	over which to quantify. When we say "for all /x/", we generally
	190	don't mean "any and all things of any kind which we could
	191	possibly name /x/". Rather, we think of /x/ as having a type of
	192	some kind:
	193
	194	For all natural numbers x, ...
	195
	196	Then, if our proof holds, it holds for all natural numbers.
	197	No matter what integer value greater than or equal to zero
	198	we choose for /x/, the truism contained in the proof remains true.
	199	This is the sort of thing we want in Madison.
	200
	201	Well, to start, there is one glaringly obvious type in any
	202	term-rewriting language, namely, the term. We could say
	203
	204	For all terms t, ...
	205
	206	But it would not actually be very interesting, because terms
	207	are so general and basic that there's not actually very much you
	208	can say about them that you don't already know. You sort of need
	209	to know the basic properties of terms just to build a term-rewriting
	210	language (like the one at Madison's core) in the first place.
	211
	212	The most useful property of terms as far as Madison is concerned is
	213	that the subterm relationship is _well-founded_. In other words,
	214	in the term `c(X)`, `X` is "smaller than" `c(X)`, and since terms are
	215	finite, any series of rewrites which always results in "smaller" terms
	216	will eventually terminate. For completeness, we should probably prove
	217	that rigorously, but for expediency we will simply take it as a given
	218	fact for our proofs.
	219
	220	Anyway, to get at something actually interesting, we must look further
	221	than the terms themselves.
	222
	223	Types
	224	-----
	225
	226	What's actually interesting is when you define a restricted
	227	set of forms that terms can take, and you distinguish terms inside
	228	this set of forms from the terms outside the set. For example,
	229
	230	\| let
	231	\| boolean(true) -> true
	232	\| boolean(false) -> true
	233	\| boolean(_) -> false
	234	\| in
	235	\| rewrite boolean(false)
	236	= true
	237
	238	We call a set of forms like this a _type_. As you can see, we
	239	have basically written a predicate that defines our type. If any
	240	of the rewrite rules in the definition of this predicate rewrite
	241	a given term to `true`, that term is of our type; if it rewrites
	242	to `false`, it is not.
	243
	244	Once we have types, any constructor may be said to have a type.
	245	By this we mean that no matter what subterms the constructor has,
	246	the predicate of the type of which we speak will always reduce to
	247	`true` when that term is inserted in it.
	248
	249	Note that using predicates like this allows our types to be
	250	non-disjoint; the same term may reduce to true in two different
	251	predicates. My first sketches for Madison had disjoint types,
	252	described by rules which reduced each term to an atom which named
	253	the type of that term. (So the above would have been written with
	254	rules `true -> boolean` and `false -> boolean` instead.) However,
	255	while that method may be, on the surface, more elegant, I believe
	256	this approach better reflects how types are actually used in
	257	programming. At the end of the day, every type is just a predicate,
	258	and there is nothing stopping 2 from being both a natural number and
	259	an integer. And, for that matter, a rational number and a real
	260	number.
	261
	262	In theory, every predicate is a type, too, but that's where things
	263	get interesting. Is 2 not also an even number, and a prime number?
	264	And in an appropriate (albeit contrived) language, is it not a
	265	description of a computation which may or may not always halt?
	266
	267	The Type Syntax
	268	---------------
	269
	270	The above considerations motivate us to be careful when dealing
	271	with types. We should establish some ground rules so that we
	272	know that our types are useful to universally quantify over.
	273
	274	Unfortunately, this introduces something of a chicken-and-egg
	275	situation, as our ground rules will be using logical connectives,
	276	while at the same time they will be applied to those logical
	277	connectives to ensure that they are sound. This is not, actually,
	278	a big deal; I mention it here more because it is interesting.
	279
	280	So, the rules which define our type must conform to certain
	281	rules, themselves. While it would be possible to allow the
	282	Madison programmer to use any old bunch of rewrite rules as a
	283	type, and to check that these rules make for a "good" type when
	284	such a usage is seen -- and while this would be somewhat
	285	attractive from the standpoint of proving properties of term-
	286	rewriting systems using term-rewriting systems -- it's not strictly
	287	necessary to use a descriptive approach such as this, and there are
	288	certain organizational benefits we can achieve by taking a more
	289	prescriptive tack.
	290
	291	Viz., we introduce a special syntax for defining a type with a
	292	set of rules which function collectively as a type predicate.
	293	Again, it's not strictly necessary to do this, but it does
	294	help organize our code and perhaps our thoughts, and perhaps make
	295	an implementation easier to build. It's nice to be able to say,
	296	yes, what it means to be a `boolean` is defined right here and
	297	nowhere else.
	298
	299	So, to define a type, we write our type rules in a `type..in`
	300	block, like the following.
	301
	302	\| type boolean is
	303	\| boolean(true) -> true
	304	\| boolean(false) -> true
	305	\| in
	306	\| rewrite boolean(false)
	307	= true
	308
	309	As you can see, the wildcard reduction to false can be omitted for
	310	brevity. (In other words, "Nothing else is a boolean" is implied.)
	311	And, the `boolean` constructor can be used for rewriting in a term
	312	just like any other plain, non-`type`-blessed rewrite rule.
	313
	314	\| type boolean is
	315	\| boolean(true) -> true
	316	\| boolean(false) -> true
	317	\| in
	318	\| rewrite boolean(tree(leaf(sabrina),leaf(joe)))
	319	= false
	320
	321	Here are the rules that the type-defining rules must conform to.
	322	If any of these rules are violated in the `type` block, the Madison
	323	implementation must complain, and not proceed to try to prove anything
	324	from them.
	325
	326	Once a type is defined, it cannot be defined further in a regular,
	327	non-type-defining rewriting rule.
	328
	329	\| type boolean is
	330	\| boolean(true) -> true
	331	\| boolean(false) -> true
	332	\| in let
	333	\| boolean(red) -> green
	334	\| in
	335	\| rewrite boolean(red)
	336	? Constructor "boolean" used in rule but already defined as a type
	337
	338	The constructor in the l.h.s. must be the same in all rules.
	339
	340	\| type foo is
	341	\| foo(bar) -> true
	342	\| baz(bar) -> true
	343	\| in
	344	\| rewrite cat
	345	? In type "foo", constructor "bar" used on l.h.s. of rule
	346
	347	The constructor used in the rules must be arity 1 (i.e. have exactly
	348	one subterm.)
	349
	350	\| type foo is
	351	\| foo(bar,X) -> true
	352	\| in
	353	\| rewrite cat
	354	? In type "foo", constructor has arity greater than one
	355
	356	It is considered an error if the predicate rules ever rewrite, inside
	357	the `type` block, to anything besides the atoms `true` or `false`.
	358
	359	\| type foo is
	360	\| foo(bar) -> true
	361	\| foo(tree(X)) -> bar
	362	\| in
	363	\| rewrite cat
	364	? In type "foo", rule reduces to "bar" instead of true or false
	365
	366	The r.h.s.'s of the rules of the type predicate must always
	367	rewrite to `true` or `false`. That means, if we can't prove that
	368	the rules always rewrite to something, we can't use them as type
	369	predicate rules. In practice, there are a few properties that
	370	we insist that they have.
	371
	372	They may involve type predicates that have previously been
	373	established.
	374
	375	\| type boolean is
	376	\| boolean(true) -> true
	377	\| boolean(false) -> true
	378	\| in type boolbox is
	379	\| boolbox(box(X)) -> boolean(X)
	380	\| in
	381	\| rewrite boolbox(box(true))
	382	= true
	383
	384	They may involve certain, pre-defined rewriting rules which can
	385	be thought of as operators on values of boolean type (which, honestly,
	386	is probably built-in to the language.) For now there is only one
	387	such pre-defined rewriting rule: `and(X,Y)`, where `X` and `Y` are
	388	booleans, and which rewrites to a boolean, using the standard truth
	389	table rules for boolean conjunction.
	390
	391	\| type boolean is
	392	\| boolean(true) -> true
	393	\| boolean(false) -> true
	394	\| in type boolpair is
	395	\| boolpair(pair(X,Y)) -> and(boolean(X),boolean(Y))
	396	\| in
	397	\| rewrite boolpair(pair(true,false))
	398	= true
	399
	400	\| type boolean is
	401	\| boolean(true) -> true
	402	\| boolean(false) -> true
	403	\| in type boolpair is
	404	\| boolpair(pair(X,Y)) -> and(boolean(X),boolean(Y))
	405	\| in
	406	\| rewrite boolpair(pair(true,cheese))
	407	= false
	408
	409	Lastly, the r.h.s. of a type predicate rule can refer to the self-same
	410	type being defined, but only under certain conditions. Namely,
	411	the rewriting must "shrink" the term being rewritten. This is what
	412	lets us inductively define types.
	413
	414	\| type nat is
	415	\| nat(z) -> true
	416	\| nat(s(X)) -> nat(X)
	417	\| in
	418	\| rewrite nat(s(s(z)))
	419	= true
	420
	421	\| type nat is
	422	\| nat(z) -> true
	423	\| nat(s(X)) -> nat(s(X))
	424	\| in
	425	\| rewrite nat(s(s(z)))
	426	? Type not well-founded: recursive rewrite does not decrease in [foo.2]
	427
	428	\| type nat is
	429	\| nat(z) -> true
	430	\| nat(s(X)) -> nat(s(s(X)))
	431	\| in
	432	\| rewrite nat(s(s(z)))
	433	? Type not well-founded: recursive rewrite does not decrease in [foo.2]
	434
	435	\| type bad
	436	\| bad(leaf(X)) -> true
	437	\| bad(tree(X,Y)) -> and(bad(X),bad(tree(Y,Y))
	438	\| in
	439	\| rewrite whatever
	440	? Type not well-founded: recursive rewrite does not decrease in [bad.2]
	441
	442	We can check this by looking at all the rewrite rules in the
	443	definition of the type that are recursive, i.e. that contain on
	444	on their r.h.s. the constructor being defined as a type predicate.
	445	For every such occurrence on the r.h.s. of a recursive rewrite,
	446	the contents of the constructor must be "smaller" than the contents
	447	of the constructor on the l.h.s. What it means to be smaller
	448	should be fairly obvious: it just has fewer subterms. If all the
	449	rules conform to this pattern, rewriting will eventually terminate,
	450	because it will run out of subterms to rewrite.
	451
	452	Application of Types in Proofs
	453	------------------------------
	454
	455	Now, aside from these restrictions, type predicates are basically
	456	rewrite rules, just like any other. The main difference is that
	457	we know they are well-defined enough to be used to scope the
	458	universal quantification in a proof.
	459
	460	Simply having a definition for a `boolean` type allows us to construct
	461	a simple proof with variables. Universal quantification over the
	462	universe of booleans isn't exactly impressive; we don't cover an infinite
	463	range of values, like we would with integers, or lists. But it's
	464	a starting point on which we can build. We will give some rewrite rules
	465	for a constructor `not`, and prove that this constructor always reduces
	466	to a boolean when given a boolean.
	467
	468	\| type boolean is
	469	\| boolean(true) -> true
	470	\| boolean(false) -> true
	471	\| in let
	472	\| not(true) -> false
	473	\| not(false) -> true
	474	\| not(_) -> undefined
	475	\| in theorem
	476	\| forall X where boolean(X)
	477	\| boolean(not(X)) ~> true
	478	\| proof
	479	\| case X = true
	480	\| boolean(not(true))
	481	\| -> boolean(true) [by not.1]
	482	\| -> true [by boolean.1]
	483	\| case X = false
	484	\| boolean(not(false))
	485	\| -> boolean(false) [by not.2]
	486	\| -> true [by boolean.2]
	487	\| qed
	488	= true
	489
	490	As you can see, proofs using universally quantified variables
	491	need to make use of _cases_. We know this proof is sound, because
	492	it shows the rewrite steps for all the possible values of the
	493	variable -- and we know they are all the possible values, from the
	494	definition of the type.
	495
	496	In this instance, the cases are just the two possible values
	497	of the boolean type, but if the type was defined inductively,
	498	they would need to cover the base and inductive cases. In both
	499	matters, each case in a complete proof maps to exactly one of
	500	the possible rewrite rules for the type predicate. (and vice versa)
	501
	502	Let's prove the type of a slightly more complex rewrite rule,
	503	one which has multiple subterms which can vary. (This `and`
	504	constructor has already been introduced, and we've claimed we
	505	can use it in the definition of well-founded inductive types;
	506	but this code proves that it is indeed well-founded, and it
	507	doesn't rely on it already being defined.)
	508
	509	\| let
	510	\| and(true,true) -> true
	511	\| and(_,_) -> false
	512	\| in theorem
	513	\| forall X where boolean(X)
	514	\| forall Y where boolean(Y)
	515	\| boolean(and(X,Y)) ~> true
	516	\| proof
	517	\| case X = true
	518	\| case Y = true
	519	\| boolean(and(true,true))
	520	\| -> boolean(true) [by and.1]
	521	\| -> true [by boolean.1]
	522	\| case Y = false
	523	\| boolean(and(true,false))
	524	\| -> boolean(false) [by and.2]
	525	\| -> true [by boolean.2]
	526	\| case X = false
	527	\| case Y = true
	528	\| boolean(and(false,true))
	529	\| -> boolean(false) [by and.2]
	530	\| -> true [by boolean.2]
	531	\| case Y = false
	532	\| boolean(and(false,false))
	533	\| -> boolean(false) [by and.2]
	534	\| -> true [by boolean.2]
	535	\| qed
	536	= true
	537
	538	Unwieldy, you say! And you are correct. But making something
	539	easy to use was never my goal.
	540
	541	Note that the definition of `and()` is a bit more open-ended than
	542	`not()`. `and.2` allows terms like `and(dog,cat)` to rewrite to `false`.
	543	But our proof only shows that the result of reducing `and(A,B)` is
	544	a boolean when both A and B are booleans. So it, in fact,
	545	tells us nothing about the type of `and(dog,cat)`, nor in fact anything
	546	at all about the properties of `and(A,B)` when one or more of `A` and
	547	`B` are not of boolean type. So be it.
	548
	549	Anyway, since we were speaking of inductively defined types
	550	previously, let's do that now. With the help of `and()`, here is
	551	a type for binary trees.
	552
	553	\| type tree is
	554	\| tree(leaf) -> true
	555	\| tree(branch(X,Y)) -> and(tree(X),tree(Y))
	556	\| in
	557	\| rewrite tree(branch(leaf,leaf))
	558	= true
	559
	560	We can define some rewrite rules on trees. To start small,
	561	let's define a simple predicate on trees.
	562
	563	\| type tree is
	564	\| tree(leaf) -> true
	565	\| tree(branch(X,Y)) -> and(tree(X),tree(Y))
	566	\| in let
	567	\| empty(leaf) -> true
	568	\| empty(branch(_,_)) -> false
	569	\| in empty(branch(branch(leaf,leaf),leaf))
	570	= false
	571
	572	\| type tree is
	573	\| tree(leaf) -> true
	574	\| tree(branch(X,Y)) -> and(tree(X),tree(Y))
	575	\| in let
	576	\| empty(leaf) -> true
	577	\| empty(branch(_,_)) -> false
	578	\| in empty(leaf)
	579	= true
	580
	581	Now let's prove that our predicate always rewrites to a boolean
	582	(i.e. that it has boolean type) when its argument is a tree.
	583
	584	\| type tree is
	585	\| tree(leaf) -> true
	586	\| tree(branch(X,Y)) -> and(tree(X),tree(Y))
	587	\| in let
	588	\| empty(leaf) -> true
	589	\| empty(branch(_,_)) -> false
	590	\| in theorem
	591	\| forall X where tree(X)
	592	\| boolean(empty(X)) ~> true
	593	\| proof
	594	\| case X = leaf
	595	\| boolean(empty(leaf))
	596	\| -> boolean(true) [by empty.1]
	597	\| -> true [by boolean.1]
	598	\| case X = branch(S,T)
	599	\| boolean(empty(branch(S,T)))
	600	\| -> boolean(false) [by empty.2]
	601	\| -> true [by boolean.2]
	602	\| qed
	603	= true
	604
	605	This isn't really a proof by induction yet, but it's getting closer.
	606	This is still really us examining the cases to determine the type.
	607	But, we have an extra guarantee here; in `case X = branch(S,T)`, we
	608	know `tree(S) -> true`, and `tree(T) -> true`, because `tree(X) -> true`.
	609	This is one more reason why `and(X,Y)` is built-in to Madison; it
	610	needs to leverage what it means and make use of this information in a
	611	proof. We don't really use that extra information in this proof, but
	612	we will later on.
	613
	614	Structural Induction
	615	--------------------
	616
	617	Let's try something stronger, and get into something that could be
	618	described as real structural induction. This time, we won't just prove
	619	something's type. We'll prove something that actually walks and talks
	620	like a real (albeit simple) theorem: the reflection of the reflection
	621	of any binary tree is the same as the original tree.
	622
	623	\| type tree is
	624	\| tree(leaf) -> true
	625	\| tree(branch(X,Y)) -> and(tree(X),tree(Y))
	626	\| in let
	627	\| reflect(leaf) -> leaf
	628	\| reflect(branch(A,B)) -> branch(reflect(B),reflect(A))
	629	\| in theorem
	630	\| forall X where tree(X)
	631	\| reflect(reflect(X)) ~> X
	632	\| proof
	633	\| case X = leaf
	634	\| reflect(reflect(leaf))
	635	\| -> reflect(leaf) [by reflect.1]
	636	\| -> leaf [by reflect.1]
	637	\| case X = branch(S, T)
	638	\| reflect(reflect(branch(S, T)))
	639	\| -> reflect(branch(reflect(T),reflect(S))) [by reflect.2]
	640	\| -> branch(reflect(reflect(S)),reflect(reflect(T))) [by reflect.2]
	641	\| -> branch(S,reflect(reflect(T))) [by IH]
	642	\| -> branch(S,T) [by IH]
	643	\| qed
	644	= true
	645
	646	Finally, this is a proof using induction! In the [by IH] clauses,
	647	IH stands for "inductive hypothesis", the hypothesis that we may
	648	assume in making the proof; namely, that the property holds for
	649	"smaller" instances of the type of X -- in this case, the "smaller"
	650	trees S and T that are used to construct the tree `branch(S, T)`.
	651
	652	Relying on the IH is valid only after we have proved the base case.
	653	After having proved `reflect(reflect(S)) -> S` for the base cases of
	654	the type of S, we are free to assume that `reflect(reflect(S)) -> S`
	655	in the induction cases. And we do so, to rewrite the last two steps.
	656
	657	Like cases, the induction in a proof maps directly to the
	658	induction in the definition of the type of the variable being
	659	universally quantified upon. If the induction in the type is well-
	660	founded, so too will be the induction in the proof. (Indeed, the
	661	relationship between induction and cases is implicit in the
	662	concepts of the "base case" and "inductive case (or step)".)
	663
	664	Stepping Back
	665	-------------
	666
	667	So, we have given a simple term-rewriting-based language for proofs,
	668	and shown that it can handle a proof of a property over an infinite
	669	universe of things (trees.) That was basically my goal in designing
	670	this language. Now let's step back and consider some of the
	671	implications of this system.
	672
	673	We have, here, a typed programming language. We can define types
	674	that look an awful lot like algebraic data types. But instead of
	675	glibly declaring the type of any given term, like we would in most
	676	functional languages, we actually have to prove that our terms
	677	always rewrite to a value of that type. That's more work, of
	678	course, but it's also stronger: in proving that the term always
	679	rewrites to a value of the type, we have, naturally, proved that
	680	it always rewrites -- that its rewrite sequence is terminating.
	681	There is no possibility that its rewrite sequence will enter an
	682	infinite loop. Often, we establish this with the help of previously
	683	established basis that our inductively-defined types are well-founded,
	684	which is itself proved on the basis that the subterm relationship is
	685	well-founded.
	686
	687	Much like we can prove termination in course of proving a type,
	688	we can prove a type in the course of proving a property -- such
	689	as the type of `reflect(reflect(T))` above. (This does not directly
	690	lead to a proof of the type of `reflect`, but whatever.)
	691
	692	And, of course, we are only proving the type of term on the
	693	assumption that its subterms have specific types. These proofs
	694	say nothing about the other cases. This may provide flexibility
	695	for extending rewrite systems -- or it might not, I'm not sure.
	696	It might be nice to prove that all other types result in some
	697	error term. (One of the more annoying things about term-rewriting
	698	languages is how an error can result in a half-rewritten program
	699	instead of a recgonizable error code. There seems to be a tradeoff
	700	between extensibility and producing recognizable errors.)
	701
	702	Grammar so Far
	703	--------------
	704
	705	I think I've described everything I want in the language above, so
	706	the grammar should, modulo tweaks, look something like this:
	707
	708	Madison ::= Block.
	709	Block ::= LetBlock \| TypeBlock \| ProofBlock \| RewriteBlock.
	710	LetBlock ::= "let" {Rule} "in" Block.
	711	TypeBlock ::= "type" Symbol "is" {Rule} "in" Block.
	712	RewriteBlock ::= "rewrite" Term.
	713	Rule ::= Term "->" Term.
	714	Term ::= Atom \| Constructor \| Variable.
	715	Atom ::= Symbol.
	716	Constructor ::= Symbol "(" Term {"," Term} ")".
	717	ProofBlock ::= "theorem" Statement "proof" Proof "qed".
	718	Statement ::= Quantifier Statement \| MultiStep.
	719	Quantifiers ::= "forall" Variable "where" Term.
	720	MultiStep ::= Term "~>" Term.
	721	Proof ::= Case Proof {Case Proof} \| Trace.
	722	Trace ::= Term {"->" Term [RuleRef]}.
	723	RuleRef ::= "[" "by" (Symbol "." Integer \| "IH") "]".
	724
	725	Discussion
	726	----------
	727
	728	I think that basically covers it. This document is still a little
	729	rough, but that's what major version zeroes are for, right?
	730
	731	I have essentially convinced myself that the above-described system
	732	is sufficient for simple proof checking. There are three significant
	733	things I had to convince myself of to get to this point, which I'll
	734	describe here.
	735
	736	One is that types have to be well-founded in order for them to serve
	737	as scopes for universal quantification. This is obvious in
	738	retrospect, but getting them into the language in a way where it was
	739	clear they could be checked for well-foundedness took a little
	740	effort. The demarcation of type-predicate rewrite rules was a big
	741	step, and a little disappointing because it introduces the loaded
	742	term `type` into Madison's vernacular, which I wanted to avoid.
	743	But it made it much easier to think about, and to formulate the
	744	rules for checking that a type is well-founded. As I mentioned, it
	745	could go away -- Madison could just as easily check that any
	746	constructor used to scope a universal quantification is well-founded.
	747	But that would probably muddy the presentation of the idea in this
	748	document somewhat. It would be something to keep in mind for a
	749	subsequent version of Madison that further distances itself from the
	750	notion of "types".
	751
	752	Also, it would probably be possible to extend the notion of well-
	753	founded rewriting rules to mutually-recursive rewriting rules.
	754	However, this would complicate the procedure for checking that a
	755	type predicate is well-founded.
	756
	757	The second thing I had to accept to get to this conviction is that
	758	`and(X,Y)` is built into the language. It can't just be defined
	759	in Madison code, because while this would be wonderful from a
	760	standpoint of minimalism, Madison has to know what it means to let
	761	you write non-trivial inductive proofs. In a nutshell, it has to
	762	know that `foo(X) -> and(bar(X),baz(X))` means that if `foo(X)` is
	763	true, then `bar(X)` is also true, and `baz(X)` is true as well.
	764
	765	I considered making `or(X,Y)` a built-in as well, but after some
	766	thought, wasn't convinced that it was that valuable in the kinds
	767	of proofs I wanted to write.
	768
	769	Lastly, the third thing I had to come to terms with was, in general,
	770	how we know a stated proof is complete. As I've tried to describe
	771	above, we know it's complete because each of the cases maps to a
	772	possible rewrite rule, and induction maps to the inductive definition
	773	of a type predicate, which we know is well-founded because of the
	774	checks Madison does on it (ultimately based on the assumption that
	775	the subterm property is well-founded.) There Is Nothing Else.
	776
	777	This gets a little more complicated when you get into proofs by
	778	induction. The thing there is that we can assume the property
	779	we want to prove, in one of the cases (the inductive case) of the
	780	proof, so long as we have already proved all the other cases (the
	781	base case.) This is perfectly sound in proofs by hand, so it is
	782	likewise perfectly sound in a formal proof checker like Madison;
	783	the question is how Madison "knows" that it is sound, i.e. how it
	784	can be programmed to reject proofs which are not structured this
	785	way. Well, if we limit it to what I've just described above --
	786	check that the scope of the universal quantification is well-
	787	founded, check that there are two cases, and check that we've already
	788	proved one case, then allow the inductive hypothesis to be used as a
	789	rewrite rule in the other case of the proof -- this is not difficult
	790	to see how it could be mechanized.
	791
	792	However, this is also very limited. Let's talk about limitations.
	793
	794	For real data structures, you might well have multiple base cases;
	795	for example, a tree with two kinds of leaf nodes. Does this start
	796	breaking down? Probably. It probably breaks down with multiple
	797	inductive cases, as well, although you might be able to get around
	798	that with breaking the proof into multiple proofs, and having
	799	subsequent proofs rely on properties proved in previous proofs.
	800
	801	Another limitation I discovered when trying to write a proof that
	802	addition in Peano arithmetic is commutative. It seemingly can't
	803	be done in Madison as it currently stands, as Madison only knows
	804	how to rewrite something into something else, and cannot express
	805	the fact that two things (like `add(A,B)` and `add(B,A)`) rewrite
	806	to the same thing. Such a facility would be easy enough to add,
	807	and may appear in a future version of Madison, possibly with a
	808	syntax like:
	809
	810	theorem
	811	forall A where nat(A)
	812	forall B where nat(B)
	813	add(A,B) ~=~ add(B,A)
	814	proof ...
	815
	816	You would then show that `add(A,B)` reduces to something, and
	817	that `add(B,A)` reduces to something, and Madison would check
	818	that the two somethings are in fact the same thing. This is, in
	819	fact, a fairly standard method in the world of term rewriting.
	820
	821	As a historical note, Madison is one of the pieces of fallout from
	822	the overly-ambitious project I started a year and a half ago called
	823	Rho. Rho was a homoiconic rewriting language with several very
	824	general capabilities, and it wasn't long before I decided it was
	825	possible to write proofs in it, as well as the other things it was
	826	designed for. Of course, this stretched it to about the limit of
	827	what I could keep track of in a single project, and it was soon
	828	afterwards abandoned. Other fallout from Rho made it into other
	829	projects of mine, including Pail (having `let` bindings within
	830	the names of other `let` bindings), Falderal (the test suite from
	831	the Rho implementation), and Q-expressions (a variant of
	832	S-expressions, with better quoting capabilities, still forthcoming.)
	833
	834	Happy proof-checking!
	835	Chris Pressey
	836	December 2, 2011
	837	Evanston, Illinois

+403

-0

mdpn/doc/mdpn.html less more

	0	<html>
	1	<head>
	2	<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
	3	<meta name="Description" content="Cat's Eye Technologies: MDPN: Multi-Dimensional Pattern Notation">
	4	<title>Cat's Eye Technologies: MDPN: Multi-Dimensional Pattern Notation</title>
	5	<!-- begin html doc dynamic markup -->
	6	<script type="text/javascript" src="/contrib/jquery-1.6.4.min.js"></script>
	7	<script type="text/javascript" src="/scripts/documentation.js"></script>
	8	<!-- end html doc dynamic markup -->
	9	</head>
	10	<body>
	11
	12	<h1>Multi-Directional Pattern Notation</h1>
	13	<p>Final - Sep 6 1999</p>
	14	<hr>
	15
	16	<h2>Introduction</h2>
	17
	18	<p>MDPN is an extension to EBNF, which attributes it for
	19	the purposes of scanning and parsing input files which
	20	assume a non-unidirectional form. A familiarity with
	21	EBNF is assumed for the remainder of this document.</p>
	22
	23	<p>MDPN was developed by Chris Pressey in late 1998,
	24	built on an earlier, less successful attempt at a "2D EBNF" devised
	25	to fill a void that the
	26	mainstream literature on parsing seemed to rarely if
	27	ever approach, with much help provided by John Colagioia
	28	throughout 1998.</p>
	29
	30	<p>MDPN has possible uses in the construction of
	31	parsers and subsequently compilers for multi-directional and
	32	multi-dimensional languages such as Orthogonal, Befunge,
	33	Wierd, Blank, Plankalkül, and even less contrived notations
	34	like structured Flowchart and Object models of systems.</p>
	35
	36	<p>As the name indicates, MDPN provides a notation for
	37	describing multidimensional patterns by
	38	extending the concept of linear scanning and matching
	39	with geometric attributes in a given number of dimensions.</p>
	40
	41	<h2>Preconditions for Multidirectional Parsing</h2>
	42
	43	<p>The multidirectional parsing that MDPN concerns itself
	44	with assumes that any portion of the input file is accessable
	45	at any time. Concepts such as LL(1) are fairly meaningless
	46	in a non-unidirectional parsing system of this sort.
	47	The unidirectional input devices such as paper tape and
	48	punch cards that were the concern of original parsing methods
	49	have been superceded
	50	by modern devices such as hard disk drives and ample, cheap RAM.</p>
	51
	52	<p>In addition, MDPN is limited to an orthogonal
	53	representation of the input file, and this
	54	document is generally less concerned about forms of
	55	four or higher dimensions, to reduce unnecessary complexity.</p>
	56
	57	<h2>Notation from EBNF</h2>
	58
	59	<p>Syntax is drawn from EBNF. It is slightly modified, but
	60	should not surprise anyone who is familiar with EBNF.</p>
	61
	62	<p>A freely-chosen unadorned ('bareword') alphabetic
	63	multicharacter identifier indicates
	64	the name of a nonterminal (named pattern) in the
	65	grammar. e.g. <code>foo</code>. (Single characters have special
	66	meaning as operators.) Case matters: <code>foo</code> is not the
	67	same name as <code>Foo</code> or <code>FOO</code>.</p>
	68
	69	<p>Double quotes begin and end literal terminals (symbols.)
	70	e.g. <code>"bar"</code>.</p>
	71
	72	<p>A double-colon-equals-sign (<code>::=</code>)
	73	describes a production (pattern match)
	74	by associating a single nonterminal on the left with
	75	a pattern on the right, terminated with a period.
	76	e.g. <code>foo ::= "bar".</code></p>
	77
	78	<p>A pattern is a series of terminals, nonterminals,
	79	operators, and parenthetics.</p>
	80
	81	<p>The <code>\|</code> operator denotes
	82	alternatives. e.g. <code>"foo" \| "bar"</code></p>
	83
	84	<p>The <code>(</code> and <code>)</code> parentheses denote
	85	precedence and grouping.</p>
	86
	87	<p>The <code>[</code> and <code>]</code> brackets denote
	88	that the contents may be omitted, that is, they
	89	may occur zero or one times. e.g. <code>"bar" ["baz"]</code></p>
	90
	91	<p>The <code>{</code> and <code>}</code> braces denote
	92	that the contents may be omitted or may be repeated any number of times.
	93	e.g. <code>"bar" {baz "quuz"}</code></p>
	94
	95	<h2>Deviations from EBNF</h2>
	96
	97	<p>The input file is spatially related to a
	98	coordinate system and it is useful to
	99	think of the input mapped to an orthogonally
	100	distributed (Cartesian) form with no arbitrary
	101	limit imposed on its size, hereinafter referred to
	102	as <i>scan-space</i>.</p>
	103
	104	<p>The input file is mapped to scan-space. The
	105	first printable character in the input file always
	106	maps to the <i>origin</i> of scan-space regardless of
	107	the number of dimensions. The origin is enumerated
	108	with coordinates (0) in one dimension, (0,0) in two
	109	dimensions, (0,0,0) in three dimensions, etc.</p>
	110
	111	<p>Scan-space follows the 'computer storage' co-ordinate
	112	system so that <i>x</i> coordinates increase to
	113	the 'east' (rightwards), <i>y</i> coordinates increase to the
	114	'south' (downwards), and <i>z</i> coordinates increase on each
	115	successive 'page'.</p>
	116
	117	<p>Successive characters in the input file indicate
	118	successive coordinate (<i>x</i>) values in scan-space.
	119	For two and three dimensions, end-of-line markers are assumed
	120	to indicate "reset the <i>x</i> dimension and increment
	121	the <i>y</i> dimension", and end-of-page markers
	122	indicate "reset the <i>y</i> dimension and increment
	123	the <i>z</i> dimension",
	124	thus following the
	125	commonplace mapping of computer text printouts.</p>
	126
	127	<p>Whitespace in the input file are <b>not</b> ignored.
	128	The terminal <code>" "</code>, however, will match any
	129	whitespace (including tabs, which are <b>not</b> expanded.)
	130	The pattern <code>{" "}</code> may be used to indicate
	131	any number of whitespaces; <code>" " {" "}</code> may be used to
	132	indicate one or more whitespaces. Areas of scan-space
	133	beyond the bounds of the input file are considered to be filled
	134	with whitespaces.</p>
	135
	136	<p>Therefore, <code>"hello"</code> as a terminal is exactly
	137	the same as <code>"h" "e" "l" "l" "o"</code> as an pattern of
	138	terminals.</p>
	139
	140	<p>A <code>}</code> closing brace can be followed by a <code>^</code>
	141	(<i>constraint</i>)
	142	operator, which is followed by an expression in parentheses.</p>
	143
	144	<ul>
	145	<p>This expression is actually in a subnotation which supports
	146	a very simple form of algebra. The expression (built with
	147	terms connected by infix <code>+-*/%</code> operators with their
	148	C language meanings) can either reduce to
	149	<ul>
	150	<li>a constant value, as in <code>{"X"} ^ (5)</code>, which
	151	would match five <code>X</code> terminals in a line; or
	152	<li>an unknown value, which can involve any single lowercase
	153	letters, which indicate variables local to the production,
	154	as in <code>{"+"}^(x) {"-"}^(x*2)</code>, which would match only twice
	155	as many minus signs as plus signs.
	156	</ul></p>
	157
	158	<p>Complex algebraic expressions in constraints can and probably
	159	should be avoided when constructing a MDPN grammar
	160	for a real (non-contrived) compiler. MDPN-based
	161	compiler-compilers aren't expected to support more than
	162	one or two unknowns per expression, for example.
	163	There is no such restriction, of course,
	164	when using MDPN as a guide for hand-coding a multidimensional
	165	parser, or otherwise using it as a more
	166	sophisticated pattern-matching tool.</p>
	167	</ul>
	168
	169	<h2>The Scan Pointer</h2>
	170
	171	<p>It is useful to imagine a <i>scan pointer</i> (SP,
	172	not to be confused with a <i>stack pointer</i>, which is not
	173	the concern of this document) which is analogous to the
	174	current token in a single-dimensional parser, but exists
	175	in MDPN as a free spatial relationship to the input file, and
	176	thus also has associated geometric attributes such as
	177	direction.</p>
	178
	179	<p>The SP's <i>location</i> is advanced through scan-space by its
	180	<i>heading</i> as terminals in the productions
	181	are successfully matched with symbols in the input buffer.</p>
	182
	183	<p>The following geometric attribution operators modify the
	184	properties of the SP. Note that injudicious
	185	use of any of these operators <i>can</i> result in an infinite loop
	186	during scanning. There is no built-in contingency measure to
	187	escape from an infinite parsing loop in MDPN (but see
	188	exclusivity, below, for a possible way to overcome this.)</p>
	189
	190	<p><code>t</code> is the relative translation operator. It is
	191	followed by a vector, in parentheses, which is added to the
	192	location of the SP. This does not change its heading.</p>
	193
	194	<p>For example, <code>t (0,-1)</code> moves the
	195	SP one symbol above the current symbol (the symbol which was
	196	<i>about</i> to be matched.)</p>
	197
	198	<p>As a more realistic example of how this works, consider that the
	199	pattern <code>"." t(-1,1) "!" t(0,-1)</code> will match a
	200	period with an exclamation point directly below it, like:
	201
	202	<pre>.
	203	!</pre>
	204	</p>
	205
	206	<p><code>r</code> is the relative rotation operator. It is
	207	followed by an axis identifier (optional: see below)
	208	and an orthogonal angle (an angle <i>a</i>
	209	such that \|<i>a</i>\| <b>mod</b> 90 degrees = 0) assumed to
	210	be measured in degrees, both in parentheses.
	211	The angle is added to the SP's heading.
	212	Negative angle arguments are allowed.</p>
	213
	214	<p>Described in two dimensions, the (default) heading 0 denotes 'east,' that
	215	is, parsing character by character in a rightward direction, where
	216	the SP's <i>x</i> axis coordinate increases and all other axes coordinates
	217	stay the same. Increasing angles ascend counterclockwise (90 = 'north',
	218	180 = 'west', 270 = 'south'.)</p>
	219
	220	<p>For example, <code>">" r(-90) "+^"</code> would match
	221
	222	<pre>>+
	223	^</pre>
	224	</p>
	225
	226	<p>The axis identifier indicates which axis this rotation
	227	occurs around. If the axis identifier is omitted,
	228	the <i>z</i> axis is to be assumed, since this is certainly the
	229	most common axis to rotate about, in two dimensions.</p>
	230
	231	<p>If the axis identifier is present, it may be a single letter
	232	in the set <code>xyz</code> (these unsurprisingly indicate
	233	the <i>x</i>, <i>y</i>, and <i>z</i> dimensions respectively),
	234	or it may be a non-negative integer value, where 0 corresponds to the
	235	<i>x</i> dimension, 1 corresponds to the <i>y</i> dimension, etc.
	236	(Implementation note: in more than two dimensions,
	237	the SP's heading property should probably be broken up
	238	internally into theta, rho, &c components as appropriate.)</p>
	239
	240	<p>For example, <code>r(z,180)</code> rotates the SP's heading
	241	180 degrees about the <i>z</i> (dimension #2) axis, as does <code>r(2,180)</code>
	242	or even just <code>r(180)</code>.</p>
	243
	244	<p><code><</code> and <code>></code> are the push
	245	and pop state-stack operators, respectively. Alternately,
	246	they can be viewed as lookahead-assertion parenthetics, since the stack
	247	is generally assumed to be
	248	local to the production. (Compiler-compilers should probably
	249	notify the user, but not necessarily panic, if they find unbalanced
	250	<code><></code>'s.)</p>
	251
	252	<p>All properties of the SP (including location and heading,
	253	and scale factor if supported) are pushed as a group
	254	onto the stack during <code><</code> and popped as a group
	255	off the stack during <code>></code>.</p>
	256
	257	<h2>Advanced SP Features</h2>
	258
	259	<p>These features are not absolutely necessary for most
	260	non-contrived multi-directional grammars. MDPN compiler-compilers
	261	are not expected to support them.</p>
	262
	263	<p><code>T</code> is the absolute translation operator. It is
	264	followed by a vector which is assigned to the location of the SP.
	265	e.g. <code>T (0,0)</code> will 'home' the scan.</p>
	266
	267	<p><code>R</code> is the absolute rotation operator. It is
	268	followed by an optional axis identifier, and an
	269	orthogonal angle assumed to be measured in degrees.
	270	The SP's heading is set to this
	271	angle. e.g. <code>R(270)</code> sets the
	272	SP scanning line after line down the input text, downwards. See
	273	the <code>r</code> operator, above, for how the axis identifier functions.</p>
	274
	275	<p><code>S</code> is the absolute scale operator. It is
	276	followed by an orthogonal <i>scaling factor</i> (a scalar <i>s</i>
	277	such that <i>s</i> = <b>int</b>(<i>s</i>) and <i>s</i> >= 1).
	278	The SP's scale factor is set to this value. The finest
	279	possible scale, 1, indicates a 1:1 map with the input file;
	280	for each one input symbol matched, the SP advances one symbol
	281	in its path. When the scale factor is two, then for each
	282	one input symbol matched, the SP advances two symbols, skipping
	283	over an interim symbol. Etc.</p>
	284
	285	<p><code>s</code> is the relative scale operator. It is
	286	followed by a scalar integer which is added to the SP's scaling factor
	287	(so long as it does not cause the scaling factor to be zero or negative.)</p>
	288
	289	<p>Scale operators may also take an optional axis identifier
	290	(as in <code>S(y,2)</code>), but when the axis identifier is omitted,
	291	all axes are assumed (non-distortional scaling).</p>
	292
	293	<p><code>!></code> is a state-assertion alternative to
	294	<code>></code>, for the purpose of determining that
	295	the SP successfully and completely
	296	reverted to a previous state that
	297	was pushed onto the stack ('came full circle'). This
	298	operator is something of a luxury; a
	299	grammar which uses constraints correctly should
	300	never <i>need</i> it, but it can come in handy.</p>
	301
	302	<h2>Other Advanced Features: Exclusivity</h2>
	303
	304	<p>Lastly, in the specification of a production,
	305	the <i>exclusivity</i> applying to that production can be given
	306	between a hyphen following
	307	the name of the nonterminal, and the <code>::=</code> operator.</p>
	308
	309	<p>Exclusivity is a list of productions, named by their
	310	nonterminals, and comes into play at any particular <i>instance</i>
	311	of the production (i.e. when the production successfully
	312	matches specific symbols at specific points in scan-space during a parse,
	313	called the <i>domain</i>.) The exclusivity describes how the domain of each
	314	instance is protected from being the domain of any further instances.
	315	The domain of any subsequent instances of any productions listed
	316	in the exclusivity is restricted from sharing points in scan-space
	317	with the established domain.</p>
	318
	319	<p>Exclusivity is a measure to prevent so-called <i>crossword
	320	grammars</i> - that is, where instances of productions
	321	can <i>overlap</i> and share common symbols - if desired.
	322	Internally it's generally considered a list of
	323	'used-by-this-production' references
	324	associated with each point in scan-space.
	325	An example of the syntax to specify exclusivity is
	326	<code>bar - bar quuz ::= foo {"a"} baz</code>.
	327	Note that the domain of an instance of <code>bar</code> is the
	328	sum of the
	329	domains <code>foo</code>, <code>baz</code> and the chain of "<code>a</code>" terminals,
	330	and that neither a subsequent instance of <code>quuz</code> nor <code>bar</code>
	331	again can overlap it.</p>
	332
	333	<h2>Examples of MDPN-described Grammars</h2>
	334
	335	<p><b>Example 1.</b> A grammar for describing boxes.</p>
	336
	337	<p>The task of writing a translator to recognize a
	338	two-dimensional construct such as a box can easily
	339	be realized using a tool such as MDPN.</p>
	340
	341	<p>An input file might contain a of box drawn in
	342	ASCII characters, such as
	343
	344	<pre>+------+
	345	\| \|
	346	\| \|
	347	+------+</pre>
	348	</p>
	349
	350	<p>Let's also say that boxes have a minimum height of four (they
	351	must contain at least two rows), but
	352	no minimum width. Also, edge characters must match up with
	353	which edge they are on. So, the following forms are both
	354	illegal inputs:
	355
	356	<pre>+-+
	357	+-+</pre>
	358
	359	<pre>+-\|-+
	360	\| \|
	361	*
	362	\| \|
	363	+-\|-+</pre>
	364	</p>
	365
	366	<p>The MDPN production used to describe this box might be
	367
	368	<pre> Box ::= "+" {"-"}^(w) r(-90) "+" "\|\|" {"\|"}^(h) r(-90)
	369	"+" {"-"}^(w) r(-90) "+" "\|\|" {"\|"}^(h) r(-90).</pre>
	370	</p>
	371
	372	<p><b>Example 2.</b> A simplified grammar for Plankalkül's assignments.</p>
	373
	374	<p>An input file might contain an ASCII approximation of something
	375	Zuse might have jotted down on paper:
	376
	377	<pre> \|Z + Z => Z
	378	V\|1 2 3
	379	S\|1.n 1.n 1.n</pre>
	380	</p>
	381
	382	<p>Simplified MDPN productions used to describe this might be
	383
	384	<pre>
	385	Staff ::= Spaces "\|" TempVar AddOp TempVar Assign TempVar.<br>
	386	TempVar ::= "Z" t(-1,1) Index t(-1,1) Structure t(0,-2) Spaces.
	387	Index ::= <Digit {Digit}>.
	388	Digit ::= "0" \| "1" \| "2" \| "3" \| "4" \| "5" \| "6" \| "7" \| "8" \| "9".
	389	Structure ::= <Digit ".n">.
	390	AddOp ::= ("+" \| "-") Spaces.
	391	Assign ::= "=>" Spaces.
	392	Spaces ::= {" "}.
	393	</pre>
	394	</p>
	395
	396	<hr>
	397	<i>Last Updated Mar 2 <img src="/images/icons/copyright.gif"
	398	align=absmiddle width=12 height=12 alt="(c)" border=0>2004 <a
	399	href="/">Cat's Eye Technologies</a>.
	400	All rights reserved.
	401	This document may be freely redistributed in its unchanged form.</i>
	402	</body></html>

+545

-0

oozlybub-and-murphy/doc/oozlybub-and-murphy.html less more

	0	<?xml version="1.0"?>
	1	<!-- encoding: utf-8 -->
	2	<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
	3	<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
	4	<head>
	5	<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
	6	<title>The Oozlybub and Murphy Programming Language</title>
	7	<!-- begin html doc dynamic markup -->
	8	<script type="text/javascript" src="/contrib/jquery-1.6.4.min.js"></script>
	9	<script type="text/javascript" src="/scripts/documentation.js"></script>
	10	<!-- end html doc dynamic markup -->
	11	</head>
	12	<body>
	13
	14	<h1>Oozlybub and Murphy</h1>
	15
	16	<p>Language version 1.1</p>
	17
	18	<h2>Overview</h2>
	19
	20	<p>This document describes a new programming language.
	21	The name of this language is <dfn>Oozlybub and Murphy</dfn>.
	22	Despite appearances, this name refers to a single language.
	23	The majority of the language is
	24	named Oozlybub. The fact that the language is not entirely named
	25	Oozlybub is named Murphy. Deal with it.</p>
	26
	27	<p>For the sake of providing an "olde tyme esoterickal de-sign", the
	28	language combines several unusual features, including multiple interleaved parse
	29	streams, infinitely long variable names, gratuitously strong typing, and
	30	only-conjectural Turing completeness.
	31	While no implementation of the language exists as of this writing, it is thought to
	32	be sufficiently consistent to be implementable, modulo any errors in this docunemt.</p>
	33
	34	<p>In places the language may resemble <a href="/projects/smith/">SMITH</a>
	35	and <a href="/projects/quylthulg/">Quylthulg</a>, but this was not intended,
	36	and the similarities are purely emergent.</p>
	37
	38	<h2>Program Structure</h2>
	39
	40	<p>A Oozlybub and Murphy program consists of a number of <dfn>variables</dfn> and
	41	a number of objects called <dfn>dynasts</dfn>. A Oozlybub and Murphy program text
	42	consists of multiple <dfn>parse streams</dfn>. Each parse stream contains
	43	zero or more variable declarations, and optionally a single dynast.</p>
	44
	45	<h3>Parse Streams</h3>
	46
	47	<p>A parse stream is just a segment, possibly non-contiguous, of the text of
	48	a Oozlybub and Murphy program. A program starts out with a single parse stream, but
	49	certain <dfn>parse stream manipulation pragmas</dfn> can change this.
	50	These pragmas have the form <code>{@<var>x</var>}</code> and
	51	have a similar syntactic status as comments; they can appear anywhere except
	52	inside a lexeme.</p>
	53
	54	<p>Parse streams are arranged as a ring (a cyclic doubly linked list.) When parsing
	55	of the program text begins initially, there is already a single pre-created parse stream.
	56	When the program text ends, all parse streams which may be active are deleted.</p>
	57
	58	<p>The meanings of the pragmas are:</p>
	59
	60	<ul>
	61	<li><code>{@+}</code> Create a new parse stream to the right of the current one.</li>
	62	<li><code>{@>}</code> Switch to the parse stream to the right of the current one.</li>
	63	<li><code>{@<}</code> Switch to the parse stream to the left of the current one.</li>
	64	<li><code>{@-}</code> Delete the current parse stream. The parse stream to the
	65	left of the deleted parse stream will become the new current parse stream.</li>
	66	</ul>
	67
	68	<p>Deleting a parse stream while it contains an unfinished syntactic construct is a syntax
	69	error, just as an end-of-file in that circumstance would be in most other languages.</p>
	70
	71	<p>Providing a concrete example of parse streams in action will be difficult in the
	72	absence of defined syntax for the rest of Oozlybub and Murphy, so we will, for the purposes of
	73	the following demonstration only, pretend that the contents of a parse stream is a sentence of English.
	74	Here is how three parse streams might be managed:</p>
	75
	76	<p><code>The quick {@+}brown{@>}Now is the time{@<}fox{@<} for all good
	77	men to {@+}{@>}Wherefore art thou {@>} jumped over {@>}{@>}Romeo?{@-}
	78	come to the aid of {@>}the lazy dog's tail.{@-}their country.{@-}
	79	</code></p>
	80
	81	<h3>Variables</h3>
	82
	83	<p>All variables are declared in a block at the beginning of a parse stream.
	84	If there is also a dynast in that stream, the variables are private to that dynast;
	85	otherwise they are global and shared by all dynasts. (<em>Defined in 1.1</em>)
	86	Any dynamically created dynast gets its own private copies of any private
	87	variables the original dynast had; they will initially hold the values they
	88	had in the original, but they are not shared.</p>
	89
	90	<p>The name of a variable in Oozlybub and Murphy is not a fixed, finite-length string of symbols,
	91	as you would find in other programming languages. No sir! In Oozlybub and Murphy, each
	92	variable is named by a possibly-infinite set of strings
	93	(over the alphanumeric-plus-spaces alphabet <code>[a-zA-Z0-9 ]</code>),
	94	at least one of which must be infinitely long. (<em>New in 1.1</em>: spaces
	95	[but no other kinds of whitespace] are allowed in these strings.)</p>
	96
	97	<p>To accomodate this method of identifying a variable, in Oozlybub and Murphy programs,
	98	which are finite, variables are identified using regular expressions
	99	which match their set of names.
	100	An equivalence class of regular expressions is a set of all regular expressions which
	101	accept exactly the same set of strings; each equivalence class of regular expressions
	102	refers to the same, unique Oozlybub and Murphy variable.</p>
	103
	104	<p>(In case you wonder about the implementability of this:
	105	Checking that two regular expressions are
	106	equivalent is decidable: we convert them both to NFAs, then
	107	to DFAs, then minimize those DFAs, then check if the transition
	108	graphs of those DFAs are isomorphic.
	109	Checking that the regular expression accepts at least
	110	one infinitely-long string is also decidable: just look for
	111	a cycle in the DFA's graph.)</p>
	112
	113	<p>Note that these identifier-sets need not be disjoint. <code>/ma*/</code>
	114	and <code>/mb*/</code> are distinct variables, even though both contain the
	115	string <code>m</code>. (Note also that we are fudging slightly on how we
	116	consider to have described an infinitely long name; technically we would
	117	want to have a Büchi automaton that specifies an unending repetition with
	118	<sup>ω</sup> instead of *. But the distinction is subtle enough in this
	119	context that we're gonna let it slide.)</p>
	120
	121	<p>Syntax for giving a variable name is fairly straightforward: it is delimited on
	122	either side by <code>/</code> symbols; the alphanumeric symbols are literals;
	123	textual concatenation is regular expression sequencing,
	124	<code>\|</code> is alteration, <code>(</code> and <code>)</code> increase precedence,
	125	and <code>*</code> is Kleene asteration (zero or more occurrences).
	126	Asteration has higher precedence than sequencing, which has higher precedence
	127	than alteration. Because none of these operators is alphanumeric nor a space, no
	128	escaping scheme needs to be installed.</p>
	129
	130	<p>Variables are declared with the following syntax (<code>i</code> and <code>a</code>
	131	are the types of the variables, described in the next section):</p>
	132
	133	<pre>
	134	VARIABLES ARE i /pp/, i /qq/, a /(0\|1)*/.
	135	</pre>
	136
	137	<p>This declares an integer variable identified by the names {<code>p</code>, <code>pp</code>, <code>ppp</code>, ...},
	138	an integer variable identified by the names {<code>q</code>, <code>qq</code>, <code>qqq</code>, ...},
	139	and an array variable identified by the names of all strings of <code>0</code>'s and <code>1</code>'s.</p>
	140
	141	<p>When not in wimpmode (see below), any regular expression which denotes a
	142	variable may not be literally repeated anywhere else in the program. So in the above example,
	143	it would not be legal to refer to <code>/pp*/</code> further down in the program; an equivalent regular
	144	expression, such as <code>/p\|ppp/</code> or <code>/pp/</code> or
	145	<code>/pp\|pp\|pp*/</code> would have to be used instead.</p>
	146
	147	<h3>Types</h3>
	148
	149	<p>Oozlybub and Murphy is a statically-typed language, in that variables
	150	as well as values have types, and a value of one type cannot be stored in
	151	a variable of another type. The types of values, however, are not entirely
	152	disjoint, as we will see, and special considerations may arise for checking
	153	and conversion because of this.</p>
	154
	155	<p>The basic types are:</p>
	156
	157	<ul>
	158	<li><code>i</code>, the type of integers.
	159	<p>These are integers of unbounded extent, both positive and negative.
	160	Literal constants of type <code>i</code> are given in the usual decimal format.
	161	Variables of this type initially contain the value 0.</p>
	162	</li>
	163
	164	<li><code>p</code>, the type of prime numbers.
	165	<p>All prime numbers are integers but not all integers are prime numbers.
	166	Thus, values of prime number type will automatically be coerced to
	167	integers in contexts that require integers; however the reverse is not true,
	168	and in the other direction a conversion function (<code>P?</code>) must
	169	be used. There are no literal constants of type <code>p</code>. Variables
	170	of this type initially contain the value 2.</p>
	171	</li>
	172
	173	<li><code>a</code>, the type of arrays of integers.
	174	<p>An integer array has an integer index which is likewise of unbounded
	175	extent, both positive and negative. Variables of this type initially
	176	contain an empty array value, where all of the entries are 0.</p>
	177	</li>
	178
	179	<li><code>b</code>, the type of booleans.
	180	<p>A boolean has two possible values, <code>true</code> and <code>false</code>.
	181	Note that there are no literal constants of type <code>b</code>; these must be
	182	specified by constructing a tautology or contradiction with boolean (or other) operators.
	183	It is illegal to retrieve the value of a variable of this type before first assigning it,
	184	except to construct a tautology or contradiction.</p>
	185	</li>
	186
	187	<li><code>t</code>, the type of truth-values.
	188	<p>A truth-value has two possible values, <code>yes</code> and <code>no</code>.
	189	There are no literal constants of type <code>t</code>.
	190	It is illegal to retrieve the value of a variable of this type before first assigning it,
	191	except to construct a tautology or contradiction.</p>
	192	</li>
	193
	194	<li><code>z</code>, the type of bits.
	195	<p>A bit has two possible values, <code>one</code> and <code>zero</code>.
	196	There are no literal constants of type <code>z</code>.
	197	It is illegal to retrieve the value of a variable of this type before first assigning it,
	198	except to construct a tautology or contradiction.</p>
	199	</li>
	200
	201	<li><code>c</code>, the type of conditions.
	202	<p>A condition has two possible values, <code>go</code> and <code>nogo</code>.
	203	There are no literal constants of type <code>c</code>.
	204	It is illegal to retrieve the value of a variable of this type before first assigning it,
	205	except to construct a tautology or contradiction.</p>
	206	</li>
	207
	208	</ul>
	209
	210	<h3>Wimpmode</h3>
	211
	212	<p>(<em>New in 1.1</em>) An Oozlybub and Murphy program is in <dfn>wimpmode</dfn> if it
	213	declares a global variable of integer type which matches the string <code>am a wimp</code>,
	214	for example:</p>
	215
	216	<pre>
	217	VARIABLES ARE i /am a wimp/.
	218	</pre>
	219
	220	<p>Certain language constructs, noted in this document as such,
	221	are only permissible in wimpmode.
	222	If they are used in a program in which wimpmode is not in effect, a
	223	compile-time error shall occur and the program shall not be executed.</p>
	224
	225	<h3>Dynasts</h3>
	226
	227	<p>Each dynast is labeled with a positive integer and contains an expression.
	228	Only one dynast may be denoted in any given parse stream, but dynasts may also
	229	be created dynamically during program execution.</p>
	230
	231	<p>Program execution begins at the lowest-numbered dynast that exists in the initial program.
	232	When a dynast is executed, the expression of that dynast is evaluated for its side-effects.
	233	If there is a dynast labelled with the next higher integer (i.e. the successor of the label
	234	of the current dynast),
	235	execution continues with that dynast; otherwise, the program halts.
	236	Once a dynast has been executed, it continues to exist until the program
	237	halts, but it may never be executed again.</p>
	238
	239	<p>Evaluation of an expression may have side-effects, including
	240	writing characters to an output channel, reading characters from
	241	an input channel, altering the value of a variable, and creating
	242	a new dynast.</p>
	243
	244	<p>Dynasts are written with the syntax <code>dynast(<var>label</var>) <-> <var>expr</var></code>. A concrete example follows:</p>
	245
	246	<pre>
	247	dynast(100) <-> for each prime /p/ below 1000 do write (./p\|p/+1.)
	248	</pre>
	249
	250	<h3>TRIVIA PORTION OF SHOW</h3>
	251
	252	<p>WHO WAS IT FAMOUS MAN THAT SAID THIS?</p>
	253
	254	<ul>
	255	<li>A) RONALD REAGAN</li>
	256	<li>B) RONALD REAGAN</li>
	257	<li>B) RONALD STEWART</li>
	258	<li>C) RENALDO</li>
	259	</ul>
	260
	261	<p>contestant enters lightning round now</p>
	262
	263	<h3>Expressions</h3>
	264
	265	<p>In the following, the letter preceding <var>-expr</var> or <var>-var</var> indicates
	266	the expected type, if any, of that expression or variable. Where the expressions listed below are infix
	267	expressions, they are listed from highest to lowest precedence. Unless noted otherwise,
	268	subexpressions are evaluated left to right.</p>
	269
	270	<ul>
	271
	272	<li><code>(.<var>expr</var>.)</code>
	273	<p>Surrounding an expression with <dfn>dotted parens</dfn> gives it that precedence boost
	274	that's just the thing to have it be evaluated before the expression it's in, but
	275	there is a catch. The number of parens in the dotted parens expression must
	276	match the nesting depth in the following way: if a set of dotted parens is
	277	nested within <var>n</var> dotted parens, it must contain fib(<var>n</var>)
	278	parens, where fib(<var>n</var>) is the <var>n</var>th member of the
	279	Fibonacci sequence. For example, <code>(.(.0.).)</code> and <code>(.(.((.(((.(((((.0.))))).))).)).).)</code>
	280	are syntactically well-formed expressions (when not nested in any other dotted paren
	281	expression), but <code>(.(((.0.))).)</code> and
	282	<code>(.(.(.0.).).)</code> are not.</p>
	283	</li>
	284
	285	<li><code><var>var</var></code>
	286	<p>A variable evaluates to the value it contains at that point in execution.</p>
	287	</li>
	288
	289	<li><code>0</code>, <code>1</code>, <code>2</code>, <code>3</code>, etc.
	290	<p>Decimal literals evaluate to the expected value of type <code>i</code>.</p>
	291	</li>
	292
	293	<li><code>#myself#</code>
	294	<p>This special nullary token evaluates to the numeric label of the currently
	295	executing dynast.</p>
	296	</li>
	297
	298	<li><code><var>var</var> := <var>expr</var></code>
	299	<p>Evaluates the <var>expr</var> and stores the result in the specified variable.
	300	The variable and the expression must have the same type. Evaluates to whatever
	301	<var>expr</var> evaluated to.</p>
	302	</li>
	303
	304	<li><code><var>a-expr</var> [<var>i-expr</var>]</code>
	305	<p>Evaluates to the <code>i</code> stored at the location in the
	306	array given by <var>i-expr</var>.</p>
	307	</li>
	308
	309	<li><code><var>a-expr</var> [<var>i-expr</var>] := <var>i-expr</var></code>
	310	<p>Evaluates the second <var>i-expr</var> and stores the result in the location
	311	in the array given by the first <var>i-expr</var>. Evaluates to whatever
	312	the second <var>i-expr</var> evaluated to.</p>
	313	</li>
	314
	315	<li><code><var>a-expr</var> ? <var>i-expr</var></code>
	316	<p>Evaluates to <code>go</code> if <code><var>a-expr</var> [<var>i-expr</var>]</code>
	317	and <code><var>i-expr</var></code> evaluate to the same thing, <code>nogo</code> otherwise.
	318	The <var>i-expr</var> is only evaluated once.</p>
	319	</li>
	320
	321	<li><code>minus <var>i-expr</var></code>
	322	<p>Evaluate to the integer that is zero minus the result of evaluating <var>i-expr</var>.</p>
	323	</li>
	324
	325	<li><code>write <var>i-expr</var></code>
	326	<p>Write the Unicode code point whose number is obtained by evaluating <var>i-expr</var>, to
	327	the standard output channel. Writing a negative number shall produce one of a number of
	328	amusing and informative messages which are not defined by this document.</p>
	329	</li>
	330
	331	<li><code>#read#</code>
	332	<p>Wait for a Unicode character to become available on the standard
	333	input channel and evaluate to its integer code point value.</p>
	334	</li>
	335
	336	<li><code>not? <var>z-expr</var></code>
	337	<p>Converts a bit value to a boolean value (<code>zero</code> becomes <code>true</code> and <code>one</code> becomes <code>false</code>).</p>
	338	</li>
	339
	340	<li><code>if? <var>b-expr</var></code>
	341	<p>Converts a boolean value to condition value (true becomes go and false becomes nogo).</p>
	342	</li>
	343
	344	<li><code>cvt? <var>c-expr</var></code>
	345	<p>Converts a condition value to a truth-value (<code>go</code> becomes <code>yes</code> and <code>nogo</code> becomes <code>no</code>).</p>
	346	</li>
	347
	348	<li><code>to? <var>t-expr</var></code>
	349	<p>Converts a truth-value to a bit value (<code>yes</code> becomes <code>one</code> and <code>no</code> becomes <code>zero</code>).</p>
	350	</li>
	351
	352	<li><code>P? <var>i-expr</var> [<var>t-var</var>]</code>
	353	<p>If the result of evaluating <var>i-expr</var> is a prime number, evaluates to that
	354	prime number (and has the type <code>p</code>). If it is not prime, stores the value <code>no</code>
	355	into <var>t-var</var> and evaluates to 2.</p>
	356	</li>
	357
	358	<li><code><var>i-expr</var> * <var>i-expr</var></code>
	359	<p>Evaluates to the product of the two <var>i-expr</var>s. The result is never
	360	of type <code>p</code>, but the implementation doesn't need to do anything
	361	based on that fact.</p>
	362	</li>
	363
	364	<li><code><var>i-expr</var> + <var>i-expr</var></code>
	365	<p>Evaluates to the sum of the two <var>i-expr</var>s.</p>
	366	</li>
	367
	368	<li><code>exists/dynast <var>i-expr</var></code>
	369	<p>Evaluates to <code>one</code> if a dynast exists with the given label,
	370	or <code>zero</code> if one does not.</p>
	371	</li>
	372
	373	<li><code>copy/dynast <var>i-expr</var>, <var>p-expr</var>, <var>p-expr</var></code>
	374	<p>Creates a new dynast based on an existing one. The existing one is identified by
	375	the label given in the <var>i-expr</var>. The new dynast is a copy of the existing
	376	dynast, but with a new label. The new label is the sum of the two <var>p-expr</var>s.
	377	If a dynast with that label already exists, the program terminates.
	378	(<em>Defined in 1.1</em>) This expression evaluates to the value of the given <var>i-expr</var>.</p>
	379	</li>
	380
	381	<li><code>create/countably/many/dynasts <var>i-expr</var>, <var>i-expr</var></code>
	382	<p>Creates a countably infinite number of dynasts based on an existing one. The existing one is identified by
	383	the label given in the first <var>i-expr</var>. The new dynasts are copies of the existing
	384	dynast, but with new labels. The new labels start at the first odd integer
	385	greater than the second <var>i-expr</var>, and consist of every odd integer greater than that.
	386	If any dynast with such a label already exists, the program terminates.
	387	(<em>Defined in 1.1</em>) This expression evaluates to the value of the first given <var>i-expr</var>.</p>
	388	</li>
	389
	390	<li><code><var>b-expr</var> and <var>b-expr</var></code>
	391	<p>Evaluates to <code>one</code> if both <var>b-expr</var>s are <code>true</code>,
	392	<code>zero</code> otherwise. Note that this is not short-circuting; both <var>b-expr</var>s
	393	are evaluated.</p>
	394	</li>
	395
	396	<li><code><var>c-expr</var> or <var>c-expr</var></code>
	397	<p>Evaluates to <code>yes</code> if either or both <var>c-expr</var>s are <code>go</code>,
	398	<code>no</code> otherwise. Note that this is not short-circuting; both <var>c-expr</var>s
	399	are evaluated.</p>
	400	</li>
	401
	402	<li><code>do <var>expr</var></code>
	403	<p>Evaluates the <var>expr</var>, throws away the result, and evaluates to <code>go</code>.</p>
	404	</li>
	405
	406	<li><code><var>c-expr</var> then <var>expr</var></code>
	407	<p><strong>Wimpmode only.</strong> Evaluates the <var>c-expr</var> on the left-hand side for its side-effects only,
	408	throwing away the result, then evaluates to the result of evaluating the right-hand
	409	side <var>expr</var>.</p>
	410	</li>
	411
	412	<li><code><var>c-expr</var> ,then <var>i-expr</var></code>
	413	<p>(<em>New in 1.1</em>) Evaluates the <var>c-expr</var> on the left-hand side; if it is <code>go</code>,
	414	evaluates to the result of evaluating the right-hand side <var>i-expr</var>;
	415	if it is <code>nogo</code>, evaluates to an unspecified and quite possibly random
	416	integer between 1 and 1000000 inclusive, without evaluating the right-hand side.
	417	Note that this operator has the same precedence as <code>then</code>.</p>
	418	</li>
	419
	420	<li><code>for each prime <var>var</var> below <var>i-expr</var> do <var>i-expr</var></code>
	421	<p>The <var>var</var> must be a declared variable of type <code>p</code>. The first
	422	<var>i-expr</var> must evaluate to an integer, which we will call <var>k</var>.
	423	The second <var>i-expr</var> is evaluated once for each prime number between
	424	<var>k</var> and 2, inclusive; each time it is evaluated, <var>var</var> is bound to
	425	a successively smaller prime number between <var>k</var> and 2.
	426	(<em>Defined in 1.1</em>) Evaluates to the result of the final evaluation of the second <var>i-expr</var>.</p>
	427	</li>
	428
	429	</ul>
	430
	431	<h3>Grammar</h3>
	432
	433	<p>This section attempts to capture and summarize the syntax rules (for a single
	434	parse stream) described above, using an EBNF-like syntax extended with a few
	435	ad-hoc annotations that I don't feel like explaining right now.</p>
	436
	437	<pre>
	438	ParseStream ::= VarDeclBlock {DynastLit}.
	439	VarDeclBlock ::= "VARIABLES ARE" VarDecl {"," VarDecl} ".".
	440	VarDecl ::= TypeSpec VarName.
	441	TypeSpec ::= "i" \| "p" \| "a" \| "b" \| "t" \| "z" \| "c".
	442	VarName ::= "/" Pattern "/".
	443	Pattern ::= {[a-zA-Z0-9 ]}
	444	\| Pattern "\|" Pattern /* ignoring precedence here */
	445	\| Pattern "" / and here */
	446	\| "(" Pattern ")".
	447	DynastLit ::= "dynast" "(" Gumber ")" "<->" Expr.
	448	Expr ::= Expr1[c] {"then" Expr1 \| ",then" Expr1[i]}.
	449	Expr1 ::= Expr2[c] {"or" Expr2[c]}.
	450	Expr2 ::= Expr3[b] {"and" Expr3[b]}.
	451	Expr3 ::= Expr4[i] {"+" Expr4[i]}.
	452	Expr4 ::= Expr5[i] {"*" Expr5[i]}.
	453	Expr5 ::= Expr6[a] {"?" Expr6[i]}.
	454	Expr6 ::= Prim[a] {"[" Expr[i] "]"} [":=" Expr[i]].
	455	Prim ::= {"("}* "." Expr "." {")"}* /* remember the Fibonacci rule! */
	456	\| VarName [":=" Expr]
	457	\| Gumber
	458	\| "#myself#"
	459	\| "minus" Expr[i]
	460	\| "write" Expr[i]
	461	\| "#read#"
	462	\| "not?" Expr[z]
	463	\| "if?" Expr[b]
	464	\| "cvt?" Expr[c]
	465	\| "to?" Expr[t]
	466	\| "P?" Expr[i]
	467	\| "exists/dynast" Expr[i]
	468	\| "copy/dynast" Expr[i] "," Expr[p] "," Expr[p]
	469	\| "create/countably/many/dynasts"
	470	Expr[i] "," Expr[i]
	471	\| "do" Expr
	472	\| "for" "each" "prime" VarName "below"
	473	Expr[i] "do" Expr[i].
	474	Gumber ::= {[0-9]}.
	475	</pre>
	476
	477	<h3>Boolean Idioms</h3>
	478
	479	<p>Here we show how we can get any value of any of the <code>b</code>, <code>t</code>,
	480	<code>z</code>, and <code>c</code> types, without any constants or variables with
	481	known values of these types.</p>
	482
	483	<pre>
	484	VARIABLES ARE b /b*/.
	485	zero = /b\|b/ and not? to? cvt? if? /b\|b*/
	486	true = not? zero
	487	go = if? true
	488	yes = cvt? go
	489	one = to? yes
	490	false = not? one
	491	nogo = if? false
	492	no = cvt? nogo
	493	</pre>
	494
	495	<h3>Computational Class</h3>
	496
	497	<p>Because the single in-dynast looping construct, <code>for each prime below</code>, is
	498	always a finite loop, the execution of any fixed number of dynasts cannot be Turing-complete.
	499	We must create new dynasts at runtime, and continue execution in them, if we want
	500	any chance at being Turing-complete. We demonstrate this by showing an example of
	501	a (conjecturally) infinite loop in Oozlybub and Murphy, an idiom which will doubtless
	502	come in handy in real programs.</p>
	503
	504	<pre>
	505	VARIABLES ARE p /p/, p /q/.
	506	dynast(3) <->
	507	(. do (. if? not? exists/dynast 5 ,then
	508	create/countably/many/dynasts #myself#, 5 .) .) ,then
	509	(. for each prime /p*\|p/ below #myself#+2 do
	510	for each prime /q\|q/ below /p\|pp/+1 do
	511	if? not? exists/dynast /p\|p\|p/+/q\|q\|q/ ,then
	512	copy/dynast #myself#, /p\|ppp/, /q\|qqq/ .)
	513	</pre>
	514
	515	<p>As you can see, the ability to loop indefinitely in Oozlybub and Murphy hinges on whether
	516	Goldbach's Conjecture is true or not. Looping forever requires creating an unbounded number of
	517	new dynasts. We can create all the odd-numbered dynasts at once, but that won't be enough
	518	to loop forever, as we must proceed to the next highest numbered dynast after executing a
	519	dynast. So we must create new dynasts with successively higher even integer labels, and these
	520	can only be created by summing two primes. So, if Goldbach's conjecture is false, then there
	521	is some even number greater than two which is not the sum of two primes; thus there is some
	522	dynast that cannot be created by a running Oozlybub and Murphy program, thus it is not possible
	523	to loop forever in Oozlybub and Murphy, thus Oozlybub and Murphy is not Turing-complete
	524	(because it cannot simulate any Turing machine that loops forever.)</p>
	525
	526	<p>It should not however be difficult to show that Oozlybub and Murphy is Turing-complete under
	527	the assumption that Goldbach's Conjecture is true. If Goldbach's Conjecture is true, then the above program
	528	is an infinite loop. We need only add to it appropriate conditional instructions to,
	529	say, simulate the execution of an arbitrarily-chosen Turing machine. An array can serve as the
	530	tape, and an integer can serve as the head. Another integer can serve as the state of the finite
	531	control. The integer can be tested against various fixed integers by establishing an array for each
	532	of these fixed integers and using the <code>?</code> operator against each in turn; each branch
	533	can mutate the tape, tape head, and finite control as desired. The program can halt by neglecting
	534	to create a new even dynast to execute next, or by trying to create a dynast with a label that
	535	already exists.</p>
	536
	537	<p>Happy FLIMPING,
	538	<br/>Chris Pressey
	539	<br/>December 1, 2010
	540	<br/>Evanston, Illinois, USA
	541	</p>
	542
	543	</body>
	544	</html>

+120

-0

opus-2/doc/opus-2.html less more

	0	<html><head>
	1	<meta http-equiv="Content-Type" content="text/html;CHARSET=iso-8859-1">
	2	<title>Cat's Eye Technologies: The Opus-2 Language</title>
	3	</head>
	4	<body bgcolor="#ffffff" text="#000000" link="#1c3f7f"
	5	vlink="#204020" background="/img/sinewhite.gif" >
	6
	7	<h1>Opus-2</h1>
	8
	9	<p>Opus-2 is an abstract artlang composed by Chris Pressey at or around
	10	March 10, 2001.
	11
	12	<h3>Design Goals</h3>
	13
	14	<p>Eliminate word order entirely. Despite the appearance of the
	15	resulting language, this was the only real design goal to begin with.
	16
	17	<h3>Grammatical Overview</h3>
	18
	19	<p>Verbs in Opus-2 take the form of colours. Nouns take the form of
	20	sounds. Adjectives take the form of smells. Adverbs take the form of
	21	inner-ear sensations. Certain tenses and phrasings are indicated by
	22	tastes.
	23
	24	<p>To distinguish between the roles of the nouns in a sentence,
	25	objects are quieter, <i>sotto voce</i> sounds, and subjects are
	26	foreground sounds. It is important to remember that the sensations
	27	corresponding to object, subject, and verb all occur at the same time
	28	in an event termed an <i>sentence-experience</i>.
	29
	30	<p>This dominant-recessive relationship is also present in strong
	31	and weak scents which indicate whether an adjective describes the
	32	subject or the object, and in intense and gentle inner-ear sensations
	33	(feelings of sudden or gradual acceleration) to determine the target
	34	of an adverb.
	35
	36	<h3>Vocabulary Overview</h3>
	37
	38	<p>Sample dictionary:
	39
	40	<p><center><table border=0>
	41
	42	<tr><td colspan=2 align=center><b>verbs</b></td></tr>
	43	<tr><td>flee</td><td><i>pale green</i></td></tr>
	44	<tr><td>approach</td><td><i>deep orange</i></td></tr>
	45	<tr><td>examine</td><td><i>medium grey</i></td></tr>
	46	<tr><td>glorify</td><td><i>deep red</i></td></tr>
	47
	48	<tr><td colspan=2 align=center><b>nouns</b></td></tr>
	49	<tr><td>man</td><td><i>Eb below middle C, trombone</i></td></tr>
	50	<tr><td>woman</td><td><i>F above middle C, french horn</i></td></tr>
	51	<tr><td>world</td><td><i>car door slamming</i></td></tr>
	52	<tr><td>child</td><td><i>middle C, tubular bells</i></td></tr>
	53	<tr><td>building</td><td><i>F, tympani roll</i></td></tr>
	54	<tr><td>radio</td><td><i>harp sweep</i></td></tr>
	55
	56	<tr><td colspan=2 align=center><b>adjectives</b></td></tr>
	57	<tr><td>fast</td><td><i>burning rubber</i></td></tr>
	58	<tr><td>dangerous</td><td><i>mothballs</i></td></tr>
	59
	60	<tr><td colspan=2 align=center><b>adverbs</b></td></tr>
	61	<tr><td>quickly</td><td><i>leaning 40 degrees left</i></td></tr>
	62	<tr><td>dangerously</td><td><i>leaning 25 degrees right</i></td></tr>
	63
	64	</table></center>
	65
	66	<h3>Context Overview</h3>
	67
	68	<p>While each sentence is "instantaneous" in the sense that
	69	there is no internal word order, sentence-experiences still
	70	follow one another, and each sentence-experience does take a certain
	71	amount of time to perceive.
	72	Tense is thus implied by context between successive sentences
	73	and the duration of each sentence. The shorter a sentence is,
	74	the further into the future it is presumed to refer to.
	75	(Thanks to Rob Norman and Panu Kalliokosi for suggesting these ideas.)
	76
	77	<h3>Examples of Usage</h3>
	78
	79	<p>Example sentence-experience: "The building glorifies the woman":
	80
	81	<p><center><table border=2>
	82	<tr><td><i>deep red</i></td></tr>
	83	<tr><td><i>F, tympani roll, forte</i></td></tr>
	84	<tr><td><i>F, french horn, piano</i></td></tr>
	85	</table></center>
	86
	87	<p>Example sentence-experience: "The man quickly flees the dangerous child":
	88
	89	<p><center><table border=2>
	90	<tr><td><i>pale green</i></td></tr>
	91	<tr><td><i>Eb, trombone, forte</i></td></tr>
	92	<tr><td><i>leaning 40 degrees left (sudden)</i></td></tr>
	93	<tr><td><i>C, tubular bells, piano</i></td></tr>
	94	<tr><td><i>mothballs (gentle whiff)</i></td></tr>
	95	</table></center>
	96
	97	<h3>Who Speaks Opus-2?</h3>
	98
	99	<p>This language was designed purely as an abstract exercise in language
	100	design. Thus it was not designed for any preconceived group of speakers,
	101	and little consideration was given to their culture and capabilities.
	102	It is neither specifically a conversation language, nor a formalized
	103	language (e.g. a programming language.)
	104
	105	<p>Most of the problems of finding speakers of Opus-2 is in finding creatures
	106	that can create smells and inner-ear sensations as easily as humans can
	107	create complex sounds.
	108
	109	However, some popular opinions of who or what might speak Opus-2 have
	110	been suggested since its unveiling:
	111
	112	<ul>
	113	<li>An efficient-yet-entertaining form of future communication using direct neural jacks.
	114	<li>A code used by e.g. Neo (from <i>The Matrix</i>) to communicate to
	115	subjects unknowingly trapped in a virtual reality.
	116	<li>A pidgin spoken between highly-telepathic beings and marginally-telepathic beings.
	117	</ul>
	118
	119	</body></html>

+280

-0

tamerlane/doc/tamerlane.html less more

	0	<html><head>
	1	<meta http-equiv="Content-Type" content="text/html;CHARSET=iso-8859-1">
	2	<meta name="Description" content="Cat's Eye Technologies: The Tamerlane Programming Language">
	3	<title>Cat's Eye Technologies: The Tamerlane Programming Language</title>
	4	</head>
	5	<body bgcolor="#ffffff" text="#000000" link="#1c3f7f"
	6	vlink="#204020" background="/img/sinewhite.gif" >
	7	<center>
	8	<h1>Tamerlane</h1>
	9	<p>Chris Pressey
	10	<br>Created Jan 29 2000
	11	</center>
	12
	13	<h3>Introduction to Tamerlane</h3>
	14
	15	<p>Tamerlane is a "constraint flow" language. The point of its creation
	16	is to attempt to break as many paradigmal stereotypes and idioms that
	17	I'm aware of; at least, to make it tricky and confounding
	18	to pigeonhole easily.
	19
	20	<p>It has some concepts in it from imperative languages,
	21	some from functional languages, some from dataflow and object oriented
	22	languages, and some from graph rewriting
	23	and other constraint-based languages, and they're all muddled together
	24	into a ridiculous <i>potpourri</i>.
	25
	26	<p>Despite being such a mutt, there might actually be some algorithms that
	27	would be dreadfully easy to write in Tamerlane.
	28
	29	<h3>Overview of Tamerlane</h3>
	30
	31	<p>A Tamerlane program consists of a mutably weighted directed graph.
	32
	33	<p>Each node is considered to be an independent updatable store.
	34	The data held in each store is represented by the weights of the
	35	arcs exiting the node.
	36
	37	<p>An arc of weight zero is functionally equivalent to the absence
	38	of an arc.
	39
	40	<h3>Description and Example</h3>
	41
	42	<p>At this point we may introduce the syntax in a sample ASCII
	43	notation for a simple, almost pathological Tamerlane program:
	44
	45	<pre>
	46	Point-A: 1 Point-B,
	47	Point-B: 1 Point-C,
	48	Point-C: 1 Point-A.
	49	</pre>
	50
	51	<p>The user of a Tamerlane program may submit <i>messages</i>
	52	to the program at runtime. In this sense the user and the program are
	53	both objects which share the symmetrical relationship
	54	<b>user-of/used-by</b>. The program object's interface exposes a
	55	<tt>query</tt> method to the user, which is to be considered
	56	runtime-polymorphic.
	57
	58	<p>Using this message-passing mechanism, queries are submitted by the user to a
	59	running Tamerlane program, much as queries would be submitted
	60	to a running Prolog program.
	61
	62	<p>Queries have their own syntax and semantics.
	63	Unlike Prolog, the user's queries are interpreted as <i>rules</i>,
	64	perhaps accompanied by information about
	65	where and when the rules are "introduced" into the graph.
	66
	67	<p>As an example of a query that could be submitted to the above
	68	Tamerlane program:
	69
	70	<pre>
	71	1 Point-A -> 0 Point-A @ Point-A
	72	</pre>
	73
	74	This would introduce the rewriting rule
	75
	76	<ul>"Replace occurances of <tt>1 Point-A</tt> with <tt>0 Point-A</tt>"
	77	</ul>
	78
	79	<p>This rule is applied to the weights of the nodes in the graph
	80	starting with the node specified after the <tt>@</tt> symbol. In this
	81	instance it would start by trying to apply the rewrite to
	82	the node <tt>Point-A</tt>, but finding <tt>Point-A</tt> to contain <tt>1 Point-B</tt>,
	83	nothing would change.
	84
	85	<p>Each time a further query is submitted, each rule which has
	86	been introduced into the graph, disappears from the node it
	87	was working on, and is transmitted to the adjacent node with
	88	the lowest positive weight value. If there is a tie for
	89	lowest weight, the rule is transmitted to all of the
	90	adjacent nodes with the same lowest weight.
	91
	92	<p>For efficacy we can consider the user able to submit a <tt>nop</tt>
	93	query. This would not introduce any new rules into the graph,
	94	but it would cause all active rules to propogate to new nodes
	95	on the graph.
	96
	97	<p>So assume the user submits <tt>nop</tt>. The rule that was
	98	introduced by the last query 'moves' from the
	99	node labelled <tt>Point-A</tt> to the node labelled <tt>Point-B</tt> (since it's
	100	the only positive route out of <tt>Point-A</tt>.) It tries to rewrite
	101	<tt>Point-B</tt>, but finding only <tt>1 Point-C</tt> in <tt>Point-B</tt>, nothing happens.
	102
	103	<p>Assume the user <tt>nop</tt>s again. The rule is propogated to <tt>Point-C</tt>.
	104	Finally the pattern match succeeds, and <tt>Point-C</tt> is rewritten to
	105	a new <tt>Point-C</tt>:
	106
	107	<pre>
	108	Point-C: 0 Point-A.
	109	</pre>
	110
	111	<p>After one more <tt>nop</tt>, the engine will generate a
	112
	113	<pre>
	114	Rule stopped at Point-C (no adjacent nodes)
	115	</pre>
	116
	117	<p>message back to the user. This uses the operation that the user object's
	118	interface supplies to the running program, called <tt>messageback</tt>,
	119	which the user must supply, but is, like <tt>query</tt>, considered
	120	runtime-polymorphic.
	121
	122	<h2>Advanced Topics</h2>
	123
	124	<h3>Rule Priority</h3>
	125
	126	<p>Obviously, the user can enter more than one query in
	127	succession; s/he doesn't need to explicity <tt>nop</tt> to
	128	wait for rules to resolve. Since the graph can contain cycles, this would lead
	129	to a form of inherent, synchronous concurrency: each time the <tt>query</tt> or <tt>nop</tt>
	130	methods are invoked, more than one rule may be applied to the same
	131	node.
	132
	133	<p>The order in which these competing rules is applied is based
	134	on the rule's <i>priority</i>. Rules can be submitted with a specific
	135	priority in the following manner:
	136
	137	<pre>
	138	1 Point-A -> 0 Point-A @ Point-A ! 10
	139	</pre>
	140
	141	<p>If there is a tie in priority when two rules are competing to
	142	rewrite the same node, the outcome is guaranteed to be
	143	not simply undefined, but rather, non-deterministic, or at least
	144	probabilistic.
	145
	146	<p>The user can also specify a delay, measured in number of method calls
	147	(<tt>query</tt>s or <tt>nop</tt>s)
	148	from the present query, at which the rule will be introduced into
	149	the graph, like so:
	150
	151	<pre>
	152	1 Point-A -> 0 Point-A @ Point-A in 10
	153	</pre>
	154
	155	<p>And, for the sake of efficacy, the <tt>nop</tt> method on the
	156	program object has overloaded syntaxes whose semantics are
	157	'keep <tt>nop</tt>ing until some or all rules have stopped.'
	158
	159	<h3>Negative Weights</h3>
	160
	161	<p>A negative weight from one node to another is interpreted in a
	162	rather negative fashion with respect to the running program.
	163
	164	<p>A negative weight is selected for propogation when it is closer
	165	to zero (while still being non-zero) than any other weight (absolute
	166	value.) When a negative weight is selected, however, the
	167	semantics of propogation are different.
	168
	169	<p>When a rule encounters an arc of the form:
	170
	171	<pre>
	172	X: -1 Y.
	173	</pre>
	174
	175	<p>The rule is not "just" copied to Y like "usual". Instead, it
	176	"rolls back" the rule to Y in the fashion of a functional
	177	continuation.
	178
	179	<p>All of the nodes that have been "touched" by this rule, since it
	180	was last at node Y, are reset to their condition when the rule
	181	was last at node Y. The rule itself is deleted from X and
	182	propogated to Y, and rewriting continues from there.
	183
	184	<p>If the rule has never visited Y, however, then such an exit arc is
	185	not considered a valid candidate. It has an effective weight of
	186	0 (non-existent.)
	187
	188	<h3>Lambda Graphs</h3>
	189
	190	<p>Lambda abstraction is a powerful form of referring to non-numeric
	191	calculatory items such as functions and, in the case of Tamerlane, graphs.
	192
	193	<p>A user may submit a rule in the form
	194
	195	<pre>
	196	1 X -> 1 Y ? Y: 1 Z, Z: 1 Y
	197	</pre>
	198
	199	<p>Y is like a 'local variable' in this respect. When the rule is
	200	applied to the node, and the pattern match succeeds, the nodes
	201	named Y and Z need not actually exist, as they are simply
	202	created dynamically and 'attached' to the program graph.
	203
	204	<p>Lambda graphs are subject to garbage collection, since they can
	205	only be used by the rule that created them.
	206
	207	<h3>Pigeonholes (Updatable)</h3>
	208
	209	<p>Variables may be supplied which are explicitly updatable,
	210	and these are termed pigeonholes. Pigeonholes acknowledge events.
	211	Their assignment is associated with both nodes and arcs. They
	212	can occur when:
	213
	214	<ul>
	215	<li>a rule enters a node
	216	<li>a rule leaves a node
	217	<li>a rule chooses an arc
	218	<li>a rule passes through an arc
	219	<li>a rule rewrites an arc
	220	</ul>
	221
	222	<p>Updatable variables are always
	223	named after the node (or node.arc) they're associated with, preceded by
	224	a <tt>$</tt> symbol. The data in the variable is, of course, the arc weight
	225	(or in the case of nodes, the node's internal key.)
	226
	227	<p>If the pigeonhole is assigned the special value <tt>%Cancel</tt>,
	228	the rule is cancelled from moving to the node/through the arc/rewriting the
	229	arc.
	230
	231	<h3>Placeholders (Unification)</h3>
	232
	233	<p>Variables may also be supplied as "placeholders". These are the
	234	same as bindable (unifiable) variables in logical languages like
	235	Prolog. An example of such might be:
	236
	237	<pre>
	238	1 X 7 ^Y -> 7 X 1 ^Y
	239	</pre>
	240
	241	<p>Which would replace "1 X 7 Foo" with "7 X 1 Foo", "1 X 7 Bar" with
	242	"7 X 1 Bar", etc.
	243
	244	<h3>Horn Rules</h3>
	245
	246	<p>Horn rules only succeed if all of their rules succeed. A Horn rule
	247	is specified like:
	248
	249	<pre>
	250	1 X -> 1 Y + 1 Z 0 G -> 1 G
	251	</pre>
	252
	253	<p>(Remember that 0 G is equivalent to "the absence of an arc to G." If you
	254	have a pattern like
	255
	256	<pre>
	257	^a G ^b G 0 G -> ^b Q
	258	</pre>
	259
	260	<p>It will only match when there are exactly two different exit arcs to
	261	the same node G, which is not disallowed (nor is a node with an exit
	262	arc which points to itself.))
	263
	264	<p>Note also that arcs are unordered amongst themselves. The rule
	265
	266	<pre>
	267	1 A 1 B 1 C -> 1 E 3 D
	268	</pre>
	269
	270	<p>is the same as
	271
	272	<pre>
	273	1 B 1 A 1 C -> 3 D 1 E
	274	</pre>
	275
	276	<hr>
	277	<center><font size=-1>Last Updated Jan 27 <img src="/img/copyright.gif"
	278	align=absmiddle width=12 height=12 alt="(c)" border=0>2001 <a href="/">Cat's Eye Technologies</a>.</font></center>
	279	</body></html>

+281

-0

urreading/doc/urreading.html less more

	0	<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
	1	<!-- encoding: UTF-8 -->
	2	<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
	3	<head>
	4	<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
	5	<title>You are Reading the Name of this Esolang</title>
	6	<!-- begin html doc dynamic markup -->
	7	<script type="text/javascript" src="/contrib/jquery-1.6.4.min.js"></script>
	8	<script type="text/javascript" src="/scripts/documentation.js"></script>
	9	<!-- end html doc dynamic markup -->
	10	</head>
	11	<body>
	12
	13	<h1>You are Reading the Name of this Esolang</h1>
	14
	15	<p>November 2007, Chris Pressey, Cat's Eye Technologies</p>
	16
	17	<h2>Introduction</h2>
	18
	19	<p>This programming language, called <b>You are Reading the Name of this Esolang</b>,
	20	is my first foray into the design space of programming languages whose programs contain
	21	undecidable elements. In the case of You are Reading the Name of this Esolang,
	22	these elements are the instructions themselves — or rather, the symbols that
	23	the instructions are composed of.</p>
	24
	25	<p>Before we begin, some lexical notes. The name of this language is not
	26	pronounced exactly how it looks; rather, it is pronounced as an English speaker
	27	would pronounce the phrase
	28	"you are hearing the name of this esolang." In addition, it is strongly discouraged
	29	to refer to this language by the name "Yartnote", whether spoken or in writing, and
	30	in any capitalization scheme. After all, there may actually be a completely
	31	unrelated esolang called Yartnote one day, Zeus willing.
	32	A similar logic applies to the taboo on calling it "YRNE".</p>
	33
	34	<h2>Program structure</h2>
	35
	36	<p>A You are Reading the Name of this Esolang program is a string of
	37	symbols drawn from the alphabet <code>0</code>, <code>1</code>, <code>[</code>, and
	38	<code>]</code>.</p>
	39
	40	<p><code>0</code> and <code>1</code> are interpreted as they are in the programming
	41	language Spoon, or rather, the slightly more clearly-specified version of Spoon that
	42	follows. The Spoon tape is considered unbounded in both directions. Each cell of
	43	the Spoon tape may contain any non-negative integer (again, unbounded.) In addition,
	44	attempting to decrement a cell below 0 results in an immediate termination of the
	45	program. Oh, and there's no way to change what the symbols are.
	46	But that's the extent of the difference, I think.</p>
	47
	48	<p>For your convenience, the Spoon instructions are repeated here
	49	(taken from the public-domain <a href="http://esolangs.org/wiki/Spoon">Spoon</a>
	50	entry on the <a href="http://esolangs.org/wiki/">Esolang wiki</a>):</p>
	51
	52	<pre>1 Increment the memory cell under the pointer
	53	000 Decrement the memory cell under the pointer
	54	010 Move the pointer to the right
	55	011 Move the pointer to the left
	56	0011 Jump back to the matching 00100
	57	00100 Jump past the matching 0011 if the cell under the pointer is zero
	58	001010 Output the character signified by the cell at the pointer
	59	0010110 Input a character and store it at the cell in the pointer
	60	00101110 Output the entire memory array
	61	00101111 Immediately terminate program execution</pre>
	62
	63	<p>Each <code>[</code> must be matched with a <code>]</code>; between them lies a
	64	subprogram with the same structure as a general You are Reading the Name of this Esolang
	65	program. The meaning of this subprogram is determined from its structure as follows.
	66	The subprogram is considered to be given the same input as the entire program. If the
	67	subprogram halts on this input, it is reduced to a <code>1</code>, and if it loops
	68	forever on this input, it is reduced to a <code>0</code>. These reduced instructions
	69	are interpreted as they are in Spoon, as described above. Any output produced by the
	70	subprogram is simply discarded.</p>
	71
	72	<p>Subprograms may themselves contain subprograms, nested to arbitrary depth, in which case the
	73	reduction above is recursively applied, from the inside out, until a string of only
	74	<code>0</code> and <code>1</code> symbols remains. This is then executed as if it
	75	were a Spoon program. Any syntactically ill-formed program or subprogram is
	76	considered to halt immediately, producing no output. Note however that, as a consequence
	77	of this, a subprogram can be syntactically ill-formed (for example consisting of a
	78	single <code>0</code>) while the parent program can still be syntactically OK
	79	(the subprogram just reduces to a <code>1</code> in the parent program.)</p>
	80
	81	<h2>Implementation notes</h2>
	82
	83	<p>An implementation will determine if a particular subprogram halts, or not, if it can.
	84	Implementations may vary in the power of their proof methods used for this, but at a
	85	minimum must be able to recognize at least one subprogram that halts on any input, and
	86	one subprogram that loops forever on any input. This implementation-dependence should
	87	not strike anyone as too bizarre, I don't think — it is quite similar to how
	88	different implementations of a traditional systems-programming language can, for example,
	89	provide different levels of support for sizes of numerical data types like integers.</p>
	90
	91	<p>Recall that the problem of telling if an arbitrary program in some given Turing-complete language
	92	halts on some input is <em>undecidable</em>, or equivalently, that set of all programs
	93	that halt on some input is <em>recursively enumerable</em>. The set of all programs that loop
	94	forever on some given input is the complement of this set, and it is called <em>co-recursively
	95	enumerable</em> (<em>co-r.e.</em> for short.)</p>
	96
	97	<p>Despite this, there are many methods, ranging from simplistic to sophisticated,
	98	that can be used to prove in <em>specific</em> circumstances that a given program,
	99	on a given input, will either halt or fail to halt. These methods can be used in
	100	a You are Reading the Name of this Esolang implementation.</p>
	101
	102	<p>The simplest method for proving that a subprogram halts is probably just to
	103	simulate it on the given input indefinately, returning <code>1</code> if it halts.
	104	If the subprogram does indeed halt, this technique will (eventually) reveal
	105	that fact. The simulation can be peformed concurrently with other subprograms,
	106	so that if no proof of halting for one subprogram is ever found, this will not
	107	prevent other subprograms from being checked.</p>
	108
	109	<p>The simplest method for proving that a subprogram loops forever is probably to
	110	check it against a library of subprograms known to loop forever. For example, it
	111	can check if the program is <code>0010000000111001000011</code>
	112	(in Brainfuck: <code>[-]+[]</code>.
	113	You can readily assure yourself that this program loops forever, on any input.)
	114	This technique of course
	115	limits the number of recognizably looping subprograms that the implementation can
	116	handle to a finite number. Checking the general case, that is, recognizing an infinite number
	117	of forever-looping programs, is more difficult, however. Such an implementation
	118	will require techniques such as automatically finding a proof by induction,
	119	or abstract interpretation. However, ultimately, we know from the Halting Problem that there
	120	is no perfectly general technique that will recognize <em>every</em> program that
	121	loops forever.</p>
	122
	123	<h2>Computability class</h2>
	124
	125	<p>You are Reading the Name of this Esolang can be trivially shown to be as powerful
	126	as Spoon, since valid Spoon programs are valid You are Reading the Name of this Esolang
	127	programs. (Modulo the absence of negative-valued tape cells and the other little variations
	128	mentioned above. Since these issues have been dealt with extensively in the
	129	Brainfuck "literature", such as it is, I'm not going to worry about them.)</p>
	130
	131	<p>What about the other way around?</p>
	132
	133	<p>Well, take as a starting point the fact (in classical logic, at least)
	134	that every Spoon program either halts at some point or loops forever. (Whether we can
	135	<em>discover</em> which of these is the case is a different story.)
	136	This means that every You are Reading the Name of this Esolang program has a
	137	"canonical" form consisting of just <code>1</code>'s and <code>0</code>'s —
	138	again, whether we have an interpreter powerful enough to discover it or not.
	139	At this level, Spoon is as powerful as "canonical" You are Reading the Name of this Esolang,
	140	because "canonical" You are Reading the Name of this Esolang programs are valid
	141	Spoon programs.</p>
	142
	143	<p>Now we can go in the other direction. We can start with any "canonical"
	144	You are Reading the Name of this Esolang program, and replace each <code>1</code>
	145	with any You are Reading the Name of this Esolang subprogram that always halts.
	146	Since even the simple method, described above, of proving that a subprogram halts
	147	will always resolve to <code>1</code> if the subprogram does indeed halt, this subset
	148	of You are Reading the Name of this Esolang programs is still executable
	149	in Spoon (or any other Turing-complete language). We only need to add the halting
	150	proof mechanism to rewrite the program into "canonical" form, before executing
	151	or simulating it.</p>
	152
	153	<p>This extends recursively to any arbitrary level of nesting, too: we can
	154	replace each <code>1</code> in each subprogram with subprograms that always
	155	halt, with no bound. We only have to test these subprograms recursively,
	156	from the inside out, to eventually recover a "canonical" program.</p>
	157
	158	<p>However, something strange happens when we turn our attention to <code>0</code>'s.
	159	If we replace even one <code>0</code> with a subprogram that loops forever on
	160	some input, there is always a possibility that: a) the You are Reading the Name of this Esolang
	161	program will be run with that input, and that b) the interpreter cannot prove that
	162	the subprogram loops forever with that input. Because the set of programs that loop forever
	163	is co-r.e., there is no Turing machine (or Spoon program) that can look at any
	164	one Spoon program and say, yep, I'm certain that this here program loops forever on this
	165	input, so it should darn well be rewritten into a <code>0</code> symbol.</p>
	166
	167	<p>Thus it seems that there are You are Reading the Name of this Esolang programs
	168	which no Spoon interpreter — indeed, not any interpreter for any Turing-complete
	169	language — is able to interpret.</p>
	170
	171	<p>Since the case of <code>0</code>'s seems to mirror the case of <code>1</code>'s
	172	when it comes to expanding them into subprograms, we may conjecture that, instead
	173	of remaining basically the same as we consider more and more deeply nested subprograms
	174	(as it was with expanding <code>1</code>'s,) maybe the problem
	175	becomes more intractable the deeper we go in expanding <code>0</code>'s.
	176	Perhaps we are climbing up the arithmetic hierarchy?</p>
	177
	178	<p>At any rate, <em>I</em> certainly initially conjectured that that was the case,
	179	but it appears to be off the mark. Say you have a Spoon interpreter
	180	that's equipped with an oracle. You can feed an input string and a Spoon program into
	181	the oracle, and the oracle tells you whether or not that program halts on that input.
	182	You could then use that oracle to resolve a given You are Reading the Name of this Esolang
	183	subprogram into a <code>0</code> or <code>1</code>. But, you could also do this
	184	recursively, resolving them from the inside outward. A Spoon interpreter with such
	185	an oracle would be able to simulate any You are Reading the Name of this Esolang program
	186	— no more powerful oracle is needed, no matter how deep the subprograms are
	187	nested.</p>
	188
	189	<h2>Discussion</h2>
	190
	191	<p>I started designing You are Reading the Name of this Esolang shortly after
	192	reading about the programming language Gravity, while trying to determine the
	193	sense in which it is "non-computable." In particular, I noticed that these two statements
	194	(taken from the <a href="http://esolangs.org/wiki/Gravity">Gravity</a>
	195	entry on the <a href="http://esolangs.org/wiki/">Esolang wiki</a>,)
	196	on which the claim of Gravity's "non-computability" apparently rests,
	197	have ready analogies to problems the world of Turing machines:</p>
	198
	199	<ul>
	200	<li>"Although [Gravity's] behavior is well-defined and deterministic, the evolution of its
	201	space is in general non-computable [...]"
	202	<p>The evolution of the state-space (set of successive configurations) of a universal
	203	Turing machine is also in general non-computable (there's no Turing machine that can tell
	204	you that some given configuration will never be reached.)
	205	</p></li>
	206	<li>"It can be shown that a Turing machine cannot compute, in the general case, whether
	207	even a single collision [in a given Gravity program] will ever happen."
	208	<p>It can also be shown that a Turing machine cannot compute, in the general case,
	209	whether or not even a single given state of another Turing machine's finite control will
	210	ever be reached. (Just make that state a halt state, and you have the Halting Problem
	211	right there.)</p></li>
	212	</ul>
	213
	214	<p>Because of this, I am skeptical that Gravity is any more "non-computable" than
	215	a universal Turing machine. (I am, however, far from an expert on the computability
	216	of differential equations; it could be that the rather nonspecific term "non-computable",
	217	as used in that subfield, means something stronger than simply "undecidable".)</p>
	218
	219	<p>At any rate, the idea interested me enough to spur me into designing a language
	220	that I <em>could</em> be reasonably certain was non-computable, in
	221	some sense that I could explain. The name You are Reading the Name of this Esolang
	222	was drifting around nearby
	223	in the æther at that moment, and seemed fitting enough for this monstrosity.</p>
	224
	225	<p>The general approach was to simply force the
	226	language interpreter to decide — that is, to reduce to either a <code>0</code> or a
	227	<code>1</code> — some problem that is undecidable. This led to looking for
	228	something that needed specifically either a <code>0</code> or a <code>1</code>
	229	to specify something necessary, and that in turn
	230	led to the choice of Spoon as a base language. (Of course, I could have picked
	231	just about any language which is "its own binary Gödel numbering"; there are
	232	plenty to choose from there, but Spoon had a cool name. What can I say — I
	233	like The Tick.)</p>
	234
	235	<p>The obvious choice of undecidable
	236	problem was whether another program halts or not. Making the subject of
	237	this problem a <em>subprogram</em> with
	238	the same structure as the general program let me examine the case of unbounded
	239	recursive descent. This turned out to be not quite as interesting as I hoped, but perhaps
	240	still somewhat illuminating. (Just what <em>would</em> it take, to require that
	241	a Spoon interpreter have a more powerful oracle than HP, to run every You are Reading
	242	the Name of this Esolang program? Perhaps
	243	<a href="http://esolangs.org/wiki/Banana_Scheme">Banana Scheme</a> could provide
	244	some inspiration, here. <em>It</em> certainly seems to be climbing the arithmetic
	245	hierarchy, although I can't quite say how far. Possibly "damned far.")</p>
	246
	247	<p>I suppose one or two other things can be said about
	248	You are Reading the Name of this Esolang.</p>
	249
	250	<p>Unlike both Gravity and Banana Scheme,
	251	You are Reading the Name of this Esolang has a recursively enumerable <em>syntax</em>: the problem of
	252	whether or not a given string over the alphabet <code>0</code>, <code>1</code>,
	253	<code>[</code>, and <code>]</code> is even a well-formed You are Reading the Name of
	254	this Esolang program is undecidable!</p>
	255
	256	<p>It's not entirely clear how to interpret the instruction
	257	<code>00101110</code>, "Output the entire memory array," in the context of
	258	having a tape unbounded in both directions. I suppose I ought to stipulate that
	259	we are to just output the portion of the tape that has been "touched", i.e. that
	260	the tape head has moved over. But really, it's not so important for the
	261	goals of You are Reading the Name of this Esolang, so maybe I should just leave
	262	it undefined, for kicks.</p>
	263
	264	<p>The fact that every subprogram takes the <em>same</em> input (same as the
	265	main program) might lead to some interesting programs — programs which
	266	are unduly sensitive to changes in the input, I imagine. Of course,
	267	this doesn't affect the undecidability of subprograms, since they are always
	268	free to ignore input completely.</p>
	269
	270	<p>There is no implementation, yet, but constructing an efficient one would be
	271	a good exercise in static program analysis.</p>
	272
	273	<p>Happy undeciding!</p>
	274
	275	<p>-Chris Pressey
	276	<br/>November 5, 2007
	277	<br/>Chicago, Illinois, USA</p>
	278
	279	</body>
	280	</html>

+453

-0

xoomonk/doc/Xoomonk.falderal less more

	0	-> encoding: UTF-8
	1
	2	The Xoomonk Programming Language
	3	================================
	4
	5	Language version 0.1 Pretty Much Complete But Maybe Not Totally Finalized
	6
	7	Abstract
	8	--------
	9
	10	_Xoomonk_ is a programming language in which _malingering updatable stores_
	11	are first-class objects. Malingering updatable stores unify several language
	12	constructs, including procedure activations, named parameters, and object-like
	13	data structures.
	14
	15	Description
	16	-----------
	17
	18	Overall, Xoomonk looks like a "typical" imperative language. The result
	19	of evaluating an expression can be assigned to a variable, and the contents
	20	of a variable can be used in a further expression.
	21
	22	\| a := 1
	23	\| b := a
	24	\| print b
	25	= 1
	26
	27	However, blocks of these imperative statements can appear as terms in
	28	expressions; such blocks evaluate to entire updatable stores.
	29
	30	\| a := {
	31	\| c := 5
	32	\| d := c
	33	\| }
	34	\| print a
	35	= [c=5,d=5]
	36
	37	Once created, a store can be updated and accessed.
	38
	39	\| a := {
	40	\| c := 5
	41	\| d := c
	42	\| }
	43	\| print a
	44	\| a.d := 7
	45	\| print a
	46	\| print a.c
	47	= [c=5,d=5]
	48	= [c=5,d=7]
	49	= 5
	50
	51	A store can also be assigned to a variable after creation. Stores are
	52	accessed by reference, so this creates two aliases to the same store.
	53
	54	\| a := {
	55	\| c := 5
	56	\| d := c
	57	\| }
	58	\| b := a
	59	\| b.c := 17
	60	\| print a
	61	\| print b
	62	= [c=17,d=5]
	63	= [c=17,d=5]
	64
	65	To create an independent copy of the store, the postfix `*` operator is
	66	used.
	67
	68	\| a := {
	69	\| c := 5
	70	\| d := c
	71	\| }
	72	\| b := a*
	73	\| b.c := 17
	74	\| print a
	75	\| print b
	76	= [c=5,d=5]
	77	= [c=17,d=5]
	78
	79	Empty blocks are permissible.
	80
	81	\| a := {}
	82	\| print a
	83	= []
	84
	85	Once a store has been created, only those variables defined in the store
	86	can be updated and accessed -- new variables cannot be added.
	87
	88	\| a := { b := 6 }
	89	\| print a.c
	90	? Attempt to access an undefined variable
	91
	92	\| a := { b := 6 }
	93	\| a.c := 12
	94	? Attempt to assign an undefined variable
	95
	96	Stores and integers are the only two data types in Xoomonk. However, there
	97	are some special forms of the print statement, demonstrated here, which
	98	allow for textual output.
	99
	100	\| a := 65
	101	\| print char a
	102	\| print string "Hello, world!"
	103	\| print string "The value of a is ";
	104	\| print a;
	105	\| print "!"
	106	= A
	107	= Hello, world!
	108	= The value of a is 65!
	109
	110	Xoomonk enforces strict block scope. Variables can be shadowed in an
	111	inner block.
	112
	113	\| a := 14
	114	\| b := {
	115	\| a := 12
	116	\| print a
	117	\| }
	118	\| print a
	119	= 12
	120	= 14
	121
	122	The special identifier `^` refers to the lexically enclosing store, that is,
	123	the store which was in effect when the this store was defined.
	124
	125	\| a := 14
	126	\| b := {
	127	\| a := 12
	128	\| print ^.a
	129	\| }
	130	\| print b.a
	131	= 14
	132	= 12
	133
	134	As long as a variable is assigned somewhere in a block, it can be accessed
	135	before it is first assigned. In this case, it will hold the integer value 0.
	136	It need not stay an integer, for typing is dynamic.
	137
	138	\| b := {
	139	\| print a
	140	\| a := 12
	141	\| print a
	142	\| a := { c := 13 }
	143	\| print a
	144	\| }
	145	\| print b
	146	= 0
	147	= 12
	148	= [c=13]
	149	= [a=[c=13]]
	150
	151	We now present today's main feature.
	152
	153	It's important to understand that a block need not define all the
	154	variables used in it. Such blocks do not immediately evaluate to a store.
	155	Instead, they evaluate to an object called an _unsaturated store_.
	156
	157	Or, to put it another way:
	158
	159	If, in a block, you refer to a variable which has not yet been given
	160	a value in that updatable store, the computations within the block are not
	161	performed until that variable is given a value. Such a store is called an
	162	_unsaturated store_.
	163
	164	\| a := {
	165	\| d := c
	166	\| }
	167	\| print a
	168	= [c=?,d=*]
	169
	170	An unsaturated store behaves similarly to a saturated store in certain
	171	respects. In particular, unsaturated stores can be updated. If doing
	172	so means that all of the undefined variables in the store are now defined,
	173	the block associated with that store is evaluated, and the store becomes
	174	saturated. In this sense, an unsaturated store is like a promise, and
	175	this bears some resemblance to lazy evaluation (thus the term _malingering_).
	176
	177	\| a := {
	178	\| print string "executing block"
	179	\| d := c
	180	\| }
	181	\| print a
	182	\| a.c := 7
	183	\| print a
	184	= [c=?,d=*]
	185	= executing block
	186	= [c=7,d=7]
	187
	188	Once a store has become saturated, the block associated with it is not
	189	executed again.
	190
	191	\| a := {
	192	\| d := c
	193	\| }
	194	\| a.c := 7
	195	\| print a
	196	\| a.c := 4
	197	\| print a
	198	= [c=7,d=7]
	199	= [c=4,d=7]
	200
	201	The main program is a block. If it is unsaturated, no execution occurs
	202	at all.
	203
	204	\| print "hello"
	205	\| a := 14
	206	\| b := c
	207	=
	208
	209	Variables cannot generally be accessed from an unsaturated store.
	210
	211	\| a := {
	212	\| d := c
	213	\| }
	214	\| x := a.c
	215	? Attempt to access an unassigned variable
	216
	217	\| a := {
	218	\| d := c
	219	\| }
	220	\| x := a.d
	221	? Attempt to access an unresolved variable
	222
	223	This is true, even if the variable is assigned a constant expression
	224	inside the block.
	225
	226	\| a := {
	227	\| b := 7
	228	\| d := c
	229	\| }
	230	\| x := a.b
	231	? Attempt to access an unresolved variable
	232
	233	If, however, the unsaturated store contains some variables that have
	234	been updated since the store was created, those variable may be accessed.
	235
	236	\| a := {
	237	\| print "executing block"
	238	\| p := q
	239	\| d := c
	240	\| }
	241	\| a.q = 7
	242	\| print a.q
	243	= 7
	244
	245	Nor is it possible to assign a variable in an unsaturated store which is
	246	defined somewhere in the block.
	247
	248	\| a := {
	249	\| b := 7
	250	\| d := c
	251	\| }
	252	\| a.b := 4
	253	? Attempt to assign an unresolved variable
	254
	255	A variable is considered unresolved even if it is assigned within the block,
	256	if that assignment takes place during or after its first use in an expresion.
	257
	258	\| a := {
	259	\| b := b
	260	\| }
	261	\| a.b := 5
	262	\| print a
	263	= [b=5]
	264
	265	\| a := {
	266	\| print string "executing block"
	267	\| l := b
	268	\| b := 3
	269	\| l := 3
	270	\| }
	271	\| print string "saturating store"
	272	\| a.b := 5
	273	\| print a
	274	= saturating store
	275	= executing block
	276	= [b=3,l=3]
	277
	278	We now describe how this language is (we reasonably assume) Turing-complete.
	279
	280	Operations are accomplished with certain built-in unsaturated stores. For
	281	example, there is a store called `add`, which can be used for addition. These
	282	built-in stores are globally available; they do not exist in any particular
	283	store themselves. One uses the `$` prefix operator to access this global
	284	namespace.
	285
	286	\| print $add
	287	\| $add.x := 3
	288	\| $add.y := 5
	289	\| print $add.result
	290	\| print $add
	291	= [x=?,y=?,result=*]
	292	= 8
	293	= [x=3,y=5,result=8]
	294
	295	Because using a built-in operation store in this way saturates it, it cannot
	296	be used again. Typically you want to make a copy of the store first, and use
	297	that, leaving the built-in store unmodified.
	298
	299	\| o1 := $add*
	300	\| o1.x := 4
	301	\| o1.y := 7
	302	\| o2 := $add*
	303	\| o2.x := o1.result
	304	\| o2.y := 9
	305	\| print o2.result
	306	= 20
	307
	308	Since Xoomonk is not a strictly minimalist language, there is a selection
	309	of built-in stores which provide useful operations: `$add`, `$sub`, `$mul`,
	310	`$div`, `$gt`, and `$not`.
	311
	312	Decision-making is also accomplished with a built-in store, `if`. This store
	313	contains variables caled `cond`, `then`, and `else`. `cond` should
	314	be an integer, and `then` and `else` should be unsaturated stores where `x` is
	315	unassigned. When the first three are assigned values, if `cond` is nonzero,
	316	it is assigned to `x` in the `then` store; otherwise, if it is zero, it is
	317	assigned to `x` in the `else` store.
	318
	319	\| o1 := $if*
	320	\| o1.then := {
	321	\| y := x
	322	\| print "condition is true"
	323	\| }
	324	\| o1.else := {
	325	\| y := x
	326	\| print "condition is false"
	327	\| }
	328	\| o1.cond := 0
	329	= condition is false
	330
	331	\| o1 := $if*
	332	\| o1.then := {
	333	\| y := x
	334	\| print "condition is true"
	335	\| }
	336	\| o1.else := {
	337	\| y := x
	338	\| print "condition is false"
	339	\| }
	340	\| o1.cond := 1
	341	= condition is true
	342
	343	Repetition is also accomplished with a built-in store, `loop`. This store
	344	contains an unassigned variable called `do`. When it is assigned a value,
	345	assumed to be an unsaturated store, a copy of it is made. The variable
	346	`x` inside that copy is assigned the value 0. This is supposed to saturate
	347	the store. The variable `continue` is then accessed from the store. If
	348	it is nonzero, the process repeats, with another copy of the `do` store
	349	getting 0 assigned to its `x`, and so forth.
	350
	351	\| l := $loop*
	352	\| counter := 5
	353	\| l.do := {
	354	\| y := x
	355	\| print ^.counter
	356	\| o := $sub*
	357	\| o.x := ^.counter
	358	\| o.y := 1
	359	\| ^.counter := o.result
	360	\| continue := o.result
	361	\| }
	362	\| print string "done!"
	363	= 5
	364	= 4
	365	= 5
	366	= 2
	367	= 1
	368	= done!
	369
	370	Because the `loop` construct will always execute the `do` store at least once
	371	(even assuming its only unassigned variable is `x`), it acts like a so-called
	372	`repeat` loop. It can be used in conjunction with `if` to simulate a
	373	so-called `while` loop. With this loop, the built-in operations provided,
	374	and variables which may contain unbounded integer values, Xoomonk should
	375	be uncontroversially Turing-complete.
	376
	377	Finally, there is no provision for defining functions or procedures, because
	378	malingering stores can act as these constructs.
	379
	380	\| perimeter := {
	381	\| o1 := $mul*
	382	\| o1.x := x
	383	\| o1.y := 2
	384	\| o2 := $mul*
	385	\| o2.x := y
	386	\| o2.y := 2
	387	\| o3 := $add*
	388	\| o3.x := o1.result
	389	\| o3.y := o2.result
	390	\| result := o3.result
	391	\| }
	392	\| p1 := perimeter*
	393	\| p1.x := 13
	394	\| p1.y := 6
	395	\| print p1.result
	396	\| p2 := perimeter*
	397	\| p2.x := 4
	398	\| p2.y := 1
	399	\| print p2.result
	400	= 38
	401	= 10
	402
	403	Grammar
	404	-------
	405
	406	Xoomonk ::= { Stmt }.
	407	Stmt ::= Assign \| Print.
	408	Assign ::= Ref ":=" Expr.
	409	Print ::= "print" ("string" <string> \| "char" Expr \| Expr) [";"].
	410	Expr ::= (Block \| Ref \| Const) ["*"].
	411	Block ::= "{" { Stmt } "}".
	412	Ref ::= Name {"." Name}.
	413	Name ::= "^" \| "$" <alphanumeric> \| <alphanumeric>.
	414	Const ::= <integer-literal>.
	415
	416	Discussion
	417	----------
	418
	419	There is some similarity with Wouter van Oortmerssen's language Bla, in that
	420	function environments are very close cousins of updatable stores. But
	421	Xoomonk, quite unlike Bla, is an imperative language; once created, a store
	422	may be updated at any point. And, of course, this property is exploited in
	423	the introduction of malingering stores.
	424
	425	Xoomonk originally had an infix operator `&`, which takes two stores as its
	426	arguments, at least one of which must be saturated, and evaluates to a third
	427	store which is the result of merging the two argument stores. This result store
	428	may be saturated even if only one of the argument stores was saturated, if
	429	the saturated store gave all the values that the unsaturated store needed.
	430	This operator was dropped because it is mostly syntactic sugar for assigning
	431	each of the desired variables from one store to the other. However, it does
	432	admittedly provide a very natural syntax, which orthogonally includes "named
	433	arguments", when using unsaturated stores as procedures:
	434
	435	perimeter = {
	436	# see example above
	437	}
	438	g := perimeter* & { x := 13 y := 6 }
	439	print g.result
	440
	441	Xoomonk, as a project, is also an experiment in _test-driven language
	442	design_. As you can see, I've described the language with a series of
	443	examples, written in (something close to) Falderal format, each of which
	444	could be used as a test case. This should make implementing an interpreter
	445	or compiler for the language much easier, when I get around to that.
	446
	447	Happy malingering!
	448
	449	-Chris Pressey
	450	Cat's Eye Technologies
	451	August 7, 2011
	452	Evanston, IL