oozlybub-and-murphy.markdown @master — view markup · raw · history · blame
Oozlybub and Murphy
Language version 1.1
Overview
This document describes a new programming language. The name of this language is Oozlybub and Murphy. Despite appearances, this name refers to a single language. The majority of the language is named Oozlybub. The fact that the language is not entirely named Oozlybub is named Murphy. Deal with it.
For the sake of providing an "olde tyme esoterickal de-sign", the language combines several unusual features, including multiple interleaved parse streams, infinitely long variable names, gratuitously strong typing, and only-conjectural Turing completeness. While no implementation of the language exists as of this writing, it is thought to be sufficiently consistent to be implementable, modulo any errors in this docunemt.
In places the language may resemble SMITH and Quylthulg, but this was not intended, and the similarities are purely emergent.
Program Structure
A Oozlybub and Murphy program consists of a number of variables and a number of objects called dynasts. A Oozlybub and Murphy program text consists of multiple parse streams. Each parse stream contains zero or more variable declarations, and optionally a single dynast.
Parse Streams
A parse stream is just a segment, possibly non-contiguous, of the text
of a Oozlybub and Murphy program. A program starts out with a single
parse stream, but certain parse stream manipulation pragmas can change
this. These pragmas have the form {@x}
and have a similar syntactic
status as comments; they can appear anywhere except inside a lexeme.
Parse streams are arranged as a ring (a cyclic doubly linked list.) When parsing of the program text begins initially, there is already a single pre-created parse stream. When the program text ends, all parse streams which may be active are deleted.
The meanings of the pragmas are:
{@+}
Create a new parse stream to the right of the current one.{@>}
Switch to the parse stream to the right of the current one.{@<}
Switch to the parse stream to the left of the current one.{@-}
Delete the current parse stream. The parse stream to the left of the deleted parse stream will become the new current parse stream.
Deleting a parse stream while it contains an unfinished syntactic construct is a syntax error, just as an end-of-file in that circumstance would be in most other languages.
Providing a concrete example of parse streams in action will be difficult in the absence of defined syntax for the rest of Oozlybub and Murphy, so we will, for the purposes of the following demonstration only, pretend that the contents of a parse stream is a sentence of English. Here is how three parse streams might be managed:
The quick {@+}brown{@>}Now is the time{@<}fox{@<} for all good men to {@+}{@>}Wherefore art thou {@>} jumped over {@>}{@>}Romeo?{@-} come to the aid of {@>}the lazy dog's tail.{@-}their country.{@-}
Variables
All variables are declared in a block at the beginning of a parse stream. If there is also a dynast in that stream, the variables are private to that dynast; otherwise they are global and shared by all dynasts. (Defined in 1.1) Any dynamically created dynast gets its own private copies of any private variables the original dynast had; they will initially hold the values they had in the original, but they are not shared.
The name of a variable in Oozlybub and Murphy is not a fixed,
finite-length string of symbols, as you would find in other programming
languages. No sir! In Oozlybub and Murphy, each variable is named by a
possibly-infinite set of strings (over the alphanumeric-plus-spaces
alphabet [a-zA-Z0-9 ]
), at least one of which must be infinitely long.
(New in 1.1: spaces [but no other kinds of whitespace] are allowed in
these strings.)
To accomodate this method of identifying a variable, in Oozlybub and Murphy programs, which are finite, variables are identified using regular expressions which match their set of names. An equivalence class of regular expressions is a set of all regular expressions which accept exactly the same set of strings; each equivalence class of regular expressions refers to the same, unique Oozlybub and Murphy variable.
(In case you wonder about the implementability of this: Checking that two regular expressions are equivalent is decidable: we convert them both to NFAs, then to DFAs, then minimize those DFAs, then check if the transition graphs of those DFAs are isomorphic. Checking that the regular expression accepts at least one infinitely-long string is also decidable: just look for a cycle in the DFA's graph.)
Note that these identifier-sets need not be disjoint. /ma*/
and
/mb*/
are distinct variables, even though both contain the string m
.
(Note also that we are fudging slightly on how we consider to have
described an infinitely long name; technically we would want to have a
Büchi automaton that specifies an unending repetition with ^ω^ instead
of *. But the distinction is subtle enough in this context that we're
gonna let it slide.)
Syntax for giving a variable name is fairly straightforward: it is
delimited on either side by /
symbols; the alphanumeric symbols are
literals; textual concatenation is regular expression sequencing, |
is
alteration, (
and )
increase precedence, and *
is Kleene
asteration (zero or more occurrences). Asteration has higher precedence
than sequencing, which has higher precedence than alteration. Because
none of these operators is alphanumeric nor a space, no escaping scheme
needs to be installed.
Variables are declared with the following syntax (i
and a
are the
types of the variables, described in the next section):
VARIABLES ARE i /pp*/, i /qq*/, a /(0|1)*/.
This declares an integer variable identified by the names {p
, pp
,
ppp
, ...}, an integer variable identified by the names {q
, qq
,
qqq
, ...}, and an array variable identified by the names of all
strings of 0
's and 1
's.
When not in wimpmode (see below), any regular expression which denotes a
variable may not be literally repeated anywhere else in the program. So
in the above example, it would not be legal to refer to /pp*/
further
down in the program; an equivalent regular expression, such as
/p|ppp*/
or /p*p/
or /pp*|pp*|pp*/
would have to be used instead.
Types
Oozlybub and Murphy is a statically-typed language, in that variables as well as values have types, and a value of one type cannot be stored in a variable of another type. The types of values, however, are not entirely disjoint, as we will see, and special considerations may arise for checking and conversion because of this.
The basic types are:
-
i
, the type of integers.These are integers of unbounded extent, both positive and negative. Literal constants of type
i
are given in the usual decimal format. Variables of this type initially contain the value 0. -
p
, the type of prime numbers.All prime numbers are integers but not all integers are prime numbers. Thus, values of prime number type will automatically be coerced to integers in contexts that require integers; however the reverse is not true, and in the other direction a conversion function (
P?
) must be used. There are no literal constants of typep
. Variables of this type initially contain the value 2. -
a
, the type of arrays of integers.An integer array has an integer index which is likewise of unbounded extent, both positive and negative. Variables of this type initially contain an empty array value, where all of the entries are 0.
-
b
, the type of booleans.A boolean has two possible values,
true
andfalse
. Note that there are no literal constants of typeb
; these must be specified by constructing a tautology or contradiction with boolean (or other) operators. It is illegal to retrieve the value of a variable of this type before first assigning it, except to construct a tautology or contradiction. -
t
, the type of truth-values.A truth-value has two possible values,
yes
andno
. There are no literal constants of typet
. It is illegal to retrieve the value of a variable of this type before first assigning it, except to construct a tautology or contradiction. -
z
, the type of bits.A bit has two possible values,
one
andzero
. There are no literal constants of typez
. It is illegal to retrieve the value of a variable of this type before first assigning it, except to construct a tautology or contradiction. -
c
, the type of conditions.A condition has two possible values,
go
andnogo
. There are no literal constants of typec
. It is illegal to retrieve the value of a variable of this type before first assigning it, except to construct a tautology or contradiction.
Wimpmode
(New in 1.1) An Oozlybub and Murphy program is in wimpmode if it
declares a global variable of integer type which matches the string
am a wimp
, for example:
VARIABLES ARE i /am *a *wimp/.
Certain language constructs, noted in this document as such, are only permissible in wimpmode. If they are used in a program in which wimpmode is not in effect, a compile-time error shall occur and the program shall not be executed.
Dynasts
Each dynast is labeled with a positive integer and contains an expression. Only one dynast may be denoted in any given parse stream, but dynasts may also be created dynamically during program execution.
Program execution begins at the lowest-numbered dynast that exists in the initial program. When a dynast is executed, the expression of that dynast is evaluated for its side-effects. If there is a dynast labelled with the next higher integer (i.e. the successor of the label of the current dynast), execution continues with that dynast; otherwise, the program halts. Once a dynast has been executed, it continues to exist until the program halts, but it may never be executed again.
Evaluation of an expression may have side-effects, including writing characters to an output channel, reading characters from an input channel, altering the value of a variable, and creating a new dynast.
Dynasts are written with the syntax dynast(label) <-> expr
. A concrete
example follows:
dynast(100) <-> for each prime /p*/ below 1000 do write (./p*|p/+1.)
TRIVIA PORTION OF SHOW
WHO WAS IT FAMOUS MAN THAT SAID THIS?
- A) RONALD REAGAN
- B) RONALD REAGAN
- B) RONALD STEWART
- C) RENALDO
contestant enters lightning round now
Expressions
In the following, the letter preceding -expr or -var indicates the expected type, if any, of that expression or variable. Where the expressions listed below are infix expressions, they are listed from highest to lowest precedence. Unless noted otherwise, subexpressions are evaluated left to right.
-
(.expr.)
Surrounding an expression with dotted parens gives it that precedence boost that's just the thing to have it be evaluated before the expression it's in, but there is a catch. The number of parens in the dotted parens expression must match the nesting depth in the following way: if a set of dotted parens is nested within n dotted parens, it must contain fib(n) parens, where fib(n) is the nth member of the Fibonacci sequence. For example,
(.(.0.).)
and(.(.((.(((.(((((.0.))))).))).)).).)
are syntactically well-formed expressions (when not nested in any other dotted paren expression), but(.(((.0.))).)
and(.(.(.0.).).)
are not. -
var
A variable evaluates to the value it contains at that point in execution.
-
0
,1
,2
,3
, etc.Decimal literals evaluate to the expected value of type
i
. -
#myself#
This special nullary token evaluates to the numeric label of the currently executing dynast.
-
var := expr
Evaluates the expr and stores the result in the specified variable. The variable and the expression must have the same type. Evaluates to whatever expr evaluated to.
-
a-expr [i-expr]
Evaluates to the
i
stored at the location in the array given by i-expr. -
a-expr [i-expr] := i-expr
Evaluates the second i-expr and stores the result in the location in the array given by the first i-expr. Evaluates to whatever the second i-expr evaluated to.
-
a-expr ? i-expr
Evaluates to
go
ifa-expr [i-expr]
andi-expr
evaluate to the same thing,nogo
otherwise. The i-expr is only evaluated once. -
minus i-expr
Evaluate to the integer that is zero minus the result of evaluating i-expr.
-
write i-expr
Write the Unicode code point whose number is obtained by evaluating i-expr, to the standard output channel. Writing a negative number shall produce one of a number of amusing and informative messages which are not defined by this document.
-
#read#
Wait for a Unicode character to become available on the standard input channel and evaluate to its integer code point value.
-
not? z-expr
Converts a bit value to a boolean value (
zero
becomestrue
andone
becomesfalse
). -
if? b-expr
Converts a boolean value to condition value (true becomes go and false becomes nogo).
-
cvt? c-expr
Converts a condition value to a truth-value (
go
becomesyes
andnogo
becomesno
). -
to? t-expr
Converts a truth-value to a bit value (
yes
becomesone
andno
becomeszero
). -
P? i-expr [t-var]
If the result of evaluating i-expr is a prime number, evaluates to that prime number (and has the type
p
). If it is not prime, stores the valueno
into t-var and evaluates to 2. -
i-expr * i-expr
Evaluates to the product of the two i-exprs. The result is never of type
p
, but the implementation doesn't need to do anything based on that fact. -
i-expr + i-expr
Evaluates to the sum of the two i-exprs.
-
exists/dynast i-expr
Evaluates to
one
if a dynast exists with the given label, orzero
if one does not. -
copy/dynast i-expr, p-expr, p-expr
Creates a new dynast based on an existing one. The existing one is identified by the label given in the i-expr. The new dynast is a copy of the existing dynast, but with a new label. The new label is the sum of the two p-exprs. If a dynast with that label already exists, the program terminates. (Defined in 1.1) This expression evaluates to the value of the given i-expr.
-
create/countably/many/dynasts i-expr, i-expr
Creates a countably infinite number of dynasts based on an existing one. The existing one is identified by the label given in the first i-expr. The new dynasts are copies of the existing dynast, but with new labels. The new labels start at the first odd integer greater than the second i-expr, and consist of every odd integer greater than that. If any dynast with such a label already exists, the program terminates. (Defined in 1.1) This expression evaluates to the value of the first given i-expr.
-
b-expr and b-expr
Evaluates to
one
if both b-exprs aretrue
,zero
otherwise. Note that this is not short-circuting; both b-exprs are evaluated. -
c-expr or c-expr
Evaluates to
yes
if either or both c-exprs arego
,no
otherwise. Note that this is not short-circuting; both c-exprs are evaluated. -
do expr
Evaluates the expr, throws away the result, and evaluates to
go
. -
c-expr then expr
Wimpmode only. Evaluates the c-expr on the left-hand side for its side-effects only, throwing away the result, then evaluates to the result of evaluating the right-hand side expr.
-
c-expr ,then i-expr
(New in 1.1) Evaluates the c-expr on the left-hand side; if it is
go
, evaluates to the result of evaluating the right-hand side i-expr; if it isnogo
, evaluates to an unspecified and quite possibly random integer between 1 and 1000000 inclusive, without evaluating the right-hand side. Note that this operator has the same precedence asthen
. -
for each prime var below i-expr do i-expr
The var must be a declared variable of type
p
. The first i-expr must evaluate to an integer, which we will call k. The second i-expr is evaluated once for each prime number between k and 2, inclusive; each time it is evaluated, var is bound to a successively smaller prime number between k and 2. (Defined in 1.1) Evaluates to the result of the final evaluation of the second i-expr.
Grammar
This section attempts to capture and summarize the syntax rules (for a single parse stream) described above, using an EBNF-like syntax extended with a few ad-hoc annotations that I don't feel like explaining right now.
ParseStream ::= VarDeclBlock {DynastLit}.
VarDeclBlock ::= "VARIABLES ARE" VarDecl {"," VarDecl} ".".
VarDecl ::= TypeSpec VarName.
TypeSpec ::= "i" | "p" | "a" | "b" | "t" | "z" | "c".
VarName ::= "/" Pattern "/".
Pattern ::= {[a-zA-Z0-9 ]}
| Pattern "|" Pattern /* ignoring precedence here */
| Pattern "*" /* and here */
| "(" Pattern ")".
DynastLit ::= "dynast" "(" Gumber ")" "<->" Expr.
Expr ::= Expr1[c] {"then" Expr1 | ",then" Expr1[i]}.
Expr1 ::= Expr2[c] {"or" Expr2[c]}.
Expr2 ::= Expr3[b] {"and" Expr3[b]}.
Expr3 ::= Expr4[i] {"+" Expr4[i]}.
Expr4 ::= Expr5[i] {"*" Expr5[i]}.
Expr5 ::= Expr6[a] {"?" Expr6[i]}.
Expr6 ::= Prim[a] {"[" Expr[i] "]"} [":=" Expr[i]].
Prim ::= {"("}* "." Expr "." {")"}* /* remember the Fibonacci rule! */
| VarName [":=" Expr]
| Gumber
| "#myself#"
| "minus" Expr[i]
| "write" Expr[i]
| "#read#"
| "not?" Expr[z]
| "if?" Expr[b]
| "cvt?" Expr[c]
| "to?" Expr[t]
| "P?" Expr[i]
| "exists/dynast" Expr[i]
| "copy/dynast" Expr[i] "," Expr[p] "," Expr[p]
| "create/countably/many/dynasts"
Expr[i] "," Expr[i]
| "do" Expr
| "for" "each" "prime" VarName "below"
Expr[i] "do" Expr[i].
Gumber ::= {[0-9]}.
Boolean Idioms
Here we show how we can get any value of any of the b
, t
, z
, and
c
types, without any constants or variables with known values of these
types.
VARIABLES ARE b /b*/.
zero = /b*|b/ and not? to? cvt? if? /b*|b*/
true = not? zero
go = if? true
yes = cvt? go
one = to? yes
false = not? one
nogo = if? false
no = cvt? nogo
Computational Class
Because the single in-dynast looping construct, for each prime below
,
is always a finite loop, the execution of any fixed number of dynasts
cannot be Turing-complete. We must create new dynasts at runtime, and
continue execution in them, if we want any chance at being
Turing-complete. We demonstrate this by showing an example of a
(conjecturally) infinite loop in Oozlybub and Murphy, an idiom which
will doubtless come in handy in real programs.
VARIABLES ARE p /p*/, p /q*/.
dynast(3) <->
(. do (. if? not? exists/dynast 5 ,then
create/countably/many/dynasts #myself#, 5 .) .) ,then
(. for each prime /p*|p/ below #myself#+2 do
for each prime /q*|q/ below /p*|pp/+1 do
if? not? exists/dynast /p*|p|p/+/q*|q|q/ ,then
copy/dynast #myself#, /p*|ppp/, /q*|qqq/ .)
As you can see, the ability to loop indefinitely in Oozlybub and Murphy hinges on whether Goldbach's Conjecture is true or not. Looping forever requires creating an unbounded number of new dynasts. We can create all the odd-numbered dynasts at once, but that won't be enough to loop forever, as we must proceed to the next highest numbered dynast after executing a dynast. So we must create new dynasts with successively higher even integer labels, and these can only be created by summing two primes. So, if Goldbach's conjecture is false, then there is some even number greater than two which is not the sum of two primes; thus there is some dynast that cannot be created by a running Oozlybub and Murphy program, thus it is not possible to loop forever in Oozlybub and Murphy, thus Oozlybub and Murphy is not Turing-complete (because it cannot simulate any Turing machine that loops forever.)
It should not however be difficult to show that Oozlybub and Murphy is
Turing-complete under the assumption that Goldbach's Conjecture is true.
If Goldbach's Conjecture is true, then the above program is an infinite
loop. We need only add to it appropriate conditional instructions to,
say, simulate the execution of an arbitrarily-chosen Turing machine. An
array can serve as the tape, and an integer can serve as the head.
Another integer can serve as the state of the finite control. The
integer can be tested against various fixed integers by establishing an
array for each of these fixed integers and using the ?
operator
against each in turn; each branch can mutate the tape, tape head, and
finite control as desired. The program can halt by neglecting to create
a new even dynast to execute next, or by trying to create a dynast with
a label that already exists.
Happy FLIMPING,
Chris Pressey
December 1, 2010
Evanston, Illinois, USA