diff --git a/doc/emmental.html b/doc/emmental.html
new file mode 100755
index 0000000..2061417
--- /dev/null
+++ b/doc/emmental.html
@@ -0,0 +1,525 @@
+<html><head><title>The Emmental Programming Language</title></head>
+<body>
+
+<h1>The Emmental Programming Language</h1>
+
+<p>November 2007, Chris Pressey, Cat's Eye Technologies</p>
+
+<h2>Introduction</h2>
+
+<p>Emmental is a self-modifying programming language.  That is not to say
+that is a language in which programs are self-modifying; rather, it is
+the language itself, as defined by a meta-circular interpreter, that can be
+modified during the course of a running program.  Indeed, this is how
+Emmental, without conventional conditional and repetition/recursion constructs,
+achieves Turing-completeness.</p>
+
+<h2>Meta-circular Interpreters</h2>
+
+<p>One way to attempt to define a language is by giving what's called
+a <em>meta-circular interpreter</em> (often shortened to "MCI" in this document.)
+This is an interpreter for some language which is written in that same language
+(or at least, a language which is very close to it.)</p>
+
+<p>Meta-circular interpreters were a popular way to the describe the
+semantics of programming languages, especially
+LISP-like languages, and especially before the advent of denotational
+semantics.  The term "meta-circular" was apparently coined by John C. Reynolds
+in his paper "Definitional Interpreters for Higher-Order Programming
+Languages" (1972 Proceedings ACM National Conference.)</p>
+
+<p>Of course, in the real world, MCI's are not often used.
+They certainly <em>can</em> be used: if you have a working Scheme interpreter
+that came with your computer system, there
+is nothing stopping you from writing another Scheme interpreter in Scheme,
+and running your programs on your interpreter (which is itself running on your
+system's interpreter.)  However, this is quite a bit
+less efficient due to the duplication of effort.  A somewhat more realistic
+case might be if your system came with, say, a Scheme compiler.  You might then feed your
+Scheme interpreter (written in Scheme) through that to make a native Scheme
+interpreter, and use that to interpret your programs.
+(In this setup, the interpreter is usually described as "self-hosting"
+rather than "meta-circular".)</p>
+
+<p>But, as should be obvious, you already need an implementation of Scheme
+for your Scheme interpreter written in Scheme to be of much practical
+use to you.  If your meta-circular interpreter is all you have, you won't
+be able to use it to run or understand Scheme programs.  Because the MCI is
+defined in terms of itself, you'll need some other source of "understanding
+how it works" to make it complete.  This understanding might come
+from an implementation in some other programming language, or a specification
+in some formal language, or a description in some natural language, or simply
+from intuition &mdash; but it has to come from somewhere.</p>
+
+<p>Assuming that we do have that external source of understanding, the
+meta-circular interpreter can come in quite handy in codifying the semantics
+of the language.  And, in Emmental's case, <em>altering</em> those semantics:
+Emmental's MCI supports operations which instruct Emmental's MCI to modify
+its behaviour.</p>
+
+<h2>Interpreter Structure</h2>
+
+<p>To describe the structure of Emmental's MCI, we first examine the
+general structure of interpreters.  If you've ever written a virtual machine
+executor in, say, C, you've noticed that it has the general form</p>
+
+<pre>
+    pc = start;
+    while (!done) {
+        switch (instruction[pc]) {
+            case INSTR_ONE:
+                /* implement semantics of INSTR_ONE */
+                pc += advance(INSTR_ONE);
+                break;
+            case INSTR_TWO:
+                /* implement semantics of INSTR_TWO */
+                pc += advance(INSTR_TWO);
+                break;
+            /* ... */
+            case INSTR_HALT:
+                done = 1;
+                break;
+            default:
+                perror("Invalid opcode");
+        }
+    }
+</pre>
+
+<p>Note that <code>advance()</code> is some function that computes how far the
+program counter is advanced on that instruction.  This value is typically
++1 for most instructions, but more or less than 1 (and dependent on the
+state of the program) for a handful of "branch" instructions.
+Note also that <code>advance()</code> would not typically be implemented in C
+as a function; I'm just showing it like this to emphasize the
+regular structure.</p>
+
+<p>From this we infer that the basic structure of an interpreter is
+a <em>dictionary</em> or <em>map</em> that associates program symbols with operations.
+There is some extra housekeeping like the fetch-execute cycle that surrounds
+this dictionary, but this can (hopefully) be handled mostly automatically,
+freeing us to concentrate on <em>symbols</em> and <em>operations</em>.</p>
+
+<p>The symbols could be taken from any finite alphabet, but in Emmental,
+to keep things relatively simple, we'll just use the ASCII character set.
+(Actually, to be precise, this is the full 8-bit ASCII character set.
+Non-printable control characters are allowed, as are characters between
+128 and 255, and each has a distinct meaning.  But their representations
+are not defined.)</p>
+
+<p>The operations can be thought of, abstractly, as functions which
+transform program states.  Or they can be thought of, concretely, as
+segments of code &mdash; mini-programs which implement these functions.
+In the case of a meta-circular interpreter, these mini-programs would be written
+<em>in the language being interpreted</em>.</p>
+
+<p>To extend this idea to a <em>self-modifying</em> meta-circular interpreter,
+the operations can be thought of as functions which transform
+both program states <em>and</em> interpreter definitions.
+(Alternatively, the interpreter definition could be thought of as
+part of the program state, although I feel that's a bit gorier a way
+to look at it, and I prefer the other view, at least for Emmental.)</p>
+
+<p>In Emmental, most operations leave the interpreter definition unchanged.
+However, there is one operation which alters the interpreter
+definition, and it is this altered definition that is used to interpret
+the remainder of the program.</p>
+
+<h2>Emmental Semantics (in Emmental)</h2>
+
+<p>Emmental is essentially a stack-based language.  (There's also a queue,
+but it's not as central.)  All operations implicitly
+get data from, and implicitly deposit results back onto, a single
+stack.  For orthogonality's sake, this stack may contain only ASCII symbols.
+(And note that trying to pop an empty stack, or dequeue an empty queue,
+is an error that aborts the program.)</p>
+
+<p>Note that because we've established that an interpreter (at least,
+insofar as Emmental ever needs to know) is simply a map that takes
+symbols to operations, and that operations in Emmental are defined
+(meta-circularly) as Emmental programs, we can use the following notation
+to describe interpreters:</p>
+
+<pre>
+% &rarr; XYZ+*!
+&amp; &rarr; 'ap'ag'ag
+</pre>
+
+<p>That is, the symbol <code>%</code>, when encountered in an Emmental
+program, indicates an operation that
+is defined by the Emmental program <code>XYZ+*!</code>, and so forth.</p>
+
+<p>When a main Emmental program begins execution for the first time,
+it starts with what's called the <em>initial Emmental interpreter</em>.
+(This fact, of course, doesn't apply to any further point of execution inside
+an Emmental program, or execution of operations defined in
+Emmental's MCI, since these would be considered subprograms of a sort.
+These cases use whichever interpreter happens to be in force in that point in time.)</p>
+
+<p>The inital Emmental interpreter is defined as follows:</p>
+
+<pre>
+a &rarr; a
+b &rarr; b
+c &rarr; c
+...
+</pre>
+
+<p>That is, for every symbol <var>x</var> in the ASCII set,
+<var>x</var> <code>&rarr;</code> <var>x</var>.</p>
+
+<p>Doesn't tell us a lot about Emmental's semantics, does it?  No.
+Nothing at all, really.  But remember what I said about needing an external
+source of understanding, in order to actually get full value out of an MCI.
+And remember the purpose of Emmental's MCI: it is
+not there so much to help us understand Emmental, but to allow us to
+<em>change</em> Emmental, from inside an Emmental program.</p>
+
+<p>And, for all that our description of the initial Emmental interpreter
+is unhelpfully tautological, it is not incorrect: the semantics of
+<code>a</code> can in fact be thought of as being defined by an Emmental
+program that consists of only one instruction, <code>a</code>.  This
+happy state of affairs comes about because Emmental is stack-based;
+the "signature" (the requirements for the "before" and "after" stacks)
+of the symbol <code>a</code> is the same as the signature of the program
+containing the single symbol <code>a</code>.  No extra syntax to specify
+arity and the like is necessary.</p>
+
+<p>Above all, don't panic: we <em>will</em> describe what symbols like
+<code>a</code> actually mean in Emmental, we'll just need to do it in something
+besides Emmental.  In fact, let's do that right now.</p>
+
+<h2>Emmental Semantics (in English)</h2>
+
+<h3>Primitive Arithmetic</h3>
+
+<p><code>#</code> pushes the symbol NUL (ASCII 0) onto the stack.</p>
+
+<p>The symbols <code>0</code>, <code>1</code>, ... <code>9</code>
+pop a symbol from the stack, multiply its ASCII value by 10 modulo 256, add
+the value 0, 1, ... 9 (respectively) to that value modulo 256, and
+push the resulting symbol back onto the stack.</p>
+
+<p>The upshot of these 11 operations is that you can push arbitrary
+symbols onto the stack by spelling out their ASCII values in decimal.
+For example, <code>#64</code> pushes a <code>@</code> onto the stack.</p>
+
+<p><code>+</code> pops two symbols off the stack, adds together their
+ASCII values modulo 256, and pushes the symbol with the resultant ASCII value
+back onto the stack.</p>
+
+<p><code>-</code> pops two symbols off the stack, subtracts the ASCII value
+of the first popped from the ASCII value of the second popped modulo 256,
+and pushes the symbol with the resultant ASCII value back onto the stack.</p>
+
+<p><code>~</code> ("log") pops a symbol from the stack, computes the discrete
+base-2 logarithm of the ASCII value of that symbol, 
+and pushes the symbol with the resultant ASCII value back onto the stack.
+The discrete base-2 logarithm of a number is the floor or integer part
+of the base-2 logarithm of that number.  Alternately, it is the number of
+the highest bit position (starting with the LSB = bit position 0)
+with a bit set when the number is viewed as binary.
+Because the base-2 logarithm of the number 0 itself is undefined,
+the number 0 is treated as 256 for this operation; its discrete base-2
+logarithm is 8.</p>
+
+<h3>Stack and Queue Manipulation</h3>
+
+<p><code>^</code> ("enqueue a copy") pops a symbol off the stack, makes a copy of it, pushes
+it back onto the stack, and enqueues the copy onto the queue.</p>
+
+<p><code>v</code> ("dequeue") dequeues a symbol from the queue and pushes it onto the
+stack.</p>
+
+<p>Using these operations in combination, one can form "discard", "duplicate",
+"swap", and other more advanced stack manipulation operations.  For example,
+assuming an empty queue and more than two elements on the stack, "swap" can
+be accomplished with the code <code>^v^-+^^v^v^v-+^v-+^v-+vv</code>.</p>
+
+<p>Despite this fact, the operation <code>:</code> duplicates the top value
+of the stack.  (Emmental is not an absolutely minimal language; note, for
+instance, that it has all ten decimal digits as operations when these
+could surely have been defined in terms of only one or two operations.
+The reasons for a seperate <code>:</code> operation are given below in the
+section on Computational Class.)</p>
+
+<h3>I/O</h3>
+
+<p><code>.</code> pops a symbol off the stack and sends it to the
+standard output as an ASCII symbol.</p>
+
+<p><code>,</code> waits for an ASCII symbol to arrive on standard input,
+and pushes it onto the stack.</p>
+
+<h3>Interpreter Modification and Reflection</h3>
+
+<p>First let's define what it means to <em>pop a string</em> off the stack.
+Symbols are popped off the stack until a <code>;</code> symbol is
+found on the stack.  The symbols popped off are considered a string
+in the reverse order they were popped; i.e. the last symbol popped
+is the first symbol of the string.  The <code>;</code> symbol is
+popped off the stack, but is not made part of the string; it is simply discarded.</p>
+
+<p>Since an Emmental program is a string, popping a program is the same
+as popping a string, just that the string is interpreted as a program.</p>
+
+<p><code>!</code> (sometimes called "supplant")
+pops a symbol, which we call <var>s</var>, off the stack.
+Then it pops a program <var>t</var>.
+It then inserts the association <var>s</var> <code>&rarr;</code> <var>t</var> into the
+interpreter definition.  This overwrites whatever mapping of <var>s</var> might have
+been in the interpreter definition previously.  This new interpreter definition
+is used for all subsequent execution (until it is changed again, of course.)</p>
+
+<p>Note that <code>!</code> does <em>early binding</em>.  That is,
+the meaning of each symbol in this program <var>t</var> is
+the meaning of that symbol <em>at the time <code>!</code> is executed</em>.
+If some subsequent <code>!</code> operation later changes the meaning of one
+of the symbols that occurs in <var>t</var>, the meaning of <var>t</var>
+is not changed.  The semantics of <var>t</var> are "captured" or "frozen".
+This implies, among other things, that supplanting some
+symbol <var>z</var> with itself (a program consisting only of the symbol <var>z</var>)
+is a no-op, because <var>z</var>'s meaning, at the time that <code>!</code>
+is executed, is invariably <var>z</var>.</p>
+
+<p><code>?</code> (sometimes called "eval") pops a symbol, which we call <var>s</var>, off the stack.
+It then executes that symbol (interprets it as an operation)
+with the interpreter currently in effect.</p>
+
+<p>Note that <code>?</code> does <em>late binding</em>.  That is,
+in contrast with <code>!</code>, <code>?</code> never "freezes" the semantic
+definition of the thing that it is executing.  This is true even when
+<code>?</code> occurs in a operation redefinition (i.e. the program
+that supplanted some symbol's semantics when an <code>!</code> was executed.)
+This implies, among other things, that supplanting some symbol <var>z</var>
+with the program that consists of instructions that push the ASCII value of
+<var>z</var> onto the stack, followed by a <code>?</code> instruction,
+creates a <em>cyclic meaning</em> for <var>z</var>.
+This is because the <var>z</var> that will be executed by the <code>?</code>
+will always be the present <var>z</var>, that is, the <var>z</var>
+that is executing the <code>?</code>.</p>
+
+<p>For convenience's sake, <code>;</code> pushes the symbol <code>;</code>
+onto the stack.</p>
+
+<p>All other symbols are no-ops.</p>
+
+<h2>Computational Class</h2>
+
+<p>I believe Emmental is Turing-complete with only the operations that
+have been given so far, but I haven't proved it yet.
+All the elements are there, and although some of them are somewhat "cramped",
+they look viable.</p>
+
+<p>If you want to try thinking about how you could write real programs
+(like a universal Turing-machine simulator) in Emmental, you might want
+to skip this section, since it contains "spoilers".</p>
+
+<p>Repetition can be accomplished by assigning a symbol a cyclic semantics,
+by use of a <code>?</code> within a <code>!</code>.  For example, we can
+redefine the semantics of <code>0</code> to be <code>#48?</code>.  This is
+simply a program that pushes the symbol <code>0</code> onto the stack and
+executes it with the current interpreter, and, since <code>0</code> has been
+redefined to mean <code>#48?</code> in the current interpreter, this will
+loop forever.  The entire program to do this to <code>0</code> and run the
+infinite loop is:</p>
+
+<pre>
+;#35#52#56#63#48!0
+</pre>
+
+<p>This technique can also be used to "jump" from one definition to
+another, by using <code>?</code> to execute some <em>other</em> symbol
+at the end of a definition (that is, some symbol other than the symbol
+being defined.)</p>
+
+<p>Conditionals are a little more restrictive.  The trick to them is,
+strangely, the discrete log operator <code>~</code> in combination
+with the eval operator <code>?</code>.  Since <code>~</code> maps
+all symbols into a set of nine symbols, and <code>?</code> executes
+the symbol on the stack, <code>~?</code> will execute one of the
+symbols from ASCII 0 (NUL) to ASCII 8 (BS).  We can then, for instance,
+define NUL to do one thing, define SOH through BEL to do the same
+as NUL, and define BS to do some other thing; this essentially
+distinguishes between 0 (which executes BS) and every other value
+(which executes NUL).  Further, we can use this in conjunction
+with <code>-</code> to compare two values for equality.  So, for
+example, a program which inputs a character, and outputs Y if
+the character is M and N otherwise, would be:</p>
+
+<pre>
+#59#35#55#56#46#!;##1!;##2!;##3!;##4!;##5!;##6!;##7!#59#35#56#57#46#8!,#77-~?
+</pre>
+
+<p>In case NUL through BS are in use for some reason, we can always
+add 9 to the result of the log (<code>~#9+?</code>)
+to map the answer onto HT through DC1.  Or, of course, any of
+a great number of other arithmetical mappings of our choosing.
+The most severe constraint is that there be 9 available symbols
+to act as "destinations" for our "branch".  Even if we never
+overwrite one "branch" with another (and we can do that in Emmental!)
+and even if we allocate one extra symbol to be the "launch point"
+of the branch, we still have room for 25 branches in the ASCII
+character set.</p>
+
+<p>So these parts look good.  If there's a problem, it's
+with the queue.  Specifically, the problem seems to be
+the need to know the present size of the queue in order to do stack housework
+like "duplicate" and the subsequent need for "duplicate" to achieve
+"discard."  (Duplicate can be defined as <code>^v</code>, but this only
+works when the queue is empty.  Discard can be defined as duplicate plus
+<code>-+</code>, but this only works when there are other elements below
+the element being discarded.  [This last point is not generally a problem
+since we can push arbitrary values onto the stack before any
+program.])</p>
+
+<p>However, if it turns out that we need "duplicate" or "discard" in order
+to write routines that can handle a variable-sized queue &mdash; and that
+strikes me as likely &mdash; then it looks like we have a severe problem.</p>
+
+<p>Here's one way I could try to deal with it.  I could say that the queue
+is <em>local</em> to the operation being defined (or the main program.)
+Then you could define <code>:</code> to be <code>^v</code>, and inside
+<code>:</code>'s definition, the queue would always initially be empty,
+so the definition would work.</p>
+
+<p>But... we need the queue to store our global data.  For example, if
+we are going to simulate a Turing machine, we'd need to use the queue
+to store the tape (perhaps "doubled up", with one symbol of each
+pair telling us "the next symbol is a simulated tape symbol" or
+"the next symbol is some housekeeping value.")  We can't store the
+tape on just one stack.  And, once you are looping in Emmental, you've
+left the "main program" forever; you're jumping from definition to
+definition, and each has their own queue.  At best, you'd need to
+"dump" the queue onto the stack each time you switched definitions,
+and even then you'd need a loop to do that, and to loop you need to
+switch definitions.  It's a royal mess.</p>
+
+<p>So here's how I will deal with it.  I will add a primitive duplicate
+operation, <code>:</code>, to Emmental.  Proving that Emmental is Turing-complete
+is still, then, a challenge, although a doable-seeming challenge.
+I will then propose a more formidable challenge: prove that
+the language formed by removing the <code>:</code> operation from
+Emmental is Turing-complete.  (Equivalently, prove that the set of
+Emmental programs that begin with <code>;#0#58!</code> is Turing-complete.
+The nice thing about Emmental is that you can always shoot yourself in
+the foot &mdash; until you erase your pistol, that is.)</p>
+
+<p>And if you <em>really</em> like a challenge, try proving that Emmental
+without <code>~</code> is Turing-complete.  I don't think that it is,
+although it's possible for it to compute parity, at least (input a
+symbol, output E if its ASCII value is even, and O if it's odd.
+To accomplish this, multiply the input's ASCII value by 128 by adding 127
+copies of it to it; this is modulo 256, so the only results can be
+0 or 128.  Define those operations to print out E and O respectively.
+But that's as far as I've gotten.)</p>
+
+<h2>Discussion</h2>
+
+<h3>Design Decisions</h3>
+
+<p>I would've liked to have given Emmental a <code>'</code> or <code>"</code>
+instruction similar to Funge's "quote" and "quote-mode" instructions;
+instructions that treat one or more of the following symbols in the program
+literally, pushing them, as symbols, onto the stack, instead of executing them.
+However, such instructions are somewhat problematic, both theoretically and
+(for the approach I took implementing Emmental) practically.  There are
+two ways of thinking about the problems that arise.</p>
+
+<p>One is that the function which implements <code>'</code> is given
+access to the program text itself, and possibly the position within the program,
+and it uses these to extract the "immediate mode" symbol past the <code>'</code>.  This information
+could be available because these pieces of information are considered extra arguments
+to the function, or because they are (gorily) considered part of the overall
+program state.  Either way, this operation is given a lot of information to work with,
+and for consistency (since we want to be all nice and neat and say that all operations
+have the same signature so that our dictionary has a nice uniform type,) <em>all</em> operations
+have access to this information.  This is almost too much information; that is, it
+is so much that operations don't really <em>need</em> the dictionary.  We could just say
+there is <em>one</em> operation, defined by a function, and that function is given the
+current symbol and has to decide what it means through whatever means it likes.</p>
+
+<p>This approach is very powerful, of course, but it's just not the style that
+Emmental embodies.  (In fact, the idea to view interpreters as dictionaries
+was one of the foundational design choices for Emmental, to the point where I
+started constructing a "little theory of interpreters as maps."  It really
+wasn't exploited as much as I think it could have been.  If an interpreter is
+a map of symbols to strings of symbols, it's much more tractable than an
+opaque function would be; you can define all sorts of operations on them,
+for example concatenating two interpreters (for all symbols <var>s</var>
+in interpreter <var>a</var> and interpreter <var>b</var>,
+<var>c</var>[<var>s</var>] <code>&rarr;</code> <var>a</var>[<var>s</var>]<var>b</var>[<var>s</var>] &mdash;
+that sort of thing,) computing union or intersection of interpreters,
+Cartesian product, etc.)</p>
+
+<p>The other way of looking at it is to say that there are in fact
+<em>multiple</em> meta-circular interpreters available inside Emmental, and symbols
+like <code>'</code> switch temporarily to an alternate MCI.  This alternate MCI
+interprets every symbol as "push this symbol", then reinstates the previous MCI.
+I like this explication better than the one above &mdash; MCIs begin
+to look a bit like continuations! &mdash; but to do it justice would take some
+work.  I envision a language where the program has fine control
+over which MCI is in effect, possibly by keeping a map from symbols to MCIs,
+or maybe even being able to push MCIs onto the stack.
+This is a wee bit much for Emmental too though, and
+perhaps I'll explore these possibilities in a future language.</p>
+
+<h3>Turing-completeness</h3>
+
+<p>You can make the argument that Emmental's way of being Turing-complete
+is really nothing new: when you redefine some symbol, you're really just
+defining a new function, and when you use <code>?</code> to execute that
+symbol from within its own definition, you're just making a recursive
+function call.</p>
+
+<p>Well, yes, you can make that argument.  But it has to do with how you think
+about "what is a language", I think.  Does a Pascal program fragment which
+defines a procedure called <code>PrintFibonacci</code> represent another
+programming language, one different from Pascal?  You could certainly
+say that it does &mdash; it's
+the language Pascal where the token <code>PrintFibonacci</code> has
+some meaning that it doesn't have in Pascal.</p>
+
+<p>In that view, any language where you can define procedures, or
+functions, or standard libraries, or the like, is an extensible language.
+But even languages where you <em>can't</em> define new procedures or
+functions is arguably an extensible language.  Take some initial Brainfuck
+program fragment, for instance.  After it executes, it leaves the Brainfuck
+tape and while-stack in some state that depends on the input.
+Any Brainfuck fragment that executes after that, will execute in that
+environment, and that environment is arguably a version of the language
+Brainfuck, suitably extended.</p>
+
+<p>You don't normally think of it that way, I bet, but you
+<em>could</em> &mdash; and you would need to, to some degree,
+to claim that Emmental is "just" defining new functions.
+The reason you don't typically look at languages like this
+(unless you are very strange) is because it's much more useful to
+divide the world into "languages" and "programs."  And Emmental
+<em>does</em> make this division, it just makes it in a slightly
+different place than usual.</p>
+
+<p>As far as I'm concerned, if I describe what Emmental does as
+modifying the Emmental language via its MCI, and what Emmental actually does is
+consistent with the idea of modifying the Emmental language via its MCI, then
+what Emmental effectively does is modify the Emmental language via its
+MCI.  And if it needs to do this in a certain way in order to simulate
+a universal Turing machine, then that difference (however slight)
+sets it apart from systems where this simulation needs to be done by defining
+recursive functions.</p>
+
+<h2>Implementation</h2>
+
+<p><code>emmental.hs</code> is a reference interpreter for Emmental
+written in Haskell.  Run the function <code>emmental</code> on a
+string; you can also run <code>debug</code> on a string to view
+the state of the program (stack &amp; queue) during execution.
+(Note that <code>debug</code> is <em>not</em> able to show
+program states that occur internal to an operation.)</p>
+
+<p>Happy interpreter-redefining!
+<br>Chris Pressey
+<br>Chicago, IL
+<br>November 11, 2007</p>
+
+</body>
+</html>
diff --git a/src/emmental.hs b/src/emmental.hs
new file mode 100755
index 0000000..25537ed
--- /dev/null
+++ b/src/emmental.hs
@@ -0,0 +1,447 @@
+--
+-- emmental.hs
+-- Interpreter for the Emmental Programming Language
+-- Chris Pressey, Cat's Eye Technologies
+--
+-- $Id: emmental.hs 5 2007-11-12 04:36:43Z catseye $
+--
+
+--
+-- Copyright (c)2007 Cat's Eye Technologies.  All rights reserved.
+--
+-- Redistribution and use in source and binary forms, with or without
+-- modification, are permitted provided that the following conditions
+-- are met:
+--
+--  1. Redistributions of source code must retain the above copyright
+--     notices, this list of conditions and the following disclaimer.
+--  2. Redistributions in binary form must reproduce the above copyright
+--     notices, this list of conditions, and the following disclaimer in
+--     the documentation and/or other materials provided with the
+--     distribution.
+--  3. Neither the names of the copyright holders nor the names of their
+--     contributors may be used to endorse or promote products derived
+--     from this software without specific prior written permission. 
+--
+-- THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+-- ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+-- LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+-- FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE
+-- COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+-- INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+-- BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+-- LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+-- CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+-- LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+-- ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+-- POSSIBILITY OF SUCH DAMAGE.
+--
+
+
+import qualified Data.Map as Map
+import qualified Data.Char as Char
+
+-----------------------------------------------------------------------
+-- ============================ Symbols ============================ --
+-----------------------------------------------------------------------
+
+type Symbol = Char
+
+
+-----------------------------------------------------------------------
+-- ======================== Program States ========================= --
+-----------------------------------------------------------------------
+
+data State = State [Symbol] [Symbol]
+     deriving (Ord, Eq, Show)
+
+pop (State (head:tail) queue) = (head, State tail queue)
+push (State list queue) sym  = (State (sym:list) queue)
+
+popString (State (';':tail) queue) = ([], State tail queue)
+popString (State (head:tail) queue) =
+    let
+        (string, state') = popString (State tail queue)
+    in
+        (string ++ [head], state')
+
+enqueue (State stack queue) symbol = State stack (symbol:queue)
+dequeue (State stack queue) =
+    let
+        symbol = last queue
+        queue' = init queue
+    in
+        (symbol, State stack queue')
+
+
+-----------------------------------------------------------------------
+-- ========================= Interpreters ========================== --
+-----------------------------------------------------------------------
+
+data Interpreter = Interp (Map.Map Symbol Operation)
+
+fetch (Interp map) sym = Map.findWithDefault (fitRegOp opNop) sym map
+supplant (Interp map) sym op = (Interp (Map.insert sym op map))
+
+
+-----------------------------------------------------------------------
+-- ========================== Operations =========================== --
+-----------------------------------------------------------------------
+
+type Operation = State -> Interpreter -> IO (State, Interpreter)
+
+composeOps :: Operation -> Operation -> Operation
+
+composeOps op1 op2 = f where
+    f state interpreter = do
+        (state', interpreter') <- op1 state interpreter
+        op2 state' interpreter'
+
+createOp :: Interpreter -> [Symbol] -> Operation
+
+createOp interpreter [] = 
+    (fitRegOp opNop)
+createOp interpreter (head:tail) =
+    composeOps (fetch interpreter head) (createOp interpreter tail)
+
+--
+-- It's useful for us to express a lot of our operators as non-monadic
+-- functions that don't affect the interpreter.  This is a little "adapter"
+-- function that lets us create monadic functions with the right signature
+-- from them.
+--
+
+fitRegOp :: (State -> State) -> Operation
+
+fitRegOp regop = f where
+    f state interpreter =
+        let
+            state' = regop state
+        in
+            do return (state', interpreter)
+
+
+------------------------------------------------------------
+--------------- The operations themselves. -----------------
+------------------------------------------------------------
+
+--
+-- Redefine the meaning of the symbol on the stack with
+-- a mini-program also popped off the stack.
+--
+
+opSupplant state interpreter =
+    let
+        (opSym, state') = pop state
+        (newOpDefn, state'') = popString state'
+        newOp = createOp interpreter newOpDefn
+    in
+        do return (state'', supplant interpreter opSym newOp)
+
+--
+-- Execute the symbol on the stack with the current interpreter.
+--
+
+opEval state interpreter =
+    let
+        (opSym, state') = pop state
+        newOp = createOp interpreter [opSym]
+    in
+        newOp state' interpreter
+
+--
+-- I/O.
+--
+
+opInput state interpreter = do
+    symbol <- getChar
+    do return (push state symbol, interpreter)
+
+opOutput state interpreter =
+    let
+        (symbol, state') = pop state
+    in do
+        putChar symbol
+        return (state', interpreter)
+
+--
+-- Primitive arithmetic.
+--
+
+opAdd state =
+    let
+        (symA, state') = pop state
+        (symB, state'') = pop state'
+    in
+        push state'' (Char.chr (((Char.ord symB) + (Char.ord symA)) `mod` 256))
+
+opSubtract state =
+    let
+        (symA, state') = pop state
+        (symB, state'') = pop state'
+    in
+        push state'' (Char.chr (((Char.ord symB) - (Char.ord symA)) `mod` 256))
+
+discreteLog 0 = 8
+discreteLog 1 = 0
+discreteLog 2 = 1
+discreteLog n = (discreteLog (n `div` 2)) + 1
+
+opDiscreteLog state =
+    let
+        (symbol, state') = pop state
+    in
+        push state' (Char.chr (discreteLog (Char.ord symbol)))
+
+--
+-- Stack manipulation.
+--
+
+--
+-- Pop the top symbol of the stack, make a copy of it, push it back onto the
+-- stack, and enqueue the copy onto the queue.
+--
+
+opEnqueueCopy state =
+    let
+        (sym, _) = pop state
+    in
+        enqueue state sym
+
+--
+-- Dequeue a symbol from the queue and push it onto the stack.
+--
+
+opDequeue state =
+    let
+        (sym, state') = dequeue state
+    in
+        push state' sym
+
+--
+-- Duplicate the top symbol of the stack.
+--
+
+opDuplicate state =
+    let
+        (symbol, _) = pop state
+    in
+        push state symbol
+
+--
+-- Miscellaneous operations.
+--
+
+opNop state =
+    state
+
+--
+-- Parameterizable operations.
+--
+
+opPushValue value state =
+    push state (Char.chr value)
+
+opAccumValue value state =
+    let
+        (sym, state') = pop state
+        value' = ((Char.ord sym) * 10) + value
+    in
+        push state' (Char.chr (value' `mod` 256))
+
+
+-----------------------------------------------------------------------
+-- ===================== Debugging Functions ======================= --
+-----------------------------------------------------------------------
+
+type Debugger = State -> Interpreter -> IO ()
+
+debugNop s i = do
+    return ()
+
+debugPrintState s i = do
+    putStr ((show s) ++ "\n")
+    return ()
+
+
+-----------------------------------------------------------------------
+-- ============================ Executor =========================== --
+-----------------------------------------------------------------------
+
+execute :: [Symbol] -> State -> Interpreter -> Debugger -> IO (State, Interpreter)
+
+execute [] state interpreter debugger =
+    return (state, interpreter)
+execute (opSym:program') state interpreter debugger =
+    let
+        operation = fetch interpreter opSym
+    in do
+        (state', interpreter') <- operation state interpreter
+        debugger state' interpreter'
+        execute program' state' interpreter' debugger
+
+
+-----------------------------------------------------------------------
+-- ====================== Top-Level Function ======================= --
+-----------------------------------------------------------------------
+
+initialInterpreter = Interp
+  (Map.fromList
+    [
+        ('.', opOutput),
+        (',', opInput),
+
+        ('#', fitRegOp (opPushValue 0)),
+        ('0', fitRegOp (opAccumValue 0)),
+        ('1', fitRegOp (opAccumValue 1)),
+        ('2', fitRegOp (opAccumValue 2)),
+        ('3', fitRegOp (opAccumValue 3)),
+        ('4', fitRegOp (opAccumValue 4)),
+        ('5', fitRegOp (opAccumValue 5)),
+        ('6', fitRegOp (opAccumValue 6)),
+        ('7', fitRegOp (opAccumValue 7)),
+        ('8', fitRegOp (opAccumValue 8)),
+        ('9', fitRegOp (opAccumValue 9)),
+
+        ('+', fitRegOp opAdd),
+        ('-', fitRegOp opSubtract),
+        ('~', fitRegOp opDiscreteLog),
+
+        ('^', fitRegOp opEnqueueCopy),
+        ('v', fitRegOp opDequeue),
+        (':', fitRegOp opDuplicate),
+
+        ('!', opSupplant),
+        ('?', opEval),
+
+        (';', fitRegOp (opPushValue (Char.ord ';')))
+    ]
+  )
+
+initialState = State [] []
+
+emmental string = do
+    (state, interpreter) <- execute string initialState initialInterpreter debugNop
+    return state
+
+debug string = do
+    (state, interpreter) <- execute string initialState initialInterpreter debugPrintState
+    return state
+
+
+-----------------------------------------------------------------------
+-- ========================== Test Cases =========================== --
+-----------------------------------------------------------------------
+
+--
+-- Drivers for test cases.  'demo' runs them straight, whereas 'test'
+-- uses the debugger.
+--
+
+demo n = emmental (testProg n)
+
+test n = debug (testProg n)
+
+--
+-- Here we introduce a bit of a cheat, in order to make writing
+-- complex Emmental programs tolerable.  You can still see the
+-- programs in their fully glory by executing "show (testProg n)".
+--
+
+quote [] = []
+quote (symbol:rest) = "#" ++ (show (Char.ord symbol)) ++ (quote rest)
+
+--
+-- Add one and one.
+--
+
+testProg 1 = "#1#1+"
+
+--
+-- Redefine & as "+".
+--
+
+testProg 2 = ";#43#38!#1#1&"            -- 59,43,38 ==> ";+&"
+
+--
+-- Redefine 0 as "9".
+--
+
+testProg 3 = ";#57#48!#0"               -- 59,57,48 ==> ";90"
+
+--
+-- Redefine 0 as "#48?".  This results in an infinite loop when 0 is executed.
+--
+
+testProg 4 = ";#35#52#56#63#48!0"       -- 59,35,52,56,63,48 ==> ";#48?0"
+
+--
+-- Redefine $ as ".#36?".  This results in a loop that pops symbols and
+-- and prints them, until the stack underflows, when $ is executed.
+--
+
+testProg 5 = ";#46#35#51#54#63#36! #65#66#67#68#69$"
+
+--
+-- Duplicate the top stack element (assuming an empty queue.)
+-- This shows that the : operation is not strictly necessary
+-- (when you know the size of the queue.)
+--
+
+testProg 6 = "#65^v"
+
+--
+-- Discard the top stack element (assuming more than one element
+-- on the stack, and an empty queue.)
+--
+
+testProg 7 = "#33#123^v-+"
+
+--
+-- Swap the top two elements of the stack (assuming an empty queue.)
+--
+
+testProg 8 = "#67#66#65^v^-+^^v^v^v-+^v-+^v-+vv"
+
+--
+-- Input a symbol.  Report whether its ASCII value is even or odd.
+--
+
+testProg 9 = (quote ";^v:") ++ "!" ++                           -- : = dup
+             (quote ";#69.") ++ "#!" ++                         -- NUL = print "E"
+             (quote ";#79.") ++ "#128!" ++                      -- \128 = print "O"
+             (quote (";" ++ (take 127 [':',':'..]) ++           -- m = mul by 128
+             (take 127 ['+','+'..]) ++ "m")) ++ "!" ++
+             ",m?"
+
+--
+-- Input a symbol.  Report whether it is M or not.
+--
+
+testProg 10 = (quote ";#78.") ++ "#!" ++                        -- NUL = print "N"
+              ";##1!" ++                                        -- SOH = same as NUL
+              ";##2!" ++                                        -- STX = same as NUL
+              ";##3!" ++                                        -- ETX = same as NUL
+              ";##4!" ++                                        -- EOT = same as NUL
+              ";##5!" ++                                        -- ENQ = same as NUL
+              ";##6!" ++                                        -- ACK = same as NUL
+              ";##7!" ++                                        -- BEL = same as NUL
+              (quote ";#89.") ++ "#8!" ++                       -- BS = print "Y"
+              ",#77-~?"
+
+--
+-- Same as testProg 5, except stop printing when a NUL is
+-- encountered, instead of just underflowing the stack.
+--
+
+testProg 11 = ";" ++ (quote ":~?$") ++ "!" ++                    -- $ = dup & test
+              ";" ++ (quote ".$") ++ "#!" ++                     -- NUL = print & repeat
+              ";#0#1!" ++                                        -- SOH = same as NUL
+              ";#0#2!" ++                                        -- STX = same as NUL
+              ";#0#3!" ++                                        -- ETX = same as NUL
+              ";#0#4!" ++                                        -- EOT = same as NUL
+              ";#0#5!" ++                                        -- ENQ = same as NUL
+              ";#0#6!" ++                                        -- ACK = same as NUL
+              ";#0#7!" ++                                        -- BEL = same as NUL
+                                                                 -- BS = stop (nop)
+              "#0" ++ (quote (reverse "Hello!")) ++ "$"