git @ Cat's Eye Technologies Shelta / 0724891
Initial import of Shelta version 1.0 revision 1999.1223 sources. Cat's Eye Technologies 10 years ago
21 changed file(s) with 1886 addition(s) and 0 deletion(s). Raw diff Collapse all Expand all
0 @echo off
1 REM BOOTSTRP.BAT v2002.1208 (c)1999 Chris Pressey, Cat's-Eye Technologies.
2 REM Builds the bootstrapped versions (S & S2) of the Shelta compiler.
3 @echo on
4 call bin\shelta 86 prj\sheltas
5 copy prj\sheltas.com bin\sheltas.com
6 call bin\shelta s prj\sheltas
7 copy prj\sheltas.com bin\sheltas2.com
8 call bin\shelta s2 prj\sheltas
9 diff prj\sheltas.com bin\sheltas2.com
10 del prj\sheltas.com
Binary diff not shown
0 @echo off
1 REM SHELTA.BAT v2002.1208 (c)2002 Cat's-Eye Technologies.
2 REM A 'make'-like utility for Shelta compilers, as an MS-DOS batch.
3
4 REM -- Change the following lines to tailor what libraries are
5 REM -- included by default. See readme.txt
6 type lib\8086\8086.she >s
7 type lib\8086\gupi.she >>s
8 type lib\8086\dos.she >>s
9 type lib\8086\string.she >>s
10 type lib\gupi\linklist.she >>s
11
12 REM -- This section builds the source file, always called 'S'.
13 if not exist %2.she echo Can't find project file %2.she!
14 if exist %3.she type %3.she >>s
15 if exist %4.she type %4.she >>s
16 if exist %5.she type %5.she >>s
17 if exist %6.she type %6.she >>s
18 if exist %7.she type %7.she >>s
19 if exist %8.she type %8.she >>s
20 if exist %9.she type %9.she >>s
21 if exist %2.she type %2.she >>s
22 type null.txt >>s
23
24 bin\shelta%1.com <s > %2.com
25
26 if errorlevel 32 echo Source file could not be opened.
27 if errorlevel 16 echo Error - Unknown identifier in source file.
28 del s
Binary diff not shown
Binary diff not shown
Binary diff not shown
0 Making the Snake Eat its Tail: Bootstrapping
1 --------------------------------------------
2 Oct 20 1999, Chris Pressey, Cat's Eye Technologies.
3
4 What is bootstrapping?
5 ----------------------
6
7 Bootstrapping is the act of implementing a compiler for a language in
8 that same language, or a subset of it. It is a well-understood aspect
9 of compilation and the translation of compilers from one machine onto
10 a different machine.
11
12 Bootstrapping is a fairly esoteric discipline, however, partly because
13 there's little need to do it more than once for any given compiler and
14 any given machine, but also because at a basic level, bootstrapping is
15 somewhat difficult to understand.
16
17 How, for example, do you write a compiler that compiles itself, without
18 first having the compiler??? Sounds more than a little bit like the
19 "Which came first, the chicken or the egg" paradox.
20
21 The term 'bootstrap' itself comes from the whimsical idea that if you
22 were to bend over and tug at the straps on your own boots, you could
23 lift yourself off the ground.
24
25 So what's the trick to making a compiler levitate?
26 --------------------------------------------------
27
28 Well, first of all let me put your mind at rest - there's no paradox
29 or magic or anything else spooky involved, although it can feel that
30 way sometimes. There are in fact two realistic options for
31 bootstrapping:
32
33 - Write (on paper) the compiler in the language which it compiles,
34 then hand-translate (i.e. manually compile) it to assembly or
35 machine language. This approach has been the one taken for the
36 first compilers for both Pascal and LISP.
37
38 - First write a compiler for the language in another, already-available
39 language, such as assembly language, or C. Then re-write that compiler
40 in the language which it compiles. This is the approach many
41 bootstraps have taken.
42
43 How was Shelta bootstrapped?
44 ----------------------------
45
46 I took the second approach.
47
48 First I wrote the Shelta compiler SHELTA86.COM in assembly language
49 (SHELTA86.ASM) using Turbo Assembler 3.1.
50
51 +----------------+
52 SHELTA86.ASM | TASM ---> 8086 | SHELTA86.COM
53 +----+ +----+
54 | 8086 |
55 +------+
56 TASM.EXE
57
58 This is a 'tee diagram' as is commonly used by people who have to
59 do these sorts of things... it's pretty simple to understand.
60
61 The filename on the left is the input into the 'tee', the filename on
62 the right is the output of the 'tee'. The filename on the bottom is
63 the tool which is translating the input into the output. The formats
64 listed inside the tee are the languages each of the files is written in.
65
66 Because the output of this tee is a compiler, however, it can exist as
67 a tee in it's own right:
68
69 +-----------------+
70 | Shelta --> 8086 |
71 +----------+-----+ +----+
72 SHELTA86.ASM | TASM ---> 8086 | 8086 |
73 +----+ +----+------+
74 | 8086 | SHELTA86.COM
75 +------+
76 TASM.EXE
77
78 So I re-wrote SHELTA86.ASM in Shelta/GUPI, calling it SHELTAS.SHE.
79
80 +-----------------+
81 SHELTAS.SHE | Shelta --> 8086 | SHELTAS.COM
82 +----------+-----+ +----+
83 SHELTA86.ASM | TASM ---> 8086 | 8086 |
84 +----+ +----+------+
85 | 8086 | SHELTA86.COM
86 +------+
87 TASM.EXE
88
89 Lo and behold! A Shelta compiler written in Shelta. But that's not
90 the whole story - at this point the bootstraps have been pulled taut,
91 but there is one more tug that must be made to actually get levitating.
92 The compiler SHELTAS.COM must prove it's worth, meeting it's maker
93 so to speak:
94
95 +-----------------+
96 SHELTAS.SHE | Shelta --> 8086 | SHELTAS2.COM
97 +-----------+-----+ +----+
98 SHELTAS.SHE | Shelta --> 8086 | 8086 |
99 +----------+-----+ +----+------+
100 SHELTA86.ASM | TASM ---> 8086 | 8086 | SHELTAS.COM
101 +----+ +----+------+
102 | 8086 | SHELTA86.COM
103 +------+
104 TASM.EXE
105
106 Now, because of some subtle differences in SHELTA86.ASM and SHELTAS.SHE
107 (the assembly language version does no optimization), the sizes and
108 contents of all three of these Shelta compilers differ slightly. But
109 if the process was carried on one step further, the resultant compiler
110 would be the same as SHELTAS2.COM. The following might help clarify why
111 this is:
112
113 SHELTAS.SHE +-----------------+
114 Optimizing | Shelta --> 8086 | SHELTAS2.COM
115 SHELTAS.SHE +-----------+-----+ +----+ Optimizing
116 Optimizing | Shelta --> 8086 | 8086 | Optimized
117 +----------+-----+ +----+------+
118 SHELTA86.ASM | TASM ---> 8086 | 8086 | SHELTAS.COM
119 NonOptimizing+----+ +----+------+ Optimizing
120 Hand-Optimized | 8086 | SHELTA86.COM Non-Optimized
121 +------+ Non-Optimizing
122 Hand-Optimized
123
124 OK, but why did you choose to do this, anyway?
125 ----------------------------------------------
126
127 Well, there was certainly no reason to. I was not moving Shelta from
128 one machine to another, nor was I treating SHELTA86.ASM as a quick
129 hack which would be discarded once an optimizing compiler could be
130 bootstrapped.
131
132 On the other hand, there was no reason *not* to, so...
133
134 I did it mainly to say that I could. Not everyone can design a language,
135 write a compiler for it in the form of a 1/2-kbyte COM file, then
136 bootstrap it. I'm not sure I can say it was the hardest thing I've
137 ever done, but it was difficult enough.
138
139 Plus, well, it's the kind of freaky self-referential thing I've always
140 been interested in. A compiler written in the language which it
141 compiles, which in the end appears to have been compiled by itself.
142
143 In the preceding section I may have made what I did seem like a walk
144 in the park, but it wasn't. A large portion of the time was spent on:
145
146 - fixing bugs in SHELTA86.ASM
147 - building GUPI so that Shelta could be powerful enough to bootstrap
148 meaningfully (I could have just included SHELTA86.COM entirely as
149 inline assembly, but that would kind of defeat the purpose)
150 - fixing bugs in SHELTAS.SHE
151 - testing SHELTAS2.COM - more concentration than time, actually.
152 When you have five Shelta compilers, two in source form
153 (Turbo Assembler and Shelta) and three in executable form, you're
154 bound to get a little disoriented from having to keep track of
155 their interdependencies.
156
157 I can also offer the following piece of advice to anyone who is going
158 to be trying something similar: if you've already squashed down your
159 first compiler's source code in order to (say) claim bragging rights on
160 having built an 512-byte compiler, DO NOT attempt to simply translate
161 the optimized assembly code into another language. Rewrite it instead.
162 Especially for a program of this size. Initially trying to do a
163 literal translation from SHELTA86.ASM to SHELTAS.SHE was easily the
164 biggest mistake I made.
165
166 Where can I find further information on bootstrapping?
167 ------------------------------------------------------
168
169 Two books are of note: the notorious "Dragon" book by Aho, Sethi and
170 Ullman gives it a brief once-over; "Compilers and Compiler Generators"
171 by Terry gives it a more thorough and readable treatment.
172
173 Happy levitating!
174
175 Chris Pressey, Oct 20 1999
176 Cat's Eye Technologies, Winnipeg, Manitoba, Canada
0 Shelta <Maentwrog Mk IV>
1 ------------------------
2 * * * NEAR-BETA VERSION v1999.12.23 * * *
3
4 Shelta <Maentwrog Mk IV> NEAR-BETA (c)1999 Chris Pressey, Cat's-Eye Technologies.
5 All rights reserved. No warranty for suitability of any kind.
6 Consider yourself lucky if your head doesn't blow up.
7 This product is freely redistributable provided that all distributed copies
8 include the entire original unmodified archive.
9
10 What is Shelta <Maentwrog Mk IV>?
11 ---------------------------------
12
13 Shelta <Maentwog Mk IV> (which I'll generally just call Shelta during
14 the scope of it's documentation) is a set of interrelated software
15 systems:
16
17 - Shelta the Language - somewhere between FORTH, FALSE, and Assembler
18 - SHELTA86.COM the Tool - a Shelta compiler written in 8086 asm
19 - SHELTA.BAT the Compiler - organizer ('make') for Shelta files & libraries
20 - SHELTAS.COM and SHELTAS2.COM - Shelta compilers written in Shelta
21 - The GUPI Protocol - a standardized library of Shelta definitions
22
23 What's the history of Shelta?
24 -----------------------------
25
26 My first programming language ever was called Maentwrog, a term taken
27 from that wholly remarkable book, _The Meaning of Liff_, by Douglas
28 Adams.
29
30 Maentwrog sucked. But, it worked, kind of. It was interpreted, but
31 as I recall, it wasn't even tokenized... making the interpreter more
32 than a little slow. It was basically a subset of FORTH - not much
33 special there.
34
35 My second programming language ever was based on Maentwrog, and it
36 spawned a big hit called Befunge. Befunge left Maentwrog in the dust,
37 because there WAS much special there - Befunge is 2D, and that's
38 trippy. If you haven't tried to program in Befunge yet... try it!
39
40 However, I've always felt that I fell somewhat short of the mark I
41 was trying to make with Befunge-93. After all, it was inspired by
42 FALSE and Brainf*ck, but unlike either of them, it was not a small
43 machine-dependent compiler. It was a big, portable interpreter.
44 (In 1998 I rewrote the interpreter in assembly language to make a
45 Befunge-93 interpreter that fit into 2K. But it's just not the same.)
46 The urge to write a tiny compiler has been gnawing at me for the
47 past few years.
48
49 As such, Maentwrog has not gone totally forgotten. Over the years I've
50 made a few attempts at reworking the Maentwrog language, with little
51 success, until now. The main thing holding back Maentwrog for so
52 long was it's lack of strict design principles. Only now has the
53 subconscious philosophy of Maentwrog evolved to a point where it means
54 anything. The result is Shelta.
55
56 What is Shelta's Etymology?
57 ---------------------------
58
59 _The Oxford Dictionary of Current English_, 1996, describes Shelta as an
60 "ancient hybrid secret language used by Irish tinkers, Gypsies, etc."
61 Shelta <Maentwrog Mk IV> is targeted at a similar present-day audience.
62 Would you sometimes rather consider yourself a tinker (or a Gypsy) than
63 a computer programmer? Then Shelta may just be for you.
64
65 What is Shelta's Philosophy?
66 ----------------------------
67
68 Shelta's philosophy is one of simplicity of translation. Shelta is
69 easy to implement in assembly language. Shelta is nearly as low-level
70 as assembly language. Very small Shelta compilers can be built.
71
72 Shelta is also relatively easy to bootstrap - that is, it's not that
73 difficult to implement a Shelta compiler in Shelta itself. In fact,
74 that (along with writing a ridiculously small compiler) was my main
75 motivation for designing Shelta and building SHELTA86.COM. For more
76 information on the bootstrapped Shelta compilers and bootstrapping in
77 general, see the file bootstrp.txt.
78
79 In and of itself, Shelta has no actual functional semantics: only
80 structural ones. It relies on a either library of functions (such as
81 GUPI, described below) or inline machine language in order to be
82 considered Turing-Complete. That is, not unlike the ancient Shelta
83 language, Shelta <Maentwrog Mk IV> is hybridized: the actual programming
84 is usually done in Shelta/GUPI.
85
86 What are Shelta's Influences?
87 -----------------------------
88
89 Shelta is influenced largely by the wholly remarkable programming
90 language FALSE, by Wouter van Oortmerssen - "FORTH with lambda
91 functions". However, it is lower-level than FALSE. It is more like
92 FORTH in some ways - for example, multicharacter user-defined names
93 can be used to name unlimited variables, not just the a-z in FALSE.
94 Lastly, it is unlike FORTH and more like Assembler in that there is
95 no FORTH-like environment nor any fixed-size blocks of text as
96 input files.
97
98 What is Shelta's Syntax?
99 ------------------------
100
101 Tokens are delimited by whitespace - any whitespace and as much of it
102 as you like, but as long as two non-whitespace characters are adjacent,
103 they are considered part of the same token. Shelta's idea of
104 whitespace is, in ASCII, everything from #32 (space) down to #1 (^A).
105 (#0 (NUL) is considered synonymous with an end-of-file condition.)
106
107 The exception to the above rule is a comment block, which begins
108 (anywhere) with a ";" character and ends at the next ";" character.
109 This can occur even in the middle of a token, so "HE; foo ;LLO" is
110 taken to be the token "HELLO".
111
112 User-defined tokens - depicted with "Name" in the following table -
113 can contain any non-whitespace characters, and can begin with any
114 non-whitespace characters except for "[", "]", "\", '^', "_" and "`".
115 (This includes digits - "1" by itself is a name, not a number.)
116
117 What are the recognized tokens of Shelta?
118 -----------------------------------------
119
120 [ Begin block.
121 * ] End block, push pointer.
122 ]=Name End block, name pointer.
123 ]:Name End block, name pointer to compile-time-only block.
124 * ]Name End block, push named pointer.
125 ^Name Push pointer to previously named block.
126 _^Name Insert pointer to previously named block.
127 Name Insert contents of previously named block.
128
129 * `ABC Insert string.
130 _123 Insert decimal byte.
131 __1234 Insert decimal word.
132 \123 Push decimal word.
133
134 * = not available in SHELTA86.
135
136 What are some common syntactic idioms in Shelta?
137 ------------------------------------------------
138
139 [ `ABC ] Push pointer to string.
140 [ _5 _5 _5 _3 ] Push pointer to byte array.
141 [ __1234 __9999 ] Push pointer to word array.
142 [ _5 _5 ]=my-data Name a byte array.
143 [ _^my-data ]=my-refs Name an array of references to data.
144 _88 Insert anonymous inline machine code.
145 [ _88 ]:xyz Define xyz as inline machine code.
146 xyz Insert named inline machine code.
147 [ bar baz ]:foo Declare 'foo' as an inline proc
148 foo Insert 'foo' as an inline proc.
149 [ ]=bar Define named label.
150
151 Where do you use : instead of = after ]?
152 ----------------------------------------
153
154 Originally, Shelta did not distinguish between blocks used as
155 updatable stores, subroutines, or templates for inlined instructions.
156 As such, it would include all of them into the resulting executable,
157 even the blocks only used at compile-time to define inline instructions.
158
159 By using : instead of = after ], the Shelta compiler will treat the
160 block as containing information which is only used at compile-time.
161 This is essentially a contract between the programmer and the compiler;
162 the programmer promises not to expect the ^Name or _^Name syntax to work
163 on the block, and the compiler ensures the block does not show up
164 extraneously in the resulting executable.
165
166 What are some of the quirks of Shelta?
167 --------------------------------------
168
169 Shelta's lambda syntax is not uniform. On the top level, an empty
170 block such as this:
171 [ ]=label
172 is not necessarily defined to actually work. It is only defined to
173 produce sensible results when nested within another block like this:
174 [ [ ]=label foo ]=block
175 Also, this is NOT the same thing as saying:
176 [ [ foo ]=block1 bar ]=block2
177 This linearizes block1 out of block2, almost as if you had said
178 [ foo ]=block1 [ bar ]=block2
179 Except that the identifier block1 is 'supposed' to be local to block2
180 (it ISN'T, but it might be good programming practice to treat it that
181 way anyway! :-)
182
183 To make things even worse, nesting more than two levels deep like so
184 [ foo [ bar [ baz ] quuz ] phlef ]
185 probably doesn't do what you expect. Feel free to experiment, though.
186
187 Shelta does not have or use forward references. That seems to be no
188 problem, with the lambda-like declarations, but it can often force you
189 to write weird and awkwardly structured code. If you need to refer to
190 the current block from within it, you can always name a block twice:
191
192 [ foo ^this bar ]=this ; won't work! ;
193 [ [ ]=-this foo ^-this bar ]=this ; works! ^this == ^-this ;
194
195 What is SHELTA86.COM?
196 ---------------------
197
198 The Shelta compiler is implemented in 8086 assembly language and assembles
199 to a tiny (LESS THAN HALF A KILOBYTE! :-) executable program. There are
200 several restrictions on the program in order to trim fat:
201
202 - Input file goes in standard input, .COM file comes out standard output.
203 File should end in a ^@ (NUL) character to indicate EOF.
204 This NUL should be preceded by whitespace (otherwise it'll form
205 part of a token - you don't want that! :-)
206 - If a file error occurs, error code 32 is returned.
207 - If an undefined token is found, error code 16 is returned.
208 - The forms ] (End block push pointer,) ]Name (End block push named
209 pointer,) and `xyz (Insert String) are not supported. It is not
210 difficult to work around these by explicitly naming and pushing
211 blocks and using ASCII decimal sequences for strings. These
212 inequities could even be addressed by a simple pre-processor.
213
214 What is SHELTA.BAT?
215 -------------------
216
217 SHELTA.BAT allows one to harness a Shelta compiler such as SHELTA86.COM,
218 without having to directly put up with it's silly interface.
219
220 Usage:
221 SHELTA compiler project-file {library-files...}
222
223 'compiler' is one of: 86 (the assembly-language compiler,) S (the
224 compiler written in Shelta and compiled with 86), or S2 (the
225 compiler written in Shelta and compiled with S.) (See the file
226 bootstrp.txt for more information on the Shelta compilers written
227 in Shelta.)
228
229 You don't need to append '.she' to project-file or any library-file
230 you choose to include, SHELTA will do that for you and will
231 automatically name the output 'project-file.COM'.
232
233 SHELTA should support up to nine arguments, so you can specify seven
234 library files on the command line (there's no 'include' directive in
235 Shelta itself.)
236
237 As an example of how to use SHELTA, here's how to build and test one
238 of the example Shelta/GUPI programs, "Hello, world!":
239
240 (Updated Dec 8 2002 to reflect new directory structure:)
241
242 cd shelta-<<version>>
243 bin\shelta s2 prj\hello
244 prj\hello
245
246 How can I specify what libraries for SHELTA.BAT to use by default?
247 ------------------------------------------------------------------
248
249 By default SHELTA.BAT includes the following libraries:
250
251 8086\8086.she 8086 subset definition
252 8086\gupi.she General GUPI library (defined in 8086 subset)
253 8086\string.she GUPI string functions (defined in 8086 subset)
254 8086\dos.she DOS-dependent GUPI I/O (defined in 8086 subset)
255 gupi\linklist.she Linked list library (defined in GUPI)
256
257 You can edit SHELTA.BAT to change which libraries it uses by default.
258 (It is just a .BAT file after all.)
259
260 8086.she: One could presumably replace these inline instructions with
261 equivalent instructions for another relative-addressing processor,
262 change a few lines of SHELTA86.ASM, and voila! You could compile
263 Shelta to some other CPU. It'd be a cute trick...
264
265 gupi.she: While Shelta comes with GUPI, you don't need to use GUPI
266 with Shelta! You can comment out gupi.she and completely redefine the
267 semtantics of your Shelta. For example, you could use a very small
268 set of instructions (a tar pit) and use Shelta to compile languages
269 very similar to Brainf*ck, Malbolge, etc.
270
271 dos.she: You can replace the dos.she library with the bios.she library;
272 it does the same thing but goes directly through the BIOS instead, and
273 you can write code that will work without DOS loaded (so you could even
274 build your own OS or embedded controller code with Shelta! ;-)
275
276 What is GUPI?
277 -------------
278
279 GUPI stands for Generic Utilitarian Programming Interface. GUPI
280 is a set of Shelta definitions that acts as a standard library.
281
282 (Fact is, I'm not a big fan of how GUPI turned out; it is a
283 contrived and contingent beast, rather than the beautifully
284 designed and conceptually airtight scheme I had hoped for.
285 But I figure that if it was good enough to get me this far, it's
286 worth keeping around, and the hybrid design of Shelta makes it
287 easy to swap it for something else at a later time, anyway.)
288
289 The GUPI semantics as presented here work on a stack of word values
290 in a FORTH-like manner. Note that GUPI is not yet well documented.
291 Nor is it guaranteed not to change (although it looks unlikely;
292 any major change will warrant it's own library; "GUPII" perhaps? :-)
293
294 What are some of the naming conventions of GUPI?
295 ------------------------------------------------
296
297 Generally speaking...
298 The suffix b indicates 'byte'.
299 The suffix c indicates 'character'.
300 The suffix if indicates 'decision on a boolean'.
301 The suffix s indicates 'string with length' (block).
302 The suffix w indicates 'word' (normally 16-bit).
303 The suffix z indicates 'null-terminated string' (ASCIIZ).
304
305 Lack of any suffix usually indicates 'any type'.
306
307 What are the basic GUPI commands?
308 ---------------------------------
309
310 pop pop and discard top stack element
311 dup duplicate top stack element
312 swap pop a, pop b, push a, push b
313
314 to pop pointer, machine unary jump to pointer
315 do pop pointer, machine sub call pointer
316 toif pop pointer, pop boolean, unary jump if nonzero
317 doif pop pointer, pop boolean, sub call if nonzero
318 begin pop return pointer and push onto call stack
319 end push return pointer from call stack
320
321 begin, end, and do/doif lead to the following GUPI idiom:
322 [ begin baz end ]=bar Declare 'bar' as a subroutine.
323 ^bar do Call 'bar' as a subroutine call.
324
325 What are the memory-access commands?
326 ------------------------------------
327
328 getb pop pointer, push byte data at pointer
329 putb pop pointer, pop byte value, write at pointer
330 getw pop pointer, push word data at pointer
331 putw pop pointer, pop word value, write at pointer
332
333 What are the arithmetic and logic commands?
334 -------------------------------------------
335
336 ++ pop a, push a + 1
337 -- pop a, push a - 1
338 ** pop a, push a << 1
339 // pop a, push a >> 1
340 << pop a, pop b, push b << a
341 >> pop a, pop b, push b >> a
342 + pop a, pop b, push b + a
343 - pop a, pop b, push b - a
344 * pop a, pop b, push b * a
345 / pop a, pop b, push b / a
346 % pop a, pop b, push b mod a
347 *1 /% pop a, pop b, push b / a, push b mod a
348 ! pop a, push binary not a
349 zero pop a, push 1 if a = 0, push 0 otherwise
350 & pop a, pop b, push a binary and b
351 | pop a, pop b, push a binary or b
352 ~ pop a, pop b, push a binary xor b
353
354 *1. The algorithm commonly used for binary division actually computes
355 the results of both division and modulo (remainder for a > 0). If
356 both results are desired by the program, using /% is usually nearly
357 twice as efficient as using / and % seperately.
358
359 Indirect arithmetic?
360 --------------------
361
362 @++ pop pointer, increment word at pointer
363 @-- pop pointer, decrement word at pointer
364
365 How does GUPI interface with the operating system?
366 --------------------------------------------------
367
368 outs pop length, pop pointer, send bytes to stdout
369 outc pop word, send low byte to stdout
370 inc wait for input on stdin, push character read
371 qinc quietly wait for input on stdin, push character
372 chkin immediately return input status (is a char waiting?)
373 flin flush all unread input
374 halt pop a, stop program and return to operating system
375 with error code 'a'
376
377 And dynamic memory?
378 -------------------
379
380 malloc pop size, push ptr to memory of length size
381 mfree pop ptr, reset heap ptr to ptr
382 (Note that mfree will free ALL pointers that were
383 allocated with malloc since the pointer that is being
384 freed was allocated. It's good for local
385 linked lists and such, but be careful!)
386
387 What is "Portable Shelta/GUPI"?
388 -------------------------------
389
390 The short answer is, "Portable Shelta/GUPI" is the subset of the
391 union of the Shelta and GUPI languages where, through patience
392 and restraint - i.e. discipline - the Shelta/GUPI programmer
393 does not use any machine-dependent or self-modifying code, and
394 restricts themselves to the GUPI functions that do likewise or
395 are specified precisely and abstractly enough to be ported,
396 that is, re-written in some other machine or VM bytecode.
397
398 Where can I get updates on Shelta's condition?
399 ----------------------------------------------
400
401 Shelta's official web site is located at:
402
403 http://www.catseye.mb.ca/esoteric/shelta/
404
405 Happy tinkering!
406
407 Chris Pressey, Dec 23 1999
408 Cat's-Eye Technologies, Winnipeg, Manitoba, Canada
0 ;
1 99.she v1999.12.23 (c)2000 Chris Pressey, Cat's Eye Technologies.
2 The song "Ninety-Nine Bottles of Beer" implemented in Shelta/GUPI.
3 ;
4
5 [ _32 `bottles _32 `of _32 `beer _32 `on _32 `the _32 `wall, _13 _10 ]=L1
6 [ _32 `bottles _32 `of _32 `beer, _13 _10 ]=L2
7 [ `Take _32 `one _32 `down, _32 `pass _32 `it _32 `around, _13 _10 ]=L3
8 [ _32 `bottles _32 `of _32 `beer _32 `on _32 `the _32 `wall. _13 _10 _13 _10 ]=L4
9
10 [ `9 ]=bh [ `9 ]=bl
11 [ begin ^bh getb outc ^bl getb outc end ]=btls
12 [ begin ^bh getb \1 - ^bh putb \57 ^bl putb end ]=digit
13
14 [ [ ]=iloop
15
16 ^btls do
17 ^L1 \31 outs
18
19 ^btls do
20 ^L2 \19 outs
21
22 ^L3 \32 outs
23
24 ^bl getb \1 - ^bl putb
25 ^bl getb \47 - zero ^digit doif
26
27 ^btls do
28 ^L4 \33 outs
29
30 ^bh getb \47 - ^iloop toif
31
32 \0 halt
33 ] to
0 ;
1 demo.she v1999.12.23 (c)2000 Chris Pressey, Cat's Eye Technologies.
2 A demonstration of some of the basic features of Shelta and GUPI.
3 ;
4
5 [
6 [ ]=hw `Hello, _32 `world! ; an empty block denotes a label ;
7 [ ]=eol _13 _10
8 ]=hello
9
10 [ _0 ]=i
11 [ _0 ]=pad
12 [ __0 ]=hptr
13
14 [
15 begin
16 \1024 malloc ^hptr putw
17 [ ]=wloop
18 ^i getb ^hptr getw ^i getb + putb
19 ^i getb ++ ^i putb
20 ^i getb ^wloop toif
21 ^hptr getw \32 + \223 outs
22 end
23 ] do
24
25 ^hello \15 outs
26
27 ^hw \12 outs
28 ^eol \2 outs
29
30 ^hello getb outc
31
32 ^hello \1 + getb outc
33
34 \65 ^hello putb ^hello \15 outs
35
36 \1000 \8 / outc
37 \8 \8 * ++ outc
38 \8 \9 * ++ outc
39 \9 \9 * -- outc
40
41 flin
42 [
43 [ ]=loop
44 inc outc ^loop to ;forever!;
45 ] to
0 ;
1 hello.she v1999.12.23 (c)2000 Chris Pressey, Cat's Eye Technologies.
2 The ubiquitous greeting message, implemented in Shelta/GUPI.
3 ;
4 [ `Hello, _32 `world! _13 _10 ] \15 outs \0 halt
0 ;
1 sheltas.she v1999.12.23 (c)2000 Chris Pressey, Cat's Eye Technologies.
2 A bootstrappable Shelta compiler written in Shelta/GUPI.
3 ;
4
5 [ __0 ]=safestart
6 [ __0 ]=namestart
7
8 [ __0 ]=codeba
9 [ __0 ]=stacba
10 [ __0 ]=safeba
11 [ __0 ]=macrba
12 [ __0 ]=tokenba
13
14 [ __0 ]=symthead
15 [ __0 ]=codeh
16 [ __0 ]=stach
17 [ __0 ]=safeh
18 [ __0 ]=macrh
19 [ __0 ]=tokenh
20
21 [ begin \16 halt end ]=badtok
22 [ begin dupz end ]=fndupz
23
24 ;--------------------------------------;
25
26 [ __0 ]=newn
27 [ ; addr dlen strz -> void ;
28 begin
29 ^fndupz do
30
31 ; link up the new node ;
32 ^symthead getw \6 ll-node dup ^newn putw ll-link
33 ^newn getw ^symthead putw
34
35 ; addr dlen strz ;
36 ^newn getw ll-dptr putw
37 ^newn getw ll-dptr \2 + putw
38 ^newn getw ll-dptr \4 + putw
39
40 end
41 ]=InsertSymbol
42
43 [ __0 ]=lui
44 [ __0 ]=luitok
45 [ \0 end ]=luno
46 [ ^lui getw ll-dptr \4 + getw
47 ^lui getw ll-dptr \2 + getw end ]=luyes
48 [ ; strz -> dlen addr, that is, addr is pushed first;
49 begin
50 ^luitok putw
51 ^symthead getw ^lui putw
52
53 [ ]=luloop
54 ^lui getw zero ^luno toif
55 ^lui getw ll-dptr getw ^luitok getw eqzz ^luyes toif
56 ^lui getw ll-next ^lui putw
57 ^luloop to
58
59 ]=LookupSymbol
60
61 ;--------------------------------------;
62
63 [ __0 ]=ddtoken ; contains pointer into token where to decipher ;
64 [ __0 ]=ddvalue ; contains running tally of the value ;
65 [
66 begin
67 ^tokenba getw + ^ddtoken putw
68 \0 ^ddvalue putw
69 [ ]=ddLoop
70 ^ddvalue getw \10 *
71 ^ddtoken getw getb \48 - +
72 ^ddvalue putw
73
74 ^ddtoken @++
75 ^ddtoken getw getb \47 > ^ddLoop toif
76
77 ^ddvalue getw
78 end
79 ]=DecipherDecimal
80
81 ;--------------------------------------;
82
83 [
84 begin
85 ^codeh getw ++ putw
86 \184 ^codeh getw putb
87 \80 ^codeh getw \3 + putb
88 ^codeh getw \4 + ^codeh putw
89 end
90 ]=WritePush
91
92 [
93 begin
94 \1 ^DecipherDecimal do ^WritePush do
95 end
96 ]=PushWord
97
98 ;--------------------------------------;
99
100 [
101 ^tokenba getw \2 + ^LookupSymbol do pop
102 dup zero ^badtok toif
103 ^safeba getw - \260 +
104 ^codeh getw putw
105 ^codeh getw \2 + ^codeh putw
106 end
107 ]=LiteralSymbol
108 [
109 \2 ^DecipherDecimal do ^codeh getw putw
110 ^codeh getw \2 + ^codeh putw
111 end
112 ]=LiteralWord
113 [
114 begin
115
116 ^tokenba getw ++ getb \95 - zero ^LiteralWord toif
117 ^tokenba getw ++ getb \94 - zero ^LiteralSymbol toif
118 \1 ^DecipherDecimal do
119 ^codeh getw putb
120 ^codeh @++
121 end
122 ]=LiteralByte
123
124 ;--------------------------------------;
125
126 [
127 begin
128 ^tokenba getw ++ ^LookupSymbol do pop
129 dup zero ^badtok toif
130 ^safeba getw - \260 + ^WritePush do
131 end
132 ]=PushPointer
133
134 ;--------------------------------------;
135
136 [ __0 ]=strct
137 [
138 begin
139 \1 ^strct putw
140 [ ]=strLoop
141
142 ^tokenba getw ^strct getw + getb
143
144 ^codeh getw putb
145
146 ^codeh @++
147 ^strct @++
148
149 ^tokenba getw ^strct getw + getb ^strLoop toif
150
151 end
152 ]=String
153
154 ;--------------------------------------;
155
156 [
157 begin
158 ^codeh getw
159 ^stach getw putw
160 ^stach getw \2 + ^stach putw
161 end
162 ]=BeginBlock
163
164 [ __0 ]=ebtokptr
165 [ __0 ]=ebtoklen
166 [ __0 ]=ebdatlen
167 [ __0 ]=origcodeh
168 [
169 begin
170 ; adjust namestart ... possibly the weirdest Shelta-ism ;
171 ^namestart getw ^origcodeh getw + ^stach getw \2 - getw - ^namestart putw
172 end
173 ]=AdjustName
174 [ __0 ]=nei ; a shared counter the for next two subroutines ;
175 [
176 ^macrh getw ^namestart putw
177
178 ; copy everything from origcodeh to codeh into the macro area ;
179
180 ^origcodeh getw ^nei putw
181
182 [ ]=mloop
183
184 ^nei getw getb ^macrh getw putb
185 ^nei @++
186 ^macrh @++
187 ^nei getw ^codeh getw - ^mloop toif
188
189 ; change codeh back to origcodeh ;
190
191 ^origcodeh getw ^codeh putw
192
193 end
194 ]=MacroInstead
195 [
196 begin
197 ^tokenba getw ++ getb \58 - zero ^MacroInstead toif
198
199 ; copy everything from origcodeh to codeh into a safe area ;
200
201 ^origcodeh getw ^nei putw
202
203 [ ]=neloop
204
205 ^nei getw getb ^safeh getw putb
206 ^nei @++
207 ^safeh @++
208 ^nei getw ^codeh getw - ^neloop toif
209
210 ; change codeh back to origcodeh ;
211
212 ^origcodeh getw ^codeh putw
213 end
214 ]=NotEmpty
215 [ begin ^tokenba getw \2 + ^ebtokptr putw end ]=incebtokptr
216 [
217 ; insert name into dictionary ;
218 ^namestart getw ^ebdatlen getw ^ebtokptr getw ^InsertSymbol do
219 end
220 ]=NameIt
221 [
222 begin
223
224 ^tokenba getw ++ ^ebtokptr putw
225 ^tokenba getw ++ getb \58 - zero ^incebtokptr doif
226 ^tokenba getw ++ getb \61 - zero ^incebtokptr doif
227
228 ^safeh getw dup ^safestart putw ^namestart putw ; track starts ;
229
230 ^ebtokptr getw lenz ^ebtoklen putw
231
232 ^stach getw \2 - ^stach putw
233 ^stach getw getw ^origcodeh putw ; get original code head ;
234
235 ^codeh getw ^origcodeh getw - ^ebdatlen putw
236
237 ^stach getw ^stacba getw - ^AdjustName doif
238
239 ^ebdatlen getw \0 > ^NotEmpty doif
240
241 ; write push instruction if '=' or ':' not used ;
242
243 ^tokenba getw ++ getb \58 - zero ^NameIt toif
244 ^tokenba getw ++ getb \61 - zero ^NameIt toif
245
246 \184 ^codeh getw putb
247 \80 ^codeh getw \3 + putb
248 ^safestart getw ^safeba getw \260 - - ^codeh getw ++ putw
249
250 ^codeh getw \4 + ^codeh putw
251
252 ^tokenba getw ++ getb ^NameIt toif
253 end
254 ]=EndBlock
255
256 ;--------------------------------------;
257
258 [ __0 ]=urctr
259 [ __0 ]=urlen
260 [ __0 ]=uraddr
261 [
262 ^codeh getw -- -- ^codeh putw
263 ; ^codeh \4 \2 fwrite ^crlf \2 \2 fwrite ;
264 end
265 ]=wipeit
266 [
267 ^codeh getw \2 - getb \80 - zero ^wipeit toif end
268 ]=peep
269 [
270 ^codeh getw -- getb \88 - zero ^peep toif end
271 ]=peepok
272 [
273 begin
274 ^codeh getw ^codeba getw - ^peepok toif end
275 ]=clean
276 [
277 [ ]=urloop
278 ^uraddr getw getb ^codeh getw putb
279 ^uraddr @++
280 ^codeh @++
281 ^urlen @--
282 ^urctr @++
283
284 ^urctr getw -- zero ^clean doif
285
286 ^urlen getw ^urloop toif
287 end
288 ]=curloop
289 [
290 begin
291 \0 ^urctr putw ^tokenba getw ^LookupSymbol do ^urlen putw
292 dup zero ^badtok toif
293 ^uraddr putw
294
295 ; copy urlen bytes from uraddr to codeh ;
296
297 ; 1999.10.14 peephole optimization commented out.
298 it crashes and it's not strictly necessary. someday, perhaps... ;
299
300 ^urlen getw ^curloop toif
301
302 end
303 ]=Unroll
304
305 ;--------------------------------------;
306
307 [ end ]=goodc ; char was dupped and is on stack ;
308 [
309 begin
310 [ ]=floop
311 qinc dup \59 - ^goodc toif pop ; return good char if not semicolon ;
312 [ ]=cloop
313 qinc \59 - zero ^floop toif
314 ^cloop to
315 end
316 ]=scanc
317
318 [ _0 ]=inbyte
319 [ _0 ]=eoff
320 [ \1 ^eoff putb end ]=goteof
321 [
322 begin
323 ^tokenba getw ^tokenh putw
324 [ ]=scanloop
325 ^scanc do ^inbyte putb
326 ^inbyte getb zero ^goteof toif
327 \33 ^inbyte getb > ^scanloop toif
328
329 [ ]=scisloop
330 ^inbyte getb ^tokenh getw putb ^tokenh @++ ;write char to token;
331
332 ^scanc do ^inbyte putb
333 ^inbyte getb zero ^goteof toif
334 ^inbyte getb \32 > ^scisloop toif
335
336 \0 ^tokenh getw putb
337 end
338 ]=scantok
339
340 ; --- startup --- get dynamic memory off of heap --- ;
341
342 \16384 malloc dup ^safeba putw \2 + ^safeh putw
343 \4096 malloc dup ^macrba putw ^macrh putw
344 \4096 malloc dup ^codeba putw ^codeh putw
345 \256 malloc dup ^stacba putw ^stach putw
346 \128 malloc dup ^tokenba putw ^tokenh putw
347
348 [
349 ; write output file ;
350
351 ; put in a jump over the safe area ;
352
353 ^safeh getw ^safeba getw - ++
354
355 \233 outc
356 dup \255 & outc \8 >> \255 & outc
357 \144 outc
358
359 ; make the first word of the safe area an offset ;
360 ; to just past the last word of the code ;
361
362 ^safeh getw ^safeba getw - ^codeh getw + ^codeba getw \260 - - ^safeba getw putw
363
364 ^safeba getw ^safeh getw ^safeba getw - outs
365 ^codeba getw ^codeh getw ^codeba getw - outs
366 \0 halt
367 ]=tail
368 [ [ ]=main
369 ^scantok do
370 ^eoff getb ^tail toif
371 ^tokenba getw getb \91 - \5 > ^Unroll doif
372 ^tokenba getw getb \91 - zero ^BeginBlock doif
373 ^tokenba getw getb \92 - zero ^PushWord doif
374 ^tokenba getw getb \93 - zero ^EndBlock doif
375 ^tokenba getw getb \94 - zero ^PushPointer doif
376 ^tokenba getw getb \95 - zero ^LiteralByte doif
377 ^tokenba getw getb \96 - zero ^String doif
378 ^main to
379 ]=Shelta ^Shelta to
380
381 ; end of sheltas.she ;
0 ;
1 str.she v1999.12.23 (c)2000 Chris Pressey, Cat's Eye Technologies.
2 Demonstrates searching a list of strings.
3 ;
4
5 [ __0 ]=head
6 [ __0 ]=newn
7
8 [ `Moe _0 ]=s1
9 [ `Curly _0 ]=s2
10 [ `Larry _0 ]=s3
11
12 [ `Larry _0 ]=target ; change this variable to test ;
13
14 [ ; strz -> strz ;
15 begin dup lenz ++ malloc cpzz end
16 ]=strdup
17
18 [ ; stooge -> void ;
19 begin
20 ^strdup do
21 ^head getw \2 ll-node dup ^newn putw ll-link
22 ^newn getw dup ^head putw ll-dptr putw
23 end
24 ]=add-stooge
25
26 [ [ `No ] \2 outs \1 halt ]=no
27 [ [ `Yes ] \3 outs \0 halt ]=yes
28
29 [
30
31 ^s1 ^add-stooge do
32 ^s2 ^add-stooge do
33 ^s3 ^add-stooge do
34
35 ^head getw ^newn putw
36
37 [ ]=cloop
38 ^newn getw zero ^no toif
39 ^newn getw ll-dptr getw ^target eqzz ^yes toif
40 ^newn getw ll-next ^newn putw
41 ^cloop to
42
43 ] to
0 ;
1 8086\8086.she v1999.10.10 (c)1999 Chris Pressey, Cat's-Eye Technologies.
2 Defines the instructions of the Intel 8086 chip and it's successors.
3 ;
4
5 [ _244 ]:hlt
6
7 [ _146 ]:xchg-dx-ax
8 [ _147 ]:xchg-bx-ax
9 [ _145 ]:xchg-cx-ax
10
11 [ _80 ]:push-ax
12 [ _83 ]:push-bx
13 [ _81 ]:push-cx
14 [ _82 ]:push-dx
15
16 [ _255 _55 ]:push[bx]
17
18 [ _57 _195 ]:cmp-bx-ax
19
20 [ _161 ]:mov-ax[]
21 [ _163 ]:mov[]ax
22
23 [ _142 _6 ]:mov-es[]
24
25 [ _88 ]:pop-ax
26 [ _91 ]:pop-bx
27 [ _89 ]:pop-cx
28 [ _90 ]:pop-dx
29
30 [ _95 ]:pop-di
31 [ _94 ]:pop-si
32
33 [ _86 ]:push-si
34 [ _87 ]:push-di
35
36 [ _138 _4 ]:mov-al[si]
37 [ _58 _5 ]:cmp-al[di]
38
39 [ _139 _5 ]:mov-ax[di]
40
41 [ _211 _224 ]:shl-ax-cl
42 [ _211 _232 ]:shr-ax-cl
43
44 [ _209 _224 ]:shl-ax-1
45 [ _209 _232 ]:shr-ax-1
46
47 [ _180 ]:mov-ah
48 [ _176 ]:mov-al
49 [ _177 ]:mov-cl
50 [ _185 ]:mov-cx
51 [ _187 ]:mov-bx
52 [ _179 ]:mov-bl
53
54 [ _255 _208 ]:call-ax
55 [ _255 _224 ]:jmp-ax
56
57 [ _50 _192 ]:xor-al-al
58 [ _50 _228 ]:xor-ah-ah
59 [ _48 _255 ]:xor-bh-bh
60 [ _38 ]:es
61
62 [ _137 _195 ]:mov-bx-ax
63 [ _137 _194 ]:mov-dx-ax
64 [ _137 _193 ]:mov-cx-ax
65
66 [ _136 _204 ]:mov-ah-cl
67 [ _136 _206 ]:mov-dh-cl
68
69 [ _139 _210 ]:mov-bx-dx
70 [ _136 _7 ]:mov[bx]al
71 [ _137 _7 ]:mov[bx]ax
72 [ _137 _15 ]:mov[bx]cx
73 [ _136 _15 ]:mov[bx]cl
74 [ _139 _7 ]:mov-ax[bx]
75 [ _138 _15 ]:mov-cl[bx]
76
77 [ _136 _5 ]:mov[di]al
78
79 [ _116 ]:je
80 [ _117 ]:jne
81 [ _114 ]:jb
82 [ _119 ]:ja
83 [ _235 ]:jmp
84 [ _11 _192 ]:or-ax-ax
85 [ _10 _192 ]:or-al-al
86 [ _9 _210 ]:or-dx-dx
87 [ _247 _208 ]:not-ax
88
89 [ _131 _251 ]:cmp-bx
90
91 [ _70 ]:inc-si
92 [ _71 ]:inc-di
93
94 [ _67 ]:inc-bx
95 [ _74 ]:dec-dx
96
97 [ _128 _228 ]:and-ah
98 [ _35 _194 ]:and-ax-dx
99 [ _11 _194 ]:or-ax-dx
100 [ _51 _194 ]:xor-ax-dx
101
102 [ _49 _201 ]:xor-cx-cx
103 [ _51 _192 ]:xor-ax-ax
104 [ _49 _210 ]:xor-dx-dx
105
106 [ _1 _208 ]:add-ax-dx
107 [ _41 _208 ]:sub-ax-dx
108 [ _41 _194 ]:sub-dx-ax
109
110 [ _247 _234 ]:imul-dx
111 [ _247 _249 ]:idiv-cx
112
113 [ _64 ]:inc-ax
114 [ _72 ]:dec-ax
115
116 [ _159 ]:lahf
117 [ _235 ]:jmp
118 [ _144 ]:nop
119
120 [ _205 ]:int
0 ;
1 8086\bios.she v1999.12.23 (c)1999 Chris Pressey, Cat's-Eye Technologies.
2 BIOS interface for the OS-dependent part of GUPI.
3 ;
4
5 ;interrupt # for keybd ; [ _22 ]:keybd
6 ;interrupt # for video ; [ _16 ]:video
7
8 ; void -> halt; [ pop-ax jmp _254 ]:halt
9
10 ; char -> void; [ pop-ax mov-ah _14 mov-bl _15 int video ]:outc
11
12 ;string sizeb -> void; [ pop-dx pop-si mov-al[si]
13 mov-ah _14 mov-bl _15 int video
14 inc-si dec-dx or-dx-dx jne _242 ]:outs
15
16 ; void -> char; [ xor-ah-ah int keybd xor-ah-ah push-ax ]:qinc
17 ; void -> char; [ qinc dup outc ]:inc
18 ; void -> bool; [ mov-ah _1 int keybd je _4 inc-ax jmp _3 nop xor-ax-ax push-ax ]:chkin
19 ; void -> void; [ mov-ah _1 int keybd je _6 xor-ah-ah int keybd jmp _244 ]:flin
0 ;
1 8086\dos.she v1999.12.23 (c)1999 Chris Pressey, Cat's-Eye Technologies.
2 DOS interface for the OS-dependent part of GUPI.
3 ;
4
5 ;interrupt # for DOS ; [ _33 ]:dos
6
7 ; void -> halt; [ pop-ax mov-ah _76 int dos ]:halt
8 ;string sizeb -> void; [ mov-ah _64 _187 _1 _0 pop-cx pop-dx int dos ]:outs
9 ; char -> void; [ mov-ah _2 pop-dx int dos ]:outc
10 ; void -> char; [ mov-ah _1 int dos push-ax ]:inc
11 ; void -> char; [ mov-ah _7 int dos xor-ah-ah push-ax ]:qinc
12 ; void -> bool; [ mov-ah _11 int dos xor-ah-ah push-ax ]:chkin
13 ; void -> void; [ xor-ax-ax mov-ah _12 int dos ]:flin
14
0 ;
1 8086\fileio.she v1999.12.23
2 (c)1999 Chris Pressey, Cat's-Eye Technologies.
3 DOS file functions.
4 ;
5
6 ; zfnm -> fhdl ;
7 [ pop-dx mov-ah _61 xor-al-al int dos push-ax ]:fopenz
8
9 ; string fhdl -> fstat ;
10 [ mov-ah _63 pop-bx mov-cx __1 pop-dx int dos push-ax ]:freadc
11
12 ; zfnm -> fhdl ;
13 [ pop-dx mov-ah _60 xor-cx-cx int dos ]:fcreatez
14
15 ;str szb fhdl -> void ;
16 [ mov-ah _64 pop-bx pop-cx pop-dx int dos ]:fwrite
17
18 ; fhdl -> void ;
19 [ pop-bx mov-ah _62 int dos ]:fclose
20
21 ; quit? [ begin ^tokenba getw dup lenz \2 fwrite halt end ]=fnhalt ;
0 ;
1 8086\gupi.she v1999.10.10 (c)1999 Chris Pressey, Cat's-Eye Technologies.
2 8086-compatible semantics for GUPI.
3 ;
4
5 ; input stack -> output stack ;
6 ; bottom..top -> top..bottom ;
7
8 ; word -> void ; [ pop-ax ]:pop
9 ; word -> word word; [ pop-ax push-ax push-ax ]:dup
10 ; wrd1 wrd2 -> wrd1 wrd2; [ pop-ax pop-bx push-ax push-bx ]:swap
11
12 ; addr -> byte ; [ pop-ax xchg-bx-ax mov-ax[bx] xor-ah-ah push-ax ]:getb
13 ; byte addr -> void ; [ pop-ax xchg-bx-ax pop-cx mov[bx]cl ]:putb
14 ; addr -> word ; [ pop-ax xchg-bx-ax push[bx] ]:getw
15 ; word addr -> void ; [ pop-ax xchg-bx-ax pop-cx mov[bx]cx ]:putw
16
17 ; word -> word ; [ pop-ax inc-ax push-ax ]:++
18 ; word -> word ; [ pop-ax dec-ax push-ax ]:--
19 ; word -> word ; [ pop-ax shl-ax-1 push-ax ]:**
20 ; word -> word ; [ pop-ax shr-ax-1 push-ax ]://
21
22 ; word word -> word ; [ pop-ax xchg-cx-ax pop-ax shl-ax-cl push-ax ]:<<
23 ; word word -> word ; [ pop-ax xchg-cx-ax pop-ax shr-ax-cl push-ax ]:>>
24
25 ; addr -> void ; [ pop-bx mov-ax[bx] inc-ax mov[bx]ax ]:@++
26 ; addr -> void ; [ pop-bx mov-ax[bx] dec-ax mov[bx]ax ]:@--
27
28 ; word word -> word ; [ pop-ax pop-dx add-ax-dx push-ax ]:+
29 ; word word -> word ; [ pop-ax pop-dx sub-dx-ax xchg-dx-ax push-ax ]:-
30 ; word word -> word ; [ pop-ax pop-dx imul-dx push-ax ]:*
31 ; word word -> word ; [ pop-ax xchg-cx-ax pop-ax xor-dx-dx idiv-cx push-ax ]:/
32 ; word word -> word ; [ pop-ax xchg-cx-ax pop-ax xor-dx-dx idiv-cx push-dx ]:%
33 ; word word -> word word; [ pop-ax xchg-cx-ax pop-ax xor-dx-dx idiv-cx push-dx push-ax ]:/%
34 ; word word -> word ; [ pop-ax pop-dx or-ax-dx push-ax ]:|
35 ; word word -> word ; [ pop-ax pop-dx and-ax-dx push-ax ]:&
36 ; word word -> word ; [ pop-ax pop-dx xor-ax-dx push-ax ]:~
37 ; word -> word ; [ pop-ax not-ax push-ax ]:!
38 ; word -> word ; [ pop-ax or-ax-ax je _4 xor-ax-ax jmp _1 inc-ax push-ax ]:zero
39 ; word word -> word ; [ pop-ax pop-bx cmp-bx-ax ja _4 xor-ax-ax jmp _1 inc-ax push-ax ]:>
40
41 ; addr -> (call) ; [ pop-ax call-ax ]:do
42 ; addr -> (branch) ; [ pop-ax jmp-ax ]:to
43 ; bool addr -> (call) ; [ pop-ax pop-dx or-dx-dx je _2 jmp-ax ]:toif
44 ; bool addr -> (branch) ; [ pop-ax pop-dx or-dx-dx je _2 call-ax ]:doif
45
46 ; memory for call stack: ; [ __0 __0 __0 __0 __0 __0 __0 __0
47 __0 __0 __0 __0 __0 __0 __0 __0
48 __0 __0 __0 __0 __0 __0 __0 __0
49 __0 __0 __0 __0 __0 __0 __0 __0 ]=clstk
50 ; memory for stack pointer; [ __0 ]=clsp
51
52 ; (call) -> void ; [ pop-ax _139 _30 _^clsp _137 _135 _^clstk _131 _6 _^clsp _2 ]:begin
53 ; void -> (return) ; [ _131 _46 _^clsp _2 _139 _30 _^clsp _139 _135 _^clstk push-ax _195 ]:end
54
55 ; sizw -> ptrw ;
56 [ mov-bx __260 mov-ax[bx]
57 pop-dx
58 push-ax
59 add-ax-dx
60 mov-bx __260 mov[bx]ax ]:malloc
61
62 ; ptrw -> void ;
63 [ pop-ax mov-bx __260 mov[bx]ax ]:mfree
64
0 ;
1 8086\string.she v1999.10.10 (c)1999 Chris Pressey, Cat's-Eye Technologies.
2 GUPI string-manipulation extensions.
3 ;
4
5 ; strz strz -> bool ;
6 [
7 pop-di pop-si
8 mov-al[si] cmp-al[di]
9 je _5
10 xor-ax-ax
11 jmp _13
12 nop
13 or-al-al
14 je _4
15 inc-si inc-di
16 jmp _237
17 xor-ah-ah mov-al _1
18 push-ax
19 ]:eqzz
20
21 ; strz strz -> strz(2) ;
22 [
23 pop-di pop-si push-di
24 mov-al[si] mov[di]al
25 or-al-al
26 je _4
27 inc-si inc-di
28 jmp _244
29 ]:cpzz
30
31 ; strz -> word ;
32 [
33 pop-si
34 push-si
35 mov-al[si]
36 or-al-al
37 je _3
38 inc-si
39 jmp _247
40 pop-dx
41 push-si
42 pop-ax
43 sub-ax-dx
44 push-ax
45 ]:lenz
46
47 [ ; strz -> strz ;
48 dup lenz ++ malloc cpzz
49 ]:dupz
50
51
0 ;
1 gupi\linklist.she v1999.10.10 (c)1999 Chris Pressey, Cat's-Eye Technologies.
2 GUPI linked list extensions.
3 ;
4
5 ; size -> node ; [ \2 + malloc ]:ll-node
6 ; next node -> void ; [ putw ]:ll-link
7 ; node -> next ; [ getw ]:ll-next
8 ; node -> data-ptr ; [ \2 + ]:ll-dptr
0 IDEAL
1
2 ; shelta86.asm v1999.10.20 (c)1999 Chris Pressey, Cat's-Eye Technologies.
3 ; Implements an assembler/compiler for the Shelta language, in 8086 assembly.
4
5 ; * Special thanks to Ben Olmstead (BEM) for his suggestions for how to
6 ; reduce SHELTA86.COM's size even further.
7
8 MODEL tiny
9 P8086
10
11 DATASEG
12
13 symth dw symt
14 codeh dw code
15 stach dw stac
16 safeh dw safe + 2
17 macrh dw macr
18
19 ttable dw BeginBlock, PushWord, EndBlock, PushPointer, LiteralByte ; , String
20 ; [ \ ] ^ _ `
21
22 UDATASEG
23
24 token db 128 dup (?)
25
26 safestart dw ?
27 namestart dw ?
28 toklength dw ?
29
30 safe db 16384 dup (?)
31 symt db 16384 dup (?) ; 16K + 16K = 32K
32 code db 4096 dup (?) ;
33 macr db 4096 dup (?) ; + 8K = 40K
34 stac db 256 dup (?)
35
36 CODESEG
37 ORG 0100h
38
39 ; EQUATES
40
41 safeadj EQU (offset safe - 0104h)
42 codeadj EQU (offset code - 0104h)
43
44 ; Main program.
45 PROC Main
46
47 WhileFile:
48
49 ; ----- begin scanning token
50
51 call ScanChar ; get char -> al
52 or al, al
53 jz @@EndFile
54 cmp al, 32
55 jbe WhileFile ; repeat if char is whitespace
56
57 mov di, offset token
58 cld
59
60 @@TokenLoop: stosb ; put char in token
61 call ScanChar ; get char
62 cmp al, 32
63 ja @@TokenLoop ; repeat if char is not whitespace
64
65 @@Terminate: mov [byte di], 0 ; return null-terminated token
66
67 ; ----- end scanning token
68
69 mov si, offset token + 1
70
71 mov al, [byte token]
72 sub al, '['
73 cmp al, 4
74 ja @@Unroll
75
76 xor ah, ah
77 shl ax, 1
78 xchg bx, ax
79 mov ax, [offset ttable + bx]
80 jmp ax ; jump to handler as listed in ttable
81
82 @@Unroll: dec si ; start at first character of token
83 call LookupSymbol ; destroys DI & SI, but that's OK
84
85 ; copy cx bytes from ax to codeh
86
87 xchg ax, si
88 mov di, [codeh] ; use di to track codeh
89 rep movsb
90
91 UpCodeH: mov [codeh], di
92 jmp short WhileFile
93
94 @@EndFile: ; put in a jump over the safe area
95
96 mov ax, [safeh]
97 sub ax, offset safe - 1
98 mov bx, offset token ; re-use token
99 mov [byte bx], 0e9h
100 mov [word bx + 1], ax
101 mov [byte bx + 3], 90h
102
103 mov cx, 4
104 mov dx, offset token
105 call WriteIt
106
107 ; make the first word of the safe area an offset
108 ; to just past the last word of the code
109
110 mov cx, [safeh]
111 mov dx, offset safe
112 sub cx, dx
113 mov ax, cx
114 add ax, [codeh]
115 sub ax, codeadj
116 mov [word safe], ax
117
118 call WriteIt
119
120 mov cx, [codeh]
121 mov dx, offset code
122 sub cx, dx
123 call WriteIt
124
125 xor al, al
126
127 GlobalExit: mov ah, 4ch ; exit to DOS
128 int 21h
129 ENDP Main
130
131 PROC WriteIt
132
133 mov ah, 40h
134 mov bx, 1
135 int 21h
136 jnc @@OK
137 mov al, 32
138 jmp short GlobalExit
139 @@OK: ret
140 ENDP WriteIt
141
142 ; -------------------------------- HANDLERS --------------------------- ;
143 ; When coming into any handler, di will equal the address of the null
144 ; (that is, the number of characters in the token + offset token)
145
146 ; ==== [ ==== BEGIN BLOCK ==== ;
147
148 BeginBlock: mov di, [stach] ; push [ onto stack
149 mov ax, [codeh]
150 stosw ; mov [bx], ax
151 mov [stach], di
152 jmp WhileFile
153
154 ; ==== ] ==== END BLOCK ==== ;
155
156 EndBlock: ;mov si, offset token + 1 ; si = token + 1 until...
157 ;cmp [byte ds:si], '='
158 ;je @@Smaller
159 ;cmp [byte ds:si], ':'
160 ;je @@Smaller
161 ;jmp short @@CarryOn
162 ; remove : or = from length
163 @@Smaller: dec di ; di left over from scanning token
164
165 @@CarryOn: mov bx, di ; di now free to hold something until @@WName
166 sub bx, si ; get length
167
168 mov ax, [safeh]
169 mov [safestart], ax
170 mov [namestart], ax
171 xchg ax, di ; di now holds safe area head location
172
173 mov [toklength], bx ; length of token
174 sub [stach], 2
175 mov bx, [stach] ; pop [ from stack
176
177 mov ax, [bx] ; ax = codeh when [ happened
178
179 mov bp, [codeh] ; find length
180 sub bp, ax
181 ; mov bp, bx ; bp = length of data between [ ... ]
182 ; until @@WName below... ugh
183
184 cmp [stach], offset stac
185 je @@StackEmpty
186
187
188 mov bx, [stach]
189 sub bx, 2
190 mov cx, [bx]
191
192 ; namestart = [namestart] - (cx - ax)
193
194 sub cx, ax
195 sub [namestart], cx
196
197 ; if dlength > 0,
198
199 @@StackEmpty: ;or bp, bp
200 ;jz @@Empty
201
202 cmp [byte si], ':' ; si still = offset token + 1
203 jne @@PreCopyLoop
204
205 mov di, [macrh] ; use macro area instead of safe if :
206 mov [namestart], di
207
208 ; copy everything from ax to codeh into the di area
209
210 @@PreCopyLoop: mov dx, ax
211 mov cx, bp ; [codeh] sub cx, ax
212 push si
213 xchg si, ax
214 rep movsb
215 pop si
216
217 ; change codeh back to dx (old codeh before [)
218
219 mov [codeh], dx
220
221 ;mov si, offset token + 1
222 cmp [byte si], ':' ; si still = offset token + 1
223 je @@UpdateMacr
224
225 mov [safeh], di
226 jmp short @@Empty
227 @@UpdateMacr: mov [macrh], di
228 ;jmp short @@NameIt
229
230 ; write push instruction if '=' or ':' not used
231
232 @@Empty: ;cmp [byte si], '=' ; si still = offset token + 1
233 ;je @@NameIt
234
235 ;mov ax, [safestart]
236 ;sub ax, safeadj
237 ;mov bx, [word codeh]
238 ;mov [byte bx], 0b8h
239 ;mov [word bx + 1], ax
240 ;mov [byte bx + 3], 50h
241 ;add [codeh], 4
242
243 ;cmp [byte si], 0 ; still offset token + 1!
244 ;je @@Anonymous
245
246 ; insert namestart into dictionary
247
248 @@NameIt: mov cx, [namestart]
249 mov ax, [toklength]
250
251 ;cmp [byte si], '='
252 ;je @@Bigger
253 ;cmp [byte si], ':'
254 ;je @@Bigger
255 ;jmp short @@WName
256
257 @@Bigger: inc si
258
259 @@WName: ; Destroys DI but that's OK.
260 ; INPUT: bx = ADDRESS of token to insert, ax = length of symbol,
261 ; cx = pointer to data, dx = length of data
262 ; OUTPUT: ds:bx = pointer to newly allocated symbol
263
264 mov di, [symth] ; di no longer contains macrh/safeh
265 add ax, 6 ; 1 word for length, 1 for ptr, 1 for data length
266 add [symth], ax
267
268 stosw ; mov [word di], ax ; place ax length in symt
269
270 sub ax, 6
271 xchg cx, ax ; cx <- ax; ax <- cx
272 stosw ; mov [word di], cx ; place cx (ptr to data)
273 xchg ax, bp
274 stosw ; mov [word di], bp ; place bp (ptr length)
275
276 rep movsb
277
278 mov [symth], di
279
280 @@Anonymous: jmp WhileFile
281
282 ; ==== ^ ==== PUSH POINTER ==== ;
283
284 PushPointer: ;mov si, offset token + 1
285 call LookupSymbol ; destroys di & si, should be OK
286
287 sub ax, safeadj
288 mov di, [word codeh]
289 jmp short WritePush
290
291 ; ==== ` ==== STRING ==== ;
292 ;
293 ;String: ;mov si, offset token + 1
294 ; mov di, [codeh]
295 ;@@Loop: mov al, [byte ds:si]
296 ; stosb
297 ; inc si
298 ; cmp [byte ds:si], 0
299 ; jne @@Loop
300 ; jmp UpCodeH
301
302 ; ==== _ ==== LITERAL BYTE ==== ;
303
304 LiteralByte: ;mov si, offset token + 1
305 cmp [byte si], '_'
306 je LiteralWord
307 cmp [byte si], '^'
308 je LiteralSymbol
309 call DecipherDecimal ; destroys DI, that's OK
310 stosb ; mov [byte bx], al
311 CheapTrick: mov [codeh], di
312 jmp WhileFile
313
314 ; ==== __ ==== LITERAL WORD ==== ;
315
316 LiteralWord: inc si
317 call DecipherDecimal ; destroys DI, that's OK
318 FunkyTrick: stosw ; mov [word bx], ax
319 jmp short CheapTrick
320
321 ; ==== _^ ==== LITERAL SYMBOL ==== ;
322
323 LiteralSymbol: inc si
324 call LookupSymbol ; destroys DI & SI, that's OK
325
326 sub ax, safeadj
327
328 mov di, [word codeh]
329 jmp short FunkyTrick
330 ;mov [word bx], ax
331 ;inc [codeh]
332 ;jmp short CheapTrick
333
334 ; ==== \ ==== PUSH WORD ==== ;
335
336 PushWord: ;mov si, offset token + 1
337 call DecipherDecimal ; destroys di, that's OK
338
339 WritePush: mov [byte di], 0b8h ; B8h, low byte, high byte, 50h
340 inc di
341 stosw ; mov [word di + 1], ax
342 mov al, 50h
343 stosb
344 mov [codeh], di
345 jmp WhileFile
346
347 ; -------------------------------- SUBROUTINES --------------------------- ;
348
349 PROC DecipherDecimal ; uses and destroys DI
350 ; INPUT: si = address of token
351 ; OUTPUT: ax = value, di = codeh
352
353
354 xor di, di
355
356 @@Loop: lodsb ; mov al, [byte ds:si], inc si
357
358 mov bx, di
359 mov cl, 3
360 shl bx, cl
361 mov cx, di
362 shl cx, 1
363 add bx, cx
364
365 sub al, '0'
366 cbw
367 add bx, ax
368 mov di, bx
369
370 cmp [byte ds:si], '0'
371 jae @@Loop
372
373 xchg ax, di
374 mov di, [word codeh]
375 ret
376 ENDP DecipherDecimal
377
378 PROC ScanChar
379 ; Scans a single character from the input file, placing
380 ; it in register al, which will be 0 upon error
381 ; or eof (so don't embed nulls in the Shelta source...)
382
383 mov ah, 7 ; read from stdin one byte
384 int 21h
385 cmp al, ';' ; check for comment
386 je @@Comment
387 ret
388 @@Comment: mov ah, 7 ; read from stdin one byte
389 int 21h
390 cmp al, ';' ; check for comment
391 jne @@Comment
392 jmp short ScanChar
393
394 ENDP ScanChar
395
396 PROC LookupSymbol
397 ; INPUT: si = address of symbol to find, di = address of null termination
398 ; OUTPUT: ds:ax = pointer to contents or zero if not found
399 ; cx = length of contents
400
401 mov bx, offset symt ; bx starts at symbol table
402 mov bp, si
403 sub di, si
404
405 @@Loop: mov ax, [word bx] ; first word = token size
406
407 mov dx, bx ; keep track of start of this symt entry
408
409 sub ax, 6
410 cmp ax, di
411 jne @@Exit ; if it doesn't fit, you must acquit
412
413 add bx, 6 ; bx now points to token in symbol table
414
415 ; exit if right token
416
417 xor si, si ; reset si to token
418 @@Inner: mov al, [byte ds:bx] ; get byte from bx=symt
419 cmp [byte bp + si], al ; compare to si=token
420 jne @@Exit
421 inc bx
422 inc si
423 cmp si, di ; hit the length yet?
424 jb @@Inner ; no, repeat
425
426 ; a match!
427
428 mov bx, dx
429 mov cx, [word bx + 4] ; third word = data length
430 mov ax, [word bx + 2] ; second word = data ptr
431 ret
432
433 @@Exit: mov bx, dx
434 mov ax, [word bx]
435 add bx, ax
436 cmp bx, [symth]
437 jb @@Loop
438
439 mov al, 16 ; return 16 if unknown identifier
440 jmp GlobalExit
441
442 ENDP LookupSymbol
443
444 END Main