Initial import of Shelta version 1.0 revision 1999.1223 sources.
Cat's Eye Technologies
10 years ago
0 | @echo off | |
1 | REM BOOTSTRP.BAT v2002.1208 (c)1999 Chris Pressey, Cat's-Eye Technologies. | |
2 | REM Builds the bootstrapped versions (S & S2) of the Shelta compiler. | |
3 | @echo on | |
4 | call bin\shelta 86 prj\sheltas | |
5 | copy prj\sheltas.com bin\sheltas.com | |
6 | call bin\shelta s prj\sheltas | |
7 | copy prj\sheltas.com bin\sheltas2.com | |
8 | call bin\shelta s2 prj\sheltas | |
9 | diff prj\sheltas.com bin\sheltas2.com | |
10 | del prj\sheltas.com⏎ |
Binary diff not shown
0 | @echo off | |
1 | REM SHELTA.BAT v2002.1208 (c)2002 Cat's-Eye Technologies. | |
2 | REM A 'make'-like utility for Shelta compilers, as an MS-DOS batch. | |
3 | ||
4 | REM -- Change the following lines to tailor what libraries are | |
5 | REM -- included by default. See readme.txt | |
6 | type lib\8086\8086.she >s | |
7 | type lib\8086\gupi.she >>s | |
8 | type lib\8086\dos.she >>s | |
9 | type lib\8086\string.she >>s | |
10 | type lib\gupi\linklist.she >>s | |
11 | ||
12 | REM -- This section builds the source file, always called 'S'. | |
13 | if not exist %2.she echo Can't find project file %2.she! | |
14 | if exist %3.she type %3.she >>s | |
15 | if exist %4.she type %4.she >>s | |
16 | if exist %5.she type %5.she >>s | |
17 | if exist %6.she type %6.she >>s | |
18 | if exist %7.she type %7.she >>s | |
19 | if exist %8.she type %8.she >>s | |
20 | if exist %9.she type %9.she >>s | |
21 | if exist %2.she type %2.she >>s | |
22 | type null.txt >>s | |
23 | ||
24 | bin\shelta%1.com <s > %2.com | |
25 | ||
26 | if errorlevel 32 echo Source file could not be opened. | |
27 | if errorlevel 16 echo Error - Unknown identifier in source file. | |
28 | del s |
Binary diff not shown
Binary diff not shown
Binary diff not shown
0 | Making the Snake Eat its Tail: Bootstrapping | |
1 | -------------------------------------------- | |
2 | Oct 20 1999, Chris Pressey, Cat's Eye Technologies. | |
3 | ||
4 | What is bootstrapping? | |
5 | ---------------------- | |
6 | ||
7 | Bootstrapping is the act of implementing a compiler for a language in | |
8 | that same language, or a subset of it. It is a well-understood aspect | |
9 | of compilation and the translation of compilers from one machine onto | |
10 | a different machine. | |
11 | ||
12 | Bootstrapping is a fairly esoteric discipline, however, partly because | |
13 | there's little need to do it more than once for any given compiler and | |
14 | any given machine, but also because at a basic level, bootstrapping is | |
15 | somewhat difficult to understand. | |
16 | ||
17 | How, for example, do you write a compiler that compiles itself, without | |
18 | first having the compiler??? Sounds more than a little bit like the | |
19 | "Which came first, the chicken or the egg" paradox. | |
20 | ||
21 | The term 'bootstrap' itself comes from the whimsical idea that if you | |
22 | were to bend over and tug at the straps on your own boots, you could | |
23 | lift yourself off the ground. | |
24 | ||
25 | So what's the trick to making a compiler levitate? | |
26 | -------------------------------------------------- | |
27 | ||
28 | Well, first of all let me put your mind at rest - there's no paradox | |
29 | or magic or anything else spooky involved, although it can feel that | |
30 | way sometimes. There are in fact two realistic options for | |
31 | bootstrapping: | |
32 | ||
33 | - Write (on paper) the compiler in the language which it compiles, | |
34 | then hand-translate (i.e. manually compile) it to assembly or | |
35 | machine language. This approach has been the one taken for the | |
36 | first compilers for both Pascal and LISP. | |
37 | ||
38 | - First write a compiler for the language in another, already-available | |
39 | language, such as assembly language, or C. Then re-write that compiler | |
40 | in the language which it compiles. This is the approach many | |
41 | bootstraps have taken. | |
42 | ||
43 | How was Shelta bootstrapped? | |
44 | ---------------------------- | |
45 | ||
46 | I took the second approach. | |
47 | ||
48 | First I wrote the Shelta compiler SHELTA86.COM in assembly language | |
49 | (SHELTA86.ASM) using Turbo Assembler 3.1. | |
50 | ||
51 | +----------------+ | |
52 | SHELTA86.ASM | TASM ---> 8086 | SHELTA86.COM | |
53 | +----+ +----+ | |
54 | | 8086 | | |
55 | +------+ | |
56 | TASM.EXE | |
57 | ||
58 | This is a 'tee diagram' as is commonly used by people who have to | |
59 | do these sorts of things... it's pretty simple to understand. | |
60 | ||
61 | The filename on the left is the input into the 'tee', the filename on | |
62 | the right is the output of the 'tee'. The filename on the bottom is | |
63 | the tool which is translating the input into the output. The formats | |
64 | listed inside the tee are the languages each of the files is written in. | |
65 | ||
66 | Because the output of this tee is a compiler, however, it can exist as | |
67 | a tee in it's own right: | |
68 | ||
69 | +-----------------+ | |
70 | | Shelta --> 8086 | | |
71 | +----------+-----+ +----+ | |
72 | SHELTA86.ASM | TASM ---> 8086 | 8086 | | |
73 | +----+ +----+------+ | |
74 | | 8086 | SHELTA86.COM | |
75 | +------+ | |
76 | TASM.EXE | |
77 | ||
78 | So I re-wrote SHELTA86.ASM in Shelta/GUPI, calling it SHELTAS.SHE. | |
79 | ||
80 | +-----------------+ | |
81 | SHELTAS.SHE | Shelta --> 8086 | SHELTAS.COM | |
82 | +----------+-----+ +----+ | |
83 | SHELTA86.ASM | TASM ---> 8086 | 8086 | | |
84 | +----+ +----+------+ | |
85 | | 8086 | SHELTA86.COM | |
86 | +------+ | |
87 | TASM.EXE | |
88 | ||
89 | Lo and behold! A Shelta compiler written in Shelta. But that's not | |
90 | the whole story - at this point the bootstraps have been pulled taut, | |
91 | but there is one more tug that must be made to actually get levitating. | |
92 | The compiler SHELTAS.COM must prove it's worth, meeting it's maker | |
93 | so to speak: | |
94 | ||
95 | +-----------------+ | |
96 | SHELTAS.SHE | Shelta --> 8086 | SHELTAS2.COM | |
97 | +-----------+-----+ +----+ | |
98 | SHELTAS.SHE | Shelta --> 8086 | 8086 | | |
99 | +----------+-----+ +----+------+ | |
100 | SHELTA86.ASM | TASM ---> 8086 | 8086 | SHELTAS.COM | |
101 | +----+ +----+------+ | |
102 | | 8086 | SHELTA86.COM | |
103 | +------+ | |
104 | TASM.EXE | |
105 | ||
106 | Now, because of some subtle differences in SHELTA86.ASM and SHELTAS.SHE | |
107 | (the assembly language version does no optimization), the sizes and | |
108 | contents of all three of these Shelta compilers differ slightly. But | |
109 | if the process was carried on one step further, the resultant compiler | |
110 | would be the same as SHELTAS2.COM. The following might help clarify why | |
111 | this is: | |
112 | ||
113 | SHELTAS.SHE +-----------------+ | |
114 | Optimizing | Shelta --> 8086 | SHELTAS2.COM | |
115 | SHELTAS.SHE +-----------+-----+ +----+ Optimizing | |
116 | Optimizing | Shelta --> 8086 | 8086 | Optimized | |
117 | +----------+-----+ +----+------+ | |
118 | SHELTA86.ASM | TASM ---> 8086 | 8086 | SHELTAS.COM | |
119 | NonOptimizing+----+ +----+------+ Optimizing | |
120 | Hand-Optimized | 8086 | SHELTA86.COM Non-Optimized | |
121 | +------+ Non-Optimizing | |
122 | Hand-Optimized | |
123 | ||
124 | OK, but why did you choose to do this, anyway? | |
125 | ---------------------------------------------- | |
126 | ||
127 | Well, there was certainly no reason to. I was not moving Shelta from | |
128 | one machine to another, nor was I treating SHELTA86.ASM as a quick | |
129 | hack which would be discarded once an optimizing compiler could be | |
130 | bootstrapped. | |
131 | ||
132 | On the other hand, there was no reason *not* to, so... | |
133 | ||
134 | I did it mainly to say that I could. Not everyone can design a language, | |
135 | write a compiler for it in the form of a 1/2-kbyte COM file, then | |
136 | bootstrap it. I'm not sure I can say it was the hardest thing I've | |
137 | ever done, but it was difficult enough. | |
138 | ||
139 | Plus, well, it's the kind of freaky self-referential thing I've always | |
140 | been interested in. A compiler written in the language which it | |
141 | compiles, which in the end appears to have been compiled by itself. | |
142 | ||
143 | In the preceding section I may have made what I did seem like a walk | |
144 | in the park, but it wasn't. A large portion of the time was spent on: | |
145 | ||
146 | - fixing bugs in SHELTA86.ASM | |
147 | - building GUPI so that Shelta could be powerful enough to bootstrap | |
148 | meaningfully (I could have just included SHELTA86.COM entirely as | |
149 | inline assembly, but that would kind of defeat the purpose) | |
150 | - fixing bugs in SHELTAS.SHE | |
151 | - testing SHELTAS2.COM - more concentration than time, actually. | |
152 | When you have five Shelta compilers, two in source form | |
153 | (Turbo Assembler and Shelta) and three in executable form, you're | |
154 | bound to get a little disoriented from having to keep track of | |
155 | their interdependencies. | |
156 | ||
157 | I can also offer the following piece of advice to anyone who is going | |
158 | to be trying something similar: if you've already squashed down your | |
159 | first compiler's source code in order to (say) claim bragging rights on | |
160 | having built an 512-byte compiler, DO NOT attempt to simply translate | |
161 | the optimized assembly code into another language. Rewrite it instead. | |
162 | Especially for a program of this size. Initially trying to do a | |
163 | literal translation from SHELTA86.ASM to SHELTAS.SHE was easily the | |
164 | biggest mistake I made. | |
165 | ||
166 | Where can I find further information on bootstrapping? | |
167 | ------------------------------------------------------ | |
168 | ||
169 | Two books are of note: the notorious "Dragon" book by Aho, Sethi and | |
170 | Ullman gives it a brief once-over; "Compilers and Compiler Generators" | |
171 | by Terry gives it a more thorough and readable treatment. | |
172 | ||
173 | Happy levitating! | |
174 | ||
175 | Chris Pressey, Oct 20 1999 | |
176 | Cat's Eye Technologies, Winnipeg, Manitoba, Canada |
0 | Shelta <Maentwrog Mk IV> | |
1 | ------------------------ | |
2 | * * * NEAR-BETA VERSION v1999.12.23 * * * | |
3 | ||
4 | Shelta <Maentwrog Mk IV> NEAR-BETA (c)1999 Chris Pressey, Cat's-Eye Technologies. | |
5 | All rights reserved. No warranty for suitability of any kind. | |
6 | Consider yourself lucky if your head doesn't blow up. | |
7 | This product is freely redistributable provided that all distributed copies | |
8 | include the entire original unmodified archive. | |
9 | ||
10 | What is Shelta <Maentwrog Mk IV>? | |
11 | --------------------------------- | |
12 | ||
13 | Shelta <Maentwog Mk IV> (which I'll generally just call Shelta during | |
14 | the scope of it's documentation) is a set of interrelated software | |
15 | systems: | |
16 | ||
17 | - Shelta the Language - somewhere between FORTH, FALSE, and Assembler | |
18 | - SHELTA86.COM the Tool - a Shelta compiler written in 8086 asm | |
19 | - SHELTA.BAT the Compiler - organizer ('make') for Shelta files & libraries | |
20 | - SHELTAS.COM and SHELTAS2.COM - Shelta compilers written in Shelta | |
21 | - The GUPI Protocol - a standardized library of Shelta definitions | |
22 | ||
23 | What's the history of Shelta? | |
24 | ----------------------------- | |
25 | ||
26 | My first programming language ever was called Maentwrog, a term taken | |
27 | from that wholly remarkable book, _The Meaning of Liff_, by Douglas | |
28 | Adams. | |
29 | ||
30 | Maentwrog sucked. But, it worked, kind of. It was interpreted, but | |
31 | as I recall, it wasn't even tokenized... making the interpreter more | |
32 | than a little slow. It was basically a subset of FORTH - not much | |
33 | special there. | |
34 | ||
35 | My second programming language ever was based on Maentwrog, and it | |
36 | spawned a big hit called Befunge. Befunge left Maentwrog in the dust, | |
37 | because there WAS much special there - Befunge is 2D, and that's | |
38 | trippy. If you haven't tried to program in Befunge yet... try it! | |
39 | ||
40 | However, I've always felt that I fell somewhat short of the mark I | |
41 | was trying to make with Befunge-93. After all, it was inspired by | |
42 | FALSE and Brainf*ck, but unlike either of them, it was not a small | |
43 | machine-dependent compiler. It was a big, portable interpreter. | |
44 | (In 1998 I rewrote the interpreter in assembly language to make a | |
45 | Befunge-93 interpreter that fit into 2K. But it's just not the same.) | |
46 | The urge to write a tiny compiler has been gnawing at me for the | |
47 | past few years. | |
48 | ||
49 | As such, Maentwrog has not gone totally forgotten. Over the years I've | |
50 | made a few attempts at reworking the Maentwrog language, with little | |
51 | success, until now. The main thing holding back Maentwrog for so | |
52 | long was it's lack of strict design principles. Only now has the | |
53 | subconscious philosophy of Maentwrog evolved to a point where it means | |
54 | anything. The result is Shelta. | |
55 | ||
56 | What is Shelta's Etymology? | |
57 | --------------------------- | |
58 | ||
59 | _The Oxford Dictionary of Current English_, 1996, describes Shelta as an | |
60 | "ancient hybrid secret language used by Irish tinkers, Gypsies, etc." | |
61 | Shelta <Maentwrog Mk IV> is targeted at a similar present-day audience. | |
62 | Would you sometimes rather consider yourself a tinker (or a Gypsy) than | |
63 | a computer programmer? Then Shelta may just be for you. | |
64 | ||
65 | What is Shelta's Philosophy? | |
66 | ---------------------------- | |
67 | ||
68 | Shelta's philosophy is one of simplicity of translation. Shelta is | |
69 | easy to implement in assembly language. Shelta is nearly as low-level | |
70 | as assembly language. Very small Shelta compilers can be built. | |
71 | ||
72 | Shelta is also relatively easy to bootstrap - that is, it's not that | |
73 | difficult to implement a Shelta compiler in Shelta itself. In fact, | |
74 | that (along with writing a ridiculously small compiler) was my main | |
75 | motivation for designing Shelta and building SHELTA86.COM. For more | |
76 | information on the bootstrapped Shelta compilers and bootstrapping in | |
77 | general, see the file bootstrp.txt. | |
78 | ||
79 | In and of itself, Shelta has no actual functional semantics: only | |
80 | structural ones. It relies on a either library of functions (such as | |
81 | GUPI, described below) or inline machine language in order to be | |
82 | considered Turing-Complete. That is, not unlike the ancient Shelta | |
83 | language, Shelta <Maentwrog Mk IV> is hybridized: the actual programming | |
84 | is usually done in Shelta/GUPI. | |
85 | ||
86 | What are Shelta's Influences? | |
87 | ----------------------------- | |
88 | ||
89 | Shelta is influenced largely by the wholly remarkable programming | |
90 | language FALSE, by Wouter van Oortmerssen - "FORTH with lambda | |
91 | functions". However, it is lower-level than FALSE. It is more like | |
92 | FORTH in some ways - for example, multicharacter user-defined names | |
93 | can be used to name unlimited variables, not just the a-z in FALSE. | |
94 | Lastly, it is unlike FORTH and more like Assembler in that there is | |
95 | no FORTH-like environment nor any fixed-size blocks of text as | |
96 | input files. | |
97 | ||
98 | What is Shelta's Syntax? | |
99 | ------------------------ | |
100 | ||
101 | Tokens are delimited by whitespace - any whitespace and as much of it | |
102 | as you like, but as long as two non-whitespace characters are adjacent, | |
103 | they are considered part of the same token. Shelta's idea of | |
104 | whitespace is, in ASCII, everything from #32 (space) down to #1 (^A). | |
105 | (#0 (NUL) is considered synonymous with an end-of-file condition.) | |
106 | ||
107 | The exception to the above rule is a comment block, which begins | |
108 | (anywhere) with a ";" character and ends at the next ";" character. | |
109 | This can occur even in the middle of a token, so "HE; foo ;LLO" is | |
110 | taken to be the token "HELLO". | |
111 | ||
112 | User-defined tokens - depicted with "Name" in the following table - | |
113 | can contain any non-whitespace characters, and can begin with any | |
114 | non-whitespace characters except for "[", "]", "\", '^', "_" and "`". | |
115 | (This includes digits - "1" by itself is a name, not a number.) | |
116 | ||
117 | What are the recognized tokens of Shelta? | |
118 | ----------------------------------------- | |
119 | ||
120 | [ Begin block. | |
121 | * ] End block, push pointer. | |
122 | ]=Name End block, name pointer. | |
123 | ]:Name End block, name pointer to compile-time-only block. | |
124 | * ]Name End block, push named pointer. | |
125 | ^Name Push pointer to previously named block. | |
126 | _^Name Insert pointer to previously named block. | |
127 | Name Insert contents of previously named block. | |
128 | ||
129 | * `ABC Insert string. | |
130 | _123 Insert decimal byte. | |
131 | __1234 Insert decimal word. | |
132 | \123 Push decimal word. | |
133 | ||
134 | * = not available in SHELTA86. | |
135 | ||
136 | What are some common syntactic idioms in Shelta? | |
137 | ------------------------------------------------ | |
138 | ||
139 | [ `ABC ] Push pointer to string. | |
140 | [ _5 _5 _5 _3 ] Push pointer to byte array. | |
141 | [ __1234 __9999 ] Push pointer to word array. | |
142 | [ _5 _5 ]=my-data Name a byte array. | |
143 | [ _^my-data ]=my-refs Name an array of references to data. | |
144 | _88 Insert anonymous inline machine code. | |
145 | [ _88 ]:xyz Define xyz as inline machine code. | |
146 | xyz Insert named inline machine code. | |
147 | [ bar baz ]:foo Declare 'foo' as an inline proc | |
148 | foo Insert 'foo' as an inline proc. | |
149 | [ ]=bar Define named label. | |
150 | ||
151 | Where do you use : instead of = after ]? | |
152 | ---------------------------------------- | |
153 | ||
154 | Originally, Shelta did not distinguish between blocks used as | |
155 | updatable stores, subroutines, or templates for inlined instructions. | |
156 | As such, it would include all of them into the resulting executable, | |
157 | even the blocks only used at compile-time to define inline instructions. | |
158 | ||
159 | By using : instead of = after ], the Shelta compiler will treat the | |
160 | block as containing information which is only used at compile-time. | |
161 | This is essentially a contract between the programmer and the compiler; | |
162 | the programmer promises not to expect the ^Name or _^Name syntax to work | |
163 | on the block, and the compiler ensures the block does not show up | |
164 | extraneously in the resulting executable. | |
165 | ||
166 | What are some of the quirks of Shelta? | |
167 | -------------------------------------- | |
168 | ||
169 | Shelta's lambda syntax is not uniform. On the top level, an empty | |
170 | block such as this: | |
171 | [ ]=label | |
172 | is not necessarily defined to actually work. It is only defined to | |
173 | produce sensible results when nested within another block like this: | |
174 | [ [ ]=label foo ]=block | |
175 | Also, this is NOT the same thing as saying: | |
176 | [ [ foo ]=block1 bar ]=block2 | |
177 | This linearizes block1 out of block2, almost as if you had said | |
178 | [ foo ]=block1 [ bar ]=block2 | |
179 | Except that the identifier block1 is 'supposed' to be local to block2 | |
180 | (it ISN'T, but it might be good programming practice to treat it that | |
181 | way anyway! :-) | |
182 | ||
183 | To make things even worse, nesting more than two levels deep like so | |
184 | [ foo [ bar [ baz ] quuz ] phlef ] | |
185 | probably doesn't do what you expect. Feel free to experiment, though. | |
186 | ||
187 | Shelta does not have or use forward references. That seems to be no | |
188 | problem, with the lambda-like declarations, but it can often force you | |
189 | to write weird and awkwardly structured code. If you need to refer to | |
190 | the current block from within it, you can always name a block twice: | |
191 | ||
192 | [ foo ^this bar ]=this ; won't work! ; | |
193 | [ [ ]=-this foo ^-this bar ]=this ; works! ^this == ^-this ; | |
194 | ||
195 | What is SHELTA86.COM? | |
196 | --------------------- | |
197 | ||
198 | The Shelta compiler is implemented in 8086 assembly language and assembles | |
199 | to a tiny (LESS THAN HALF A KILOBYTE! :-) executable program. There are | |
200 | several restrictions on the program in order to trim fat: | |
201 | ||
202 | - Input file goes in standard input, .COM file comes out standard output. | |
203 | File should end in a ^@ (NUL) character to indicate EOF. | |
204 | This NUL should be preceded by whitespace (otherwise it'll form | |
205 | part of a token - you don't want that! :-) | |
206 | - If a file error occurs, error code 32 is returned. | |
207 | - If an undefined token is found, error code 16 is returned. | |
208 | - The forms ] (End block push pointer,) ]Name (End block push named | |
209 | pointer,) and `xyz (Insert String) are not supported. It is not | |
210 | difficult to work around these by explicitly naming and pushing | |
211 | blocks and using ASCII decimal sequences for strings. These | |
212 | inequities could even be addressed by a simple pre-processor. | |
213 | ||
214 | What is SHELTA.BAT? | |
215 | ------------------- | |
216 | ||
217 | SHELTA.BAT allows one to harness a Shelta compiler such as SHELTA86.COM, | |
218 | without having to directly put up with it's silly interface. | |
219 | ||
220 | Usage: | |
221 | SHELTA compiler project-file {library-files...} | |
222 | ||
223 | 'compiler' is one of: 86 (the assembly-language compiler,) S (the | |
224 | compiler written in Shelta and compiled with 86), or S2 (the | |
225 | compiler written in Shelta and compiled with S.) (See the file | |
226 | bootstrp.txt for more information on the Shelta compilers written | |
227 | in Shelta.) | |
228 | ||
229 | You don't need to append '.she' to project-file or any library-file | |
230 | you choose to include, SHELTA will do that for you and will | |
231 | automatically name the output 'project-file.COM'. | |
232 | ||
233 | SHELTA should support up to nine arguments, so you can specify seven | |
234 | library files on the command line (there's no 'include' directive in | |
235 | Shelta itself.) | |
236 | ||
237 | As an example of how to use SHELTA, here's how to build and test one | |
238 | of the example Shelta/GUPI programs, "Hello, world!": | |
239 | ||
240 | (Updated Dec 8 2002 to reflect new directory structure:) | |
241 | ||
242 | cd shelta-<<version>> | |
243 | bin\shelta s2 prj\hello | |
244 | prj\hello | |
245 | ||
246 | How can I specify what libraries for SHELTA.BAT to use by default? | |
247 | ------------------------------------------------------------------ | |
248 | ||
249 | By default SHELTA.BAT includes the following libraries: | |
250 | ||
251 | 8086\8086.she 8086 subset definition | |
252 | 8086\gupi.she General GUPI library (defined in 8086 subset) | |
253 | 8086\string.she GUPI string functions (defined in 8086 subset) | |
254 | 8086\dos.she DOS-dependent GUPI I/O (defined in 8086 subset) | |
255 | gupi\linklist.she Linked list library (defined in GUPI) | |
256 | ||
257 | You can edit SHELTA.BAT to change which libraries it uses by default. | |
258 | (It is just a .BAT file after all.) | |
259 | ||
260 | 8086.she: One could presumably replace these inline instructions with | |
261 | equivalent instructions for another relative-addressing processor, | |
262 | change a few lines of SHELTA86.ASM, and voila! You could compile | |
263 | Shelta to some other CPU. It'd be a cute trick... | |
264 | ||
265 | gupi.she: While Shelta comes with GUPI, you don't need to use GUPI | |
266 | with Shelta! You can comment out gupi.she and completely redefine the | |
267 | semtantics of your Shelta. For example, you could use a very small | |
268 | set of instructions (a tar pit) and use Shelta to compile languages | |
269 | very similar to Brainf*ck, Malbolge, etc. | |
270 | ||
271 | dos.she: You can replace the dos.she library with the bios.she library; | |
272 | it does the same thing but goes directly through the BIOS instead, and | |
273 | you can write code that will work without DOS loaded (so you could even | |
274 | build your own OS or embedded controller code with Shelta! ;-) | |
275 | ||
276 | What is GUPI? | |
277 | ------------- | |
278 | ||
279 | GUPI stands for Generic Utilitarian Programming Interface. GUPI | |
280 | is a set of Shelta definitions that acts as a standard library. | |
281 | ||
282 | (Fact is, I'm not a big fan of how GUPI turned out; it is a | |
283 | contrived and contingent beast, rather than the beautifully | |
284 | designed and conceptually airtight scheme I had hoped for. | |
285 | But I figure that if it was good enough to get me this far, it's | |
286 | worth keeping around, and the hybrid design of Shelta makes it | |
287 | easy to swap it for something else at a later time, anyway.) | |
288 | ||
289 | The GUPI semantics as presented here work on a stack of word values | |
290 | in a FORTH-like manner. Note that GUPI is not yet well documented. | |
291 | Nor is it guaranteed not to change (although it looks unlikely; | |
292 | any major change will warrant it's own library; "GUPII" perhaps? :-) | |
293 | ||
294 | What are some of the naming conventions of GUPI? | |
295 | ------------------------------------------------ | |
296 | ||
297 | Generally speaking... | |
298 | The suffix b indicates 'byte'. | |
299 | The suffix c indicates 'character'. | |
300 | The suffix if indicates 'decision on a boolean'. | |
301 | The suffix s indicates 'string with length' (block). | |
302 | The suffix w indicates 'word' (normally 16-bit). | |
303 | The suffix z indicates 'null-terminated string' (ASCIIZ). | |
304 | ||
305 | Lack of any suffix usually indicates 'any type'. | |
306 | ||
307 | What are the basic GUPI commands? | |
308 | --------------------------------- | |
309 | ||
310 | pop pop and discard top stack element | |
311 | dup duplicate top stack element | |
312 | swap pop a, pop b, push a, push b | |
313 | ||
314 | to pop pointer, machine unary jump to pointer | |
315 | do pop pointer, machine sub call pointer | |
316 | toif pop pointer, pop boolean, unary jump if nonzero | |
317 | doif pop pointer, pop boolean, sub call if nonzero | |
318 | begin pop return pointer and push onto call stack | |
319 | end push return pointer from call stack | |
320 | ||
321 | begin, end, and do/doif lead to the following GUPI idiom: | |
322 | [ begin baz end ]=bar Declare 'bar' as a subroutine. | |
323 | ^bar do Call 'bar' as a subroutine call. | |
324 | ||
325 | What are the memory-access commands? | |
326 | ------------------------------------ | |
327 | ||
328 | getb pop pointer, push byte data at pointer | |
329 | putb pop pointer, pop byte value, write at pointer | |
330 | getw pop pointer, push word data at pointer | |
331 | putw pop pointer, pop word value, write at pointer | |
332 | ||
333 | What are the arithmetic and logic commands? | |
334 | ------------------------------------------- | |
335 | ||
336 | ++ pop a, push a + 1 | |
337 | -- pop a, push a - 1 | |
338 | ** pop a, push a << 1 | |
339 | // pop a, push a >> 1 | |
340 | << pop a, pop b, push b << a | |
341 | >> pop a, pop b, push b >> a | |
342 | + pop a, pop b, push b + a | |
343 | - pop a, pop b, push b - a | |
344 | * pop a, pop b, push b * a | |
345 | / pop a, pop b, push b / a | |
346 | % pop a, pop b, push b mod a | |
347 | *1 /% pop a, pop b, push b / a, push b mod a | |
348 | ! pop a, push binary not a | |
349 | zero pop a, push 1 if a = 0, push 0 otherwise | |
350 | & pop a, pop b, push a binary and b | |
351 | | pop a, pop b, push a binary or b | |
352 | ~ pop a, pop b, push a binary xor b | |
353 | ||
354 | *1. The algorithm commonly used for binary division actually computes | |
355 | the results of both division and modulo (remainder for a > 0). If | |
356 | both results are desired by the program, using /% is usually nearly | |
357 | twice as efficient as using / and % seperately. | |
358 | ||
359 | Indirect arithmetic? | |
360 | -------------------- | |
361 | ||
362 | @++ pop pointer, increment word at pointer | |
363 | @-- pop pointer, decrement word at pointer | |
364 | ||
365 | How does GUPI interface with the operating system? | |
366 | -------------------------------------------------- | |
367 | ||
368 | outs pop length, pop pointer, send bytes to stdout | |
369 | outc pop word, send low byte to stdout | |
370 | inc wait for input on stdin, push character read | |
371 | qinc quietly wait for input on stdin, push character | |
372 | chkin immediately return input status (is a char waiting?) | |
373 | flin flush all unread input | |
374 | halt pop a, stop program and return to operating system | |
375 | with error code 'a' | |
376 | ||
377 | And dynamic memory? | |
378 | ------------------- | |
379 | ||
380 | malloc pop size, push ptr to memory of length size | |
381 | mfree pop ptr, reset heap ptr to ptr | |
382 | (Note that mfree will free ALL pointers that were | |
383 | allocated with malloc since the pointer that is being | |
384 | freed was allocated. It's good for local | |
385 | linked lists and such, but be careful!) | |
386 | ||
387 | What is "Portable Shelta/GUPI"? | |
388 | ------------------------------- | |
389 | ||
390 | The short answer is, "Portable Shelta/GUPI" is the subset of the | |
391 | union of the Shelta and GUPI languages where, through patience | |
392 | and restraint - i.e. discipline - the Shelta/GUPI programmer | |
393 | does not use any machine-dependent or self-modifying code, and | |
394 | restricts themselves to the GUPI functions that do likewise or | |
395 | are specified precisely and abstractly enough to be ported, | |
396 | that is, re-written in some other machine or VM bytecode. | |
397 | ||
398 | Where can I get updates on Shelta's condition? | |
399 | ---------------------------------------------- | |
400 | ||
401 | Shelta's official web site is located at: | |
402 | ||
403 | http://www.catseye.mb.ca/esoteric/shelta/ | |
404 | ||
405 | Happy tinkering! | |
406 | ||
407 | Chris Pressey, Dec 23 1999 | |
408 | Cat's-Eye Technologies, Winnipeg, Manitoba, Canada |
0 | ; | |
1 | 99.she v1999.12.23 (c)2000 Chris Pressey, Cat's Eye Technologies. | |
2 | The song "Ninety-Nine Bottles of Beer" implemented in Shelta/GUPI. | |
3 | ; | |
4 | ||
5 | [ _32 `bottles _32 `of _32 `beer _32 `on _32 `the _32 `wall, _13 _10 ]=L1 | |
6 | [ _32 `bottles _32 `of _32 `beer, _13 _10 ]=L2 | |
7 | [ `Take _32 `one _32 `down, _32 `pass _32 `it _32 `around, _13 _10 ]=L3 | |
8 | [ _32 `bottles _32 `of _32 `beer _32 `on _32 `the _32 `wall. _13 _10 _13 _10 ]=L4 | |
9 | ||
10 | [ `9 ]=bh [ `9 ]=bl | |
11 | [ begin ^bh getb outc ^bl getb outc end ]=btls | |
12 | [ begin ^bh getb \1 - ^bh putb \57 ^bl putb end ]=digit | |
13 | ||
14 | [ [ ]=iloop | |
15 | ||
16 | ^btls do | |
17 | ^L1 \31 outs | |
18 | ||
19 | ^btls do | |
20 | ^L2 \19 outs | |
21 | ||
22 | ^L3 \32 outs | |
23 | ||
24 | ^bl getb \1 - ^bl putb | |
25 | ^bl getb \47 - zero ^digit doif | |
26 | ||
27 | ^btls do | |
28 | ^L4 \33 outs | |
29 | ||
30 | ^bh getb \47 - ^iloop toif | |
31 | ||
32 | \0 halt | |
33 | ] to |
0 | ; | |
1 | demo.she v1999.12.23 (c)2000 Chris Pressey, Cat's Eye Technologies. | |
2 | A demonstration of some of the basic features of Shelta and GUPI. | |
3 | ; | |
4 | ||
5 | [ | |
6 | [ ]=hw `Hello, _32 `world! ; an empty block denotes a label ; | |
7 | [ ]=eol _13 _10 | |
8 | ]=hello | |
9 | ||
10 | [ _0 ]=i | |
11 | [ _0 ]=pad | |
12 | [ __0 ]=hptr | |
13 | ||
14 | [ | |
15 | begin | |
16 | \1024 malloc ^hptr putw | |
17 | [ ]=wloop | |
18 | ^i getb ^hptr getw ^i getb + putb | |
19 | ^i getb ++ ^i putb | |
20 | ^i getb ^wloop toif | |
21 | ^hptr getw \32 + \223 outs | |
22 | end | |
23 | ] do | |
24 | ||
25 | ^hello \15 outs | |
26 | ||
27 | ^hw \12 outs | |
28 | ^eol \2 outs | |
29 | ||
30 | ^hello getb outc | |
31 | ||
32 | ^hello \1 + getb outc | |
33 | ||
34 | \65 ^hello putb ^hello \15 outs | |
35 | ||
36 | \1000 \8 / outc | |
37 | \8 \8 * ++ outc | |
38 | \8 \9 * ++ outc | |
39 | \9 \9 * -- outc | |
40 | ||
41 | flin | |
42 | [ | |
43 | [ ]=loop | |
44 | inc outc ^loop to ;forever!; | |
45 | ] to |
0 | ; | |
1 | hello.she v1999.12.23 (c)2000 Chris Pressey, Cat's Eye Technologies. | |
2 | The ubiquitous greeting message, implemented in Shelta/GUPI. | |
3 | ; | |
4 | [ `Hello, _32 `world! _13 _10 ] \15 outs \0 halt |
0 | ; | |
1 | sheltas.she v1999.12.23 (c)2000 Chris Pressey, Cat's Eye Technologies. | |
2 | A bootstrappable Shelta compiler written in Shelta/GUPI. | |
3 | ; | |
4 | ||
5 | [ __0 ]=safestart | |
6 | [ __0 ]=namestart | |
7 | ||
8 | [ __0 ]=codeba | |
9 | [ __0 ]=stacba | |
10 | [ __0 ]=safeba | |
11 | [ __0 ]=macrba | |
12 | [ __0 ]=tokenba | |
13 | ||
14 | [ __0 ]=symthead | |
15 | [ __0 ]=codeh | |
16 | [ __0 ]=stach | |
17 | [ __0 ]=safeh | |
18 | [ __0 ]=macrh | |
19 | [ __0 ]=tokenh | |
20 | ||
21 | [ begin \16 halt end ]=badtok | |
22 | [ begin dupz end ]=fndupz | |
23 | ||
24 | ;--------------------------------------; | |
25 | ||
26 | [ __0 ]=newn | |
27 | [ ; addr dlen strz -> void ; | |
28 | begin | |
29 | ^fndupz do | |
30 | ||
31 | ; link up the new node ; | |
32 | ^symthead getw \6 ll-node dup ^newn putw ll-link | |
33 | ^newn getw ^symthead putw | |
34 | ||
35 | ; addr dlen strz ; | |
36 | ^newn getw ll-dptr putw | |
37 | ^newn getw ll-dptr \2 + putw | |
38 | ^newn getw ll-dptr \4 + putw | |
39 | ||
40 | end | |
41 | ]=InsertSymbol | |
42 | ||
43 | [ __0 ]=lui | |
44 | [ __0 ]=luitok | |
45 | [ \0 end ]=luno | |
46 | [ ^lui getw ll-dptr \4 + getw | |
47 | ^lui getw ll-dptr \2 + getw end ]=luyes | |
48 | [ ; strz -> dlen addr, that is, addr is pushed first; | |
49 | begin | |
50 | ^luitok putw | |
51 | ^symthead getw ^lui putw | |
52 | ||
53 | [ ]=luloop | |
54 | ^lui getw zero ^luno toif | |
55 | ^lui getw ll-dptr getw ^luitok getw eqzz ^luyes toif | |
56 | ^lui getw ll-next ^lui putw | |
57 | ^luloop to | |
58 | ||
59 | ]=LookupSymbol | |
60 | ||
61 | ;--------------------------------------; | |
62 | ||
63 | [ __0 ]=ddtoken ; contains pointer into token where to decipher ; | |
64 | [ __0 ]=ddvalue ; contains running tally of the value ; | |
65 | [ | |
66 | begin | |
67 | ^tokenba getw + ^ddtoken putw | |
68 | \0 ^ddvalue putw | |
69 | [ ]=ddLoop | |
70 | ^ddvalue getw \10 * | |
71 | ^ddtoken getw getb \48 - + | |
72 | ^ddvalue putw | |
73 | ||
74 | ^ddtoken @++ | |
75 | ^ddtoken getw getb \47 > ^ddLoop toif | |
76 | ||
77 | ^ddvalue getw | |
78 | end | |
79 | ]=DecipherDecimal | |
80 | ||
81 | ;--------------------------------------; | |
82 | ||
83 | [ | |
84 | begin | |
85 | ^codeh getw ++ putw | |
86 | \184 ^codeh getw putb | |
87 | \80 ^codeh getw \3 + putb | |
88 | ^codeh getw \4 + ^codeh putw | |
89 | end | |
90 | ]=WritePush | |
91 | ||
92 | [ | |
93 | begin | |
94 | \1 ^DecipherDecimal do ^WritePush do | |
95 | end | |
96 | ]=PushWord | |
97 | ||
98 | ;--------------------------------------; | |
99 | ||
100 | [ | |
101 | ^tokenba getw \2 + ^LookupSymbol do pop | |
102 | dup zero ^badtok toif | |
103 | ^safeba getw - \260 + | |
104 | ^codeh getw putw | |
105 | ^codeh getw \2 + ^codeh putw | |
106 | end | |
107 | ]=LiteralSymbol | |
108 | [ | |
109 | \2 ^DecipherDecimal do ^codeh getw putw | |
110 | ^codeh getw \2 + ^codeh putw | |
111 | end | |
112 | ]=LiteralWord | |
113 | [ | |
114 | begin | |
115 | ||
116 | ^tokenba getw ++ getb \95 - zero ^LiteralWord toif | |
117 | ^tokenba getw ++ getb \94 - zero ^LiteralSymbol toif | |
118 | \1 ^DecipherDecimal do | |
119 | ^codeh getw putb | |
120 | ^codeh @++ | |
121 | end | |
122 | ]=LiteralByte | |
123 | ||
124 | ;--------------------------------------; | |
125 | ||
126 | [ | |
127 | begin | |
128 | ^tokenba getw ++ ^LookupSymbol do pop | |
129 | dup zero ^badtok toif | |
130 | ^safeba getw - \260 + ^WritePush do | |
131 | end | |
132 | ]=PushPointer | |
133 | ||
134 | ;--------------------------------------; | |
135 | ||
136 | [ __0 ]=strct | |
137 | [ | |
138 | begin | |
139 | \1 ^strct putw | |
140 | [ ]=strLoop | |
141 | ||
142 | ^tokenba getw ^strct getw + getb | |
143 | ||
144 | ^codeh getw putb | |
145 | ||
146 | ^codeh @++ | |
147 | ^strct @++ | |
148 | ||
149 | ^tokenba getw ^strct getw + getb ^strLoop toif | |
150 | ||
151 | end | |
152 | ]=String | |
153 | ||
154 | ;--------------------------------------; | |
155 | ||
156 | [ | |
157 | begin | |
158 | ^codeh getw | |
159 | ^stach getw putw | |
160 | ^stach getw \2 + ^stach putw | |
161 | end | |
162 | ]=BeginBlock | |
163 | ||
164 | [ __0 ]=ebtokptr | |
165 | [ __0 ]=ebtoklen | |
166 | [ __0 ]=ebdatlen | |
167 | [ __0 ]=origcodeh | |
168 | [ | |
169 | begin | |
170 | ; adjust namestart ... possibly the weirdest Shelta-ism ; | |
171 | ^namestart getw ^origcodeh getw + ^stach getw \2 - getw - ^namestart putw | |
172 | end | |
173 | ]=AdjustName | |
174 | [ __0 ]=nei ; a shared counter the for next two subroutines ; | |
175 | [ | |
176 | ^macrh getw ^namestart putw | |
177 | ||
178 | ; copy everything from origcodeh to codeh into the macro area ; | |
179 | ||
180 | ^origcodeh getw ^nei putw | |
181 | ||
182 | [ ]=mloop | |
183 | ||
184 | ^nei getw getb ^macrh getw putb | |
185 | ^nei @++ | |
186 | ^macrh @++ | |
187 | ^nei getw ^codeh getw - ^mloop toif | |
188 | ||
189 | ; change codeh back to origcodeh ; | |
190 | ||
191 | ^origcodeh getw ^codeh putw | |
192 | ||
193 | end | |
194 | ]=MacroInstead | |
195 | [ | |
196 | begin | |
197 | ^tokenba getw ++ getb \58 - zero ^MacroInstead toif | |
198 | ||
199 | ; copy everything from origcodeh to codeh into a safe area ; | |
200 | ||
201 | ^origcodeh getw ^nei putw | |
202 | ||
203 | [ ]=neloop | |
204 | ||
205 | ^nei getw getb ^safeh getw putb | |
206 | ^nei @++ | |
207 | ^safeh @++ | |
208 | ^nei getw ^codeh getw - ^neloop toif | |
209 | ||
210 | ; change codeh back to origcodeh ; | |
211 | ||
212 | ^origcodeh getw ^codeh putw | |
213 | end | |
214 | ]=NotEmpty | |
215 | [ begin ^tokenba getw \2 + ^ebtokptr putw end ]=incebtokptr | |
216 | [ | |
217 | ; insert name into dictionary ; | |
218 | ^namestart getw ^ebdatlen getw ^ebtokptr getw ^InsertSymbol do | |
219 | end | |
220 | ]=NameIt | |
221 | [ | |
222 | begin | |
223 | ||
224 | ^tokenba getw ++ ^ebtokptr putw | |
225 | ^tokenba getw ++ getb \58 - zero ^incebtokptr doif | |
226 | ^tokenba getw ++ getb \61 - zero ^incebtokptr doif | |
227 | ||
228 | ^safeh getw dup ^safestart putw ^namestart putw ; track starts ; | |
229 | ||
230 | ^ebtokptr getw lenz ^ebtoklen putw | |
231 | ||
232 | ^stach getw \2 - ^stach putw | |
233 | ^stach getw getw ^origcodeh putw ; get original code head ; | |
234 | ||
235 | ^codeh getw ^origcodeh getw - ^ebdatlen putw | |
236 | ||
237 | ^stach getw ^stacba getw - ^AdjustName doif | |
238 | ||
239 | ^ebdatlen getw \0 > ^NotEmpty doif | |
240 | ||
241 | ; write push instruction if '=' or ':' not used ; | |
242 | ||
243 | ^tokenba getw ++ getb \58 - zero ^NameIt toif | |
244 | ^tokenba getw ++ getb \61 - zero ^NameIt toif | |
245 | ||
246 | \184 ^codeh getw putb | |
247 | \80 ^codeh getw \3 + putb | |
248 | ^safestart getw ^safeba getw \260 - - ^codeh getw ++ putw | |
249 | ||
250 | ^codeh getw \4 + ^codeh putw | |
251 | ||
252 | ^tokenba getw ++ getb ^NameIt toif | |
253 | end | |
254 | ]=EndBlock | |
255 | ||
256 | ;--------------------------------------; | |
257 | ||
258 | [ __0 ]=urctr | |
259 | [ __0 ]=urlen | |
260 | [ __0 ]=uraddr | |
261 | [ | |
262 | ^codeh getw -- -- ^codeh putw | |
263 | ; ^codeh \4 \2 fwrite ^crlf \2 \2 fwrite ; | |
264 | end | |
265 | ]=wipeit | |
266 | [ | |
267 | ^codeh getw \2 - getb \80 - zero ^wipeit toif end | |
268 | ]=peep | |
269 | [ | |
270 | ^codeh getw -- getb \88 - zero ^peep toif end | |
271 | ]=peepok | |
272 | [ | |
273 | begin | |
274 | ^codeh getw ^codeba getw - ^peepok toif end | |
275 | ]=clean | |
276 | [ | |
277 | [ ]=urloop | |
278 | ^uraddr getw getb ^codeh getw putb | |
279 | ^uraddr @++ | |
280 | ^codeh @++ | |
281 | ^urlen @-- | |
282 | ^urctr @++ | |
283 | ||
284 | ^urctr getw -- zero ^clean doif | |
285 | ||
286 | ^urlen getw ^urloop toif | |
287 | end | |
288 | ]=curloop | |
289 | [ | |
290 | begin | |
291 | \0 ^urctr putw ^tokenba getw ^LookupSymbol do ^urlen putw | |
292 | dup zero ^badtok toif | |
293 | ^uraddr putw | |
294 | ||
295 | ; copy urlen bytes from uraddr to codeh ; | |
296 | ||
297 | ; 1999.10.14 peephole optimization commented out. | |
298 | it crashes and it's not strictly necessary. someday, perhaps... ; | |
299 | ||
300 | ^urlen getw ^curloop toif | |
301 | ||
302 | end | |
303 | ]=Unroll | |
304 | ||
305 | ;--------------------------------------; | |
306 | ||
307 | [ end ]=goodc ; char was dupped and is on stack ; | |
308 | [ | |
309 | begin | |
310 | [ ]=floop | |
311 | qinc dup \59 - ^goodc toif pop ; return good char if not semicolon ; | |
312 | [ ]=cloop | |
313 | qinc \59 - zero ^floop toif | |
314 | ^cloop to | |
315 | end | |
316 | ]=scanc | |
317 | ||
318 | [ _0 ]=inbyte | |
319 | [ _0 ]=eoff | |
320 | [ \1 ^eoff putb end ]=goteof | |
321 | [ | |
322 | begin | |
323 | ^tokenba getw ^tokenh putw | |
324 | [ ]=scanloop | |
325 | ^scanc do ^inbyte putb | |
326 | ^inbyte getb zero ^goteof toif | |
327 | \33 ^inbyte getb > ^scanloop toif | |
328 | ||
329 | [ ]=scisloop | |
330 | ^inbyte getb ^tokenh getw putb ^tokenh @++ ;write char to token; | |
331 | ||
332 | ^scanc do ^inbyte putb | |
333 | ^inbyte getb zero ^goteof toif | |
334 | ^inbyte getb \32 > ^scisloop toif | |
335 | ||
336 | \0 ^tokenh getw putb | |
337 | end | |
338 | ]=scantok | |
339 | ||
340 | ; --- startup --- get dynamic memory off of heap --- ; | |
341 | ||
342 | \16384 malloc dup ^safeba putw \2 + ^safeh putw | |
343 | \4096 malloc dup ^macrba putw ^macrh putw | |
344 | \4096 malloc dup ^codeba putw ^codeh putw | |
345 | \256 malloc dup ^stacba putw ^stach putw | |
346 | \128 malloc dup ^tokenba putw ^tokenh putw | |
347 | ||
348 | [ | |
349 | ; write output file ; | |
350 | ||
351 | ; put in a jump over the safe area ; | |
352 | ||
353 | ^safeh getw ^safeba getw - ++ | |
354 | ||
355 | \233 outc | |
356 | dup \255 & outc \8 >> \255 & outc | |
357 | \144 outc | |
358 | ||
359 | ; make the first word of the safe area an offset ; | |
360 | ; to just past the last word of the code ; | |
361 | ||
362 | ^safeh getw ^safeba getw - ^codeh getw + ^codeba getw \260 - - ^safeba getw putw | |
363 | ||
364 | ^safeba getw ^safeh getw ^safeba getw - outs | |
365 | ^codeba getw ^codeh getw ^codeba getw - outs | |
366 | \0 halt | |
367 | ]=tail | |
368 | [ [ ]=main | |
369 | ^scantok do | |
370 | ^eoff getb ^tail toif | |
371 | ^tokenba getw getb \91 - \5 > ^Unroll doif | |
372 | ^tokenba getw getb \91 - zero ^BeginBlock doif | |
373 | ^tokenba getw getb \92 - zero ^PushWord doif | |
374 | ^tokenba getw getb \93 - zero ^EndBlock doif | |
375 | ^tokenba getw getb \94 - zero ^PushPointer doif | |
376 | ^tokenba getw getb \95 - zero ^LiteralByte doif | |
377 | ^tokenba getw getb \96 - zero ^String doif | |
378 | ^main to | |
379 | ]=Shelta ^Shelta to | |
380 | ||
381 | ; end of sheltas.she ; |
0 | ; | |
1 | str.she v1999.12.23 (c)2000 Chris Pressey, Cat's Eye Technologies. | |
2 | Demonstrates searching a list of strings. | |
3 | ; | |
4 | ||
5 | [ __0 ]=head | |
6 | [ __0 ]=newn | |
7 | ||
8 | [ `Moe _0 ]=s1 | |
9 | [ `Curly _0 ]=s2 | |
10 | [ `Larry _0 ]=s3 | |
11 | ||
12 | [ `Larry _0 ]=target ; change this variable to test ; | |
13 | ||
14 | [ ; strz -> strz ; | |
15 | begin dup lenz ++ malloc cpzz end | |
16 | ]=strdup | |
17 | ||
18 | [ ; stooge -> void ; | |
19 | begin | |
20 | ^strdup do | |
21 | ^head getw \2 ll-node dup ^newn putw ll-link | |
22 | ^newn getw dup ^head putw ll-dptr putw | |
23 | end | |
24 | ]=add-stooge | |
25 | ||
26 | [ [ `No ] \2 outs \1 halt ]=no | |
27 | [ [ `Yes ] \3 outs \0 halt ]=yes | |
28 | ||
29 | [ | |
30 | ||
31 | ^s1 ^add-stooge do | |
32 | ^s2 ^add-stooge do | |
33 | ^s3 ^add-stooge do | |
34 | ||
35 | ^head getw ^newn putw | |
36 | ||
37 | [ ]=cloop | |
38 | ^newn getw zero ^no toif | |
39 | ^newn getw ll-dptr getw ^target eqzz ^yes toif | |
40 | ^newn getw ll-next ^newn putw | |
41 | ^cloop to | |
42 | ||
43 | ] to |
0 | ; | |
1 | 8086\8086.she v1999.10.10 (c)1999 Chris Pressey, Cat's-Eye Technologies. | |
2 | Defines the instructions of the Intel 8086 chip and it's successors. | |
3 | ; | |
4 | ||
5 | [ _244 ]:hlt | |
6 | ||
7 | [ _146 ]:xchg-dx-ax | |
8 | [ _147 ]:xchg-bx-ax | |
9 | [ _145 ]:xchg-cx-ax | |
10 | ||
11 | [ _80 ]:push-ax | |
12 | [ _83 ]:push-bx | |
13 | [ _81 ]:push-cx | |
14 | [ _82 ]:push-dx | |
15 | ||
16 | [ _255 _55 ]:push[bx] | |
17 | ||
18 | [ _57 _195 ]:cmp-bx-ax | |
19 | ||
20 | [ _161 ]:mov-ax[] | |
21 | [ _163 ]:mov[]ax | |
22 | ||
23 | [ _142 _6 ]:mov-es[] | |
24 | ||
25 | [ _88 ]:pop-ax | |
26 | [ _91 ]:pop-bx | |
27 | [ _89 ]:pop-cx | |
28 | [ _90 ]:pop-dx | |
29 | ||
30 | [ _95 ]:pop-di | |
31 | [ _94 ]:pop-si | |
32 | ||
33 | [ _86 ]:push-si | |
34 | [ _87 ]:push-di | |
35 | ||
36 | [ _138 _4 ]:mov-al[si] | |
37 | [ _58 _5 ]:cmp-al[di] | |
38 | ||
39 | [ _139 _5 ]:mov-ax[di] | |
40 | ||
41 | [ _211 _224 ]:shl-ax-cl | |
42 | [ _211 _232 ]:shr-ax-cl | |
43 | ||
44 | [ _209 _224 ]:shl-ax-1 | |
45 | [ _209 _232 ]:shr-ax-1 | |
46 | ||
47 | [ _180 ]:mov-ah | |
48 | [ _176 ]:mov-al | |
49 | [ _177 ]:mov-cl | |
50 | [ _185 ]:mov-cx | |
51 | [ _187 ]:mov-bx | |
52 | [ _179 ]:mov-bl | |
53 | ||
54 | [ _255 _208 ]:call-ax | |
55 | [ _255 _224 ]:jmp-ax | |
56 | ||
57 | [ _50 _192 ]:xor-al-al | |
58 | [ _50 _228 ]:xor-ah-ah | |
59 | [ _48 _255 ]:xor-bh-bh | |
60 | [ _38 ]:es | |
61 | ||
62 | [ _137 _195 ]:mov-bx-ax | |
63 | [ _137 _194 ]:mov-dx-ax | |
64 | [ _137 _193 ]:mov-cx-ax | |
65 | ||
66 | [ _136 _204 ]:mov-ah-cl | |
67 | [ _136 _206 ]:mov-dh-cl | |
68 | ||
69 | [ _139 _210 ]:mov-bx-dx | |
70 | [ _136 _7 ]:mov[bx]al | |
71 | [ _137 _7 ]:mov[bx]ax | |
72 | [ _137 _15 ]:mov[bx]cx | |
73 | [ _136 _15 ]:mov[bx]cl | |
74 | [ _139 _7 ]:mov-ax[bx] | |
75 | [ _138 _15 ]:mov-cl[bx] | |
76 | ||
77 | [ _136 _5 ]:mov[di]al | |
78 | ||
79 | [ _116 ]:je | |
80 | [ _117 ]:jne | |
81 | [ _114 ]:jb | |
82 | [ _119 ]:ja | |
83 | [ _235 ]:jmp | |
84 | [ _11 _192 ]:or-ax-ax | |
85 | [ _10 _192 ]:or-al-al | |
86 | [ _9 _210 ]:or-dx-dx | |
87 | [ _247 _208 ]:not-ax | |
88 | ||
89 | [ _131 _251 ]:cmp-bx | |
90 | ||
91 | [ _70 ]:inc-si | |
92 | [ _71 ]:inc-di | |
93 | ||
94 | [ _67 ]:inc-bx | |
95 | [ _74 ]:dec-dx | |
96 | ||
97 | [ _128 _228 ]:and-ah | |
98 | [ _35 _194 ]:and-ax-dx | |
99 | [ _11 _194 ]:or-ax-dx | |
100 | [ _51 _194 ]:xor-ax-dx | |
101 | ||
102 | [ _49 _201 ]:xor-cx-cx | |
103 | [ _51 _192 ]:xor-ax-ax | |
104 | [ _49 _210 ]:xor-dx-dx | |
105 | ||
106 | [ _1 _208 ]:add-ax-dx | |
107 | [ _41 _208 ]:sub-ax-dx | |
108 | [ _41 _194 ]:sub-dx-ax | |
109 | ||
110 | [ _247 _234 ]:imul-dx | |
111 | [ _247 _249 ]:idiv-cx | |
112 | ||
113 | [ _64 ]:inc-ax | |
114 | [ _72 ]:dec-ax | |
115 | ||
116 | [ _159 ]:lahf | |
117 | [ _235 ]:jmp | |
118 | [ _144 ]:nop | |
119 | ||
120 | [ _205 ]:int |
0 | ; | |
1 | 8086\bios.she v1999.12.23 (c)1999 Chris Pressey, Cat's-Eye Technologies. | |
2 | BIOS interface for the OS-dependent part of GUPI. | |
3 | ; | |
4 | ||
5 | ;interrupt # for keybd ; [ _22 ]:keybd | |
6 | ;interrupt # for video ; [ _16 ]:video | |
7 | ||
8 | ; void -> halt; [ pop-ax jmp _254 ]:halt | |
9 | ||
10 | ; char -> void; [ pop-ax mov-ah _14 mov-bl _15 int video ]:outc | |
11 | ||
12 | ;string sizeb -> void; [ pop-dx pop-si mov-al[si] | |
13 | mov-ah _14 mov-bl _15 int video | |
14 | inc-si dec-dx or-dx-dx jne _242 ]:outs | |
15 | ||
16 | ; void -> char; [ xor-ah-ah int keybd xor-ah-ah push-ax ]:qinc | |
17 | ; void -> char; [ qinc dup outc ]:inc | |
18 | ; void -> bool; [ mov-ah _1 int keybd je _4 inc-ax jmp _3 nop xor-ax-ax push-ax ]:chkin | |
19 | ; void -> void; [ mov-ah _1 int keybd je _6 xor-ah-ah int keybd jmp _244 ]:flin |
0 | ; | |
1 | 8086\dos.she v1999.12.23 (c)1999 Chris Pressey, Cat's-Eye Technologies. | |
2 | DOS interface for the OS-dependent part of GUPI. | |
3 | ; | |
4 | ||
5 | ;interrupt # for DOS ; [ _33 ]:dos | |
6 | ||
7 | ; void -> halt; [ pop-ax mov-ah _76 int dos ]:halt | |
8 | ;string sizeb -> void; [ mov-ah _64 _187 _1 _0 pop-cx pop-dx int dos ]:outs | |
9 | ; char -> void; [ mov-ah _2 pop-dx int dos ]:outc | |
10 | ; void -> char; [ mov-ah _1 int dos push-ax ]:inc | |
11 | ; void -> char; [ mov-ah _7 int dos xor-ah-ah push-ax ]:qinc | |
12 | ; void -> bool; [ mov-ah _11 int dos xor-ah-ah push-ax ]:chkin | |
13 | ; void -> void; [ xor-ax-ax mov-ah _12 int dos ]:flin | |
14 |
0 | ; | |
1 | 8086\fileio.she v1999.12.23 | |
2 | (c)1999 Chris Pressey, Cat's-Eye Technologies. | |
3 | DOS file functions. | |
4 | ; | |
5 | ||
6 | ; zfnm -> fhdl ; | |
7 | [ pop-dx mov-ah _61 xor-al-al int dos push-ax ]:fopenz | |
8 | ||
9 | ; string fhdl -> fstat ; | |
10 | [ mov-ah _63 pop-bx mov-cx __1 pop-dx int dos push-ax ]:freadc | |
11 | ||
12 | ; zfnm -> fhdl ; | |
13 | [ pop-dx mov-ah _60 xor-cx-cx int dos ]:fcreatez | |
14 | ||
15 | ;str szb fhdl -> void ; | |
16 | [ mov-ah _64 pop-bx pop-cx pop-dx int dos ]:fwrite | |
17 | ||
18 | ; fhdl -> void ; | |
19 | [ pop-bx mov-ah _62 int dos ]:fclose | |
20 | ||
21 | ; quit? [ begin ^tokenba getw dup lenz \2 fwrite halt end ]=fnhalt ; |
0 | ; | |
1 | 8086\gupi.she v1999.10.10 (c)1999 Chris Pressey, Cat's-Eye Technologies. | |
2 | 8086-compatible semantics for GUPI. | |
3 | ; | |
4 | ||
5 | ; input stack -> output stack ; | |
6 | ; bottom..top -> top..bottom ; | |
7 | ||
8 | ; word -> void ; [ pop-ax ]:pop | |
9 | ; word -> word word; [ pop-ax push-ax push-ax ]:dup | |
10 | ; wrd1 wrd2 -> wrd1 wrd2; [ pop-ax pop-bx push-ax push-bx ]:swap | |
11 | ||
12 | ; addr -> byte ; [ pop-ax xchg-bx-ax mov-ax[bx] xor-ah-ah push-ax ]:getb | |
13 | ; byte addr -> void ; [ pop-ax xchg-bx-ax pop-cx mov[bx]cl ]:putb | |
14 | ; addr -> word ; [ pop-ax xchg-bx-ax push[bx] ]:getw | |
15 | ; word addr -> void ; [ pop-ax xchg-bx-ax pop-cx mov[bx]cx ]:putw | |
16 | ||
17 | ; word -> word ; [ pop-ax inc-ax push-ax ]:++ | |
18 | ; word -> word ; [ pop-ax dec-ax push-ax ]:-- | |
19 | ; word -> word ; [ pop-ax shl-ax-1 push-ax ]:** | |
20 | ; word -> word ; [ pop-ax shr-ax-1 push-ax ]:// | |
21 | ||
22 | ; word word -> word ; [ pop-ax xchg-cx-ax pop-ax shl-ax-cl push-ax ]:<< | |
23 | ; word word -> word ; [ pop-ax xchg-cx-ax pop-ax shr-ax-cl push-ax ]:>> | |
24 | ||
25 | ; addr -> void ; [ pop-bx mov-ax[bx] inc-ax mov[bx]ax ]:@++ | |
26 | ; addr -> void ; [ pop-bx mov-ax[bx] dec-ax mov[bx]ax ]:@-- | |
27 | ||
28 | ; word word -> word ; [ pop-ax pop-dx add-ax-dx push-ax ]:+ | |
29 | ; word word -> word ; [ pop-ax pop-dx sub-dx-ax xchg-dx-ax push-ax ]:- | |
30 | ; word word -> word ; [ pop-ax pop-dx imul-dx push-ax ]:* | |
31 | ; word word -> word ; [ pop-ax xchg-cx-ax pop-ax xor-dx-dx idiv-cx push-ax ]:/ | |
32 | ; word word -> word ; [ pop-ax xchg-cx-ax pop-ax xor-dx-dx idiv-cx push-dx ]:% | |
33 | ; word word -> word word; [ pop-ax xchg-cx-ax pop-ax xor-dx-dx idiv-cx push-dx push-ax ]:/% | |
34 | ; word word -> word ; [ pop-ax pop-dx or-ax-dx push-ax ]:| | |
35 | ; word word -> word ; [ pop-ax pop-dx and-ax-dx push-ax ]:& | |
36 | ; word word -> word ; [ pop-ax pop-dx xor-ax-dx push-ax ]:~ | |
37 | ; word -> word ; [ pop-ax not-ax push-ax ]:! | |
38 | ; word -> word ; [ pop-ax or-ax-ax je _4 xor-ax-ax jmp _1 inc-ax push-ax ]:zero | |
39 | ; word word -> word ; [ pop-ax pop-bx cmp-bx-ax ja _4 xor-ax-ax jmp _1 inc-ax push-ax ]:> | |
40 | ||
41 | ; addr -> (call) ; [ pop-ax call-ax ]:do | |
42 | ; addr -> (branch) ; [ pop-ax jmp-ax ]:to | |
43 | ; bool addr -> (call) ; [ pop-ax pop-dx or-dx-dx je _2 jmp-ax ]:toif | |
44 | ; bool addr -> (branch) ; [ pop-ax pop-dx or-dx-dx je _2 call-ax ]:doif | |
45 | ||
46 | ; memory for call stack: ; [ __0 __0 __0 __0 __0 __0 __0 __0 | |
47 | __0 __0 __0 __0 __0 __0 __0 __0 | |
48 | __0 __0 __0 __0 __0 __0 __0 __0 | |
49 | __0 __0 __0 __0 __0 __0 __0 __0 ]=clstk | |
50 | ; memory for stack pointer; [ __0 ]=clsp | |
51 | ||
52 | ; (call) -> void ; [ pop-ax _139 _30 _^clsp _137 _135 _^clstk _131 _6 _^clsp _2 ]:begin | |
53 | ; void -> (return) ; [ _131 _46 _^clsp _2 _139 _30 _^clsp _139 _135 _^clstk push-ax _195 ]:end | |
54 | ||
55 | ; sizw -> ptrw ; | |
56 | [ mov-bx __260 mov-ax[bx] | |
57 | pop-dx | |
58 | push-ax | |
59 | add-ax-dx | |
60 | mov-bx __260 mov[bx]ax ]:malloc | |
61 | ||
62 | ; ptrw -> void ; | |
63 | [ pop-ax mov-bx __260 mov[bx]ax ]:mfree | |
64 |
0 | ; | |
1 | 8086\string.she v1999.10.10 (c)1999 Chris Pressey, Cat's-Eye Technologies. | |
2 | GUPI string-manipulation extensions. | |
3 | ; | |
4 | ||
5 | ; strz strz -> bool ; | |
6 | [ | |
7 | pop-di pop-si | |
8 | mov-al[si] cmp-al[di] | |
9 | je _5 | |
10 | xor-ax-ax | |
11 | jmp _13 | |
12 | nop | |
13 | or-al-al | |
14 | je _4 | |
15 | inc-si inc-di | |
16 | jmp _237 | |
17 | xor-ah-ah mov-al _1 | |
18 | push-ax | |
19 | ]:eqzz | |
20 | ||
21 | ; strz strz -> strz(2) ; | |
22 | [ | |
23 | pop-di pop-si push-di | |
24 | mov-al[si] mov[di]al | |
25 | or-al-al | |
26 | je _4 | |
27 | inc-si inc-di | |
28 | jmp _244 | |
29 | ]:cpzz | |
30 | ||
31 | ; strz -> word ; | |
32 | [ | |
33 | pop-si | |
34 | push-si | |
35 | mov-al[si] | |
36 | or-al-al | |
37 | je _3 | |
38 | inc-si | |
39 | jmp _247 | |
40 | pop-dx | |
41 | push-si | |
42 | pop-ax | |
43 | sub-ax-dx | |
44 | push-ax | |
45 | ]:lenz | |
46 | ||
47 | [ ; strz -> strz ; | |
48 | dup lenz ++ malloc cpzz | |
49 | ]:dupz | |
50 | ||
51 |
0 | ; | |
1 | gupi\linklist.she v1999.10.10 (c)1999 Chris Pressey, Cat's-Eye Technologies. | |
2 | GUPI linked list extensions. | |
3 | ; | |
4 | ||
5 | ; size -> node ; [ \2 + malloc ]:ll-node | |
6 | ; next node -> void ; [ putw ]:ll-link | |
7 | ; node -> next ; [ getw ]:ll-next | |
8 | ; node -> data-ptr ; [ \2 + ]:ll-dptr |
0 | IDEAL | |
1 | ||
2 | ; shelta86.asm v1999.10.20 (c)1999 Chris Pressey, Cat's-Eye Technologies. | |
3 | ; Implements an assembler/compiler for the Shelta language, in 8086 assembly. | |
4 | ||
5 | ; * Special thanks to Ben Olmstead (BEM) for his suggestions for how to | |
6 | ; reduce SHELTA86.COM's size even further. | |
7 | ||
8 | MODEL tiny | |
9 | P8086 | |
10 | ||
11 | DATASEG | |
12 | ||
13 | symth dw symt | |
14 | codeh dw code | |
15 | stach dw stac | |
16 | safeh dw safe + 2 | |
17 | macrh dw macr | |
18 | ||
19 | ttable dw BeginBlock, PushWord, EndBlock, PushPointer, LiteralByte ; , String | |
20 | ; [ \ ] ^ _ ` | |
21 | ||
22 | UDATASEG | |
23 | ||
24 | token db 128 dup (?) | |
25 | ||
26 | safestart dw ? | |
27 | namestart dw ? | |
28 | toklength dw ? | |
29 | ||
30 | safe db 16384 dup (?) | |
31 | symt db 16384 dup (?) ; 16K + 16K = 32K | |
32 | code db 4096 dup (?) ; | |
33 | macr db 4096 dup (?) ; + 8K = 40K | |
34 | stac db 256 dup (?) | |
35 | ||
36 | CODESEG | |
37 | ORG 0100h | |
38 | ||
39 | ; EQUATES | |
40 | ||
41 | safeadj EQU (offset safe - 0104h) | |
42 | codeadj EQU (offset code - 0104h) | |
43 | ||
44 | ; Main program. | |
45 | PROC Main | |
46 | ||
47 | WhileFile: | |
48 | ||
49 | ; ----- begin scanning token | |
50 | ||
51 | call ScanChar ; get char -> al | |
52 | or al, al | |
53 | jz @@EndFile | |
54 | cmp al, 32 | |
55 | jbe WhileFile ; repeat if char is whitespace | |
56 | ||
57 | mov di, offset token | |
58 | cld | |
59 | ||
60 | @@TokenLoop: stosb ; put char in token | |
61 | call ScanChar ; get char | |
62 | cmp al, 32 | |
63 | ja @@TokenLoop ; repeat if char is not whitespace | |
64 | ||
65 | @@Terminate: mov [byte di], 0 ; return null-terminated token | |
66 | ||
67 | ; ----- end scanning token | |
68 | ||
69 | mov si, offset token + 1 | |
70 | ||
71 | mov al, [byte token] | |
72 | sub al, '[' | |
73 | cmp al, 4 | |
74 | ja @@Unroll | |
75 | ||
76 | xor ah, ah | |
77 | shl ax, 1 | |
78 | xchg bx, ax | |
79 | mov ax, [offset ttable + bx] | |
80 | jmp ax ; jump to handler as listed in ttable | |
81 | ||
82 | @@Unroll: dec si ; start at first character of token | |
83 | call LookupSymbol ; destroys DI & SI, but that's OK | |
84 | ||
85 | ; copy cx bytes from ax to codeh | |
86 | ||
87 | xchg ax, si | |
88 | mov di, [codeh] ; use di to track codeh | |
89 | rep movsb | |
90 | ||
91 | UpCodeH: mov [codeh], di | |
92 | jmp short WhileFile | |
93 | ||
94 | @@EndFile: ; put in a jump over the safe area | |
95 | ||
96 | mov ax, [safeh] | |
97 | sub ax, offset safe - 1 | |
98 | mov bx, offset token ; re-use token | |
99 | mov [byte bx], 0e9h | |
100 | mov [word bx + 1], ax | |
101 | mov [byte bx + 3], 90h | |
102 | ||
103 | mov cx, 4 | |
104 | mov dx, offset token | |
105 | call WriteIt | |
106 | ||
107 | ; make the first word of the safe area an offset | |
108 | ; to just past the last word of the code | |
109 | ||
110 | mov cx, [safeh] | |
111 | mov dx, offset safe | |
112 | sub cx, dx | |
113 | mov ax, cx | |
114 | add ax, [codeh] | |
115 | sub ax, codeadj | |
116 | mov [word safe], ax | |
117 | ||
118 | call WriteIt | |
119 | ||
120 | mov cx, [codeh] | |
121 | mov dx, offset code | |
122 | sub cx, dx | |
123 | call WriteIt | |
124 | ||
125 | xor al, al | |
126 | ||
127 | GlobalExit: mov ah, 4ch ; exit to DOS | |
128 | int 21h | |
129 | ENDP Main | |
130 | ||
131 | PROC WriteIt | |
132 | ||
133 | mov ah, 40h | |
134 | mov bx, 1 | |
135 | int 21h | |
136 | jnc @@OK | |
137 | mov al, 32 | |
138 | jmp short GlobalExit | |
139 | @@OK: ret | |
140 | ENDP WriteIt | |
141 | ||
142 | ; -------------------------------- HANDLERS --------------------------- ; | |
143 | ; When coming into any handler, di will equal the address of the null | |
144 | ; (that is, the number of characters in the token + offset token) | |
145 | ||
146 | ; ==== [ ==== BEGIN BLOCK ==== ; | |
147 | ||
148 | BeginBlock: mov di, [stach] ; push [ onto stack | |
149 | mov ax, [codeh] | |
150 | stosw ; mov [bx], ax | |
151 | mov [stach], di | |
152 | jmp WhileFile | |
153 | ||
154 | ; ==== ] ==== END BLOCK ==== ; | |
155 | ||
156 | EndBlock: ;mov si, offset token + 1 ; si = token + 1 until... | |
157 | ;cmp [byte ds:si], '=' | |
158 | ;je @@Smaller | |
159 | ;cmp [byte ds:si], ':' | |
160 | ;je @@Smaller | |
161 | ;jmp short @@CarryOn | |
162 | ; remove : or = from length | |
163 | @@Smaller: dec di ; di left over from scanning token | |
164 | ||
165 | @@CarryOn: mov bx, di ; di now free to hold something until @@WName | |
166 | sub bx, si ; get length | |
167 | ||
168 | mov ax, [safeh] | |
169 | mov [safestart], ax | |
170 | mov [namestart], ax | |
171 | xchg ax, di ; di now holds safe area head location | |
172 | ||
173 | mov [toklength], bx ; length of token | |
174 | sub [stach], 2 | |
175 | mov bx, [stach] ; pop [ from stack | |
176 | ||
177 | mov ax, [bx] ; ax = codeh when [ happened | |
178 | ||
179 | mov bp, [codeh] ; find length | |
180 | sub bp, ax | |
181 | ; mov bp, bx ; bp = length of data between [ ... ] | |
182 | ; until @@WName below... ugh | |
183 | ||
184 | cmp [stach], offset stac | |
185 | je @@StackEmpty | |
186 | ||
187 | ||
188 | mov bx, [stach] | |
189 | sub bx, 2 | |
190 | mov cx, [bx] | |
191 | ||
192 | ; namestart = [namestart] - (cx - ax) | |
193 | ||
194 | sub cx, ax | |
195 | sub [namestart], cx | |
196 | ||
197 | ; if dlength > 0, | |
198 | ||
199 | @@StackEmpty: ;or bp, bp | |
200 | ;jz @@Empty | |
201 | ||
202 | cmp [byte si], ':' ; si still = offset token + 1 | |
203 | jne @@PreCopyLoop | |
204 | ||
205 | mov di, [macrh] ; use macro area instead of safe if : | |
206 | mov [namestart], di | |
207 | ||
208 | ; copy everything from ax to codeh into the di area | |
209 | ||
210 | @@PreCopyLoop: mov dx, ax | |
211 | mov cx, bp ; [codeh] sub cx, ax | |
212 | push si | |
213 | xchg si, ax | |
214 | rep movsb | |
215 | pop si | |
216 | ||
217 | ; change codeh back to dx (old codeh before [) | |
218 | ||
219 | mov [codeh], dx | |
220 | ||
221 | ;mov si, offset token + 1 | |
222 | cmp [byte si], ':' ; si still = offset token + 1 | |
223 | je @@UpdateMacr | |
224 | ||
225 | mov [safeh], di | |
226 | jmp short @@Empty | |
227 | @@UpdateMacr: mov [macrh], di | |
228 | ;jmp short @@NameIt | |
229 | ||
230 | ; write push instruction if '=' or ':' not used | |
231 | ||
232 | @@Empty: ;cmp [byte si], '=' ; si still = offset token + 1 | |
233 | ;je @@NameIt | |
234 | ||
235 | ;mov ax, [safestart] | |
236 | ;sub ax, safeadj | |
237 | ;mov bx, [word codeh] | |
238 | ;mov [byte bx], 0b8h | |
239 | ;mov [word bx + 1], ax | |
240 | ;mov [byte bx + 3], 50h | |
241 | ;add [codeh], 4 | |
242 | ||
243 | ;cmp [byte si], 0 ; still offset token + 1! | |
244 | ;je @@Anonymous | |
245 | ||
246 | ; insert namestart into dictionary | |
247 | ||
248 | @@NameIt: mov cx, [namestart] | |
249 | mov ax, [toklength] | |
250 | ||
251 | ;cmp [byte si], '=' | |
252 | ;je @@Bigger | |
253 | ;cmp [byte si], ':' | |
254 | ;je @@Bigger | |
255 | ;jmp short @@WName | |
256 | ||
257 | @@Bigger: inc si | |
258 | ||
259 | @@WName: ; Destroys DI but that's OK. | |
260 | ; INPUT: bx = ADDRESS of token to insert, ax = length of symbol, | |
261 | ; cx = pointer to data, dx = length of data | |
262 | ; OUTPUT: ds:bx = pointer to newly allocated symbol | |
263 | ||
264 | mov di, [symth] ; di no longer contains macrh/safeh | |
265 | add ax, 6 ; 1 word for length, 1 for ptr, 1 for data length | |
266 | add [symth], ax | |
267 | ||
268 | stosw ; mov [word di], ax ; place ax length in symt | |
269 | ||
270 | sub ax, 6 | |
271 | xchg cx, ax ; cx <- ax; ax <- cx | |
272 | stosw ; mov [word di], cx ; place cx (ptr to data) | |
273 | xchg ax, bp | |
274 | stosw ; mov [word di], bp ; place bp (ptr length) | |
275 | ||
276 | rep movsb | |
277 | ||
278 | mov [symth], di | |
279 | ||
280 | @@Anonymous: jmp WhileFile | |
281 | ||
282 | ; ==== ^ ==== PUSH POINTER ==== ; | |
283 | ||
284 | PushPointer: ;mov si, offset token + 1 | |
285 | call LookupSymbol ; destroys di & si, should be OK | |
286 | ||
287 | sub ax, safeadj | |
288 | mov di, [word codeh] | |
289 | jmp short WritePush | |
290 | ||
291 | ; ==== ` ==== STRING ==== ; | |
292 | ; | |
293 | ;String: ;mov si, offset token + 1 | |
294 | ; mov di, [codeh] | |
295 | ;@@Loop: mov al, [byte ds:si] | |
296 | ; stosb | |
297 | ; inc si | |
298 | ; cmp [byte ds:si], 0 | |
299 | ; jne @@Loop | |
300 | ; jmp UpCodeH | |
301 | ||
302 | ; ==== _ ==== LITERAL BYTE ==== ; | |
303 | ||
304 | LiteralByte: ;mov si, offset token + 1 | |
305 | cmp [byte si], '_' | |
306 | je LiteralWord | |
307 | cmp [byte si], '^' | |
308 | je LiteralSymbol | |
309 | call DecipherDecimal ; destroys DI, that's OK | |
310 | stosb ; mov [byte bx], al | |
311 | CheapTrick: mov [codeh], di | |
312 | jmp WhileFile | |
313 | ||
314 | ; ==== __ ==== LITERAL WORD ==== ; | |
315 | ||
316 | LiteralWord: inc si | |
317 | call DecipherDecimal ; destroys DI, that's OK | |
318 | FunkyTrick: stosw ; mov [word bx], ax | |
319 | jmp short CheapTrick | |
320 | ||
321 | ; ==== _^ ==== LITERAL SYMBOL ==== ; | |
322 | ||
323 | LiteralSymbol: inc si | |
324 | call LookupSymbol ; destroys DI & SI, that's OK | |
325 | ||
326 | sub ax, safeadj | |
327 | ||
328 | mov di, [word codeh] | |
329 | jmp short FunkyTrick | |
330 | ;mov [word bx], ax | |
331 | ;inc [codeh] | |
332 | ;jmp short CheapTrick | |
333 | ||
334 | ; ==== \ ==== PUSH WORD ==== ; | |
335 | ||
336 | PushWord: ;mov si, offset token + 1 | |
337 | call DecipherDecimal ; destroys di, that's OK | |
338 | ||
339 | WritePush: mov [byte di], 0b8h ; B8h, low byte, high byte, 50h | |
340 | inc di | |
341 | stosw ; mov [word di + 1], ax | |
342 | mov al, 50h | |
343 | stosb | |
344 | mov [codeh], di | |
345 | jmp WhileFile | |
346 | ||
347 | ; -------------------------------- SUBROUTINES --------------------------- ; | |
348 | ||
349 | PROC DecipherDecimal ; uses and destroys DI | |
350 | ; INPUT: si = address of token | |
351 | ; OUTPUT: ax = value, di = codeh | |
352 | ||
353 | ||
354 | xor di, di | |
355 | ||
356 | @@Loop: lodsb ; mov al, [byte ds:si], inc si | |
357 | ||
358 | mov bx, di | |
359 | mov cl, 3 | |
360 | shl bx, cl | |
361 | mov cx, di | |
362 | shl cx, 1 | |
363 | add bx, cx | |
364 | ||
365 | sub al, '0' | |
366 | cbw | |
367 | add bx, ax | |
368 | mov di, bx | |
369 | ||
370 | cmp [byte ds:si], '0' | |
371 | jae @@Loop | |
372 | ||
373 | xchg ax, di | |
374 | mov di, [word codeh] | |
375 | ret | |
376 | ENDP DecipherDecimal | |
377 | ||
378 | PROC ScanChar | |
379 | ; Scans a single character from the input file, placing | |
380 | ; it in register al, which will be 0 upon error | |
381 | ; or eof (so don't embed nulls in the Shelta source...) | |
382 | ||
383 | mov ah, 7 ; read from stdin one byte | |
384 | int 21h | |
385 | cmp al, ';' ; check for comment | |
386 | je @@Comment | |
387 | ret | |
388 | @@Comment: mov ah, 7 ; read from stdin one byte | |
389 | int 21h | |
390 | cmp al, ';' ; check for comment | |
391 | jne @@Comment | |
392 | jmp short ScanChar | |
393 | ||
394 | ENDP ScanChar | |
395 | ||
396 | PROC LookupSymbol | |
397 | ; INPUT: si = address of symbol to find, di = address of null termination | |
398 | ; OUTPUT: ds:ax = pointer to contents or zero if not found | |
399 | ; cx = length of contents | |
400 | ||
401 | mov bx, offset symt ; bx starts at symbol table | |
402 | mov bp, si | |
403 | sub di, si | |
404 | ||
405 | @@Loop: mov ax, [word bx] ; first word = token size | |
406 | ||
407 | mov dx, bx ; keep track of start of this symt entry | |
408 | ||
409 | sub ax, 6 | |
410 | cmp ax, di | |
411 | jne @@Exit ; if it doesn't fit, you must acquit | |
412 | ||
413 | add bx, 6 ; bx now points to token in symbol table | |
414 | ||
415 | ; exit if right token | |
416 | ||
417 | xor si, si ; reset si to token | |
418 | @@Inner: mov al, [byte ds:bx] ; get byte from bx=symt | |
419 | cmp [byte bp + si], al ; compare to si=token | |
420 | jne @@Exit | |
421 | inc bx | |
422 | inc si | |
423 | cmp si, di ; hit the length yet? | |
424 | jb @@Inner ; no, repeat | |
425 | ||
426 | ; a match! | |
427 | ||
428 | mov bx, dx | |
429 | mov cx, [word bx + 4] ; third word = data length | |
430 | mov ax, [word bx + 2] ; second word = data ptr | |
431 | ret | |
432 | ||
433 | @@Exit: mov bx, dx | |
434 | mov ax, [word bx] | |
435 | add bx, ax | |
436 | cmp bx, [symth] | |
437 | jb @@Loop | |
438 | ||
439 | mov al, 16 ; return 16 if unknown identifier | |
440 | jmp GlobalExit | |
441 | ||
442 | ENDP LookupSymbol | |
443 | ||
444 | END Main |