git @ Cat's Eye Technologies SixtyPical / c66b339
Spiffy up the README, move meaty stuff into docs. Cat's Eye Technologies 7 years ago
2 changed file(s) with 135 addition(s) and 100 deletion(s). Raw diff Collapse all Expand all
4949 ### Abstract Interpretation ###
5050
5151 SixtyPical tries to prevent the program from using data that has no meaning.
52 For example, the following:
52
53 The instructions of a routine are analyzed using abstract interpretation.
54 One thing we specifically do is determine which registers and memory locations
55 are *not* affected by the routine. For example, the following:
5356
5457 routine do_it {
5558 lda #0
6265 * the A register is declared to be a meaningful output of `update_score`
6366 * `update_score` was determined to not change the value of the A register
6467
65 The first must be done with an explicit declaration on `update_score` (NYI).
66 The second will be done using abstract interpretation of the code of
67 `update_score` (needs to be implemented again, now, and better).
68 The first case must be done with an explicit declaration on `update_score`.
69 The second case will be be inferred using abstract interpretation of the code
70 of `update_score`.
6871
6972 ### Structured Programming ###
7073
71 You get an `if` and a `repeat` and instructions like `sei` work like `with`
72 where they are followed by a block and the `cli` instruction is implicitly
73 (and unavoidably) added at the end.
74
75 For more information, see the docs (which are written in the form of a
76 Falderal literate test suite.)
77
78 Concepts
79 --------
80
81 ### Routines ###
74 SixtyPical eschews labels for code and instead organizes code into _blocks_.
8275
8376 Instead of the assembly-language subroutine, SixtyPical provides the _routine_
84 as the abstraction for a reusable sequence of code.
77 as the abstraction for a reusable sequence of code. A routine may be called,
78 or may be included inline, by another routine. The body of a routine is a
79 block.
8580
86 A routine may be called, or may be included inline, by another routine.
81 Along with routines, you get `if`, `repeat`, and `with` constructs which take
82 blocks. The `with` construct takes an instruction like `sei` and implicitly
83 (and unavoidably) inserts the corresponding `cli` at the end of the block.
8784
88 There is one top-level routine called `main` which represents the entire
89 program.
85 For More Information
86 --------------------
9087
91 The instructions of a routine are analyzed using abstract interpretation.
92 One thing we specifically do is determine which registers and memory locations
93 are *not* affected by the routine.
88 For more information, see the docs (which are written in the form of
89 Falderal literate test suites. If you have Falderal installed, you can run
90 the tests with `./test.sh`.)
9491
95 If a register is not affected by a routine, then a caller of that routine may
96 assume that the value in that register is retained.
92 Ideas
93 -----
9794
98 Of course, a routine may intentionally affect a register or memory location,
99 as an output. It must declare this. We're not there yet.
95 These aren't implemented yet:
96
97 * Abstract interpretation must extend to `if`, `repeat`, and `with`
98 blocks. The two incoming contexts must be merged, and any storage
99 locations updated differently or poisoned in either context, will be
100 considered poisoned in the result context.
100101
101 ### Addresses ###
102
103 The body of a routine may not refer to an address literally. It must use
104 a symbol that was declared previously.
105
106 An address may be declared with `reserve`, which is like `.data` or `.bss`
107 in an assembler. This is an address into the program's data. It is global
108 to all routines.
109
110 An address may be declared with `locate`, which is like `.alias` in an
111 assembler, with the understanding that the value will be treated "like an
112 address." This is generally an address into the operating system or hardware
113 (e.g. kernal routine, I/O port, etc.)
114
115 Not there. yet:
116
117 > Inside a routine, an address may be declared with `temporary`. This is like
118 > `static` in C, except the value at that address is not guaranteed to be
119 > retained between invokations of the routine. Such addresses may only be used
120 > within the routine where they are declared. If analysis indicates that two
121 > temporary addresses are never used simultaneously, they may be merged
122 > to the same address.
123
124 An address knows what kind of data is stored at the address:
125
126 * `byte`: an 8-bit byte. not part of a word. not to be used as an address.
127 (could be an index though.)
128 * `word`: a 16-bit word. not to be used as an address.
129 * `vector`: a 16-bit address of a routine. Only a handful of operations
130 are supported on vectors:
131
132 * copying the contents of one vector to another
133 * copying the address of a routine into a vector
134 * jumping indirectly to a vector (i.e. to the code at the address
135 contained in the vector (and this can only happen at the end of a
136 routine (NYI))
137 * `jsr`'ing indirectly to a vector (which is done with a fun
138 generated trick (NYI))
139
140 * `byte table`: a series of `byte`s contiguous in memory starting from the
141 address. This is the only kind of address that can be used in
142 indexed addressing.
143
144 ### Blocks ###
145
146 Each routine is a block. It may be composed of inner blocks, if those
147 inner blocks are attached to certain instructions.
148
149 SixtyPical does not have instructions that map literally to the 6502 branch
150 instructions. Instead, it has an `if` construct, with two blocks (for the
151 "then" and `else` parts), and the branch instructions map to conditions for
152 this construct.
153
154 Similarly, there is a `repeat` construct. The same branch instructions can
155 be used in the condition to this construct. In this case, they branch back
156 to the top of the `repeat` loop.
157
158 The abstract states of the machine at each of the different block exits are
159 merged during analysis. If any register or memory location is treated
160 inconsistently (e.g. updated in one branch of the test, but not the other,)
161 that register cannot subsequently be used without a declaration to the effect
162 that we know what's going on. (This is all a bit fuzzy right now.)
163
164 There is also no `rts` instruction. It is included at the end of a routine,
165 but only when the routine is used as a subroutine. Also, if the routine
166 ends by `jsr`ing another routine, it reserves the right to do a tail-call
167 or even a fallthrough.
168
169 There are also _with_ instructions, which are associated with three opcodes
170 that have natural symmetrical opcodes: `pha`, `php`, and `sei`. These
171 instructions take a block. The natural symmetrical opcode is inserted at
172 the end of the block.
102 * Inside a routine, an address may be declared with `temporary`. This is like
103 `static` in C, except the value at that address is not guaranteed to be
104 retained between invokations of the routine. Such addresses may only be used
105 within the routine where they are declared. If analysis indicates that two
106 temporary addresses are never used simultaneously, they may be merged
107 to the same address.
173108
174109 TODO
175110 ----
177112 * Initial values for reserved, incl. tables
178113 * give length for tables, must be there for reserved, if no init val
179114 * Character tables ("strings" to everybody else)
180 * Work out the analyses again and document them
181115 * Addressing modes — indexed mode on more instructions
182116 * `jsr (vector)`
183117 * `jmp routine`
184118 * insist on EOL after each instruction. need spacesWOEOL production
185119 * asl .a
120 * `outputs` on externals
99
1010 -> Functionality "Check SixtyPical program" is implemented by
1111 -> shell command "bin/sixtypical check %(test-file)"
12
13 Some Basic Syntax
14 -----------------
1215
1316 `main` must be present.
1417
4447 | }
4548 = True
4649
47 A program may `reserve` and `assign`.
50 Addresses
51 ---------
52
53 An address may be declared with `reserve`, which is like `.data` or `.bss`
54 in an assembler. This is an address into the program's data. It is global
55 to all routines.
56
57 | reserve byte lives
58 | routine main {
59 | lda #3
60 | sta lives
61 | }
62 | routine died {
63 | dec lives
64 | }
65 = True
66
67 An address may be declared with `locate`, which is like `.alias` in an
68 assembler, with the understanding that the value will be treated "like an
69 address." This is generally an address into the operating system or hardware
70 (e.g. kernal routine, I/O port, etc.)
71
72 | assign byte screen $0400
73 | routine main {
74 | lda #0
75 | sta screen
76 | }
77 = True
78
79 The body of a routine may not refer to an address literally. It must use
80 a symbol that was declared previously with `reserve` or `assign`.
81
82 | routine main {
83 | lda #0
84 | sta $0400
85 | }
86 ? unexpected "$"
87
88 | assign byte screen $0400
89 | routine main {
90 | lda #0
91 | sta screen
92 | }
93 = True
94
95 Test for many combinations of `reserve` and `assign`.
4896
4997 | reserve byte lives
5098 | assign byte gdcol 647
213261 | lda screen
214262 | }
215263 ? incompatible types 'Vector' and 'Byte'
264
265 ### Addresses ###
266
267 An address knows what kind of data is stored at the address:
268
269 * `byte`: an 8-bit byte. not part of a word. not to be used as an address.
270 (could be an index though.)
271 * `word`: a 16-bit word. not to be used as an address.
272 * `vector`: a 16-bit address of a routine. Only a handful of operations
273 are supported on vectors:
274
275 * copying the contents of one vector to another
276 * copying the address of a routine into a vector
277 * jumping indirectly to a vector (i.e. to the code at the address
278 contained in the vector (and this can only happen at the end of a
279 routine (NYI))
280 * `jsr`'ing indirectly to a vector (which is done with a fun
281 generated trick (NYI))
282
283 * `byte table`: a series of `byte`s contiguous in memory starting from the
284 address. This is the only kind of address that can be used in
285 indexed addressing.
286
287 ### Blocks ###
288
289 Each routine is a block. It may be composed of inner blocks, if those
290 inner blocks are attached to certain instructions.
291
292 SixtyPical does not have instructions that map literally to the 6502 branch
293 instructions. Instead, it has an `if` construct, with two blocks (for the
294 "then" and `else` parts), and the branch instructions map to conditions for
295 this construct.
296
297 Similarly, there is a `repeat` construct. The same branch instructions can
298 be used in the condition to this construct. In this case, they branch back
299 to the top of the `repeat` loop.
300
301 The abstract states of the machine at each of the different block exits are
302 merged during analysis. If any register or memory location is treated
303 inconsistently (e.g. updated in one branch of the test, but not the other,)
304 that register cannot subsequently be used without a declaration to the effect
305 that we know what's going on. (This is all a bit fuzzy right now.)
306
307 There is also no `rts` instruction. It is included at the end of a routine,
308 but only when the routine is used as a subroutine. Also, if the routine
309 ends by `jsr`ing another routine, it reserves the right to do a tail-call
310 or even a fallthrough.
311
312 There are also _with_ instructions, which are associated with three opcodes
313 that have natural symmetrical opcodes: `pha`, `php`, and `sei`. These
314 instructions take a block. The natural symmetrical opcode is inserted at
315 the end of the block.