49 | 49 |
### Abstract Interpretation ###
|
50 | 50 |
|
51 | 51 |
SixtyPical tries to prevent the program from using data that has no meaning.
|
52 | |
For example, the following:
|
|
52 |
|
|
53 |
The instructions of a routine are analyzed using abstract interpretation.
|
|
54 |
One thing we specifically do is determine which registers and memory locations
|
|
55 |
are *not* affected by the routine. For example, the following:
|
53 | 56 |
|
54 | 57 |
routine do_it {
|
55 | 58 |
lda #0
|
|
62 | 65 |
* the A register is declared to be a meaningful output of `update_score`
|
63 | 66 |
* `update_score` was determined to not change the value of the A register
|
64 | 67 |
|
65 | |
The first must be done with an explicit declaration on `update_score` (NYI).
|
66 | |
The second will be done using abstract interpretation of the code of
|
67 | |
`update_score` (needs to be implemented again, now, and better).
|
|
68 |
The first case must be done with an explicit declaration on `update_score`.
|
|
69 |
The second case will be be inferred using abstract interpretation of the code
|
|
70 |
of `update_score`.
|
68 | 71 |
|
69 | 72 |
### Structured Programming ###
|
70 | 73 |
|
71 | |
You get an `if` and a `repeat` and instructions like `sei` work like `with`
|
72 | |
where they are followed by a block and the `cli` instruction is implicitly
|
73 | |
(and unavoidably) added at the end.
|
74 | |
|
75 | |
For more information, see the docs (which are written in the form of a
|
76 | |
Falderal literate test suite.)
|
77 | |
|
78 | |
Concepts
|
79 | |
--------
|
80 | |
|
81 | |
### Routines ###
|
|
74 |
SixtyPical eschews labels for code and instead organizes code into _blocks_.
|
82 | 75 |
|
83 | 76 |
Instead of the assembly-language subroutine, SixtyPical provides the _routine_
|
84 | |
as the abstraction for a reusable sequence of code.
|
|
77 |
as the abstraction for a reusable sequence of code. A routine may be called,
|
|
78 |
or may be included inline, by another routine. The body of a routine is a
|
|
79 |
block.
|
85 | 80 |
|
86 | |
A routine may be called, or may be included inline, by another routine.
|
|
81 |
Along with routines, you get `if`, `repeat`, and `with` constructs which take
|
|
82 |
blocks. The `with` construct takes an instruction like `sei` and implicitly
|
|
83 |
(and unavoidably) inserts the corresponding `cli` at the end of the block.
|
87 | 84 |
|
88 | |
There is one top-level routine called `main` which represents the entire
|
89 | |
program.
|
|
85 |
For More Information
|
|
86 |
--------------------
|
90 | 87 |
|
91 | |
The instructions of a routine are analyzed using abstract interpretation.
|
92 | |
One thing we specifically do is determine which registers and memory locations
|
93 | |
are *not* affected by the routine.
|
|
88 |
For more information, see the docs (which are written in the form of
|
|
89 |
Falderal literate test suites. If you have Falderal installed, you can run
|
|
90 |
the tests with `./test.sh`.)
|
94 | 91 |
|
95 | |
If a register is not affected by a routine, then a caller of that routine may
|
96 | |
assume that the value in that register is retained.
|
|
92 |
Ideas
|
|
93 |
-----
|
97 | 94 |
|
98 | |
Of course, a routine may intentionally affect a register or memory location,
|
99 | |
as an output. It must declare this. We're not there yet.
|
|
95 |
These aren't implemented yet:
|
|
96 |
|
|
97 |
* Abstract interpretation must extend to `if`, `repeat`, and `with`
|
|
98 |
blocks. The two incoming contexts must be merged, and any storage
|
|
99 |
locations updated differently or poisoned in either context, will be
|
|
100 |
considered poisoned in the result context.
|
100 | 101 |
|
101 | |
### Addresses ###
|
102 | |
|
103 | |
The body of a routine may not refer to an address literally. It must use
|
104 | |
a symbol that was declared previously.
|
105 | |
|
106 | |
An address may be declared with `reserve`, which is like `.data` or `.bss`
|
107 | |
in an assembler. This is an address into the program's data. It is global
|
108 | |
to all routines.
|
109 | |
|
110 | |
An address may be declared with `locate`, which is like `.alias` in an
|
111 | |
assembler, with the understanding that the value will be treated "like an
|
112 | |
address." This is generally an address into the operating system or hardware
|
113 | |
(e.g. kernal routine, I/O port, etc.)
|
114 | |
|
115 | |
Not there. yet:
|
116 | |
|
117 | |
> Inside a routine, an address may be declared with `temporary`. This is like
|
118 | |
> `static` in C, except the value at that address is not guaranteed to be
|
119 | |
> retained between invokations of the routine. Such addresses may only be used
|
120 | |
> within the routine where they are declared. If analysis indicates that two
|
121 | |
> temporary addresses are never used simultaneously, they may be merged
|
122 | |
> to the same address.
|
123 | |
|
124 | |
An address knows what kind of data is stored at the address:
|
125 | |
|
126 | |
* `byte`: an 8-bit byte. not part of a word. not to be used as an address.
|
127 | |
(could be an index though.)
|
128 | |
* `word`: a 16-bit word. not to be used as an address.
|
129 | |
* `vector`: a 16-bit address of a routine. Only a handful of operations
|
130 | |
are supported on vectors:
|
131 | |
|
132 | |
* copying the contents of one vector to another
|
133 | |
* copying the address of a routine into a vector
|
134 | |
* jumping indirectly to a vector (i.e. to the code at the address
|
135 | |
contained in the vector (and this can only happen at the end of a
|
136 | |
routine (NYI))
|
137 | |
* `jsr`'ing indirectly to a vector (which is done with a fun
|
138 | |
generated trick (NYI))
|
139 | |
|
140 | |
* `byte table`: a series of `byte`s contiguous in memory starting from the
|
141 | |
address. This is the only kind of address that can be used in
|
142 | |
indexed addressing.
|
143 | |
|
144 | |
### Blocks ###
|
145 | |
|
146 | |
Each routine is a block. It may be composed of inner blocks, if those
|
147 | |
inner blocks are attached to certain instructions.
|
148 | |
|
149 | |
SixtyPical does not have instructions that map literally to the 6502 branch
|
150 | |
instructions. Instead, it has an `if` construct, with two blocks (for the
|
151 | |
"then" and `else` parts), and the branch instructions map to conditions for
|
152 | |
this construct.
|
153 | |
|
154 | |
Similarly, there is a `repeat` construct. The same branch instructions can
|
155 | |
be used in the condition to this construct. In this case, they branch back
|
156 | |
to the top of the `repeat` loop.
|
157 | |
|
158 | |
The abstract states of the machine at each of the different block exits are
|
159 | |
merged during analysis. If any register or memory location is treated
|
160 | |
inconsistently (e.g. updated in one branch of the test, but not the other,)
|
161 | |
that register cannot subsequently be used without a declaration to the effect
|
162 | |
that we know what's going on. (This is all a bit fuzzy right now.)
|
163 | |
|
164 | |
There is also no `rts` instruction. It is included at the end of a routine,
|
165 | |
but only when the routine is used as a subroutine. Also, if the routine
|
166 | |
ends by `jsr`ing another routine, it reserves the right to do a tail-call
|
167 | |
or even a fallthrough.
|
168 | |
|
169 | |
There are also _with_ instructions, which are associated with three opcodes
|
170 | |
that have natural symmetrical opcodes: `pha`, `php`, and `sei`. These
|
171 | |
instructions take a block. The natural symmetrical opcode is inserted at
|
172 | |
the end of the block.
|
|
102 |
* Inside a routine, an address may be declared with `temporary`. This is like
|
|
103 |
`static` in C, except the value at that address is not guaranteed to be
|
|
104 |
retained between invokations of the routine. Such addresses may only be used
|
|
105 |
within the routine where they are declared. If analysis indicates that two
|
|
106 |
temporary addresses are never used simultaneously, they may be merged
|
|
107 |
to the same address.
|
173 | 108 |
|
174 | 109 |
TODO
|
175 | 110 |
----
|
|
177 | 112 |
* Initial values for reserved, incl. tables
|
178 | 113 |
* give length for tables, must be there for reserved, if no init val
|
179 | 114 |
* Character tables ("strings" to everybody else)
|
180 | |
* Work out the analyses again and document them
|
181 | 115 |
* Addressing modes — indexed mode on more instructions
|
182 | 116 |
* `jsr (vector)`
|
183 | 117 |
* `jmp routine`
|
184 | 118 |
* insist on EOL after each instruction. need spacesWOEOL production
|
185 | 119 |
* asl .a
|
|
120 |
* `outputs` on externals
|