git @ Cat's Eye Technologies SixtyPical / 0.12
Merge pull request #8 from catseye/develop-0.12 Develop 0.12 Chris Pressey authored 3 years ago GitHub committed 3 years ago
16 changed file(s) with 877 addition(s) and 376 deletion(s). Raw diff Collapse all Expand all
00 History of SixtyPical
11 =====================
2
3 0.12
4 ----
5
6 * `copy` is now understood to trash `a`, thus it is not valid to use `a` in `copy`.
7 To compensate, indirect addressing is supported in `ld` and `st`, for example,
8 as `ld a, [ptr] + y` and `st a, [ptr] + y`.
9 * Implements the "union rule for trashes" when analyzing `if` blocks.
10 * Even if we `goto` another routine, we can't trash an output.
11 * `static` storage locations local to routines can now be defined within routines.
12 * Small grammar changes that obviate the need for:
13 * the parentheses in type expressions like `vector (routine ...) table[256]`
14 * the `forward` keyword in forward references in source of `copy` instruction
15 * Fixed bug where `trash` was not marking the location as being virtually altered.
216
317 0.11
418 ----
00 SixtyPical
11 ==========
22
3 _Version 0.11. Work-in-progress, everything is subject to change._
3 _Version 0.12. Work-in-progress, everything is subject to change._
44
55 SixtyPical is a very low-level programming language, similar to 6502 assembly,
66 with static analysis through abstract interpretation.
3838 TODO
3939 ----
4040
41 ### `low` and `high` address operators
42
43 To turn `word` type into `byte`.
44
4541 ### Save registers on stack
4642
4743 This preserves them, so that, semantically, they can be used later even though they
5551 This should be tracked in the abstract interpretation.
5652 (If only because abstract interpretation is the major point of this project!)
5753
58 ### Routine-local static memory locations
54 Range-checking buffers might be too difficult. Range checking tables will be easier.
55 If a value is ANDed with 15, its range must be 0-15, etc.
5956
60 These would not need to appear in the inputs/outputs/trashes sets of the routines
61 that call this routine.
57 ### Re-order routines and optimize tail-calls to fallthroughs
6258
63 These might be forced to specify an initial value so that they can always be
64 assumed to be meaningful.
65
66 ### More modes for `copy`
67
68 * don't allow `copy foo, a` probably. insist on `ld a, foo` for this.
69 * have `copy` instruction able to copy a byte to a user-def mem loc, etc.
70 * `copy x, [ptr] + y`
71 * Maybe even `copy [ptra] + y, [ptrb] + y`, which can be compiled to indirect LDA then indirect STA!
72
73 ### Union rule for trashes in `if`
74
75 If one branch trashes {`a`} and the other branch trashes {`b`} then the whole
76 `if` statement trashes {`a`, `b`}.
59 Not because it saves 3 bytes, but because it's a neat trick. Doing it optimally
60 is probably NP-complete. But doing it adeuqately is probably not that hard.
7761
7862 ### And at some point...
7963
64 * `low` and `high` address operators - to turn `word` type into `byte`.
65 * `const`s that can be used in defining the size of tables, etc.
66 * Tests, and implementation, ensuring a routine can be assigned to a vector of "wider" type
67 * Related: can we simply view a (small) part of a buffer as a byte table? If not, why not?
8068 * Check that the buffer being read or written to through pointer, appears in approporiate inputs or outputs set.
69 (Associate each pointer with the buffer it points into.)
70 * `static` pointers -- currently not possible because pointers must be zero-page, thus `@`, thus uninitialized.
71 * Question the value of the "consistent initialization" principle for `if` statement analysis.
8172 * `interrupt` routines -- to indicate that "the supervisor" has stored values on the stack, so we can trash them.
82 * error messages that include the line number of the source code
83 * add absolute addressing in shl/shr, absolute-indexed for add, sub, etc.
84 * check and disallow recursion.
85 * automatic tail-call optimization (could be tricky, w/constraints?)
86 * re-order routines and optimize tail-calls to fallthroughs
73 * Error messages that include the line number of the source code.
74 * Add absolute addressing in shl/shr, absolute-indexed for add, sub, etc.
75 * Automatic tail-call optimization (could be tricky, w/constraints?)
76 * Possibly `ld x, [ptr] + y`, possibly `st x, [ptr] + y`.
77 * Maybe even `copy [ptra] + y, [ptrb] + y`, which can be compiled to indirect LDA then indirect STA!
2525 The intent is not to make it absolutely impossible to make such errors,
2626 just harder.
2727
28 ### Things it will Not Do ###
29
30 To emphasize the point, the intent is not to make it impossible to make
31 data-usage (and other) errors, just harder.
32
33 Here are some things SixtyPical will not try to detect or prevent you
34 from doing:
35
36 * Check that a vector is initialized before it's called.
37 * Check that the stack has enough room on it.
38 * Prevent bad things happening (e.g. clobbering a static storage
39 location) because of a recursive call. (You can always recursively
40 call yourself through a vector.)
41 * Check that reads and writes to a buffer are in bounds. (This may
42 happen someday, but it's difficult. It's more likely that this
43 will happen for tables, than for buffers.)
44
45 At one point I wanted to do a call-tree analysis to find sets of
46 routines that would never be called together (i.e. would never be on
47 the call stack at the same time) and allow any static storage locations
48 defined within them to occupy the same addresses, i.e. allow storage
49 to be re-used across these routines. But, in the presence of vectors,
50 this becomes difficult (see "Prevent bad things happening", above.)
51 Also, it would usually only save a few bytes of storage space.
52
2853 ### Some Background ###
2954
3055 The ideas in SixtyPical came from a couple of places.
147147 LDA ($02), Y
148148 STA ($02), Y
149149
150 There are extended modes of `copy` for using these types of memory location.
150 There are extended instruction modes for using these types of memory location.
151151 See `copy` below, but here is some illustrative example code:
152152
153153 copy ^buf, ptr // this is the only way to initialize a pointer
237237 Some combinations, such as `ld x, y`, are illegal because they do not map to
238238 underlying opcodes.
239239
240 There is another mode of `ld` which reads into `a` indirectly through a pointer.
241
242 ld a, [<src-memory-location>] + y
243
244 The memory location in this syntax must be a pointer.
245
246 This syntax copies the contents of memory at the pointer (offset by the `y`
247 register) into a register (which must be the `a` register.)
248
249 In addition to the constraints above, `y` must be initialized before
250 this mode is used.
251
240252 ### st ###
241253
242254 st <src-memory-location>, <dest-memory-location> [+ <index-memory-location>]
252264 changed by this instruction (unless of course dest is a flag.)
253265
254266 If and only if dest is a byte table, the index-memory-location must be given.
267
268 There is another mode of `st` which write `a` into memory, indirectly through
269 a pointer.
270
271 st a, [<dest-memory-location>] + y
272
273 The memory location in this syntax must be a pointer.
274
275 This syntax copies the constents of the `a` register into
276 the contents of memory at the pointer (offset by the `y` register).
277
278 In addition to the constraints above, `y` must be initialized before
279 this mode is used.
255280
256281 ### copy ###
257282
511536 Program ::= {TypeDefn} {Defn} {Routine}.
512537 TypeDefn::= "typedef" Type Ident<new>.
513538 Defn ::= Type Ident<new> [Constraints] (":" Literal | "@" LitWord).
514 Type ::= "(" Type ")" | TypeExpr ["table" TypeSize].
539 Type ::= TypeTerm ["table" TypeSize].
515540 TypeExpr::= "byte"
516541 | "word"
517542 | "buffer" TypeSize
518543 | "pointer"
519 | "vector" Type
544 | "vector" TypeTerm
520545 | "routine" Constraints
546 | "(" Type ")"
521547 .
522548 TypeSize::= "[" LitWord "]".
523549 Constrnt::= ["inputs" LocExprs] ["outputs" LocExprs] ["trashes" LocExprs].
0 // SixtyPical 0.11 introduced a new syntax for defining routines
1 // (routine was made into a type, and you can now say: define name type.)
2
3 typedef routine
4 inputs x
5 outputs x
6 trashes z, n
7 routine_type
8
9 vector routine_type vec
10
11 define foo routine_type
12 {
13 inc x
14 }
15
16 define main routine
17 outputs vec
18 trashes a, z, n
19 {
20 copy foo, vec
21 }
3030 //
3131
3232 typedef routine
33 inputs joy2, button_down, press_fire_msg, dispatch_game_state, save_x,
34 actor_pos, pos, new_pos, actor_delta, delta, actor_logic,
33 inputs joy2, press_fire_msg, dispatch_game_state,
34 actor_pos, actor_delta, actor_logic,
3535 screen, screen1, screen2, screen3, screen4, colormap1, colormap2, colormap3, colormap4
36 outputs button_down, dispatch_game_state,
37 actor_pos, pos, new_pos, actor_delta, delta, actor_logic,
36 outputs dispatch_game_state,
37 actor_pos, actor_delta, actor_logic,
3838 screen, screen1, screen2, screen3, screen4, colormap1, colormap2, colormap3, colormap4
39 trashes a, x, y, c, z, n, v, ptr, save_x, compare_target, dispatch_logic
39 trashes a, x, y, c, z, n, v, pos, new_pos, delta, ptr, dispatch_logic
4040 game_state_routine
4141
4242 //
5050 typedef routine
5151 inputs pos, delta, joy2, screen
5252 outputs pos, delta, new_pos, screen, c
53 trashes a, x, y, z, n, v, ptr, compare_target
53 trashes a, x, y, z, n, v, ptr
5454 logic_routine
5555
5656 // ----------------------------------------------------------------
8686 word table[256] actor_delta
8787 word delta
8888
89 vector (logic_routine) table[256] actor_logic
89 vector logic_routine table[256] actor_logic
9090 vector logic_routine dispatch_logic
9191
92 byte button_down : 0 // effectively static-local to check_button
9392 byte table[32] press_fire_msg: "PRESS`FIRE`TO`PLAY"
94
95 byte save_x
96 word compare_target
9793
9894 //
9995 // Points to the routine that implements the current game state.
157153 // call this routine.
158154 // Upon return, if carry is set, the button was pressed then released.
159155
160 routine check_button
161 inputs joy2, button_down
162 outputs c, button_down
156 define check_button routine
157 inputs joy2
158 outputs c
163159 trashes a, z, n
160 static byte button_down : 0
164161 {
165162 ld a, button_down
166163 if z {
217214 add new_pos, delta
218215 }
219216
220 routine check_new_position_in_bounds
217 define check_new_position_in_bounds routine
221218 inputs new_pos
222219 outputs c
223 trashes compare_target, a, z, n, v
220 trashes a, z, n, v
221 static word compare_target : 0
224222 {
225223 copy 1000, compare_target
226224 st on, c
242240
243241 routine init_game
244242 inputs actor_pos, actor_delta, actor_logic
245 outputs actor_pos, actor_delta, pos, actor_logic
246 trashes a, y, z, n, c, v
243 outputs actor_pos, actor_delta, actor_logic
244 trashes pos, a, y, z, n, c, v
247245 {
248246 ld y, 0
249247 copy word 0, pos
250248 repeat {
251249 copy pos, actor_pos + y
252250 copy word 40, actor_delta + y
253 copy forward enemy_logic, actor_logic + y
251 copy enemy_logic, actor_logic + y
254252
255253 st off, c
256254 add pos, word 7
262260 ld y, 0
263261 copy word 0, actor_pos + y
264262 copy word 0, actor_delta + y
265 copy forward player_logic, actor_logic + y
263 copy player_logic, actor_logic + y
266264 }
267265
268266 // ----------------------------------------------------------------
283281 ld y, 0
284282
285283 // check collision.
286 copy [ptr] + y, a
284 ld a, [ptr] + y
287285 // if "collision" is with your own self, treat it as if it's blank space!
288286 cmp a, 81
289287 if z {
306304 st off, c
307305 } else {
308306 st on, c
309 trash n
310 trash a
311 trash z
307 }
308
309 // FIXME these trashes, strictly speaking, probably shouldn't be needed,
310 // but currently the compiler cares a little too much about values that are
311 // initialized in one branch of an `if`, but not the other, but are trashed
312 // at the end of the routine anyway.
313 trash ptr
314 trash y
315 trash v
316 } else {
317 st off, c
318 }
319 }
320
321 define enemy_logic logic_routine
322 static word compare_target : 0
323 {
324 call calculate_new_position
325 call check_new_position_in_bounds
326
327 if c {
328 copy ^screen, ptr
329 st off, c
330 add ptr, new_pos
331 ld y, 0
332
333 // check collision.
334 ld a, [ptr] + y
335 // if "collision" is with your own self, treat it as if it's blank space!
336 cmp a, 82
337 if z {
338 ld a, 32
339 }
340 cmp a, 32
341 if z {
342 copy ^screen, ptr
343 st off, c
344 add ptr, pos
345 copy 32, [ptr] + y
346
347 copy new_pos, pos
348
349 copy ^screen, ptr
350 st off, c
351 add ptr, pos
352 copy 82, [ptr] + y
353
354 st off, c
355 } else {
356 st on, c
312357 }
313358
314359 // FIXME these trashes, strictly speaking, probably shouldn't be needed,
317362 // at the end of the routine anyway.
318363 trash ptr
319364 trash y
320 trash a
321 trash v
322 } else {
323 st off, c
324 }
325 }
326
327 define enemy_logic logic_routine
328 {
329 call calculate_new_position
330 call check_new_position_in_bounds
331
332 if c {
333 copy ^screen, ptr
334 st off, c
335 add ptr, new_pos
336 ld y, 0
337
338 // check collision.
339 copy [ptr] + y, a
340 // if "collision" is with your own self, treat it as if it's blank space!
341 cmp a, 82
342 if z {
343 ld a, 32
344 }
345 cmp a, 32
346 if z {
347 copy ^screen, ptr
348 st off, c
349 add ptr, pos
350 copy 32, [ptr] + y
351
352 copy new_pos, pos
353
354 copy ^screen, ptr
355 st off, c
356 add ptr, pos
357 copy 82, [ptr] + y
358
359 st off, c
360 } else {
361 st on, c
362 trash n
363 trash a
364 trash z
365 }
366
367 // FIXME these trashes, strictly speaking, probably shouldn't be needed,
368 // but currently the compiler cares too much about values that are
369 // initialized in one branch of an `if`, but not the other, but trashed
370 // at the end of the routine anyway.
371 trash ptr
372 trash y
373 trash a
374365 } else {
375366 copy delta, compare_target
376367 st on, c
380371 } else {
381372 copy $ffd8, delta
382373 }
383 trash compare_target
384374 }
385375
386376 st off, c
410400 if c {
411401 call clear_screen
412402 call init_game
413 copy forward game_state_play, dispatch_game_state
414
415 // FIXME these trashes, strictly speaking, probably shouldn't be needed,
416 // but currently the compiler cares too much about values that are
417 // initialized in one branch of an `if`, but not the other, but trashed
418 // at the end of the routine anyway.
419 trash a
420 trash n
421 trash z
422 } else {
423 trash y
424 trash c
425 trash v
403 copy game_state_play, dispatch_game_state
426404 }
427405
428406 goto save_cinv
429407 }
430408
431409 define game_state_play game_state_routine
410 static byte save_x : 0
432411 {
433412 ld x, 0
434413 repeat {
443422 if c {
444423 // Player died! Want no dead! Break out of the loop (this is a bit awkward.)
445424 call clear_screen
446 copy forward game_state_game_over, dispatch_game_state
425 copy game_state_game_over, dispatch_game_state
447426 ld x, 15
448427 } else {
449428 ld x, save_x
450 trash c
451429 }
452430
453431 copy pos, actor_pos + x
469447 call clear_screen
470448 call init_game
471449 copy game_state_title_screen, dispatch_game_state
472
473 // FIXME these trashes, strictly speaking, probably shouldn't be needed,
474 // but currently the compiler cares too much about values that are
475 // initialized in one branch of an `if`, but not the other, but trashed
476 // at the end of the routine anyway.
477 trash a
478 trash n
479 trash z
480 } else {
481 trash y
482 trash c
483 trash v
484450 }
485451
486452 goto save_cinv
0 //
1 // Demonstrates vector tables.
2 // Prints "AABAB".
3 //
4
05 vector routine
16 trashes a, z, n
27 print
3035 trashes print, a, x, z, n, c
3136 {
3237 ld x, 0
33 copy printa, print
34 copy print, vectors + x
38 copy printa, vectors + x
3539 inc x
3640 copy printa, print
3741 copy print, vectors + x
3943 copy printb, print
4044 copy print, vectors + x
4145 inc x
42 copy printa, print
43 copy print, vectors + x
46 copy printa, vectors + x
4447 inc x
45 copy printb, print
46 copy print, vectors + x
47
48 copy printa, print
48 copy printb, vectors + x
4949
5050 ld x, 0
5151 repeat {
+0
-51
eg/vector-table2.60p less more
0 vector routine
1 trashes a, z, n
2 print
3
4 vector (routine
5 trashes a, z, n)
6 table[32] vectors
7
8 routine chrout
9 inputs a
10 trashes a
11 @ 65490
12
13 routine printa
14 trashes a, z, n
15 {
16 ld a, 65
17 call chrout
18 }
19
20 routine printb
21 trashes a, z, n
22 {
23 ld a, 66
24 call chrout
25 }
26
27 routine main
28 inputs vectors
29 outputs vectors
30 trashes print, a, x, z, n, c
31 {
32 ld x, 0
33 copy printa, vectors + x
34 inc x
35 copy printa, vectors + x
36 inc x
37 copy printb, vectors + x
38 inc x
39 copy printa, vectors + x
40 inc x
41 copy printb, vectors + x
42
43 ld x, 0
44 repeat {
45 copy vectors + x, print
46 call print
47 inc x
48 cmp x, 5
49 } until z
50 }
00 #!/bin/sh
11
2 if [ "X$X64" = "X" ]; then
3 X64=x64
4 fi
25 SRC=$1
36 if [ "X$1" = "X" ]; then
47 echo "Usage: ./loadngo.sh <source.60p>"
69 fi
710 OUT=/tmp/a-out.prg
811 bin/sixtypical --traceback --basic-prelude $SRC > $OUT || exit 1
12 ls -la $OUT
913 if [ -e vicerc ]; then
10 x64 -config vicerc $OUT
14 $X64 -config vicerc $OUT
1115 else
12 x64 $OUT
16 $X64 $OUT
1317 fi
1418 rm -f $OUT
5050
5151 class IncompatibleConstraintsError(ConstraintsError):
5252 pass
53
54
55 def routine_has_static(routine, ref):
56 if not hasattr(routine, 'statics'):
57 return False
58 for static in routine.statics:
59 if static.location == ref:
60 return True
61 return False
5362
5463
5564 class Context(object):
100109 c._writeable = set(self._writeable)
101110 return c
102111
103 def set_from(self, c):
104 assert c.routines == self.routines
105 assert c.routine == self.routine
106 self._touched = set(c._touched)
107 self._meaningful = set(c._meaningful)
108 self._writeable = set(c._writeable)
109
110112 def each_meaningful(self):
111113 for ref in self._meaningful:
112114 yield ref
118120 def assert_meaningful(self, *refs, **kwargs):
119121 exception_class = kwargs.get('exception_class', UnmeaningfulReadError)
120122 for ref in refs:
123 # statics are always meaningful
124 if routine_has_static(self.routine, ref):
125 continue
121126 if ref.is_constant() or ref in self.routines:
122127 pass
123128 elif isinstance(ref, LocationRef):
126131 if kwargs.get('message'):
127132 message += ' (%s)' % kwargs['message']
128133 raise exception_class(message)
134 elif isinstance(ref, IndexedRef):
135 self.assert_meaningful(ref.ref, **kwargs)
136 self.assert_meaningful(ref.index, **kwargs)
129137 else:
130138 raise NotImplementedError(ref)
131139
132140 def assert_writeable(self, *refs, **kwargs):
133141 exception_class = kwargs.get('exception_class', ForbiddenWriteError)
134142 for ref in refs:
143 # statics are always writeable
144 if routine_has_static(self.routine, ref):
145 continue
135146 if ref not in self._writeable:
136147 message = '%s in %s' % (ref.name, self.routine.name)
137148 if kwargs.get('message'):
203214 if routine.block is None:
204215 # it's an extern, that's fine
205216 return
206 type = routine.location.type
207 context = Context(self.routines, routine, type.inputs, type.outputs, type.trashes)
217 type_ = routine.location.type
218 context = Context(self.routines, routine, type_.inputs, type_.outputs, type_.trashes)
219
208220 if self.debug:
209221 print "at start of routine `{}`:".format(routine.name)
210222 print context
223
211224 self.analyze_block(routine.block, context)
225 trashed = set(context.each_touched()) - context._meaningful
226
212227 if self.debug:
213228 print "at end of routine `{}`:".format(routine.name)
214229 print context
230 print "trashed: ", LocationRef.format_set(trashed)
231 print "outputs: ", LocationRef.format_set(type_.outputs)
232 trashed_outputs = type_.outputs & trashed
233 if trashed_outputs:
234 print "TRASHED OUTPUTS: ", LocationRef.format_set(trashed_outputs)
235 print ''
236 print '-' * 79
237 print ''
238
239 # even if we goto another routine, we can't trash an output.
240 for ref in trashed:
241 if ref in type_.outputs:
242 raise UnmeaningfulOutputError('%s in %s' % (ref.name, routine.name))
243
215244 if not self.has_encountered_goto:
216 for ref in type.outputs:
245 for ref in type_.outputs:
217246 context.assert_meaningful(ref, exception_class=UnmeaningfulOutputError)
218247 for ref in context.each_touched():
219 if ref not in type.outputs and ref not in type.trashes:
248 if ref not in type_.outputs and ref not in type_.trashes and not routine_has_static(routine, ref):
220249 message = '%s in %s' % (ref.name, routine.name)
221250 raise ForbiddenWriteError(message)
222251 self.current_routine = None
235264 src = instr.src
236265
237266 if opcode == 'ld':
238 if instr.index:
239 if TableType.is_a_table_type(src.type, TYPE_BYTE) and dest.type == TYPE_BYTE:
267 if isinstance(src, IndexedRef):
268 if TableType.is_a_table_type(src.ref.type, TYPE_BYTE) and dest.type == TYPE_BYTE:
240269 pass
241270 else:
242271 raise TypeMismatchError('%s and %s in %s' %
243 (src.name, dest.name, self.current_routine.name)
272 (src.ref.name, dest.name, self.current_routine.name)
244273 )
245 context.assert_meaningful(instr.index)
274 context.assert_meaningful(src, src.index)
275 elif isinstance(src, IndirectRef):
276 # copying this analysis from the matching branch in `copy`, below
277 if isinstance(src.ref.type, PointerType) and dest.type == TYPE_BYTE:
278 pass
279 else:
280 raise TypeMismatchError((src, dest))
281 context.assert_meaningful(src.ref, REG_Y)
246282 elif src.type != dest.type:
247283 raise TypeMismatchError('%s and %s in %s' %
248284 (src.name, dest.name, self.current_routine.name)
249285 )
250 context.assert_meaningful(src)
286 else:
287 context.assert_meaningful(src)
251288 context.set_written(dest, FLAG_Z, FLAG_N)
252289 elif opcode == 'st':
253 if instr.index:
254 if src.type == TYPE_BYTE and TableType.is_a_table_type(dest.type, TYPE_BYTE):
255 pass
256 else:
257 raise TypeMismatchError((src, dest))
258 context.assert_meaningful(instr.index)
290 if isinstance(dest, IndexedRef):
291 if src.type == TYPE_BYTE and TableType.is_a_table_type(dest.ref.type, TYPE_BYTE):
292 pass
293 else:
294 raise TypeMismatchError((src, dest))
295 context.assert_meaningful(dest.index)
296 context.set_written(dest.ref)
297 elif isinstance(dest, IndirectRef):
298 # copying this analysis from the matching branch in `copy`, below
299 if isinstance(dest.ref.type, PointerType) and src.type == TYPE_BYTE:
300 pass
301 else:
302 raise TypeMismatchError((src, dest))
303 context.assert_meaningful(dest.ref, REG_Y)
304 context.set_written(dest.ref)
259305 elif src.type != dest.type:
260306 raise TypeMismatchError('%r and %r in %s' %
261307 (src, dest, self.current_routine.name)
262308 )
309 else:
310 context.set_written(dest)
263311 context.assert_meaningful(src)
264 context.set_written(dest)
265312 elif opcode == 'add':
266313 context.assert_meaningful(src, dest, FLAG_C)
267314 if src.type == TYPE_BYTE:
318365 context.set_touched(ref)
319366 context.set_unmeaningful(ref)
320367 elif opcode == 'if':
368 incoming_meaningful = set(context.each_meaningful())
369
321370 context1 = context.clone()
322371 context2 = context.clone()
323372 self.analyze_block(instr.block1, context1)
324373 if instr.block2 is not None:
325374 self.analyze_block(instr.block2, context2)
375
376 outgoing_meaningful = set(context1.each_meaningful()) & set(context2.each_meaningful())
377 outgoing_trashes = incoming_meaningful - outgoing_meaningful
378
326379 # TODO may we need to deal with touched separately here too?
327380 # probably not; if it wasn't meaningful in the first place, it
328381 # doesn't really matter if you modified it or not, coming out.
329382 for ref in context1.each_meaningful():
383 if ref in outgoing_trashes:
384 continue
330385 context2.assert_meaningful(
331386 ref, exception_class=InconsistentInitializationError,
332387 message='initialized in block 1 but not in block 2 of `if {}`'.format(src)
333388 )
334389 for ref in context2.each_meaningful():
390 if ref in outgoing_trashes:
391 continue
335392 context1.assert_meaningful(
336393 ref, exception_class=InconsistentInitializationError,
337394 message='initialized in block 2 but not in block 1 of `if {}`'.format(src)
338395 )
339 context.set_from(context1)
396
397 # merge the contexts. this used to be a method called `set_from`
398 context._touched = set(context1._touched) | set(context2._touched)
399 context._meaningful = outgoing_meaningful
400 context._writeable = set(context1._writeable) | set(context2._writeable)
401
402 for ref in outgoing_trashes:
403 context.set_touched(ref)
404 context.set_unmeaningful(ref)
405
340406 elif opcode == 'repeat':
341407 # it will always be executed at least once, so analyze it having
342408 # been executed the first time.
351417 context.assert_meaningful(src)
352418
353419 elif opcode == 'copy':
420 if dest == REG_A:
421 raise ForbiddenWriteError("{} cannot be used as destination for copy".format(dest))
422
354423 # 1. check that their types are compatible
355424
356425 if isinstance(src, AddressRef) and isinstance(dest, LocationRef):
410479 context.set_written(dest.ref)
411480 elif isinstance(src, IndirectRef) and isinstance(dest, LocationRef):
412481 context.assert_meaningful(src.ref, REG_Y)
413 # TODO this will need to be more sophisticated. the thing ref points to is touched, as well.
414482 context.set_written(dest)
415483 elif isinstance(src, LocationRef) and isinstance(dest, IndexedRef):
416484 context.assert_meaningful(src, dest.ref, dest.index)
427495 context.set_written(dest)
428496
429497 context.set_touched(REG_A, FLAG_Z, FLAG_N)
430 context.set_unmeaningful(FLAG_Z, FLAG_N)
431
432 # FIXME: this is just to support "copy [foo] + y, a". consider disallowing `a` as something
433 # that can be used in `copy`. should be using `st` or `ld` instead, probably.
434 if dest == REG_A:
435 context.set_touched(REG_A)
436 context.set_written(REG_A)
437 else:
438 context.set_unmeaningful(REG_A)
498 context.set_unmeaningful(REG_A, FLAG_Z, FLAG_N)
439499
440500 elif opcode == 'with-sei':
441501 self.analyze_block(instr.block, context)
460520
461521 self.has_encountered_goto = True
462522 elif opcode == 'trash':
523 context.set_touched(instr.dest)
463524 context.set_unmeaningful(instr.dest)
464525 else:
465526 raise NotImplementedError(opcode)
2727 class Compiler(object):
2828 def __init__(self, emitter):
2929 self.emitter = emitter
30 self.routines = {}
31 self.labels = {}
32 self.trampolines = {} # Location -> Label
30 self.routines = {} # routine.name -> Routine
31 self.routine_statics = {} # routine.name -> { static.name -> Label }
32 self.labels = {} # global.name -> Label ("global" includes routines)
33 self.trampolines = {} # Location -> Label
34 self.current_routine = None
3335
3436 # helper methods
3537
4143 else:
4244 raise NotImplementedError(index)
4345
46 def compute_length_of_defn(self, defn):
47 length = None
48 type_ = defn.location.type
49 if type_ == TYPE_BYTE:
50 length = 1
51 elif type_ == TYPE_WORD or isinstance(type_, (PointerType, VectorType)):
52 length = 2
53 elif isinstance(type_, TableType):
54 length = type_.size * (1 if type_.of_type == TYPE_BYTE else 2)
55 elif isinstance(type_, BufferType):
56 length = type_.size
57 if length is None:
58 raise NotImplementedError("Need size for type {}".format(type_))
59 return length
60
61 def get_label(self, name):
62 if self.current_routine:
63 static_label = self.routine_statics.get(self.current_routine.name, {}).get(name)
64 if static_label:
65 return static_label
66 return self.labels[name]
67
4468 # visitor methods
4569
4670 def compile_program(self, program):
4771 assert isinstance(program, Program)
4872
73 defn_labels = []
74
4975 for defn in program.defns:
50 # compute length of memory pointed to. this is awful.
51 length = None
52 type_ = defn.location.type
53 if type_ == TYPE_BYTE:
54 length = 1
55 elif type_ == TYPE_WORD or isinstance(type_, (PointerType, VectorType)):
56 length = 2
57 elif isinstance(type_, TableType):
58 length = type_.size * (1 if type_.of_type == TYPE_BYTE else 2)
59 elif isinstance(type_, BufferType):
60 length = type_.size
61 if length is None:
62 raise NotImplementedError("Need size for type {}".format(type_))
63 self.labels[defn.name] = Label(defn.name, addr=defn.addr, length=length)
76 length = self.compute_length_of_defn(defn)
77 label = Label(defn.name, addr=defn.addr, length=length)
78 self.labels[defn.name] = label
79 defn_labels.append((defn, label))
6480
6581 for routine in program.routines:
6682 self.routines[routine.name] = routine
6985 label.set_addr(routine.addr)
7086 self.labels[routine.name] = label
7187
88 if hasattr(routine, 'statics'):
89 static_labels = {}
90 for defn in routine.statics:
91 length = self.compute_length_of_defn(defn)
92 label = Label(defn.name, addr=defn.addr, length=length)
93 static_labels[defn.name] = label
94 defn_labels.append((defn, label))
95 self.routine_statics[routine.name] = static_labels
96
7297 self.compile_routine(self.routines['main'])
7398 for routine in program.routines:
7499 if routine.name != 'main':
76101
77102 for location, label in self.trampolines.iteritems():
78103 self.emitter.resolve_label(label)
79 self.emitter.emit(JMP(Indirect(self.labels[location.name])))
104 self.emitter.emit(JMP(Indirect(self.get_label(location.name))))
80105 self.emitter.emit(RTS())
81106
82107 # initialized data
83 for defn in program.defns:
108 for defn, label in defn_labels:
84109 if defn.initial is not None:
85 label = self.labels[defn.name]
86110 initial_data = None
87111 type_ = defn.location.type
88112 if type_ == TYPE_BYTE:
98122 self.emitter.emit(initial_data)
99123
100124 # uninitialized, "BSS" data
101 for defn in program.defns:
125 for defn, label in defn_labels:
102126 if defn.initial is None and defn.addr is None:
103 label = self.labels[defn.name]
104127 self.emitter.resolve_bss_label(label)
105128
106
107129 def compile_routine(self, routine):
130 self.current_routine = routine
108131 assert isinstance(routine, Routine)
109132 if routine.block:
110 self.emitter.resolve_label(self.labels[routine.name])
133 self.emitter.resolve_label(self.get_label(routine.name))
111134 self.compile_block(routine.block)
112135 self.emitter.emit(RTS())
136 self.current_routine = None
113137
114138 def compile_block(self, block):
115139 assert isinstance(block, Block)
130154 self.emitter.emit(TYA())
131155 elif isinstance(src, ConstantRef):
132156 self.emitter.emit(LDA(Immediate(Byte(src.value))))
133 elif instr.index == REG_X:
134 self.emitter.emit(LDA(AbsoluteX(self.labels[src.name])))
135 elif instr.index == REG_Y:
136 self.emitter.emit(LDA(AbsoluteY(self.labels[src.name])))
137 else:
138 self.emitter.emit(LDA(Absolute(self.labels[src.name])))
157 elif isinstance(src, IndexedRef) and src.index == REG_X:
158 self.emitter.emit(LDA(AbsoluteX(self.get_label(src.ref.name))))
159 elif isinstance(src, IndexedRef) and src.index == REG_Y:
160 self.emitter.emit(LDA(AbsoluteY(self.get_label(src.ref.name))))
161 elif isinstance(src, IndirectRef) and isinstance(src.ref.type, PointerType):
162 self.emitter.emit(LDA(IndirectY(self.get_label(src.ref.name))))
163 else:
164 self.emitter.emit(LDA(Absolute(self.get_label(src.name))))
139165 elif dest == REG_X:
140166 if src == REG_A:
141167 self.emitter.emit(TAX())
142168 elif isinstance(src, ConstantRef):
143169 self.emitter.emit(LDX(Immediate(Byte(src.value))))
144 elif instr.index == REG_Y:
145 self.emitter.emit(LDX(AbsoluteY(self.labels[src.name])))
146 else:
147 self.emitter.emit(LDX(Absolute(self.labels[src.name])))
170 elif isinstance(src, IndexedRef) and src.index == REG_Y:
171 self.emitter.emit(LDX(AbsoluteY(self.get_label(src.ref.name))))
172 else:
173 self.emitter.emit(LDX(Absolute(self.get_label(src.name))))
148174 elif dest == REG_Y:
149175 if src == REG_A:
150176 self.emitter.emit(TAY())
151177 elif isinstance(src, ConstantRef):
152178 self.emitter.emit(LDY(Immediate(Byte(src.value))))
153 elif instr.index == REG_X:
154 self.emitter.emit(LDY(AbsoluteX(self.labels[src.name])))
155 else:
156 self.emitter.emit(LDY(Absolute(self.labels[src.name])))
179 elif isinstance(src, IndexedRef) and src.index == REG_X:
180 self.emitter.emit(LDY(AbsoluteX(self.get_label(src.ref.name))))
181 else:
182 self.emitter.emit(LDY(Absolute(self.get_label(src.name))))
157183 else:
158184 raise UnsupportedOpcodeError(instr)
159185 elif opcode == 'st':
167193 REG_X: STX,
168194 REG_Y: STY
169195 }.get(src, None)
170 mode_cls = {
171 REG_X: AbsoluteX,
172 REG_Y: AbsoluteY,
173 None: Absolute
174 }.get(instr.index, None)
196
197 if isinstance(dest, IndexedRef):
198 mode_cls = {
199 REG_X: AbsoluteX,
200 REG_Y: AbsoluteY,
201 }[dest.index]
202 label = self.get_label(dest.ref.name)
203 elif isinstance(dest, IndirectRef) and isinstance(dest.ref.type, PointerType):
204 mode_cls = IndirectY
205 label = self.get_label(dest.ref.name)
206 else:
207 mode_cls = Absolute
208 label = self.get_label(dest.name)
209
175210 if op_cls is None or mode_cls is None:
176211 raise UnsupportedOpcodeError(instr)
177 self.emitter.emit(op_cls(mode_cls(self.labels[dest.name])))
212 self.emitter.emit(op_cls(mode_cls(label)))
178213 elif opcode == 'add':
179214 if dest == REG_A:
180215 if isinstance(src, ConstantRef):
181216 self.emitter.emit(ADC(Immediate(Byte(src.value))))
182217 else:
183 self.emitter.emit(ADC(Absolute(self.labels[src.name])))
218 self.emitter.emit(ADC(Absolute(self.get_label(src.name))))
184219 elif isinstance(dest, LocationRef) and src.type == TYPE_WORD and dest.type == TYPE_WORD:
185220 if isinstance(src, ConstantRef):
186 dest_label = self.labels[dest.name]
221 dest_label = self.get_label(dest.name)
187222 self.emitter.emit(LDA(Absolute(dest_label)))
188223 self.emitter.emit(ADC(Immediate(Byte(src.low_byte()))))
189224 self.emitter.emit(STA(Absolute(dest_label)))
191226 self.emitter.emit(ADC(Immediate(Byte(src.high_byte()))))
192227 self.emitter.emit(STA(Absolute(Offset(dest_label, 1))))
193228 elif isinstance(src, LocationRef):
194 src_label = self.labels[src.name]
195 dest_label = self.labels[dest.name]
229 src_label = self.get_label(src.name)
230 dest_label = self.get_label(dest.name)
196231 self.emitter.emit(LDA(Absolute(dest_label)))
197232 self.emitter.emit(ADC(Absolute(src_label)))
198233 self.emitter.emit(STA(Absolute(dest_label)))
203238 raise UnsupportedOpcodeError(instr)
204239 elif isinstance(dest, LocationRef) and src.type == TYPE_WORD and isinstance(dest.type, PointerType):
205240 if isinstance(src, ConstantRef):
206 dest_label = self.labels[dest.name]
241 dest_label = self.get_label(dest.name)
207242 self.emitter.emit(LDA(ZeroPage(dest_label)))
208243 self.emitter.emit(ADC(Immediate(Byte(src.low_byte()))))
209244 self.emitter.emit(STA(ZeroPage(dest_label)))
211246 self.emitter.emit(ADC(Immediate(Byte(src.high_byte()))))
212247 self.emitter.emit(STA(ZeroPage(Offset(dest_label, 1))))
213248 elif isinstance(src, LocationRef):
214 src_label = self.labels[src.name]
215 dest_label = self.labels[dest.name]
249 src_label = self.get_label(src.name)
250 dest_label = self.get_label(dest.name)
216251 self.emitter.emit(LDA(ZeroPage(dest_label)))
217252 self.emitter.emit(ADC(Absolute(src_label)))
218253 self.emitter.emit(STA(ZeroPage(dest_label)))
228263 if isinstance(src, ConstantRef):
229264 self.emitter.emit(SBC(Immediate(Byte(src.value))))
230265 else:
231 self.emitter.emit(SBC(Absolute(self.labels[src.name])))
266 self.emitter.emit(SBC(Absolute(self.get_label(src.name))))
232267 elif isinstance(dest, LocationRef) and src.type == TYPE_WORD and dest.type == TYPE_WORD:
233268 if isinstance(src, ConstantRef):
234 dest_label = self.labels[dest.name]
269 dest_label = self.get_label(dest.name)
235270 self.emitter.emit(LDA(Absolute(dest_label)))
236271 self.emitter.emit(SBC(Immediate(Byte(src.low_byte()))))
237272 self.emitter.emit(STA(Absolute(dest_label)))
239274 self.emitter.emit(SBC(Immediate(Byte(src.high_byte()))))
240275 self.emitter.emit(STA(Absolute(Offset(dest_label, 1))))
241276 elif isinstance(src, LocationRef):
242 src_label = self.labels[src.name]
243 dest_label = self.labels[dest.name]
277 src_label = self.get_label(src.name)
278 dest_label = self.get_label(dest.name)
244279 self.emitter.emit(LDA(Absolute(dest_label)))
245280 self.emitter.emit(SBC(Absolute(src_label)))
246281 self.emitter.emit(STA(Absolute(dest_label)))
257292 elif dest == REG_Y:
258293 self.emitter.emit(INY())
259294 else:
260 self.emitter.emit(INC(Absolute(self.labels[dest.name])))
295 self.emitter.emit(INC(Absolute(self.get_label(dest.name))))
261296 elif opcode == 'dec':
262297 if dest == REG_X:
263298 self.emitter.emit(DEX())
264299 elif dest == REG_Y:
265300 self.emitter.emit(DEY())
266301 else:
267 self.emitter.emit(DEC(Absolute(self.labels[dest.name])))
302 self.emitter.emit(DEC(Absolute(self.get_label(dest.name))))
268303 elif opcode == 'cmp':
269304 cls = {
270305 'a': CMP,
276311 if isinstance(src, ConstantRef):
277312 self.emitter.emit(cls(Immediate(Byte(src.value))))
278313 else:
279 self.emitter.emit(cls(Absolute(self.labels[src.name])))
314 self.emitter.emit(cls(Absolute(self.get_label(src.name))))
280315 elif opcode in ('and', 'or', 'xor',):
281316 cls = {
282317 'and': AND,
287322 if isinstance(src, ConstantRef):
288323 self.emitter.emit(cls(Immediate(Byte(src.value))))
289324 else:
290 self.emitter.emit(cls(Absolute(self.labels[src.name])))
325 self.emitter.emit(cls(Absolute(self.get_label(src.name))))
291326 else:
292327 raise UnsupportedOpcodeError(instr)
293328 elif opcode in ('shl', 'shr'):
301336 raise UnsupportedOpcodeError(instr)
302337 elif opcode == 'call':
303338 location = instr.location
304 label = self.labels[instr.location.name]
339 label = self.get_label(instr.location.name)
305340 if isinstance(location.type, RoutineType):
306341 self.emitter.emit(JSR(Absolute(label)))
307342 elif isinstance(location.type, VectorType):
313348 raise NotImplementedError
314349 elif opcode == 'goto':
315350 location = instr.location
316 label = self.labels[instr.location.name]
351 label = self.get_label(instr.location.name)
317352 if isinstance(location.type, RoutineType):
318353 self.emitter.emit(JMP(Absolute(label)))
319354 elif isinstance(location.type, VectorType):
371406 if isinstance(src, (LocationRef, ConstantRef)) and isinstance(dest, IndirectRef):
372407 if src.type == TYPE_BYTE and isinstance(dest.ref.type, PointerType):
373408 if isinstance(src, ConstantRef):
374 dest_label = self.labels[dest.ref.name]
409 dest_label = self.get_label(dest.ref.name)
375410 self.emitter.emit(LDA(Immediate(Byte(src.value))))
376411 self.emitter.emit(STA(IndirectY(dest_label)))
377412 elif isinstance(src, LocationRef):
378 src_label = self.labels[src.name]
379 dest_label = self.labels[dest.ref.name]
413 src_label = self.get_label(src.name)
414 dest_label = self.get_label(dest.ref.name)
380415 self.emitter.emit(LDA(Absolute(src_label)))
381416 self.emitter.emit(STA(IndirectY(dest_label)))
382417 else:
384419 else:
385420 raise NotImplementedError((src, dest))
386421 elif isinstance(src, IndirectRef) and isinstance(dest, LocationRef):
387 if dest == REG_A and isinstance(src.ref.type, PointerType):
388 src_label = self.labels[src.ref.name]
389 self.emitter.emit(LDA(IndirectY(src_label)))
390 elif dest.type == TYPE_BYTE and isinstance(src.ref.type, PointerType):
391 src_label = self.labels[src.ref.name]
392 dest_label = self.labels[dest.name]
422 if dest.type == TYPE_BYTE and isinstance(src.ref.type, PointerType):
423 src_label = self.get_label(src.ref.name)
424 dest_label = self.get_label(dest.name)
393425 self.emitter.emit(LDA(IndirectY(src_label)))
394426 self.emitter.emit(STA(Absolute(dest_label)))
395427 else:
396428 raise NotImplementedError((src, dest))
397429 elif isinstance(src, AddressRef) and isinstance(dest, LocationRef) and \
398430 isinstance(src.ref.type, BufferType) and isinstance(dest.type, PointerType):
399 src_label = self.labels[src.ref.name]
400 dest_label = self.labels[dest.name]
431 src_label = self.get_label(src.ref.name)
432 dest_label = self.get_label(dest.name)
401433 self.emitter.emit(LDA(Immediate(HighAddressByte(src_label))))
402434 self.emitter.emit(STA(ZeroPage(dest_label)))
403435 self.emitter.emit(LDA(Immediate(LowAddressByte(src_label))))
404436 self.emitter.emit(STA(ZeroPage(Offset(dest_label, 1))))
405437 elif isinstance(src, LocationRef) and isinstance(dest, IndexedRef):
406438 if src.type == TYPE_WORD and TableType.is_a_table_type(dest.ref.type, TYPE_WORD):
407 src_label = self.labels[src.name]
408 dest_label = self.labels[dest.ref.name]
439 src_label = self.get_label(src.name)
440 dest_label = self.get_label(dest.ref.name)
409441 self.emitter.emit(LDA(Absolute(src_label)))
410442 self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(dest_label)))
411443 self.emitter.emit(LDA(Absolute(Offset(src_label, 1))))
412444 self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(Offset(dest_label, 256))))
413445 elif isinstance(src.type, VectorType) and isinstance(dest.ref.type, TableType) and isinstance(dest.ref.type.of_type, VectorType):
414446 # FIXME this is the exact same as above - can this be simplified?
415 src_label = self.labels[src.name]
416 dest_label = self.labels[dest.ref.name]
447 src_label = self.get_label(src.name)
448 dest_label = self.get_label(dest.ref.name)
417449 self.emitter.emit(LDA(Absolute(src_label)))
418450 self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(dest_label)))
419451 self.emitter.emit(LDA(Absolute(Offset(src_label, 1))))
420452 self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(Offset(dest_label, 256))))
421453 elif isinstance(src.type, RoutineType) and isinstance(dest.ref.type, TableType) and isinstance(dest.ref.type.of_type, VectorType):
422 src_label = self.labels[src.name]
423 dest_label = self.labels[dest.ref.name]
454 src_label = self.get_label(src.name)
455 dest_label = self.get_label(dest.ref.name)
424456 self.emitter.emit(LDA(Immediate(HighAddressByte(src_label))))
425457 self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(dest_label)))
426458 self.emitter.emit(LDA(Immediate(LowAddressByte(src_label))))
429461 raise NotImplementedError
430462 elif isinstance(src, ConstantRef) and isinstance(dest, IndexedRef):
431463 if src.type == TYPE_WORD and TableType.is_a_table_type(dest.ref.type, TYPE_WORD):
432 dest_label = self.labels[dest.ref.name]
464 dest_label = self.get_label(dest.ref.name)
433465 self.emitter.emit(LDA(Immediate(Byte(src.low_byte()))))
434466 self.emitter.emit(STA(self.addressing_mode_for_index(dest.index)(dest_label)))
435467 self.emitter.emit(LDA(Immediate(Byte(src.high_byte()))))
438470 raise NotImplementedError
439471 elif isinstance(src, IndexedRef) and isinstance(dest, LocationRef):
440472 if TableType.is_a_table_type(src.ref.type, TYPE_WORD) and dest.type == TYPE_WORD:
441 src_label = self.labels[src.ref.name]
442 dest_label = self.labels[dest.name]
473 src_label = self.get_label(src.ref.name)
474 dest_label = self.get_label(dest.name)
443475 self.emitter.emit(LDA(self.addressing_mode_for_index(src.index)(src_label)))
444476 self.emitter.emit(STA(Absolute(dest_label)))
445477 self.emitter.emit(LDA(self.addressing_mode_for_index(src.index)(Offset(src_label, 256))))
446478 self.emitter.emit(STA(Absolute(Offset(dest_label, 1))))
447479 elif isinstance(dest.type, VectorType) and isinstance(src.ref.type, TableType) and isinstance(src.ref.type.of_type, VectorType):
448480 # FIXME this is the exact same as above - can this be simplified?
449 src_label = self.labels[src.ref.name]
450 dest_label = self.labels[dest.name]
481 src_label = self.get_label(src.ref.name)
482 dest_label = self.get_label(dest.name)
451483 self.emitter.emit(LDA(self.addressing_mode_for_index(src.index)(src_label)))
452484 self.emitter.emit(STA(Absolute(dest_label)))
453485 self.emitter.emit(LDA(self.addressing_mode_for_index(src.index)(Offset(src_label, 256))))
461493 if isinstance(src, ConstantRef):
462494 raise NotImplementedError
463495 else:
464 src_label = self.labels[src.name]
465 dest_label = self.labels[dest.name]
496 src_label = self.get_label(src.name)
497 dest_label = self.get_label(dest.name)
466498 self.emitter.emit(LDA(Absolute(src_label)))
467499 self.emitter.emit(STA(Absolute(dest_label)))
468500 elif src.type == TYPE_WORD and dest.type == TYPE_WORD:
469501 if isinstance(src, ConstantRef):
470 dest_label = self.labels[dest.name]
502 dest_label = self.get_label(dest.name)
471503 self.emitter.emit(LDA(Immediate(Byte(src.low_byte()))))
472504 self.emitter.emit(STA(Absolute(dest_label)))
473505 self.emitter.emit(LDA(Immediate(Byte(src.high_byte()))))
474506 self.emitter.emit(STA(Absolute(Offset(dest_label, 1))))
475507 else:
476 src_label = self.labels[src.name]
477 dest_label = self.labels[dest.name]
508 src_label = self.get_label(src.name)
509 dest_label = self.get_label(dest.name)
478510 self.emitter.emit(LDA(Absolute(src_label)))
479511 self.emitter.emit(STA(Absolute(dest_label)))
480512 self.emitter.emit(LDA(Absolute(Offset(src_label, 1))))
481513 self.emitter.emit(STA(Absolute(Offset(dest_label, 1))))
482514 elif isinstance(src.type, VectorType) and isinstance(dest.type, VectorType):
483 src_label = self.labels[src.name]
484 dest_label = self.labels[dest.name]
515 src_label = self.get_label(src.name)
516 dest_label = self.get_label(dest.name)
485517 self.emitter.emit(LDA(Absolute(src_label)))
486518 self.emitter.emit(STA(Absolute(dest_label)))
487519 self.emitter.emit(LDA(Absolute(Offset(src_label, 1))))
488520 self.emitter.emit(STA(Absolute(Offset(dest_label, 1))))
489521 elif isinstance(src.type, RoutineType) and isinstance(dest.type, VectorType):
490 src_label = self.labels[src.name]
491 dest_label = self.labels[dest.name]
522 src_label = self.get_label(src.name)
523 dest_label = self.get_label(dest.name)
492524 self.emitter.emit(LDA(Immediate(HighAddressByte(src_label))))
493525 self.emitter.emit(STA(Absolute(dest_label)))
494526 self.emitter.emit(LDA(Immediate(LowAddressByte(src_label))))
144144 # just to be sure.
145145 equal = isinstance(other, self.__class__) and other.name == self.name
146146 if equal:
147 assert other.type == self.type
147 assert other.type == self.type, repr((self, other))
148148 return equal
149149
150150 def __hash__(self):
1313 self.ast_node = ast_node
1414 self.model = model
1515
16 def __repr__(self):
17 return "%s(%r, %r)" % (self.__class__.__name__, self.ast_node, self.model)
18
1619
1720 class Parser(object):
1821 def __init__(self, text):
1922 self.scanner = Scanner(text)
20 self.symbols = {} # token -> SymEntry
21 self.typedefs = {} # token -> Type AST
23 self.symbols = {} # token -> SymEntry
24 self.current_statics = {} # token -> SymEntry
25 self.typedefs = {} # token -> Type AST
2226 for token in ('a', 'x', 'y'):
2327 self.symbols[token] = SymEntry(None, LocationRef(TYPE_BYTE, token))
2428 for token in ('c', 'z', 'n', 'v'):
2529 self.symbols[token] = SymEntry(None, LocationRef(TYPE_BIT, token))
2630 self.backpatch_instrs = []
2731
32 def soft_lookup(self, name):
33 if name in self.current_statics:
34 return self.current_statics[name].model
35 if name in self.symbols:
36 return self.symbols[name].model
37 return None
38
2839 def lookup(self, name):
29 if name not in self.symbols:
40 model = self.soft_lookup(name)
41 if model is None:
3042 raise SyntaxError('Undefined symbol "%s"' % name)
31 return self.symbols[name].model
43 return model
3244
3345 # --- grammar productions
3446
129141 return size
130142
131143 def defn_type(self):
144 type_ = self.defn_type_term()
145
146 if self.scanner.consume('table'):
147 size = self.defn_size()
148 type_ = TableType(type_, size)
149
150 return type_
151
152 def defn_type_term(self):
132153 type_ = None
133154
134155 if self.scanner.consume('('):
141162 elif self.scanner.consume('word'):
142163 type_ = TYPE_WORD
143164 elif self.scanner.consume('vector'):
144 type_ = self.defn_type()
165 type_ = self.defn_type_term()
145166 if not isinstance(type_, RoutineType):
146167 raise SyntaxError("Vectors can only be of a routine, not %r" % type_)
147168 type_ = VectorType(type_)
160181 raise SyntaxError("Undefined type '%s'" % type_name)
161182 type_ = self.typedefs[type_name]
162183
163 if self.scanner.consume('table'):
164 size = self.defn_size()
165 type_ = TableType(type_, size)
166
167184 return type_
168185
169186 def defn_name(self):
208225 type_ = self.defn_type()
209226 if not isinstance(type_, RoutineType):
210227 raise SyntaxError("Can only define a routine, not %r" % type_)
228 statics = []
211229 if self.scanner.consume('@'):
212230 self.scanner.check_type('integer literal')
213231 block = None
214232 addr = int(self.scanner.token)
215233 self.scanner.scan()
216234 else:
235 statics = self.statics()
236
237 self.current_statics = self.compose_statics_dict(statics)
217238 block = self.block()
239 self.current_statics = {}
240
218241 addr = None
219242 location = LocationRef(type_, name)
220243 return Routine(
221244 name=name, block=block, addr=addr,
222 location=location
245 location=location, statics=statics
223246 )
247
248 def compose_statics_dict(self, statics):
249 c = {}
250 for defn in statics:
251 name = defn.name
252 if name in self.symbols or name in self.current_statics:
253 raise SyntaxError('Symbol "%s" already declared' % name)
254 c[name] = SymEntry(defn, defn.location)
255 return c
224256
225257 def labels(self):
226258 accum = []
243275 accum.append(self.locexpr())
244276 return accum
245277
246 def locexpr(self):
278 def locexpr(self, forward=False):
247279 if self.scanner.token in ('on', 'off'):
248280 loc = ConstantRef(TYPE_BIT, 1 if self.scanner.token == 'on' else 0)
249281 self.scanner.scan()
258290 loc = ConstantRef(TYPE_WORD, int(self.scanner.token))
259291 self.scanner.scan()
260292 return loc
293 elif forward:
294 name = self.scanner.token
295 self.scanner.scan()
296 loc = self.soft_lookup(name)
297 if loc is not None:
298 return loc
299 else:
300 return name
261301 else:
262302 loc = self.lookup(self.scanner.token)
263303 self.scanner.scan()
264304 return loc
265305
266 def indlocexpr(self):
267 if self.scanner.consume('forward'):
268 return self.label()
269 elif self.scanner.consume('['):
306 def indlocexpr(self, forward=False):
307 if self.scanner.consume('['):
270308 loc = self.locexpr()
271309 self.scanner.expect(']')
272310 self.scanner.expect('+')
276314 loc = self.locexpr()
277315 return AddressRef(loc)
278316 else:
279 loc = self.locexpr()
280 index = None
281 if self.scanner.consume('+'):
282 index = self.locexpr()
283 loc = IndexedRef(loc, index)
317 loc = self.locexpr(forward=forward)
318 if not isinstance(loc, basestring):
319 index = None
320 if self.scanner.consume('+'):
321 index = self.locexpr()
322 loc = IndexedRef(loc, index)
284323 return loc
324
325 def statics(self):
326 defns = []
327 while self.scanner.consume('static'):
328 defn = self.defn()
329 if defn.initial is None:
330 raise SyntaxError("Static definition {} must have initial value".format(defn))
331 defns.append(defn)
332 return defns
285333
286334 def block(self):
287335 instrs = []
315363 self.scanner.expect('forever')
316364 return Instr(opcode='repeat', dest=None, src=src,
317365 block=block, inverted=inverted)
318 elif self.scanner.token in ("ld", "add", "sub", "cmp", "and", "or", "xor"):
366 elif self.scanner.token in ("ld",):
367 # the same as add, sub, cmp etc below, except supports an indlocexpr for the src
368 opcode = self.scanner.token
369 self.scanner.scan()
370 dest = self.locexpr()
371 self.scanner.expect(',')
372 src = self.indlocexpr()
373 return Instr(opcode=opcode, dest=dest, src=src, index=None)
374 elif self.scanner.token in ("add", "sub", "cmp", "and", "or", "xor"):
319375 opcode = self.scanner.token
320376 self.scanner.scan()
321377 dest = self.locexpr()
330386 self.scanner.scan()
331387 src = self.locexpr()
332388 self.scanner.expect(',')
333 dest = self.locexpr()
334 index = None
335 if self.scanner.consume('+'):
336 index = self.locexpr()
337 return Instr(opcode=opcode, dest=dest, src=src, index=index)
389 dest = self.indlocexpr()
390 return Instr(opcode=opcode, dest=dest, src=src, index=None)
338391 elif self.scanner.token in ("shl", "shr", "inc", "dec"):
339392 opcode = self.scanner.token
340393 self.scanner.scan()
351404 elif self.scanner.token in ("copy",):
352405 opcode = self.scanner.token
353406 self.scanner.scan()
354 src = self.indlocexpr()
407 src = self.indlocexpr(forward=True)
355408 self.scanner.expect(',')
356409 dest = self.indlocexpr()
357410 instr = Instr(opcode=opcode, dest=dest, src=src)
8787 | ld x, 0
8888 | }
8989 = ok
90
91 This is true regardless of whether it's an input or not.
92
93 | routine main
94 | inputs x
95 | {
96 | ld x, 0
97 | }
98 ? ForbiddenWriteError: x in main
99
100 | routine main
101 | inputs x
102 | outputs x, z, n
103 | {
104 | ld x, 0
105 | }
106 = ok
107
108 | routine main
109 | inputs x
110 | trashes x, z, n
111 | {
112 | ld x, 0
113 | }
114 = ok
115
116 If a routine trashes a location, this must be declared.
117
118 | routine foo
119 | trashes x
120 | {
121 | trash x
122 | }
123 = ok
124
125 | routine foo
126 | {
127 | trash x
128 | }
129 ? ForbiddenWriteError: x in foo
130
131 | routine foo
132 | outputs x
133 | {
134 | trash x
135 | }
136 ? UnmeaningfulOutputError: x in foo
137
138 If a routine causes a location to be trashed, this must be declared in the caller.
139
140 | routine trash_x
141 | trashes x, z, n
142 | {
143 | ld x, 0
144 | }
145 |
146 | routine foo
147 | trashes x, z, n
148 | {
149 | call trash_x
150 | }
151 = ok
152
153 | routine trash_x
154 | trashes x, z, n
155 | {
156 | ld x, 0
157 | }
158 |
159 | routine foo
160 | trashes z, n
161 | {
162 | call trash_x
163 | }
164 ? ForbiddenWriteError: x in foo
165
166 | routine trash_x
167 | trashes x, z, n
168 | {
169 | ld x, 0
170 | }
171 |
172 | routine foo
173 | outputs x
174 | trashes z, n
175 | {
176 | call trash_x
177 | }
178 ? UnmeaningfulOutputError: x in foo
90179
91180 If a routine reads or writes a user-define memory location, it needs to declare that too.
92181
12321321 be initialized in the other.
12331322
12341323 | routine foo
1235 | inputs x
12361324 | outputs x
12371325 | trashes a, z, n, c
12381326 | {
1327 | ld x, 0
12391328 | ld a, 0
12401329 | cmp a, 42
12411330 | if z {
12461335 | }
12471336 = ok
12481337
1338 | routine foo
1339 | inputs x
1340 | outputs x
1341 | trashes a, z, n, c
1342 | {
1343 | ld a, 0
1344 | cmp a, 42
1345 | if z {
1346 | ld x, 7
1347 | } else {
1348 | ld a, 23
1349 | }
1350 | }
1351 = ok
1352
12491353 An `if` with a single block is analyzed as if it had an empty `else` block.
12501354
12511355 | routine foo
12851389 | }
12861390 | }
12871391 = ok
1392
1393 The cardinal rule for trashes in an `if` is the "union rule": if one branch
1394 trashes {`a`} and the other branch trashes {`b`} then the whole `if` statement
1395 trashes {`a`, `b`}.
1396
1397 | routine foo
1398 | inputs a, x, z
1399 | trashes a, x
1400 | {
1401 | if z {
1402 | trash a
1403 | } else {
1404 | trash x
1405 | }
1406 | }
1407 = ok
1408
1409 | routine foo
1410 | inputs a, x, z
1411 | trashes a
1412 | {
1413 | if z {
1414 | trash a
1415 | } else {
1416 | trash x
1417 | }
1418 | }
1419 ? ForbiddenWriteError: x in foo
1420
1421 | routine foo
1422 | inputs a, x, z
1423 | trashes x
1424 | {
1425 | if z {
1426 | trash a
1427 | } else {
1428 | trash x
1429 | }
1430 | }
1431 ? ForbiddenWriteError: a in foo
12881432
12891433 ### repeat ###
12901434
14611605 | }
14621606 = ok
14631607
1608 The understanding is that, because `copy` trashes `a`, `a` cannot be used
1609 as the destination of a `copy`.
1610
1611 | byte source : 0
1612 | byte dest
1613 |
1614 | routine main
1615 | inputs source
1616 | outputs dest
1617 | trashes a, z, n
1618 | {
1619 | copy source, a
1620 | }
1621 ? ForbiddenWriteError
1622
14641623 Can `copy` from a `word` to a `word`.
14651624
14661625 | word source : 0
15031662 | }
15041663 ? TypeMismatchError
15051664
1506 ### copy[] ###
1507
1508 Buffers and pointers.
1665 ### Buffers and pointers ###
15091666
15101667 Note that `^buf` is a constant value, so it by itself does not require `buf` to be
15111668 listed in any input/output sets.
15841741 | ld y, 0
15851742 | copy ^buf, ptr
15861743 | copy [ptr] + y, foo
1744 | }
1745 = ok
1746
1747 Read through a pointer to the `a` register. Note that this is done with `ld`,
1748 not `copy`.
1749
1750 | buffer[2048] buf
1751 | pointer ptr
1752 | byte foo
1753 |
1754 | routine main
1755 | inputs buf
1756 | outputs a
1757 | trashes y, z, n, ptr
1758 | {
1759 | ld y, 0
1760 | copy ^buf, ptr
1761 | ld a, [ptr] + y
1762 | }
1763 = ok
1764
1765 Write the `a` register through a pointer. Note that this is done with `st`,
1766 not `copy`.
1767
1768 | buffer[2048] buf
1769 | pointer ptr
1770 | byte foo
1771 |
1772 | routine main
1773 | inputs buf
1774 | outputs buf
1775 | trashes a, y, z, n, ptr
1776 | {
1777 | ld y, 0
1778 | copy ^buf, ptr
1779 | ld a, 255
1780 | st a, [ptr] + y
15871781 | }
15881782 = ok
15891783
20882282 | copy foo, vec
20892283 | }
20902284 = ok
2285
2286 ### static ###
2287
2288 When memory locations are defined static to a routine, they cannot be
2289 directly input, nor directly output; and since they are always initialized,
2290 they cannot be trashed. Thus, they really don't participate in the analysis.
2291
2292 | define foo routine
2293 | inputs x
2294 | outputs x
2295 | trashes z, n
2296 | static byte t : 0
2297 | {
2298 | st x, t
2299 | inc t
2300 | ld x, t
2301 | }
2302 |
2303 | define main routine
2304 | trashes a, x, z, n
2305 | static byte t : 0
2306 | {
2307 | ld x, t
2308 | call foo
2309 | }
2310 = ok
528528 = $081A INX
529529 = $081B RTS
530530
531 Copy routine (by forward reference) to vector.
532
533 | vector routine
534 | inputs x
535 | outputs x
536 | trashes z, n
537 | bar
538 |
539 | routine main
540 | outputs bar
541 | trashes a, n, z
542 | {
543 | copy foo, bar
544 | }
545 |
546 | routine foo
547 | inputs x
548 | outputs x
549 | trashes z, n
550 | {
551 | inc x
552 | }
553 = $080D LDA #$18
554 = $080F STA $081A
555 = $0812 LDA #$08
556 = $0814 STA $081B
557 = $0817 RTS
558 = $0818 INX
559 = $0819 RTS
560
531561 Copy word to word table and back, with both `x` and `y` as indexes.
532562
533563 | word one
633663 | outputs x
634664 | trashes a, z, n
635665 | one
636 | vector (routine
666 | vector routine
637667 | outputs x
638 | trashes a, z, n)
668 | trashes a, z, n
639669 | table[256] many
640670 |
641671 | routine bar outputs x trashes a, z, n {
845875 | ld y, 0
846876 | copy ^buf, ptr
847877 | copy [ptr] + y, foo
848 | copy [ptr] + y, a
878 | ld a, [ptr] + y
849879 | }
850880 = $080D LDY #$00
851881 = $080F LDA #$1F
856886 = $0819 STA $101F
857887 = $081C LDA ($FE),Y
858888 = $081E RTS
889
890 Write the `a` register through a pointer.
891
892 | buffer[2048] buf
893 | pointer ptr @ 254
894 | byte foo
895 |
896 | routine main
897 | inputs buf
898 | outputs buf
899 | trashes a, y, z, n, ptr
900 | {
901 | ld y, 0
902 | copy ^buf, ptr
903 | ld a, 255
904 | st a, [ptr] + y
905 | }
906 = $080D LDY #$00
907 = $080F LDA #$1C
908 = $0811 STA $FE
909 = $0813 LDA #$08
910 = $0815 STA $FF
911 = $0817 LDA #$FF
912 = $0819 STA ($FE),Y
913 = $081B RTS
859914
860915 Add a word memory location, and a literal word, to a pointer, and then read through it.
861916 Note that this is *not* range-checked. (Yet.)
920975 = $080D TAX
921976 = $080E LDA #$00
922977 = $0810 RTS
978
979 ### static ###
980
981 Memory locations defined static to a routine are allocated
982 just the same as initialized global storage locations are.
983
984 | define foo routine
985 | inputs x
986 | outputs x
987 | trashes z, n
988 | static byte t : 255
989 | {
990 | st x, t
991 | inc t
992 | ld x, t
993 | }
994 |
995 | define main routine
996 | trashes a, x, z, n
997 | static byte t : 7
998 | {
999 | ld x, t
1000 | call foo
1001 | }
1002 = $080D LDX $081F
1003 = $0810 JSR $0814
1004 = $0813 RTS
1005 = $0814 STX $081E
1006 = $0817 INC $081E
1007 = $081A LDX $081E
1008 = $081D RTS
1009 = $081E .byte $FF
1010 = $081F .byte $07
407407 | }
408408 = ok
409409
410 A routine can be copied into a vector before the routine appears in the program,
411 *however*, it must be marked as such with the keyword `forward`.
410 A routine can be copied into a vector before the routine appears in the program.
411 This is known as a "forward reference". You are only allowed to make forward
412 references in the source of a `copy` instruction.
412413
413414 | vector routine
414415 | inputs cinv, a
424425 | routine foo {
425426 | ld a, 0
426427 | }
427 ? SyntaxError: Undefined symbol
428
429 | vector routine
430 | inputs cinv, a
431 | outputs cinv, x
432 | trashes a, x, z, n
433 | cinv @ 788
434 | routine main {
435 | with interrupts off {
436 | copy forward foo, cinv
437 | }
438 | call cinv
439 | }
440 | routine foo {
441 | ld a, 0
442 | }
443428 = ok
444429
445430 goto.
528513 | ld a, 0
529514 | }
530515 ? SyntaxError
516
517 Memory locations can be defined static to a routine.
518
519 | define foo routine
520 | inputs x
521 | outputs x
522 | trashes z, n
523 | static byte t : 0
524 | {
525 | st x, t
526 | inc t
527 | ld x, t
528 | }
529 |
530 | define main routine
531 | trashes a, x, z, n
532 | static byte t : 0
533 | {
534 | ld x, t
535 | call foo
536 | }
537 = ok
538
539 Static memory locations must always be given an initial value.
540
541 | define main routine
542 | inputs x
543 | outputs x
544 | trashes z, n
545 | static byte t
546 | {
547 | st x, t
548 | inc t
549 | ld x, t
550 | }
551 ? SyntaxError
552
553 Name of a static cannot shadow an existing global or static.
554
555 | byte t
556 |
557 | define main routine
558 | inputs x
559 | outputs x
560 | trashes z, n
561 | static byte t
562 | {
563 | st x, t
564 | inc t
565 | ld x, t
566 | }
567 ? SyntaxError
568
569 | define main routine
570 | inputs x
571 | outputs x
572 | trashes z, n
573 | static byte t
574 | static byte t
575 | {
576 | st x, t
577 | inc t
578 | ld x, t
579 | }
580 ? SyntaxError