Flesh out and describe the fallthru optimization algorithm.
Chris Pressey
4 years ago

77 | 77 | Not because it saves 3 bytes, but because it's a neat trick. Doing it optimally |

78 | 78 | is probably NP-complete. But doing it adequately is probably not that hard. |

79 | 79 | |

80 | > Every routine is falled through to by zero or more routines. | |

81 | > Don't consider the main routine. | |

82 | > For each routine α that is finally-falled through to by a set of routines R(α), | |

83 | > pick a movable routine β from R, move β in front of α, remove the `jmp` at the end of β and | |

84 | > mark β as unmovable. | |

85 | > Note this only works if β finally-falls through. If there are multiple tail | |

86 | > positions, we can't eliminate all the `jmp`s. | |

87 | > Note that if β finally-falls through to α it can't finally-fall through to anything | |

88 | > else, so the sets R(α) should be disjoint for every α. (Right?) | |

89 | ||

90 | 80 | ### And at some point... |

91 | 81 | |

92 | 82 | * `low` and `high` address operators - to turn `word` type into `byte`. |

3 | 3 | This is a test suite, written in [Falderal][] format, for SixtyPical's |

4 | 4 | ability to detect which routines make tail calls to other routines, |

5 | 5 | and thus can be re-arranged to simply "fall through" to them. |

6 | ||

7 | The theory is as follows. | |

8 | ||

9 | SixtyPical supports a `goto`, but it can only appear in tail position. | |

10 | If a routine r1 ends with a unique `goto` to a fixed routine r2 it is said | |

11 | to *potentially fall through* to r2. | |

12 | ||

13 | A *unique* `goto` means that there are not multiple different `goto`s in | |

14 | tail position (which can happen if, for example, an `if` is the last thing | |

15 | in a routine, and each branch of that `if` ends with a different `goto`.) | |

16 | ||

17 | A *fixed* routine means, a routine which is known at compile time, not a | |

18 | `goto` through a vector. | |

19 | ||

20 | Consider the set R of all routines in the program. | |

21 | ||

22 | Every routine r1 ∈ R either potentially falls through to a single routine | |

23 | r2 ∈ R (r2 ≠ r1) or it does not potentially fall through to any routine. | |

24 | We can say out(r1) = {r2} or out(r1) = ∅. | |

25 | ||

26 | Every routine r ∈ R in this set also has a set of zero or more | |

27 | routines from which it is potentially falled through to by. Call this | |

28 | in(r). It is the case that out(r1) = {r2} → r1 ∈ in(r2). | |

29 | ||

30 | We can trace out the connections by following the in- or our- sets of | |

31 | a given routine. Because each routine potentially falls through to only | |

32 | a single routine, the structures we find will be tree-like, not DAG-like. | |

33 | ||

34 | But they do permit cycles. | |

35 | ||

36 | So, we first break those cycles. We will be left with out() sets which | |

37 | are disjoint trees, i.e. if r1 ∈ in(r2), then r1 ∉ in(r3) for all r3 ≠ r2. | |

38 | ||

39 | We then follow an algorithm something like this. Treat R as a mutable | |

40 | set and start with an empty list L. Then, | |

41 | ||

42 | - Pick a routine r from R where out(r) = ∅. | |

43 | - Find the longest chain of routines r1,r2,...rn in R where out(r1) = {r2}, | |

44 | out(r2} = {r3}, ... out(rn-1) = {rn}, and rn = r. | |

45 | - Remove (r1,r2,...,rn) from R and append them to L in that order. | |

46 | Mark (r1,r2,...rn-1) as "will have their final `goto` removed." | |

47 | - Repeat until R is empty. | |

48 | ||

49 | When times comes to generate code, generate it in the order given by L. | |

6 | 50 | |

7 | 51 | [Falderal]: http://catseye.tc/node/Falderal |

8 | 52 |