git @ Cat's Eye Technologies MARYSUE / 1.0-2019.0222
Import "Overview of a Story Compiler" into this repository. Chris Pressey 2 years ago
2 changed file(s) with 200 addition(s) and 0 deletion(s). Raw diff Collapse all Expand all
2222 This code is in the public domain; see the file [UNLICENSE](UNLICENSE)
2323 in this directory.
2424
25 Also in this repository is a short article describing this generator, written
26 during NaNoGenMo 2015 (while the generator was still under construction):
27 [Overview of a Story Compiler][].
28
2529 [NaNoGenMo 2015]: https://github.com/dariusk/NaNoGenMo-2015/
2630 [A Time for Destiny]: http://catseye.tc/modules/MARYSUE/generated/A_Time_for_Destiny.html
31 [Overview of a Story Compiler]: doc/Overview%20of%20a%20Story%20Compiler.md
0 Overview of a "Story Compiler"
1 ------------------------------
2
3 _Chris Pressey, Nov 12 2015_
4
5 This is an extremely simplified description of the story generator I'm working
6 on. It glosses over most of the details, but hopefully provides an overview
7 of the architecture. I've tried to write it for a general intermediate-programmer
8 audience; no knowledge of compiler construction is assumed.
9
10 ### First, about the name ###
11
12 A compiler is a program which, typically, takes
13 program source code (almost always in a text file) as input, and produces
14 a file in some format that the computer can more easily execute. Well-known
15 examples are the Java Compiler (`javac`) and the GNU C Compiler (`gcc`), but
16 in fact lots of programming language interpreters contain a compiler in them
17 somewhere. (Javascript, for example, is compiled "just in time" in a web
18 browser before it's run.)
19
20 However, I've been calling this generator a "story compiler" _not_ because it works
21 like a typical compiler, but because a lot of its internal parts look a
22 lot like the internal parts of a typical compiler. It works _sort of_ like a
23 compiler, but the analogy is sometimes strained.
24
25 So, when reading this, you can just forget about the "compiler" angle if you like.
26 It will probably be easier to think about, if you just think of it as a story generator.
27
28 ### How the story is represented ###
29
30 The story, at any given point, is represented by a _tree_ where each tree
31 node can have any number of children.
32
33 Unfortunately, trees are difficult to draw in ASCII. And recursive data
34 structures with references might not be everyone's cup of tea.
35
36 Luckily, there's a way to think about them and write them out that is usually
37 simpler: a tree is basically _a list that can contain other lists inside it_
38 (and those sublists can contain lists inside them, and so forth on down.)
39
40 So, this is a tree:
41
42 [a, [b, c], d, [[e, f], g]]
43
44 At every point in the program, the story is represented by something that
45 looks like that.
46
47 ### What we do with these trees ###
48
49 Inside the generator, we have a bunch of functions. Each of them takes a
50 tree as input, and returns a slightly different tree.
51
52 The functions are often called back-to-back, one after the other, like
53
54 tree = transform_tree_in_some_way(tree)
55 tree = transform_tree_in_some_other_way(tree)
56 tree = apply_yet_another_transformation(tree)
57
58 and we call this pattern a _pipeline_. Each function call in it, we call
59 a _stage_.
60
61 This generator is basically one long pipeline. Currently, it has about
62 a dozen stages.
63
64 (Note that, each time we call a function, we get a _new_ tree. We don't
65 change the old tree, and in fact we do what we can to prevent it from being
66 changed — it's an _immutable_ data structure. This is often less efficient
67 than changing a tree directly, but it is also often easier to reason about.)
68
69 This isn't the entire picture. There is also a "database" of things —
70 characters, items, settings, and the like — that exists alongside the
71 pipeline. Parts of the tree can refer to objects in this database.
72 But the pipeline is where most of the activity happens.
73
74 ### Where do we begin? ###
75
76 No story is generated purely out of thin air. You have to start with
77 _something_. Because this generator works on trees, naturally, it starts
78 with a tree.
79
80 In principle, the "story compiler" could read this initial tree from a
81 text file (written in e.g. JSON or YAML), and it would probably be more
82 deserving of the name "compiler" if it did. But, I only have a month,
83 so for expediency, the initial tree is hard-coded in the generator.
84
85 Early in my discussion thread, I mentioned the "null story":
86
87 > Once upon a time, they lived happily ever after.
88
89 The compiler starts with a tree representation which basically matches that.
90 It looks something like this:
91
92 [IntroduceCharacters, *, CharactersConvalesce]
93
94 Think of the `*` as a placeholder for the parts of the story that aren't written yet.
95
96 ### How this is turned into a story ###
97
98 One of the first stages of the pipeline is the "plot complicator", which takes
99 this initial tree and creates a new tree where every `*` is replaced by some subplot
100 that it picks out of a hat (more or less). For example, after complication, the new tree might be
101
102 [IntroduceCharacters, [JewelsStolen, *, JewelsRecovered, *], CharactersConvalesce]
103
104 If we want a fairly involved story, we don't have to run this stage just once,
105 we can run it many times. And if all the subplots themselves contain `*`'s,
106 this process can continue for as long as you like. Currently, it's run about
107 five times.
108
109 Once we're happy with how complex the plot is, there's a stage that takes that
110 final plot tree, removes any remaining `*`'s, and flattens it, producing a tree
111 like:
112
113 [IntroduceCharacters, JewelsStolen, JewelsRecovered, CharactersConvalesce]
114
115 And from that, the generator can print out a fairly nice synopsis.
116
117 (Note that flattening a tree like this is a convenient thing to do at various
118 points in the pipeline. Just because a list _can_ contain embedded sublists
119 doesn't mean it _has_ to.)
120
121 Then there's a stage that turns those plot developments into sequences of events.
122
123 This is actually a very murky area in the generator, and a lot of it is written
124 in an ad-hoc fashion, and I'm not happy about that... but for now, let's just
125 pretend it's simple. Say it basically looks for particular plot developments,
126 and replaces them with particular sequences of events, like so:
127
128 IntroduceCharacters → [DescribeBurglar, DescribeDetective]
129 JewelsStolen → [BurglarTakesJewels, BurglarEscapes]
130 JewelsRecovered → [DetectiveCatchesBurglar, DetectiveTakesJewels]
131 CharactersConvalesce → [BurglarEscapes, DetectiveGoesHome]
132
133 So the resulting tree after this stage looks like:
134
135 [
136 [DescribeBurglar, DescribeDetective],
137 [BurglarTakesJewels, BurglarEscapes],
138 [DetectiveCatchesBurglar, DetectiveTakesJewels],
139 [BurglarEscapes, DetectiveGoesHome],
140 ]
141
142 Which is then flattened:
143
144 [DescribeBurglar, DescribeDetective, BurglarTakesJewels, BurglarEscapes,
145 DetectiveCatchesBurglar, DetectiveTakesJewels, BurglarEscapes, DetectiveGoesHome]
146
147 and then ultimately text is generated. This part is a bit murky too, but for
148 simplicity, just assume that we go through the tree and for every event we see,
149 we print out a corresponding sentence:
150
151 > The burglar was a tall person. The detective was a short person. The burglar took the
152 > jewels. The burglar escaped. The detective caught the burglar. The detective
153 > took the jewels. The burglar escaped. The detective went home.
154
155 And there we have a story.
156
157 And that is basically how this generator works, if we ignore all the messy details.
158
159 ### What does a stage actually do? ###
160
161 Earlier I mentioned that a stage takes a tree and returns another tree, but that
162 might leave you wondering how the stage actually does that.
163
164 Well, a tree is a recursive data structure, so the easiest way to do that is to
165 write each stage as a recursive function. If you're familiar with design patterns,
166 you may know this as a "visitor". But if you're not comfortable with recursion,
167 this may be perplexing at first — it does take a while to wrap your head around it.
168
169 I'll give a simple example in pseudo-code. Say we wanted to take a tree, and
170 return a new tree where all the events of a certain type have been removed.
171 (There are actually stages in this generator that do that.) Say we want to
172 get rid of all events that involve the burglar, just before we write out the
173 story. (We're going for a "Garfield minus Garfield" feel, I guess.)
174 We could write a stage like this:
175
176 function remove_burglar_events(tree) {
177 new_children = [] // an empty list
178
179 for each child in tree {
180 if child is an event that involves the burglar {
181 // do nothing!
182 } else {
183 new_child = remove_burglar_events(child)
184 append new_child to new_children
185 }
186 }
187
188 return new Tree(new_children)
189 }
190
191 Notice how, in the `else` block, this function calls itself - that's the recursion. We actually make many new trees, one for each subtree of the tree we're given, and we "glue" them back together to form the new tree that we return to the caller.
192
193 Most of the stages in this generator look more or less like that, only with
194 more complex logic in the middle part.