git @ Cat's Eye Technologies Fountain / 886b255
Merge branch 'master' into develop-0.4 Chris Pressey 1 year, 6 months ago
1 changed file(s) with 34 addition(s) and 14 deletion(s). Raw diff Collapse all Expand all
210210
211211 The fact that every CSL can be captured by an LBA suggests maybe we
212212 could just linear-bound the amount of storage used by a Fountain
213 grammar in the worst case. Maybe; I haven't thought much about
214 this yet. My initial impression is that it seems a bit artificial.
215
216 Thinking about it a bit more, to make this work for both parsing
217 and generation, what we really need to do is to show the context
218 and the string (whether it be the input for parsing or the result
219 of generation) are related in size linearly -- that one is never
220 more than _k_ times bigger than the other, where _k_ is a constant
221 for the grammar). It might be feasible to do that in simple cases.
222 My intuition at the moment is that it surely breaks down at some
223 point, but it's not immediately clear where that point begins.
213 grammar in the worst case.
214
215 Having thought about it, this is probably the way to go. When
216 processing (parsing or generating) a Fountain grammar, the user ought
217 to be able to specify a "fuel efficiency" _E_, which is the linear bound.
218
219 (Whether this is specified in the source file, or through some other
220 means like a command-line option, is immaterial for the present
221 purposes. Presumbly though, it's omission doesn't stop us from
222 processing the grammar, we simply don't make the check in this case)
223
224 Each time a character is consumed from the input (resp. generated to
225 the output), _E_ units of "fuel" are gained. Each time a new unit of
226 storage is allocated for storing the context used by the grammar,
227 one unit of "fuel" is expended. Expending more fuel than has been
228 accumulated so far results in some kind of warning or error condition
229 (the salient thing being that the user is made aware that this grammar
230 exceeds the linear bound.)
231
232 It should be noted that the integers are unbounded, so an
233 operation like `a += 1`, may or may not allocate a new unit of
234 storage (a machine word, say), so the usage needs to be recalculated
235 afterwards.
236
237 Freeing up storage does not allow the grammar to reclaim "fuel".
238
239 This check could probably be done statically, using some kind of
240 abstract interpretation; but it would also be possible (and probably
241 a lot easier) to add it as a dynamic check while processing the
242 grammar.
224243
225244 #### Does all this talk of complexity classes even mean anything?
226245
315334 When generating from a grammar, we often want to take a "random sample"
316335 of the space of utterances that the grammar defines. There are methods
317336 that have been developed to do this; not just for grammars, but any
318 recursive description of a structure; for example [Boltzmann Samplers][].
337 recursive description of a structure; for example [Boltzmann Samplers][]
338 (PDF).
319339
320340 We should probably go in this direction.
321341
364384 [Exanoke]: https://catseye.tc/node/Exanoke
365385 [Tamsin]: https://catseye.tc/node/Tamsin
366386 [Tandem]: https://catseye.tc/node/Tandem
367 [Boltzmann Samplers]: https://github.com/cpressey/Some-Papers-I-Really-Liked#boltzmann-samplers-for-the-random-generation-of-combinatorial-structures
368 [ambinate.py]: https://gist.github.com/cpressey/dd3f63eda91b33e429fa
387 [Boltzmann Samplers]: https://algo.inria.fr/flajolet/Publications/DuFlLoSc04.pdf
388 [ambinate.py]: https://codeberg.org/catseye/Dipple/src/branch/master/python/ambinate.py