git @ Cat's Eye Technologies NaNoGenLab / cfd0eff
Needs more science, I said! Chris Pressey 10 years ago
4 changed file(s) with 49 addition(s) and 26 deletion(s). Raw diff Collapse all Expand all
00 ending-concordance
11 ==================
22
3 Requirements
4 ------------
3 Hypothesis
4 ----------
5
6 We hypothesize that words in English can be roughly categorized using a
7 characteristic as simple as the pair of letters that they end with, and
8 that this can be exploited to form sentences which look almost plausible.
9
10 Apparatus
11 ---------
512
613 * Python 2.7.6 (probably works with older versions too)
714 * The `gutenberg.py` module from [gutenizer](https://github.com/okfn/gutenizer/)
815 * A bunch of Project Gutenberg texts in plain text format
916
10 Basic Strategy
11 --------------
17 Method
18 ------
1219
1320 * Read in all the words.
1421 * Index the words based on the final two letters in each word that is four or
1522 more letters long.
1623 * Write out words randomly chosen from two alternating end-two-letters groups.
1724
18 Sample Output
19 -------------
25 Observations
26 ------------
2027
2128 When run on _Principles of Scientific Management_, I got:
2229
00 evaporating-text
11 ================
22
3 Requirements
4 ------------
3 Hypothesis
4 ----------
5
6 We hypothesize that a novel, under the right circumstances, can evaporate.
7
8 Apparatus
9 ---------
510
611 * Python 2.7.6 (probably works with older versions too)
712 * The `gutenberg.py` module from [gutenizer](https://github.com/okfn/gutenizer/)
813 * An input text (possibly from Project Gutenberg)
914
10 Basic Strategy
11 --------------
15 Method
16 ------
1217
1318 * Collect all the sentences and count them: _s_ is the number of sentences.
1419 * In each sentence, erase words. The probability of a word being erased
1520 is _n_/_s_ where _n_ is the sentence number;.the first sentence is
1621 numbered 0.
1722
18 Sample Output
19 -------------
23 Observations
24 ------------
2025
2126 When run on Voltaire's "Candide": at the beginning...
2227
00 infix-neologisms
11 ================
22
3 Requirements
4 ------------
3 Hypothesis
4 ----------
5
6 We hypothesize that new words can be formed from existing words by splitting
7 them open and sticking a word inside.
8
9 Apparatus
10 ---------
511
612 * Python 2.7.6 (probably works with older versions too)
713 * A set of input words
814
9 Basic Strategy
10 --------------
15 Method
16 ------
1117
1218 * Pick an input word.
1319 * Split it into two parts. Pick another input word and insert it in
1521 * Possibly repeat step #2.
1622 * Output the word and repeat from step #1.
1723
18 Sample Output
19 -------------
24 Observations
25 ------------
2026
2127 Running it on `../generic-corpora/containers.txt` which I just threw together
2228 after a few internet searches, you might get
00 naive-cut-up
11 ============
22
3 Requirements
4 ------------
3 Hypothesis
4 ----------
5
6 We hypothesize that if we cut up a newspaper. We also hypothesize that
7 we cut up a newspaper.
8
9 Apparatus
10 ---------
511
612 * Python 2.7.6 (probably works with older versions too)
713 * [Pillow](http://python-pillow.github.io/) (it might work with PIL too)
814 * Some scanned images of newspapers, books, etc., in PNG format, for example
915 obtained by [fetch-chronam](../fetch-chronam/)
1016
11 Basic Strategy
12 --------------
17 Method
18 ------
1319
1420 * Start with "blank" canvas. For simplicity, we actually use one of the
1521 input images as the "canvas".
1723 * Copy the image within the rectangle to a random location on the canvas.
1824 * Repeat from step 2 until we guess we've covered the canvas.
1925
20 Usage
21 -----
26 ### Detailed procedure ###
2227
2328 First, we assume some PNGs of scanned newspaper pages involving some topic
2429 (in this example, cheese) have been obtained. (PNG format is probably not
5964
6065 (You may wish to use a less clumsy image viewer than Ristretto, yourself.)
6166
62 Sample Output
63 -------------
67 Observations
68 ------------
6469
6570 It may be difficult to tell in this scaled-down sample, but the result was
6671 surprisingly thematic in its reference to cheese: