Merge pull request #10 from catseye/develop-0.10
Develop 0.10
Chris Pressey authored 5 years ago
GitHub committed 5 years ago
0 | 0 | History of Feedmark |
1 | 1 | =================== |
2 | ||
3 | 0.10 | |
4 | ---- | |
5 | ||
6 | * Format of refdex files has changed: an entry can now have a | |
7 | key `filenames`, which is like `filename`, but can be a list. | |
8 | This is backwards-compatible on input, and you can pass the | |
9 | flag `--output-refdex-single-filename` to cause the output | |
10 | from `--output-refdex` to strip all but the last filename | |
11 | and produce only `filename` entries on output. | |
12 | * Parser now allows trailing `###` on h3-level section headers. | |
2 | 13 | |
3 | 14 | 0.9-2019.105 |
4 | 15 | ------------ |
0 | Copyright (c)2019 Chris Pressey, Cat's Eye Technologies | |
1 | ||
2 | Permission is hereby granted, free of charge, to any person obtaining a | |
3 | copy of this software and associated documentation files (the "Software"), | |
4 | to deal in the Software without restriction, including without limitation | |
5 | the rights to use, copy, modify, merge, publish, distribute, sublicense, | |
6 | and/or sell copies of the Software, and to permit persons to whom the | |
7 | Software is furnished to do so, subject to the following conditions: | |
8 | ||
9 | The above copyright notice and this permission notice shall be included in | |
10 | all copies or substantial portions of the Software. | |
11 | ||
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | |
13 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | |
14 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | |
15 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | |
16 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING | |
17 | FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER | |
18 | DEALINGS IN THE SOFTWARE. |
0 | 0 | Feedmark |
1 | 1 | ======== |
2 | 2 | |
3 | *Version 0.9-2019.1015. Subject to change in backwards-incompatible ways without notice.* | |
3 | *Version 0.10. Subject to change in backwards-incompatible ways without notice.* | |
4 | 4 | |
5 | 5 | **Feedmark** is a format for embedding structured data in Markdown files |
6 | 6 | in a way which is both human-readable and machine-extractable. |
108 | 108 | Feedmark is a subset of Markdown, which is something it has in common |
109 | 109 | with [Falderal][], however it has decidedly different goals. |
110 | 110 | |
111 | TODO | |
112 | ---- | |
113 | ||
114 | Research whether JSON Schema could be used for validation as well. | |
115 | ||
116 | "common" properties on document which all entries within inherit. | |
117 | ||
118 | Sub-entries. Somehow. For individual games in a series, implementations | |
119 | or variations on a programming language, etc. | |
120 | ||
121 | Allow trailing `###` on h3-level headings. | |
122 | ||
123 | Index creation from refdex, for permalinks. | |
111 | See [TODO.md](TODO.md) for planned features and [HISTORY.md](HISTORY.md) | |
112 | for a record of features added in past versions. | |
124 | 113 | |
125 | 114 | [Falderal]: http://catseye.tc/node/Falderal |
126 | 115 | [Chrysoberyl]: http://git.catseye.tc/Chrysoberyl/ |
0 | TODO for Feedmark | |
1 | ----------------- | |
2 | ||
3 | "common" properties on document which all entries within inherit. | |
4 | ||
5 | Sub-entries. Somehow. For individual games in a series, implementations | |
6 | or variations on a programming language, etc. |
0 | This is free and unencumbered software released into the public domain. | |
1 | ||
2 | Anyone is free to copy, modify, publish, use, compile, sell, or | |
3 | distribute this software, either in source code form or as a compiled | |
4 | binary, for any purpose, commercial or non-commercial, and by any | |
5 | means. | |
6 | ||
7 | In jurisdictions that recognize copyright laws, the author or authors | |
8 | of this software dedicate any and all copyright interest in the | |
9 | software to the public domain. We make this dedication for the benefit | |
10 | of the public at large and to the detriment of our heirs and | |
11 | successors. We intend this dedication to be an overt act of | |
12 | relinquishment in perpetuity of all present and future rights to this | |
13 | software under copyright law. | |
14 | ||
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, | |
16 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF | |
17 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. | |
18 | IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR | |
19 | OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, | |
20 | ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR | |
21 | OTHER DEALINGS IN THE SOFTWARE. | |
22 | ||
23 | For more information, please refer to <http://unlicense.org/> |
24 | 24 | |
25 | 25 | Aenean ullamcorper ex at tellus bibendum semper. Donec lectus augue, vestibulum vel justo in, euismod feugiat libero. Nam fringilla iaculis fermentum. Sed ac felis quis nunc fringilla mattis. Suspendisse potenti. Suspendisse a eros vel lacus luctus venenatis ut id tortor. Etiam nisi orci, scelerisque et aliquam sit amet, blandit non mi. Ut fringilla est sed metus facilisis convallis. Aliquam pharetra iaculis lobortis. |
26 | 26 | |
27 | ### Llamas: It's Time to Spot Them | |
27 | ### Llamas: It's Time to Spot Them ### | |
28 | 28 | |
29 | 29 | * date: Nov 1 2016 09:00:00 |
30 | 30 |
5 | 5 | |
6 | 6 | setuptools.setup( |
7 | 7 | name='Feedmark', |
8 | version='0.9.2019.1015', | |
9 | description='Definition of Feedmark, a curation-oriented subset of Markdown, and tools for processing it', | |
8 | version='0.10', | |
9 | description='Feedmark, a curation-oriented subset of Markdown, and tools for processing it', | |
10 | 10 | long_description=long_description, |
11 | 11 | long_description_content_type="text/markdown", |
12 | 12 | author='Chris Pressey', |
18 | 18 | classifiers=[ |
19 | 19 | "Development Status :: 4 - Beta", |
20 | 20 | "Intended Audience :: Information Technology", |
21 | "License :: Public Domain", | |
21 | "License :: OSI Approved :: MIT License", | |
22 | 22 | "Operating System :: OS Independent", |
23 | 23 | "Programming Language :: Python :: 2.7", |
24 | 24 | "Programming Language :: Python :: 3", |
24 | 24 | for key, value in items(local_refdex): |
25 | 25 | if 'filename' in value: |
26 | 26 | value['filename'] = input_refdex_filename_prefix + value['filename'] |
27 | if 'filenames' in value: | |
28 | value['filenames'] = [input_refdex_filename_prefix + f for f in value['filenames']] | |
27 | 29 | refdex.update(local_refdex) |
28 | 30 | except: |
29 | 31 | sys.stderr.write("Could not read refdex JSON from '{}'\n".format(filename)) |
48 | 50 | value['filename'].encode('utf-8') |
49 | 51 | assert isinstance(value['anchor'], unicode_string) |
50 | 52 | value['anchor'].encode('utf-8') |
53 | elif 'filenames' in value and 'anchor' in value: | |
54 | assert len(value) == 2 | |
55 | for filename in value['filenames']: | |
56 | assert isinstance(value, unicode_string) | |
57 | filename.encode('utf-8') | |
58 | assert isinstance(value['anchor'], unicode_string) | |
59 | value['anchor'].encode('utf-8') | |
51 | 60 | else: |
52 | 61 | raise NotImplementedError("badly formed refdex") |
53 | 62 | except: |
55 | 64 | raise |
56 | 65 | |
57 | 66 | return refdex |
67 | ||
68 | ||
69 | def convert_refdex_to_single_filename_refdex(input_refdex): | |
70 | """Note that this makes a partially shallow copy.""" | |
71 | refdex = {} | |
72 | for key, value in input_refdex.items(): | |
73 | if 'filenames' in value: | |
74 | refdex[key] = { | |
75 | 'filename': value['filenames'][-1], | |
76 | 'anchor': value['anchor'] | |
77 | } | |
78 | else: | |
79 | refdex[key] = value | |
80 | return refdex |
1 | 1 | import json |
2 | 2 | import sys |
3 | 3 | |
4 | from feedmark.loader import read_document_from, read_refdex_from | |
4 | from feedmark.loader import ( | |
5 | read_document_from, read_refdex_from, convert_refdex_to_single_filename_refdex, | |
6 | ) | |
5 | 7 | from feedmark.utils import items |
6 | 8 | |
7 | 9 | |
61 | 63 | argparser.add_argument('--input-refdexes', metavar='FILENAME', type=str, |
62 | 64 | help='Load these JSON files as the reference-style links index before processing' |
63 | 65 | ) |
66 | argparser.add_argument('--input-refdex-filename-prefix', type=str, default=None, | |
67 | help='After loading refdexes, prepend this to filename of each refdex' | |
68 | ) | |
64 | 69 | argparser.add_argument('--output-refdex', action='store_true', |
65 | 70 | help='Construct reference-style links index from the entries and write it to stdout as JSON' |
66 | 71 | ) |
67 | argparser.add_argument('--input-refdex-filename-prefix', type=str, default=None, | |
68 | help='After loading refdexes, prepend this to filename of each refdex' | |
72 | argparser.add_argument('--output-refdex-single-filename', action='store_true', | |
73 | help='When outputting a refdex, ensure that only entries with a single filename are ' | |
74 | 'output, by stripping all but the last filename from multiple filenames entries.' | |
69 | 75 | ) |
70 | 76 | |
71 | 77 | argparser.add_argument('--limit', metavar='COUNT', type=int, default=None, |
72 | 78 | help='Process no more than this many entries when making an Atom or HTML feed' |
73 | 79 | ) |
74 | 80 | |
75 | argparser.add_argument('--version', action='version', version="%(prog)s 0.9") | |
81 | argparser.add_argument('--version', action='version', version="%(prog)s 0.10") | |
76 | 82 | |
77 | 83 | options = argparser.parse_args(args) |
78 | 84 | |
83 | 89 | for filename in options.input_files: |
84 | 90 | document = read_document_from(filename) |
85 | 91 | documents.append(document) |
92 | ||
93 | ### input: load input refdexes | |
86 | 94 | |
87 | 95 | input_refdexes = [] |
88 | 96 | if options.input_refdex: |
112 | 120 | if options.output_refdex: |
113 | 121 | for document in documents: |
114 | 122 | for section in document.sections: |
115 | refdex[section.title] = { | |
116 | 'filename': document.filename, | |
117 | 'anchor': section.anchor | |
118 | } | |
123 | if section.title in refdex: | |
124 | entry = refdex[section.title] | |
125 | if entry['anchor'] != section.anchor: | |
126 | raise ValueError("Inconsistent anchors: {} in refex, {} in document".format(entry['anchor'], section.anchor)) | |
127 | if 'filename' in entry: | |
128 | entry['filenames'] = [] | |
129 | del entry['filename'] | |
130 | entry['filenames'].append(document.filename) | |
131 | else: | |
132 | refdex[section.title] = { | |
133 | 'filenames': [document.filename], | |
134 | 'anchor': section.anchor | |
135 | } | |
119 | 136 | |
120 | 137 | ### processing: rewrite references phase |
121 | 138 | |
126 | 143 | ### output |
127 | 144 | |
128 | 145 | if options.output_refdex: |
146 | if options.output_refdex_single_filename: | |
147 | refdex = convert_refdex_to_single_filename_refdex(refdex) | |
129 | 148 | sys.stdout.write(json.dumps(refdex, indent=4, sort_keys=True)) |
130 | 149 | |
131 | 150 | if options.dump_entries: |
19 | 19 | entry = refdex[name] |
20 | 20 | if 'filename' in entry and 'anchor' in entry: |
21 | 21 | filename = quote(entry['filename'].encode('utf-8')) |
22 | anchor = quote(entry['anchor'].encode('utf-8')) | |
23 | url = u'{}#{}'.format(filename, anchor) | |
24 | elif 'filenames' in entry and 'anchor' in entry: | |
25 | # pick the last one, for compatibility with single-refdex style | |
26 | filename = quote(entry['filenames'][-1].encode('utf-8')) | |
22 | 27 | anchor = quote(entry['anchor'].encode('utf-8')) |
23 | 28 | url = u'{}#{}'.format(filename, anchor) |
24 | 29 | elif 'url' in entry: |
243 | 248 | while self.is_blank_line(): |
244 | 249 | self.scan() |
245 | 250 | |
246 | match = re.match(r'^\#\#\#\s+(.*?)\s*$', self.line) | |
251 | match = re.match(r'^\#\#\#\s+(.*?)\s*(\#\#\#)?\s*$', self.line) | |
247 | 252 | if not match: |
248 | 253 | raise ValueError('Expected section, found "{}"'.format(self.line)) |
249 | 254 |
127 | 127 | "<a href=\"https://daringfireball.net/projects/markdown/\">Markdown</a>\ncan be used.</p>\n" |
128 | 128 | "<p>To <a href=\"https://en.wikipedia.org/wiki/Site\">site</a> them.</p>\n<p>Sight them, sigh.</p>" |
129 | 129 | ) |
130 | # note that property values are bare HTML: there is no surrounding <p></p> or other element | |
130 | # note that property values are bare HTML fragments: there is no surrounding <p></p> or other element | |
131 | 131 | self.assertEqual( |
132 | 132 | data['documents'][0]['properties']['hopper'], |
133 | 133 | '<a href="https://en.wikipedia.org/wiki/Stephen_Hopper">Stephen</a>' |
183 | 183 | self.assertDictEqual(data, { |
184 | 184 | "2 Llamas Spotted Near Mall": { |
185 | 185 | "anchor": "2-llamas-spotted-near-mall", |
186 | "filename": "eg/Recent Llama Sightings.md" | |
186 | "filenames": ["eg/Recent Llama Sightings.md"], | |
187 | 187 | }, |
188 | 188 | "A Possible Llama Under the Bridge": { |
189 | 189 | "anchor": "a-possible-llama-under-the-bridge", |
190 | "filename": "eg/Recent Llama Sightings.md" | |
190 | "filenames": ["eg/Recent Llama Sightings.md"], | |
191 | 191 | }, |
192 | 192 | "Llamas: It's Time to Spot Them": { |
193 | 193 | "anchor": "llamas-its-time-to-spot-them", |
194 | "filename": "eg/Recent Llama Sightings.md" | |
194 | "filenames": ["eg/Recent Llama Sightings.md"], | |
195 | 195 | }, |
196 | 196 | "Maybe sighting the llama": { |
197 | 197 | "anchor": "maybe-sighting-the-llama", |
198 | "filename": "eg/Ancient Llama Sightings.md" | |
199 | } | |
198 | "filenames": ["eg/Ancient Llama Sightings.md"], | |
199 | } | |
200 | }) | |
201 | ||
202 | def test_output_refdex_with_overlap(self): | |
203 | # Both of these files contain an entry called "Llamas: It's Time to Spot Them". | |
204 | # The refdex is created with entries pointing to all files where the entry occurs. | |
205 | main(['eg/Recent Llama Sightings.md', 'eg/Referenced Llama Sightings.md', '--output-refdex']) | |
206 | data = json.loads(sys.stdout.getvalue()) | |
207 | self.assertDictEqual(data, { | |
208 | "2 Llamas Spotted Near Mall": { | |
209 | "anchor": "2-llamas-spotted-near-mall", | |
210 | "filenames": [ | |
211 | "eg/Recent Llama Sightings.md", | |
212 | ] | |
213 | }, | |
214 | "A Possible Llama Under the Bridge": { | |
215 | "anchor": "a-possible-llama-under-the-bridge", | |
216 | "filenames": [ | |
217 | "eg/Recent Llama Sightings.md", | |
218 | ], | |
219 | }, | |
220 | "Llamas: It's Time to Spot Them": { | |
221 | "anchor": "llamas-its-time-to-spot-them", | |
222 | "filenames": [ | |
223 | "eg/Recent Llama Sightings.md", | |
224 | "eg/Referenced Llama Sightings.md" | |
225 | ] | |
226 | }, | |
227 | }) | |
228 | ||
229 | def test_output_refdex_with_overlap_forcing_single_filename(self): | |
230 | # Both of these files contain an entry called "Llamas: It's Time to Spot Them" | |
231 | # The refdex is created pointing only to the file that was mentioned last. | |
232 | main(['eg/Recent Llama Sightings.md', 'eg/Referenced Llama Sightings.md', '--output-refdex', '--output-refdex-single-filename']) | |
233 | data = json.loads(sys.stdout.getvalue()) | |
234 | self.assertDictEqual(data, { | |
235 | "2 Llamas Spotted Near Mall": { | |
236 | "anchor": "2-llamas-spotted-near-mall", | |
237 | "filename": "eg/Recent Llama Sightings.md", | |
238 | }, | |
239 | "A Possible Llama Under the Bridge": { | |
240 | "anchor": "a-possible-llama-under-the-bridge", | |
241 | "filename": "eg/Recent Llama Sightings.md", | |
242 | }, | |
243 | "Llamas: It's Time to Spot Them": { | |
244 | "anchor": "llamas-its-time-to-spot-them", | |
245 | "filename": "eg/Referenced Llama Sightings.md", | |
246 | }, | |
200 | 247 | }) |
201 | 248 | |
202 | 249 | def test_input_refdex_output_markdown(self): |