git @ Cat's Eye Technologies Feedmark / 0.10
Merge pull request #10 from catseye/develop-0.10 Develop 0.10 Chris Pressey authored 2 years ago GitHub committed 2 years ago
11 changed file(s) with 153 addition(s) and 57 deletion(s). Raw diff Collapse all Expand all
00 History of Feedmark
11 ===================
2
3 0.10
4 ----
5
6 * Format of refdex files has changed: an entry can now have a
7 key `filenames`, which is like `filename`, but can be a list.
8 This is backwards-compatible on input, and you can pass the
9 flag `--output-refdex-single-filename` to cause the output
10 from `--output-refdex` to strip all but the last filename
11 and produce only `filename` entries on output.
12 * Parser now allows trailing `###` on h3-level section headers.
213
314 0.9-2019.105
415 ------------
0 Copyright (c)2019 Chris Pressey, Cat's Eye Technologies
1
2 Permission is hereby granted, free of charge, to any person obtaining a
3 copy of this software and associated documentation files (the "Software"),
4 to deal in the Software without restriction, including without limitation
5 the rights to use, copy, modify, merge, publish, distribute, sublicense,
6 and/or sell copies of the Software, and to permit persons to whom the
7 Software is furnished to do so, subject to the following conditions:
8
9 The above copyright notice and this permission notice shall be included in
10 all copies or substantial portions of the Software.
11
12 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
13 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
14 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
15 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
16 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
17 FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
18 DEALINGS IN THE SOFTWARE.
00 Feedmark
11 ========
22
3 *Version 0.9-2019.1015. Subject to change in backwards-incompatible ways without notice.*
3 *Version 0.10. Subject to change in backwards-incompatible ways without notice.*
44
55 **Feedmark** is a format for embedding structured data in Markdown files
66 in a way which is both human-readable and machine-extractable.
108108 Feedmark is a subset of Markdown, which is something it has in common
109109 with [Falderal][], however it has decidedly different goals.
110110
111 TODO
112 ----
113
114 Research whether JSON Schema could be used for validation as well.
115
116 "common" properties on document which all entries within inherit.
117
118 Sub-entries. Somehow. For individual games in a series, implementations
119 or variations on a programming language, etc.
120
121 Allow trailing `###` on h3-level headings.
122
123 Index creation from refdex, for permalinks.
111 See [TODO.md](TODO.md) for planned features and [HISTORY.md](HISTORY.md)
112 for a record of features added in past versions.
124113
125114 [Falderal]: http://catseye.tc/node/Falderal
126115 [Chrysoberyl]: http://git.catseye.tc/Chrysoberyl/
0 TODO for Feedmark
1 -----------------
2
3 "common" properties on document which all entries within inherit.
4
5 Sub-entries. Somehow. For individual games in a series, implementations
6 or variations on a programming language, etc.
+0
-24
UNLICENSE less more
0 This is free and unencumbered software released into the public domain.
1
2 Anyone is free to copy, modify, publish, use, compile, sell, or
3 distribute this software, either in source code form or as a compiled
4 binary, for any purpose, commercial or non-commercial, and by any
5 means.
6
7 In jurisdictions that recognize copyright laws, the author or authors
8 of this software dedicate any and all copyright interest in the
9 software to the public domain. We make this dedication for the benefit
10 of the public at large and to the detriment of our heirs and
11 successors. We intend this dedication to be an overt act of
12 relinquishment in perpetuity of all present and future rights to this
13 software under copyright law.
14
15 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
16 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
17 MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
18 IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
19 OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
20 ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
21 OTHER DEALINGS IN THE SOFTWARE.
22
23 For more information, please refer to <http://unlicense.org/>
2424
2525 Aenean ullamcorper ex at tellus bibendum semper. Donec lectus augue, vestibulum vel justo in, euismod feugiat libero. Nam fringilla iaculis fermentum. Sed ac felis quis nunc fringilla mattis. Suspendisse potenti. Suspendisse a eros vel lacus luctus venenatis ut id tortor. Etiam nisi orci, scelerisque et aliquam sit amet, blandit non mi. Ut fringilla est sed metus facilisis convallis. Aliquam pharetra iaculis lobortis.
2626
27 ### Llamas: It's Time to Spot Them
27 ### Llamas: It's Time to Spot Them ###
2828
2929 * date: Nov 1 2016 09:00:00
3030
55
66 setuptools.setup(
77 name='Feedmark',
8 version='0.9.2019.1015',
9 description='Definition of Feedmark, a curation-oriented subset of Markdown, and tools for processing it',
8 version='0.10',
9 description='Feedmark, a curation-oriented subset of Markdown, and tools for processing it',
1010 long_description=long_description,
1111 long_description_content_type="text/markdown",
1212 author='Chris Pressey',
1818 classifiers=[
1919 "Development Status :: 4 - Beta",
2020 "Intended Audience :: Information Technology",
21 "License :: Public Domain",
21 "License :: OSI Approved :: MIT License",
2222 "Operating System :: OS Independent",
2323 "Programming Language :: Python :: 2.7",
2424 "Programming Language :: Python :: 3",
2424 for key, value in items(local_refdex):
2525 if 'filename' in value:
2626 value['filename'] = input_refdex_filename_prefix + value['filename']
27 if 'filenames' in value:
28 value['filenames'] = [input_refdex_filename_prefix + f for f in value['filenames']]
2729 refdex.update(local_refdex)
2830 except:
2931 sys.stderr.write("Could not read refdex JSON from '{}'\n".format(filename))
4850 value['filename'].encode('utf-8')
4951 assert isinstance(value['anchor'], unicode_string)
5052 value['anchor'].encode('utf-8')
53 elif 'filenames' in value and 'anchor' in value:
54 assert len(value) == 2
55 for filename in value['filenames']:
56 assert isinstance(value, unicode_string)
57 filename.encode('utf-8')
58 assert isinstance(value['anchor'], unicode_string)
59 value['anchor'].encode('utf-8')
5160 else:
5261 raise NotImplementedError("badly formed refdex")
5362 except:
5564 raise
5665
5766 return refdex
67
68
69 def convert_refdex_to_single_filename_refdex(input_refdex):
70 """Note that this makes a partially shallow copy."""
71 refdex = {}
72 for key, value in input_refdex.items():
73 if 'filenames' in value:
74 refdex[key] = {
75 'filename': value['filenames'][-1],
76 'anchor': value['anchor']
77 }
78 else:
79 refdex[key] = value
80 return refdex
11 import json
22 import sys
33
4 from feedmark.loader import read_document_from, read_refdex_from
4 from feedmark.loader import (
5 read_document_from, read_refdex_from, convert_refdex_to_single_filename_refdex,
6 )
57 from feedmark.utils import items
68
79
6163 argparser.add_argument('--input-refdexes', metavar='FILENAME', type=str,
6264 help='Load these JSON files as the reference-style links index before processing'
6365 )
66 argparser.add_argument('--input-refdex-filename-prefix', type=str, default=None,
67 help='After loading refdexes, prepend this to filename of each refdex'
68 )
6469 argparser.add_argument('--output-refdex', action='store_true',
6570 help='Construct reference-style links index from the entries and write it to stdout as JSON'
6671 )
67 argparser.add_argument('--input-refdex-filename-prefix', type=str, default=None,
68 help='After loading refdexes, prepend this to filename of each refdex'
72 argparser.add_argument('--output-refdex-single-filename', action='store_true',
73 help='When outputting a refdex, ensure that only entries with a single filename are '
74 'output, by stripping all but the last filename from multiple filenames entries.'
6975 )
7076
7177 argparser.add_argument('--limit', metavar='COUNT', type=int, default=None,
7278 help='Process no more than this many entries when making an Atom or HTML feed'
7379 )
7480
75 argparser.add_argument('--version', action='version', version="%(prog)s 0.9")
81 argparser.add_argument('--version', action='version', version="%(prog)s 0.10")
7682
7783 options = argparser.parse_args(args)
7884
8389 for filename in options.input_files:
8490 document = read_document_from(filename)
8591 documents.append(document)
92
93 ### input: load input refdexes
8694
8795 input_refdexes = []
8896 if options.input_refdex:
112120 if options.output_refdex:
113121 for document in documents:
114122 for section in document.sections:
115 refdex[section.title] = {
116 'filename': document.filename,
117 'anchor': section.anchor
118 }
123 if section.title in refdex:
124 entry = refdex[section.title]
125 if entry['anchor'] != section.anchor:
126 raise ValueError("Inconsistent anchors: {} in refex, {} in document".format(entry['anchor'], section.anchor))
127 if 'filename' in entry:
128 entry['filenames'] = []
129 del entry['filename']
130 entry['filenames'].append(document.filename)
131 else:
132 refdex[section.title] = {
133 'filenames': [document.filename],
134 'anchor': section.anchor
135 }
119136
120137 ### processing: rewrite references phase
121138
126143 ### output
127144
128145 if options.output_refdex:
146 if options.output_refdex_single_filename:
147 refdex = convert_refdex_to_single_filename_refdex(refdex)
129148 sys.stdout.write(json.dumps(refdex, indent=4, sort_keys=True))
130149
131150 if options.dump_entries:
1919 entry = refdex[name]
2020 if 'filename' in entry and 'anchor' in entry:
2121 filename = quote(entry['filename'].encode('utf-8'))
22 anchor = quote(entry['anchor'].encode('utf-8'))
23 url = u'{}#{}'.format(filename, anchor)
24 elif 'filenames' in entry and 'anchor' in entry:
25 # pick the last one, for compatibility with single-refdex style
26 filename = quote(entry['filenames'][-1].encode('utf-8'))
2227 anchor = quote(entry['anchor'].encode('utf-8'))
2328 url = u'{}#{}'.format(filename, anchor)
2429 elif 'url' in entry:
243248 while self.is_blank_line():
244249 self.scan()
245250
246 match = re.match(r'^\#\#\#\s+(.*?)\s*$', self.line)
251 match = re.match(r'^\#\#\#\s+(.*?)\s*(\#\#\#)?\s*$', self.line)
247252 if not match:
248253 raise ValueError('Expected section, found "{}"'.format(self.line))
249254
127127 "<a href=\"https://daringfireball.net/projects/markdown/\">Markdown</a>\ncan be used.</p>\n"
128128 "<p>To <a href=\"https://en.wikipedia.org/wiki/Site\">site</a> them.</p>\n<p>Sight them, sigh.</p>"
129129 )
130 # note that property values are bare HTML: there is no surrounding <p></p> or other element
130 # note that property values are bare HTML fragments: there is no surrounding <p></p> or other element
131131 self.assertEqual(
132132 data['documents'][0]['properties']['hopper'],
133133 '<a href="https://en.wikipedia.org/wiki/Stephen_Hopper">Stephen</a>'
183183 self.assertDictEqual(data, {
184184 "2 Llamas Spotted Near Mall": {
185185 "anchor": "2-llamas-spotted-near-mall",
186 "filename": "eg/Recent Llama Sightings.md"
186 "filenames": ["eg/Recent Llama Sightings.md"],
187187 },
188188 "A Possible Llama Under the Bridge": {
189189 "anchor": "a-possible-llama-under-the-bridge",
190 "filename": "eg/Recent Llama Sightings.md"
190 "filenames": ["eg/Recent Llama Sightings.md"],
191191 },
192192 "Llamas: It's Time to Spot Them": {
193193 "anchor": "llamas-its-time-to-spot-them",
194 "filename": "eg/Recent Llama Sightings.md"
194 "filenames": ["eg/Recent Llama Sightings.md"],
195195 },
196196 "Maybe sighting the llama": {
197197 "anchor": "maybe-sighting-the-llama",
198 "filename": "eg/Ancient Llama Sightings.md"
199 }
198 "filenames": ["eg/Ancient Llama Sightings.md"],
199 }
200 })
201
202 def test_output_refdex_with_overlap(self):
203 # Both of these files contain an entry called "Llamas: It's Time to Spot Them".
204 # The refdex is created with entries pointing to all files where the entry occurs.
205 main(['eg/Recent Llama Sightings.md', 'eg/Referenced Llama Sightings.md', '--output-refdex'])
206 data = json.loads(sys.stdout.getvalue())
207 self.assertDictEqual(data, {
208 "2 Llamas Spotted Near Mall": {
209 "anchor": "2-llamas-spotted-near-mall",
210 "filenames": [
211 "eg/Recent Llama Sightings.md",
212 ]
213 },
214 "A Possible Llama Under the Bridge": {
215 "anchor": "a-possible-llama-under-the-bridge",
216 "filenames": [
217 "eg/Recent Llama Sightings.md",
218 ],
219 },
220 "Llamas: It's Time to Spot Them": {
221 "anchor": "llamas-its-time-to-spot-them",
222 "filenames": [
223 "eg/Recent Llama Sightings.md",
224 "eg/Referenced Llama Sightings.md"
225 ]
226 },
227 })
228
229 def test_output_refdex_with_overlap_forcing_single_filename(self):
230 # Both of these files contain an entry called "Llamas: It's Time to Spot Them"
231 # The refdex is created pointing only to the file that was mentioned last.
232 main(['eg/Recent Llama Sightings.md', 'eg/Referenced Llama Sightings.md', '--output-refdex', '--output-refdex-single-filename'])
233 data = json.loads(sys.stdout.getvalue())
234 self.assertDictEqual(data, {
235 "2 Llamas Spotted Near Mall": {
236 "anchor": "2-llamas-spotted-near-mall",
237 "filename": "eg/Recent Llama Sightings.md",
238 },
239 "A Possible Llama Under the Bridge": {
240 "anchor": "a-possible-llama-under-the-bridge",
241 "filename": "eg/Recent Llama Sightings.md",
242 },
243 "Llamas: It's Time to Spot Them": {
244 "anchor": "llamas-its-time-to-spot-them",
245 "filename": "eg/Referenced Llama Sightings.md",
246 },
200247 })
201248
202249 def test_input_refdex_output_markdown(self):