git @ Cat's Eye Technologies yastasoti / 63dab22
Some updates to documentation Chris Pressey 4 years ago
2 changed file(s) with 25 addition(s) and 6 deletion(s). Raw diff Collapse all Expand all
44
55 Was split off from Feedmark, which doesn't itself need to support this function.
66
7 Features:
7 ### Features ###
88
99 * input is a JSON list of objects containing links (such as those produced by Feedmark)
10 * output is a JSON list of objects that could not be retrieved, which can be fed back
11 into the script as input
12 * checks links with `HEAD` requests by default; if `--archive-links-to` is given,
13 fetches a copy of each resource with `GET` and saves it to disk
1014 * tries to be idempotent and not create a new local file if the remote file hasn't changed
11 * planned: archive youtube links with youtube-dl.
12 * TODO: logging
13 * TODO: Handle redirects (301, 302) better when archiving external links.(?)
15 * handles links that are local files; checks if the file exists locally
1416
15 Example:
16
17 #### Planned features ####
18
19 * archive youtube links with youtube-dl.
20 * logging
21 * ignore certain URLs
22 * Handle failures (redirects, etc) better. Fall back to external tool like `wget` or `curl`.
23
24 ### Examples ###
25
26 Check that the links in a set of Feedmark documents all resolve:
27
28 feedmark --output-links article/*.md | yastasoti --article-root=article/ - | tee results.json
29
30 Since no `--archive-links` options were given, this will make only `HEAD`
31 requests to check that the resources exist. It will not fetch them.
32
33 Archive stuff off teh internets:
34
1735 yastasoti --archive-links-to=downloads links.json
1836
1937 If it is only desired that the links be checked, `--check-links` will
0 requests==2.17.3