git @ Cat's Eye Technologies ellsync / 0.2
Merge pull request #1 from cpressey/develop-0.2 Develop 0.2 Chris Pressey authored 1 year, 6 months ago GitHub committed 1 year, 6 months ago
4 changed file(s) with 287 addition(s) and 82 deletion(s). Raw diff Collapse all Expand all
0 ellsync
1 =======
0 `ellsync`
1 =========
22
3 ellsync is an opinionated poka-yoke for rsync.
3 _Version 0.2_
4 | _Entry_ [@ catseye.tc](https://catseye.tc/node/ellsync)
5 | _See also:_ [yastasoti](https://github.com/catseye/yastasoti)
46
5 * opinionated: it was designed for a particular use case for rsync
7 - - - -
8
9 <img align="right" src="images/ellsync-logo.png?raw=true" />
10
11 **`ellsync`** is an opinionated poka-yoke for rsync.
12
13 * [opinionated][]: it was designed for a particular use case for rsync
614 (offline backups).
7 * poka-yoke: it exposes a restricted interface to rsync, which
15 * [poka-yoke][]: it exposes a restricted interface to rsync, which
816 prevents using it in dangerous ways.
917
10 As a side-effect it also provides some convenience, as the restricted
11 interface can be accessed by shorthand form instead of verbosely.
18 Because the restricted interface that `ellsync` presents can be accessed
19 by shorthand form, it also happens to provide some convenience over
20 using `rsync` directly — but its real purpose is to increase safety.
21 (I've been burned more than once when I've made a mistake using `rsync`.)
1222
13 ellsync's operation is based on a *backup router* which is a JSON file
23 Quick start
24 -----------
25
26 Make sure you have Python (2.7 or 3.x) installed, clone this repository,
27 and put its `bin` directory on your executable search path. You will
28 then be able to run `ellsync` from your terminal.
29
30 Usage guide
31 -----------
32
33 ### Backup router
34
35 `ellsync`'s operation is based on a *backup router* which is a JSON file
1436 that looks like this:
1537
1638 {
2850 are bona fide changes, but any change to the contents of the cache can be
2951 discarded.
3052
31 With this router saved as `router.json` we can then say
53 ### `syncdirs` command
3254
33 ellsync router.json /home/user/art/ /media/user/External1/art/
55 With the above router saved as `router.json` we can then say
56
57 ellsync router.json syncdirs /home/user/art/ /media/user/External1/art/
3458
3559 and this will in effect run
3660
4165 involved will often remain in the filesystem cache, meaning a subsequent
4266 actual run will go quite quickly. To do that actual run, use `--apply`:
4367
44 ellsync router.json /home/user/art/ /media/user/External1/art/ --apply
68 ellsync router.json syncdirs /home/user/art/ /media/user/External1/art/ --apply
4569
4670 Note that if we try
4771
48 ellsync router.json /media/user/External1/art/ /home/user/art/
72 ellsync router.json syncdirs /media/user/External1/art/ /home/user/art/
4973
5074 we will be prevented, because it is an error, because the direction of
5175 the backup stream is always from canonical to cache.
5276
53 Various other configurations are prevented. You may have noticed that rsync
54 is sensitive about whether a directory name ends in a slash or not. ellsync
77 Various other configurations are prevented. You may have noticed that `rsync`
78 is sensitive about whether a directory name ends in a slash or not. `ellsync`
5579 detects when a trailing slash is missing and adds it. Thus
5680
57 ellsync router.json /media/user/External1/art /home/user/art/
81 ellsync router.json syncdirs /media/user/External1/art /home/user/art/
5882
5983 is still interpreted as
6084
6387 (but note that the directories in the router do need to have the
6488 trailing slashes.)
6589
66 Since this configuration is named in the router, we don't even have to
67 give these directory names. We can just give the name of the stream,
68 followed by a colon (more on that in a second):
69
70 ellsync router.json art:
90 ### `list` command
7191
7292 Either the canonical or the cache (or both) may be offline storage (removable
7393 media), therefore neither directory is assumed to exist (it might not exist
7494 if the volume is not mounted.) If either of the directories does not exist,
75 ellsync will refuse to use this backup stream. Based on this, there is a
95 `ellsync` will refuse to use this backup stream. Based on this, there is a
7696 subcommand to list which streams are, at the moment, backupable:
7797
7898 ellsync router.json list
7999
100 ### `sync` command
101
102 Since each stream configuration is named in the router, we don't even have to
103 give these directory names. We can use the `sync` command where we give
104 just the name of the stream, followed by a colon (more on that in a second):
105
106 ellsync router.json sync art:
107
80108 Also, since the contents of the canonical and the cache normally
81 have the same directory structure, ellsync allows specifying that
109 have the same directory structure, `ellsync` allows specifying that
82110 only a subdirectory of a stream is to be synced:
83111
84 ellsync router.json /home/user/art/painting/ /media/user/External1/art/painting/
112 ellsync router.json sync /home/user/art/painting/ /media/user/External1/art/painting/
85113
86114 This is of course allowed only as long as it is the same subdirectory.
87115 This will fail:
88116
89 ellsync router.json /home/user/art/painting/ /media/user/External1/art/sculpture/
117 ellsync router.json sync /home/user/art/painting/ /media/user/External1/art/sculpture/
90118
91119 And this can be combined with the short, name-the-stream syntax, and
92120 explains why there is a colon in it:
93121
94 ellsync router.json art:painting/
122 ellsync router.json sync art:painting/
123
124 ### `rename` command
125
126 Sometimes you want to rename a subdirectory somewhere under the canonical of
127 one of the streams. It's completely fine to do this, but the next time it is synced,
128 `rsync` will treat it, in the cache, as the old subdirectory being deleted and
129 a new subdirectory being created. If there are a large number of files in the
130 subdirectory, this delete-and-create sync can take a long time. It's also not
131 obvious from `rsync`'s logging output that everything being deleted is also being
132 created somewhere else.
133
134 To ease this situation, `ellsync` has a `rename` command that works like so:
135
136 ellsync router.json rename art: sclupture sculpture
137
138 This renames the `/media/user/External1/art/sclupture` directory to
139 `/media/user/External1/art/sculpture` and *also* renames the `/home/user/art/sclupture`
140 directory to `/home/user/art/sculpture`. If the contents of the source and
141 destination directories were in sync before this rename occurred, they will
142 continue to be in sync after the rename happens.
143
144 Hints and Tips
145 --------------
95146
96147 You might have a router you use almost always, in which case you might
97148 want to establish an alias like
100151
101152 (or whatever.)
102153
103 Note
104 ----
154 Notes
155 -----
105156
106157 If `rsync` encounters an error, it will abort, having only partially completed.
107158 In particular, if it encounters a directory which it cannot read, because it
108159 is for example owned by another user and not world-readable, it will abort.
109160 `ellsync` does not currently detect this properly (if it is detectable (I hope
110161 that it is!))
162
163 History
164 -------
165
166 ### 0.2
167
168 Every `ellsync` functionality has an explicit subcommand (`list` and `sync` to
169 start.)
170
171 `sync` was split into `sync` (takes a stream) and `syncdirs` (takes to and
172 from dirs).
173
174 Added `rename` command.
175
176 ### 0.1
177
178 Initial release.
179
180 [opinionated]: https://softwareengineering.stackexchange.com/questions/12182/what-does-opinionated-software-really-mean
181 [poka-yoke]: https://en.wikipedia.org/wiki/Poka-yoke
Binary diff not shown
22 import os
33 import sys
44 from subprocess import Popen, STDOUT, PIPE
5
6
7 # - - - - utilities - - - -
58
69
710 def clean_dir(dirname):
1013 return dirname
1114
1215
13 def main(args):
16 def perform_sync(from_dir, to_dir, dry_run=True):
17 for d in (from_dir, to_dir):
18 if not os.path.isdir(d):
19 raise ValueError("Directory '{}' is not present".format(d))
20 rsync_options = '--dry-run ' if dry_run else ''
21 cmd = 'rsync {}--archive --verbose --delete "{}" "{}"'.format(rsync_options, from_dir, to_dir)
22 sys.stdout.write(cmd + '\n')
23 try:
24 p = Popen(cmd, shell=True, stderr=STDOUT, stdout=PIPE, encoding='utf-8')
25 decode_line = lambda line: line
26 except TypeError:
27 # python 2.x
28 p = Popen(cmd, shell=True, stderr=STDOUT, stdout=PIPE)
29 decode_line = lambda line: line.decode('utf-8')
30 pipe = p.stdout
31 for line in p.stdout:
32 sys.stdout.write(decode_line(line))
33 sys.stdout.flush()
34 p.wait()
1435
36
37 # - - - - commands - - - -
38
39
40 def list_(router, args):
41 for stream_name, stream in router.items():
42 if os.path.isdir(stream['from']) and os.path.isdir(stream['to']):
43 print("{}: {} => {}".format(stream_name, stream['from'], stream['to']))
44
45
46 def sync(router, args):
1547 argparser = ArgumentParser()
16
17 argparser.add_argument('router', metavar='ROUTER', type=str,
18 help='JSON file containing the backup router description'
48 argparser.add_argument('stream_name', metavar='STREAM', type=str,
49 help='Name of stream (or stream:subdirectory) to sync contents across'
1950 )
51 argparser.add_argument('--apply', default=False, action='store_true',
52 help='Actually run the rsync command'
53 )
54 options = argparser.parse_args(args)
55
56 if ':' in options.stream_name:
57 stream_name, subdir = options.stream_name.split(':')
58 else:
59 raise NotImplementedError("Arg must be stream:subdir")
60 stream = router[stream_name]
61 from_dir = stream['from']
62 to_dir = stream['to']
63 if subdir:
64 from_dir = os.path.join(from_dir, subdir)
65 to_dir = os.path.join(to_dir, subdir)
66
67 from_dir = clean_dir(from_dir)
68 to_dir = clean_dir(to_dir)
69
70 perform_sync(from_dir, to_dir, dry_run=(not options.apply))
71
72
73 def syncdirs(router, args):
74 argparser = ArgumentParser()
2075 argparser.add_argument('from_dir', metavar='FROM_DIR', type=str,
2176 help='Canonical directory to sync contents from, or name of stream to use'
2277 )
2782 argparser.add_argument('--apply', default=False, action='store_true',
2883 help='Actually run the rsync command'
2984 )
30
3185 options = argparser.parse_args(args)
86
87 from_dir = clean_dir(options.from_dir)
88 to_dir = clean_dir(options.to_dir)
89 selected_stream_name = None
90 for stream_name, stream in router.items():
91 if from_dir.startswith(stream['from']) and to_dir.startswith(stream['to']):
92 from_suffix = from_dir[len(stream['from']):]
93 to_suffix = to_dir[len(stream['to']):]
94 if from_suffix != to_suffix:
95 raise ValueError( (from_suffix, to_suffix) )
96 selected_stream_name = stream_name
97 break
98 if selected_stream_name is None:
99 raise ValueError("Stream {} => {} was not found in router".format(from_dir, to_dir))
100
101 perform_sync(from_dir, to_dir, dry_run=(not options.apply))
102
103
104 def rename(router, args):
105 argparser = ArgumentParser()
106 argparser.add_argument('stream_name', metavar='STREAM', type=str,
107 help='Name of stream to operate under'
108 )
109 argparser.add_argument('existing_subdir_name', metavar='DIRNAME', type=str,
110 help='Existing subdirectory to be renamed'
111 )
112 argparser.add_argument('new_subdir_name', metavar='DIRNAME', type=str,
113 help='New name for subdirectory'
114 )
115 options = argparser.parse_args(args)
116
117 stream_name = options.stream_name
118 if ':' in stream_name:
119 stream_name, subdir = options.stream_name.split(':')
120 assert subdir == ''
121
122 stream = router[stream_name]
123 from_dir = stream['from']
124 to_dir = stream['to']
125
126 existing_subdir_a = clean_dir(os.path.join(from_dir, options.existing_subdir_name))
127 new_subdir_a = clean_dir(os.path.join(from_dir, options.new_subdir_name))
128
129 if not os.path.isdir(existing_subdir_a):
130 raise ValueError("Directory '{}' is not present".format(existing_subdir_a))
131 if os.path.isdir(new_subdir_a):
132 raise ValueError("Directory '{}' already exists".format(new_subdir_a))
133
134 existing_subdir_b = clean_dir(os.path.join(to_dir, options.existing_subdir_name))
135 new_subdir_b = clean_dir(os.path.join(to_dir, options.new_subdir_name))
136
137 if not os.path.isdir(existing_subdir_b):
138 raise ValueError("Directory '{}' is not present".format(existing_subdir_b))
139 if os.path.isdir(new_subdir_b):
140 raise ValueError("Directory '{}' already exists".format(new_subdir_b))
141
142 print("Renaming {} to {}".format(existing_subdir_a, new_subdir_a))
143 os.rename(existing_subdir_a, new_subdir_a)
144 print("Renaming {} to {}".format(existing_subdir_b, new_subdir_b))
145 os.rename(existing_subdir_b, new_subdir_b)
146
147
148 # - - - - driver - - - -
149
150
151 def main(args):
152 argparser = ArgumentParser()
153
154 argparser.add_argument('router', metavar='ROUTER', type=str,
155 help='JSON file containing the backup router description'
156 )
157 argparser.add_argument('command', metavar='COMMAND', type=str,
158 help='The action to take. One of: list, sync, syncdirs, rename'
159 )
160
161 options, remaining_args = argparser.parse_known_args(args)
32162
33163 with open(options.router, 'r') as f:
34164 router = json.loads(f.read())
35165
36 if options.to_dir is None:
37 if ':' in options.from_dir:
38 stream_name, subdir = options.from_dir.split(':')
39 else:
40 command = options.from_dir
41 if command == 'list':
42 for stream_name, stream in router.items():
43 if os.path.isdir(stream['from']) and os.path.isdir(stream['to']):
44 print("{}: {} => {}".format(stream_name, stream['from'], stream['to']))
45 sys.exit(0)
46 else:
47 raise NotImplementedError("Arg must be stream:subdir or command; command must be one of: list")
48 stream = router[stream_name]
49 from_dir = stream['from']
50 to_dir = stream['to']
51 if subdir:
52 from_dir = clean_dir(os.path.join(from_dir, subdir))
53 to_dir = clean_dir(os.path.join(to_dir, subdir))
166 if options.command == 'list':
167 list_(router, remaining_args)
168 elif options.command == 'sync':
169 sync(router, remaining_args)
170 elif options.command == 'syncdirs':
171 syncdirs(router, remaining_args)
172 elif options.command == 'rename':
173 rename(router, remaining_args)
54174 else:
55 from_dir = clean_dir(options.from_dir)
56 to_dir = clean_dir(options.to_dir)
57 selected_stream_name = None
58 for stream_name, stream in router.items():
59 if from_dir.startswith(stream['from']) and to_dir.startswith(stream['to']):
60 from_suffix = from_dir[len(stream['from']):]
61 to_suffix = to_dir[len(stream['to']):]
62 if from_suffix != to_suffix:
63 raise ValueError( (from_suffix, to_suffix) )
64 selected_stream_name = stream_name
65 break
66 if selected_stream_name is None:
67 raise ValueError("Stream {} => {} was not found in router".format(from_dir, to_dir))
68
69 for d in (from_dir, to_dir):
70 if not os.path.isdir(d):
71 raise ValueError("Directory '{}' is not present".format(d))
72 rsync_options = '--dry-run ' if (not options.apply) else ''
73 cmd = 'rsync {}--archive --verbose --delete "{}" "{}"'.format(rsync_options, from_dir, to_dir)
74 sys.stdout.write(cmd + '\n')
75 try:
76 p = Popen(cmd, shell=True, stderr=STDOUT, stdout=PIPE, encoding='utf-8')
77 except TypeError:
78 # python 2.x
79 p = Popen(cmd, shell=True, stderr=STDOUT, stdout=PIPE)
80 pipe = p.stdout
81 for line in p.stdout:
82 sys.stdout.write(line.decode('utf-8'))
83 sys.stdout.flush()
84 p.wait()
175 argparser.print_usage()
176 sys.exit(1)
3232 'basic': {
3333 'from': 'canonical',
3434 'to': 'cache',
35 },
36 'other': {
37 'from': 'canonical2',
38 'to': 'cache2',
3539 }
3640 }
3741 with open('backup.json', 'w') as f:
4953 main(['backup.json'])
5054
5155 def test_dry_run(self):
52 main(['backup.json', 'canonical', 'cache'])
56 main(['backup.json', 'syncdirs', 'canonical', 'cache'])
5357 self.assertFalse(os.path.exists('cache/thing'))
5458 output = sys.stdout.getvalue()
5559 self.assertEqual(output.split('\n')[0], 'rsync --dry-run --archive --verbose --delete "canonical/" "cache/"')
5660 self.assertIn('DRY RUN', output)
5761
5862 def test_apply(self):
59 main(['backup.json', 'canonical', 'cache', '--apply'])
63 main(['backup.json', 'syncdirs', 'canonical', 'cache', '--apply'])
6064 self.assertTrue(os.path.exists('cache/thing'))
6165 output = sys.stdout.getvalue()
6266 self.assertEqual(output.split('\n')[:4], [
6670 ''
6771 ])
6872
73 def test_stream(self):
74 main(['backup.json', 'sync', 'basic:', '--apply'])
75 output = sys.stdout.getvalue()
76 self.assertEqual(output.split('\n')[:4], [
77 'rsync --archive --verbose --delete "canonical/" "cache/"',
78 'sending incremental file list',
79 'thing',
80 ''
81 ])
82
83 def test_stream_not_exist(self):
84 with self.assertRaises(ValueError) as ar:
85 main(['backup.json', 'sync', 'other:', '--apply'])
86 self.assertIn("Directory 'canonical2/' is not present", str(ar.exception))
87
88 def test_rename(self):
89 check_call("mkdir -p canonical/sclupture", shell=True)
90 check_call("mkdir -p cache/sclupture", shell=True)
91 main(['backup.json', 'rename', 'basic:', 'sclupture', 'sculpture'])
92 self.assertTrue(os.path.exists('canonical/sculpture'))
93 self.assertTrue(os.path.exists('cache/sculpture'))
94
95 def test_rename_not_both_subdirs_exist(self):
96 check_call("mkdir -p canonical/sclupture", shell=True)
97 with self.assertRaises(ValueError) as ar:
98 main(['backup.json', 'rename', 'basic:', 'sclupture', 'sculpture'])
99 self.assertIn("Directory 'cache/sclupture/' is not present", str(ar.exception))
100 self.assertFalse(os.path.exists('canonical/sculpture'))
101
102 def test_rename_new_subdir_already_exists(self):
103 check_call("mkdir -p canonical/sclupture", shell=True)
104 check_call("mkdir -p canonical/sculpture", shell=True)
105 check_call("mkdir -p cache/sclupture", shell=True)
106 with self.assertRaises(ValueError) as ar:
107 main(['backup.json', 'rename', 'basic:', 'sclupture', 'sculpture'])
108 self.assertIn("Directory 'canonical/sculpture/' already exists", str(ar.exception))
109 self.assertFalse(os.path.exists('cache/sculpture'))
110
69111
70112 if __name__ == '__main__':
71113 unittest.main()