[plug] File Synchronize - Open Source

Cameron Patrick cameron at patrick.wattle.id.au
Wed Aug 25 23:17:33 WST 2004


James Devenish wrote:

> Phew and Thank You! (I'm a declared Unison advocate, esp. as opposed to
> using rsync for regular bi-directional synchronisation. Rsync is great
> as a widely-deployed mechanism for mirroring, of course.)

Aha.  Well, now I have you cowering in the corner facing unspecified
implements of prescriptive grammarian doom, you may be able to help me
with something that's been bugging me about unison before I go and
re-implement half of it in python :-)

Is there an easy way to get unison to not stat() every file in the
replica?  I have a multi-hundred-MB replica (a subset of my e-mail, in
Maildir format) consisting of many thousands of files which I
synchronise regularly between the mail server and my laptop.  It takes
quite a while (>1 minute if the results aren't in cache) for unison to
look for changes on each machine.

One of the guarantees that the Maildir format makes is that file
contents are never altered without the file name also changing.  So in
principle, Unison should be able to get away with just a readdir(),
which is enormously faster[1].  In fact, it can do better: if a
directory mtime hasn't changed, we can guarantee that nothing in the
directory has changed -- and a lot of the time, nothing will have
changed in most folders, so this should be a big win.

Cameron.

[1] Statistically meaningless test: on my server at home, it took
~2.5s to do an 'ls >/dev/null' vs ~110s to do an 'ls -l >/dev/null' on
my PLUG archive folder with 26000 messages in it.  However, this one
folder is marginally larger than all the mail that I regularly sync
with unison :-)




More information about the plug mailing list