On Fri, Jun 11, 2010 at 11:48 AM, Paul Winkler <sli...@gmail.com> wrote:
> On Fri, Jun 11, 2010 at 10:34 AM, Hanno Schlichting <ha...@hannosch.eu>wrote:
>> I tend to run rsync via "rsync -rP --rsh=ssh". The Data.fs is an
>> append-only file, so rsync is very efficient at handling it. Only
>> zeopack rewrites things all across the file and causes a subsequent
>> rsync to be slow again.
> Thanks. I'll do a trial run of this today.
It seems that a second rsync isn't exactly blazing fast with a few changes
on the end of the 32G Data.fs. Near as I can tell, it spends a good 10
minutes or so just comparing the files to see if it has any work to do.
Once that phase is done, it seems to spend a lot of its time in IO since by
default it builds a new file and replaces the existing file when it's done.
Total time ~ 25 minutes.
The rsync man page paid off though: Using the --append option (or
--append-verify on recent enough versions of rsync) seems to reduce the IO a
lot, as it's tailor-made for this use case: updating in-place when the
source file has only been appended to and potentially losing the target file
on failure is OK. (We can manually make a pristine copy prior to starting
our downtime, just in case we need to do it over for any reason).
FWIW total time for the second `rsync -z --append Data.fs` was:
Last time I had to rebuild the index file it took ~ 30 minutes, so this
looks like a win. We'll go with rsync.
For more information about ZODB, see the ZODB Wiki:
ZODB-Dev mailing list - ZODB-Dev@zope.org