On 2013-02-27 05:36, Ian Collins wrote:
Bob Friesenhahn wrote:
On Wed, 27 Feb 2013, Ian Collins wrote:
I am finding that rsync with the right options (to directly
block-overwrite) plus zfs snapshots is providing me with pretty
amazing "deduplication" for backups without even enabling
deduplication in zfs. Now backup storage goes a very long way.
We do the same for all of our "legacy" operating system backups. Take a
snapshot then do an rsync and an excellent way of maintaining
backups for those.
Magic rsync options used:
-a --inplace --no-whole-file --delete-excluded
This causes rsync to overwrite the file blocks in place rather than
writing to a new temporary file first. As a result, zfs COW produces
primitive "deduplication" of at least the unchanged blocks (by writing
nothing) while writing new COW blocks for the changed blocks.
Do these options impact performance or reduce the incremental stream sizes?
I just use -a --delete and the snapshots don't take up much space
(compared with the incremental stream sizes).
Well, to be certain, you can create a dataset with a large file in it,
snapshot it, and rsync over a changed variant of the file, snapshot and
compare referenced sizes. If the file was rewritten into a new temporary
one and then renamed over original, you'd likely end up with as much
used storage as for the original file. If only changes are written into
it "in-place" then you'd use a lot less space (and you'd not see a
.garbledfilename in the directory during the process).
If you use rsync over network to back up stuff, here's an example of
SMF wrapper for rsyncd, and a config sample to make a snapshot after
completion of the rsync session.
zfs-discuss mailing list