Re: [zfs-discuss] Building an On-Site and Off-Size ZFS server, replication question

Jim Klimov Fri, 05 Oct 2012 01:36:31 -0700

2012-10-05 11:17, Tiernan OToole wrote:

Also, as a follow up question, but slightly unrelated, when it comes to
the ZFS Send, i could use SSH to do the send, directly to the machine...
Or i could upload the compressed, and possibly encrypted dump to the
server... Which, for resume-ability and speed, would be suggested? And
if i where to go with an upload option, any suggestions on what i should
use?


As for this, the answer depends on network bandwidth, reliability,
and snapshot file size - ultimately, on the probability and retry
cost of an error during transmission.

Many posters on the list strongly object to using files as storage
for snapshot streams, because in reliability this is (may be) worse
than a single-disk pool and bitrot on it - a single-bit error in
a snapshot file can render it and all newer snapshots invalid and
un-importable.

Still, given enough scratch space on the sending and receiving sides
and a bad (slow, glitchy) network in-between, I did go with compressed
files of zfs-send streams (perhaps making recursion myself and using
smaller files of one snapshot each - YMMV). For compression on multiCPU
senders I can strongly suggest "pigz --fast $filename" (I did have
problems in pigz-1.7.1 compressing several files with one command,
maybe that's fixed now). If you're tight on space/transfer size more
than on CPU, you can try other parallel algos - pbzip2, p7zip, etc.
Likewise, you can also pass the file into an encryptor of your choice.

Then I can rsync these files to the receiving server, using "rsync -c"
and/or md5sum, sha256sum, sha1sum or whatever tool(s) of your liking
to validate that the files received match those sent - better safe
than sorry. I'm usually using "rsync -cavPHK" for any work, which
gives you retryable transfers in case network goes down, status bar,
directory recursion and hardlink support among other things.

NFS is also retryable if so configured (even if the receiver gets
rebooted in the process), and if you, for example, already have
VPN between two sites, you might find it faster than rsync which
involves extra encryption - maybe redundant in VPN case.

When the scratch area on the receiver has got and validated the
compressed snapshot stream, I can gzcat it and pipe into zfs recv.
This ultimately validates that the zfs-send stream arrived intact
and is fully receivable, and only then I can delete the temporary
files involved - or retry the send from different steps (it is
possible that the initial file was corrupted in RAM, etc.)

Note that such approach via files essentially disables zfs-send
deduplication which may be available in protocol between two
active zfs commands, but AFAIK this does not preclude you from
receiving data into deduped datasets - local dedup happens upon
block writes anyway, like compression, encryption and stuff like
that.

HTH,
//Jim Klimov

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Building an On-Site and Off-Size ZFS server, replication question

Reply via email to