On 3/5/2013 10:27 AM, Bob Friesenhahn wrote:
On Tue, 5 Mar 2013, David Magda wrote:
It's also possible to reduce the amount that rsync has to walk the entire
file tree.

Most folks simply do a "rsync --options /my/source/ /the/dest/", but if
you use "zfs diff", and parse/feed the output of that to rsync, then the
amount of thrashing can probably be minimized. Especially useful for file
hierarchies that very many individual files, so you don't have to stat()
every single one.

Zfs diff only works for zfs filesystems. If one is using zfs filesystems then rsync may not be the best option. In the real world, data may be sourced from many types of systems and filesystems.


Good point. Clearly this wouldn't work for my current linux fileserver. I'm building a replacement that will run FreeBSD 9.1 with a zfs storage pool. My backups are to a thumper running solaris 10 and zfs in another department. I have an arm's-length collaboration with the department that runs the thumper, which likely precludes a direct zfs send.

Rsync has allowed us to transfer data without getting too deep into each others' system administration. I run an rsync daemon with read only access to my filesystem that accepts connections from the thumper. They serve the backups to me via a read-only nfs export. The only problem has been the iops load generated by my users' millions of small files. That's why the zfs diff idea excited me, but perhaps I'm missing some simpler approach.

zfs-discuss mailing list

Reply via email to