[EMAIL PROTECTED] wrote on 02/02/2007 11:16:32 AM:

> Hi all,
>
> Longtime reader, first time poster.... Sorry for the lengthy intro
> and not really sure the title matches what I'm trying to get at... I
> am trying to find a solution where making use of a zfs filesystem
> can shorten our backup window.  Currently, our backup solution takes
> data from ufs or vxfs filesystems from a split mirrored disk, mounts
> it off host, and then writes that directly to tape from the off host
> backup system.  This equates to a "full" backup and requires a
> significant amount of time that the split mirror is attached to that
> system till it can get returned.  I'd like to fit in a zfs
> filesystem into the mix, and hopefully make use of "space saving"
> snapshot capabilities and find out if there is a known way to
> migrate this into a "incremental" backup, with known freeware or os
> level tools.

What we are playing with here is rsync --inplace diffs from vxfs/ufs (large
systems, 7+TB, millions of files) multiple times per day to thumper and
have the thumper then spool to tape via netbackup.  In our situation this
has shortened our full backup windows from 2 days on our largest systems to
< 1 hour.  On the thumper side we snap after each rsync and because of the
--inplace the differential data requirements for the snaps are very close
to actual data delta on the primary server.   This also allows for (in our
case) 1 -> 8 snaps per day to be kept live nearline over extended periods
with very little overhead -- and reducing the amount of production side
snaps holding delta data.  Most restore requests are done via snaps.


>
> I get that if the source data storage was originally zfs instead of
> ufs|vxfs, that I'd be able to take snapshots of the storage pre
> mirror split, mount that storage on the offhost, and then take
> deltas from the different snapshots to turn into files, or get
> applied directly to another zfs filesystem on the offhost that was
> originally created from information off that detached mirror.  We
> could also do this without the mirror split and just do a zfs send
> and pipe the data out to a remote host where it would recreate that
> snapshot there.  It will take a while to get to where I can have zfs
> running in production as it might involve some brainwashing of some
> DBAs to get it done, so in the meantime, what are some thoughts on
> how to do this without data sitting on a zfs source?

This was the same issue we had,  zfs is missing some features and has some
performance issues in certain workflows that do not allow us to migrate
most of our production systems (yet).  rsync is working well for us in in
lieu of zfs send/receive.

>
> Some questions I have are in regards to trying to keep this
> management of data at a "file" level.  I would like to use a zfs
> filesystem as the repository of data, and have that repository most
> efficiently house the data.  I'd like to see that if I sent over
> binary database files that were sourced on a ufs|vxfs filesystem to
> the zfs filesystem, and then took a snapshot of that data, how could
> I update the data on that zfs filesystem with more current files,
> and then have zfs recognize that the files are mostly the same, and
> only have some differing bits.  Can a snapshotted zfs filesystem,
> get a file that is named the same, overwritten fully on the live zfs
> filesystem, and use the same amount of "block" space that is used in
> the snapshot?  I don't know if I'm stating that all clearly.  I
> don't know how I can recreate data on a zfs filesystem to the point
> where a zfs snapshot makes use of the same data if it is the same.
> I know that if I tar -c - | tar -x or find | cpio data onto a zfs
> filesystem, take a snapshot of that zfs fs, and then do the
> operation again to the same set of files, take another snapshot,
> both snapshots say they consume the amount of space that included
> the total of the amount of files copied.  So they are not sharing
> the same space on blocks of the disk.  Do I understand that correctly?

again, rsync --inplace overlays only changed files in place so as only
blocks that change are rewritten -- minimizing snap delta cost.
>
> What I'm really looking for is a way to shrink our backup window, by
> making use of some "tool" that can look at a binary file that is at
> 2 different points in time, say one on a zfs snapshot, and one from
> a different filesystem, i.e. a current split of a mirror housing a
> zfs|ufs|vxfs filesystem mounted on a host that can see both
> filesystems.  Is there a way to compare the 2 files, and just get
> portions that differ to get written to the copy that is on the zfs
> filesystem, so that after a new snapshot is taken, the zfs snapshot
> could see the amount of changes in that file to only be the delta of
> bits inside the file that changed?  I thought rsync could deal with
> this, yet I think if timestamp changes on your source file, it
> considers the whole file as changed and would copy over the whole
> thing, yet I'm really not that versed in rsync and could be completely
wrong.
>

rsync can use timestamp/size diffs to quickly tell if a file change is
suspect and then will go further and checksum blocks and only transfer
changed blocks.  It is pretty efficient.


> I guess I'm more after that "tool".  I know there exist agents that
> can run that poll oracle database files and find out what bits
> changed and write those off somewhere.  RMAN can do that, yet that
> still keeps things down at a DBA level, yet I need to keep this
> backup processing at the SA level.  I'm just trying to find out how
> to migrate our data in a way that is fast, reliable, and optimal.
>
> Was checking out these threads: http://www.opensolaris.
> org/jive/thread.jspa?threadID=20276&tstart=0
> http://www.opensolaris.org/jive/thread.jspa?threadID=22724&tstart=0
>
> And now just saw an update to http://blogs.sun.com/AVS/.  Maybe all
> my answers lie there... Will dig around there for more, but would
> welcome feedback and ideas for this.
>
> TIA
>
>
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to