Ahhh, this has been...interesting...some real "personalities" involved in
this
discussion.  :p  The following is long-ish but I thought a re-cap was in
order.
I'm sure we'll never finish this discussion, but I want to at least have a
new
plateau or base from which to consider these questions.

I've just read through EVERY post to this thread, so I want to recap the
best
points in the vein of the original thread, and set a new base for continuing

the conversation.  Personally, I'm less interested in the archival case;
rather,
I'm looking for the best way to either recover from a complete system
failure
or recover an individual file or file set from some backup media, most
likely
tape.

Now let's put all of this together, along with some definitions.  First, the

difference between archival storage (to tape or other) and backup.  I think
the best definition provided in this thread came from Darren Moffat as well.

As Carsten Aulbert mentioned, this discussion is fairly useless until we
start
using the same terminology to describe a set of actions.

For this discussion, I am defining archival as taking the data and placing
it
on some media - likely tape, but not necessarily - in the simplest format
possible that could hopefully be read by another device in the future.  This

could exclude capturing NTFS/NFSv4/ZFS ACLs, Solaris extended attributes,
or zpool properties (aka metadata for purposes of this discussion).  With an

archive, we may not go back and touch the data for a long time, if ever
again.

Backup, OTOH, is the act of making a perfect copy of the data to some
media (in my interest tape, but again, not necessarily) which includes all
of
the metadata associated with that data.  Such a copy would allow perfect
re-creation of the data in a new environment, recovery from a complete
system failure, or single file (or file set) recovery.  With a backup, we
have
the expectation that we may need to return to it shortly after it is
created,
so we have to be able to trust it...now.  Data restored from this backup
needs to be an exact replica of the original source - ZFS pool and dataset
properties, extended attributes, and ZFS ACLs included.

Now that I hopefully have common definitions for this conversation (and
I hope I captured Darren's meaning accurately), I'll divide this into 2
sections,
starting with NDMP.

NDMP:

For those who are unaware (and to clarify my own understanding), I'll take
a moment to describe NDMP.  NDMP was invented by NetApp to allow direct
backup of their Filers to tape backup servers, and eventually onto tape.  It

is designed to remove the need for indirect backup by backing up the NFS
or CIFS shared file systems on the clients.  Instead, we backup the shared
file systems directly from the Filer (or other file server - say Fishworks
box
or OpenSolaris server) to the backup server via the network.  We avoid
multiple copies of the shared file systems.  NDMP is a network-based
delivery mechanism to get data from a storage server to a backup server,
which is why the backup software must also speak NDMP.  Hopefully, my
description is mostly accurate, and it is clear why this might be useful for

people using (Open)Solaris + ZFS for tape backup or archival purposes.

Darren Moffat made the point that NDMP could be used to do the tape
splitting, but I'm not sure this is accurate.  If "zfs send" from a file
server
running (Open)Solaris to a tape drive over NDMP  is viable -- which it
appears to be to me -- then the tape splitting would be handled by the
tape backup application.  In my world, that's typically NetBackup or some
similar enterprise offering.  I see no reason why it couldn't be Amanda or
Bacula or Arkeia or something else.  THIS is why I am looking for faster
progress on NDMP.

Now, NDMP doesn't do you much good for a locally attached tape drive,
as Darren and Svein pointed out.  However, provided the software which is
installed on this fictional server can talk to the tape in an appropriate
way,
then all you have to do is pipe "zfs send" into it.  Right?  What did I
miss?

ZVOLs and NTFS/NFSv4/ZFS ACLs:

The answer is "zfs send" to both of my questions about ZVOLs and ACLs.

At the center of all of this attention is "zfs send".  As Darren Moffat
pointed
out, it has all the pieces to do a proper, complete and correct backup.  The

big remaining issue that I see is how do you place a "zfs send" stream on a
tape in a reliable fashion.  CR 6936195 would seem to handle one complaint
from Svein, Miles Nordin and others about reliability of the send stream on
the tape.  Again, I think NDMP may help answer this question for file
servers without attached tape devices.  For those with attached tape
devices,
what's the equivalent answer?  Who is doing this, and how?  I believe we've
seen Ed Harvey say "NetBackup" and Ian Collins say "NetVault".  Do these
products capture all the metadata required to call this copy a "backup"?
That's my next question.

Finally, Damon Atkins said:

"But their needs to be a tool:
* To restore an individual file or a zvol (with all ACLs/properties)
* That allows backup vendors (which place backups on tape or disk or CD or
..) build indexes of what is contain in the backup (e.g. filename, owner,
size modification dates, type (dir/file/etc) )
*Stream output suitable for devices like tape drives.
*Should be able to tell if the file is corrupted when being restored.
*May support recovery of corrupt data blocks within the stream.
*Preferable gnutar command-line compatible
*That admins can use to backup and transfer a subset of files e.g user home
directory (which is not a file system) to another server or on to CD to be
sent to their new office location, or ????"

So far, I don't think I've seen any reference to a tool that fits this
description.  Erik Ableson provided a list of software that he says can do
all
of this.  Is that list (incomplete though it may be) accurate?  Can they all

create what I define as a backup (to tape media) of a ZFS pool?

Note: Being a long time Unix user, I could do without GNU command line
semantics, but that's just a personal thing.  Items 1 - 3 on the list are
needs,
while 4 - 7 would be great if not quite needs, IMO.

Finally, if no such tool currently exists as described by Damon Atkins, what

is required to create one?  This is a curiosity question, admittedly.

-- 
"You can choose your friends, you can choose the deals." - Equity Private

"If Linux is faster, it's a Solaris bug." - Phil Harman

Blog - http://whatderass.blogspot.com/
Twitter - @khyron4eva
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to