Ahhh, this has been...interesting...some real "personalities" involved in this discussion. :p The following is long-ish but I thought a re-cap was in order. I'm sure we'll never finish this discussion, but I want to at least have a new plateau or base from which to consider these questions.
I've just read through EVERY post to this thread, so I want to recap the best points in the vein of the original thread, and set a new base for continuing the conversation. Personally, I'm less interested in the archival case; rather, I'm looking for the best way to either recover from a complete system failure or recover an individual file or file set from some backup media, most likely tape. Now let's put all of this together, along with some definitions. First, the difference between archival storage (to tape or other) and backup. I think the best definition provided in this thread came from Darren Moffat as well. As Carsten Aulbert mentioned, this discussion is fairly useless until we start using the same terminology to describe a set of actions. For this discussion, I am defining archival as taking the data and placing it on some media - likely tape, but not necessarily - in the simplest format possible that could hopefully be read by another device in the future. This could exclude capturing NTFS/NFSv4/ZFS ACLs, Solaris extended attributes, or zpool properties (aka metadata for purposes of this discussion). With an archive, we may not go back and touch the data for a long time, if ever again. Backup, OTOH, is the act of making a perfect copy of the data to some media (in my interest tape, but again, not necessarily) which includes all of the metadata associated with that data. Such a copy would allow perfect re-creation of the data in a new environment, recovery from a complete system failure, or single file (or file set) recovery. With a backup, we have the expectation that we may need to return to it shortly after it is created, so we have to be able to trust it...now. Data restored from this backup needs to be an exact replica of the original source - ZFS pool and dataset properties, extended attributes, and ZFS ACLs included. Now that I hopefully have common definitions for this conversation (and I hope I captured Darren's meaning accurately), I'll divide this into 2 sections, starting with NDMP. NDMP: For those who are unaware (and to clarify my own understanding), I'll take a moment to describe NDMP. NDMP was invented by NetApp to allow direct backup of their Filers to tape backup servers, and eventually onto tape. It is designed to remove the need for indirect backup by backing up the NFS or CIFS shared file systems on the clients. Instead, we backup the shared file systems directly from the Filer (or other file server - say Fishworks box or OpenSolaris server) to the backup server via the network. We avoid multiple copies of the shared file systems. NDMP is a network-based delivery mechanism to get data from a storage server to a backup server, which is why the backup software must also speak NDMP. Hopefully, my description is mostly accurate, and it is clear why this might be useful for people using (Open)Solaris + ZFS for tape backup or archival purposes. Darren Moffat made the point that NDMP could be used to do the tape splitting, but I'm not sure this is accurate. If "zfs send" from a file server running (Open)Solaris to a tape drive over NDMP is viable -- which it appears to be to me -- then the tape splitting would be handled by the tape backup application. In my world, that's typically NetBackup or some similar enterprise offering. I see no reason why it couldn't be Amanda or Bacula or Arkeia or something else. THIS is why I am looking for faster progress on NDMP. Now, NDMP doesn't do you much good for a locally attached tape drive, as Darren and Svein pointed out. However, provided the software which is installed on this fictional server can talk to the tape in an appropriate way, then all you have to do is pipe "zfs send" into it. Right? What did I miss? ZVOLs and NTFS/NFSv4/ZFS ACLs: The answer is "zfs send" to both of my questions about ZVOLs and ACLs. At the center of all of this attention is "zfs send". As Darren Moffat pointed out, it has all the pieces to do a proper, complete and correct backup. The big remaining issue that I see is how do you place a "zfs send" stream on a tape in a reliable fashion. CR 6936195 would seem to handle one complaint from Svein, Miles Nordin and others about reliability of the send stream on the tape. Again, I think NDMP may help answer this question for file servers without attached tape devices. For those with attached tape devices, what's the equivalent answer? Who is doing this, and how? I believe we've seen Ed Harvey say "NetBackup" and Ian Collins say "NetVault". Do these products capture all the metadata required to call this copy a "backup"? That's my next question. Finally, Damon Atkins said: "But their needs to be a tool: * To restore an individual file or a zvol (with all ACLs/properties) * That allows backup vendors (which place backups on tape or disk or CD or ..) build indexes of what is contain in the backup (e.g. filename, owner, size modification dates, type (dir/file/etc) ) *Stream output suitable for devices like tape drives. *Should be able to tell if the file is corrupted when being restored. *May support recovery of corrupt data blocks within the stream. *Preferable gnutar command-line compatible *That admins can use to backup and transfer a subset of files e.g user home directory (which is not a file system) to another server or on to CD to be sent to their new office location, or ????" So far, I don't think I've seen any reference to a tool that fits this description. Erik Ableson provided a list of software that he says can do all of this. Is that list (incomplete though it may be) accurate? Can they all create what I define as a backup (to tape media) of a ZFS pool? Note: Being a long time Unix user, I could do without GNU command line semantics, but that's just a personal thing. Items 1 - 3 on the list are needs, while 4 - 7 would be great if not quite needs, IMO. Finally, if no such tool currently exists as described by Damon Atkins, what is required to create one? This is a curiosity question, admittedly. -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss