On 4/25/12 10:10 PM, Richard Elling wrote:
On Apr 25, 2012, at 8:30 PM, Carson Gaspar wrote:
And applications that don't pin the mount points, and can be idled
during the migration. If your migration is due to a dead server, and
you have pending writes, you have no choice but to reboot the
client(s) (and accept the data loss, of course).
Reboot requirement is a lame client implementation.
Then it's a lame client misfeature of every single NFS client I've ever
seen, assuming the mount is "hard" (and if a RW mount isn't, you're crazy).
To bring this back to ZFS, sadly ZFS doesn't support NFS HA without
shared / replicated storage, as ZFS send / recv can't preserve the
data necessary to have the same NFS filehandle, so failing over to a
replica causes stale NFS filehandles on the clients. Which frustrates
me, because the technology to do NFS shadow copy (which is possible in
Solaris - not sure about the open source forks) is a superset of that
needed to do HA, but can't be used for HA.
You are correct, a ZFS send/receive will result in different file
handles on the receiver, just like
rsync, tar, ufsdump+ufsrestore, etc.
But unlike SnapMirror.
It is possible to preserve NFSv file handles in a ZFS environment
using lower-level replication
like TrueCopy, SRDF, AVS, etc. But those have other architectural issues
(aka suckage). I am
open to looking at what it would take to make a ZFS-friendly replicator
that would do this, but
need to know the business case 
The beauty of AFS and others, is that the file handle equivalent is not
a number. NFSv4 also has
this feature. So I have a little bit of heartburn when people say, "NFS
sux because it has a feature
I won't use because I won't upgrade to NFSv4 even though it was released
10 years ago."
NFSv4 implementations are still iffy. We've tried it - it hasn't been
stable (on Linux, at least). However we haven't tested RHEL6 yet. Are
you saying that if we have a Solaris NFSv4 server serving Solaris and
Linux NFSv4 clients with ZFS send/recv replication, that we can flip a
VIP to point to the replica target and the clients won't get stale
filehandles? Or that this is not the case today, but would be easier to
make the case than for v filehandles?
 FWIW, you can build a metropolitan area ZFS-based, shared storage
cluster today for about 1/4
the cost of the NetApp Stretch Metro software license. There is more
than one way to skin a cat :-)
So if the idea is to get even lower than 1/4 the NetApp cost, it feels
like a race to the bottom.
Shared storage is evil (in this context). Corrupt the storage, and you
have no DR. That goes for all block-based replication products as well.
This is not acceptable risk. I keep looking for a non-block-based
replication system that allows seamless client failover, and can't find
anything but NetApp SnapMirror. Please tell me I haven't been looking
hard enough. Lustre et. al. don't support Solaris clients (which I find
hilarious as Oracle owns it). I could build something on top of / under
AFS for RW replication if I tried hard, but it would be fairly fragile.
zfs-discuss mailing list