On Wed, 2010-06-30 at 22:28 +0200, Ragnar Sundblad wrote: > To be safe, the protocol needs to be able to discover that the devices > (host or disk) has been disconnected and reconnected or has been reset > and that either parts assumptions about the state of the other has to > be invalidated. > > I don't know enough about either SAS or SATA to say if they guarantee that > you will be noticed. But if they don't, they aren't safe for cached writes.
Generally, ZFS will only notice a removed disk when it is trying to write to it -- or when it probes. ZFS does not necessarily get notified on hot device removal -- certainly not immediately. (I've written some code so that *will* notice, even if no write ever goes there... that's the topic of another message.) The other thing is that disk writes are generally idempotent. So, if a drive was removed between the time an IO was finished but before the time the response was returned to the host, it isn't a problem. When the disk is returned, ZFS should automatically retry the I/O. (In fact, ZFS automatically retries failed I/O operations several times before finally "failing".) The nasty race that occurs is if your system crashes or is powered off *after* the log has acknowledged the write, but before the bits get shoved to main pool storage. This is a data loss situation. But assuming you don't take a system crash or some other fault, I would guess that removal of a log device and reinsertion would not cause any problems. (Except for possibly delaying synchronous writes.) That said, I've not actually *tested* it. - Garrett > > /ragge > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss