On 9-May-07, at 3:44 PM, Bakul Shah wrote:

Robert Milkowski wrote:
Hello Mario,

Wednesday, May 9, 2007, 5:56:18 PM, you wrote:

MG> I've read that it's supposed to go at full speed, i.e. as fast as MG> possible. I'm doing a disk replace and what zpool reports kind of MG> surprises me. The resilver goes on at 1.6MB/s. Did resilvering get MG> throttled at some point between the builds, or is my ATA controller hav
ing bigger issues?

Lot of small files perhaps? What kind of protection have you used?

Good question. Remember that resilvering is done in time order and from
the top-level metadata down, not by sequentially blasting bits.  Jeff
Bonwick describes this as top-down resilvering.
        http://blogs.sun.com/bonwick/entry/smokin_mirrors

From a MTTR and performance perspective this means that ZFS recovery time is a function of the amount of space used, where it is located (!), and the validity of the surviving or regenerated data. The big win is the amount of
space used, as most file systems are not full.
  -- richard

It seems to me that once you copy meta data, you can indeed
copy all live data sequentially.

I don't see this, given the top down strategy. For instance, if I understand the transactional update process, you can't commit the metadata until the data is in place.

Can you explain in more detail your reasoning?

  Given that a vast majority
of disk blocks in use will typically contain data, this is a
winning strategy from a performance point of view and still
allows you to retrieve a fair bit of data in case of a second
disk failure (checksumming will catch a case where good
metadata points to as yet uncopied data block).  If amount of
live data is > 50% of disk space you may as well do a disk
copy, perhaps skipping over already copied meta data.

Not only that, you can even start using the disk being
resilvered right away for writes,  The new write will be
either to a) an already copied block

How can that be, under a COW régime?

--Toby

or b) as yet uncopied
block.  In case a) there is nothing more to do.  In case b)
the copied-from-block will have the new data so in both cases
the right thing happens.  Any potential window between
reading a copied-from block and writing to copied-to block
can be closed with careful coding/locking.

If a second disk fails during copying, the current strategy
doesn't buy you much in most any case.  You really don't want
to go through a zillion files looking for survivors.  If you
have a backup, you will restore from that rather than look
through the debris.  Not to mention you have made the window
of a potentially catastrophic failure much larger if
resilvering is significantly slower.

Comments?
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to