On 2013-01-20 16:56, Edward Ned Harvey (opensolarisisdeadlongliveopensolaris) wrote:
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Jim Klimov

And regarding the "considerable activity" - AFAIK there is little way
for ZFS to reliably read and test "TXGs newer than X"

My understanding is like this:  When you make a snapshot, you're just creating 
a named copy of the present latest TXG.  When you zfs send incremental from one 
snapshot to another, you're creating the delta between two TXG's, that happen 
to have names.  So when you break a mirror and resilver, it's exactly the same 
operation as an incremental zfs send, it needs to calculate the delta between 
the latest (older) TXG on the previously UNAVAIL device, up to the latest TXG 
on the current pool.  Yes this involves examining the meta tree structure, and 
yes the system will be very busy while that takes place.  But the work load is 
very small relative to whatever else you're likely to do with your pool during 
normal operation, because that's the nature of the meta tree structure ... very 
small relative to the rest of your data.

Hmmm... Given that many people use automatic snapshots, those do
provide us many roots for branches of block-pointer tree after a
certain TXG (creation of snapshot and the next live variant of
the dataset).

This might allow resilvering to quickly select only those branches
of the metadata tree that are known or assumed to have changed after
a disk was temporarily lost - and not go over datasets (snapshots)
that are known to have been committed and closed (became read-only)
while that disk was online.

I have no idea if this optimization does take place in ZFS code,
but it seems "bound to be there"... if not - a worthy RFE, IMHO ;)

//Jim

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to