On Dec 20, 2010, at 2:05 PM, Erik Trimble wrote:

> On 12/20/2010 11:56 AM, Mark Sandrock wrote:
>> Erik,
>> 
>>      just a hypothetical what-if ...
>> 
>> In the case of resilvering on a mirrored disk, why not take a snapshot, and 
>> then
>> resilver by doing a pure block copy from the snapshot? It would be 
>> sequential,
>> so long as the original data was unmodified; and random access in dealing 
>> with
>> the modified blocks only, right.
>> 
>> After the original snapshot had been replicated, a second pass would be done,
>> in order to update the clone to 100% live data.
>> 
>> Not knowing enough about the inner workings of ZFS snapshots, I don't know 
>> why
>> this would not be doable. (I'm biased towards mirrors for busy filesystems.)
>> 
>> I'm supposing that a block-level snapshot is not doable -- or is it?
>> 
>> Mark
> Snapshots on ZFS are true snapshots - they take a picture of the current 
> state of the system. They DON'T copy any data around when created. So, a ZFS 
> snapshot would be just as fragmented as the ZFS filesystem was at the time.

But if one does a raw (block) copy, there isn't any fragmentation -- except for 
the COW updates.

If there were no updates to the snapshot, then it becomes a 100% sequential 
block copy operation.

But even with COW updates, presumably the large majority of the copy would 
still be sequential i/o.

Maybe for the 2nd pass, the filesystem would have to be locked, so the 
operation would ever complete,
but if this is fairly short in relation to the overall resilvering time, then 
it could still be a win in many cases.

I'm probably not explaining it well, and may be way off, but it seemed an 
interesting notion.

Mark

> 
> 
> The problem is this:
> 
> Let's say I write block A, B, C, and D on a clean zpool (what kind, it 
> doesn't matter).  I now delete block C.  Later on, I write block E.   There 
> is a probability (increasing dramatically as times goes on), that the on-disk 
> layout will now look like:
> 
> A, B, E, D
> 
> rather than
> 
> A, B, [space], D, E
> 
> 
> So, in the first case, I can do a sequential read to get A & B, but then must 
> do a seek to get D, and a seek to get E.
> 
> The "fragmentation" problem is mainly due to file deletion, NOT to file 
> re-writing.  (though, in ZFS, being a C-O-W filesystem, re-writing generally 
> looks like a delete-then-write process, rather than a modify process).
> 
> 
> -- 
> Erik Trimble
> Java System Support
> Mailstop:  usca22-123
> Phone:  x17195
> Santa Clara, CA
> Timezone: US/Pacific (GMT-0800)
> 

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to