On 4/26/2011 3:59 AM, Fred Liu wrote:
>
>> -----Original Message-----
>> From: Erik Trimble [mailto:erik.trim...@oracle.com]
>> Sent: 星期二, 四月 26, 2011 12:47
>> To: Ian Collins
>> Cc: Fred Liu; ZFS discuss
>> Subject: Re: [zfs-discuss] How does ZFS dedup space accounting work
>> with quota?
>>
>> On 4/25/2011 6:23 PM, Ian Collins wrote:
>>>   On 04/26/11 01:13 PM, Fred Liu wrote:
>>>> Hmmmm, it seems dedup is pool-based not filesystem-based.
>>> That's correct. Although it can be turned off and on at the
>> filesystem
>>> level (assuming it is enabled for the pool).
>> Which is effectively the same as choosing per-filesystem dedup.  Just
>> the inverse. You turn it on at the pool level, and off at the
>> filesystem
>> level, which is identical to "off at the pool level, on at the
>> filesystem level" that NetApp does.
> My original though is just enabling dedup on one file system to check if it
> is mature enough or not in the production env. And I have only one pool.
> If dedup is filesytem-based, the effect of dedup will be just throttled within
> one file system and won't propagate to the whole pool. Just disabling dedup 
> cannot get rid of all the effects(such as the possible performance degrade 
> ... etc),
> because the already dedup'd data is still there and DDT is still there. The 
> thinkable
> thorough way is totally removing all the dedup'd data. But is it the real 
> thorough way?
You can do that now. Enable Dedup at the pool level. Turn it OFF on all
the existing filesystems. Make a new "test" filesystem, and run your tests.

Remember, only data written AFTER the dedup value it turned on will be
de-duped. Existing data will NOT. And, though dedup is enabled at the
pool level, it will only consider data written into filesystems that
have the dedup value as ON.

Thus, in your case, writing to the single filesystem with dedup on will
NOT have ZFS check for duplicates from the other filesystems. It will
check only inside itself, as it's the only filesystem with dedup enabled.

If the experiment fails, you can safely destroy your test dedup
filesystem, then unset dedup at the pool level, and you're fine.


> And also the dedup space saving is kind of indirect. 
> We cannot directly get the space saving in the file system where the 
> dedup is actually enabled for it is pool-based. Even in pool perspective,
> it is still sort of indirect and obscure from my opinion, the real space 
> saving
> is the abs delta between the output of 'zpool list' and the sum of 'du' on 
> all the folders in the pool
> (or 'df' on the mount point folder, not sure if the percentage like 123% will 
> occur or not... grinning ^:^ ).
>
> But in NetApp, we can use 'df -s' to directly and easily get the space saving.
That is true. Honestly, however, it would be hard to do this on a
per-filesystem basis. ZFS allows for the creation of an arbitrary number
of filesystems in a pool, far higher than NetApp does. The result is
that the "filesystem" concept is much more flexible in ZFS. The downside
is that keeping dedup statistics for a given arbitrary set of data is
logistically difficult.

An analogy with NetApp is thus: Can you use any tool to find the dedup
ratio of an arbitrary directory tree INSIDE a NetApp filesystem?


> It is true, quota is in charge of logical data not physical data.
> Let's assume an interesting scenario -- say the pool is 100% full in logical 
> data
> (such as 'df' tells you 100% used) but not full in physical data(such as 
> 'zpool list' tells
> you still some space available), can we continue writing data into this pool?
>
Sure, you can keep writing to the volume. What matters to the OS is what
*it* thinks, not what some userland app thinks.


-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to