On Sat, Dec 19, 2009 at 05:25, Ian Collins <i...@ianshome.com> wrote:

> Stacy Maydew wrote:
>
>> The commands "zpool list" and "zpool get dedup <pool>" both show a ratio
>> of 1.10.
>> So thanks for that answer.  I'm a bit confused though if the dedup is
>> applied per zfs filesystem, not zpool, why can I only see the dedup on a per
>> pool basis rather than for each zfs filesystem?
>>
>> Seems to me there should be a way to get this information for a given zfs
>> filesystem?
>>
>>
>>
> The information, if present, would probably be meaningless.  Consider which
> filesystem holds the block and which the dupe?  What happens if the original
> is removed?
>

AHA - "original/copy" I fell into the same trap.

This is the question I had back in November. Michael Schuster
http://blogs.sun.com/recursion helped me out and that's my reference point.

Here was my scenario:

in /home/fred there's a photo collection
>     another collection exists in /home/janet
>     at some point in the past, fred sent janet a party picture, let's call
> it DSC4456.JPG
>     In the dataset, there are now two copies of the file, which are
> genuinely identical.
>
>     So then:
>     - When you de-dupe, which copy of the file gets flung?
>

Michael provided the following really illuminating explanation:

dedup (IIRC) operates at block level, not file level, so the question, as it
> stands, has no answer. what happens - again, from what I read in Jeff's blog
> - is this: zfs detects that a copy of a block with the same hash is being
> created, so instead of storing the block again, it just increments the
> reference count and makes sure whatever "thing" references this piece of
> data points to the "old" data.
>
> In that sense, you could probably argue that the "new" copy never gets
> created.
>

("Jeff's blog" referred to above is here:
http://blogs.sun.com/bonwick/entry/zfs_dedup)

OK, fair enough but I still could quite get my head around what's actually
happening, so I posed this followup question, in order to cement the idea in
my silly head (because I still wasn't focused on "new copy never gets
created")....

Fred has an image (DSC4456.JPG in my example) in his home directory, he's
sent it to Janet. Arguably - when Janet pulled the attachment out of the
email and saved it to her $HOME - that copy never got written! Instead, the
reference count was incremented by one. Fair enough, but what is Janet
"seeing" when she does an ls and greps for that image? What is she seeing:
- a symlink?
- an "apparition" of some kind?
she sees the file, it's there, but what exactly is she seeing?

Michael stepped in and described this:

they're going to see the same file (the blocks of which now have a ref.
> counter that is one less than it was before).
>
> think posix-style hard links: two directory entries pointing to the same
> inode - both "files" are actually one, but as long as you don't change it,
> it doesn't matter. when you "remove" one (by removing the name), the other
> remains, the ref. count in the inode is decremented by one.
>

So, coming around full circle to your question; "What happens if the
original is removed?" it can be answered this way:

There is no original, there is no copy. There is one block with reference
counters.

- Fred can rm his "file" (because clearly it isn't a file, it's a filename
and that's all)
- result: the reference count is decremented by one - the data remains on
disk.
OR
- Janet can rm her "filename"
- result: the reference count is decremented by one - the data remains on
disk
OR
-both can rm the filename the reference count is now decremented by two -
but there were only two so now it's really REALLY gone.

Or is it really REALLY gone? Nope,  If you snapshotted the pool it isn't! :)

For me, within the core of the explanation, the posix hard link reference
somehow tipped the scales and made me understand, but we all have mental
hooks into different parts of an explanation (the "aha" moment) so YMMV :)

Dedup is fascinating, I hope you don't mind me sharing this little
list-anecdote because it honestly made a huge difference to my understanding
of the concept.

Once again, many thanks to Michael Schuster at Sun for having the patience
to walk a n00b through the steps towards enlightenment.

--
-Me
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to