On Fri, Jan 22, 2010 at 7:19 AM, Mike Gerdts <mger...@gmail.com> wrote:
> On Thu, Jan 21, 2010 at 2:51 PM, Andrey Kuzmin
> <andrey.v.kuz...@gmail.com> wrote:
>> Looking at dedupe code, I noticed that on-disk DDT entries are
>> compressed less efficiently than possible: key is not compressed at
>> all (I'd expect roughly 2:1 compression ration with sha256 data),
>
> A cryptographic hash such as sha256 should not be compressible.  A

I'd certainly agree for block-encryption where encrypted bytes are
uniformly distributed by design, was not sure with digests.

And, anyway, ddt includes other data as well, so I'd consider
compression efficiency question, especially for in-core ddt entries.

Regards,
Andrey

> trivial example shows this to be the case:
>
> for i in {1..10000} ; do
>    echo $i | openssl dgst -sha256 -binary
> done > /tmp/sha256
>
> $ gzip -c <sha256 >sha256.gz
> $ compress -c <sha256 >sha256.Z
> $ bzip2 -c <sha256 >sha256.bz2
>
> $ ls -go sha256*
> -rw-r--r--   1  320000 Jan 22 04:13 sha256
> -rw-r--r--   1  428411 Jan 22 04:14 sha256.Z
> -rw-r--r--   1  321846 Jan 22 04:14 sha256.bz2
> -rw-r--r--   1  320068 Jan 22 04:14 sha256.gz
>
> --
> Mike Gerdts
> http://mgerdts.blogspot.com/
>
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to