Hash: SHA1

On 10/01/12 21:32, Richard Elling wrote:
> On Jan 9, 2012, at 7:23 PM, Jesus Cea wrote:
>> The page is written in Spanish, but the terminal transcriptions
>> should be useful for everybody.
>> In the process, maybe somebody finds this interesting too:
>> http://www.jcea.es/artic/zfs_flash01.htm
> Google translate works well for this :-)  Thanks for posting! --
> richard

Talking about this, there is something that bugs me.

For some reason, sync writes are written to the ZIL only if they are
"small". Big writes are far slower, apparently bypassing the ZIL.
Maybe some concern about disk bandwidth (because we would be writing
the data twice, but it is only speculation).

But this is happening TOO when the ZIL is in a SSD. I guess ZFS should
write the sync writes to the SSD even if they are quite big (megabytes).

In the "zil.c" code I see things like:

 * Define a limited set of intent log block sizes.
 * These must be a multiple of 4KB. Note only the amount used (again
 * aligned to 4KB) actually gets written. However, we can't always just
 * allocate SPA_MAXBLOCKSIZE as the slog space could be exhausted.
uint64_t zil_block_buckets[] = {
    4096,               /* non TX_WRITE */
    8192+4096,          /* data base */
    32*1024 + 4096,     /* NFS writes */

 * Use the slog as long as the logbias is 'latency' and the current
commit size
 * is less than the limit or the total list size is less than 2X the
 * Limit checking is disabled by setting zil_slog_limit to UINT64_MAX.
uint64_t zil_slog_limit = 1024 * 1024;
#define USE_SLOG(zilog) (((zilog)->zl_logbias == ZFS_LOGBIAS_LATENCY) && \
        (((zilog)->zl_cur_used < zil_slog_limit) || \
        ((zilog)->zl_itx_list_sz < (zil_slog_limit << 1))))

I have 2GB of ZIL in a mirrored SSD. I can randomly write to it at
240MB/s, so I guess the sync write restriction could be reexamined
when ZFS is using a separate ZIL device, with plenty of space to burn
:-). Am I missing anything?

Could I change the value of "zil_slog_limit" in the kernel (via mdb)
when using a ZIL device, safely?. Would do what I expect?

My usual database block size is 64KB... :-(. The writeahead log write
can be bigger that 128KB easily (before and after data, plus some
changes in the parent nodes).

Seems faster to do several writes with several SYNCs that a big write
with a final SYNC. That is quite counterintuitive.

Am I hitting something else, like the "write throttle"?

PS: I am talking about Solaris 10 U10. My ZFS "logbias" attribute is

- -- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
j...@jcea.es - http://www.jcea.es/     _/_/    _/_/  _/_/    _/_/  _/_/
jabber / xmpp:j...@jabber.org         _/_/    _/_/          _/_/_/_/_/
.                              _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

zfs-discuss mailing list

Reply via email to