> with recent bits ZFS compression is now handled concurrently with  
> many CPUs working on different records.
> So this load will burn more CPUs and acheive it's results  
> (compression) faster.
> 
> So the observed pauses should be consistent with that of a load  
> generating high system time.
> The assumption is that compression now goes faster than when is was  
> single threaded.
> 
> Is this undesirable ? We might seek a way to slow down compression in  
> order to limit the system load.

According to this dtrace script....

#!/usr/sbin/dtrace -s

sdt:genunix::taskq-enqueue
/((taskq_ent_t *)arg1)->tqent_func == (task_func_t *)&`zio_write_compress/
{
        @where[stack()] = count();
}

tick-5s {
        printa(@where);
        trunc(@where);
}




... I see bursts of ~ 1000 zio_write_compress() [gzip] taskq calls
enqueued into the "spa_zio_issue" taskq by zfs`spa_sync() and
its children:

  0  76337                         :tick-5s 
...
              zfs`zio_next_stage+0xa1
              zfs`zio_wait_for_children+0x5d
              zfs`zio_wait_children_ready+0x20
              zfs`zio_next_stage_async+0xbb
              zfs`zio_nowait+0x11
              zfs`dbuf_sync_leaf+0x1b3
              zfs`dbuf_sync_list+0x51
              zfs`dbuf_sync_indirect+0xcd
              zfs`dbuf_sync_list+0x5e
              zfs`dbuf_sync_indirect+0xcd
              zfs`dbuf_sync_list+0x5e
              zfs`dnode_sync+0x214
              zfs`dmu_objset_sync_dnodes+0x55
              zfs`dmu_objset_sync+0x13d
              zfs`dsl_dataset_sync+0x42
              zfs`dsl_pool_sync+0xb5
              zfs`spa_sync+0x1c5
              zfs`txg_sync_thread+0x19a
              unix`thread_start+0x8
             1092

  0  76337                         :tick-5s 



It seems that after such a batch of compress requests is
submitted to the "spa_zio_issue" taskq, the kernel is busy
for several seconds working on these taskq entries.
It seems that this blocks all other "taskq" activity inside the
kernel...



This dtrace script counts the number of 
zio_write_compress() calls enqueued / execed 
by the kernel per second:

#!/usr/sbin/dtrace -qs

sdt:genunix::taskq-enqueue
/((taskq_ent_t *)arg1)->tqent_func == (task_func_t *)&`zio_write_compress/
{
        this->tqe = (taskq_ent_t *)arg1;
        @enq[this->tqe->tqent_func] = count();
}

sdt:genunix::taskq-exec-end
/((taskq_ent_t *)arg1)->tqent_func == (task_func_t *)&`zio_write_compress/
{
        this->tqe = (taskq_ent_t *)arg1;
        @exec[this->tqe->tqent_func] = count();
}

tick-1s {
        /*
        printf("%Y\n", walltimestamp);
        */
        printf("TS(sec): %u\n", timestamp / 1000000000);
        printa("enqueue %a: [EMAIL PROTECTED]", @enq);
        printa("exec    %a: [EMAIL PROTECTED]", @exec);
        trunc(@enq);
        trunc(@exec);
}




I see bursts of zio_write_compress() calls enqueued / execed,
and periods of time where no zio_write_compress() taskq calls
are enqueued or execed.

10#  ~jk/src/dtrace/zpool_gzip7.d 
TS(sec): 7829
TS(sec): 7830
TS(sec): 7831
TS(sec): 7832
TS(sec): 7833
TS(sec): 7834
TS(sec): 7835
enqueue zfs`zio_write_compress: 1330
exec    zfs`zio_write_compress: 1330
TS(sec): 7836
TS(sec): 7837
TS(sec): 7838
TS(sec): 7839
TS(sec): 7840
TS(sec): 7841
TS(sec): 7842
TS(sec): 7843
TS(sec): 7844
enqueue zfs`zio_write_compress: 1116
exec    zfs`zio_write_compress: 1116
TS(sec): 7845
TS(sec): 7846
TS(sec): 7847
TS(sec): 7848
TS(sec): 7849
TS(sec): 7850
TS(sec): 7851
TS(sec): 7852
TS(sec): 7853
TS(sec): 7854
TS(sec): 7855
TS(sec): 7856
TS(sec): 7857
enqueue zfs`zio_write_compress: 932
exec    zfs`zio_write_compress: 932
TS(sec): 7858
TS(sec): 7859
TS(sec): 7860
TS(sec): 7861
TS(sec): 7862
TS(sec): 7863
TS(sec): 7864
TS(sec): 7865
TS(sec): 7866
TS(sec): 7867
enqueue zfs`zio_write_compress: 5
exec    zfs`zio_write_compress: 5
TS(sec): 7868
enqueue zfs`zio_write_compress: 774
exec    zfs`zio_write_compress: 774
TS(sec): 7869
TS(sec): 7870
TS(sec): 7871
TS(sec): 7872
TS(sec): 7873
TS(sec): 7874
TS(sec): 7875
TS(sec): 7876
enqueue zfs`zio_write_compress: 653
exec    zfs`zio_write_compress: 653
TS(sec): 7877
TS(sec): 7878
TS(sec): 7879
TS(sec): 7880
TS(sec): 7881


And a final dtrace script, which monitors scheduler activity while
filling a gzip compressed pool:

#!/usr/sbin/dtrace -qs

sched:::off-cpu,
sched:::on-cpu,
sched:::remain-cpu,
sched:::preempt
{
        /*
        @[probename, stack()] = count();
        */
        @[probename] = count();
}


tick-1s {
        printf("%Y", walltimestamp);
        printa(@);
        trunc(@);
}


It shows periods of time with absolutely *no*
scheduling activity (I guess this is when the
"spa_zio_issue" taskq is working on such a bug
batch of submitted gzip compression calls):

21# ~jk/src/dtrace/zpool_gzip9.d
2007 May  6 21:38:12
  preempt                                                          13
  off-cpu                                                         808
  on-cpu                                                          808
2007 May  6 21:38:13
  preempt                                                           1
  off-cpu                                                         446
  on-cpu                                                          446
2007 May  6 21:38:14
  preempt                                                           6
  off-cpu                                                        2308
  on-cpu                                                         2308
2007 May  6 21:38:15
  preempt                                                           1
  off-cpu                                                         429
  on-cpu                                                          429
2007 May  6 21:38:16
  off-cpu                                                         412
  on-cpu                                                          412
2007 May  6 21:38:17
  off-cpu                                                         711
  on-cpu                                                          711
2007 May  6 21:38:18
  preempt                                                           2
  off-cpu                                                         238
  on-cpu                                                          238
2007 May  6 21:38:19
  preempt                                                          48
  off-cpu                                                        7740
  on-cpu                                                         7740
2007 May  6 21:38:20
  preempt                                                         114
  off-cpu                                                       20823
  on-cpu                                                        20823
2007 May  6 21:38:21                                               
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2007 May  6 21:38:22                                               
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2007 May  6 21:38:23                                               
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2007 May  6 21:38:24
  preempt                                                         181
  off-cpu                                                       32486
  on-cpu                                                        32486
2007 May  6 21:38:25
  preempt                                                           1
  off-cpu                                                        1612
  on-cpu                                                         1612
2007 May  6 21:38:26
  preempt                                                           6
  off-cpu                                                        1068
  on-cpu                                                         1068
2007 May  6 21:38:27
2007 May  6 21:38:28
  preempt                                                          56
  off-cpu                                                        9756
  on-cpu                                                         9756
2007 May  6 21:38:29
  preempt                                                         200
  off-cpu                                                       36195
  on-cpu                                                        36195
2007 May  6 21:38:30                                               
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2007 May  6 21:38:31                                               
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2007 May  6 21:38:32                                               
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2007 May  6 21:38:33
  preempt                                                         144
  off-cpu                                                       25982
  on-cpu                                                        25982
2007 May  6 21:38:34
  preempt                                                         118
  off-cpu                                                       20705
  on-cpu                                                        20705
2007 May  6 21:38:35                                               
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2007 May  6 21:38:36                                               
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2007 May  6 21:38:37                                               
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2007 May  6 21:38:38
  preempt                                                         231
  off-cpu                                                       42003
  on-cpu                                                        42003
2007 May  6 21:38:39
  preempt                                                          42
  off-cpu                                                        4781
  on-cpu                                                         4781
2007 May  6 21:38:40                                               
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2007 May  6 21:38:41                                               
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2007 May  6 21:38:42
  preempt                                                           4
  off-cpu                                                         486
  on-cpu                                                          486
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to