It looks like I was posting on the wrong mailing list. I thought this mailing list includes developers. The experiment I did is not for commercial purpose. The purpose of comparison is to find the optimization opportunity of the entire software stack on both linux and solaris.
As for this zfs root lock, currently a special re-entrant rwlock is implemented here for ZFS only. The interesting is, all http and https request of my benchmark are performing the read ops on apache server, while we still saw lots of mutex spin events of this rwlock in the lockstat report . I'll continue to investigate if this is a conflict of design philosophy. At least, this lock does not well behave for this kind of more read, less write case. If anyone has interested in this topic, I can send the update offline. Thanks, -Aubrey On Tue, Mar 27, 2012 at 9:42 PM, "Hung-Sheng Tsao (Lao Tsao 老曹) Ph.D." <laot...@gmail.com> wrote: > hi > you did not answer the question, what is the RAM of the server? how many > socket and core etc > what is the block size of zfs? > what is the cache ram of your san array? > what is the block size/strip size of your raid in san array? raid 5 or > what? > what is your test program and how (from what kind client) > regards > > > > > On 3/26/2012 11:13 PM, Aubrey Li wrote: >> >> On Tue, Mar 27, 2012 at 1:15 AM, Jim Klimov<j...@cos.ru> wrote: >>> >>> Well, as a further attempt down this road, is it possible for you to rule >>> out >>> ZFS from swapping - i.e. if RAM amounts permit, disable the swap at all >>> (swap -d /dev/zvol/dsk/rpool/swap) or relocate it to dedicated slices of >>> same or better yet separate disks? >>> >> Thanks Jim for your suggestion! >> >> >>> If you do have lots of swapping activity (that can be seen in "vmstat 1" >>> as >>> si/so columns) going on in a zvol, you're likely to get much >>> fragmentation >>> in the pool, and searching for contiguous stretches of space can become >>> tricky (and time-consuming), or larger writes can get broken down into >>> many smaller random writes and/or "gang blocks", which is also slower. >>> At least such waiting on disks can explain the overall large kernel >>> times. >> >> I took swapping activity into account, even when the CPU% is 100%, "si" >> (swap-ins) and "so" (swap-outs) are always ZEROs. >> >>> You can also see the disk wait times ratio in "iostat -xzn 1" column "%w" >>> and disk busy times ratio in "%b" (second and third from the right). >>> I dont't remember you posting that. >>> >>> If these are accounting in tens, or even close or equal to 100%, then >>> your disks are the actual bottleneck. Speeding up that subsystem, >>> including addition of cache (ARC RAM, L2ARC SSD, maybe ZIL >>> SSD/DDRDrive) and combatting fragmentation by moving swap and >>> other scratch spaces to dedicated pools or raw slices might help. >> >> My storage system is not quite busy, and there are only read operations. >> ===================================== >> # iostat -xnz 3 >> extended device statistics >> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device >> 112.4 0.0 1691.5 0.0 0.0 0.5 0.0 4.8 0 41 c11t0d0 >> extended device statistics >> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device >> 118.7 0.0 1867.0 0.0 0.0 0.5 0.0 4.5 0 42 c11t0d0 >> extended device statistics >> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device >> 127.7 0.0 2121.6 0.0 0.0 0.6 0.0 4.7 0 44 c11t0d0 >> extended device statistics >> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device >> 141.3 0.0 2158.5 0.0 0.0 0.7 0.0 4.6 0 48 c11t0d0 >> ============================================== >> >> Thanks, >> -Aubrey >> _______________________________________________ >> zfs-discuss mailing list >> firstname.lastname@example.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > -- > Hung-Sheng Tsao Ph D. > Founder& Principal > HopBit GridComputing LLC > cell: 9734950840 > > http://laotsao.blogspot.com/ > http://laotsao.wordpress.com/ > http://blogs.oracle.com/hstsao/ > _______________________________________________ zfs-discuss mailing list email@example.com http://mail.opensolaris.org/mailman/listinfo/zfs-discuss