On Mon, Mar 26, 2012 at 8:24 PM, Jim Mauro <james.ma...@oracle.com> wrote:
> You care about #2 and #3 because you are fixated on a ZFS root
> lock contention problem, and not open to a broader discussion
> about what your real problem actually is. I am not saying there is
> not lock contention, and I am not saying there is - I'll look at the
> data later carefully later when I have more time.
> Your problem statement, which took 20 emails to glean, is the
> Solaris system consumes more CPU than Linux on the same
> hardware, doing roughly the same amount of work, and delivering
> roughly the same level of performance - is that correct?
> Please consider that, in Linux, you have no observability into
> kernel lock statistics (at least, known that I know of) - Linux uses kernel
> locks also, and for this workload, it seems likely to me that could
> you observe those statistics, you would see numbers that would
> lead you to conclude you have lock contention in Linux.
> Let's talk about THE PROBLEM - Linux is 15% sys, 55% usr,
> Solaris is 30% sys, 70% usr, running the same workload,
> doing the same amount of work. delivering the same level
> of performance. Please validate that problem statement.

You're definitely right.

I'm running the same workload, doing the same amount of work,
delivering the same level of performance on the same hardware
platform, even the root disk type are exactly the same.

So the userland software stack is exactly the same.
The defference is:
        linux    is 15% sys, 55% usr.
        solaris is 30% sys, 70% usr.

Basically I agree with Fajar. This is probably not a fair comparison.
A robost system not only deliver highest performance, we should
consider reliability, availability and serviceability, and energy
efficiency and other server related features. No doubt ZFS is the
most excellent file system in the planet.

As Richard pointed out, if we look at mpstat output
SET minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl sze
 0 35140   0 2380 59742 19476 93056 30906 32919 256336 1104 967806 65
35   0   0  32

smtx 256336: spins on mutex

I need to look at

icsw 30906:  involuntary context switches
migr 32919:  thread migration
syscl 967806: system calls

And given that Solaris consumes 70% CPU on the userland,
I probably need to break down how long it is in apache, libphp, libc,
etc. (what's your approach for usr% you mentioned above? BTW)

I am not open to a broader discussion because this is zfs mailing list.
zfs root lock contention is what I observed so far I can post on this forum.
I can take care of other aspects and may ask for help somewhere else.

I admit I didn't dig into linux, I agree there could be lock contention as well.
But since solaris consumes more CPU% and hence more system power
than linux, I think I have to look at solaris first, to see if there are any
tuning works need to be done.

Do you agree this is the right way to go ahead?

zfs-discuss mailing list

Reply via email to