On Mon, Nov 10, 2008 at 12:30 AM, Mike Gerdts <[EMAIL PROTECTED]> wrote:
> On Sun, Nov 9, 2008 at 7:54 PM, Jeff Victor <[EMAIL PROTECTED]> wrote:
>> <zonestat intro snipped>
>> If you have any comments, or suggestions for improvement, please let
>> me know on this e-mail list or via private e-mail.
> I've had such needs for a while and have developed some tools to help my 
> organization with that.
> Unfortunately, I'm not able to share that code.  I am able to share 
> suggestions...
> I am in a habit of:
> #! /usr/bin/perl -w
> use strict;

Yes, those generated warnings when I had used them earlier. I wanted
to get the code "out the door" and took a couple of shortcuts to do
that. I will address the warnings soon and put those checks back in

> That catches a lot of mistakes that may be masked by:
> close STDERR;
> which I never do. :)

:-) Another of the short cuts. I hope to remove those short cuts in
v1.3, which should be done this week.

> Please do not use /etc/release as a test of kernel functionality.
> Those that patch to an equivalent level as the update release have a
> similar level of functionality.  A better mechanism would be to check
> for specific kernel patches.

Great idea, I'll look into it.

> # Get amount and cap of memory locked by processes in each zone.
> # "kstat -p caps:*:lockedmem_zone_*" conveniently summarizes all zones for us.
> #
> open(KSTAT, "/usr/bin/kstat -p caps:*:lockedmem_zone_* |");
> while (<KSTAT>) {
> You could just use Sun::Solaris::Kstat rather than forking another perl 
> script.

Yup, that was in the ToDo list: convert all uses of /usr/bin/kstat to
uses of the Kstat module. I might sneak that into v1.3 along with
significant improvements in identifying zone->project mappings.

> My feeling on capped memory is that if it becomes an issue and capped
> swap is not really close to capped memory, the over-consumptive zone
> has too high of a chance of causing horrible I/O problems for all
> zones.  That is, the cap is likely to do more harm than good.  This
> may change if swap can go onto solid state disk.  I only mention this,
> because I don't see a purpose in capping RSS, rather I cap swap.

For "fast leaks" and DoS attacks, I agree. The RAM cap helps with slow
leaks and temporary overconsumption of RAM.

> FWIW, I tend to use the term "reserved memory" instead of "swap"
> because that is less confusing to most people.

That's a useful perspective. If you choose the swap cap - which is
really a VM cap - so that the sum of the swap caps is less than RAM,
you have effectively implemented 'reserved memory.' (I'm ignoring RAM
usage of the global zone, which shouldn't be ignored in practice.)
But you must be careful: nothing prevents you from 'over-reserving'
memory. If you have 'reserved' all of system memory in this way, and
add a new zone with its own 'reserve,' you will have over-subscribed
memory. That might be a good thing, as long as no one is surprised if
the system starts paging.

However, the entire concept of reserved memory limits the scalability
of the system. Imagine 4 zones with swap caps of 4GB, on a system with
16GB of RAM. (Again, I'm ignoring the GZ.) Unless you allow yourself
to over-subscribe RAM, you can't add more zones, even if those 4 zones
are only using 1GB each during normal conditions.

Balance is needed. When paging must be avoided at all costs,
'reserving' memory by setting proper swap-caps makes a great deal of
sense. When paging is unlikely because the workload is well
understood, and a small amount of paging would not be horrible, and
zone 'density' is important, reserving memory would not make sense.
Many situations would call for memory 'reservations' on some zones,
and RAM caps on others.

> For CPU related stats, take a look at a discussion I started a while back:
> http://mail.opensolaris.org/pipermail/perf-discuss/2005-November/002048.html

Cool. Also, Jim Fiori had a simple idea for counting CPU time per zone
with almost no perf impact: use DTrace to implement a probe which
fires every M microseconds, and increments a per-zone counter. But
that's a short-term solution. We need a per-zone counter in the kernel
that tallies CPU time per zone.

> One project I would like to kick off sometime is doing per user, per
> project, and per zone microstate accounting.

Excellent idea. I'll watch for it! :-)


> I didn't have a chance to check logic closely or run it on a test
> system.  I'll offer more feedback if needed when I get a chance to
> test it.  It is a great start and I can't wait to see it progress.


zones-discuss mailing list

Reply via email to