Very belatedly, I'm just getting around to reviewing this.  Overall
I think it looks good.  Comments in-line.

> 1) "Hard" vs. "Soft" RM configuration within zonecfg
>
...
>               dedicated-cpu
>                       ncpus (a positive integer or range, default value 1)
>                       importance (a positive integer, default value 1)
>                       max-lwps (an integer >= 100)

why >= 100?  I can envision a minimized zone where this is too many.

>               capped-cpu
>                       cpu-cap (a positive integer, default value 100 which
>                                represents 100% of one cpu)

I'm scared of this default.  To put it another way, why did you pick
100?  Should there be a value which represents infinity?  What is
the meaning of specifying 0, or is that an error?

>                       max-lwps (an integer >= 100)
>                       cpu-shares (a positive integer)
>               dedicated-memory
>                       TBD - once msets [12] are completed
>               capped-memory
>                       cap (a positive decimal number with optional k, m, g,
>                            or t as a modifier, no modifier defaults to units
>                            of megabytes(m), must be at least 1m)

I think this set of rules is too complex and too confusing for users--
it's weird to have the default units be larger than the smallest
available units.  Let's mandate that the user *always* specify units.

> 
> 2) Temporary Pools.
> 
...
>       If a dedicated-cpu (or eventually a dedicated-memory) resource is
>       configured for the zone, then when the zone boots zoneadmd will create
>       a temporary pool dedicated for the zones use.  Zoneadmd will
>       dynamically create a pool & pset (or eventually a mset) and assign the
>       number of cpus specified in zonecfg to that pset.  The temporary pool
>       & pset will be named 'SUNWzone{zoneid}'.

Could we somehow work the zone name into this?  It would be nice for
e.g. poolstat(1) observability.  Otherwise the user experience is going
to be all about trying to work out what 'SUNWzone34' maps to, which
seems poor.

>       Zoneadmd will set the 'pset.min' and 'pset.max' pset properties, as
>       well as the 'pool.importance' pool property, based on the values
>       specified for dedicated-cpu's 'ncpus' and 'importance' properties
>       in zonecfg.

Is importance mandatory?  Will it have a default value?  What values can
it have?  What does it mean?  Please be a little more specific.

>       If the cpu (or memory) resources needed to create the temporary pool
>       are unavailable, zoneadmd will issue an error and the zone won't boot.
> 
>       When the zone is halted, the temporary pool & pset will be destroyed.

What about during a reboot.  It seems like it'd be good to not tear
down the temporary pool during reboot, but maybe that's hard.  It would
to me seem weird if my pool was 2-4 CPUs, and I had 2, then rebooted and
had 4.

>       We will add a new boolean property ('temporary') that can exist on
>       pools and any resource set.  The 'temporary' property indicates that
>       the pool or resource set should never be committed to a static
>       configuration (e.g. pooladm -s) and that it should never be destroyed
>       when updating the dynamic configuration from a static configuration
>       (e.g. pooladm -c).  These temporary pools/resources can only be managed
>       in the dynamic configuration.  These changes will be implemented within
>       libpool(3LIB).
> 
>       It is our expectation that most users will never need to manage
>       temporary pools through the existing poolcfg(1M) commands.  For users
>       who need more sophisticated pool configuration and management, the
>       existing 'pool' resource within zonecfg should be used and users
>       should manually create a permanent pool using the existing mechanisms.
> 
> 3) Resource controls in zonecfg will be simplified.
...
>       Here are the aliases we will define for the rctls:
>               alias           rctl
>               -----           ----
>               max-lwps        zone.max-lwps
>               cpu-shares      zone.cpu-shares
>               cpu-cap         zone.cpu-cap (future, once cpu-caps integrate)

You've mentioned that you will substitute in sort of "the right"
defaults for the privileged and action fields.  It seems like you should
spell out what those will be...

    alias         rctl
    --------------------------------------------------------------
    cpu-shares=X  zone.cpu-shares(privileged, X, none)
     ...

>       If an rctl was already defined that did not match the expected value
>       (e.g. it had 'action=none' or multiple values), then the 'max-lwps'
>       alias will be disabled.  An attempt to set 'max-lwps' within
>       'dedicated-cpu' would print the following error:
>               "One or more incompatible rctls already exist for this
>                property"
> 
>       This rctl alias enhancement is fully backward compatible with the
>       existing rctl syntax.  That is, zonecfg output will continue to display
>       rctl settings in the current format (in addition to the new aliased
>       format) and zonecfg will continue to accept the existing input syntax
>       for setting rctls.  This ensures full backward compatibility for any
>       existing tools/scripts that parse zonecfg output or configure zones.

Maybe I missed it-- but what is the behavior of 'zonecfg export' going to be?

> 4) Enable rcapd to limit zone memory while running in the global zone
> 
>       Currently, to use rcapd(1M) to limit zone memory consumption, the
>       rcapd process must be run within the zone.  This exposes a loophole
>       since the zone administrator, who might be untrusted, can change the
>       rcapd limit.

Suggest rewording: "While useful in some configurations, in situations
where the zone administrator is untrusted, this is inneffective, since
the zone administrator could simply change the rcapd limit."

>       We will enhance rcapd so that it can limit zone's memory consumption
>       while it is running in the global zone.  This closes the rcapd
>       loophole and allows the global zone administrator to set memory
>       caps that can be enforced by a single, trusted process.

Ditto on the rewording (basically, I think "loophole" is too vague).

>       The rcapd limit for a zone will be configured using the new

Here you say "a zone"-- can you be precise?  Does that include the
global zone?

>       'capped-memory' resource and 'cap' property within zonecfg.
>       When a zone with 'capped-memory' boots, zoneadmd will automatically
>       start rcapd in the global zone, if necessary.  The interfaces to

Would it be better to say "enable the rcap service"?

>       communicate memory cap information between zoneadmd and rcapd
>       are project private.

At an architectural level, it'd be nice to summarize them; for example,
does one need to reboot the zone to get the new setting?  Is there
any way to do online tuning of the value?  Should this just be done
with SMF properties?

>       As part of this overall project, we will be enhancing the internal
>       rcapd rss accounting so that rcapd will have a more accurate
>       measurement of the overall rss for each zone.

More detail would be appreciated.

> 5) Use FSS when zone.cpu-shares is set 
> 
>       Although the zone.cpu-shares rctl can be set on a zone, the Fair Share
>       Scheduler (FSS) is not the default scheduling class so this rctl
>       frequently has no effect, unless the user also sets FSS as the
>       default scheduler or changes the zones processes to use FSS with the
>       priocntl(1M) command.  This means that users can easily think
>       they have configured their zone for a behavior that they are not
>       actually getting.
> 
>       We will enhance zoneadmd so that if the zone.cpu-shares rctl is set
>       and FSS is not already the default scheduling class, zoneadmd will set
>       the scheduling class to be FSS for processes in the zone.

Just for that zone?  This to me still seems confusing to users-- you
could have 3 zone with FSS on, and two without. How about *also* issuing a
warning at zone boot if FSS is not the machine-wide default.

Apropos my earlier (today) comment about dispadmin, should we have
some sort of 'dispadmin -d -do-it-now' option?

> 7) Pools system objective defaults to weighted-load (wt-load)[4]
> 
>       Currently pools are delivered with no objective set.  This means that
>       if you enable the poold(1M) service, nothing will actually happen on
>       your system.
> 
>       As part of this project, we will set weighted load
>       (system.poold.objectives=wt-load) to be the default objective.
>       Delivering this objective as the default does not impact systems out
>       of the box since poold is disabled by default.

What happens if you boot a zone which uses temporary pools, but pools
are not enabled?  Should booting zones enable poold?

        -dp

-- 
Daniel Price - Solaris Kernel Engineering - [EMAIL PROTECTED] - blogs.sun.com/dp
_______________________________________________
zones-discuss mailing list
zones-discuss@opensolaris.org

Reply via email to