Hi Jerry,
Jerry Jelinek wrote On 08/17/06 09:20,:
Steffen,
Steffen Weiberle wrote:
Jerry Jelinek wrote On 08/16/06 18:14,:
Steffen,
Thanks for your comments. Responses in-line.
Steffen Weiberle wrote:
Hi Jerry, this is great.
I have a few comments below.
Thanks
Steffen
1) "Hard" vs. "Soft" RM configuration within zonecfg
We will enhance zonecfg(1M) so that the user can configure
basic RM
capabilities in a structured way.
Various existing and upcoming RM features can be broken down
into "hard" vs. "soft" partitioning of the system's resources.
With "hard" partitioning, resources are dedicated to the zone
using
processor sets (psets) and memory sets (msets). With "soft"
partitioning, resources are shared, but capped, with an upper
limit
on their use by the zone.
Hard | Soft
---------------------------------
cpu | psets | cpu-caps
memory | msets | rcapd
Within zonecfg we will organize these various RM features into
four
basic zonecfg resources so that it is simple for a user to
understand
and configure the RM features that are to be used with their zone.
Note that zonecfg "resources" are not the same as the system's
cpu & memory resources or "resource management". Within
zonecfg, a
"resource" is the name of a top-level property group for the
zone (see
zonecfg(1M) for more information).
Are you saying just the names are different, or are there other
differences as well?
Unfortunately the word "resource" is overloaded here. zonecfg(1M) uses
it to mean a group of properties which has nothing to do with
system resources (e.g. cpu or memory) or how the word "resource" is
used under the umbrella of Solaris Resource Management.
Now I am more confused. You mean
zonecfg pool[adm,cfg]
------- ------------
dedicated-cpu != CPU resource pool (pset and pool)
capped-cpu != cpu-caps
dedicated-memory != memory resource pool
Sorry, I misunderstood your question here. The mapping *is* to these
RM features. I actually have this shown about two paragraphs down
from here. I thought you were talking about the word 'resource'
and how we use that word to mean different things in zones and RM.
OK Thanks
The four new zonecfg resources are:
dedicated-cpu
capped-cpu (future, after cpu-caps are integrated)
dedicated-memory (future, after memory sets are integrated)
capped-memory
Each of these zonecfg resources will have properties that are
appropriate to the RM capabilities associated with that resource.
Zonecfg will only allow one instance of each these resource to be
configured and it will not allow conflicting resources to be added
(e.g. dedicated-cpu and capped-cpu are mutually exclusive).
The mapping of these new zonecfg resources to the underlying RM
feature
is:
dedicated-cpu -> temporary pset
dedicated-memory -> temporary mset
capped-cpu -> cpu-cap rctl [14]
capped-memory -> rcapd running in global zone
2) Temporary Pools.
We will implement the concept of "temporary pools" within the
pools
framework.
To improve the integration of zones and pools we are allowing the
configuration of some basic pool attributes within zonecfg, as
described above in section 1. However, we do not want to extend
zonecfg to completely and directly manage standard pool
configurations.
That would lead to confusion and inconsistency regarding which
tool to
use and where configuration data is stored. Temporary pools
sidesteps
this problem and allows zones to dynamically create a simple
pool/pset
configuration for the basic case where a sysadmin just wants a
specified number of processors dedicated to the zone (and
eventually a
dedicated amount of memory).
We believe that the ability to simply specify a fixed number of
cpus
(and eventually a mset size) meets the needs of a large
percentage of
zones users who need "hard" partitioning (e.g. to meet licensing
restrictions).
If a dedicated-cpu (and/or eventually a dedicated-memory)
resource is
configured for the zone, then when the zone boots zoneadmd will
enable
pools if necessary and create a temporary pool dedicated for
the zones
use. Zoneadmd will dynamically create a pool & pset (and/or
eventually
a mset) and assign the number of cpus specified in zonecfg to that
pset. The temporary pool & pset will be named
'SUNWtmp_{zonename}'.
Zonecfg validation will disallow an explicit 'pool' property name
beginning with 'SUNWtmp'.
Zoneadmd will set the 'pset.min' and 'pset.max' pset
properties, as
well as the 'pool.importance' pool property, based on the values
specified for dedicated-cpu's 'ncpus' and 'importance' properties
in zonecfg, as described above in section 1.
If the cpu (or memory) resources needed to create the temporary
pool
are unavailable, zoneadmd will issue an error and the zone
won't boot.
When the zone is halted, the temporary pool & pset will be
destroyed.
We will add a new boolean libpool(3LIB) property ('temporary')
that can
exist on pools and any pool resource set. The 'temporary'
property
indicates that the pool or resource set should never be
committed to a
static configuration (e.g. pooladm -s) and that it should never be
destroyed when updating the dynamic configuration from a static
configuration (e.g. pooladm -c). These temporary
pools/resources can
only be managed in the dynamic configuration. Support for
temporary
pools will be implemented within libpool(3LIB) using the two new
consolidation private functions listed in the interface table
below.
It is our expectation that most users will never need to manage
temporary pools through the existing poolcfg(1M) commands. For
users
who need more sophisticated pool configuration and management, the
existing 'pool' resource within zonecfg should be used and users
should manually create a permanent pool using the existing
mechanisms.
Will the existing pool commands show the results as if they were
created using those commands? It would be a useful learning and
templating tool to apply the resulting configuration(s) to scripts
using the existing commands for the future.
No, that is not part of this proposal. I am actually not quite sure
what
you are asking here. Maybe we could take that offline?
I'll clarify here because it may be of general interest. When playing
with CPU resource pools you use poolcfg, pooladm, poolstat. If I have
not configured any resouce pools and configure a zone with resources
via zonecfg, when the zone is running, do the pool* commands reflect
the temporary pool or do they show just the default?
If you have not configured a pool, poolstat, for example, returns
poolstat: couldn't open pools state file: Facility is not active
Will it still behave like that? Or will the system now reflect that
resource pools are in use, just not configured using the existing
commands?
Does the concept of "temporary" make it invisible to the standard pool
tools?
No, the temporary pools are fully visible using the standard tools and
can even be manipulated dynamically using the poolcfg command. It is just
that the temporary pools cannot be committed to a static configuration file
and will not be overwritten by updates from a static configuration file.
Another way to say this is that you have to use the '-d' flag on poolcfg
to manipulate temporary pools.
I see it now.
Here are the aliases we will define for the rctls:
alias rctl priv action
----- ---- ---- ------
max-lwps zone.max-lwps privileged deny
cpu-shares zone.cpu-shares privileged none
Coming in the near future, once the associated projects
integrate [14, 17, 18]
alias rctl priv action
----- ---- ---- ------
cpu-cap zone.cpu-cap privileged deny
max-locked-memory zone.max-locked-memory privileged deny
max-shm-memory zone.max-shm-memory privileged deny
max-shm-ids zone.max-shm-ids privileged deny
max-msg-ids zone.max-msg-ids privileged deny
max-sem-ids zone.max-sem-ids privileged deny
What is the purpose of some of these zone.* controls? Is it to limit
what a priviliged user can set the values to for projects, etc. in
that zone or does it set the defaults for the zone as well.
These set the upper limits for the zone as a whole. Thus, the
non-global zone admin cannot exceed these since they are controlled
by the global zone admin.
OK
I can see it being easier to configure different DB zones from the
global zone via zonecfg than having to enter, delegate, and educate
the zone users how to set them. I'm leaning towards making these the
defaults for the zone, not the just limit.
You won't be able to override these in the non-global zone.
What happens if zonecfg sets them lower than the Solaris defaults?
Will they be at the default and when an admin tries to set them the
limit is enforced, or will they already be lowered? Or will the
setting not be accepted, with an error message?
Setting these lower than the Solaris defaults is the whole point of the
rctl. That is, if you don't want a lower limit, you would not set the
rctl in the first place. I think maybe I am not understanding what
you are really asking here. The bottom line is that these are rctls,
just like the other zone-wide rctls we have, and the purpose is to put
an upper limit on the controlled resource that is lower that the system
maximum. These new rctls are actually part of different projects:
PSARC 2004/580 zone/project.max-locked-memory Resource Controls
PSARC 2006/451 System V resource controls for Zones
This project is just describing rctl aliases for these new rctl
capabilities.
And I got caught up in the System V ones, which is where that part was
going. If the zone's parmameter are lower than the defaults, e.g. for
max-shm-ids, what happens. Confusing that with the other resource limits.
Thanks again,
Steffen
Thanks,
Jerry
_______________________________________________
zones-discuss mailing list
zones-discuss@opensolaris.org