On Thu 01 Dec 2011 at 10:39AM, Derek McEachern wrote:
> Have a peculiar problem that I haven't seen before.
> When starting a system that has about 35 - 40 zones on it occasionally we
> see that one of the zones doesn't come up properly. You can log into the
> zone but none of the /etc/rc3.d scripts have been run.
> /var/adm/messages is completely empty and when running who -r to see the
> run level it doesn't report anything.
Take a look at the output of svcs -x. Most likely you have a service
that svc:/milestone/multi-user-server:default depends on (directly or
indirectly) that has timed out and as such is in maintenance. Because
the dependency is not satisfied, this milestone doesn't come up so the
rc3 scripts are not run.
My guess is the timeout is because so many zones are starting at once
that the disks are being thrashed. The resulting I/O backlog slows down
the startup of services, which leads to timeouts, which lead to some
services failing to even try to start.
A google search and a 5 second read suggests that this link may be of
help to adjust the timeout of services that require a longer timeout:
Solaris Core OS / Zones http://blogs.oracle.com/zoneszone/
zones-discuss mailing list