On Wed, 23 Dec 2009 15:26:17 +0100, Glenn Brunette <glenn.brune...@sun.com>
> Just verified that something is still wrong in b129, but the problem is
> _not_ with a vanilla configuration. This time around boot/halt #102,
> the system apparently shutdown/panic'ed? I was running it overnight
> and came in to a system that had been rebooted. I did not see any
> problem in the audit log nor in /var/adm/messages. Any pointers?
> I am running an Immutable Service Container configuration, based upon
> the installation steps at:
> pfexec pkg install SUNWmercurial
> hg clone https://kenai.com/hg/isc~source isc
> pfexec isc/bin/iscadm.ksh -N 0
> pfexec bootadm update-archive
> pfexec shutdown -g 0 -i 0 -y
> [after reboot]
> zlogin -C isc1
> [wait for zone isc1 to fully complete boot process]
> then run the script that I provided that stops and starts the zone.
> Apparently, there must be something wrong with the interaction of
> components. In this configuration, we have things like resource
> controls, auditing, IP Filter/IP NAT, and zones all enabled.
> Would it be possible for you to try the steps above on a fresh
> install of 2009.06 or later (b129 is where I am right now). Also,
> if you have other debugging methods, please let me know.
hey Glenn, the good news is that I have an OSOL_130 system with ISC installed
as described below that reliably reproduces _something_.
That something being the system completely hung when run your script:
batsc...@osol:~# while : ; do echo "`date`:ZONE BOOT"; pfexec zoneadm -z isc1
boot; sleep 10; echo "`date`: ZONE HALT"; pfexec zoneadm -z isc1 halt; sleep
Note, sleep 30 didn't do it, 17 hours running without an issue, however changing
this to sleep 10, I can reliably hang the system usually within 5 hours.
no remote access possibly anymore and even local console doesn't do it anymore.
F1-A taking a dump when booted into kmdb however works.
the bad news is, I'm not getting the dumps, sigh. this is due to bug:
6911155 kernel dump fails if panic happens in interrupt service routine
which is fixed in build 131.
So I will persue this further once OSOL_131 has been released and this
system has been upgraded. I finally will have dumps by then.
I'll also contact you offline how you can setup your systems
to capture crash dumps and anything else we might need.
zones-discuss mailing list