Dave Pigliavento wrote:
I have a Solaris 10 release 6/06 system that is configured with several zones. 
One of which is now in an unusable state after a reboot attempt.

The status of the zone is shutting_down and there is a defunct process that 
can't be killed. preap just hangs indefinitely and won't even get rid of the 
defunct process.

2 oracle-grid      shutting_down  /zones/oracle-grid

UID   PID  PPID   C    STIME TTY         TIME CMD
root   551     1   0   Mar 19 ?           0:00 zsched
0005000  1919     1   0   Mar 19 ?           0:05 
/u01/app/oracle/OracleHomes/oms10g/Apache/Apache/bin/httpd -d /u01/app/oracle/O
0005000  1956  1919   0        - ?           0:00 <defunct>


Aside from rebooting the entire system is there anything I can do to clean this 
up?

Any assistance would be greatly appreciated.

It doesn't look like the zombie process is the problem since its parent
is still there and that process is apparently hung in the kernel.
Whenever something like this happens it probably means you are hitting
a kernel bug.  It is almost certainly not a zones bug but a bug in some
other part of the kernel or in a driver.  In order to get a bug filed,
or to see if it is a known bug, we really need a system dump so that we
can debug the problem a bit and then get it assigned to the proper team.
If you have a dedicated dump device, and since your system is still
responsive, you should be able to run 'savecore -L'.  If you don't have
a dedicated dump device, then instead of simply rebooting, you can use one
of the other techniques, such as 'halt -d', to get a dump when you shutdown.

Unfortunately, when a thread is stuck in the kernel like this there is
probably no way to kill it without rebooting the system, but if we can
get a system dump then we can at least try to get to the root cause of the bug.

Jerry
_______________________________________________
zones-discuss mailing list
zones-discuss@opensolaris.org

Reply via email to