Thats what I was thinking - do a re-install.  I ran broken for a month.
I was at my limit and gave the directory moves a shot.

Prior to that I did updates, and even upgrades (it originally happened
on a U1 install), but continued to fail after the U2 and even U3b3 upgrades.

I did the grep and only the packages SUNWrcmdr and SUNWtnetr have
postinstalls that reference pam.conf.

But like I said, ever since I did the "move the directory entry of SUNWcsr
before the others", the system has been working again.  But I believe I
can still recreate this on my other BE.

Let me know if you want me to investigate further with the other BE
and I will, otherwise I'll just forget about it and hope it doesn't
happen again...

Enda O'Connor ( Sun Micro Systems Ireland) wrote:
I strongly suspect that somehow updatemanager has created some inconsistency.

If the old BE list pam.conf as type e pamconf in /var/sadm/install/contents

Then as long as the pspool/pkgmap is also e pamconf this should just install.

Perhaps grep for pam.conf in

and see if somehow something else is modifying this file outside of those packages that depend on SUNWcsr.

I'm running out of ideas short of a reinstall.


Brian Kolaci wrote:

Yep, its there.  And the dependency is listed there.

The save/pspool... pkgmap entry is:

1 e pamconf etc/pam.conf 0644 root sys 3224 28137 1106348054

I'm just stunned/amazed that the "directory swapping" actually worked.
But I know that was the only change that fixed the issue.  I had created
a zone right before it, did the directory entry swapping and created another
zone right after it, and the second zone was OK.

Also before the directory swapping I used DTrace to confirm that
something else was manipulating /etc/pam.conf in the zone prior
to the pamconf action script.

I still have an old BE that had the issue (with solaris 10 U3 beta 3).
I can boot into that to diagnose this further (just go to single user,
mount it and luactivate it, right?)  Let me know what I should do/try
to find out what is being installed in the zone prior to SUNWcsr.

Enda O'Connor ( Sun Micro Systems Ireland) wrote:

There should be a file /var/sadm/pkg/SUNWkrbu/save/pspool/SUNWkrbu/install/depend

which should explicitly call out SUNWcsr.

Coudl you check the /var/sadm/pkg/SUNWcsr/save/pspool/SUNWcsr/pkgmap and tell me what the entry is for pam.conf?
I suspect that this entry is corrupt.

Very much suspect that update manager has caused the system to become out of sync.

Basically in my experience updatemanager needs a lot of space in /var to work. It downloads and uncompresses the patches, which if installing a lot of patches can be space consuming, given that the avg patch is 6m zipped to start with.


Brian Kolaci wrote:

I believe that SUNWkrbu is creating the file /etc/pam.conf.
How are the dependencies expressed?  What tells (or should tell)
the system that SUNWcsr must be installed before SUNWkrbu (or
any other package)?

I don't have the logs from the failed update, but this all started
when "updatemanager" failed to update patches on the system.  The
reason for updatemanager failing is clear though (zonepath for the
local zones was on the root filesystem and it filled to 100% and
couldn't boot the zones since the /zones filesystem wasn't mounted).

Enda O'Connor ( Sun Micro Systems Ireland) wrote:

Hi Brian
Not clear to me why this is happening, my system appears to have correct dependencies on SUNWcsr, the only way that I know that this could happen is if etc/pam.conf got converted to type "f" by mistake, which does not appear to be the case in your failed machine.
A package creates pam.conf without a dependency on SUNWcsr.
oe the pspool/SUNWcsr/pkgmap entry for pam.conf is damaged.

what is the pam.conf entry like in
/var/sadm/pkg./SUNWcsr/save/pspool/SUNWcsr/pkgmap ( is it type e and pamconf CAS ? )

Almost looks like the system is somehow corrupted by a patchrm failure.
But that is just a guess without any concrete evidence.
Did you remove any patches ( if so which ones )
Do you have an install_log from a failed zone install, ( in /var/sadm/system/logs in the non global zone )

What release and what patches have you applied to the failed system
Brian Kolaci wrote:

Hi Enda,

the grep returns:

/etc/pam.conf e pamconf 0644 root sys 3224 28137 1106348054 SUNWcsr

Yes, there was one of the other packages.  I looked and it appears
it was SUNWkrbu.

Strange but when I mv'd the directories around (at least the way I did), the order they came back from ls, enough to make it start working again. My rationale was to look at the directories there, then clear the directory entry toward the top of the list by moving the directory out of the way, then moving SUNWcsr back into the package directory (again assuming that it will be assigned to the first free slot), then move the other directory back in. Like I said, it did work, but having that work didn't give a
warm & fuzzy feeling...

Enda O'Connor ( Sun Micro Systems Ireland) wrote:

Hi Brian
packages are installed in dependency order, ie SUNWcsr always installs before any other package that requires it, no matter the ordering of them in /var/sadm/pkg. Basically any patch can change the order of the packages in /var/sadm/pkg in relation to time stamps etc, so we do need to do this in dependency order. Basically SUNWtnetr has /var/sadm/pkg/SUNWtnetr/install/depend which calls out SUNWcsr.

I mv'ed packages in /var/sadm/pkg around and it had no effect on the ordering, see <zone-path>/root/var/sadm/system/logs/install_log

it should never change with respect to SUNWcsr really, basically SUNWcsr will always install before any other package that calls out a dependency installs.

Could I see a log of a failed install, seems some package is installing pam.conf without a dependency on SUNWcsr?

But unless the system is corrupted the packages you mention:

SUNWsshdr SUNWtnetr SUNWrcmdr SUNWwebminu

all install after SUNWcsr.

what does grep etc/pam.conf /var/sadm/install/contents  say?


Brian Kolaci wrote:

Well, I finally solved this obscure case.  I think this is a silly
way to determine package ordering and dependencies which can cripple
an installation.  I believe a bug should be filed, but I'm not sure
what to file it against.

Due to a failed patch update, something happened with the on-disk
directory ordering in /var/sadm/pkg.  I found that /etc/pam.conf
is referenced in the packages SUNWcsr SUNWsshdr SUNWtnetr SUNWrcmdr
SUNWwebminu SUNWman.  Apparently their installation order is based
on the order they're returned by opendir/readdir. There should be a
dependency on SUNWcsr by all packages that reference /etc/pam.conf
in any class action scripts, since SUNWcsr needs to install it
before any other package can modify it.

So the logic to determine package installation order needs to
be updated to include the above dependency.  What dependency checks
are used to calculate the order?  Is there any "official" order the
packages should be installed in (so that I may rebuild the /var/sadm/pkg
directory to be in the proper order)?

Should you run into this problem, I'll quickly post how I fixed this.

# cd /var/sadm/pkg
# ls -ltd SUNWcsr SUNWsshdr SUNWtnetr SUNWrcmdr SUNWwebminu

if SUNWcsr isn't at the top of the list, you're going to have problems.

I fixed it by creating a directory and moving the one at the top of the list into the tmp folder, then mv SUNWcsr to the tmp folder, then mv SUNWcsr back
followed by moving the other directory back.

# mkdir tmp
# mv SUNWtnetr tmp
# mv SUNWcsr tmp
# ls -ltd SUNWcsr SUNWsshdr SUNWtnetr SUNWrcmdr SUNWwebminu
SUNWcsr: No such file or directory
SUNWtnetr: No such file or directory
drwxr-xr-x   4 root     root         512 Jul 18 15:22 SUNWsshdr
drwxr-xr-x   4 root     root         512 Jul 18 15:21 SUNWrcmdr
drwxr-xr-x   4 root     root         512 May  3  2005 SUNWwebminu
# mv tmp/SUNWcsr .
# mv tmp/SUNWtnetr .
# ls -ltd SUNWcsr SUNWsshdr SUNWtnetr SUNWrcmdr SUNWwebminu

and maybe I just got lucky, but SUNWcsr was now at the top of the list. All zone creations now work properly and /etc/pam.conf matches the global zone.

Brian Kolaci wrote:

Thanks for the reply.
I'm digging through i.pamconf to find out why its not copying the file.
This seems to be the problem.  Its doing the editing, but not the
initial copy of the file. I checked the CLEANUP_FILE and found that it had logged messages "default entries updated", which means it is
not copying the file which means it already exists.

Perhaps there's some kind of package installation ordering issue.
The i.pamconf script checks for the existence of /etc/pam.conf and
only copies it if it doesn't exist. If another package gets installed
before SUNWcsr that tries to manipulate /etc/pam.conf and actually
creates it, then the copy will never be done. This looks like what is

What order are the packages installed in? Is there a way to adjust that order to assure that SUNWcsr comes before the other one(s) that are manipulating the file? What is the correct order for package installation?

Renaud Manus wrote:

SUNWcsr pkgmap defines /etc/pam.conf as a 'e' (editable) type file with a class action script 'pamconf'. In this situation, when you install a new zone, when it comes to install the SUNWcsr package, the class action script will just copy the file from /var/sadm/pkg/SUNWcsr/save/... to [ZONEROOTPATH]/etc.

After that, it's possible that some packages need to modify the pam.conf (eg. SUNWtnetr), to add new entries for example, then they do so in their
postinstall script.

You could find all the files on both systems that manipulate pam.conf and compare them.


# find /var/sadm/pkg/SUNWcsr -type f -exec /usr/xpg4/bin/grep -q pam.conf {} \; -print

-- Renaud

Brian Kolaci wrote:


I'm still having zone creation issues where my /etc/pam.conf is corrupt.

I have 2 machines, one works fine, the other always creates the
zone with a bad /etc/pam.conf.

I used the Dtrace toolkit "opensnoop" program to watch on both machines. I see on the "good" machine, where it creates the /etc/pam.conf correctly that a process properly copies the file from the pspool directory:

0 29509 cp 4 /var/sadm/pkg/SUNWcsr/save/pspool/SUNWcsr/reloc/etc/pam.conf This happens during the "Initializing package <x> of <y>: percent complete: ??%" phase. I never see this on the machine having issues. In fact what I do see is:

0 16561 cat -1 /pool1/zones/bktest2/root/etc/pam.conf 0 16564 grep 7 /pool1/zones/bktest2/root/etc/pam.conf 0 16565 sh 7 /pool1/zones/bktest2/root/etc/pam.conf 0 17485 cat 6 /pool1/zones/bktest2/root/etc/pam.conf 0 17487 cat 6 /tmp/pam.conf.17484 0 17489 grep 6 /pool1/zones/bktest2/root/etc/pam.conf 0 17490 sh 6 /pool1/zones/bktest2/root/etc/pam.conf 0 17491 grep 6 /pool1/zones/bktest2/root/etc/pam.conf 0 17492 sh 6 /pool1/zones/bktest2/root/etc/pam.conf so it appears to be trying to manipulate the file rather than just copy it.

What determines whether a file is copied from the save/pspool/... directory
rather than just a postinstall script trying to manipulate it?

I've even upgraded the system to the latest U3 beta and the problem persists.

Is the process flow for creating zones documented somewhere?
