On Thu, 17 Dec 2009 23:16:19 +0100, Dan Price <d...@eng.sun.com> wrote:
>> On Thu, Dec 17, 2009 at 07:17:50PM +0100, Frank Batschulat (Home) wrote: >> > May I have 2 code reviewers for: >> > >> > 6911329 Incorrect code in kstat_delete causes panic >> > http://cr.opensolaris.org/~batschul/onnvkstat/ >> > >> > Description >> > A colleague was looking into a crash and the reason turned out to be a >> > NULL pointer dereference in kstat_delete(): >> > >> > kstat_delete(kstat_t *ksp) >> > { kmutex_t *lp; >> > ekstat_t *e = (ekstat_t *)ksp; >> > zoneid_t zoneid = e->e_zone.zoneid; >> > kstat_zone_t *kz; >> > >> > if (ksp == NULL) >> > return; >> > >> > Note that there is a dereference of 'ksp' [via 'e'] before the check for >> > ksp being NULL. >> > >> > unfortunately we don't have a dump/stacktrace anymore to inspect who >> > called kstat_delete(NULL) and why. > > Do we really think that ksp being NULL is a invalid condition? Yes, I think we do. kstat_create() offically and documented returns NULL in the error case. ie. the usual sequence for a user would be ksp = kstat_create() if (ksp != NULL) kstat_install() kstat_delete(9F) PARAMETERS ksp Pointer to a currently installed kstat(9S) structure. > If it's invalid, then why not add an assertion, so we can root-cause. absolutely ! webrev updated: http://cr.opensolaris.org/~batschul/onnvkstat/ > Or has this if (ksp == NULL) been there forever and ever and there > are drivers abusing it? nope, it got introduced when Jeff re-wrote the kstat framework in Solaris 9 via: 4460914 kstat implementation has escaped and doesn't scale > I see a bunch of cmn_err's in kstat_create-- are there log files > from the machine which might indicate that there was a kstat_create > which returned NULL? unfortunately I do have nothing at all. However what we do know is that the inital kstat_create() can't have returned NULL, because in that case the following kstat_install() would've already paniced in face of the NULL ksp due to the various dereferences it does on ksp. this implies 2 things: 1) a bug in the kstat framework itself 2) a kstat user calling kstat_delete() twice or with a NULL ksp. I'll run tests with the usual big kstat consumers like Zones, UFS, ZFS and NFS with the debug bits for testing, perhaps I can uncover an offender. thanks frankB _______________________________________________ zones-discuss mailing list zones-discuss@opensolaris.org