Thanks for reviewing this. My responses to your comments are
Edward Pilatowicz wrote:
On Tue, Dec 15, 2009 at 08:39:12AM -0700, Jerry Jelinek wrote:
I have an initial code review for the fix for bug:
6768950 panic[cpu1]/thread=ffffff084ce0b3e0: syscall_asm_amd64.s:480
lwp ffffff0756a8cdc0, pcb_rupdate != 0
There is a webrev at:
The code changes in the sn1 and solaris10 brands are basically
identical. I know there is a lot of common code there but I
didn't want to clutter up this bug fix with the unrelated changes
necessary to make the code common. I'll be addressing that with
a separate fix.
My initial testing of these changes looks good but I still need
to run more extensive tests.
this looks great. i have some initial comments.
- could you update the following lines with comments:
xchgq CPTRSIZE(%rbp), %rax /* swap JMP table offset and ret addr */
shrq $4, %rax /* JMP table offset / JMP size = syscall num */
movq %rax, EH_LOCALS_GREG(REG_RAX)(%rbp) /* save syscall num */
- don't you need to update this file as well? have you tested 32-bit
No, this doesn't need to be updated since this code doesn't touch the
user's stack. I have done preliminary testing with 32 bit kernels and
the callbacks work correctly with the code as is. Thats because the
32 bit code is more like the 64 bit code that handles an interrupt stack
where we already have the right data pushed.
- perhaps you could do the following renames:
BRAND_GET_RET_REG -> BRAND_URET_FROM_REG
BRAND_GET_RET_STACK -> BRAND_URET_FROM_INTR_STACK
- wrt this code:
cmpq $NSYSCALL, %rax /* is 0 <= syscall <= MAX? */
jbe 0f /* yes, syscall is OK */
xorq %rax, %rax /* no, zero syscall number */
it's duplicated in every brand callback right after
CALLBACK_PROLOGUE(). why not make it part of CALLBACK_PROLOGUE?
also, if the syscall num is > NSYSCALL, why not just jump to 9: and
let the normal syscall path detect and return the error?
OK. I was modeling this on the way lx did it but your suggestion seems
- it seems like there should be a macro for this rough block of code
(which calculates the jmp table address):
GET_P_BRAND_DATA(%esp, 1, %edx);/* get p_brand_data ptr */
movl SPD_HANDLER(%edx), %edx /* get p_brand_data->spd_handler ptr */
shll $4, %eax
addl %eax, %edx /* we'll return to our handler */
I'll put one together.
- prior to these changes V_URET_ADDR wasn't always set, so the different
brand syscall callbacks would get the userland return address from
their syscall specific locations (registers, interrupt stack, etc). but
now since V_URET_ADDR is always set, perhaps the callback handlers could
be made more consistent by always getting the value from the stack (ie,
- so following up with the last comment (and getting more into potential
comminization work) it seems to me like it might be benificial to move
all the syscall mechanism specific handling code out of the actual brand
callbacks and into BRAND_CALLBACK. you've already started doing this by
having BRAND_CALLBACK be aware of how to get the userland return
address. (prior to that it didn't have any dependancy upon the
different syscall mechanisms, except when deciding which brand callback
to invoke.) continuing down that path we could move all the syscall
specific handling code into BRAND_CALLBACK. then each brand would only
deliver a single callback which would take one parameter, the syscall
number. it would return one value, a userland return address. then
BRAND_CALLBACK could handle all the different syscall specific return
paths. this would also be benificial in the future since if a new
syscall mechanism was introduced, we wouldn't have to update any actual
brand callbacks, just BRAND_CALLBACK. thoughts?
For these last two I agree that there are some good opportunities here and
I was torn between doing a bunch more clean up on this and deferring that
work to the fix for:
6900207 code can be shared between solaris10 and ipkg brands
Since bug 6768950 is serious and I'd like to get the fix done sooner
rather than later, I'd like to defer some of these other changes to 6900207.
I was about to start on that anyway so once 6768950 is done I'm going to
immediately start work on a bunch of ideas I have for making the code shared
and simpler. I was also going to roll a fix for:
6887823 brandz on x86 should ignore %gs on 64-bit kernels
into that same set of cleanup. I definitely agree with your comments here
but I'm worried about the fix for 6768950 taking too long.
zones-discuss mailing list