Ben,

Thanks for this change. As of today, I've applied a very small portion 
of it, by introducing debug_msg() calls into the path(s) where you've 
added XLOG warnings.

Ben Greear wrote:
> On some error conditions related to interface removal, the PIM 
> callbacks would
> not handle the next task, and so nothing would ever look at the task 
> queue
> again, effectively hanging the multicast routing daemon.

I think we need to look very carefully at changes which affect the flow 
of RPC calls in and out of PIM, as we are gearing up for significant 
refactoring in that area.

Were you able to pin the task list hang down to a specific PIM RPC call 
or set of events? It could be argued that failure of the Finder, still 
shouldn't be regarded as a purely transient failure.

This is especially the case, if we're in a situation where we're using 
in-flight shared memory and user-space synchronization mechanisms (e.g. 
futex, umtx) to control access to that shared memory, so I'd err on the 
side of the conservative, and not commit this change in full for now.

thanks,
BMS

_______________________________________________
Xorp-hackers mailing list
[email protected]
http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/xorp-hackers

Reply via email to