On Thu, 2007-05-24 at 11:43 +0200, Jan Kiszka wrote: > > You don't want to run rpi_pop() after the bulk of xnpod_delete_thread() > > Not after, I meant before. But then... > > > has run for the same thread, and you don't want a preemption to occur > > anytime between the thread deletion and the update of the RPI state that > > still refers to it. You might try moving the call to rpi_pop() to the > > prologue of both routines instead of waiting for the deletion hook to do > > it, but this would introduce a priority issue in the do_taskexit_event > > callout, since we might have lowered the root priority in rpi_pop() and > > be switched out in the middle of the deletion process. Unless we hold > > the nklock all time long, that is. Back to square #1, I'm afraid. > > ...there might be other issues, OK. > > So we might need some reference counter for the xnthread object so that > the last user actually frees it and some flags to check what jobs remain > to be done for cleanup. > > >>>>> latency hit is certainly not seen there; the real issues are left in the > >>>>> native and POSIX skin thread deletion routines (heap management and > >>>>> other housekeeping calls under nklock and so on). > >>>> That's surely true. Besides RPI manipulation, there is really no > >>>> problematic code left here latency-wise. Other hooks are trickier. But > >>>> my point is that we first need some infrastructure to provide an > >>>> alternative cleanup context before we can start moving things. My patch > >>>> today is a hack...err...workaround from this perspective. > >>>> > >>> I don't see it this way. This code is very specific, in the sense it > >>> lays on the inter-domain boundary between Linux and Xenomai's primary > >>> mode for a critical operation like thread deletion, this is what make > >>> things a bit confusing. What's below and above this layer has to be as > >>> seamless as possible, this code in particular can't. > >> Well, we could move the memory release of the shadow thread in Linux > >> context e.g. (release on idle, a bit like RCU - uuuh). > > > > This is basically what Gilles did in a recent patch to fix some treading > > on freed memory, by releasing the shadow TCB through > > xnheap_schedule_free(). I think we should have a look at the removal > > from registry operations now. > > Look, we now have xnheap_schedule_free,
Actually, we had this for the last two years now, it just happened to be underused. > next we a need kind of > xnregistery_schedule_free > - that's my point, provide a generic platform > for all these things. More may follow (for new kind of skins eg.). > Yes, what we need is a common policy for object creation and deletion, that properly uses the APC mechanism. The logic behind this being that any object creation which involves secondary domain work to set them up - such as threads, heaps, queues, pipes - will invariably require the deletion process to follow the converse path, i.e. rely on Linux kernel services. So far, this has been achieved by setting the lostage bit in the syscall exec mode, but this has two drawbacks: 1) kernel-based threads take no advantage of this since they don't go through the syscall mechanism, 2) when such routine wants to switch to primary mode in order to finish the housekeeping work, it has to perform this transition "manually", by a call to the hardening service, which is opening Pandora's box if badly used. From the interface design POV, I'd really prefer that xnshadow_relax/harden remain internal calls only used by the nucleus. I guess we both agree on that. But, to sum up, let's drop the micro-optimization approach to solve this issue, this just would not scale. PS: The registry is special, it has to defer some work to Linux by design, so basically all registrable objects are concerned here. > Jan > -- Philippe. _______________________________________________ Xenomai-core mailing list [email protected] https://mail.gna.org/listinfo/xenomai-core
