On Wed, Mar 30, 2011 at 1:27 PM, Henri Roosen <[email protected]> wrote: > On Wed, Mar 30, 2011 at 10:15 AM, Philippe Gerum <[email protected]> wrote: >> On Wed, 2011-03-30 at 09:30 +0200, Henri Roosen wrote: >>> On Wed, Mar 30, 2011 at 6:58 AM, Philippe Gerum <[email protected]> wrote: >>> > On Tue, 2011-03-29 at 21:29 +0200, Gilles Chanteperdrix wrote: >>> >> Philippe Gerum wrote: >>> >> > On Tue, 2011-03-29 at 21:19 +0200, Gilles Chanteperdrix wrote: >>> >> >> Philippe Gerum wrote: >>> >> >>> On Tue, 2011-03-29 at 21:11 +0200, Gilles Chanteperdrix wrote: >>> >> >>>> Philippe Gerum wrote: >>> >> >>>>> On Tue, 2011-03-29 at 16:41 +0200, Henri Roosen wrote: >>> >> >>>>>> Hi, >>> >> >>>>>> >>> >> >>>>>> I have several Xenomai RT threads (prio > 0) that get ready to >>> >> >>>>>> run all >>> >> >>>>>> at the same time. Priority coupling is enabled in the kernel. >>> >> >>>>>> >>> >> >>>>>> If one of them (unfortunately) makes a Linux system call, I see >>> >> >>>>>> that >>> >> >>>>>> first other lower and same priority Xenomai tasks are scheduled >>> >> >>>>>> before >>> >> >>>>>> the switched task is run in the Linux domain. As I understand, >>> >> >>>>>> priority coupling should prevent this. >>> >> >>>>>> >>> >> >>>>>> To rule out a problem in the application, this is also tested >>> >> >>>>>> with a >>> >> >>>>>> simple application based on the rt_print example. In my opinion, >>> >> >>>>>> with >>> >> >>>>>> priority coupling enabled this should print: >>> >> >>>>>> Wakeup! - I am - awake! - Me too! >>> >> >>>>>> But I get: >>> >> >>>>>> Wakeup! - I am - Me too! - awake! >>> >> >>>>>> So task 2 gets run before task 3 completes in the Linux domain. >>> >> >>>>>> >>> >> >>>>>> Please find attached the test application and the .config file. >>> >> >>>>> The fine print with priority coupling is that it stops immediately >>> >> >>>>> whenever the thread blocks linux-wise; this is actually why, after >>> >> >>>>> all >>> >> >>>>> this time debugging it, I'm pondering now whether I should keep >>> >> >>>>> this >>> >> >>>>> behavior/feature in 3.x. >>> >> >>>>> >>> >> >>>>> Initially, this was aimed at enforcing the right scheduling >>> >> >>>>> sequence >>> >> >>>>> with traditional RTOS APIs, specifically when it comes to create >>> >> >>>>> threads, so that high priority children do run prior to low >>> >> >>>>> priority >>> >> >>>>> parents (some legacy apps may expect this). But the fact is that >>> >> >>>>> this >>> >> >>>>> behavior also carries a number of uncertainties, and having the >>> >> >>>>> thread >>> >> >>>>> de-boosted when blocked by Linux is a serious one. >>> >> >>>> Maybe each thread could have a bit telling whether or not it should >>> >> >>>> run >>> >> >>>> under priority coupling, this bit would be disabled at all times, >>> >> >>>> except >>> >> >>>> during the thread creation routines, and at other times if the user >>> >> >>>> called xnpod_set_mode to enable it if he wants? >>> >> >>>> >>> >> >>> This bit exists, it is XNRPIOFF. What I'm pondering is whether this >>> >> >>> all >>> >> >>> makes sense to provide priority coupling without any mean to actually >>> >> >>> control the impact the regular kernel may have on it. >>> >> >>> >>> >> >> without the irq shield you mean :-) >>> >> >> >>> >> > >>> >> > No, it is not related. The issue now is with the inability to determine >>> >> > whether and when the kernel may cause the priority boost to drop >>> >> > without >>> >> > the user knowing about it. >>> >> > >>> >> Maybe we could add a new SIGDEBUG reason ? >>> >> >>> > >>> > SIGDEBUG is for detecting a misuse of some feature, the issue may be >>> > that the feature could be a misuse of the scheduling system in itself. >>> > This is what should be pondered before any other move. >>> > >>> > -- >>> > Philippe. >>> > >>> > >>> > >>> >>> Using a data array to track the switches and replace gettimeofday() >>> with sched_yield() shows the same sequence of events. Actually the >>> problem was shown in our main application that already uses a data >>> array for trace data, The rt_print based app was just for simple >>> reproducing the problem. >>> >>> Our realtime thread should actually not do Linux system calls, neither >>> should it cause exceptions, but unfortunately we don't have total >>> control over that. So when it does make a system call we rely on >>> priority coupling that the task completes before the lower priority >>> realtime threads are scheduled. Our tracing tool shows this is not the >>> case. >>> >>> What can I do to help fixing the priority coupling? >> >> As discussed earlier, it still remains to show whether linux blocks the >> task for whatever reason when issuing the syscall. In such a case, there >> is not much you could do, since you would simply face a limitation of >> the prio coupling design, there is no fix for this one. >> >> I would suggest to instrument rpi_switch(), to check whether the task is >> de-boosted for that reason, to make sure we are not chasing wild gooses. > > There are 2 calls to rpi_switch each loop: > First is the switch to task 3: this is when Linux actually schedules > the task the first time for doing the system call, right? > Second is the switch to the gatekeeper: this is when the task calls > the rt_event_wait for waiting in the Xenomai domain, right? > > So from the rpi_switch tracing I cannot see Linux blocking the task. > Also non of the rpi_switch calls enter the first 'if'. > > What would be the thing to check next? > >> >>> >>> Thanks, >>> Henri. >> >> -- >> Philippe. >> >> >> >
Did some more tracing to see why the lower priority thread is scheduled before the higher prio thread is ended. The highest priority task makes a system call and gets relaxed by xnshadow_relax. The rpi is pushed here and a Linux call with LO_WAKEUP_REQ is scheduled. Then I see the scheduler scheduling to the ROOT task. So far so good! In the Linux domain, we run into the lostage_handler, where the scheduled LO_WAKEUP_REQ is executed. Here there is a call to xnpod_schedule() which actually causes a switch back to the primary domain and the lower priority Xenomai task to be scheduled in, even before the wanted process is woken up. Now, I am unsure what is faulty here and maybe Philippe or someone can answer that. Personally I would have expected the xnpod_schedule (or xnsched_pick_next) to know about the rpi list and not schedule a lower priority task than of any on that list. I was unable to find such code. A quick and dirty test of commenting out the xnpod_schedule() call at LO_WAKEUP_REQ makes my test application show the correct sequence of events, but that cannot be the fix... Anyone any suggestions? Thanks, Henri _______________________________________________ Xenomai-help mailing list [email protected] https://mail.gna.org/listinfo/xenomai-help
