On 2013-02-13 11:01, Jan Kiszka wrote:
> On 2013-02-13 10:49, Henri Roosen wrote:
>> On Wed, Feb 13, 2013 at 10:26 AM, Michael Haberler <[email protected]>wrote:
>>
>>>
>>> We have a report from 'the field' which we cannot make sense of.
>>>
>>> The situation:
>>> - an AMD board: http://www.asus.com/Motherboard/F1A75M_PRO
>>> - dmesg post boot: http://pastebin.com/38XrxNBy
>>> - xeno-regression-test runs well, max 32us jitter
>>> - John's Xenomai kernel packages:  3.5.7/2.6.2.1 [1]
>>> - a native-skin userland RT threads application (linuxcnc[3])
>>>  - 2 threads
>>>  - jitter measured with its own GUI application 'latency-test'
>>>  - successfully tested on several other platforms
>>>
>>>
>>> what we observed:
>>>
>>> 1. Problem behaviour
>>> ---------------------
>>> - boot
>>> - run LinuxCNC latency-test
>>> - observe massive spikes in latency
>>>  - >100uS on a 25uS thread!
>>>  - http://static.mah.priv.at/public/latency/skunkworks-unprimed.png
>>>
>>> now any of 2), 3) or 4) improve latency:
>>>
>>> 2. run switchtest:  temporary change
>>> ------------------------------------
>>> - while still running LinuxCNC latency-test from 1) above,
>>> - running "/usr/lib/xenomai/testsuite/switchtest -s 1000" in a separate
>>> window
>>> - hit 'Reset Statistics' on the latency-test window
>>> - max latency drops massively
>>> - see http://static.mah.priv.at/public/latency/skunkworks-primed.png [3]
>>> - ^C-ing out of the switchtest makes latency rise again
>>>
>>>
>>> 3. running a trivial shell script:  temporary change during script
>>> execution
>>>
>>> ----------------------------------------------------------------------------
>>> - reboot
>>> - run latency-test, again observe latency spikes
>>> - in a separate window, run:
>>>  - while true; do echo "nothing" > /dev/null; done
>>> - again, latency-test shows rather low latency figures after hitting
>>> 'reset statistics' *as long as the above script is running*
>>> - quote from Sam: "BTW - I ran the latency-test all night with the
>>> donothing scrip and it peaked at about 19.6us latency."
>>> - killing the script makes the latency spikes reappear.
>>>
>>>
>>> 4. running xeno-regression-test and breaking out: permanent drop in latency
>>> --------------------------------------------------------------------------
>>> - reboot
>>> - run latency-test, again observe latency spikes
>>> - in a separate window, run:
>>>   sudo xeno-regression-test -l "/usr/lib/xenomai/testsuite/dohell -m /tmp
>>> 100 " -t 2
>>> - latency drops
>>> - the key observation: if you break by ^C out of xeno-regression-test,
>>> *latency
>>> figures remain low*
>>> - note that breaking out of xeno-regression-test left some processes
>>> running, obviously dd and ls:  http://pastebin.ca/2313116
>>> - once these processes complete ( http://pastebin.ca/2313117) latency
>>> goes up again.
>>>
>>> second data point:
>>> we have a report from another user, same kernel, Intel Q8200 Quad core
>>> board, which confirms 'dohell 900' in a separate window does drop latency
>>> significantly. This suggests it might not board specific.
>>>
>>>
>>> This leaves us puzzled as to the causality here. We would really like to
>>> get rid of the latency spikes, but the shell script approach isnt appealing.
>>>
>>> Any suggestions?
>>>
>>
>> I've seen similar behaviour. In my case it had to do with the latency of
>> transitions of the cpu's idle states. The problem was worked around by
>> providing "nohlt idle=poll". I'm sure it is documented somewhere on the
>> xenomai website too.
> 
> That will burn quite a bit of power, though. Maybe there is some BIOS
> switch to relax power saving mode a bit without giving up on halt. Also,
> do you see the same effect in text mode?
> 
> In any case, confirming the latency source via the ipipe tracer is a
> good first step: http://xenomai.org/index.php/I-pipe:Tracer. You could
> programmaticly break the trace once you detect the spike in your
> program. Post the resulting trace here for public discussion.
> 
> Jan
> 

BTW, though unrelated to the latency issues: Running a 32-bit kernel on
a box with >1GB RAM is not a good choice, performance-wise. PAE is slow...

i386 is for low-end embedded only today. I'll continue testing it, but
it will surely receive less attention than x86-64, just like in mainline
Linux.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

_______________________________________________
Xenomai mailing list
[email protected]
http://www.xenomai.org/mailman/listinfo/xenomai

Reply via email to