On 02/13/2013 11:06 AM, Jan Kiszka wrote:
> On 2013-02-13 11:01, Jan Kiszka wrote:
>> On 2013-02-13 10:49, Henri Roosen wrote:
>>> On Wed, Feb 13, 2013 at 10:26 AM, Michael Haberler
>>> <[email protected]>wrote:
>>>
>>>>
>>>> We have a report from 'the field' which we cannot make sense of.
>>>>
>>>> The situation:
>>>> - an AMD board: http://www.asus.com/Motherboard/F1A75M_PRO
>>>> - dmesg post boot: http://pastebin.com/38XrxNBy
>>>> - xeno-regression-test runs well, max 32us jitter
>>>> - John's Xenomai kernel packages: 3.5.7/2.6.2.1 [1]
>>>> - a native-skin userland RT threads application (linuxcnc[3])
>>>> - 2 threads
>>>> - jitter measured with its own GUI application 'latency-test'
>>>> - successfully tested on several other platforms
>>>>
>>>>
>>>> what we observed:
>>>>
>>>> 1. Problem behaviour
>>>> ---------------------
>>>> - boot
>>>> - run LinuxCNC latency-test
>>>> - observe massive spikes in latency
>>>> - >100uS on a 25uS thread!
>>>> - http://static.mah.priv.at/public/latency/skunkworks-unprimed.png
>>>>
>>>> now any of 2), 3) or 4) improve latency:
>>>>
>>>> 2. run switchtest: temporary change
>>>> ------------------------------------
>>>> - while still running LinuxCNC latency-test from 1) above,
>>>> - running "/usr/lib/xenomai/testsuite/switchtest -s 1000" in a separate
>>>> window
>>>> - hit 'Reset Statistics' on the latency-test window
>>>> - max latency drops massively
>>>> - see http://static.mah.priv.at/public/latency/skunkworks-primed.png [3]
>>>> - ^C-ing out of the switchtest makes latency rise again
>>>>
>>>>
>>>> 3. running a trivial shell script: temporary change during script
>>>> execution
>>>>
>>>> ----------------------------------------------------------------------------
>>>> - reboot
>>>> - run latency-test, again observe latency spikes
>>>> - in a separate window, run:
>>>> - while true; do echo "nothing" > /dev/null; done
>>>> - again, latency-test shows rather low latency figures after hitting
>>>> 'reset statistics' *as long as the above script is running*
>>>> - quote from Sam: "BTW - I ran the latency-test all night with the
>>>> donothing scrip and it peaked at about 19.6us latency."
>>>> - killing the script makes the latency spikes reappear.
>>>>
>>>>
>>>> 4. running xeno-regression-test and breaking out: permanent drop in latency
>>>> --------------------------------------------------------------------------
>>>> - reboot
>>>> - run latency-test, again observe latency spikes
>>>> - in a separate window, run:
>>>> sudo xeno-regression-test -l "/usr/lib/xenomai/testsuite/dohell -m /tmp
>>>> 100 " -t 2
>>>> - latency drops
>>>> - the key observation: if you break by ^C out of xeno-regression-test,
>>>> *latency
>>>> figures remain low*
>>>> - note that breaking out of xeno-regression-test left some processes
>>>> running, obviously dd and ls: http://pastebin.ca/2313116
>>>> - once these processes complete ( http://pastebin.ca/2313117) latency
>>>> goes up again.
>>>>
>>>> second data point:
>>>> we have a report from another user, same kernel, Intel Q8200 Quad core
>>>> board, which confirms 'dohell 900' in a separate window does drop latency
>>>> significantly. This suggests it might not board specific.
>>>>
>>>>
>>>> This leaves us puzzled as to the causality here. We would really like to
>>>> get rid of the latency spikes, but the shell script approach isnt
>>>> appealing.
>>>>
>>>> Any suggestions?
>>>>
>>>
>>> I've seen similar behaviour. In my case it had to do with the latency of
>>> transitions of the cpu's idle states. The problem was worked around by
>>> providing "nohlt idle=poll". I'm sure it is documented somewhere on the
>>> xenomai website too.
>>
>> That will burn quite a bit of power, though. Maybe there is some BIOS
>> switch to relax power saving mode a bit without giving up on halt. Also,
>> do you see the same effect in text mode?
>>
>> In any case, confirming the latency source via the ipipe tracer is a
>> good first step: http://xenomai.org/index.php/I-pipe:Tracer. You could
>> programmaticly break the trace once you detect the spike in your
>> program. Post the resulting trace here for public discussion.
>>
>> Jan
>>
>
> BTW, though unrelated to the latency issues: Running a 32-bit kernel on
> a box with >1GB RAM is not a good choice, performance-wise. PAE is slow...
>
> i386 is for low-end embedded only today. I'll continue testing it, but
> it will surely receive less attention than x86-64, just like in mainline
> Linux.
I can continue running xeno-regression-test on geode and duall piii for
the I-pipe releases as I did up to now.
--
Gilles.
_______________________________________________
Xenomai mailing list
[email protected]
http://www.xenomai.org/mailman/listinfo/xenomai