On 2013-02-13 11:01, Jan Kiszka wrote: > On 2013-02-13 10:49, Henri Roosen wrote: >> On Wed, Feb 13, 2013 at 10:26 AM, Michael Haberler <[email protected]>wrote: >> >>> >>> We have a report from 'the field' which we cannot make sense of. >>> >>> The situation: >>> - an AMD board: http://www.asus.com/Motherboard/F1A75M_PRO >>> - dmesg post boot: http://pastebin.com/38XrxNBy >>> - xeno-regression-test runs well, max 32us jitter >>> - John's Xenomai kernel packages: 3.5.7/2.6.2.1 [1] >>> - a native-skin userland RT threads application (linuxcnc[3]) >>> - 2 threads >>> - jitter measured with its own GUI application 'latency-test' >>> - successfully tested on several other platforms >>> >>> >>> what we observed: >>> >>> 1. Problem behaviour >>> --------------------- >>> - boot >>> - run LinuxCNC latency-test >>> - observe massive spikes in latency >>> - >100uS on a 25uS thread! >>> - http://static.mah.priv.at/public/latency/skunkworks-unprimed.png >>> >>> now any of 2), 3) or 4) improve latency: >>> >>> 2. run switchtest: temporary change >>> ------------------------------------ >>> - while still running LinuxCNC latency-test from 1) above, >>> - running "/usr/lib/xenomai/testsuite/switchtest -s 1000" in a separate >>> window >>> - hit 'Reset Statistics' on the latency-test window >>> - max latency drops massively >>> - see http://static.mah.priv.at/public/latency/skunkworks-primed.png [3] >>> - ^C-ing out of the switchtest makes latency rise again >>> >>> >>> 3. running a trivial shell script: temporary change during script >>> execution >>> >>> ---------------------------------------------------------------------------- >>> - reboot >>> - run latency-test, again observe latency spikes >>> - in a separate window, run: >>> - while true; do echo "nothing" > /dev/null; done >>> - again, latency-test shows rather low latency figures after hitting >>> 'reset statistics' *as long as the above script is running* >>> - quote from Sam: "BTW - I ran the latency-test all night with the >>> donothing scrip and it peaked at about 19.6us latency." >>> - killing the script makes the latency spikes reappear. >>> >>> >>> 4. running xeno-regression-test and breaking out: permanent drop in latency >>> -------------------------------------------------------------------------- >>> - reboot >>> - run latency-test, again observe latency spikes >>> - in a separate window, run: >>> sudo xeno-regression-test -l "/usr/lib/xenomai/testsuite/dohell -m /tmp >>> 100 " -t 2 >>> - latency drops >>> - the key observation: if you break by ^C out of xeno-regression-test, >>> *latency >>> figures remain low* >>> - note that breaking out of xeno-regression-test left some processes >>> running, obviously dd and ls: http://pastebin.ca/2313116 >>> - once these processes complete ( http://pastebin.ca/2313117) latency >>> goes up again. >>> >>> second data point: >>> we have a report from another user, same kernel, Intel Q8200 Quad core >>> board, which confirms 'dohell 900' in a separate window does drop latency >>> significantly. This suggests it might not board specific. >>> >>> >>> This leaves us puzzled as to the causality here. We would really like to >>> get rid of the latency spikes, but the shell script approach isnt appealing. >>> >>> Any suggestions? >>> >> >> I've seen similar behaviour. In my case it had to do with the latency of >> transitions of the cpu's idle states. The problem was worked around by >> providing "nohlt idle=poll". I'm sure it is documented somewhere on the >> xenomai website too. > > That will burn quite a bit of power, though. Maybe there is some BIOS > switch to relax power saving mode a bit without giving up on halt. Also, > do you see the same effect in text mode? > > In any case, confirming the latency source via the ipipe tracer is a > good first step: http://xenomai.org/index.php/I-pipe:Tracer. You could > programmaticly break the trace once you detect the spike in your > program. Post the resulting trace here for public discussion. > > Jan >
BTW, though unrelated to the latency issues: Running a 32-bit kernel on a box with >1GB RAM is not a good choice, performance-wise. PAE is slow... i386 is for low-end embedded only today. I'll continue testing it, but it will surely receive less attention than x86-64, just like in mainline Linux. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux _______________________________________________ Xenomai mailing list [email protected] http://www.xenomai.org/mailman/listinfo/xenomai
