Sebastian Smolorz wrote:
> Jan Kiszka wrote:
>> Cornelius Köpp wrote:
>>> Hello,
>>> I run the latency test from testsuite on several hard and software
>>> configurations. Running on Xenomai 2.4.2, Linux 2.6.24 the results
>>> shows a "strange" behavior: In Kernel mode (-t1) the latencys
>>> constantly linear decrease. See attached plot
>>> 'drifting_latencys_in_kernelmode.png' of latency test running 48h on
>>> Pentium3 700. This effect could be reproduced, even on other hardware
>>> (Pentium-M 1400).
>> As our P3 boards did not support APIC-based timing (IIRC), your kernel
>> has correctly disabled the related kernel support. But the Pentium M
>> should be fine. So could you check if we are seeing some TSC clocks
>> vs. PIT timer rounding issue by enabling the local APIC on the Pentium M?
> There is no difference in enabling the local APIC on the Pentium M WRT
> this bug.
>>> The usermode (-t0) did not show a drifting, but is influenced by a
>>> test ran in kernelmode before.
>> What do you mean with "is influenced"?
> Cornelius saw the following behaviour: If the latency test was run in
> user space first, no drift appeared over time. If latency was run in
> kernel space (with the reported ngeative drift) a following latency test
> in user space showed also negative values but with no additional drift
> over time.
>>> I talked with Sebastian Smolorz about this and he builds his own
>>> independent kernel-config to check. He got the same drifting-effect
>>> with Xenomai 2.4.2 and Xenomai 2.4.3 running latency over several
>>> hours. His kernel-config ist attached as
>>> 'config-2.6.24-xenomai-2.4.3__ssm'.
>>> Our kernel-configs are both based on a config used with Xenomai 2.3.4
>>> and Linux without any drifting effects.
>> 2.3.x did not incorporate the new TSC-to-ns conversion. Maybe it is
>> not a PIC vs. APIC thing, but rather a rounding problem of larger TSC
>> values (that naturally show up when the system runs for a longer time).
> This hint seems to point into the right direction. I tried out a
> modified pod_32.h (xnarch_tsc_to_ns() commented out) so that the old
> implementation in include/asm-generic/bits/pod.h was used. The drifting
> bug disappeared. So there seems so be a buggy x86-specific
> implementation of this routine.

Hmm, maybe even a conceptional issue: the multiply-shift-based
xnarch_tsc_to_ns is not as precise as the still multiply-divide-based
xnarch_ns_to_tsc. So when converting from tsc over ns back to tsc, we
may loose some bits, maybe too many bits...

It looks like this bites us in the kernel latency tests (-t2 should
suffer as well). Those recalculate their timeouts each round based on
absolute nanoseconds. In contrast, the periodic user mode task of -t0
uses a periodic timer that is forwarded via a tsc-based interval.

You (or Cornelius) could try to analyse the calculation path of the
involved timeouts, specifically to understand why the scheduled timeout
of the underlying task timer (which is tsc-based) tend to diverge from
the calculated one (ns-based).


Attachment: signature.asc
Description: OpenPGP digital signature

Xenomai-core mailing list

Reply via email to