Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
>>>> Jan Kiszka wrote:
>>>>> Gilles Chanteperdrix wrote:
>>>>>> So, Ok, I will try to do something for x86 (either reduce the numbers of
>>>>>> registers used by the C code, or reduce the assembly to the bare
>>>>>> minimum). But, please, pick my generic implementation of llmulshft, it
>>>>>> was carefully written.
>>>>> Yes, it is the better choice for 32 bit archs (my previous tests didn't
>>>>> reflect the usage in Xenomai truely, redoing them made my generic
>>>>> version fall behind yours). Will include it.
>>>> Done, see -v6. Then I added that two-liner for x86_64 rthal_llmulshft,
>>>> fixed the BITS_PER_LONG bug, and enabled generic-based support for ARM
>>>> (testing welcome!).
>>>> At this chance: My series now also includes rthal_llimd for x86_64,
>>>> another two-liner.
>>> v6 is not in the download area.
>> Mpf, forgot to press "update". Done.
> Ok, I agree with the fast-tsc-to-ns patch: I could not get gcc to
> generate code with less moves on x86 (which is, for me, if it was still
> needed, yet another proof that these register moves are harmless).

No question -- from the average performance POV.

> However, I do not agree with the x86_64 llimd: it will not work if m is
> greater than 2G, that is why we implement llimd in terms of ullimd on
> other architectures.

Please help me, I don't see it yet:

m is 32 bit and gets extended to 64 bit without considering any sign (as
it should be). Then we multiply 64x64 bit signed, but we know for sure
that the second multiplier is always positive. Same for division. Basic
tests ((-1*1000000000)/2 vs. (-1*3000000000)/2) confirmed this on the


Attachment: signature.asc
Description: OpenPGP digital signature

Xenomai-core mailing list

Reply via email to