On 26/04/2025 8:55 pm, Linus Torvalds wrote:
> So I think that manual cmov pattern for x86-32 should be replaced with
>
>         bool zero;
>
>         asm("bsfl %[in],%[out]"
>             CC_SET(z)
>             : CC_OUT(z) (zero),
>               [out]"=r" (r)
>             : [in] "rm" (x));
>
>         return zero ? 0 : r+1;
>
> instead (that's ffs(), and fls() would need the same thing except with
> bsrl insteadm, of course).
>
> I bet that would actually improve code generation.

It is possible to do better still.

ffs/fls are commonly found inside loops where x is the loop condition
too.  Therefore, using statically_true() to provide a form without the
zero compatibility turns out to be a win.

> And I also bet it doesn't actually matter, of course.

Something that neither Linux nor Xen had, which makes a reasonable
difference, is a for_each_set_bit() optimised over a scalar value.  The
existing APIs make it all too easy to spill the loop condition to memory.

~Andrew

Reply via email to