On 26/04/2025 8:55 pm, Linus Torvalds wrote: > So I think that manual cmov pattern for x86-32 should be replaced with > > bool zero; > > asm("bsfl %[in],%[out]" > CC_SET(z) > : CC_OUT(z) (zero), > [out]"=r" (r) > : [in] "rm" (x)); > > return zero ? 0 : r+1; > > instead (that's ffs(), and fls() would need the same thing except with > bsrl insteadm, of course). > > I bet that would actually improve code generation.
It is possible to do better still. ffs/fls are commonly found inside loops where x is the loop condition too. Therefore, using statically_true() to provide a form without the zero compatibility turns out to be a win. > And I also bet it doesn't actually matter, of course. Something that neither Linux nor Xen had, which makes a reasonable difference, is a for_each_set_bit() optimised over a scalar value. The existing APIs make it all too easy to spill the loop condition to memory. ~Andrew