On April 29, 2025 3:04:30 PM PDT, Linus Torvalds <torva...@linux-foundation.org> wrote: >On Tue, 29 Apr 2025 at 14:59, Andrew Cooper <andrew.coop...@citrix.com> wrote: >> >> do_variable_ffs() doesn't quite work. >> >> REP BSF is LZCNT, and unconditionally writes it's output operand, and >> defeats the attempt to preload with -1. >> >> Drop the REP prefix, and it should work as intended. > >Bah. That's what I get for just doing it blindly without actually >looking at the kernel source. I just copied the __ffs() thing - and >there the 'rep' is not for the zero case - which we don't care about - >but because lzcnt performs better on newer CPUs. > >So you're obviously right. > > Linus
Yeah, the encoding of lzcnt was a real mistake, because the outputs are different (so you still need instruction-specific postprocessing.)