On 03/11/2011 07:19 AM, Alexandre Julliard wrote:
Henri Verbeet<[email protected]>  writes:

Something like the following may be faster:

static const int bit_pos[] =
{
     0x00, 0x01, 0x0e, 0x02, 0x0f, 0x1b, 0x12, 0x03,
     0x1e, 0x10, 0x1c, 0x16, 0x18, 0x13, 0x0a, 0x04,
     0x1f, 0x0d, 0x1a, 0x11, 0x1d, 0x15, 0x17, 0x09,
     0x0c, 0x19, 0x14, 0x08, 0x0b, 0x07, 0x06, 0x05,
};
const unsigned int c = 0x7dc4d65;
return bit_pos[((x&  -x) * c)>>  27];

I'd expect that mostly due to eliminating the branches and the
implicit shift&  mask operations in something like "y.c[1]".
Maybe, but there's no point. This is just a fallback implementation for
broken platforms, any platform that we care about should never use this.

For systems with fast multiplication that way is probably faster. This was just something I noticed that took me 5 mins and was clearly better than what we had.


Reply via email to