I accidentally hit reply instead of reply all.

> > Shouldn't that be (i & 3) != 0?
> > An offset of 0 should not enter this loop, but 0 & 3 does not equal 1.
>
> The idea really is that offset of 1 doesn't enter the loop, thus the
> main slicing-by-4 loop is misaligned. I don't know why it makes a
> difference and I'm no longer even sure why I decided to try it. You can
> try different (i & 3) != { 0, 1, 2, 3 } combinations.

I misunderstood your intent. I thought you were intending to get the
for loop onto 4 byte alignment.

I updated the benchmark to test with offsets [0,1,2] and also reducing
the length by an additional [0,1,2]. This should provide a good mix of
content which could require alignment at beginning and extra bytes at
the end.

Thus far I have only tested on jdk 11 64bit windows, but the fairly
clear winner is:

    public void update(byte[] buf, int off, int len) {
        final int end = off + len;
        int i=off;
        if (len > 3) {
            switch (i & 3) {
                case 3:
                    crc = TABLE[0][(buf[i++] ^ (int) crc) & 0xFF] ^ (crc >>> 8);
                case 2:
                    crc = TABLE[0][(buf[i++] ^ (int) crc) & 0xFF] ^ (crc >>> 8);
                case 1:
                    crc = TABLE[0][(buf[i++] ^ (int) crc) & 0xFF] ^ (crc >>> 8);
            }
            for (int j = end - 3; i < j; i += 4) {
                final int tmp = (int)crc;
                crc = TABLE[3][(tmp & 0xFF) ^ (buf[i] & 0xFF)] ^
                      TABLE[2][((tmp >>> 8) & 0xFF) ^ (buf[i + 1] & 0XFF)] ^
                      (crc >>> 32) ^
                      TABLE[1][((tmp >>> 16) & 0xFF) ^ (buf[i + 2] & 0XFF)] ^
                      TABLE[0][((tmp >>> 24) & 0xFF) ^ (buf[i + 3] & 0XFF)];
            }
        }
        switch ((end-i) & 3) {
            case 3:
                crc = TABLE[0][(buf[i++] ^ (int) crc) & 0xFF] ^ (crc >>> 8);
            case 2:
                crc = TABLE[0][(buf[i++] ^ (int) crc) & 0xFF] ^ (crc >>> 8);
            case 1:
                crc = TABLE[0][(buf[i++] ^ (int) crc) & 0xFF] ^ (crc >>> 8);
        }
    }


Brett

Reply via email to