# Re: [xz-devel] java crc64 implementation

```I accidentally hit reply instead of reply all.

> > Shouldn't that be (i & 3) != 0?
> > An offset of 0 should not enter this loop, but 0 & 3 does not equal 1.
>
> The idea really is that offset of 1 doesn't enter the loop, thus the
> main slicing-by-4 loop is misaligned. I don't know why it makes a
> difference and I'm no longer even sure why I decided to try it. You can
> try different (i & 3) != { 0, 1, 2, 3 } combinations.```
```
I misunderstood your intent. I thought you were intending to get the
for loop onto 4 byte alignment.

I updated the benchmark to test with offsets [0,1,2] and also reducing
the length by an additional [0,1,2]. This should provide a good mix of
content which could require alignment at beginning and extra bytes at
the end.

Thus far I have only tested on jdk 11 64bit windows, but the fairly
clear winner is:

public void update(byte[] buf, int off, int len) {
final int end = off + len;
int i=off;
if (len > 3) {
switch (i & 3) {
case 3:
crc = TABLE[0][(buf[i++] ^ (int) crc) & 0xFF] ^ (crc >>> 8);
case 2:
crc = TABLE[0][(buf[i++] ^ (int) crc) & 0xFF] ^ (crc >>> 8);
case 1:
crc = TABLE[0][(buf[i++] ^ (int) crc) & 0xFF] ^ (crc >>> 8);
}
for (int j = end - 3; i < j; i += 4) {
final int tmp = (int)crc;
crc = TABLE[3][(tmp & 0xFF) ^ (buf[i] & 0xFF)] ^
TABLE[2][((tmp >>> 8) & 0xFF) ^ (buf[i + 1] & 0XFF)] ^
(crc >>> 32) ^
TABLE[1][((tmp >>> 16) & 0xFF) ^ (buf[i + 2] & 0XFF)] ^
TABLE[0][((tmp >>> 24) & 0xFF) ^ (buf[i + 3] & 0XFF)];
}
}
switch ((end-i) & 3) {
case 3:
crc = TABLE[0][(buf[i++] ^ (int) crc) & 0xFF] ^ (crc >>> 8);
case 2:
crc = TABLE[0][(buf[i++] ^ (int) crc) & 0xFF] ^ (crc >>> 8);
case 1:
crc = TABLE[0][(buf[i++] ^ (int) crc) & 0xFF] ^ (crc >>> 8);
}
}

Brett

```