Based on some playing around with unrolling loops as part of the crc64 implementation, I tried unrolling the "legacy" implementation and found it provided some nice improvements. The improvements were most pronounced on 32 bit jdk 11:
32 jdk 11 - LEGACY Benchmark (file) Mode Cnt Score Error Units XZCompressionBenchmark.compress ihe_ovly_pr.dcm avgt 3 17.812 ± 0.588 ms/op XZCompressionBenchmark.compress image1.dcm avgt 3 8404.259 ± 391.678 ms/op XZCompressionBenchmark.compress large.xml avgt 3 16037.416 ± 467.379 ms/op Unrolled Benchmark (file) Mode Cnt Score Error Units XZCompressionBenchmark.compress ihe_ovly_pr.dcm avgt 3 13.624 ± 0.845 ms/op XZCompressionBenchmark.compress image1.dcm avgt 3 7833.118 ± 28.132 ms/op XZCompressionBenchmark.compress large.xml avgt 3 12838.831 ± 192.884 ms/op 32 jdk 11 - LEGACY (server) Benchmark (file) Mode Cnt Score Error Units XZCompressionBenchmark.compress ihe_ovly_pr.dcm avgt 3 14.105 ± 0.081 ms/op XZCompressionBenchmark.compress image1.dcm avgt 3 8474.630 ± 518.903 ms/op XZCompressionBenchmark.compress large.xml avgt 3 16009.553 ± 529.315 ms/op Unrolled Benchmark (file) Mode Cnt Score Error Units XZCompressionBenchmark.compress ihe_ovly_pr.dcm avgt 3 10.513 ± 0.290 ms/op XZCompressionBenchmark.compress image1.dcm avgt 3 7900.578 ± 309.317 ms/op XZCompressionBenchmark.compress large.xml avgt 3 12871.200 ± 570.491 ms/op /** * Simply loops over all of the bytes, comparing one at a time. */ @SuppressWarnings("unused") private static int legacyMismatch( byte[] a, int aFromIndex, int bFromIndex, int length) { int i=0; for (int j=length - 7; i<j; i+=8) { if (a[aFromIndex + i] != a[bFromIndex + i]) return i; if (a[aFromIndex + i + 1] != a[bFromIndex + i + 1]) return i + 1; if (a[aFromIndex + i + 2] != a[bFromIndex + i + 2]) return i + 2; if (a[aFromIndex + i + 3] != a[bFromIndex + i + 3]) return i + 3; if (a[aFromIndex + i + 4] != a[bFromIndex + i + 4]) return i + 4; if (a[aFromIndex + i + 5] != a[bFromIndex + i + 5]) return i + 5; if (a[aFromIndex + i + 6] != a[bFromIndex + i + 6]) return i + 6; if (a[aFromIndex + i + 7] != a[bFromIndex + i + 7]) return i + 7; } for ( ; i<length; ++i) { if (a[aFromIndex + i] != a[bFromIndex + i]) { return i; } } return length; }