> Have you tested with 32-bit Java too? It's quite possible that it's > better to use ints than longs on 32-bit system. If so, that should be > detected at runtime too, I guess.
I have now run benchmarks using the 32bit jre on 64bit windows system. That actually introduces additional interesting impacts by using the client jvm by default, which does not use the c2 compiler. The longs appear to be a bit faster in client mode (not as optimized) and the ints a bit faster in server mode. > In XZ Utils the arrays have extra room at the end so that memcmplen.h > can always read 4/8/16 bytes at a time. Since this is easy to do, I > think it should be done in XZ for Java too to avoid special handling of > the last bytes. Somewhat surprisingly, this actually appears to make things slightly worse by having to introduce the check to see a detected difference was beyond the actual length. > Since Java in general is memory safe, having bound checks with Unsafe is > nice as long as it doesn't hurt performance too much. This > > if (aFromIndex < 0 || aFromIndex + length > a.length || > bFromIndex < 0 || bFromIndex + length > b.length) { > > is a bit relaxed though since it doesn't catch integer overflows. > Something like this would be more strict: > > if (length < 0 || > aFromIndex < 0 || aFromIndex > a.length - length || > bFromIndex < 0 || bFromIndex > b.length - length) { Nice catch. Arrays approaching 2GB are not common yet, but seems likely in future. > Comparing byte arrays as ints or longs results in unaligned/misaligned > memory access. MethodHandles.byteArrayViewVarHandle docs say that this > is OK. A quick web search gave me an impression that it might not be > safe with Unsafe though. Can you verify how it is with Unsafe? If it > isn't allowed, dropping support for Unsafe may be fine. It's just the > older Java versions that would use it anyway. This is a bit trickier. The closest I can find to documentation is from the getInt method[1]. The object referred to by o is an array, and the offset is an integer of the form B+N*S, where N is a valid index into the array, and B and S are the values obtained by arrayBaseOffset and arrayIndexScale (respectively) from the array's class. The value referred to is the Nth element of the array. ... However, the results are undefined if that variable is not in fact of the type returned by this method. Taken in context with what the new jdk.internal.misc.Unsafe.getLongUnaligned implementation looks like[2] unaligned access is not safe with Unsafe.getLong. > Do you have a way to check how these methods behave on Android and ARM? > (I understand that this might be too much work to check. This may be > skipped.) I /might/ be able to run some benchmarks on aws graviton2 instances. I will send out updated code soon, but I currently have implementations for jdk 9+ to use VarHandle for x86 (ints for 32bit longs for 64 bit) and non-x86 to use the vectorized Arrays mismatch. For older jdks, x86 will use Unsafe (again, ints for 32 bit and longs for 64 bit). There are also implementations for VarHandles which will make an attempt to compare individual bytes to try and get memory reads into alignment. The default implementation behavior can be overridden by setting a system property. [1] - https://github.com/openjdk/jdk11u-dev/blob/master/src/jdk.unsupported/share/classes/sun/misc/Unsafe.java#L127-L139 [2] - https://github.com/openjdk/jdk11u-dev/blob/master/src/java.base/share/classes/jdk/internal/misc/Unsafe.java#L3398