I have added a comment to the PR with updated benchmark results: https://github.com/tukaani-project/xz-java/pull/13#issuecomment-1977705691
On Fri, Mar 1, 2024 at 6:23 AM Brett Okken <brett.okken...@gmail.com> wrote: > > I found and resolved the difference: > https://github.com/tukaani-project/xz-java/pull/13/commits/1e4550e06d8cbec4079b2b2fba4a2245307cc4e6 > > It was indeed in BT4 and had to do with searching for the > niceLenLimit. I will update the benchmarks over the weekend, as they > take some time to run. > > Brett > > On Thu, Feb 29, 2024 at 8:47 PM Brett Okken <brett.okken...@gmail.com> wrote: > > > > > Thanks! Ideally there would be one commit to add the minimal portable > > > version, then separate commits for each optimized variant. > > > > Would you like me to remove the Unsafe based impl from > > https://github.com/tukaani-project/xz-java/pull/13? > > > > > So far I have given it only a quick try. array_comp_incremental seems > > > faster than xz-java.git master. Compression time was reduced by about > > > 10 %. :-) This is with OpenJDK 21.0.2, only a quick test, and my > > > computer is old so I don't doubt your higher numbers. > > > > How are you testing? I am using jmh, so it has a warm up period before > > actually measuring, giving the jvm plenty of opportunity to perform > > optimizations. If you are doing single shot executions to compress a > > file, that could provide pretty different results. > > > > > With array_comparison_performance the improvement seems to be less, > > > maybe 5 %. I didn't test much yet but it still seems clear that > > > array_comp_incremental is faster on my computer. > > > > Going back to the previous question, this could be due to fact that I > > collapsed some class hierarchy in the _incremental pr. This could take > > the optimizer a bit longer to figure out. > > > > > However, your code produces different output compared to xz-java.git > > > master so the speed comparison isn't entirely fair. I assume there was > > > no intent to affect the encoder output with these changes so I wonder > > > what is going on. Both of your branches produce the same output so it's > > > something common between them that makes the difference. > > > > This was definitely not the intent, and I had not noticed this previously. > > > > With the 3 files I test with, none have any difference with preset of > > 3. The smallest file (ihe_ovly_pr.cm) also has no difference at preset > > 6. > > > > With the ~25MB image1.dcm (mostly a greyscale bmp), the PR versions > > produce more compressed content at preset 6. > > 1.9 = 4,041,476 > > PR = 4,004,156 > > > > There is a smaller difference with the ~50MB xml file, but strangely, > > the PR version is slightly bigger. > > 1.9 = 1,589,512 > > PR = 1,589,564 > > > > Given that I am only seeing differences with preset of 6, I am > > guessing the difference must be in BT4. > > The result still seems to be valid (at least the java XZInputStream > > reads it back correctly). > > There is clearly a subtle "defect" somewhere, but I cannot tell if it > > is in the current trunk or the PR. My best guess is that there is an > > off by 1 error in one or the other. > > > > Brett > > > > On Thu, Feb 29, 2024 at 11:35 AM Lasse Collin <lasse.col...@tukaani.org> > > wrote: > > > > > > On 2024-02-25 Brett Okken wrote: > > > > I created https://github.com/tukaani-project/xz-java/pull/13 with the > > > > bare bones changes to utilize a utility for array comparisons and an > > > > Unsafe implementation. > > > > When/if that is reviewed and approved, we can move on through the > > > > other implementation options. > > > > > > Thanks! Ideally there would be one commit to add the minimal portable > > > version, then separate commits for each optimized variant. > > > > > > So far I have given it only a quick try. array_comp_incremental seems > > > faster than xz-java.git master. Compression time was reduced by about > > > 10 %. :-) This is with OpenJDK 21.0.2, only a quick test, and my > > > computer is old so I don't doubt your higher numbers. > > > > > > With array_comparison_performance the improvement seems to be less, > > > maybe 5 %. I didn't test much yet but it still seems clear that > > > array_comp_incremental is faster on my computer. > > > > > > However, your code produces different output compared to xz-java.git > > > master so the speed comparison isn't entirely fair. I assume there was > > > no intent to affect the encoder output with these changes so I wonder > > > what is going on. Both of your branches produce the same output so it's > > > something common between them that makes the difference. > > > > > > I plan to get back to this next week. > > > > > > > > One thing I wonder is if JNI could help. > > > > > > > > It would most likely make things faster, but also more complicated. I > > > > like the java version for the simplicity. I am not necessarily looking > > > > to compete with native performance, but would like to get improvements > > > > where they are reasonably available. Here there is some complexity in > > > > supporting multiple implementations for different versions and/or > > > > architectures, but that complexity does not intrude into the core of > > > > the xz code. > > > > > > I think your thoughts are similar to mine here. Java version is clearly > > > slower but it's nicer code to read too. A separate class for buffer > > > comparisons indeed doesn't hurt the readability of the core code. > > > > > > On the other hand, if Java version happened to be used a lot then JNI > > > could save both time (up to 50 %) and even electricity. java.util.zip > > > uses native zlib for the performance-critical code. > > > > > > In the long run both faster Java code and JNI might be worth doing. > > > There's more than enough pure Java stuff to do for now so any JNI > > > thoughts have to wait. > > > > > > -- > > > Lasse Collin > > >