On Wed, Feb 3, 2021 at 2:56 PM Lasse Collin <lasse.col...@tukaani.org> wrote: > > On 2021-02-01 Brett Okken wrote: > > I have played with this quite a bit and have come up with a slightly > > modified change which does not regress for the smallest of the sample > > objects and shows a nice improvement for the 2 larger files. > > It seems to regress horribly if dist is zero. A file with a very long > sequence of the same byte is good for testing. >
Would this be a valid test of what you are describing? final byte[] bytes = new byte[16 * 1024]; Arrays.fill(bytes, (byte) -75); final byte[] compressed; try(final ByteArrayOutputStream baos = new ByteArrayOutputStream(); final XZOutputStream xos = new XZOutputStream(baos, new LZMA2Options())) { for (int i=0; i<10240; ++i) { xos.write(bytes); } xos.finish(); compressed = baos.toByteArray(); } The source is effectively 160MB of the same byte value. I found a strange bit of behavior with this case in the compression. In LZMAEncoderNormal.calcLongRepPrices, I am seeing a case where int len2Limit = Math.min(niceLen, avail - len - 1); results in -1, (avail and len are both 8). This results in calling LZEncoder.getMatchLen with a lenLimit of -1. Is that expected? When I was testing with java 8 and the ArrayUtil changes, this resulted in an ArrayIndexOutBoundsException.