Re: [xz-devel] [PATCH v3] liblzma: Add multi-threaded decoder

Sebastian Andrzej Siewior Sat, 04 Dec 2021 14:55:19 -0800

On 2021-11-30 00:25:11 [+0200], Lasse Collin wrote:
> Hello!
Hello,

> On 2021-02-05 Sebastian Andrzej Siewior wrote:
> > - Added enum `lzma_memlimit_opt' to lzma_stream_decoder_mt() as an
> >   init parameter. The idea is to specify how to obey the memory limit
> >   so the user can keep using one API and not worry to fail due to the
> >   memory limit. Lets assume the archive has a 9MiB dictionary, 24MiB
> >   block of uncompressed data. The archive contains two compressed
> >   blocks of 10 MiB each. Using two threads, the memory requirement is
> >   roughly (9 + 24 + 10) * 2 = 86 MiB
> > 
> >   On a system with 64 MiB of memory with additional 128MiB of swap it
> >   likely leads to the use of (say 30 MiB) swap memory during
> >   decompression which will slow down the whole operation.
> >   The synchronous API would do just fine with only 9 MiB of memory.
> > 
> >   So to not complicate things, invoking lzma_stream_decoder_mt() with
> >   a memory limit of 32 MiB three scenarios are possible:
> >   - LZMA_MEMLIMIT_THREAD
> >     One thread requires 43MiB of memory and would exceed the memory
> >     limit. However, continue with one thread instead of possible two.
> > 
> >   - LZMA_MEMLIMIT_NO_THREAD
> >     One thread requires 43MiB of memory and would exceed the memory
> >     limit. Fallback to the synchronous API without buffered input /
> >     output memory.
> > 
> >   - LZMA_MEMLIMIT_COMPLETE
> >     In this scenario it would behave like LZMA_MEMLIMIT_NO_THREAD.
> >     However, with a dictionary size > 32MiB it would abort.
> 
> In the old single-threaded code, if no memory usage limit is specified
> the worst case memory usage with LZMA2 is about 4 GiB (the format
> allows 4 GiB dict although the current encoder only supports 1536 MiB).
> With the threaded decoder it's the same with LZMA_MEMLIMIT_NO_THREAD.
> 
> However, LZMA_MEMLIMIT_THREAD sounds a bit scary. There are no practical
> limits to the block size so there can be a .xz file that makes the
> decoder allocate a huge amount of memory. It doesn't even need to be an
> intentionally malicious file, it just needs to have the size fields
> present. Thus, I think LZMA_MEMLIMIT_THREAD should be removed.
> One-thread multi-threaded mode will still be used with
> LZMA_MEMLIMIT_NO_THREAD if the limit is high enough.


You are right. I can't think of a reason to force one thread.

> LZMA_MEMLIMIT_NO_THREAD should be the default in xz when no memory
> usage limit has been explicitly specified. There needs to be a default
> "soft limit" (the MemAvailable method is such) that will drop xz to
> single-threaded mode if the soft limit is too high for threaded mode
> (even with just one thread).
> 
> LZMA_MEMLIMIT_COMPLETE could be the mode to use when a memlimit is
> explicitly specified (a "hard limit") on the xz command line. This would
> match the existing behavior of the old single-threaded decoder. It
> would be good to have a way to specify a soft limit on the xz command
> line too.
> 
> It could make sense to have both soft and hard limit at the same time
> but perhaps it gets too confusing: Soft limit that would be used to
> restrict the number of threads (and even drop to single-threaded mode)
> and hard limit which can return LZMA_MEMLIMIT_ERROR. If one is fine to
> use 300 MiB in threaded mode but still wants to allow up to 600 MiB in
> case the file *really* requires that much even in single-threaded mode,
> then this would be useful.

I see. This might sounds like advanced usage.
Looking from the package-manager point of view (installs packages and
needs it decompressed), the work flow is "I have here 80MiB of free
memory but I need to have this decompressed so do what needs to be done
in order fulfil your job." So if the dictionary is 128MiB in size, do it
without threading and hope for swap space but this *used* to work before
that, too.

> Separate soft and hard limits might be convenient from implementation
> point of view though. xz would need --memlimit-soft (or some better
> name) which would always have some default value (like MemAvailable).
> The threaded decoder in liblzma would need to take two memlimit values.
> Then there would be no need for an enum (or a flag) to specify the
> memlimit mode (assuming that LZMA_MEMLIMIT_THREAD is removed).

Ah I see. So one would say soft-limit 80MiB, hard-limit 2^60bytes and
would get no threading at all / LZMA_MEMLIMIT_NO_THREAD. And with soft
1GiB, hard 2^60bytes would get the threading mode. (2^60 is made up
no limit).

> Extra idea, maybe useless: The --no-adjust option could be used to
> specify that if the specified number of threads isn't possible due to
> a memlimit then xz will abort. This is slightly weird as it doesn't
> provide real performance guarantees anyway (block sizes could vary a
> lot) but it's easy to implement if it is wanted.
> 
> I wonder if relying on the lzma_mt struct is useful for the decoder.
> Perhaps the options could be passed directly as arguments as there are
> still 2-3 fewer than needed for the encoder.

Thre is
- num threads
- flags
- memlimit
- timeout

One struct to rule them all and you could extend it without the need to
change the ABI.
I took one of the reserved ones for the memlimit. If you put the two
memory limits and number of threads in one init/configure function then
only flags and timeout is left. Maybe that would be enought then.

> I've made some other minor edits locally already so I would prefer to
> *not* get new patch revisions until I have committed something.
> Comments are very welcome. :-)
> 
> Thanks!

Sebastian

Re: [xz-devel] [PATCH v3] liblzma: Add multi-threaded decoder

Reply via email to