Re: [xz-devel] [PATCH v3] liblzma: Add multi-threaded decoder

Lasse Collin Mon, 29 Nov 2021 14:25:17 -0800

Hello!

On 2021-02-05 Sebastian Andrzej Siewior wrote:
> - Added enum `lzma_memlimit_opt' to lzma_stream_decoder_mt() as an
>   init parameter. The idea is to specify how to obey the memory limit
>   so the user can keep using one API and not worry to fail due to the
>   memory limit. Lets assume the archive has a 9MiB dictionary, 24MiB
>   block of uncompressed data. The archive contains two compressed
>   blocks of 10 MiB each. Using two threads, the memory requirement is
>   roughly (9 + 24 + 10) * 2 = 86 MiB
> 
>   On a system with 64 MiB of memory with additional 128MiB of swap it
>   likely leads to the use of (say 30 MiB) swap memory during
>   decompression which will slow down the whole operation.
>   The synchronous API would do just fine with only 9 MiB of memory.
> 
>   So to not complicate things, invoking lzma_stream_decoder_mt() with
>   a memory limit of 32 MiB three scenarios are possible:
>   - LZMA_MEMLIMIT_THREAD
>     One thread requires 43MiB of memory and would exceed the memory
>     limit. However, continue with one thread instead of possible two.
> 
>   - LZMA_MEMLIMIT_NO_THREAD
>     One thread requires 43MiB of memory and would exceed the memory
>     limit. Fallback to the synchronous API without buffered input /
>     output memory.
> 
>   - LZMA_MEMLIMIT_COMPLETE
>     In this scenario it would behave like LZMA_MEMLIMIT_NO_THREAD.
>     However, with a dictionary size > 32MiB it would abort.


In the old single-threaded code, if no memory usage limit is specified
the worst case memory usage with LZMA2 is about 4 GiB (the format
allows 4 GiB dict although the current encoder only supports 1536 MiB).
With the threaded decoder it's the same with LZMA_MEMLIMIT_NO_THREAD.

However, LZMA_MEMLIMIT_THREAD sounds a bit scary. There are no practical
limits to the block size so there can be a .xz file that makes the
decoder allocate a huge amount of memory. It doesn't even need to be an
intentionally malicious file, it just needs to have the size fields
present. Thus, I think LZMA_MEMLIMIT_THREAD should be removed.
One-thread multi-threaded mode will still be used with
LZMA_MEMLIMIT_NO_THREAD if the limit is high enough.

LZMA_MEMLIMIT_NO_THREAD should be the default in xz when no memory
usage limit has been explicitly specified. There needs to be a default
"soft limit" (the MemAvailable method is such) that will drop xz to
single-threaded mode if the soft limit is too high for threaded mode
(even with just one thread).

LZMA_MEMLIMIT_COMPLETE could be the mode to use when a memlimit is
explicitly specified (a "hard limit") on the xz command line. This would
match the existing behavior of the old single-threaded decoder. It
would be good to have a way to specify a soft limit on the xz command
line too.

It could make sense to have both soft and hard limit at the same time
but perhaps it gets too confusing: Soft limit that would be used to
restrict the number of threads (and even drop to single-threaded mode)
and hard limit which can return LZMA_MEMLIMIT_ERROR. If one is fine to
use 300 MiB in threaded mode but still wants to allow up to 600 MiB in
case the file *really* requires that much even in single-threaded mode,
then this would be useful.

Separate soft and hard limits might be convenient from implementation
point of view though. xz would need --memlimit-soft (or some better
name) which would always have some default value (like MemAvailable).
The threaded decoder in liblzma would need to take two memlimit values.
Then there would be no need for an enum (or a flag) to specify the
memlimit mode (assuming that LZMA_MEMLIMIT_THREAD is removed).

Extra idea, maybe useless: The --no-adjust option could be used to
specify that if the specified number of threads isn't possible due to
a memlimit then xz will abort. This is slightly weird as it doesn't
provide real performance guarantees anyway (block sizes could vary a
lot) but it's easy to implement if it is wanted.

I wonder if relying on the lzma_mt struct is useful for the decoder.
Perhaps the options could be passed directly as arguments as there are
still 2-3 fewer than needed for the encoder.

I've made some other minor edits locally already so I would prefer to
*not* get new patch revisions until I have committed something.
Comments are very welcome. :-)

Thanks!

-- 
Lasse Collin

Re: [xz-devel] [PATCH v3] liblzma: Add multi-threaded decoder

Reply via email to