On 2021-01-20 Sebastian Andrzej Siewior wrote: > On 2021-01-18 23:52:50 [+0200], Lasse Collin wrote: > > I have understood that *in practice* the problem with the xz command > > line tool is limited to "xz -T0" usage so fixing this use case is > > enough for most people. Please correct me if I missed something. > > Correct.
There is some code for special behavior with -T0 now for both compression and decompression. I haven't updated the man page yet but the commit messages should be helpful. I hope it can be documented so that it sounds simple enough. :-) > In the parallel decompress I added code on Linux to query the > available memory. I would prefer that as an upper limit on 64bit if no > limit is given. The reason is that *this* amount of memory is safe to > use without over-committing / involving swap. This may be the way to go on Linux but I didn't add it yet. The committed code uses total_ram / 4. Since MemAvail is Linux-specific something more broadly available needs exist for better portability, and total_ram / 4 could perhaps be it. It can be tweaked if needed, it's just a starting point. > For 32bit applications I would cap that limit to 2.5 GiB or so. The > reason is that the *normal* case is to run 32bit application on a > 32bit kernel and so likely only 3GiB can be addressed at most (minus > a few details like linked in libs, NULL page, guard pages and so on). > The 32bit application on 64bit kernel is probably a shortcut where > something is done a 32bit chroot - like building a package. > > I'm not sure what a sane upper limit is on other OSes. Limitting it on > 32bit does probably more good than bad if there is no -M parameter. I think a generic cap needs to be below 2 GiB. For example, if 32-bit MIPS can do only 2 GiB. There could be OS+arch-specific exceptions though. The code currently in xz.git uses 1400 MiB. There needs to be some extra room if repeated mallocs and frees fragment the address space a little. Perhaps it's too conservative but it allows eight compression threads at the default xz -6, and one thread at -9 in threaded mode (so it can create a file that can be decompressed in threaded mode). > > An alternative "fix" for the liblzma case could be adding a simple > > API function that would scale down the number of threads in a > > lzma_mt structure based on a memory usage limit and if the > > application is 32 bits. Currently the thread count and LZMA2 > > settings adjusting code is in xz, not in liblzma. > > It might help. dpkg checks the memlimit with > lzma_stream_encoder_mt_memusage() and decreases the memory limit until > it fits. It looks simpler compared to rpm's attempt and various > exceptions. Now that lzma_mt structure contains memlimit_threading already, a flag could be added to use it to reduce the number of threads at the encoder initialization. I suppose reducing the thread count would go a long way. It doesn't affect the compressed output so it can be done when people wish reproducible output. > > The idea for the current 4020 MiB special limit is based on a patch > > that was in use in FreeBSD to solve the problem of 32-bit xz on > > 64-bit kernel. So at least FreeBSD should be supported to not make > > 32-bit xz worse under 64-bit FreeBSD kernel. > > Is this a common case? I don't *know* but I guess some build 32-bit packages on a 64-bit kernel so it may be common enough use case. > While poking around, Linux has this personality() syscall/function. > There is a flag called PER_LINUX32_3GB and PER_LINUX_32BIT which are > set if the command is invoked with `linux32' say > linux32 xz > > then it would set that flag set and could act. It is not set by > starting a 32bit application on a 64bit kernel on its own or on a > 32bit kernel. I don't know if this is common practise but I use this > in my chroots. So commands like `uname -m' return `i686' instead of > `x86_64'. If other chroot environments do it as well then it could be > used as a hack to assume that it is run on 64bit kernel. That is if > we want that ofcourse :) I haven't look at this but it sounds that it could be useful. If xz knows that it has 4 GiB of address space the default limit could be much higher. -- Lasse Collin