Hello! This sounds great but unfortunately I still haven't been able to properly read it yet. It may take a few more days. I apologize. :-(
As a minor update from my side, I committed lzma_outq changes that I mostly did two weeks ago. I believe it should now be usable for threaded decompression too to decouple output buffers from threads. However, I don't mean that you must use it. If you wish to use it, it's OK to do the change later. The docs are poor but: - When starting a Block and a new output buffer is needed, these must be called in this order: * lzma_outq_has_buf(): if it fails, all buffers are in use already. * lzma_outq_prealloc_buf(): ensures that a buffer of requested size is available in the cache. * lzma_outq_get_buf(): gets a buffer from the cache. * lzma_outq_enable_partial_output() [MUTEX]: calls a callback to tell the thread at the head of the queue to start making the progress available to the main thread. Must be called with the main mutex locked. - When reading decompressed output from the queue: * lzma_outq_is_readable() [MUTEX] can be used to poll if there is output available. * lzma_outq_read() [MUTEX] is used for reading. It will return LZMA_STREAM_END after the end of each buffer. In the decompressor this is a sign that lzma_outq_enable_partial_output() should be called before trying to read more data. * lzma_outq_is_empty() can be used to detect when no more buffers are pending and thus the end of the file may have been reached. - Worker threads: * lzma_outbuf.pos and .finished must be touched only with the main mutex locked. * A simple call-back is needed for use with lzma_outq_enable_partial_output(). On 2020-12-24 Sebastian Andrzej Siewior wrote: > I moved parts of the memcpy() out the locked section. Only the thread, > that is currently decompressing is waking the main thread. However the > current output position is updated under the main-thread's mutex. So > that might be not optimal. I would expect it to be fine. When only one worker thread is updating its status to the main thread, there won't be that much contention on the mutex. In the new lzma_outq, lzma_out_read() is called when the main mutex is locked and so the copying from the intermediate buffer to the final output buffer is done with the mutex locked so your code better here. This isn't hard to change in lzma_outq but I didn't do it for this commit to keep it less messy. Anyway, I will read your patch carefully as soon as I'm able to focus on it. Thanks a lot for your help! -- Lasse Collin | IRC: Larhzu @ IRCnet & Freenode