> > > This: > > > > > > diff --git a/src/liblzma/common/stream_decoder_mt.c > > > b/src/liblzma/common/stream_decoder_mt.c > > > --- a/src/liblzma/common/stream_decoder_mt.c > > > +++ b/src/liblzma/common/stream_decoder_mt.c > > > @@ -786,7 +786,7 @@ read_output_and_wait(struct lzma_stream_coder *coder, > > > if (mythread_cond_timedwait(&coder->cond, > > > &coder->mutex, > > > wait_abs) != 0) { > > > - ret = LZMA_TIMED_OUT; > > > + ret = LZMA_OK; > > > break; > > > } > > > } else { > > > > > > Should "fixes" it. At some point the main thread needs to check if the > > > next thread is able to make progress or not and then return LZMA_OK so > > > that the upper layer can figure out that no progress is made. Otherwise > > > it stucks in the LZMA_TIMED_OUT loop. > > > > This fixes it just for xz, but if no timeout is specified it will > > still deadlock. > > I haven't looked at the code enough to understand how to fix it for both, > > but I will start to look into that. > > No, it is for everyone. In the timeout case we need to check if the > first thread in line can make progress. If we don't provide data new > data to the thread and the thread consumed everything it had then the > thread won't make progress. If the function gets invoked with 0 new data > then we should return LZMA_OK at which point the upper layer (between XZ > binary and the library) will notice that no progress is made and return > an error to xz.
I tested this locally by setting the timeout for xz to 0 on line 57 of src/xz/coder.c and it deadlocks on a truncated xz file. It seems like a race condition because it deadlocks ~90% of the time in my liblzma tests. When debugging, I can see that the main thread does not return from lzma_code when the LZMA_FINISH action is given. When the decoder receives LZMA_FINISH, it should probably signal to the worker threads that no more input is coming and to exit with error if they are in the middle of a block. Even if LZMA_FINISH is not given, we still need to avoid deadlock at all costs. I will come up with a patch for this since it sounds like a fun problem :) > The above patch is not correct because if you do it as I suggeted then > it is possible that an error is returned because the thread is slow and > did not yet make progress. I agree that the above patch only "fixes" the problem. Small timeouts and slow threads will result in false positive deadlock detections. > > I was following the conversation about the soft and hard memory limiting. > > If a user wanted decoding to fail if it can't be done multithreaded and > > update > > the memory limit as needed, that can't be done right now. It's a minor issue > > that only matters for liblzma, but it would be nice to be able to update > > both > > limits after decoding has started. I don't consider this a bug, more like > > missing a nice to have feature. One easy solution is to add a new API > > function to liblzma to update the soft memory limit for the multithreaded > > decoder and do nothing / return an error on all other coders. I will add > > a patch for this if you guys think it is a good idea. > > Maybe I missundertand you. But if you set memlimit_stop to the same > value as memlimit_threading then you have either multi threaded > decompression or none at all right? I understand that setting memlimit_stop = memlimit_threading will result in either threaded decoding or no decoding. The problem is only the memlimit_stop can be updated. There is no way to update memlimit_threading after initialization to ensure the "give me threaded decoding or none at all" is still true. I don't think most users will care about this, but it would be nice to provide the flexibility for those that do. Jia Tan