On 2022-04-21 Jia Tan wrote: > The current behavior of LZMA_FINISH in the decoder is a little > confusing because it requires calling lzma_code a few times without > providing more input to trigger a LZMA_BUF_ERROR.
The current behavior basically ignores the use LZMA_FINISH when determining if LZMA_BUF_ERROR should be returned. I understand that it can be confusing since after LZMA_FINISH there is nothing a new call to lzma_code() can do to avoid the problem. However, I don't think it's a problem in practice: - Application that calls lzma_code() in a loop will just call lzma_code() again and eventually get LZMA_BUF_ERROR. - Application that does single-shot decoding without a loop tends to check for LZMA_STREAM_END as a success condition and treats other codes, including LZMA_OK, as a problem. In the worst case a less robust application could break if this LZMA_OK becomes LZMA_BUF_ERROR as the existing API doc says that LZMA_BUF_ERROR won't be returned immediately. The docs don't give any indication that LZMA_FINISH could affect this behavior. - An extra call or two to lzma_code() in an error condition doesn't matter in terms of performance. > This patch replaces return LZMA_OK lines with: > > return action == LZMA_FINISH && *out_pos != out_size ? LZMA_BUF_ERROR > : LZMA_OK; I don't like replacing a short statement with a copy-pasted long statement since it is needed in so many places. A benefit of the current approach is that the handling of LZMA_BUF_ERROR is in lzma_code() and (most of the time) the rest of code can ignore the problem completely. Also, the condition *out_pos != out_size is confusing in a few places. For example, in SEQ_STREAM_HEADER: --- a/src/liblzma/common/stream_decoder.c +++ b/src/liblzma/common/stream_decoder.c @@ -118,7 +118,8 @@ stream_decode(void *coder_ptr, const lzma_allocator *allocator, // Return if we didn't get the whole Stream Header yet. if (coder->pos < LZMA_STREAM_HEADER_SIZE) - return LZMA_OK; + return action == LZMA_FINISH && *out_pos != out_size + ? LZMA_BUF_ERROR : LZMA_OK; coder->pos = 0; In SEQ_STREAM_HEADER no output can be produced, only input will be read. Still the condition checks for full output buffer which is not only confusing but wrong: if there was an empty Stream ahead, having no output space would be fine! In such a situation this can return LZMA_OK even when the intention was to return LZMA_BUF_ERROR due to truncated input. To make this work, only places that can produce output should check if the output buffer is full. However, I don't think the current behavior is worth changing. As you pointed out, it is a bit weird (and I had never noticed it myself before you mentioned it). It's not actually broken though and some applications doing single-shot decoding might even rely on the current behavior. Trying to change this could cause problems in rare cases and, if not done carefully enough, introduce new bugs. So I thank you for the patch but it won't be included. -- Lasse Collin