On Fri, Sep 28, 2012 at 7:44 AM, Stefan Behnel <[email protected]> wrote:
> there is an unfortunate interaction between the "progressive" parsing mode
> and the loading of an external DTD, e.g. to inject defaulted attribute
> values. I see this in lxml's iterparse() implementation that started
> failing to inject them in libxml2 2.9.0. It uses incremental push parsing.
I ran into this problem too. I have a testcase showing the problem
using xmllint:
% cat x.xml
<!DOCTYPE x SYSTEM "x.dtd">
<x/>
% cat x.dtd
<!ELEMENT x EMPTY>
% xmllint --valid --noout --stream x.xml
x.xml:2: element x: validity error : No declaration for element x
<x/>
^
x.xml:3: element x: validity error : No declaration for element x
^
Document x.xml does not validate
>
> The problem results from the fact that xmlSAX2ExternalSubset() in SAX2.c
> reuses the existing parser context, which, in this case, is in progressive
> mode. When it calls into xmlParseExternalSubset(), that starts by running
> the "GROW" macro, which is a no-opt in progressive mode. Thus, no data is
> available and xmlParseExternalSubset() terminates without doing anything.
>
> I'm not currently sure why it worked in older releases. I suspect that one
> of the many additional places that now set the ctxt->progressive field to 1
> might have triggered it.
A git bisect session points to this commit as the problem:
[5353bbf7dda0a01462109337c5fa34859d3e6d0b] More fixups on the push
parser behaviour
>
> I'm not entirely sure about the right way to fix this. Maybe
> xmlSAX2ExternalSubset() should also back up and restore the "progressive"
> field of the context and then set it to 0 before calling
> xmlParseExternalSubset()? I attached a patch that does that and that fixes
> the problem for me.
This patch fixes my test case as well.
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
[email protected]
https://mail.gnome.org/mailman/listinfo/xml