Hi,

there is an unfortunate interaction between the "progressive" parsing mode
and the loading of an external DTD, e.g. to inject defaulted attribute
values. I see this in lxml's iterparse() implementation that started
failing to inject them in libxml2 2.9.0. It uses incremental push parsing.

The problem results from the fact that xmlSAX2ExternalSubset() in SAX2.c
reuses the existing parser context, which, in this case, is in progressive
mode. When it calls into xmlParseExternalSubset(), that starts by running
the "GROW" macro, which is a no-opt in progressive mode. Thus, no data is
available and xmlParseExternalSubset() terminates without doing anything.

I'm not currently sure why it worked in older releases. I suspect that one
of the many additional places that now set the ctxt->progressive field to 1
might have triggered it.

I'm not entirely sure about the right way to fix this. Maybe
xmlSAX2ExternalSubset() should also back up and restore the "progressive"
field of the context and then set it to 0 before calling
xmlParseExternalSubset()? I attached a patch that does that and that fixes
the problem for me.

BTW, is it correct that "ctxt->progressive" is sometimes set to "1" and
sometimes to things like "XML_PARSER_COMMENT" or "XML_PARSER_PI" in
parser.c? Those values are more commonly assigned to the "instate" field.

Stefan
diff -r 58415f6342ee SAX2.c
--- a/SAX2.c	Wed Sep 26 10:21:06 2012 +0800
+++ b/SAX2.c	Fri Sep 28 13:40:08 2012 +0200
@@ -411,6 +411,7 @@
 	xmlParserInputPtr input = NULL;
 	xmlCharEncoding enc;
 	int oldcharset;
+	int oldprogressive;
 
 	/*
 	 * Ask the Entity resolver to load the damn thing
@@ -432,6 +433,7 @@
 	oldinputMax = ctxt->inputMax;
 	oldinputTab = ctxt->inputTab;
 	oldcharset = ctxt->charset;
+	oldprogressive = ctxt->progressive;
 
 	ctxt->inputTab = (xmlParserInputPtr *)
 	                 xmlMalloc(5 * sizeof(xmlParserInputPtr));
@@ -442,11 +444,13 @@
 	    ctxt->inputMax = oldinputMax;
 	    ctxt->inputTab = oldinputTab;
 	    ctxt->charset = oldcharset;
+	    ctxt->progressive = oldprogressive;
 	    return;
 	}
 	ctxt->inputNr = 0;
 	ctxt->inputMax = 5;
 	ctxt->input = NULL;
+	ctxt->progressive = 0;
 	xmlPushInput(ctxt, input);
 
 	/*
@@ -487,6 +491,7 @@
 	ctxt->inputMax = oldinputMax;
 	ctxt->inputTab = oldinputTab;
 	ctxt->charset = oldcharset;
+	ctxt->progressive = oldprogressive;
 	/* ctxt->wellFormed = oldwellFormed; */
     }
 }
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
https://mail.gnome.org/mailman/listinfo/xml

Reply via email to