On Tue, Feb 16, 2010 at 10:00:03AM +0300, Eugene Pimenov wrote:
> Hello everyone,
> 
> As my colleague pointed out in December 
> (http://mail.gnome.org/archives/xml/2009-December/msg00036.html ; although he 
> didn't do it in a clear manner), there're real world examples of  HTML pages 
> that overflows stack. We're using libxml through nokogiri ( 
> http://nokogiri.org/ it's a Ruby library). 
> 
> E. g.
>       >> 
> Nokogiri::HTML::SAX::Parser.new(Nokogiri::XML::SAX::Document.new).parse_memory("<b>"*100_000)
>       #=> SystemStackError: stack level too deep
> 
> In the patch I change htmlParseElement to return immediately and let the 
> caller htmlParseContent do the job.
> 
> htmlParseElement is not a static function, and I changed it behavior! I 
> googled around 
> (http://google.com/codesearch?q=htmlParseElement&hl=en&btnG=Search+Code) and 
> I don't see everyone actually using it. But if this is an issue, I can make 
> htmlParseElement call the secret (static) htmlParseElement and then 
> htmlParseContent until level matches. I'd rather see htmlParseElement 
> converted to static though.

  The patch as is also breaks the ABI by changing the ordering of
elements in the public structure of a parser context.
  Also changing the behaviour of the function to that extend is not
correct and googling is not sufficient to answer if that behaviour would
be in use somewhere. libxml2 is used in many embedded project, you won't
find their source in google !
  So at the minimum the public function must be preserved. The new
elements in the parser context must be added at the end of the
structure (I also need to be convinced this is really needed) and
all the regression tests must still pass with the patch applied.

> I also attach weirdness.patch that deletes double definitions, and sets 
> nameMax to 0 if it fails to allocate some memory.

  That one sounds fine, right !

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
[email protected]  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to