Bugs item #1231997, was opened at 2005-07-03 22:31 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1231997&group_id=6473
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: DOM Group: None Status: Open Resolution: None Priority: 5 Submitted By: Bryan Rink (holopoj) Assigned to: Nobody/Anonymous (nobody) Summary: Memory leak in sgmlop.SGMLParser.register? Initial Comment: The following code runs fine: from xml.dom.ext.reader import Sgmlop from xml.parsers import sgmlop while True: a = Sgmlop.HtmlParser() b = sgmlop.SGMLParser() #a.parser = b b.register(a) But if the commented line is uncommented this leaks memory (very quickly). The garbage collector must be having trouble with the fact the two objects reference each other. This isn't a contrived example, the code above was adopted from lines 48-51 of xml.dom.reader.Sgmlop.py: def initParser(self, parser): self._parser = parser self._parser.register(self) return And HtmlParser.initParser calls that function like this: SgmlopParser.initParser(self, sgmlop.SGMLParser()) initParser is called from xml.ext.dom.reader.HtmlLib.Reader.fromStream which is how I came across this error. I was parsing many html documents and creating a new Reader for each one. There is no problem if I use only one reader, so that's the solution I will take, but it still seems that the first snippet of code above should not leak memory. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1231997&group_id=6473 _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig