I have a sample XML file which contains <text>‡‡ .... </text>
with 8,000,000 (eight million) repetitions of '‡'.
A test program (in Python using lxml) for loading and then writing it is:
import sys
#import cElementTree as ET
from lxml import etree as ET
f=open(sys.argv[1])
et = ET.ElementTree(file = f)
et.write('ooo')
When it is run with cElementTree , it completes successfully in about 1
minute.
When it is run with lxml, which uses libxml2, it does not complete, even
after *12 hours*!!! and the process is constantly at 100% CPU.
Further testing showed it reaches the 'write' statement quite fast and is
stuck in there.
Writing it with encoding="UTF-8" is quick enough.
TIA
Moshe
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml