On Mon, 2005-10-10 at 11:11 +0000, [EMAIL PROTECTED] wrote: > Hi, > > * I am a beginner with XML processing, so please bear with me ! > > * I have looked over the Python/XML HOWTO and I am currently reading > Python & XML (Jones & Drake) O'Reilly book. Did not find what I am looking > for. > > OBJECTIVES: > --------------- > * I would like to parse the attached XML file using Python and a simple DOM > API, however I would like the following additional features: > a) Use a Validating Reader (I would like to use a DTD at run-time within my > application) > b) XML processor to ignore all the trailing line feeds (used to visually > format the XML file). > > SAMPLE XML FILE: > --------------------- > <?xml version="1.0" encoding="US-ASCII"?> > <!DOCTYPE casefile SYSTEM "cases.dtd"> > <casefile name="data" revision="PA1" date="2005-10-01"> > <case date="2005-10-01"> > <problem> Find an apartment </problem> > <solution> Use Google </solution> > <outcome> successful </outcome> > </case> > </casefile> > > QUESTIONS: > -------------- > a) Does PyXML offer a Validating DOM Reader ? (Or, is a Validating Reader > only available for SAX?)
Well you can build a DOM from SAX through the validating reader. > b) Would using a DOM Validating DOM Reader automatically eliminate the > extra trailing line feeds in my DOM object ? If not, how do I get the DOM > object > to drop the extra line feeds ? Depending on your DTD, those interstitial newlines might be ignorable whitespace. They are unless they match a PCDATA pattern in the DTD. If they are, then they would come into SAX in an ignorableWhitespace event rather than characters. You could use this to tweak the creation of the DOM. > c) Can I do the above without using the 4Suite XML package ? I think you can by hacking the SAX2-based readers in 4DOM (which is part of PyXML, not 4Suite). Then again, 4Suite has a very fast SAX -> DOM walker. It doesn't validate, but it also has a very fast whitespace stripper that would eliminate the interstitial whitespace. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://fourthought.com http://copia.ogbuji.net http://4Suite.org Articles: http://uche.ogbuji.net/tech/publications/ _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig