Thanks for yuour response. But, in the Xerces documentation FAQ there is a question:
"Can I use Xerces C++ to parse HTML?" and the answer is "Yes.. only if the HTML follows the rules given in the XML specification..."
What does it mean? Can you give me a point here?
Regards.
At 12:08 AM 8/3/2003 -0700, you wrote:
Xerces-C is an XML, not an SGML/HTML parser, so you cannot use it to parse HTML.
Dave
Hi:
Any of you could parse/validate an HTML file, pre-parsing the grammar with the W3 dtd?
I'm doing: ____________________________ SAX2XMLReader* parser = XMLReaderFactory::createXMLReader(); parser->setFeature(XMLUni::fgXercesUseCachedGrammarInParse, true);
Grammar * gr=parser->loadGrammar(dtd_file_in_source, Grammar::DTDGrammarType, true); //here gr is not NULL!!
parser->parse(some_html_in_source); ____________________________
But the parse stops in <meta....>, this is because the meta tag doesn't have to be slash terminated. That rule is in the DTD. ;-) So, Is the DTD being parsed?
Any point will be appreciated. Regards.
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]