Dave:

Thanks for yuour response. But, in the Xerces documentation FAQ there is a question:

"Can I use Xerces C++ to parse HTML?" and the answer is "Yes.. only if the HTML follows the rules given in the XML specification..."

What does it mean? Can you give me a point here?


Regards.




At 12:08 AM 8/3/2003 -0700, you wrote:

Xerces-C is an XML, not an SGML/HTML parser, so you cannot use it to parse
HTML.

Dave



Hi:

Any of you could parse/validate an HTML file, pre-parsing the grammar with
the W3 dtd?

I'm doing:
____________________________
SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
parser->setFeature(XMLUni::fgXercesUseCachedGrammarInParse, true);

Grammar * gr=parser->loadGrammar(dtd_file_in_source,
Grammar::DTDGrammarType, true);
//here gr is not NULL!!

parser->parse(some_html_in_source);
____________________________


But the parse stops in <meta....>, this is because the meta tag doesn't have to be slash terminated. That rule is in the DTD. ;-) So, Is the DTD being parsed?

Any point will be appreciated.
Regards.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to