hi everyone,
I'm currently working on an information retrieval utility using java,
i use Jtidy to clean retrieved html into well formed xhtml,
wich i parse and transform using XERCES/XALAN.
Now that i went on optimizing my application, i found that (in some
precise conditions) writing a parser that would IGNORE xhtml entities
would improve the program performance, that means a non-validating
parser that would parse, for instance, the following file without
generating any errors:

<?xml...?>
<html>
<body>
<p>ignore this entity &copy;</p>
<p>ignore this one too &ordf;</p>
</body>
</html>

i'd be grateful if anyone could be of any help.

cheers,
marwan.

_________________________________________________________________
Send and receive Hotmail on your mobile device: http://mobile.msn.com


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to