hi everyone, I'm currently working on an information retrieval utility using java, i use Jtidy to clean retrieved html into well formed xhtml, wich i parse and transform using XERCES/XALAN. Now that i went on optimizing my application, i found that (in some precise conditions) writing a parser that would IGNORE xhtml entities would improve the program performance, that means a non-validating parser that would parse, for instance, the following file without generating any errors:
<?xml...?> <html> <body> <p>ignore this entity ©</p> <p>ignore this one too ª</p> </body> </html>
i'd be grateful if anyone could be of any help.
cheers, marwan.
_________________________________________________________________ Send and receive Hotmail on your mobile device: http://mobile.msn.com
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
