Takumi Fujiwara wrote:
Does/Will NekoHTL parser work any JAXP parser?
>
e.g. Piccolo at http://piccolo.sourceforge.net?
I think Piccolo is faster than Xerces. So I would like
to take advantage of Piccolo for parser/correcting
HTML.
NekoHTML requires the Xerces Native Interface (XNI), not
Xerces (per se). If you instantiate a NekoHTML DOM or SAX
parser, you will get a subclass of the Xerces parser but
that does *not* mean that you are using Xerces. NekoHTML
swaps the parsing pipeline in the DOM/SAX parser with its
own. So the scanning and tag-balancing operations are
strictly NekoHTML. Therefore, if NekoHTML is slow, then
that's my fault, not Xerces. :)
The perceived slowness is because Xerces is a conformant
XML parser. "Faster" XML parsers usually gain from not
implementing validation or only supporting a limited
number of character encodings. So keep this in mind when
evaluating parsers and pick the parser for your
application appropriately.
--
Andy Clark * [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]