Hi everybody,

on my search for a XML parser/DOM with namespace support, I discovered
Xerces-J. I made some simple tests with it and realized that it is not quite
useable for my purposes, because of some problems I encountered. Maybe you are
interested in and maybe you can give me some tips:

  - In my case, I don't need validation, since I have a fixed set of
    documents. So I tried to turn the validation off. But it still attempts
    to load the DTD and/or the schemes specified via DOCTYPE/the namespace
    URIs. This is a performance/memory overhead I would like to get rid of.

  - Since I want to use the original namespace URIs (e.g. for HTML), it
    tries to download stuff from the W3C server. But I would like to
    avoid the download for each document which is parsed, esp. as I want
    to work offline. Well, I got the idea to use an own implementation
    of the EntityResolver, which maps the URIs to local files (or to
    empty input streams). This works with the DTD/DOCTYPE, but
    unfortunately not for the namespace URIs. I tracked the problem down:
    The schema loader creates a new instance of an XML parser, and sets
    the EntityResolver of the parser to a default one, instead of using
    the EntityResolver which was used in the originating parser.

  - I know that DOM2 is still in draft phase, and therefore it makes sense
    that the org.w3c.dom packages still contain only DOM Level 1. On the
    other side, it would be much easier to switch from the DOM2 WD APIs
    to the final APIs instead from the xerces.dom.* APIs. A compile test
    where I "borrowed" the DOM2 WD interfaces from OpenXML showed that
    obviously Xerces already implements all current interfaces.
    So why don't you supply a build version, which includes the
    DOM2 WD APIs? You can mark all classes with "deprecated" and 
    corresponding comments, so that every user will notice the "draft"
    status.

Thanks for reading up to here :-)

regards,

Klaus Malorny

Reply via email to