Hi,

        Not sure if you want to go this way, but take a look at Andy
Clark's NekoHTML parser -- I think he 'corrects' HTML much as you want
to correct your XML. If you're curious: http://www.apache.org/~andyc/.

        Chris


-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
Sent: Monday, March 04, 2002 5:29 AM
To: [EMAIL PROTECTED]
Subject: Using xerces to correct non well-formed documents



I am trying to use the xerces SAX parser to correct documents that are
potentially non well-formed, i.e. some end-tags may be missing. I have
already written a ContentHandler that generates the missing end-tags,
but I
don't know how to make the parser aware of my corrections (if I don't,
it
aborts with a fatal error since it expects a well-formed input). Let me
illustrate the situation:

input file   corrector behaviour
----------   -------------------
.
.
.
<A>          <!-- corrector writes <A> to output file  -->
  <B>        <!-- corrector writes <B> to output file  -->
    <C>      <!-- corrector writes <C> to output file  -->
    </C>     <!-- corrector writes </C> to output file -->

</A>         <!-- missing </B>:
.                 - corrector writes </B> to output file
.                 - parser aborts with fatal error
.            -->

---

Is it possible to make xerces aware of the correction so that it
continues
parsing the input document?
If not, do you know of any other XML parser that might be able to deal
with
this?
If not, do you have any solutions/ideas/suggestions/experiences
concerning
this problem?

Thank you for your help!

Nicolas Wettstein
[EMAIL PROTECTED]


PS: I was unable to access the mailing-list archives so I don't know if
this
topic has been discussed before. Sorry for that!



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to