Re: Error reporting from XML Schema and from Schematron (long)

Jan Dvorak 21 Jun 2002 09:31:44 -0000

Rick,

I agree with you: I think pretty much any application has to rewrite the 
messages of any XML Schema validator, so that they speak the language of the 
problem domain, rather than that of the XML Schema spec.


Plus, applications might have a need to refine the errors. Violating a 
constraint in the context of one element (or type) can mean something quite 
different than violating the same constraint in the context of another 
element (or type), and there should be a way to distinguish these. And there 
can be different causes of a content model mismatch...


Rick Jelliffe wrote:

> There has been talk of adding XPath to the locators used in
> SAX.  That would be a great idea.  Line numbers are
> useful sometimes, and paths are useful at others, so IMHO we
> need a SAX infrastructure that can provide either.

Yes.
XNI has the mechanism of augmentations.
Perhaps that's the way for SAX as well?
Is there a revision of SAX planned?

> Personally, I think an error object should be able to provide
>   - file/line/character number
Of the real problem spot, if that can at all be defined.
>   - XPath
Yes!
>   - severity indicator
>   - sendor ID
>   - nickname or error-code
>   - single line overview
>   - multiline diagnostic, XML
>   - icon for that error
>   - URL for see also
>   - unique ID for keying a repair method
>   - unique ID for diagnostic generating function
Yes!!!

A XML Schema validator alone will never be able to provide all of this on its 
own. Unless it has very detailed instructions.

> This would support Schematron and XSD well.
Yes, errors from the two should be unified.

> My company has also been using Xerces-J as well as
> Schematron (and also RELAX NG and DTDS) in
> an editor product now in beta testing.
>
> I had to rewrite almost all the Xerces error messages
> because they were incomprehensible to end-users.
> (I don't know if it is worthwhile contributing these,
> because some of them are specific to our system
> or leave out diagnostics of errors that cannot happen
> for us.)

> One improvement that I found useful was to first
> classify all errors as either document errors
> or schema errors. At the moment everything is
> mixed together, and a layman has now way of
> knowing whether the document is bad or the
> schema is bad. So the first thing I did was
> to prefix all schema errors with (Schema error).
> Then I rewrote all the other errors for end-users,
> in product specific terms.

I used the IBM Schema Quality Checker to debug my schema, so I didn't 
experience any schema errors from Xerces. It was of great utility to me, as I 
was learning XML Schema in the process.

> Actually, I do tend to think that one should always
> expect to rewrite error messages for a particular
> system.  But for Xerces' case, it would be nice
> if the messages were a little less programmer
> oriented in the first-place.

Or there was a mechanism to customize them, with a clean interface.
The 'Validation errors' thread on this list has a discussion of exactly these 
matters.

> The two worst offenders are:
>  1) Error messages relating to the DOCTYPE
> declaration.  A missing system identifier in
> the DOCTYPE declaration is diagnosed as
> being caused by a missing space.  If there is
> no entity, then IIRC the user gets a message
> to the effect that there is  an error in "null".
> Problems that occur before the perceived
> start of the document are very off-putting.

Yes, that's a pain.

>  2) The XSD error messages.  These are
> fairly poor: you have to learn to ignore
> the reference to the XSD outcome code
> and the parenthetic content models at
> the end.

... and the rest is not particularly clear either.
It is my impression that Xerces-J-2 moved towards more technical speak still.

I'm not saying this is Xerces' fault. 
It's doing it's job and is doing it great. 
Just the real life use of the tool takes some adaptation. 

> Finally, on the issue of migrating from
> XSD to Schematron.  One thing that may
> be helpful is Francis Norton's typeTagger.
> This is an XSLT stylesheet that adds
> xsi:type attributes to a document, based
> on an XSD schema.  So you can continue
> to describe your basic structures and
> datatypes using XSD, but give the
> Schematron access to that typing
> information:
>   <rule context="[EMAIL PROTECTED]:type='address']">
>     <assert test="[EMAIL PROTECTED]:type='street']"
>
>     >A <name/> is an address, and so
>
>    it needs some kind of street, for example &lt;strasse>.</assert>
>   </rule>

Sounds interesting. 
Also, type information should be available in XPath 2.0.

> Cheers
> Rick Jelliffe

Jan Dvorak

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Error reporting from XML Schema and from Schematron (long)

Reply via email to