Re: Cores when using XercesDOMParser::getSrcOffset()

Axel WeiÃ Sat, 05 Mar 2005 08:25:31 -0800

Am Samstag, 5. MÃrz 2005 12:01 schrieb Bob Freitas:
> Hi Axel,
>
>
>
> I have exactly the same problem that you posted on xerces-c-dev about
> finding the line number of your text in your XML. I tried using the
> getSrcOffset() on the XercesDOMParser, to try to get the character
> position, which I thought I could use with the original input file to
> determine the line number, but it didn't work for me.  The call to the
> getSrcOffset() method kept coring the program.  I opened a bug with
> them.  I did see some other posts talking about using the getSrcOffset
> with the SAXParser.  I haven't tried that yet.  I kind of have a lot
> invested in the XercesDOMParser, so I would need a pretty compelling
> reason to change at this point.
>
> Did you happen to figure something out?


Hello Bob,

I've read your jira input with interest this morning, but got out to take 
place at the wedding of good friends :).

What I figured out using the offset-tracking of xerces dom-parser is as 
follows: the getSrcOffset() method is only usable, if the parser 
encounters xml syntax errors or schema validation errors and calls your 
error-handler. Then you may point the user to the offset in the xml 
file. After having parsed the whole file (which is already the case when 
you get the dom-tree with doc->getDocumentElement()), the source offset 
information is void, and calling getSrcOffset() will crash. 
(Xerces-developers, please correct me if I'm wrong.)

For some reasons (maybe memory-efficiency), DOMNodes do not contain any 
information about the source origin. So, if you encounter subtle errors 
behind syntax parsing or schema validation, there is no way to tell the 
user something like "file://validated.xml, line 137: failed to load url 
'http://www.some-domain.de/xml/some-file.xml'".

At this point, I'm assuming that the author of an xml-file would prefer 
to use an xml-editor (instead of using a text-editor) which presents him 
the structure instead of lines of xml-source and always produces 
'correct' xml-syntax. So it may be useful to reflect the xml-structure 
to point him to the origin of the failed processing. Finally, my 
preferred way to point out such errors, is something like: 
"file://validated.xml, xpath=namesp:root-element/[EMAIL PROTECTED]: 
failed to access referenced url 
'http://www.some-domain.de/xml/some-file.xml'" which reflects the parsed 
xml-structure instead of printing 'silly' line numbers.

Hope it helps...
                        Axel

-- 
Humboldt-UniversitÃt zu Berlin
Institut fÃr Informatik
Signalverarbeitung und Mustererkennung
Dipl.-Inf. Axel WeiÃ
Rudower Chaussee 25
12489 Berlin-Adlershof
+49-30-2093-3050
** www.freesp.de **

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Cores when using XercesDOMParser::getSrcOffset()

Reply via email to