I really don't have a clue. You might want to debug into the method
XMLScanner::scanProlog(). It will first call scanXMLDecl(), to scan the
XMLDecl strangely enough. That should exit when it sees the > character. It
will then come back to scanProlog(), which go back around to the top of its
loop. It will get the next character and test it against a number of legal
characters that could start something legal at that point. It should test
true for XMLReader::isWhitespace() and process it as character data.
However, it obviously is not, and is falling down tot he bottom of the loop
where it complains that it doesn't understand what its seeing following the
XMLDecl (the 'invalid document structure' error it emits.) It would be
interesting to see what character you *do* get back at that point.

Another option is to take out the XMLDecl and see if it works ok. The
XMLDecl is scanned separately, in order to find encoding="" info, so its a
special case. That might explain how we got that far. If it then fails on
the first character of the file, then its got to be something in the
transcoding code, which comes into play after the XMLDecl is scanned (or
immediately if there is no XMLDecl.)

Perhaps there is some code in the Latin1 transcoder that has failed to take
into account character size, but that seems unlikely since that code is
common on all platforms, some of which also have a 32 bit wchar_t (and
hence a 32 bit XMLCh.)

----------------------------------------
Dean Roddey
Software Weenie
IBM Center for Java Technology - Silicon Valley
[EMAIL PROTECTED]



michael gaida <[EMAIL PROTECTED]> on 04/04/2000 12:36:27 PM

Please respond to [EMAIL PROTECTED]

To:   [EMAIL PROTECTED]
cc:
Subject:  xerces-c-110 parsing (newline?) problem



I posted this one before but now I have some more
results:

Parsing with the SAXPrint and DOMPrint samples
doesn't work when I use selfbuild samples (build
with BCB4.0). The prebuild ones work fine.


When parsing the personal.xml sample I get:

Fatal Error at file "D:\build\personal.xml", line 1,
column 44 Message:
<?xml version='1.0' encoding='ISO-8859-1' ?>


hexview output

 text:
 <?xml version="1
 .0" encoding="is
 o-8859-1"?>  <!D
            ^XMLCh chCR
             line 1, char 44

 binary:
 3C 3F 78 6D 6C 20 76 65 72 73 69 6F 6E 3D 22 31
 2E 30 22 20 65 6E 63 6F 64 69 6E 67 3D 22 69 73
 6F 2D 38 38 35 39 2D 31 22 3F 3E 0D 0A 3C 21 44
                                  ^^ XMLCh chCR
                                     line 1, char 44


I don't think this has got something to do with
the CR character. XMLCh is typedef wchar_t in
BorlandCDefs.hpp - this should be alright.


- Michael
___________________________________________________________
http://www.firemail.de - Ihr Briefkasten im Web. Einfach, schnell, sicher.
Neu! Jetzt auch mit kostenlosem Fax-Empfang und Voicemail!



Reply via email to