Hello,


I need your help to understand what follows.



I have this xml file (you can find it attached) whose tag may contain
western European, Russian or Greek characters, even mixed among them.
I have run xmllint --debug �Csax on the file to see if everything is OK when
I get a mixed character string and I was surprised to see that the
characters callback is invoked twice: once for the first four characters
(which are western european) and once for the remaining part of the string
(Russian).
Output of xmllint is as follows:

SAX.setDocumentLocator()
SAX.startDocument()
SAX.startElementNs(tag1, NULL, NULL, 2, xmlns:xsi='
http://www.w3.org/2001/XMLSchema-instance', xmlns:xsd='
http://www.w3.org/2001/XMLSchema', 5, 0,
xsi:noNamespaceSchemaLocation='myxs...', 9, Version='1.2"...', 3,
CreationDate='2007...', 10, CreationTime='17:0...', 8,
CreationTimeOffset='+01"...', 3)
SAX.characters(
  , 3)
SAX.startElementNs(tag2, NULL, NULL, 0, 0, 0)
SAX.characters(
    , 5)
SAX.startElementNs(tag3, NULL, NULL, 0, 0, 0)
SAX.characters(AAAA, 4)
SAX.characters(закончилась, 22)
SAX.endElementNs(tag3, NULL, NULL)
SAX.characters(
  , 3)
SAX.endElementNs(tag2, NULL, NULL)
SAX.characters(
, 1)
SAX.endElementNs(tag1, NULL, NULL)
SAX.endDocument()

This does not happen neither when I move the first four characters to the
end of the string nor when I move them to the middle.



I have searched the maling list for some similar case as well as the xmlsoft
website and other resources but honestly I am still puzzled by the behaviour
of the parser.

Am I overlooking something?


Best regards.
Massimo Comba
<?xml version="1.0" encoding="utf-8"?>
<tag1 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xmlns:xsd="http://www.w3.org/2001/XMLSchema"; xsi:noNamespaceSchemaLocation="myxsd.xsd" Version="1.2" CreationDate="2007-11-02" CreationTime="17:01:12" CreationTimeOffset="+01">
  <tag2>
    <tag3>AAAAзакончилась</tag3>
  </tag2>
</tag1>
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to