Hello,
I need your help to understand what follows. I have this xml file (you can find it attached) whose tag may contain western European, Russian or Greek characters, even mixed among them. I have run xmllint --debug �Csax on the file to see if everything is OK when I get a mixed character string and I was surprised to see that the characters callback is invoked twice: once for the first four characters (which are western european) and once for the remaining part of the string (Russian). Output of xmllint is as follows: SAX.setDocumentLocator() SAX.startDocument() SAX.startElementNs(tag1, NULL, NULL, 2, xmlns:xsi=' http://www.w3.org/2001/XMLSchema-instance', xmlns:xsd=' http://www.w3.org/2001/XMLSchema', 5, 0, xsi:noNamespaceSchemaLocation='myxs...', 9, Version='1.2"...', 3, CreationDate='2007...', 10, CreationTime='17:0...', 8, CreationTimeOffset='+01"...', 3) SAX.characters( , 3) SAX.startElementNs(tag2, NULL, NULL, 0, 0, 0) SAX.characters( , 5) SAX.startElementNs(tag3, NULL, NULL, 0, 0, 0) SAX.characters(AAAA, 4) SAX.characters(закончилась, 22) SAX.endElementNs(tag3, NULL, NULL) SAX.characters( , 3) SAX.endElementNs(tag2, NULL, NULL) SAX.characters( , 1) SAX.endElementNs(tag1, NULL, NULL) SAX.endDocument() This does not happen neither when I move the first four characters to the end of the string nor when I move them to the middle. I have searched the maling list for some similar case as well as the xmlsoft website and other resources but honestly I am still puzzled by the behaviour of the parser. Am I overlooking something? Best regards. Massimo Comba
<?xml version="1.0" encoding="utf-8"?> <tag1 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xsi:noNamespaceSchemaLocation="myxsd.xsd" Version="1.2" CreationDate="2007-11-02" CreationTime="17:01:12" CreationTimeOffset="+01"> <tag2> <tag3>AAAAзаконÑилаÑÑ</tag3> </tag2> </tag1>
_______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
