Ok, you're correct about that. Apparently file:write() still does XML-
ish serialization by default. So that explains the introduction of
entities and character references.

However, there is still a bug here, as far as I can tell. I'm attaching
a new reproducer. You should run it with

  zorba -f -q foo.xq --serialize-text

This uses the "text" serialization method, which should simply dump the
in-memory contents of the string to the screen. When run this way, all
the entities and character references are gone. But, illegal double-
quotes and backslashes remain, as do several illegal control characters
(you can see the latter by piping the output through cat -vet). The
output also contains newlines inside JSON strings which I'm pretty sure
aren't legal. The tab character is gone entirely, but I think that is
happening during query parsing; not sure.

So, unless there is something else going on, the output string from
json:serialize() is still not guaranteed to be valid JSON. The first
thing you should do is put some debug code into the implementation of
that function to output the actual return value, to be 100% sure that
the on-screen output is in fact exactly the same bytes as the return
value. But, assuming that it is, there's a bug.

** Attachment added: "foo.xq"
   
https://bugs.launchpad.net/zorba/+bug/878508/+attachment/3234618/+files/foo.xq

-- 
You received this bug notification because you are a member of Zorba
Coders, which is the registrant for Zorba.
https://bugs.launchpad.net/bugs/878508

Title:
  JSON Module not escaping escape characters

Status in Zorba - The XQuery Processor:
  Confirmed

Bug description:
  The module doesn't convert escaped characters as you would expect. You 
instead get a string containing the string with it's unescaped value. A 
conversion needs to be implemented, something such as:
  JSON <-> XML
  \"    <-> &quot;
  \\    <-> \
  \/    <-> /
  \b    <-> &#x8;
  \f    <-> &#xC;
  \n    <-> *actual newline*
  \r    <-> *actual carriage return*
  \t    <-> '   '
  \u$$$$<-> &#x$$$$; or #$$$$$; with the correct hex-decimal conversion
  <     <-> &lt;
  >     <-> &gt;
  &     <-> &amp;
  '     <-> &apos;

  This proposition might create a regresion related to bug #866757.

To manage notifications about this bug go to:
https://bugs.launchpad.net/zorba/+bug/878508/+subscriptions

-- 
Mailing list: https://launchpad.net/~zorba-coders
Post to     : zorba-coders@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zorba-coders
More help   : https://help.launchpad.net/ListHelp

Reply via email to