On Mon, Sep 29, 2014 at 10:09:57AM +0200, Martin Kletzander wrote:
> [resending because it appears that my previous posting haven't reached
>  the list]
> 
> Since commit 8eb55d782a2b9afacc7938694891cc6fad7b42a5, when you parse
> and save an URI that has no server (or similar) part, two slashes
> after the 'schema:' get lost.  It means 'uri:///noserver' is turned
> into 'uri:/noserver'.
> 
> This can break some applicatication that rely on those slashes to be
> in (e.g. libvirt).  This micro-series proposes 2 different approaches
> to fixing this that have one slight difference.  Approach 1 adds a
> field "slashes_used" into the xmlURI structure that is set to 1 if and
> only if double slashes were skipped when parsing.  The other approach
> simply checks whether path is absolute (starting with '/' and adds
> those two slashes when that condition is true.
> 
> The difference is that the second approach changes 'uri:/only/path' to
> 'uri:///only/path', but doesn't fiddle with the structure insides.
> 
> The probelm is caused by adaptations in RFC 3986 being ambiguously
> implemented in uri.c; for example after skipping "//", the parsing can
> follow almost the same rules or for example after skipping first '/'
> in the path the code can be exactly the same but there are more
> functions for that.  Nevertheless this series aims to fix the issue
> caused by commit 8eb55d782a2b9afacc7938694891cc6fad7b42a5 and leaves
> these cleanups to be done as a follow-up patch later on.
> 
> I'm Cc'ing the author of 8eb55d782a2b9afacc7938694891cc6fad7b42a5 in
> order for him to be able to check this patch and let me (us) know
> whether the approaches proposed here are also usable as a fix for
> their use case.

  Okay I think I have a better option:

basically
   foo:///only/path

means a host of "" while

   foo:/only/path

means no host at all

  So the best fix IMHO is to fix the URI parser to record the first
case and an empty host string and the second case as a NULL host string

 I would not revert the initial patch, we should not 'invent' those
slash, but we should instead when parsing keep the information that
it's a host based path and that foo:/// means the presence of a host
but an empty one.

Once applied the resulting patch below, all cases seems to be saved
properly:

thinkpad:~/XML -> ./testURI uri:/noserver
uri:/noserver
thinkpad:~/XML -> ./testURI uri:///noserver
uri:///noserver
thinkpad:~/XML -> ./testURI uri://server/foo
uri://server/foo
thinkpad:~/XML -> ./testURI uri:/noserver/foo
uri:/noserver/foo
thinkpad:~/XML -> ./testURI uri:///
uri:///
thinkpad:~/XML -> ./testURI uri://
uri://
thinkpad:~/XML -> ./testURI uri:/
uri:/
thinkpad:~/XML ->

  If you revert the initial patch that last case fails

The problem is that I don't want to change the xmlURI structure to
minimize ABI breakage, so I could not extend the field. The natural
solution is to denote that uri:/// has an empty host by making
the uri server field an empty string which works very well but breaks
applications (like libvirt ;-) who blindly look at uri->server
not being NULL to try to reach it !
Simplest was to stick the port to -1 in that case, instead of 0
application don't bother looking at the port of there is no server
string, this makes the patch more complex than a 1 liner, but
is better for ABI.

With that patch libvirt test suite passes again except one case:

115) QEMU XML-2-ARGV disk-drive-network-gluster
... 
Offset 295
Expect [//V]
Actual [V]
                                                                      ...
FAILED

  I expect that to be a bug in the way the gluster/rdb driver builds
an URI since all other URI building work in a compatible fashion,
this need to be examined, but I don't consider this a libxml2 bug at
this point,

  thanks,

Daniel

diff --git a/uri.c b/uri.c
index d4dcd2f..ff47abb 100644
--- a/uri.c
+++ b/uri.c
@@ -759,6 +759,8 @@ xmlParse3986HierPart(xmlURIPtr uri, const char **str)
         cur += 2;
        ret = xmlParse3986Authority(uri, &cur);
        if (ret != 0) return(ret);
+       if (uri->server == NULL)
+           uri->port = -1;
        ret = xmlParse3986PathAbEmpty(uri, &cur);
        if (ret != 0) return(ret);
        *str = cur;
@@ -1106,7 +1108,7 @@ xmlSaveUri(xmlURIPtr uri) {
            }
        }
     } else {
-       if (uri->server != NULL) {
+       if ((uri->server != NULL) || (uri->port == -1)) {
            if (len + 3 >= max) {
                 temp = xmlSaveUriRealloc(ret, &max);
                 if (temp == NULL) goto mem_error;
@@ -1143,22 +1145,24 @@ xmlSaveUri(xmlURIPtr uri) {
                }
                ret[len++] = '@';
            }
-           p = uri->server;
-           while (*p != 0) {
-               if (len >= max) {
-                    temp = xmlSaveUriRealloc(ret, &max);
-                    if (temp == NULL) goto mem_error;
-                    ret = temp;
+           if (uri->server != NULL) {
+               p = uri->server;
+               while (*p != 0) {
+                   if (len >= max) {
+                       temp = xmlSaveUriRealloc(ret, &max);
+                       if (temp == NULL) goto mem_error;
+                       ret = temp;
+                   }
+                   ret[len++] = *p++;
                }
-               ret[len++] = *p++;
-           }
-           if (uri->port > 0) {
-               if (len + 10 >= max) {
-                    temp = xmlSaveUriRealloc(ret, &max);
-                    if (temp == NULL) goto mem_error;
-                    ret = temp;
+               if (uri->port > 0) {
+                   if (len + 10 >= max) {
+                       temp = xmlSaveUriRealloc(ret, &max);
+                       if (temp == NULL) goto mem_error;
+                       ret = temp;
+                   }
+                   len += snprintf((char *) &ret[len], max - len, ":%d", 
uri->port);
                }
-               len += snprintf((char *) &ret[len], max - len, ":%d", 
uri->port);
            }
        } else if (uri->authority != NULL) {
            if (len + 3 >= max) {

-- 
Daniel Veillard      | Open Source and Standards, Red Hat
veill...@redhat.com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml

Reply via email to