Hi, attached you can find a patch for urlgrabber to preserve query parameter in urls.
Some CDN do token authentication by appending a token to the URL as query parameter. So the baseurl could be something like: https://host.domain.top/path/?abcdef1234567890 Simply appending the relative part to it will result in something like this https://host.domain.top/path/?abcdef1234567890/requested/file.txt which is simply wrong. -- Regards Michael Calmer -------------------------------------------------------------------------- Michael Calmer SUSE LINUX Products GmbH, Maxfeldstr. 5, D-90409 Nuernberg T: +49 (0) 911 74053 0 F: +49 (0) 911 74053575 - e-mail: michael.cal...@suse.com -------------------------------------------------------------------------- SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer HRB 16746 (AG Nürnberg)
>From 6166930c6d5d1823b0b4e47f86078c5a76418c23 Mon Sep 17 00:00:00 2001 From: Michael Calmer <m...@suse.de> Date: Fri, 12 Sep 2014 13:01:55 +0200 Subject: [PATCH] preserve queryparams in urls Some CDN do token authentication by appending a token to the URL as query parameter. So the baseurl could be something like: https://host.domain.top/path/?abcdef1234567890 Simply appending the relative part to it will result in an invalid URL. --- urlgrabber/mirror.py | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/urlgrabber/mirror.py b/urlgrabber/mirror.py index f3c2664..6a7cdb2 100644 --- a/urlgrabber/mirror.py +++ b/urlgrabber/mirror.py @@ -94,6 +94,7 @@ CUSTOMIZATION import sys import six import random +import urlparse from six.moves import _thread as thread # needed for locking to make this threadsafe from urlgrabber.grabber import URLGrabError, CallbackObject, DEBUG, _to_utf8 @@ -395,11 +396,12 @@ class MirrorGroup: # by overriding the configuration methods :) def _join_url(self, base_url, rel_url): - if base_url.endswith('/') or rel_url.startswith('/'): - return base_url + rel_url + (scheme, netloc, path, query, fragid) = urlparse.urlsplit(base_url) + if path.endswith('/') or rel_url.startswith('/'): + return urlparse.urlunsplit((scheme, netloc, path + rel_url, query, fragid)) else: - return base_url + '/' + rel_url - + return urlparse.urlunsplit((scheme, netloc, path + '/' + rel_url, query, fragid)) + def _mirror_try(self, func, url, kw): gr = GrabRequest() gr.func = func -- 1.8.1.4
_______________________________________________ Yum-devel mailing list Yum-devel@lists.baseurl.org http://lists.baseurl.org/mailman/listinfo/yum-devel