On Wednesday 09 February à 16:06, Mike Brown wrote: > Sylvain Thénault wrote: > > thanks a lot. Actually almost all the work is already done right there. > > Here is what I've worked on. Once we'll reach a consensus, I'll add that > > to pyxml. So I've joined to this mail: > > > > - a light version of 4Suite Uri.py including the following functions: > > SplitUriRef, UnsplitUriRef (it was really less annoying to use those > > two functions than the equivalent urllib's ones), Absolutize, > > MakeUrllibSafe, _RemoveDotSegments, BaseJoin, GetScheme and > > IsAbsolute. With the presented solution, the 3 last ones are not used > > and could be removed, but I've kept them in for now. > > Doc strings will need to be updated to reflect the promotion from > "rfc2396bis" to RFC 3986. Also there's one place where I have "RFC > (newline)2396bis" which should also be fixed.
done. However, does sections of rfc 2396bis match sections of rfc 3986 ? > In MakeUrllibSafe, you should catch the UnicodeError that could result > from the attempt to force unicode to a byte string: > > if isinstance(uri, unicode): > try: > uri = uri.encode('us-ascii') > except UnicodeError: > raise ValueError("uri %r must consist of ASCII characters." % uri) done. > > Every tests for Absolutize from 4suite are still passing. > > I forgot to point you to my tests. They do not use unittest, so they > would need to be adapted, but it would be easy since the comparisons > are string-in to string-out (or exception), and I've labeled them > pretty clearly: > > http://cvs.4suite.org/viewcvs/4Suite/test/Lib/test_uri.py?view=markup > > As you will see, they are fairly comprehensive. I did found them. As I said I've run relevant tests again the restricted version of Uri.py and all of them pass. > > - a modified version of saxutils, expecting the Uri module above to be > > in the _xmlplus directory (ie importable as xml.Uri). I've refactored > > prepare_input_source to ease testing of the URI merging stuff. > > You might want to grep for "emacspymodestink" in your code. :) right, forgot that :) And I've also added the following modification to prepare_input_source since I send it here: @@ -510,7 +510,7 @@ source = xmlreader.InputSource() source.setByteStream(f) if hasattr(f, "name"): - source.setSystemId(f.name) + source.setSystemId('file:%s' % f.name) if source.getByteStream() is None: sysid = absolute_system_id(source.getSystemId(), base) source.setSystemId(sysid) > > - a unittest file, which include some test cases for the URI merging > > function. Please take a look at the existant test cases to check > > everything looks fine to you. If you have other case to add, please let > > me know (or maybe can I add this file to the cvs first). Notice that > > to run the tests, you should have a "quotes.xml" file in the same > > directory as the test file (there is one in the test directory of > > pyxml). As a bonus, I've converted the escape function test from > > test_utils into a unittest in the same file. did you take a look at those tests ? Sounds good to anyone here ? More tests to add ? > > Anyway, having SplitUriRef/UnsplitUriRef replacing > > urlparse.urlsplit/urlunsplit and Absolutize or BaseJoin replacing > > urlparse.urljoin would definitly be the right thing. > > On python-dev in Sep 2004, I was discussing with Martin v. Löwi swhat > principles we think should be embraced by urlparse, urllib and urllib2. He > feels that we should simultaneously shoot for both URI and IRI support > according to the RFCs (3986 and 3987), with unicode arguments being assumed > to > be IRIs. > > I would hold off on any stdlib changes until the APIs can be discussed in > more detail. ok. -- Sylvain Thénault LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig