Salut Daniel,

Daniel Veillard <veill...@redhat.com> writes:

> On Tue, Apr 22, 2014 at 10:11:46AM +0000, Susanne Oberhauser-Hirschoff wrote:
>> 
>> The xml:base is not just the directory, it also contains the file name,
>> right? 
>
>   right but it is not needed, in that case all your files are in the
> same directory, no need to add an xml:base it doesn't change any
> further URI-Reference done from the included portion

Ah, now I understand where you are coming from: you were only concerned
about external URI references going out from the xi:included portion!


Now I'm doing docbook processing and I find xml:base extremely usefull
to tell the origin of each part of the xi:include processed document: go
to the closest parent with xml:base defined, voilá, that's where this
part originates :-)

When I shuffle the included parts into subdirectories, all is well.  But
when all fragments are in the same directory, the current xml:base fixup
kicks them out :-(


>> The whole XInclude test suite behaves like that, see below.
>
>   http://www.w3.org/TR/xinclude/#base
>
> the goal was really to make sure any further URI-Reference would not
> be broken.

I'm confused.  Whose goal?

>> So it _should_ look like this, shouldn't it?  This is what I get with
>> the attached patch to libxml:
>> 
>> ### correct output ################################################
>> xmllint --xinclude 1.xml 
>> <?xml version="1.0"?>
>> <top xmlns:xi="http://www.w3.org/2001/XInclude";>
>>   <elem1 xmlns:xi="http://www.w3.org/2001/XInclude"; xml:base="2.xml">
>>   <elem2 xml:base="3.xml">
>>   <a fileref="x.svg"/>
>> </elem2>
>> </elem1>
>> </top>
>> ###################################################################
>
>   and with or without the xml:base the fileref URI reference will work
> correctly. on the other hand having tons of xml:base getting in the
> final document is more a nuisance than a benefit, especially if you
> have a lot of top level XIncluded element


The lxml.etree._Element sourceline and base go directly to libxml2
xml:base.  No xml:base fixup means useless sourceline: As the base
remains unset per the current logic, the base-sourceline combo is simply
wrong.

I see no other way than xml:base to track which file some part of the
document comes from.

>> The XInclude test suite agrees, when run with the attached script, like
>> this.
>> 
>> ###################################################################
>> cvs -d:pserver:anonym...@dev.w3.org:/sources/public \
>>    co  2001/XInclude-Test-Suite  XInclude-Test-Suite
>> 
>> cd XInclude-Test-Suite
>> 
>> python3 PATH-TO/run-tests-with-lxml.py
>> ###################################################################
>> 
>> This gets about 15 less failures when run with the patch below, and
>> afaict from a review with/without patch, there is no additional ones.
>> 
>> So it should be an improvement :)
>> 
>
>   Not completely sure TBH, your test suite output will look nice, the
> users document not so ... which one is most important ?

It's not about looking nice.  If I was about looking nice, I'd try to
get rid of the duplicate xmlns:xinclude namespace declarations which
remain in the document after xi:include processing :-)


The problem is that with the current implementation, xml:base in libxml2
is restricted to exactly one use case, relative references to external
resources.  However for identifying where which part of the document
originates, a correct filename in xml:base is key.



Also I'm not sure about the amount of additional information.  The
xml:base attribute is only added to the root that replaces the
xi:include.  It is *not* added to any other internal node.

In anything beyond simple test cases with tiny document fragments, the
signal/noise is actually improving as every xml:base that's added is
relevant metadata.


I think that's also why the test suite has xml:base in _all_ places
where xi:include was processed.


On the list I saw you had your fight getting xml:base processing in at
all.  It made changes to DTDs necessary.  However when the files are in
subdirectories (or super directories), the xml:base will be added,
anyhow, so the DTDs / schemas have to deal with xml:base already.


What's the use cases that do thousands of xi:includes of tiny xml
fragments, rendering the current tuning necessary?

If that's real I could redo a patch with an option.

Though I'd prefer not to :)

Thx,


S.

-- 
Susanne Oberhauser                     SUSE LINUX Products GmbH
+49-911-74053-574                      Maxfeldstraße 5
Processes and Infrastructure           90409 Nürnberg
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg)
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml

Reply via email to