Daniel Veillard wrote:
> On Wed, Jun 06, 2007 at 11:35:09AM +0200, Oliver Meyer wrote:
>> Hi everybody,
>>
>> in xml 1.1 you are allowed to have e.g.  as an attribute value. My 
>> xmllint does not support that version.
>> Are you planning to support xml 1.1?
>>
>> Kind Regards,
>> Oliver
>>
>> foo.xml=
>>
>>    <?xml version = "1.1" encoding = "UTF-8"?>
>>    <foo a= '&#7;'/>
> 
> And what is the meaning of that &#7; ?

BEL? I don't care :-)

And what is the _meaning_ of &#65; ?

It's ASCII.

> My point on the subject is the following:
>   - 1.1 allows to dump invalid content unchecked from database without
>     worrying about semantic. Does this help interoperability ? No,
>     clean up your databases
>   - Also note that 1.1 rejects documents which are well-formed from
>     an 1.0 perspective, see production RestrictedChar, the code point
>     [#xE-#x1F] | [#x7F-#x84] | [#x86-#x9F] which used to be allowed as-is
>     will now raise a well-formedness error.
> 
> I am part of the Working Group which created XML-1.1, there were good intents
> for it like cleanup w.r.t. Unicode, but some big vendors also pushed for
> allowing characters which were IMHO rightfully blocked in 1.0 . And it's
> unfortunately not backward compatible. 
> While I would be sensible to request driven by the good intents, yours
> is from my perspective due to the fact that you have not well defined data
> and you would like to make this 'portable'. Please clean your data 
> instead of sending the problem to the next person in the food chain.
> 
>   I don't see how '&#7;' could make any sense if I received it in a
> text document (yes XML is fundamentally text), maybe I need to be enlightened 
> !
I still don't see _why_ an XML parser have to know &#7; or &#65; .

> But this was debated to death in the Working Group before, my opinion is
> well set, and I prefer to protect my users base from the real use of 1.1
> (and thanks to the Web gods, the request to allow code point 0 was blocked !)
> 
>   In a nutshell, no, clean up your data, or use something else, if
> you really want to send raw data, why not use binary directly ? That's
> just fine, but don't pretend it's a text format.
We use XML for the structure.. And so going somewhere else (other, 
binary) would be a step back.

Our problem area has been ISO2709 which are converted to MARCXML (from 
network sources beyond our control). Right now problematic chars, say 
&#7, are just thrown away. Another option to avoid data loss would for 
us to make _private_ semantics <char num="7"/>.

/ Adam

> 
> Daniel
> 

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to