[ACTION-4] Re: W3C TAG Response to CURIE Last Call (PR#8055)

Shane McCarron Wed, 08 Oct 2008 10:20:27 -0700


Noah,

The working group resolved to accept the TAG's comments and integratethem into the draft. I have embedded notesbelow as to how that integration was done. You can see an updatededitor's draft via http://www.w3.org/MarkUp/Drafts#curie

Note that the working group has not reviewed this response, but wasinstrumental in formulating it. Any errors are entirely my fault. Youshould expect a more formal response once the working group hascompleted their review.

Thanks again for taking the time to carefully review thisspecification. We look forward to a smooth transition to CR in the nearfuture.


Shane McCarron

[EMAIL PROTECTED] wrote:

* The introduction contains the statement:

<current>
"Unfortunately, QNames are unsuitable in most cases because 1) they areNOT intended for use in attribute values, and 2) ...".
</current>
Whether or not they were originally intended for such use, QNames areroutinely used in attribute values, e.g. in XML Schema Documents, wheretheir use is required. We suggest that a better explanation might bealong the lines of:
<proposed>
"Unfortunately, QNames are unsuitable in most cases because 1) the use ofQName as identifiers in attribute values and element content isproblematic as discussed in [2], and 2) ..."
</proposed>

We have used your proposed text. We have also added an informativereference to the TAG finding.

* The TAG has decided to formally (re)raise a concern that I raisedprivately in a note sent in early August [3], and which the TAG itselfraised in an in earlier round of comments [4]. The concern remains thatit is inappropriate to allow for use of new CURIE or safe_CURIE syntax inlanguages for which the specifications do not allow for it. Similarly, itis inappropriate to interpret existing syntax (e.g. pref:xxx) as a CURIEin cases where the specifications require it be interpreted as a URI.Accordingly, we suggest that the text that currently reads:
<current>
"In some cases language designers will want to use both URIs and CURIEs asthe value of an attribute. For example, in XHTML+RDFa [XHTMLRDFa] theabout attribute allows a URI to be specified that some metadata is"about", but it is also be useful to abbreviate this URI, using thecompact syntax. However, the problem is that it is not possible for thelanguage parser to be completely sure whether it has located a CURIE or aURI. For example, a resource could be specified as follows:
        <p rel="foaf:homePage" about="http://www.example.org/home.html
">home</p>
There is no way to be sure that this is a normal URI, or a CURIE.Therefore the syntax for carrying a CURIE when there is any possibility ofambiguity is to enclose the CURIE in square brackets [...]
</current>

Be replaced with:

<proposed>
CURIEs and safe_CURIEs map to IRIs, but neither a CURIE nor a safe_CURIE<italic>is</italic> an IRI or URI. Accordingly, CURIEs and safe_CURIEsMUST NOT be used as values for attributes or other content that arespecified to contain only URIs, IRIs, URI-references, IRI-references, etc.Specifications for particular attribute values or other content MAY bewritten to allow either CURIEs or IRIs (or URIs, etc.). Thespecifications for such languages MUST provide rules for disambiguantionin situations where the same string could be interpreted as either a CURIEor an IRI. One way to do this is to require that all CURIEs be expressedas safe_CURIEs, implying that all unbracketed strings are to beinterpreted directly as IRIs.
</proposed>

We have integrated your proposed text. Note that at the point in thedocument where you suggested we put this, however, safe curie had notyet been introduced. On balance I decided that this was not a big deal,and just made the term a forward reference to its production.

* In the introduction, the term "value space" is used in a quite generalmanner to refer to a set of values that are grouped together and thusdistinct from similar values in other groups. Later, in the syntaxsection, the statement is made: "Note that while the set of IRIsrepresents the lexical space of a CURIE, the value space is the set ofURIs (IRIs after canonicalization - see [IRI])." This seems to appeal,without reference, to notions intended to be either similar in spirit to,or exactly the same as, the similarly named concepts defined for XMLSchema Datatypes [5,6]. We suggest that, first of all, the inconsistencyin usage between the Introduction and the Syntax section should beresolved. Secondly, the syntax section should be clearer on whether thereis an assumption that an XML Schema Datatype for CURIE is being defined(as it is eventually in the Appendix), in which case the terms "lexicalspace" and "value space" should probably be made hyperlinks to the XSDRecommendation. If there is no specific assumption of an XSD Datatype inthe syntax section, then the terms lexical space and value space shouldeither be dropped from this section, or clarified. We would expect that,if the terms lexical and value space are retained in this section, thelexical space would be the set of strings conforming to the BNF for CURIE,SafeCURIE, etc. If so, those correspondences should be made clear.Looking ahead to Appendix A, the types you define there are subtypes ofxsd:string. For those types, the correspondence between lexical and valuespace is of necessity 1:1 (I.e., as required by the XSD Recommendation),and thus the value space is also of the form pref:xxxxx. In any case, thewhole story about datatypes, lexical, and value spaces, needs to beclarified, and needs to be made more consistent with XSD whereappropriate. On balance we suggest you retain the definitions in AppendixA (with the corrections given below), but replace the word/phrase 'value'and "value space" in the Introduction with 'name' and "name collection"respectively.

There was *no* intent that the term "value" in the introduction map toanything from XSD. It was an unfortunate coincidence. All we weretrying to do was describe collections of scoped data. So basically, wedid what you suggested. We have changed "value" to "name" and made someother changes as needed so the sentences would scan well. The term"value" is only used when we are talking about XML "attribute values",since I personally have no idea how else one might talk about the stringwithin quotation marks attached to an attribute name in an XML document.

* There is a related, and serious, problem in section 3.  The sentence:

<current>
Note that while the set of IRIs represents the lexical space of a CURIE,the value space is the set of URIs (IRIs after canonicalization - see[IRI])."
</current>

FWIW this was just an error in the last call draft. We knew what wemeant (lexical space is CURIEs, value space is IRIs) and said it wrong.It had been corrected already in various intermediate drafts. However,thanks very much for helping us to tighten this still further.

is wrong on two counts, even after we decouple the terminology from XMLSchema's usage:
      1) The 'lexical space' is a subset of strings,
         as specified by the BNF at the top of
         section 3 (after correction).

      2) The 'value space' is strings (intended for use in)
         representing IRIs.
So, and given the recommendation below as well, we suggest you replace theparagraph containing the above sentence with something along the followinglines:
<proposed>
"CURIEs are an abbreviation for strings which are >intended< to representIRIs (see [IRI]), but >checking that intent is not part of CURIEconformance<. The intended IRI is constructed by concatenating the prefixbinding with the reference part, if any. There MUST be a prefix bindingfor the prefix (or the default prefix, if the prefix is absent) in scope."
</proposed>


We have integrated this change with some minor word smithing.

Care should be taken to check throughout that the word 'CURIE' is alwaysused to refer to strings of the form [prefix :] reference. If a name isneeded for the IRI which this maps to, perhaps a phrase such as "expandedCURIE" should be used, paralleling the term "expanded name" from XMLnamespaces; we are unsure as to whether there is, on balance, a need forsuch a term.

Since it was never the intent that CURIE mean "expanded CURIE", you arecorrect that we don't need such a term. We have done a scan of thedocument and are confident that when we say CURIE we mean the lexicalform, not its expansion.

* Section 3 says:

<current>
"A CURIE processor that encounters a value that does not conform theconstraints defined by this specification and by the host language SHOULDignore that value. A host language MAY require other behavior."
</current>
This seems to make unwarranted assumptions about the host languages,whether each such language in fact has a notion of "ignoring" content, andif so, whether that is in fact the most appropriate error handlingstrategy. Accordingly, we recommend instead:
<proposed>
"It is an error if a string required by a host language to be a CURIE orSafeCURIE fails to satisfy the constraints defined above. Error handlingis implementation-defined." Or, if you prefer, replace that last sentencewith "Rules for error reporting and/or recovery should be provided in thespecification for the host language."
</proposed>


The group agreed to put the onus on host language specifications.  Thanks!

The following comments apply to Appendix A, which defines XML SchemaDatatypes relating to CURIEs:
* The status of Appendix A needs to be clarified -- it's currentlydescribed as normative, but at the very least the list of types needscross-referencing to the BNF for CURIE and SafeCURIE.
* The syntax in section 3 and the regexps in Appendix A need to bebrought into line. We recommend that this might be done by:
     a) Changing the CURIE production to read:

         curie := [ prefix ':' ] reference

        with a bit of prose saying that the empty
        string is _not_  a CURIE.

We have added that prose. Note, however, that the original CURIEproduction was correct. It is possible to have a CURIE with no "prefix"and only a leading colon - such a CURIE uses the host language-defineddefault prefix.

     b) Changing the core part of the regexps to read:

         ([\i-[:]][\c-[:]]*:)?.*

As per advice from C. M . Sperberg-McQueen, we have changed the regexpsto read (([\i-[:]][\c-[:]]*)?:)?.+ - we believe this matches theproduction. If not, please let us know!

     c) Adding a facet to CURIE:

         <xs:minLength value="1"/>

     d) Adding a facet to SafeCURIE:

         <xs:minLength value="3"/>


We have added these facets - thanks!

We trust that our changes will resolve your last call comments.

--
Shane P. McCarron                          Phone: +1 763 786-8160 x120
Managing Director                            Fax: +1 763 786-8180
ApTest Minnesota                            Inet: [EMAIL PROTECTED]

[ACTION-4] Re: W3C TAG Response to CURIE Last Call (PR#8055)

Reply via email to