TAG comments on: http://www.w3.org/TR/2007/WD-curie-20071126/ "CURIE Syntax 1.0" from Williams, Stuart (HP Labs, Bristol) on 2008-03-28 (www-html-editor@w3.org from January to March 2008)

From: Williams, Stuart (HP Labs, Bristol) <skw@hp.com>
Date: Fri, 28 Mar 2008 12:39:08 +0000
To: "www-html-editor@w3.org" <www-html-editor@w3.org>
Message-ID: <9674EA156DA93A4F855379AABDA4A5C611A1D57A8D@G5W0277.americas.hpqcorp.net>
With respect to your work on "CURIE Syntax 1.0" [1], the TAG has asked me to post the comments attached below on its behalf.

The TAG reached concensus on the comments it wished to send during their meeting on 27th March 2008 (minutes to be published).

I'd like to thank you for your patience in awaiting our comments.

Best regards

Stuart Williams (co-chair)
on behalf of W3C TAG
[1] http://www.w3.org/TR/2007/WD-curie-20071126/
--
Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England
===============================================================================

The TAG appreciates that the XHTML 2 WG is attempting to address a frequently expressed need with the CURIE design.  Aside from the relatively minor comments given at the end, which we hope you can address to improve the way the spec. reads, we have some overall concerns which we invite you to consider.

[Note that although most of these comments were written against the 22 January 2008 Editors' Draft [1], some were based on the public WD of
26 November [2], and may have been overtaken.]

1) The spec. as it stands doesn't really make clear what the
   requirements for CURIEs are.  What _precisely_ is the requirement
   you are trying to address?

2) The overlap with existing usage of the 'xxx:yyy' pattern in
   XML-based languages is troubling.  It would be helpful if you could
   at least explain the background which has led you to reject all
   suggestions that a different separator character, or XML entity
   syntax, should be used.

3) The fact that you feel compelled to provide for potential confusion
   in contexts where URIs are expected in XML languages is very
   troubling, if we read it as implying that CURIEs are intended for
   use in existing XML languages in places where only URIs are allowed
   today. We can't tell whether this is actually your intention,
   because the spec. is equivocal on this point. In section 5.2 [1]
   the (existing) 'href' attribute of XHTML is mentioned in the prose
   (worrying), but the _examples_ which follow only show CURIEs in the
   (presumably proposed for XHTML2 or HTML5 or . . .) 'resource'
   attribute (OK).

   In this connection we find the prose about CURIEs in the current
   RDFA spec. [2] troubling. The implication that CURIEs can be used
   in existing URI-only contexts is made explicit in one of the
   examples therein [3]:

     <link about="[_:a]" rel="foaf:knows" href="[_:b]" />

   and more generally there by the fact that the DTD for XHTML+RDFa
   defines several of its _new_ attributes, e.g. 'resource' and
   'about', as containing URI references.

   One can imagine an alternative proposal which made clear that it
   was only addressing the need for an abbreviated URI format in
   non-XML languages, or new XML languages, or new contexts within old
   XML languages, where _only_ such abbreviated forms are
   allowed. That is, a position taken _against_ any possibility the
   CURIEs might be used where URIs are called for in XML languages
   today. It would though have to acknowledge the possible negative
   consequences of success in going down this path, namely that
   ordinary users will not understand that 'safe' CURIEs
   ([xxx:yyy]-form CURIEs) are not a universal alternative to URIs,
   and will start using them in existing languages where URIs are
   expected, causing tools to break and users to be frustrated.

   All of this adds up to saying: please consider _very_ carefully
   whether the use cases/candidate requirements you have for the
   'safe' CURIE, i.e. a CURIE that can be used in an XML language
   where a URI can also be used, are really compelling. We note in
   this regard that we are aware of no requests for an analogous
   form for QNames.

4) Have you considered that if you get what you've asked for, you
   won't have (everything) you need?  That is, have you considered
   that being able to write xxx:37b and have that treated as
   "http://www.example.com/feeds/thursday.xml#37b" will _not_ make
   that a useable URI?  '37b' is not an NCName, so the URI is not a
   valid shorthand XPointer.  '37b' is not an XML Name, so is not a
   valid value for an ID-typed attribute, and so cannot be an anchor
   in a valid XML document.

   You may say that this is not your problem, but by allowing, even
   encouraging, the use of CURIEs of this form, you are encouraging
   people to deploy broken data.

More specific comments:

Section 1:

  The implication that CURIEs are simply a reworking of QNames to
  eliminate some inconvenient restrictions is misleading:
  the value space of QNames is a space of _pairs_ of URIs and NCNames,
  whereas the value space of CURIEs is a space of singletons.  So
  statements such as "values that are valid QNames are a subset of
  [CURIEs]" and "in other words, the principle used in QNames --- that
  of combining a namespace name with a local part to generate a URI"
  should be removed, or at least heavily modified.

Section 1.  "1) [QNames] are NOT intended for use in attribute values"

  This is at best misleading -- W3C XML Schema datatypes, of which
  QName is one, are explicitly and intentionally intended for use to
  define the allowed content of both attributes and elements.

  Insofar as the TAG has warned about QNames in content (not just
  attribute values), this has to do with the vulnerability of prefix
  mappings and CURIEs seem to inherit all of those problems.

Section 3.  Prefixes and even colons are optional (again/still).

  This is just asking for trouble, in my view, particularly the 'no
  colon' case..  What use cases require default prefixes?  The absence
  of _any_ visible signal seems very dangerous.

Section 3.  "The concatenation of the prefix value associated with a
               CURIE and its reference MUST be an IRI [IRI]."

  Just what production is meant here?  I.e. the IRI production itself
  (I hope so) or the IRI-reference production (I hope not)?

Section 3.

  This draft reintroduces the possibility of "additional prefix
  mapping definition mechanisms".

  We are made uneasy by this for XML-based languages, as it only
  _increases_ the risk of prefix-bindings being lost. If this
  provision cannot be removed, at least some justification for it
  ought to be provided.

  It is particularly worrying that these mechanisms are used in
  section 4.3 immediately after the statement "documents annotated
  with RDFa could use the xmlns mechanism to define prefixes". Given
  that they could, we think at the very least they SHOULD, and would
  prefer MUST.

Section 5.  "lexical value"

  This is at best a confusing phrase -- I suggest sticking with
  "lexical form" instead of "raw CURIE" and "value as IRI" for "lexical
  value" (or "value as URI", depending on which you actually mean).

Section 5.2

  The end of the first paragraph suggests that the following example
  is about an email address, but that appears not to be the case. At
  least, it's not clear how to interpret the example:

     <span rel="foaf:homePage" resource="http://...">home</span>

  as an email address.

If this spec. is in fact intended to define an XSD datatype, a schema document, or at least simple type definitions for CURIEs and safe CURIEs, would be a good addition.

[1] http://www.w3.org/MarkUp/2008/ED-curie-20080122/
[2] http://www.w3.org/TR/2007/WD-curie-20071126/
Received on Friday, 28 March 2008 12:43:20 UTC