Draft response to: TAG comments on: http://www.w3.org/TR/2007/WD-curie-20071126/ "CURIE Syntax 1.0" from Steven Pemberton on 2008-04-16 (public-xhtml2@w3.org from April 2008)

From: Steven Pemberton <steven.pemberton@cwi.nl>
Date: Wed, 16 Apr 2008 15:33:51 +0200
To: "XHTML WG" <public-xhtml2@w3.org>
Message-ID: <op.t9phaphzsmjzpq@acer3010>
To: "Williams, Stuart (HP Labs, Bristol)" <skw@hp.com>,  
"www-html-editor@w3.org" <www-html-editor@w3.org>

Hello Stuart. Thanks for the comments. This email is a reply to the  
general comments only. A separate mail will address the specific points.

> With respect to your work on "CURIE Syntax 1.0" [1], the TAG has asked  
> me to post the comments attached below on its behalf.
>
> The TAG reached concensus on the comments it wished to send during their  
> meeting on 27th March 2008 (minutes to be published).
>
> I'd like to thank you for your patience in awaiting our comments.

You just got in under the wire :-) We were just about to go to last call  
when your comments arrived.

> The TAG appreciates that the XHTML 2 WG is attempting to address a  
> frequently expressed need with the CURIE design.  Aside from the  
> relatively minor comments given at the end, which we hope you can  
> address to improve the way the spec. reads, we have some overall  
> concerns which we invite you to consider.
>
> [Note that although most of these comments were written against the 22  
> January 2008 Editors' Draft [1], some were based on the public WD of
> 26 November [2], and may have been overtaken.]
>
> 1) The spec. as it stands doesn't really make clear what the
>    requirements for CURIEs are.  What _precisely_ is the requirement
>    you are trying to address?

Hmm. The introduction tries to explain that. To summarise: Many specs use  
QNames to represent URIs, but unfortunately QNames are unable to represent  
all URIs, so this spec fixes that. What more would you like to see?

> 2) The overlap with existing usage of the 'xxx:yyy' pattern in
>    XML-based languages is troubling.  It would be helpful if you could
>    at least explain the background which has led you to reject all
>    suggestions that a different separator character, or XML entity
>    syntax, should be used.

It is true that there was a long discussion on this. The final decision  
fell to the current syntax because a) It looks like what it is extending,  
so future specs could use it without invalidating current content b)  
existing mind-share of the xxx:yyy syntax, making it easy for people to  
understand intuitively c) widespread existing use of the syntax in other  
software, such as Wikis. But are you asking for that discussion to be  
repeated in the spec? Personally I am not sure the specification is the  
right place for that sort of discussion.

> 3) The fact that you feel compelled to provide for potential confusion
>    in contexts where URIs are expected in XML languages is very
>    troubling, if we read it as implying that CURIEs are intended for
>    use in existing XML languages in places where only URIs are allowed
>    today. We can't tell whether this is actually your intention,
>    because the spec. is equivocal on this point. In section 5.2 [1]
>    the (existing) 'href' attribute of XHTML is mentioned in the prose
>    (worrying), but the _examples_ which follow only show CURIEs in the
>    (presumably proposed for XHTML2 or HTML5 or . . .) 'resource'
>    attribute (OK).

It is true that it is regrettable that there is a clash between the syntax  
of QNames and URIs. However, everyone seems so used now to using  
Qname-like syntax for representing URIs, especially in Semantic Web  
Contexts (just look at Turle, Sparql, and N3), that to design yet a  
different syntax seemed like asking for trouble.

But note that CURIEs are only the syntactic space. The value space of  
CURIEs is URIs. CURIES are not intended to be sent over the wire for  
dereferencing.

>    In this connection we find the prose about CURIEs in the current
>    RDFA spec. [2] troubling. The implication that CURIEs can be used
>    in existing URI-only contexts is made explicit in one of the
>    examples therein [3]:
>
>      <link about="[_:a]" rel="foaf:knows" href="[_:b]" />
>
>    and more generally there by the fact that the DTD for XHTML+RDFa
>    defines several of its _new_ attributes, e.g. 'resource' and
>    'about', as containing URI references.

Yes, if the semantic web hadn't used this ambiguous shortened form for  
URIs we wouldn't be walking this difficult line. But clearly in contexts  
where people use URIs, for consistency (and good engineering reasons) they  
will also want to be able to use CURIEs. So we chose a syntax that clearly  
delineates them from URIs, but still looks like something recognisable.  
However, we only provide the datatype in this spec. How language designers  
use it is up to them.

>    One can imagine an alternative proposal which made clear that it
>    was only addressing the need for an abbreviated URI format in
>    non-XML languages, or new XML languages, or new contexts within old
>    XML languages, where _only_ such abbreviated forms are
>    allowed. That is, a position taken _against_ any possibility the
>    CURIEs might be used where URIs are called for in XML languages
>    today. It would though have to acknowledge the possible negative
>    consequences of success in going down this path, namely that
>    ordinary users will not understand that 'safe' CURIEs
>    ([xxx:yyy]-form CURIEs) are not a universal alternative to URIs,
>    and will start using them in existing languages where URIs are
>    expected, causing tools to break and users to be frustrated.
>    All of this adds up to saying: please consider _very_ carefully
>    whether the use cases/candidate requirements you have for the
>    'safe' CURIE, i.e. a CURIE that can be used in an XML language
>    where a URI can also be used, are really compelling. We note in
>    this regard that we are aware of no requests for an analogous
>    form for QNames.

The only reason they are in there is because we had compelling use cases  
for them.

> 4) Have you considered that if you get what you've asked for, you
>    won't have (everything) you need?  That is, have you considered
>    that being able to write xxx:37b and have that treated as
>    "http://www.example.com/feeds/thursday.xml#37b" will _not_ make
>    that a useable URI?  '37b' is not an NCName, so the URI is not a
>    valid shorthand XPointer.  '37b' is not an XML Name, so is not a
>    valid value for an ID-typed attribute, and so cannot be an anchor
>    in a valid XML document.

Sure. We spent a lot of time investigating how to define the syntax of  
CURIEs so that the expansion was a valid URI, but it is provably not  
possible. So all you can say is "The expansion must be a valid URI".

In fact whether a URI is valid or not is in many cases a dynamic property,  
such as the example you gave, since it depends on the media type of the  
returned resource. You don't know whether "#37b" is OK until you have the  
media type of "http://www.example.com/feeds/thursday.xml".

>    You may say that this is not your problem, but by allowing, even
>    encouraging, the use of CURIEs of this form, you are encouraging
>    people to deploy broken data.

I don't follow your reasoning I'm afraid.

Best wishes,

Steven Pemberton
For the XHTML2 WG
Received on Wednesday, 16 April 2008 13:34:28 UTC