[ISSUE 56] CURIE last call draft might need stronger admonition on using CURIEs with URIs are expected

First of all, these comments are my own and so do not represent any 
official or unofficial position of the TAG at this point. 

I have taken a quick look at the Last Call CURIE Syntax draft of 6 May 
2008 [1].  One of the issues discussed is the posisble use of CURIEs in 
situations where URIs or IRIs would otherwise be expected.  The draft 
introduces a notion of "safe-CURIE", which is a CURIE wrapped in square 
brackes like this [sample:12345].  It also says:

<fromLastCallDraft>
"In some cases language designers will want to use both URIs and CURIEs as 
the value of an attribute. For example, in XHTML+RDFa [XHTMLRDFa] the 
about attribute allows a URI to be specified that some metadata is 
"about", but it is also be useful to abbreviate this URI, using the 
compact syntax. However, the problem is that it is not possible for the 
language parser to be completely sure whether it has located a CURIE or a 
URI. For example, a resource could be specified as follows:

<p rel="foaf:homePage" about="http://www.example.org/home.html">home</p>

There is no way to be sure that this is a normal URI, or a CURIE. 
Therefore the syntax for carrying a CURIE when there is any possibility of 
ambiguity is to enclose the CURIE in square brackets [...]
</fromLastCallDraft>

I don't think this is strong enough.  It doesn't clealy indicate whether 
it is or isn't OK to use CURIES in languages where a URI is currently 
required.  Also, the phrase "possibility of ambiguity" is sufficiently 
vague that it will be hard for designers of specifications that use CURIEs 
to tell if they are conforming.  Must [ xxx:yyy] syntax be used around all 
CURIEs in fields that otherwise accept URIs or IRIs, or only if the 
particular CURIE string is syntactically legal as a URI?  Thus, I think 
the draft should include something along the following lines:

<proposed>
CURIEs and safe-CURIEs map to IRIs, but neither a CURIE nor a safe-CURIE 
<italic>is</.italic> an IRI or URI.  Accordingly, CURIEs and safe-CURIEs 
MUST NOT be used as values for attributes that are specified to contain 
only URIs, IRIs, URI-references, IRI-references, etc.   Specifications for 
particular attribute values or other content MAY be written to allow 
either CURIEs or IRIs (or URIs, etc.).  The specifications for such 
languages MUST provide rules for disambiguition in situations where the 
same string could be interpreted as either a CURIE or an IRI.  One way to 
do this is to require that all CURIEs be expressed as safe-CURIEs, 
implying that all unbracketed strings are to be interpreted as IRIs.
</proposed>

What's important in the above is that (1) it allows IRI/Curie mixing only 
where specifications explicitly provide for it, and thus not in existing 
specifications, and (2) it makes clear that detailed rules for 
disambiguation are needed.  I hope these comments are helpful.

Noah

[1] http://www.w3.org/TR/2008/WD-curie-20080506/

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------

Received on Monday, 4 August 2008 15:42:35 UTC