curie compatibility, scope, timeline from Dan Connolly on 2005-10-28 (public-rdf-in-xhtml-tf@w3.org from October 2005)

From: Dan Connolly <connolly@w3.org>
Date: Thu, 27 Oct 2005 21:44:12 -0500
To: public-rdf-in-xhtml task force <public-rdf-in-xhtml-tf@w3.org>
Message-Id: <1130467452.27261.683.camel@dirk>

This sort of example raises all sorts of questions, for me.

  Find out more about <a href="[wiki:Thales]">Thales</a>
  -- http://www.w3.org/2001/sw/BestPractices/HTML/2005-10-27-CURIE

Practically, there's a lot of software that assumes the value
of an href attribute is a URI/IRI reference... with every
good reason: the spec say so.

If you assume a string is a URI, you can parse it quick-n-dirty
style, using, e.g. this regex from appendix B of the URI standard:

      ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?

      -- http://www.ietf.org/rfc/rfc3986.txt

The resulting parse is:

scheme:  [wiki
auth:  
path: Thales] 
query:  frag:

Now '[' isn't allowed in a scheme name, but software that assumes
it's dealing with a URI reference isn't going to check for that.

A quick check with firefox shows that firefox uses a different
algorithm. It treats [wiki:Thales] as a relative URI reference;
it combines it with a base of

 file:///home/connolly/,curi.html

to get

 file:///home/connolly/%5Bwiki:Thales%5D

So the user never gets any feedback that the reference is
being misinterpreted.

Now these CURIs are designed for use in XHTML 2... it's
one thing to make new features invisible to old software
so that only users that upgrade get a benefit. But it's
something else entirely to assume that this construct
will somehow never get fed into old/current software.

It seems much more straightforward to use a different
attribute from href for this sort of reference.

But even that design doesn't seem worth persuing, to me.
Making up a new form of URI reference seems to be stretching
the scope of the charter of all the relevant groups.
I hope you're prepared to get review from the URI Interest
group, the TAG, etc.

TAG's position on QNames in content seems relevant.
Any WD on curies is going to need to answer these points,
eventually...

[[
6 Architectural Statement
In so far as the identification mechanism of the Web is the URI and
QNames are not URIs, it is a mistake to use a QName for identification
when a URI would serve.

That said, the TAG recognizes that there are sometimes pragmatic reasons
for chosing short, lexical representations of more complex names and
accepts that QNames are an established mechanism for doing so. Further,
it must be observed that some things are identified by QNames: element
and attribute names, types in W3C XML Schema, etc.

Where there is a compelling reason to use QNames instead of URIs for
identification, it is imperative that specifications provide a mapping
between QNames and URIs, if such a mapping is possible.

Finally, we observe that a whole class of interpretation problems can be
avoided if the use of QNames can be restricted to contexts where their
identification is natural and unambiguous (element and attribute names,
simple content of type xs:QName, etc.) and we encourage developers to
employ such restrictions wherever possible.
]]
 -- http://www.w3.org/2001/tag/doc/qnameids.html

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
D3C2 887B 0F92 6005 C541  0875 0F91 96DE 6E52 C29E

Received on Friday, 28 October 2005 02:44:15 UTC