Re: Objection to Debate Scheduling [rdfmsQnameUriMapping-6, qnameAsId-18, URIGoodPractice-40] from Dan Connolly on 2006-04-26 (www-tag@w3.org from April 2006)

From: Dan Connolly <connolly@w3.org>
Date: Wed, 26 Apr 2006 15:22:02 -0400
To: "Mark Birbeck" <mark.birbeck@x-port.net>
Cc: <steve@w3.org>, <swick@w3.org>, <djweitzner@w3.org>, <www-tag@w3.org>, <public-rdf-in-xhtml-tf@w3.org>, "'Ben Adida'" <ben@mit.edu>, <Vincent.Quint@inrialpes.fr>, <schreiber@cs.vu.nl>, <ht@inf.ed.ac.uk>, <em@w3.org>, <dwood@tucanatech.com>
Message-Id: <d2f25ff22f15340ce6c4247d22e7a08b@w3.org>
On Apr 21, 2006, at 8:24 AM, Mark Birbeck wrote:

> Vincent,
>
> Great to hear about the 'organisation of a discussion on a new topic',
> albeit indirectly. But it seems an odd way to approach it--that 'the  
> TF/WG
> will be represented so that the issue is not misrepresented'. Perhaps  
> the
> seriousness of the problem that CURIEs is trying to solve hasn't been  
> fully
> conveyed, but regardless of RDFA and XHTML 2's need for CURIEs there  
> is a
> *serious* problem within W3C specifications in the promiscuous use of  
> QNames
> in places where they are inappropriate.
>
> So, I would suggest that CURIEs is discussed in the context of the  
> *already
> existing* QName problem, and not as some kind of upstart looking to  
> rock the
> boat, and in that context I recommend to anyone who might be involved  
> in
> that discussion that they look at the references below.

Yes, thanks for the summary.

Meanwhile, note that the already existing QName problem has been  
discussed
in www-tag before; there are two relevant issues...

http://www.w3.org/2001/tag/issues.html#rdfmsQnameUriMapping-6
http://www.w3.org/2001/tag/issues.html#qnameAsId-18

which are both currently closed; they were addressed by...

Using Qualified Names (QNames) as Identifiers in XML Content
TAG Finding 17 March 2004
http://www.w3.org/2001/tag/doc/qnameids.html

I pointed this out in the RDF/XHTML task force last October.
http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2005Oct/0071

Perhaps it's time to reconsider these issues...

>
> The references break down roughly into two groups; the first group is  
> of
> comments that motivate the need for CURIEs, and that group includes my
> original proposal. The second group are those that also address the  
> problems
> with using QNames as a mechanism for abbreviating URIs, which in broad
> summary are:
>
>  * there are URIs that cannot be abbreviated by
>    using QNames;

While we're collecting history, this is a postponed issue from the RDF  
Core WG

Issue rdfms-qnames-cant-represent-all-uris: The RDF XML syntax cannot  
represent all possible Property URI's.
Raised Wed, 14 Feb 2001 by Graham Klyne
http://www.w3.org/2000/03/rdf-tracking/#rdfms-qnames-cant-represent- 
all-uris


>  * QNames is a syntax for expressing XML elements
>    and attributes, and not for expressing URIs.

Well, when QNames are used for something other than abbreviations
for URIs, we bump up against item #1 of web architecture, i.e.

"Identify with URIs

To benefit from and increase the value of the World Wide Web, agents  
should provide URIs as identifiers for resources."
  -- http://www.w3.org/TR/2004/REC-webarch-20041215/#pr-use-uris

as a corollary, we have...

"QName Mapping

A specification in which QNames serve as resource identifiers MUST  
provide a mapping to URIs."
http://www.w3.org/TR/2004/REC-webarch-20041215/#qname-mapping


>  It
>    is therefore inappropriate and confusing to use
>    the 'object type' in other specifications as a
>    way to scope things (e.g., XPath functions) or
>    abbreviate URIs (e.g., RDF-related standards).

I don't understand what you mean by that.

> This second point is particularly important and in my view is the  
> discussion
> that the W3C in some form or another should be having; there are a many
> situations that have absolutely nothing to do with XML that have  
> adopted
> QNames, and each time it is done it further muddies the waters.
>
> To summarise:
>
>  * a mechanism is needed in many different specs
>    to abbreviate URIs that is independent of XML,
>    and this mechanism should be able to cope with
>    *all* URIs.

Hmm... that's not clear to me. First, we already have relative URI  
references,
which are independent of XML.

Second, it's not clear why further abbreviation mechanisms shouldn't be  
specific
to the language in which they occur. There are usability reasons that  
argue
for sharing across languages that should be considered when evaluating  
designs,
but it's not at all clear that a "must work with all formats"  
requirement at the
start is justified.

Third, as to "*all* URIs," it's already is the case that some URIs are  
better than others;
i.e. some invariants hold on lots of URIs but not for some edge cases.
For example, the usual case is:

   wrt("http://example/a/b/", "http://example/a/b/c/d", "c/d")
   join("http://example/a/b/", "c/d") = "http://example/a/b/c/d"

i.e.
   wrt(from, to, path)
   join(from, path) = to

but this only works for "well behaved" URIs, i.e. not  
http://example/a/b/../c/d .

There's also the fact that l and 1 look a like, so using them in a URI  
introduces
transcription error risks. People used to laugh about this issue until  
the phishers
used unicode characters that look alike to defaud banking customers.

The relevant TAG issue is
  http://www.w3.org/2001/tag/issues.html#URIGoodPractice-40
and it's still open. We haven't figured out a nice set of rules for
avoiding all the relevant snags.

But in this case...

[[
The IPTC's taxonomy has a set of subject codes for news articles; to  
pick an example, the code 15002000 represents alpine skiing. The IPTC  
would like to be able to represent these codes in a convenient form in  
their documents, in such a way that they are not only compact, but it's  
also easy for news organisations to add their own codes. The obvious  
choice was to use QNames for this, since they allow different  
organisations to adopt their own namespaces to qualify the values. But  
as with our ISBN example, iptc:15002000 is not a valid QName.
]]
  --  
http://internet-apps.blogspot.com/2005/10/curies-compact-uri-syntax- 
semantic.html>

a pattern I have seen used a number of times for making URI fragments  
out of numerals
is to put an XML name start character such as '_' in front of it.  
iptc:_15002000 is a valid
QName.

I think it's worthwhile keeping this option in mind while evaluating  
proposals like CURIEs,
especially since URIs are so widely used and hence the coordination  
costs of finding an
abbreviation mechanism that works across all formats are considerable.

> It would obviously be great if the W3C was able to coordinate the
> standardisation of such a mechanism since a 'de facto' standard already
> exists 'in the wild'...on Wikis, in server configuration files, in  
> XPath
> function names, in XML schema datatypes, in RDF/XML, in SPARQL, in  
> WRL, and
> so on. (See my blog entry below.) CURIEs is the proposal that I came  
> up with
> as part of a discussion with the IPTC (who have quite rightly forced  
> this
> issue onto the agenda) but it may not be the best one. The unfortunate  
> thing
> is that I haven't seen any other proposals to solve the QName issue  
> yet.
>
> Anyway, here are the links:
>
> The proposal itself:
>
>   <http://www.w3.org/2001/sw/BestPractices/HTML/2005-10-21-curie>
>
> Motivation for it:
>
>
> <http://internet-apps.blogspot.com/2005/10/curies-compact-uri-syntax- 
> semanti
> c.html>
>
> Explanation of how to get out of the QName-abuse problem by using  
> CURIEs:
>
>    
> <http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2005Nov/ 
> 0049>
>
> <http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2005Nov/ 
> 0021.htm
> l>
>
> Norman Walsh replied to the second of the above two links saying that,
> provided CURIEs use the same namespace prefix mechanism as QNames, his
> concerns would be reduced (he still has other concerns though):
>
>    
> <http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2005Nov/ 
> 0023>
>
> Observation that WRL has come to the same conclusion (that QNames are
> inappropriate as a format for abbreviated URIs), and in place of  
> QNames uses
> 'serialised QNames':
>
>    
> <http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2006Jan/ 
> 0006>
>
> Comment by John Cowan saying that he has no problem with CURIEs:
>
>   <http://lists.w3.org/Archives/Public/public-xml-core-wg/2006Mar/0061>
>
> Regards,
>
> Mark
>
>
-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Wednesday, 26 April 2006 19:21:44 UTC