W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > May 2002

Re: Clarification of charmod-uri

From: Graham Klyne <Graham.Klyne@MIMEsweeper.com>
Date: Thu, 02 May 2002 12:19:24 +0100
Message-Id: <>
To: Dan Connolly <connolly@w3.org>
Cc: Aaron Swartz <me@aaronsw.com>, Jeremy Carroll <jjc@hplb.hpl.hp.com>, RDF Core <w3c-rdfcore-wg@w3.org>
At 11:24 AM 5/1/02 -0500, Dan Connolly wrote:
>It seems to me that we're no longer talking about
>URIs-as-specified-in-RFC2396, but instead URIs-as-used-and-implemented.

I think that RFC 2396 can be read either way on this.  In section 2.1, the 
discussion of "URI character sequence" and "original character sequence" 
can be taken as saying that a URI can be a sequence of "URI characters" or 
"original characters".

I used to think that the intent was that for inclusion in computer data, 
URI characters should always be used, but I can see a point of view that 
Unicode characters in XML are little different from handwritten 
hieroglyphics insofar as qualifying as "original characters".

My point is that if one accepts all this, then using full Unicode in RDF 
URIs is not extending RFC 2396 or stretching our charter.  It's a big "if", 
and like you I'm not always sure of this, but in the CC/PP work we got a 
lot of pressure from I18N folks to go with this kind of treatment of URIs.

So if this is definitely the direction of URIs, and if RDF developers can 
swallow this interpretation without too much pain, it seems better to make 
the decision sooner rather than later.  This was the basis on which I 
supported the decision.  What would persuade me to vote the other way?:

- sufficiently compelling protests from the RDF developer community, 
especially those with live deployed software

- a compelling argument or broad consensus that according to RFC 2396, the 
only true URI consists only of a limited set of US-ASCII characters.

FWIW, on the URI list, Roy Fielding has suggested that he sees just one 
thing from the IRI proposal that might be incorporated into the URI 
standard - a default interpretation of %hh as describing a UTF-8 encoding.
- http://lists.w3.org/Archives/Public/uri/2002May/0006.html


Graham Klyne
Received on Thursday, 2 May 2002 07:11:06 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:53:57 UTC