W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > September 2001

Re: URI terminology demystified (I18N details)

From: Graham Klyne <Graham.Klyne@MIMEsweeper.com>
Date: Thu, 20 Sep 2001 18:40:31 +0100
Message-Id: <>
To: Jeremy Carroll <jjc@hplb.hpl.hp.com>
Cc: w3c-rdfcore-wg@w3.org
At 05:04 PM 9/20/01 +0100, Jeremy Carroll wrote:

> > FWIW, I'm having a separate discussion with Martin Duerst about this issue
> > with respect to CC/PP (an application of RDF);  Martin seems to think the
> > XML system identifier rules should apply to URI values in RDF -- I'm
> > pressing for clarity about why this is so, given that URIs per se cannot
> > contain non-US-ASCII characters.
> >
>I think this is a no-brainer from an internationalization point of view.
>When a non-English speaker wishes to write a meaningful rdf:about or
>rdf:ID value then they will use non-US ASCII characters.
>Since URIs are US ASCII somewhere someone has to do the conversion, and
>the %HH encoding of UTF-8 is the correct conversion to do.
>It is necessary for a spec to say who does the conversion, and given
>that RDF/XML is meant to be (barely) end user readable, it should be in
>their language. Hence the RDF/XML processor needs to do the conversion.

I have no problem with that position.  I just don't think it's clear from 
the XML spec (for system identifiers), the XML namespace spec (for 
namespace URIs), or the current RDF spec (for URI-valued attributes, etc.).

I think that when one says a piece of text has URI syntax, and that it may 
also contain non-US-ASCII characters, then the latter has to be stated very 
clearly.  This is not completely at odds with RFC2396, which talks about 
characters -> octets -> URI-character mappings.  But I do think that when 
one talks of URIs in data streams, what is usually meant is a sequence of 
URI-characters;  i.e. the US-ASCII subset used by URIs.


Graham Klyne                    MIMEsweeper Group
Strategic Research              <http://www.mimesweeper.com>
Received on Thursday, 20 September 2001 14:01:28 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:24:04 UTC