W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > September 2001

Re: URI terminology demystified

From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
Date: Thu, 20 Sep 2001 14:24:41 +0100
Message-ID: <3BA9EE19.BFC38FFB@hplb.hpl.hp.com>
To: w3c-rdfcore-wg@w3.org

Hmmm, I was just examing the XML specs concerning system identifiers



Your quote from the old RDF spec:

Dan Connolly wrote:
>   Note: Although non-ASCII characters in URIs are not allowed by [URI],
> [XML]
>   specifies a convention to avoid unnecessary incompatibilities in
> extended URI
>   syntax. Implementors of RDF are encouraged to avoid further
> incompatibility and
>   use the XML convention for system identifiers. Namely, that a
> non-ASCII character
>   in a URI be represented in UTF-8 as one or more bytes, and then these
> bytes be
>   escaped with the URI escaping mechanism (i.e., by converting each byte
> to %HH,
>   where HH is the hexadecimal notation of the byte value).

This seems to be a misinterpretation of the XML spec, which the erratum
We should, IMO, hence go along with the clarification, and the RDF/XML
processor is responsible for escaping non-permitted characters in

I also note that this is consistent with our test case:



which has not been approved, seems to suggest the following

1: ID's are subject to the same URI encoding rule.
2: N-triple URIs are in US-ASCII and must be already encoded.

These seem like good things.

Dan - do you know about namespace declarations? 
    - are the URIs in Unicode (needing escaping) or US-ASCII?

Received on Thursday, 20 September 2001 09:20:31 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:24:04 UTC