Re: URI terminology demystified

Hmmm, I was just examing the XML specs concerning system identifiers
....

See:

http://www.w3.org/XML/xml-V10-2e-errata#E4


Your quote from the old RDF spec:

Dan Connolly wrote:
> 
>   Note: Although non-ASCII characters in URIs are not allowed by [URI],
> [XML]
>   specifies a convention to avoid unnecessary incompatibilities in
> extended URI
>   syntax. Implementors of RDF are encouraged to avoid further
> incompatibility and
>   use the XML convention for system identifiers. Namely, that a
> non-ASCII character
>   in a URI be represented in UTF-8 as one or more bytes, and then these
> bytes be
>   escaped with the URI escaping mechanism (i.e., by converting each byte
> to %HH,
>   where HH is the hexadecimal notation of the byte value).
> 


This seems to be a misinterpretation of the XML spec, which the erratum
clarifies.
We should, IMO, hence go along with the clarification, and the RDF/XML
processor is responsible for escaping non-permitted characters in
URI-refs.



I also note that this is consistent with our test case:

http://www.w3.org/2000/10/rdf-tests/rdfcore/rdfms-difference-between-ID-and-about/test2.nt

http://www.w3.org/2000/10/rdf-tests/rdfcore/rdfms-difference-between-ID-and-about/test2.rdf


which has not been approved, seems to suggest the following

1: ID's are subject to the same URI encoding rule.
2: N-triple URIs are in US-ASCII and must be already encoded.

These seem like good things.

Dan - do you know about namespace declarations? 
    - are the URIs in Unicode (needing escaping) or US-ASCII?

Jeremy

Received on Thursday, 20 September 2001 09:20:31 UTC