- From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
- Date: Thu, 20 Sep 2001 14:24:41 +0100
- To: w3c-rdfcore-wg@w3.org
Hmmm, I was just examing the XML specs concerning system identifiers
....
See:
http://www.w3.org/XML/xml-V10-2e-errata#E4
Your quote from the old RDF spec:
Dan Connolly wrote:
>
> Note: Although non-ASCII characters in URIs are not allowed by [URI],
> [XML]
> specifies a convention to avoid unnecessary incompatibilities in
> extended URI
> syntax. Implementors of RDF are encouraged to avoid further
> incompatibility and
> use the XML convention for system identifiers. Namely, that a
> non-ASCII character
> in a URI be represented in UTF-8 as one or more bytes, and then these
> bytes be
> escaped with the URI escaping mechanism (i.e., by converting each byte
> to %HH,
> where HH is the hexadecimal notation of the byte value).
>
This seems to be a misinterpretation of the XML spec, which the erratum
clarifies.
We should, IMO, hence go along with the clarification, and the RDF/XML
processor is responsible for escaping non-permitted characters in
URI-refs.
I also note that this is consistent with our test case:
http://www.w3.org/2000/10/rdf-tests/rdfcore/rdfms-difference-between-ID-and-about/test2.nt
http://www.w3.org/2000/10/rdf-tests/rdfcore/rdfms-difference-between-ID-and-about/test2.rdf
which has not been approved, seems to suggest the following
1: ID's are subject to the same URI encoding rule.
2: N-triple URIs are in US-ASCII and must be already encoded.
These seem like good things.
Dan - do you know about namespace declarations?
- are the URIs in Unicode (needing escaping) or US-ASCII?
Jeremy
Received on Thursday, 20 September 2001 09:20:31 UTC