- From: Graham Klyne <Graham.Klyne@MIMEsweeper.com>
- Date: Thu, 20 Sep 2001 15:33:19 +0100
- To: Dan Connolly <connolly@w3.org>
- Cc: Jeremy Carroll <jjc@hplb.hpl.hp.com>, w3c-rdfcore-wg@w3.org
FWIW, I'm having a separate discussion with Martin Duerst about this issue with respect to CC/PP (an application of RDF); Martin seems to think the XML system identifier rules should apply to URI values in RDF -- I'm pressing for clarity about why this is so, given that URIs per se cannot contain non-US-ASCII characters. (I think part of the motivation is to prepare the ground for deployment of IRIs.) I had been planning to report the outcome of the discussion back to this group, as it relates to some wording that exists in the existing RDF spec. #g -- At 09:11 AM 9/20/01 -0500, Dan Connolly wrote: >Jeremy Carroll wrote: > > > > Hmmm, I was just examing the XML specs concerning system identifiers > > .... > > > > See: > > > > http://www.w3.org/XML/xml-V10-2e-errata#E4 > > > > Your quote from the old RDF spec: > > > > Dan Connolly wrote: > > > > > > Note: Although non-ASCII characters in URIs are not allowed by [URI], > > > [XML] > > > specifies a convention to avoid unnecessary incompatibilities in > > > extended URI > > > syntax. Implementors of RDF are encouraged to avoid further > > > incompatibility and > > > use the XML convention for system identifiers. Namely, that a > > > non-ASCII character > > > in a URI be represented in UTF-8 as one or more bytes, and then these > > > bytes be > > > escaped with the URI escaping mechanism (i.e., by converting each byte > > > to %HH, > > > where HH is the hexadecimal notation of the byte value). > > > > > > > This seems to be a misinterpretation of the XML spec, which the erratum > > clarifies. > >Strictly speaking, it's not; system identifiers only occur >in things like <!ENTITY ...> delcarations. The value of >an rdf:resource attribute isn't a system identifier (unless >we change RDF 1.0 to say that it is for some reason). > > > > We should, IMO, hence go along with the clarification, and the RDF/XML > > processor is responsible for escaping non-permitted characters in > > URI-refs. > >It's not XML 1.0 that compells us to go with the >Unicode->URI escaping in resource/about/ID, >but the history of HTML 4.0 href, the text from RDF 1.0 >excerpted above, the precedent of the XLink REC (xlink:href), >and the recent opinion of the I18N WG expressed >in the charmod spec. > > > I also note that this is consistent with our test case: > > > > > http://www.w3.org/2000/10/rdf-tests/rdfcore/rdfms-difference-between-ID-and-about/test2.nt > > > > > http://www.w3.org/2000/10/rdf-tests/rdfcore/rdfms-difference-between-ID-and-about/test2.rdf > > > > which has not been approved, seems to suggest the following > > > > 1: ID's are subject to the same URI encoding rule. > >Yup. (that is: values of rdf:ID attributes.) > > > 2: N-triple URIs are in US-ASCII and must be already encoded. > >Yes; to be crystal clear: All URIs are in US-ASCII. >URIs appear in N-triple syntax as-is, with no further encoding. > > > These seem like good things. > >Agreed. > > > Dan - do you know about namespace declarations? > > - are the URIs in Unicode (needing escaping) or US-ASCII? > >I think namespace declarations must use URI references as-is; >i.e. you're not allowed to put non-uri characters in them. >This follows from > (a) a literal reading of the namespaces REC, > which says that the value of an xmlns attribute > is a namespace name and a namespace name *is* URI references > (not that they can be decoded into URI references). > Nobody has suggested changing/clarifying this > aspect of the namespace spec, to my knowledge. > > (b) my own observation that the XML infrastructure > treats namespace names as plain old strings, and > never decodes or otherwise mangles them (other > than normal XML attribute value literal interpretation). > >It's at least worth a health-warning to say "if you >put non-URI characters in your namespace names, LOOK OUT! >We know of no software that's going to help you!" > >And it's worth a test case or two. Care to cook some up? > >-- >Dan Connolly, W3C http://www.w3.org/People/Connolly/ ------------------------------------------------------------ Graham Klyne MIMEsweeper Group Strategic Research <http://www.mimesweeper.com> <Graham.Klyne@MIMEsweeper.com> ------------------------------------------------------------
Received on Thursday, 20 September 2001 10:41:07 UTC