- From: Richard Tobin <richard@inf.ed.ac.uk>
- Date: Thu, 16 Feb 2006 15:33:29 +0000 (GMT)
- To: public-xml-core-wg@w3.org
I have a few comments on Francois's XRI definition: [Definition: *XML resource identifiers* are XML string meant to be used as IRI references or URI references]. System identifers are XML I think we agreed to change "XML string" to "string" yesterday. The definition should probably be changed to be in the singular too: "An *XML resource identifier* is a string ...". resource identifiers. An XML resource identifier may contain characters that, according to [IETF RFC 3897] and [IETF RFC 3986], must be escaped before the string can be used to retrieve the referenced resource. To convert an XML resource identifier to an IRI reference, the following characters must be escaped: * the control characters #x0 to #x1F and #x7F (most of which cannot appear in XML) Most of these *can* appear in XML 1.1 as character references. Character references cannot be used in system identifiers, but you can construct an internal entity containing a system identifier containing a control character. I suggest dropping the parenthesized comment. * space #x20 Note: Authors are advised to avoid unescaped spaces, as XML Schema has identified them as an interoperability risk. * the delimiters < #x3C, > #x3E and " #x22 * the unwise characters { #x7B, } #x7D, | #x7C, \ #x5C, ^ #x5E and ` #x60 These characters are escaped by applying to them steps 2.1 to 2.3 of Section 3.1 of [IETF RFC 3987]. If necessary for the implementation, an IRI reference is converted to a URI reference by following the prescriptions of Section 3.1 of [IETF RFC 3987]. This conversion MUST be performed only when absolutely necessary and as late as possible in a processing chain. In particular, neither the process of converting a relative IRI to an absolute one nor the process of passing a IRI reference to a process or software component responsible for dereferencing it SHOULD trigger escaping. What about the XRI->IRI escaping? Must it happen late? And I can no longer remember exactly what we're getting at here; how exactly can you tell whether the conversion was done early or late? Also, I would prefer it if the definition actually defined which strings are legal XRIs. It's implicit that they are ones that do in fact result in IRIs after the escaping, but this should either be stated explicitly or a production should be given. This seems particularly important for Namespaces, where no escaping is in fact done. -- Richard
Received on Thursday, 16 February 2006 15:33:35 UTC