- From: Henry S. Thompson <ht@inf.ed.ac.uk>
- Date: Tue, 21 Aug 2007 15:34:09 +0100
- To: public-xml-core-wg <public-xml-core-wg@w3.org>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Oops - - I failed to edit the inclusion sensibly -- revised version: We would like to suggest that the best way to move forward with our effort to reconcile the differences between the way in which various specifications in the XML family allow a superset of IRIs, and the IRI spec. itself, would be to incorporate a new section in the revision of the IRI spec. that you are currently working on, which would name and define a single concept to be referenced from all those XML specs, along the following lines: Name (negotiable): Legacy Extended IRIs (LEIRIs) Definition (taken from [1]): A Legacy Extended International Resource Identifier (LEIRI) is a sequence of Unicode characters that can be converted into an IRI by the application of a few simple encoding rules. To convert a Legacy Extended International Resource Identifier to an IRI reference, the following characters MUST be percent encoded: * the control characters #x0 to #x1F and #x7F to #x9F * space #x20 * the delimiters "<" #x3C, ">" #x3E, and '"' #x22 * the unwise characters "{" #x7B, "}" #x7D, "|" #x7C, "\" #x5C, "^" #x5E, and "`" #x60 * characters in the Unicode private use area (#xE000-#xF8FF), except where they appear in the query part of the resulting IRI. These characters are percent encoded by applying [steps 2.1 to 2.3 of Section 3.1 of RFC 3987] to them. Health Warning: We would be happy to see some text added to warn against creating new LEIRIs using most or indeed almost all of the characters allowed by this, perhaps expanding on what is already present in [1]: "[A]uthors of [LEIRI]s are advised to percent encode space characters themselves, rather than rely on the processor to do so, because spaces are often used to separate [LEIRI]s in a sequence." We would expect to go ahead and publish several specs. which are waiting for a resolution of this issue, e.g. XML Base 2e and XLink 1.1, once there is a stable and agreed-final Internet Draft of a new edition of 3987 including agreed prose along the lines given above, leaving the insertion of the final RFC number to subsequent errata. - ------------- ht [1] http://www.w3.org/XML/2007/04/hrri/draft-walsh-tobin-hrri-01c.html - -- Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh Half-time member of W3C Team 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ [mail really from me _always_ has this .sig -- mail without it is forged spam] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQFGyvfhkjnJixAXWBoRAiHtAJwPrZsSK622nvlwih0uE0Wt6l0vpgCdF2wK 8a1Xh9+B7b0Gxld8A0Rj+u8= =bQW3 -----END PGP SIGNATURE-----
Received on Tuesday, 21 August 2007 14:34:12 UTC