- From: Chris Lilley <chris@w3.org>
- Date: Wed, 23 Mar 2005 16:41:27 +0100
- To: www-i18n-comments@w3.org
Hello www-i18n-comments, in the specification Character Model for the World Wide Web 1.0: Resource Identifiers W3C Candidate Recommendation 22 November 2004 http://www.w3.org/TR/2004/CR-charmod-resid-20041122/ >> C060 [S] Specifications that define new syntax for URIs, such as a >> new URI scheme or a new kind of fragment identifier, MUST specify >> that characters outside the US-ASCII repertoire are encoded using >> UTF-8 and %HH-escaping. >> This is in accordance with Guidelines for new URL Schemes [RFC 2718], >> Section 2.2.5. While working on implementing this requirement in a specification, it was pointed out that requiring escaping for fragment identifiers, while safe, is sort of pointless. Using a notation where capital letters represent some characters outside the repertoire of US-ASCII, then given this IRI http://example.org/Zfoo.bar#ABCD what is hexified and sent to the server is http://example.org/Zfoo.bar while the fragment, ABCD, is not sent to the server and is merely applied once the resource and its Media type have been returned. Thus, whether the protocol is 8-bit clean is irrelevant, and whether the fragment was hexified or not is not detectable by observing the implementation. The guidelines make good sense for other parts of the IRI, such as queries, etc but do not seem to be necessary or to provide any benefit for fragments, and does not seem to be testable short of reading the source code. -- Chris Lilley mailto:chris@w3.org Chair, W3C SVG Working Group W3C Graphics Activity Lead
Received on Wednesday, 23 March 2005 15:41:27 UTC