W3C home > Mailing lists > Public > www-i18n-comments@w3.org > March 2005

Re: Why hexify fragments?

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Wed, 23 Mar 2005 17:27:18 +0100
To: Chris Lilley <chris@w3.org>
Cc: www-i18n-comments@w3.org
Message-ID: <424692d3.65836093@smtp.bjoern.hoehrmann.de>

* Chris Lilley wrote:
>>> C060 [S] Specifications that define new syntax for URIs, such as a
>>> new URI scheme or a new kind of fragment identifier, MUST specify
>>> that characters outside the US-ASCII repertoire are encoded using
>>> UTF-8 and %HH-escaping.
>>> This is in accordance with Guidelines for new URL Schemes [RFC 2718],
>>> Section 2.2.5.
>While working on implementing this requirement in a specification, it
>was pointed out that requiring escaping for fragment identifiers, while
>safe, is sort of pointless.

If e.g. image/svg+xml does not define how to encode fragment identifiers
then e.g. svg#Bj%F6rn might be legal and it would be highly unclear what
this might match against. Equally, if it does not say that it is based
on UTF-8, implementations might not consider svg#Bj%C3%B6rn to match an
element with id="Björn". Reversing the %xx escaping is atm required for
image/svg+xml, yet most implementations fail to do this, so this is not
pointless to state explicitly.

>while the fragment, ABCD, is not sent to the server and is merely
>applied once the resource and its Media type have been returned. Thus,
>whether the protocol is 8-bit clean is irrelevant, and whether the
>fragment was hexified or not is not detectable by observing the

This depends on information flow requirements. What is testable is
whether the escaping is reversed, something like (in various encodings)

  <svg xmlns="http://www.w3.org/2000/svg" version="1.1"
    <rect id="Björn" fill="green" width="100" height="100" />
    <rect id="test" fill="red" width="100" height="100" />
    <use xlink:href="#Bj%C3%B6rn" />

would do. This fails in ASV6, Batik 1.5-dev, Opera 8... For text/html
see http://lists.w3.org/Archives/Public/www-html-editor/2002OctDec/0001
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 
Received on Wednesday, 23 March 2005 16:28:35 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:20:15 UTC