W3C home > Mailing lists > Public > www-i18n-comments@w3.org > March 2005

Re: Why hexify fragments?

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Wed, 23 Mar 2005 17:27:18 +0100
To: Chris Lilley <chris@w3.org>
Cc: www-i18n-comments@w3.org
Message-ID: <424692d3.65836093@smtp.bjoern.hoehrmann.de>

* Chris Lilley wrote:
>>> C060 [S] Specifications that define new syntax for URIs, such as a
>>> new URI scheme or a new kind of fragment identifier, MUST specify
>>> that characters outside the US-ASCII repertoire are encoded using
>>> UTF-8 and %HH-escaping.
>
>>> This is in accordance with Guidelines for new URL Schemes [RFC 2718],
>>> Section 2.2.5.
>
>While working on implementing this requirement in a specification, it
>was pointed out that requiring escaping for fragment identifiers, while
>safe, is sort of pointless.

If e.g. image/svg+xml does not define how to encode fragment identifiers
then e.g. svg#Bj%F6rn might be legal and it would be highly unclear what
this might match against. Equally, if it does not say that it is based
on UTF-8, implementations might not consider svg#Bj%C3%B6rn to match an
element with id="Björn". Reversing the %xx escaping is atm required for
image/svg+xml, yet most implementations fail to do this, so this is not
pointless to state explicitly.

>while the fragment, ABCD, is not sent to the server and is merely
>applied once the resource and its Media type have been returned. Thus,
>whether the protocol is 8-bit clean is irrelevant, and whether the
>fragment was hexified or not is not detectable by observing the
>implementation.

This depends on information flow requirements. What is testable is
whether the escaping is reversed, something like (in various encodings)

  <svg xmlns="http://www.w3.org/2000/svg" version="1.1"
       xmlns:xlink="http://www.w3.org/1999/xlink">
  
    <rect id="Björn" fill="green" width="100" height="100" />
    <rect id="test" fill="red" width="100" height="100" />
    <use xlink:href="#Bj%C3%B6rn" />
  
  </svg>

would do. This fails in ASV6, Batik 1.5-dev, Opera 8... For text/html
see http://lists.w3.org/Archives/Public/www-html-editor/2002OctDec/0001
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 
Received on Wednesday, 23 March 2005 16:28:35 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 October 2009 08:32:35 GMT