W3C home > Mailing lists > Public > www-tag@w3.org > May 2009

Re: a few URI/href issues captured with test cases

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Fri, 22 May 2009 22:28:42 +0200
To: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
Cc: <www-tag@w3.org>, public-iri@w3.org
Message-ID: <c0qd155fp8a1q7la7j9lblqhjd5ivmk9dd@hive.bjoern.hoehrmann.de>
* Martin J. Dürst wrote:

W3C's Character Model specification defines a reference processing model
which disallows processing text differently depending on the encoding. A
format that requires something else has problems with operations such as
composition and identity transformations.

If you receive an e-mail containing a link, and you click that link, you
may get one document, and if you instead copy the link and paste it into
the address bar of your browser, you may get some other document.

Consider this example:

  <?xml version='1.0' encoding='X'?>
    <data xml:id='Bjo&#x308;rn'>A</data>
    <data xml:id='Bj&#xF6;rn'>B</data>
    <ref src='#Bjo&#x308;rn' />

Depending only on the value of X, you have two very different documents;
if you have a document like

  <?xml version='1.0' encoding='X'?>
  <!DOCTYPE example ...>
    <ref src='&A;&B;' />

then there is no specification that explains what the value of src would
resolve to, as that may depend on X and the encoding of the document de-
claring entity A and the encoding of the document declaring entity B.

In fact I am not entirely sure the examples are valid, as one may argue
the implementation supports IRIs for all the relevant operations, so you
do not apply the IRI to URI conversion and hence never normalize. Then
you could use external resources over some URI-only protocol instead.

What the specification does right now is trade convenience in some cases
for confusion in other cases. As I said above, a link may work from the
e-mail client directly, but copying and pasting it into the address bar
of your browser may fail, for reasons most users would not understand.

Changing the requirement from MUST to SHOULD would not change that, you
still have it working sometimes, and not working other times, at least
in theory, in practise there are few if any implementations that do as
the specification requires.

If removing the requirement is not an option, some data why this trade-
off is a very good one would help finding alternatives or convince the
implementers to do as the specification requires.
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 
Received on Friday, 22 May 2009 20:29:23 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:56:29 UTC