W3C home > Mailing lists > Public > www-international@w3.org > January to March 2011

Re: Cool IRIs & diacritics, for a change

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Sat, 05 Feb 2011 01:26:02 +0100
To: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Cc: www-international@w3.org
Message-ID: <jq5pk6lvarc0q2s932f7nnkcnv4lblhp4t@hive.bjoern.hoehrmann.de>
* Leif Halvard Silli wrote:
>Questions and conclusions:

>	- is the article [1] simply outdated? Have new thing happened?
>	  or perhaps it doesn't speak about how to link to *filenames*?

If you used 'http' addresses in your tests then the question is how the
browser sends the address to the server and then it is up to the server
how to interpret it, and the server may do one thing or another. The
server should obviously accept the NFC variant, but if it does not, then
that is their business. For 'file' addresses it's up to browser and OS
how they resolve them.

>	- why does Wikipedia work, then? I suppose the a *composed*
>	  'å', such as the when you type an 'å' in the URL bar, 
>	  is *ambiguous*: it can be interpreted two ways, perhaps.
>	  But wikipedia has probably hardcoded 'å' (%C3%A5) to mean
>	  'å'. OTOH, I don't understand why browsers considers '%C3%A5'
>	  ambiguous when the page is UTF-8 encoded ... ???

MediaWiki uses a Unicode Normalizer when available and does some things
on its own to map some kinds of user input, it's likely that Wikipedia
does normalize page names.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 
Received on Saturday, 5 February 2011 00:26:45 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 5 February 2011 00:26:47 GMT