Re: Shrinking HTML5 some more — Anne’s Weblog from Julian Reschke on 2009-03-30 (public-html@w3.org from March 2009)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Mon, 30 Mar 2009 10:32:51 +0200
To: Anne van Kesteren <annevk@opera.com>
CC: "public-html@w3.org" <public-html@w3.org>
Message-ID: <49D083B3.8010004@gmx.de>

Anne van Kesteren wrote:
> On Sat, 28 Mar 2009 13:15:35 +0100, Julian Reschke 
> <julian.reschke@gmx.de> wrote:
>> That distinction doesn't make sense to me. Why normalize when the 
>> source is ISO8859-1, and not when it was UTF-8?
> 
> Per http://lists.w3.org/Archives/Public/www-style/2005May/0022.html the 
> main importance is documents encoded in Windows-1258 written in 
> Vietnamese, but it does not elaborate on why that is so. (Or why making 

It sounds like this is an edge case, in that that encoding could 
potentially contain decomposed characters, which would be mapped to a 
sequence of decomposed Unicode characters when the mapping is done in 
the most simple way.

> IRIs work for that encoding was important but making them work for HTML, 
> CSS, etc. was not.)

I'd say they work just fine; you just need to preprocess them. And also, 
the work-in-progress revision of RFC 3987 already addresses this (at 
least partly), by introducing LEIRIs 
(<http://tools.ietf.org/html/draft-duerst-iri-bis-05#section-7>).

 > ...

BR, Julian

Received on Monday, 30 March 2009 08:33:38 UTC