W3C home > Mailing lists > Public > public-html@w3.org > March 2009

Re: Shrinking HTML5 some more — Anne’s Weblog

From: Julian Reschke <julian.reschke@gmx.de>
Date: Mon, 30 Mar 2009 12:05:07 +0200
Message-ID: <49D09953.8020906@gmx.de>
To: Anne van Kesteren <annevk@opera.com>
CC: "public-html@w3.org" <public-html@w3.org>
Anne van Kesteren wrote:
>> It sounds like this is an edge case, in that that encoding could 
>> potentially contain decomposed characters, which would be mapped to a 
>> sequence of decomposed Unicode characters when the mapping is done in 
>> the most simple way.
> 
> Yes, but because of that edge case the specification has this silly 
> requirement which affects all non-Unicode encodings. Decomposed 
> characters are easy to get using character escapes.

Indeed, forgot about that.

In that case, I think it would be best for iri-bis to lift this 
requirement. It doesn't make sense that the same character sequence is 
handled differently depending on where it came from.

>>> IRIs work for that encoding was important but making them work for 
>>> HTML, CSS, etc. was not.)
>>
>> I'd say they work just fine; you just need to preprocess them.
> 
> The preprocessing you need to do involves converting the input to a URI 
> which seems highly suboptimal.

You could also preprocess to an IRI.

Anyway, the actual processing is the same, so what we're really 
discussing is simply how and where it's defined.

>> And also, the work-in-progress revision of RFC 3987 already addresses 
>> this (at least partly), by introducing LEIRIs 
>> (<http://tools.ietf.org/html/draft-duerst-iri-bis-05#section-7>).
> 
> LEIRIs are not a solution.

Please elaborate.

BR, Julian
Received on Monday, 30 March 2009 10:05:49 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:40:29 GMT