RE: [CSS21] uri() from Ian Hickson on 2005-05-07 (public-i18n-core@w3.org from April to June 2005)

From: Ian Hickson <ian@hixie.ch>
Date: Sat, 7 May 2005 06:26:13 +0000 (UTC)
To: Martin Duerst <duerst@it.aoyama.ac.jp>
Cc: Addison Phillips <addison.phillips@quest.com>, Richard Ishida <ishida@w3.org>, www-style@w3.org, public-i18n-core@w3.org
Message-ID: <Pine.LNX.4.61.0505070622290.19319@dhalsim.dreamhost.com>

On Wed, 4 May 2005, Martin Duerst wrote:
>> 
>> This would be nigh-on-impossible for UAs to sanely implement because it 
>> would require character encoding information to be propagated through 
>> the implementation into parts of the code that are completely unrelated 
>> to the parsing of the original document (e.g. the DOM code).
> 
> The original character encoding is part of the Infoset.

First, CSS has no infoset, and second, the infoset is a highly theoretical 
construct which in real browsers doesn't exist. So I don't think that this 
is very relevant to the discussion. :-)

> Well, the IRI RFC, as well as the IDN-related RFCs, are still in Draft 
> Stage. Feedback on what parts are easy or difficult to implement is 
> definitely welcome. I can immagine that a future version of these specs 
> would e.g. contain some provision for small devices to not do the 
> normalization/nameprep, if there is enough feedback in this direction.

My main feedback would be: Please don't make anything dependent on the 
encoding that the URI is found in. Internally, in Web browsers, things are 
implemented such that the incoming data stream is decoded before it is 
parsed, after which point the original encoding information is lost, and 
all the data is in UTF-16 (or sometimes UTF-8). Anything that requires 
that the original encoding information be used to change the behaviour of 
later processing will most likely not be reliably implemented.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Saturday, 7 May 2005 06:26:27 UTC