[whatwg] Default encoding to UTF-8? from Anne van Kesteren on 2012-04-03 (public-whatwg-archive@w3.org from April 2012)

From: Anne van Kesteren <annevk@opera.com>
Date: Tue, 03 Apr 2012 21:08:41 +0200
Message-ID: <op.wb7d4rdh64w2qv@annevk-macbookpro.local>

On Tue, 03 Apr 2012 13:59:25 +0200, Henri Sivonen <hsivonen at iki.fi> wrote:
> On Wed, Jan 4, 2012 at 12:34 AM, Leif Halvard Silli
> <xn--mlform-iua at xn--mlform-iua.no> wrote:
>>> A solution that would border on reasonable would be decoding as
>>> US-ASCII up to the first non-ASCII byte
>>
>> Thus possibly prescan of more than 1024 bytes?
>
> I didn't mean a prescan.  I meant proceeding with the real parse and
> switching decoders in midstream. This would have the complication of
> also having to change the encoding the document object reports to
> JavaScript in some cases.

On IRC (#whatwg) zcorpan pointed out this would break URLs where entities  
are used to encode non-ASCII code points in the query component.


-- 
Anne van Kesteren
http://annevankesteren.nl/

Received on Tuesday, 3 April 2012 12:08:41 UTC