W3C home > Mailing lists > Public > whatwg@whatwg.org > April 2012

[whatwg] Default encoding to UTF-8?

From: Anne van Kesteren <annevk@opera.com>
Date: Tue, 03 Apr 2012 21:08:41 +0200
Message-ID: <op.wb7d4rdh64w2qv@annevk-macbookpro.local>
On Tue, 03 Apr 2012 13:59:25 +0200, Henri Sivonen <hsivonen at iki.fi> wrote:
> On Wed, Jan 4, 2012 at 12:34 AM, Leif Halvard Silli
> <xn--mlform-iua at xn--mlform-iua.no> wrote:
>>> A solution that would border on reasonable would be decoding as
>>> US-ASCII up to the first non-ASCII byte
>> Thus possibly prescan of more than 1024 bytes?
> I didn't mean a prescan.  I meant proceeding with the real parse and
> switching decoders in midstream. This would have the complication of
> also having to change the encoding the document object reports to
> JavaScript in some cases.

On IRC (#whatwg) zcorpan pointed out this would break URLs where entities  
are used to encode non-ASCII code points in the query component.

Anne van Kesteren
Received on Tuesday, 3 April 2012 12:08:41 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 30 January 2013 18:48:07 GMT