[whatwg] Default encoding to UTF-8? from Jukka K. Korpela on 2011-12-06 (public-whatwg-archive@w3.org from December 2011)

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Tue, 06 Dec 2011 23:27:11 +0200
Message-ID: <4EDE88AF.6030308@cs.tut.fi>

2011-12-06 22:58, Leif Halvard Silli write:

> There is now a bug, and the editor says the outcome depends on "a
> browser vendor to ship it":
> https://www.w3.org/Bugs/Public/show_bug.cgi?id=15076
>
> Jukka K. Korpela Tue Dec 6 00:39:45 PST 2011
>
>> what is this proposed change to defaults supposed to achieve. [?]
>
> I'd say the same as in XML: UTF-8 as a reliable, common default.

The "bug" was created so that the argument given was:
"It would be nice to minimize number of declarations a page needs to 
include."

That is, author convenience - so that authors could work sloppily and 
produce documents that could fail on user agents that haven't 
implemented this change.

This sounds more absurd than I can describe.

XML was created as a new data format; it was an entirely different issue.

>> If there's something that should be added to or modified in the
>> algorithm for determining character encoding, the I'd say it's error
>> processing. I mean user agent behavior when it detects, [...]
>
> There is already an (optional) detection step in the algorithm - but UA
> treat that step differently, it seems.

I'm afraid I can't find it - I mean the treatment of a document for 
which some encoding has been deduced (say, directly from HTTP headers) 
and which then turns out to violate the rules of the encoding.

Yucca

Received on Tuesday, 6 December 2011 13:27:11 UTC