- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Mon, 12 Oct 2009 10:42:47 +0300
- To: Mark Davis ☕ <mark@macchiato.com>
- Cc: Ian Hickson <ian@hixie.ch>, Larry Masinter <masinter@adobe.com>, Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>, "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, "Phillips, Addison" <addison@amazon.com>, Andrew Cunningham <andrewc@vicnet.net.au>, Richard Ishida <ishida@w3.org>, "public-html@w3.org" <public-html@w3.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
On Oct 12, 2009, at 07:14, Mark Davis ☕ wrote: > • Test if the bytes are valid UTF-8. If they are, return return > that encoding, with the confidence tentative, and abort these steps. > • [include note about UTF-8 patterns, maybe reworded a bit.] > • The user agent may attempt to autodetect the character encoding > [include rest of #5] So you are suggesting making UTF-8 autodetect mandatory while leaving the rest of chardet optional? Does any one of the 5 top browsers do that? > • Otherwise, return an implementation-defined or user-specified > default character encoding, with the confidence tentative. Due to > its widespread use as a default in legacy content, windows-1252 is > recommended as a default in the absences of other information. I think it would be useful to include a table showing the locales and their default encodings for the locales to which browsers traditionally ship with a non-Windows-1252 default. -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Monday, 12 October 2009 07:43:26 UTC