W3C home > Mailing lists > Public > public-i18n-core@w3.org > October to December 2009

Re: HTML5 Issue 11 (encoding detection): I18N WG response...

From: Mark Davis ☕ <mark@macchiato.com>
Date: Sun, 11 Oct 2009 21:28:32 -0700
Message-ID: <30b660a20910112128r27cfe9e0ied68a231508fb5d8@mail.gmail.com>
To: Ian Hickson <ian@hixie.ch>
Cc: Larry Masinter <masinter@adobe.com>, Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>, "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, "Phillips, Addison" <addison@amazon.com>, Andrew Cunningham <andrewc@vicnet.net.au>, Richard Ishida <ishida@w3.org>, "public-html@w3.org" <public-html@w3.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
FYI, for the Unicode conference I updated the Unicode Growth chart to with
newer data. It continues the trends we found about 15 months earlier.

http://www.macchiato.com/main/updated-unicode-growth

Mark


On Sun, Oct 11, 2009 at 21:14, Mark Davis ☕ <mark@macchiato.com> wrote:

> I'm a little late to this discussion, so please forgive me if I'm covering
> ground people have already discussed.
>
> But focusing on advice to developers, I'd suggest replacing 6 and 7 in
> http://dev.w3.org/html5/spec/Overview.html#determining-the-character-encoding,
> by the following 3 numbered items.
>
>    - Test if the bytes are valid UTF-8. If they are, return return that
>    encoding, with the confidence<http://dev.w3.org/html5/spec/Overview.html#concept-encoding-confidence>
>    *tentative*, and abort these steps.
>       - *[include note about UTF-8 patterns, maybe reworded a bit.]*
>    - The user agent may attempt to autodetect the character encoding *[include
>    rest of #5]*
>    - Otherwise, return an implementation-defined or user-specified default
>    character encoding, with the confidence<http://dev.w3.org/html5/spec/Overview.html#concept-encoding-confidence>
>    *tentative*. Due to its widespread use as a default in legacy content,
>    windows-1252 is recommended as a default in the absences of other
>    information.
>
>
> Mark
>
>
>
> On Sun, Oct 11, 2009 at 19:57, Ian Hickson <ian@hixie.ch> wrote:
>
>> On 11 Oct 2009, at 18:39, Larry Masinter <masinter@adobe.com> wrote:
>>
>>  Can someone please explain, again, why the discussion of default
>>> configurations of a particular category of user agent in various
>>> regions belongs in the definition of the HyperText Markup Language?
>>>
>>> What benefit can any author of a web page derive, please, from
>>> knowing what the default settings of various browsers in products
>>> sold into various language environments?
>>>
>>
>> Authors aren't the only target audience of this specification.
>> Implementors benefit from advice suggesting default encodings. Users benefit
>> from consistency in implementations.
>>
>> --
>> Ian Hickson
>>
>>
>
Received on Monday, 12 October 2009 04:29:12 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 12 October 2009 04:29:14 GMT