W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > August 2009

[Bug 7381] New: Clarify default encoding wording and add some examples for non-latin locales.

From: <bugzilla@wiggum.w3.org>
Date: Thu, 20 Aug 2009 07:32:26 +0000
To: public-html-bugzilla@w3.org
Message-ID: <bug-7381-2486@http.www.w3.org/Bugs/Public/>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=7381

           Summary: Clarify default encoding wording and add some examples
                    for non-latin locales.
           Product: HTML WG
           Version: unspecified
          Platform: PC
               URL: http://dev.w3.org/html5/spec/Overview.html#determining-
                    the-character-encoding
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HTML5 spec bugs
        AssignedTo: dave.null@w3.org
        ReportedBy: mjs@apple.com
         QAContact: public-html-bugzilla@w3.org
                CC: ian@hixie.ch, mike@w3.org, public-html@w3.org


Step 7 of the encoding algorithm says:

"Otherwise, return an implementation-defined or user-specified default
character encoding, with the confidence tentative. In non-legacy environments,
the more comprehensive UTF-8 encoding is recommended. Due to its use in legacy
content, windows-1252 is recommended as a default in predominantly Western
demographics instead. Since these encodings can in many cases be distinguished
by inspection, a user agent may heuristically decide which to use as a
default."

The I18N WG suggests wording along these lines:

"Otherwise, return an implementation-defined or user-specified default
character encoding, with the confidence tentative. The UTF-8 encoding is
recommended as a default. The default may also be set according to the
expectations and predominant legacy content encodings for a given demographic
or audience. For example, windows-1252 is recommended as the default encoding
for Western European language environments. Other encodings may also be used.
For example, "windows-949" might be an appropriate default in a Korean language
runtime environment. "

Henri and I suggested striking the UTF-8 recommendation since it's not likely
to be followed.


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Thursday, 20 August 2009 07:32:35 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:00:58 UTC