W3C home > Mailing lists > Public > public-i18n-core@w3.org > July to September 2011

[Bug 13396] New: i18n-ISSUE-77: HTTP and defaulting to UTF-16LE

From: <bugzilla@jessica.w3.org>
Date: Wed, 27 Jul 2011 18:20:46 +0000
To: public-i18n-core@w3.org
Message-ID: <bug-13396-3493@http.www.w3.org/Bugs/Public/>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=13396

           Summary: i18n-ISSUE-77: HTTP and defaulting to UTF-16LE
           Product: HTML WG
           Version: unspecified
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HTML5 spec (editor: Ian Hickson)
        AssignedTo: ian@hixie.ch
        ReportedBy: public-i18n-core@w3.org
         QAContact: public-html-bugzilla@w3.org
                CC: mike@w3.org, public-html-wg-issue-tracking@w3.org,
                    public-html@w3.org


8.2.2.2 Character encodings
http://www.w3.org/TR/html5/parsing.html#character-encodings-0

Supported by the i18n WG.

"When a user agent is to use the UTF-16 encoding but no BOM has been found,
user agents must default to UTF-16LE."

If the HTTP header declares the file to be UTF-16BE, which I believe it can,
and in which case a BOM should *not* be used, then I think that this would not
be true. If the HTTP header declares the file to be UTF-16, then there must be
a BOM, so I assume that this is a recovery mechanism if someone does declare
UTF-16 in HTTP but omits the BOM. I'd think that some kind of clarification and
perhaps error message would be in order though.

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You reported the bug.
Received on Wednesday, 27 July 2011 18:20:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 27 July 2011 18:20:48 GMT