- From: <bugzilla@wiggum.w3.org>
- Date: Thu, 20 Aug 2009 07:29:13 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=7380
Summary: Suggest heuristic detection of UTF-8
Product: HTML WG
Version: unspecified
Platform: PC
URL:
http://dev.w3.org/html5/spec/Overview.html#determining-
the-character-encoding
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: HTML5 spec bugs
AssignedTo: dave.null@w3.org
ReportedBy: mjs@apple.com
QAContact: public-html-bugzilla@w3.org
CC: ian@hixie.ch, mike@w3.org, public-html@w3.org
Step 6 of the encoding detection algorithm should specifically suggest the
possibility of algorithmically detecting UTF-8. Here is some suggested wording
from the I18N WG:
"Note: The UTF-8 encoding has a highly detectable bit pattern. Documents that
contain bytes > 0x7F which match the UTF-8 pattern are very likely to be UTF-8,
while documents that do not match it definitely are not. While not full
autodetection, it may be appropriate for a user-agent to search for this common
encoding."
--
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Thursday, 20 August 2009 07:29:24 UTC