W3C home > Mailing lists > Public > public-html@w3.org > April 2012

[Bug 16768] New: Update HTML to make use of the Encoding Standard

From: <bugzilla@jessica.w3.org>
Date: Wed, 18 Apr 2012 07:54:43 +0000
To: public-html@w3.org
Message-ID: <bug-16768-2495@http.www.w3.org/Bugs/Public/>
https://www.w3.org/Bugs/Public/show_bug.cgi?id=16768

           Summary: Update HTML to make use of the Encoding Standard
           Product: HTML WG
           Version: unspecified
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HTML5 spec (editor: Ian Hickson)
        AssignedTo: ian@hixie.ch
        ReportedBy: annevk@opera.com
         QAContact: public-html-bugzilla@w3.org
                CC: mike@w3.org, public-html-wg-issue-tracking@w3.org,
                    public-html@w3.org


The IANA registry is unbounded, does not match implementations when it comes to
encodings and their labels, does not detail extensions to encodings that need
to be supported, does not detail error handling for encodings; it is inadequate
per today's standards.
http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html was written to solve
this problem and using it in HTML we can simplify the following:

* Instead of "preferred MIME name" we can now talk about "name" of the
"encoding".
* "ASCII-compatible character encoding" is no longer needed as only utf-16 and
utf-16be are incompatible per the restricted list.
* The "decode a byte string as UTF-8, with error handling" algorithm can be
removed in favor of using "utf-8 decode" which has the correct error handling
(should be identical).
* For encoding (URLs and <form>) a custom "encoder error" needs to be defined,
by returning from the decoder algorithm and feeding it the intended replacement
characters. (You do not know in advance which code points cannot be encoded.)
* In the suggested default encoding list the encoding names can be updated to
use the canonical name rather than a label.
* Misinterpreted for compatibility is no longer needed and the encoding
overrides table can also be removed.

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Received on Wednesday, 18 April 2012 10:38:08 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:17:48 GMT