W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > February 2011

[Bug 11973] HTML Spec confuses character sets with character encodings

From: <bugzilla@jessica.w3.org>
Date: Thu, 03 Feb 2011 20:05:19 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1Pl5Qd-0000Hc-Cb@jessica.w3.org>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11973

Julian Reschke <julian.reschke@gmx.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |julian.reschke@gmx.de

--- Comment #2 from Julian Reschke <julian.reschke@gmx.de> 2011-02-03 20:05:19 UTC ---
(In reply to comment #0)
> Let me rephrase this: the text is encoded with UTF-8 using the windows-1252
> character set (which is what MS Word uses).

That doesn't make sense to me.

The character set of HTML (as in: the repertoire of characters that can be
used) is fixed to be Unicode.

It is *encoded* in exactly one encoding (well, at least a non-broken document).
No matter what the metadata says.

It's unfortunate that some attributes/params say "charset" when they should say
"encoding", but that's something that can't be easily changed.

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Thursday, 3 February 2011 20:05:20 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 3 February 2011 20:05:20 GMT