W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > February 2011

[Bug 11973] HTML Spec confuses character sets with character encodings

From: <bugzilla@jessica.w3.org>
Date: Thu, 03 Feb 2011 20:05:19 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1Pl5Qd-0000Hc-Cb@jessica.w3.org>

Julian Reschke <julian.reschke@gmx.de> changed:

           What    |Removed                     |Added
                 CC|                            |julian.reschke@gmx.de

--- Comment #2 from Julian Reschke <julian.reschke@gmx.de> 2011-02-03 20:05:19 UTC ---
(In reply to comment #0)
> Let me rephrase this: the text is encoded with UTF-8 using the windows-1252
> character set (which is what MS Word uses).

That doesn't make sense to me.

The character set of HTML (as in: the repertoire of characters that can be
used) is fixed to be Unicode.

It is *encoded* in exactly one encoding (well, at least a non-broken document).
No matter what the metadata says.

It's unfortunate that some attributes/params say "charset" when they should say
"encoding", but that's something that can't be easily changed.

Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Thursday, 3 February 2011 20:05:20 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:01:40 UTC