W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > February 2011

[Bug 11973] HTML Spec confuses character sets with character encodings

From: <bugzilla@jessica.w3.org>
Date: Thu, 03 Feb 2011 20:03:31 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1Pl5Ot-0000AC-IN@jessica.w3.org>

--- Comment #1 from Tab Atkins Jr. <jackalmage@gmail.com> 2011-02-03 20:03:31 UTC ---
(In reply to comment #0)
> Let me rephrase this: the text is encoded with UTF-8 using the windows-1252
> character set (which is what MS Word uses).

I'm not certain I understand.  Do you mean that the document is using utf-8
encoding, but with the windows-1252 character set masquerading as codepoints? 
So that, for example, € is encoded as if its codepoint was 0x80 rather than

Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Thursday, 3 February 2011 20:03:32 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 16:31:05 UTC