W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > February 2011

[Bug 11973] HTML Spec confuses character sets with character encodings

From: <bugzilla@jessica.w3.org>
Date: Thu, 03 Feb 2011 20:03:31 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1Pl5Ot-0000AC-IN@jessica.w3.org>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11973

--- Comment #1 from Tab Atkins Jr. <jackalmage@gmail.com> 2011-02-03 20:03:31 UTC ---
(In reply to comment #0)
> Let me rephrase this: the text is encoded with UTF-8 using the windows-1252
> character set (which is what MS Word uses).

I'm not certain I understand.  Do you mean that the document is using utf-8
encoding, but with the windows-1252 character set masquerading as codepoints? 
So that, for example, € is encoded as if its codepoint was 0x80 rather than
0x20ac?

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Thursday, 3 February 2011 20:03:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 3 February 2011 20:03:33 GMT