- From: <bugzilla@jessica.w3.org>
- Date: Thu, 03 Feb 2011 21:07:50 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11973 --- Comment #6 from Craig S <craig.e.shea@gmail.com> 2011-02-03 21:07:50 UTC --- (In reply to comment #5) > As far as I can tell, the spec is not confused; it's just as Julian says that > some attributes/params have unfortunate names for legacy reasons. I agree. However, I still maintain that the spec really should say that "The charset attribute specifies the character set used by the document.", as this seems to be the way UA's are in fact treating it. This would at least make the spec "definition" align with the attribute name. In addition, perhaps the spec could mandate that all HTML files are to be encoded (read stored or saved) as UTF-8. Then, with the combination of the mandated encoding, and the declaration of the character set, a UA knows how to interpret the document. Also, it preserves 99.999% of all web pages in the wild (since ANSI/ASCII plain is already valid UTF-8). Furthermore, the spec can continue to say that the default character set for HTML is UTF-8 (and should you want anything different, be sure to specify it with the META tag using one of the specified methods). I found a snippet of text on stackoverflow.com (http://stackoverflow.com/questions/2014069/windows-1252-to-utf-8-encoding) that was interesting: "While utf8 is valid Win-1252, the reverse is not true: win-1252 is NOT valid UTF-8." This explains why I see "funky" characters in my HTML page when sent as charset=UTF-8 as opposed to charset=windows-1252 (which displays correctly). Thank you all for your comments. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Thursday, 3 February 2011 21:07:52 UTC