- From: Larry Masinter <masinter@adobe.com>
- Date: Thu, 30 Jul 2009 22:59:13 -0700
- To: Larry Masinter <masinter@adobe.com>, Anne van Kesteren <annevk@opera.com>, Ian Hickson <ian@hixie.ch>, "Dr. Olaf Hoffmann" <Dr.O.Hoffmann@gmx.de>
- CC: HTML WG <public-html@w3.org>
To be specific about my advice on handling of charset: >> What the document should say, rather than having a 'willful' >> misinterpretation, is that ISO-8859-1 means ISO-8859-1, but that for >> backward compatibility with existing (broken) web content, HTTP >> interpreting agents SHOULD treat characters outside of the ISO-8859-1 >> repertoire as if they were in Windows-1252. >That's exactly what it says, as far as I can tell. Could you elaborate on >exactly what text in the spec you are objecting to? Maybe I don't >understand your request. My request: The definition of "charset" in the HTML 4.01 specification is much more legible and understandable, and the current draft's language is opaque. Readopt most of HTML 4.01 section 5.2 text; it would be a great improvement in legibility. Remove the tables in 2.7 Character Encodings from the body of the specification and put into a separate document or appendix "Browser Implementation Compatibility Guide" which begins with wording to the effect: "For compatibility with some existing legacy content deployed on the web in various degrees, the following implementation advice is provided. Conforming HTML interpreters MAY apply these equivalences, but conforming HTML generators and editing tools MUST NOT rely on these mappings. Over time, it is expected that use of incorrect charset labels will decrease." Other wording around "willful violation" should be replaced with advice on how incompatibility should be reduced in the future. Larry -- http://larry.masinter.net
Received on Friday, 31 July 2009 06:00:06 UTC