- From: <bugzilla@jessica.w3.org>
- Date: Mon, 30 Jun 2014 15:57:13 +0000
- To: www-international@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=23646 --- Comment #26 from Addison Phillips <addison@lab126.com> --- (In reply to Paul Eggert from comment #23) > It was never common practice to use charset="us-ascii" when the text was > actually Latin-1 or some other extension to ASCII. The default was Latin-1, > and some validators would recommend charset="us-ascii" when the text was > limited to characters in the range 00-7F. So the longstanding meaning of > charset="us-ascii" was "This document is not using any characters outside > the ASCII range, and I've checked it and that's what I want". Look at it from the browser (or search engine or document consumer) point of view. If you have a document that declares "us-ascii", but, in fact, contains non-ASCII byte values, what should happen to those byte values when the document is interpreted? I find myself writing text here that I already said in or around comment 1, so I won't repeat myself. > > Again, I'm not asking that the standard be *changed*, only that this issue > be *explained*. Currently this stuff is entirely a mystery to a non-expert > (and it appears, even to some experts). That's not right. I agree that an explanation is desirable. There is no discussion of superset encodings or why any of this occurs in the Encoding spec. A note is probably called for so that it won't be a mystery. Perhaps just after the "violation of UTS#22" note in section 4.2: -- In many cases the legacy single-byte encoding selected has a larger character repertoire than that of the label actually used in the document. For example, both the "iso8859-1" and "us-ascii" labels use the "windows-1252" encoding. This is because user-agents historically have applied the larger "super-set" encoding in practice because document authors tend to be imprecise in identifying the correct label. -- -- You are receiving this mail because: You are on the CC list for the bug.
Received on Monday, 30 June 2014 15:57:15 UTC