- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Fri, 11 Nov 2005 16:38:39 +0200 (EET)
- To: www-validator@w3.org
- Cc: "BHA in L.A." <bha_in_la@yahoo.com>
On Fri, 11 Nov 2005, Lachlan Hunt wrote: > BHA in L.A. wrote: - - >> I /choose/ to use the Windows-1252 character set despite all your advice. >> (Compatibility with the occasional old browser is one reason.) > > Rubbish. Windows-1252 is not recommended /because/ it is not supported by > the occasional old browser. Unless I've missed something, the issue is a _warning_ about _references_ to non-SGML characters (e.g., –), not about an _error message_ about occurrences of non-SGML characters in the data as such (e.g., the octet 150 decimal in a data stream purported to be e.g. ISO-8859-1 or UTF-8). These are related problems, but quite different. And the problem under discussion has nothing to do with _encodings_ such as windows-1252. The meaning of a character reference in HTML does not depend on the encoding. Technically, a reference like – is not a reportable markup error (hence, no error message, but a warning), but its meaning is undefined. I think it would be reasonable to have, in general, a set of checkboxes for selecting which warnings the user of a validator would like to see. It would be natural to make references to non-SGML characters one of them. It is generally not advisable to use windows-1252 on the Web (though _many_ do so), but in practice, it works quite widely, though not universally. There is certainly nothing formally wrong with it; it is a registered encoding. Using – and similar references is a different issue, but it too works quite widely, though not universally. It's a trick based on the assumption that browsers will interpret such references as if the document character set were windows-1252. I think it is reasonable for a validator to warn about them but also to have such warnings switched off without switching other warnings off. > Instead of using ’ for a right single quotation mark, for example, you > should, ’ (decimal) or ’ (hexadecimal). If compatibility with old browsers is relevant, the common method of using ASCII apostrophe ' as a replacement for the right single quote is surely safest. -- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Friday, 11 November 2005 14:38:52 UTC