W3C home > Mailing lists > Public > www-validator@w3.org > November 2005

Re: [VE][410] Error Message Feedback

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Fri, 11 Nov 2005 16:38:39 +0200 (EET)
To: www-validator@w3.org
Cc: "BHA in L.A." <bha_in_la@yahoo.com>
Message-ID: <Pine.GSO.4.63.0511111625200.10101@korppi.cs.tut.fi>

On Fri, 11 Nov 2005, Lachlan Hunt wrote:

> BHA in L.A. wrote:
- -
>> I /choose/ to use the Windows-1252 character set despite all your advice.
>> (Compatibility with the occasional old browser is one reason.)
> Rubbish.  Windows-1252 is not recommended /because/ it is not supported by 
> the occasional old browser.

Unless I've missed something, the issue is a _warning_ about _references_
to non-SGML characters (e.g., &#150;), not about an _error message_ about
occurrences of non-SGML characters in the data as such (e.g., the
octet 150 decimal in a data stream purported to be e.g. ISO-8859-1
or UTF-8). These are related problems, but quite different. And the 
problem under discussion has nothing to do with _encodings_ such as
windows-1252. The meaning of a character reference in HTML does not
depend on the encoding.

Technically, a reference like &#150; is not a reportable markup error
(hence, no error message, but a warning), but its meaning is undefined.

I think it would be reasonable to have, in general, a set of checkboxes
for selecting which warnings the user of a validator would like to see.
It would be natural to make references to non-SGML characters one of them.

It is generally not advisable to use windows-1252 on the Web (though 
_many_ do so), but in practice, it works quite widely, though not
universally. There is certainly nothing formally wrong with it; it is
a registered encoding.

Using &#150; and similar references is a different issue, but it too works 
quite widely, though not universally. It's a trick based on the 
assumption that browsers will interpret such references as if the document 
character set were windows-1252. I think it is reasonable for a 
validator to warn about them but also to have such warnings switched off
without switching other warnings off.

> Instead of using &#146; for a right single quotation mark, for example, you 
> should, &#8217; (decimal) or &#x2019; (hexadecimal).

If compatibility with old browsers is relevant, the common method of
using ASCII apostrophe ' as a replacement for the right single quote
is surely safest.

Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Friday, 11 November 2005 14:38:52 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 14:17:47 UTC