W3C home > Mailing lists > Public > www-validator@w3.org > June 2003

Re: default charset broken

From: Karl Ove Hufthammer <karl@huftis.org>
Date: Sat, 07 Jun 2003 18:50:59 +0200
Message-Id: <n2m-g.Xns9393BFBFC138huftis@ID-99504.news.dfncis.de>
To: www-validator@w3.org

Terje Bless <link@pobox.com> wrote in
news:f02000001-1026-364C400E990111D7B1DF0030657B83E8@[193.157.66.
23]:

> Yes, well, in the interest of full disclosure, let me add that
> another significant factor in the Validator's current
> behaviour is that the HTTP defaulting behaviour is considered
> harmfull to i18n and all those users for whom iso-8859-1 is
> insufficient.

Well, even if those users have no control over their servers,
there are no problems, neither in theory nor in practice, to use
characters not in ISO-8859-1. They could always use numeric
character references (which work better anyway).

But in my opinion the main problem is that the validator is
labeling perfectly valid documents as invalid. I think this is
more serious than not labeling invalid documents as invalid
because of character encoding issues.

> In particular, if we allow for your interpretation above, we
> would in effect default to ISO-8859-1 not only for pages such
> as Kjetil's (who are most certainly correct and the author
> very aware of what he is doing), but also for Joe
> Web-duh-signer and his clueless little hosting company where
> there is _no_ conscious decision involved and ISO-8859-1 is
> the _wrong_ value more often then not.

'More often than not'? Isn't ISO-8859-1 the most used encoding for
valid documents?

And if there are '_no_ conscious decision involved', I doubt the
Web pages would be valid even with an explicit character encoding
declaration.

-- 
Karl Ove Hufthammer
Received on Saturday, 7 June 2003 12:51:18 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:09 GMT