- From: Andreas Prilop <aprilop2007@trashmail.net>
- Date: Wed, 28 Nov 2007 18:15:08 +0100 (MET)
- To: www-validator@w3.org
I still believe that the following behaviour is illogical and
not really helpful. (It has been discussed before.)
Given a webpage that does not specify any encoding (charset).
Unfortunately, this still happens and such pages are mostly
Windows-1251 or Windows-1252 encoded.
Then validator.w3.org reports:
(1) No Character Encoding Found! Falling back to UTF-8.
(2) Sorry, I am unable to validate this document because on line ...
it contained one or more bytes that I cannot interpret as utf-8
(in other words, the bytes found are not valid values in
the specified Character Encoding).
This makes no sense; and it doesn't help the user.
The logical procedure would be:
(1) On line ... the document contained one or more bytes
that I cannot interpret as UTF-8 (in other words, the bytes
found are not valid values in UTF-8).
(2) Therefore I don't fall back to UTF-8.
N.B.
I do not suggest a specific other fallback encoding or fallback
behaviour. I just say that it is illogical to assume first UTF-8
and then immediately claim that UTF-8 is impossible.
Received on Wednesday, 28 November 2007 17:15:22 UTC