Re: Fallback to UTF-8

Hi Frank,

Thanks a lot for going through your test cases. Much appreciated. I  
failed to precise that the feedback requested was mostly on documents  
without declared character encoding, but you found other interesting  
things.

Let's see inline.

On 28-Apr-08, at 3:53 PM, Frank Ellermann wrote:
>> | Software error:
> | Undefined subroutine  
> &W3C::Validator::EventHandler::abort_if_error_flagged | called at / 
> home/link/web/HEAD/httpd/cgi-bin/check line 2756.
> <http://hmdmhdfmhdjmzdtjmzdtzktdkztdjz.googlepages.com/IDN-XML-test.htm 
> >
> <http://hmdmhdfmhdjmzdtjmzdtzktdkztdjz.googlepages.com/IDN-IRI-test.htm 
> >
> (The two test cases for bugs 5279 and 5280)

Fixed. As in, the Undefined subroutine error is fixed. 5279 and 5280  
are untouched.


> <http://freenet-homepage.de/Xyzzy/home/test/res.htm>
> (Another "line 2756", it used to work, valid HTML 2.0 Strict)

OK now.

> <http://home.claranet.de/xyzzy/w3c/p3p.xml>
> (Another "line 2756", it used to work, well formed XML)

OK now.

>
> <http://home.claranet.de/xyzzy/sitemap.xml>
> (Another "line 2756", it used to work, well formed XML)

OK now.

> [warning] Unable to Determine Parse Mode!
> [...]
> | Type (-//IETF//DTD HTML i18n//EN) is not in the validator's catalog
> <http://freenet-homepage.de/Xyzzy/home/test/res.html>
> (SGML is correct - RFC 2070 DTD republished by IANA)

It does validate right now. I guess you are pointing out that “ Unable  
to Determine Parse Mode” could use a better wording in this case?


> [warning] Missing "charset" attribute for "text/xml" document.
> <http://freenet-homepage.de/Xyzzy/home/test/utf-4.xml>
> (this text/xml document really uses encoding US-ASCII)

HEAD http://freenet-homepage.de/Xyzzy/home/test/utf-4.xml
Content-Type: text/xml
So I guess the warning by the validator, that the spec specifies a  
strong default of "us-ascii" is OK here?


> [warning] Mismatch between Public and System identifiers
> <http://freenet-homepage.de/Xyzzy/colour.htm>
> (the released validator has no problem with using System
> identifiers pointing to its own catatlog, maybe it's an
> artefact of the qa-dev.w3.org != validator.w3.org setup)

That's a new feature. Some recent feedback prompted the addition of a  
check for consistency between FPI and SI. It's a warning, so as usual,  
it can be ignored if you are sure of your doctype.

> [warning] Character Encoding mismatch!
> | The character encoding specified in the HTTP header
> | (iso-8859-1) is different from the value in the <meta>
> | element (windows-1252). I will use the value from the
> | HTTP header (iso-8859-1) for this validation.
> <http://home.claranet.de/xyzzy/ascii.htm>
> (Nikita consistently hates u+0080 based on an iso-8859-1
> assumption, and the document uses a windows-1252 0x80 €)

Mm, sorry, not sure if you are reporting an issue or a "work as it  
should, here". Can you give more details?


Thanks a lot.

-- 
olivier

Received on Monday, 28 April 2008 07:07:29 UTC