Autodetection failure from Elliotte Rusty Harold on 2002-12-08 (www-validator@w3.org from December 2002)

From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
Date: Sun, 8 Dec 2002 09:13:54 -0500
To: www-validator@w3.org
Message-Id: <p04330100ba1903b5a9ce@[192.168.254.4]>

When attempting to validate a document which I identified as XHTML 
1.1 using the pop-up menu, I received the following message:

  I was not able to extract a character encoding labeling from any of 
the valid sources for such information. Without encoding information 
it is impossible to validate the document. The sources I tried are:

     * The HTTP Content-Type field.
     * The XML Declaration.
     * The HTML "META" element.

And I even tried to autodetect it using the algorithm defined in 
Appendix F of the XML 1.0 Recommendation.

Since none of these sources yielded any usable information, I will 
not be able to validate this document. Sorry. Please make sure you 
specify the character encoding in use.

I believe that in this case for XHTML, the fallback should be UTF-8. 
It certainly is for XML, and I don't think there's any reason XHTML 
should be different. If everything else fails, assume UTF-8.
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|          XML in a  Nutshell, 2nd Edition (O'Reilly, 2002)          |
|              http://www.cafeconleche.org/books/xian2/              |
|  http://www.amazon.com/exec/obidos/ISBN%3D0596002920/cafeaulaitA/  |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
+----------------------------------+---------------------------------+

Received on Sunday, 8 December 2002 09:16:35 UTC