- From: Terje Bless <link@pobox.com>
- Date: Wed, 27 Nov 2002 01:33:57 +0100
- To: W3C Validator <www-validator@w3.org>
- cc: Eric Anderson <anderson@cs.uoregon.edu>
Eric Anderson <anderson@cs.uoregon.edu> wrote: >As of this current update, a whole set of pages which had been >validating correctly stopped. Specifically, I'm getting the following >error message for all of them: > >" I was not able to extract a character encoding labeling from any of >the valid sources for such information. Without encoding information it >is impossible to validate the document. The sources I tried are: [...] " > >I don't know if this reveals some formerly unknown error in our web >server, or whether it's a Validator problem. But it might be the >latter. > >Here's one of the URLs for which this is happening: > >http://www.cs.uoregon.edu/~anderson/gtf/cis111/ There is actually several things going on here. Fist of all, you are serving XML as "text/html" (which is unfortunate, but may be necessary). Secondly, your server is not sending a "charset" parameter for the Content-Type field in the HTTP response so we have no clear indication of what Character Encoding is being used. And finally, your example document is not Valid XML; it contains a XML Declaration but not as the first thing in the file. cf.: <!-- $Id: index.html,v 1.11 2002/11/21 02:24:49 anderson Exp anderson $ --> <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html [...] Without the comment we might have been able to use the autodetect algorithm from Appendix F of the XML Recommendation to find the correct Encoding; and if you set an explicit character encoding in the HTTP headers there will be no doubt. The reason this changed is that the new version of the Validator is more strict about proper labelling of character encoding to avoid giving false results. I would recommend that 1) you move the CVS comment somewhere else and 2) that you configure your web server to send the proper "charset" parameter. Depending on what kinds of documents you typically serve, you may be able to just set a default encoding of "UTF-8" in the main configuration file for the server. -- If you believe that will stop spammers, you're sadly misled. Rusty hooks, rectally administered fuel oil enemas, and the gutting of their machines, *that* stops spammers! -- Saundo
Received on Tuesday, 26 November 2002 19:34:04 UTC