- From: Doug Ewell <dewell@adelphia.net>
- Date: Sun, 4 May 2003 21:47:04 -0700
- To: <www-validator@w3.org>
Hello gurus, I've found a problem with the XHTML 1.0 validation service provided at <http://validator.w3.org/>. I created a bare-bones XHTML document in UTF-8, as follows: -----8<-----cut here-----8<-----cut here-----8<----- <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <meta http-equiv="Content-type" content="text/html; charset=UTF-8" /> <title>Minimal XHTML document</title> </head> <body> <p> This is a minimal XHTML 1.0 document. ☺ </p> </body> </html> -----8<-----cut here-----8<-----cut here-----8<----- Note the non-ASCII character (U+263A WHITE SMILING FACE) in the body text. When I use the "Local File" option of the validator, I get the following message: "Sorry, I am unable to validate this document because on line 12 it contained one or more bytes that I cannot interpret as us-ascii (in other words, the bytes found are not valid values in the specified Character Encoding). Please check both the content of the file and the character encoding indication." It's complaining about the character U+263A; if I remove it, the file validates. But notice that I declared the document to be UTF-8 in TWO different ways, in the XML declaration and again in the "meta http-equiv" statement. Yet for some reason the validator is still treating the file as US-ASCII. Now, if I upload the document to my Web site and validate it from there, by URI, everything works fine. So there is some discrepancy between the way the validator handles a file referenced by URI and the same file read from a local disk. Please copy any responses to me directly, as I am not subscribed to the list. Thanks, -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/
Received on Monday, 5 May 2003 00:56:55 UTC