Problem with XHTML validation

Hello gurus,

I've found a problem with the XHTML 1.0 validation service provided at
<http://validator.w3.org/>.  I created a bare-bones XHTML document in
UTF-8, as follows:

-----8<-----cut here-----8<-----cut here-----8<-----

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
   PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-type" content="text/html; charset=UTF-8" />
<title>Minimal XHTML document</title>
</head>
<body>
<p>
This is a minimal XHTML 1.0 document. ☺
</p>
</body>
</html>

-----8<-----cut here-----8<-----cut here-----8<-----

Note the non-ASCII character (U+263A WHITE SMILING FACE) in the body
text.

When I use the "Local File" option of the validator, I get the following
message:

"Sorry, I am unable to validate this document because on line 12 it
contained one or more bytes that I cannot interpret as us-ascii (in
other words, the bytes found are not valid values in the specified
Character Encoding). Please check both the content of the file and the
character encoding indication."

It's complaining about the character U+263A; if I remove it, the file
validates.  But notice that I declared the document to be UTF-8 in TWO
different ways, in the XML declaration and again in the "meta
http-equiv" statement.  Yet for some reason the validator is still
treating the file as US-ASCII.

Now, if I upload the document to my Web site and validate it from there,
by URI, everything works fine.  So there is some discrepancy between the
way the validator handles a file referenced by URI and the same file
read from a local disk.

Please copy any responses to me directly, as I am not subscribed to the
list.

Thanks,

-Doug Ewell
 Fullerton, California
 http://users.adelphia.net/~dewell/

Received on Monday, 5 May 2003 00:56:55 UTC