- From: Neil Zanella <nzanella@cs.mun.ca>
- Date: Thu, 26 Jun 2003 23:10:43 -0230 (NDT)
- To: Bjoern Hoehrmann <derhoermi@gmx.net>
- cc: www-validator@w3.org
OK, I have left out an important bit of information in my bug report:
Consider the following file:
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
</head>
<body>
<p>...</p>
</body>
</html>
If you name it hello.xml and run it through the validator than everything
works fine. However, if I name it hello.html then the validator complains.
I am not sure why this is so and would like an explanation. I think it has
something to do with the fact that web servers return the type of document
in the HTTP response header before sending it. The web server I used was
configured to return the text/html mime type for files with the .html
extension and text/xml for files with the .xml extension. But the
parser still recognized the file as XML since it stated:
--- begin quote ----------------------------------------------------------
I was not able to extract a character encoding labeling from any of the
valid sources for such information. Without encoding information it is
impossible to validate the document. The sources I tried are:
* The HTTP Content-Type field.
* The XML Declaration.
* The HTML "META" element.
And I even tried to autodetect it using the algorithm defined in Appendix
F of the XML 1.0 Recommendation.
--- end quote ------------------------------------------------------------
I wonder whether XHTML documents should not be ended with the .html
extension (or the web server cannot tell them apart from text/html).
So what is the common convention? Should they have the .xml extension?
Thanks!
Neil
In any case, from the
On Thu, 26 Jun 2003, Bjoern Hoehrmann wrote:
> * Neil Zanella wrote:
> >I have a document encoded in ASCII (a subset of UTF-8).
> >The XML 1.0 specification states:
> >
> >It is also a fatal error if an XML entity contains no encoding declaration
> >and its content is not legal UTF-8 or UTF-16.
> >
> >However, the validator should therefore validate correctly XHTML documents
> >starting with <?xml version="1.0"?> followed by a proper XHTML 1.0 DTD
> >followed by the actual content.
> >
> >However the validator.w3.org program insisted that I ought to specify it,
> >but that's not what the XML standard says, right?
>
> http://validator.w3.org/check?uri=http://www.bjoernsworld.de/temp/foo3.xml
>
Received on Thursday, 26 June 2003 21:40:48 UTC