- From: Neil Zanella <nzanella@cs.mun.ca>
- Date: Thu, 26 Jun 2003 23:10:43 -0230 (NDT)
- To: Bjoern Hoehrmann <derhoermi@gmx.net>
- cc: www-validator@w3.org
OK, I have left out an important bit of information in my bug report: Consider the following file: <?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title></title> </head> <body> <p>...</p> </body> </html> If you name it hello.xml and run it through the validator than everything works fine. However, if I name it hello.html then the validator complains. I am not sure why this is so and would like an explanation. I think it has something to do with the fact that web servers return the type of document in the HTTP response header before sending it. The web server I used was configured to return the text/html mime type for files with the .html extension and text/xml for files with the .xml extension. But the parser still recognized the file as XML since it stated: --- begin quote ---------------------------------------------------------- I was not able to extract a character encoding labeling from any of the valid sources for such information. Without encoding information it is impossible to validate the document. The sources I tried are: * The HTTP Content-Type field. * The XML Declaration. * The HTML "META" element. And I even tried to autodetect it using the algorithm defined in Appendix F of the XML 1.0 Recommendation. --- end quote ------------------------------------------------------------ I wonder whether XHTML documents should not be ended with the .html extension (or the web server cannot tell them apart from text/html). So what is the common convention? Should they have the .xml extension? Thanks! Neil In any case, from the On Thu, 26 Jun 2003, Bjoern Hoehrmann wrote: > * Neil Zanella wrote: > >I have a document encoded in ASCII (a subset of UTF-8). > >The XML 1.0 specification states: > > > >It is also a fatal error if an XML entity contains no encoding declaration > >and its content is not legal UTF-8 or UTF-16. > > > >However, the validator should therefore validate correctly XHTML documents > >starting with <?xml version="1.0"?> followed by a proper XHTML 1.0 DTD > >followed by the actual content. > > > >However the validator.w3.org program insisted that I ought to specify it, > >but that's not what the XML standard says, right? > > http://validator.w3.org/check?uri=http://www.bjoernsworld.de/temp/foo3.xml >
Received on Thursday, 26 June 2003 21:40:48 UTC