- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Wed, 16 May 2007 08:03:00 +0300 (EEST)
- To: Muharrem Kaderli <m.rem@isnet.net.tr>
- cc: www-validator@w3.org
On Tue, 15 May 2007, Muharrem Kaderli wrote:
> Validating http://www.habune.com/
> Error [66]: "document type does not allow element X here; assuming
> missing Y start-tag"
First of all, I don't get such a message when I try to validate the
document. Instead, I get a message saying that the document cannot be
checked because it contains bytes that cannot be interpreted as UTF-8.
The document is in fact windows-1254 (Windows Turkish) encoding. The
encoding should be declared in HTTP headers, or in the XML prologue,
<?xml encoding="windows-1254"?>
at the very start of the document, or both. Things get somewhat tricky,
since an XML declaration throws IE into "Quirks Mode", and sometimes
individual authors cannot control HTTP headers. So authors often resort to
"meta Ersatz", i.e. a meta tag inside the document to specify the
encoding. Although this does not comply with the specifications (XML rules
specify the default and allow it to be overriden in the XML prologue or at
a higher-level protocol such as HTTP, but not inside the document), it has
been observed to "work" on contemporary browsers. But then you need to
have the meta tag syntax right. You now have
<Meta http-equiv="Content-type" Content="text/html;" charset="windows-1254">
with two extra quotation marks; it should be
<meta http-equiv="Content-type" content="text/html;charset=windows-1254">
If I manually set, in the validator's user interface, the encoding that
the validator uses to interpret the document (as you have probably done,
judging from the error message you mention), I get 229 error messages.
The first error message is:
"Error Line 4 column 6: document type does not allow element "title"
here; assuming missing "head" start-tag."
I think that's rather self-explanatory: the tag <head> is missing before
the <title> tag. The reason is that in XHTML, the tag <head> is
obligatory, i.e. it cannot be omitted as in previous versions of HTML.
Then there is a pile of error messages, largely caused by the use of
"classic HTML" syntax - uppercase letters in tag names etc. - in a
document purported to be XHTML 1.0.
Obviously, if you change the document type declaration to an HTML 4.01
doctype,
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
there will be much fewer problems to consider. (You'll still have to fix
"&" to "&" in many occasions, and things like that.)
There is no gain from using XHTML on the web now or in the foreseeable
future, unless you have some special use case where you combine XHTML with
other XML based languages and can deal with the fact that IE does not
understand XHTML at all (except when you make it treat XHTML as "classic
HTML"). But as you have seen, there are many traps and pitfalls that you
will find if you try to use XHTML.
--
Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Wednesday, 16 May 2007 05:03:14 UTC