- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Tue, 10 Sep 2013 09:50:46 +0300
- To: Roland <silvermaplesoft@earthlink.net>
- CC: www-validator@w3.org
2013-09-10 2:15, Roland wrote: > Error [65]: "document type does not allow element X here; missing one > of Y start-tag" > > This error message appears in situations like <body> <a name="top"> > Using <a> in this manner is out of date, but it is the simplest > example of the issue. The situation occurs whenever the first child of body is an inline element. The case <a name="top"> is not simpler than other inline element start tags, like <b> or <span>. The error message appears when validating against HTML 4.01 Strict or XHTML 1.0 Strict (or HTML 2.0 Strict, for that matter!) or other DTD that enforces the requirement that inline content is not allowed inside body without intervening markup. It does not appear when validating e.g. against HTML 4.01 Transitional or HTML5. > No place, on the W3C Web site nor outside literature, do I recall > seeing a statement that the <body> element forces a context-sensitive > requirement that all children be block-level elements. The rule that the validator is applying is <!ELEMENT BODY O O (%block;|SCRIPT)+ +(INS|DEL)> which is mentioned right at the start of 7.5.1 The BODY element of the HTML 4.01 specification: http://www.w3.org/TR/REC-html40/struct/global.html#edef-BODY Such formal syntax rules are what the validator is applying, when performing markup validation in the SGML or XML sense. This is one reason why validation is so overrated. You really need to understand what validation is in order to benefit from it rather than get confused, but it is often recommended as if it were simple and easy. It seems that the HTML 4.01 specification does not describe this rule in prose. It is not even mentioned in the part that discusses the differences between HTML 4.01 Strict and HTML 4.01 Transitional. I was astonished at this, but apparently I learned it from somewhere else, like a textbook, years ago. But this is a problem with the specification, not the validator. > This is > especially insidious when the error occurs somewhere deep within the > code (I tested it as the last child) and it is easy to be oblivious > to the fact that you're now back up to the child level. The syntax rule is useful if and only if you wish to stick to coding style where all content in body is wrapped in block containers. In HTML 4.0, this rule is bundled together with a rule that forbids most of so-called presentational markup. There is really no logical connection between the two rules, except that you might call both of them "Puristic" (or "Strict"). But if you wish to have only one of them applied, you need a custom DTD. > The explanatory note includes the statement "This might mean that you > need a containing element, ...." This is, of course, true, but it > fails to note the special character of the <body> element. In terms of SGML validation, which is what this is about, no element has any special character. Different elements have different content models. > The blockquote element has the same undocumented > constraint. It is not undocumented. It's just documented formally only. > The paragraph element has the opposite constraint--it may not include > a block-level element. It's a completely different constraint. > This is a nuisance when I want to include a > <pre> within a paragraph. The HTML concept of paragraph corresponds to a paragraph of text, which may contain images and other embedded inline objects, but otherwise it's just flow of text, possibly with text-level (phrase-level) markup. The formal rule, unlike the rule forbidding direct inline content in body, reflects browser reality: a browser implicitly closes an open p element when it encounters <pre>. So it's more than just a rule "thou shall not use pre within p"; you *cannot* use pre within p, any attempt at doing so will fail, instead of just being formally wrong. > These context sensitive constraints (and any others) deserve special > mention somewhere--within the error message would be most useful; > after all, the parser knows the current token--without it the parser > couldn't generate the list of possible predecessors. I'm afraid the validator uses an old SGML parser that has no provisions for indicating "the current token", due to the way the parser has been coded. And I'm afraid nobody will work on the SGML validator; all work is directed towards HTML5 validation, which is a completely different animal (and does not have this issue, because HTML5 rules allow direct inline content in body). Yucca
Received on Tuesday, 10 September 2013 06:51:18 UTC