- From: Thomas Broyer <t.broyer@gmail.com>
- Date: Thu, 13 Nov 2008 18:14:03 +0100
- To: noah_mendelsohn@us.ibm.com
- Cc: public-html <public-html@w3.org>, www-tag@w3.org
On Thu, Nov 13, 2008 at 5:18 PM, wrote: > > e> XXXXXX (Isn't obviously HTML at all, > but browser will presumably > build a DOM and render XXXXXX) Let me add: f> <title>something</title>XXXXXX > The best example I have of 'unclean' are (b), in which the close tags are > in the wrong order, and (e), which has no tags at all. As far as I know, > an HTML browser will accept both of these, built a DOM for them, allow > scripting of that DOM, and render on the screen output per the HTML 5 > Recommendation. > > Perhaps all of those are therefore what we mean by legal or clean HTML 5, > but I don't think so. Actually only (f) would be (and it appears to also be legal HTML 4), as all the others miss a title element. > (a) seems to me to be legal HTML in a sense that > (b), for example, is not. If I wrote an HTML editor and it put out > content in the form of (b), I hope you'd tell me my editor was buggy, and > that the tags should be properly nested. Yup, that's covered in §8.1 "Writing HTML Documents" http://www.w3.org/TR/html5/syntax.html#writing0 > So, that being the case, when there's a language as important as HTML 5, I > think it's a good thing for there to be a high quality specification that > makes very clear answers to questions such as: > > * What documents are part the language (or legal in the language if you > prefer) and which ones not? A document that parses (§8.2 Parsing HTML Documents) without "parse errors" *and* produces a DOM that is valid wrt content models is "part of the language". > * What is the correct interpretation of the legal documents? Er, what do you mean by "interpretation"? See also §1.3 Conformance Requirements http://www.w3.org/TR/html5/introduction.html#conformance (has been renumbered as §2.2 since the publication of the WD) > In short, this would be just a language specification, as distinct from > the existing HTML 5 draft, which focusses on consuming and rendering HTML > 5 as well as consuming and rendering other input. Note that, in > principle, a language specification is not just for authors. It's a > specification of what the language >is<. No doubt, the most common > consumers of HTML 5 will be browsers, which will be much more liberal in > what they accept, but the language specification should be referenced by > anyone who wants to either produce or consume clean, legal, HTML (e.g. no > badly nested tags). Usually, such a language specification will say > nothing about documents like (b) that aren't in the language, except to > make clear that they aren't. Section 8.1 focuses on the "global" syntax, section 3.1 on "microsyntaxes" and the rest of section 3 on "content models" (elements, their attributes and what the can contain) Note that section 8.1 obviously doesn't apply to the XML serialization of an otherwise valid DOM/Infoset. > Since the current HTML 5 draft is focussed to a significant degree on what > browsers consume, it provides for processing and building DOMs from (a-e). Well, that's only section 8.2... My 2 c€nts -- Thomas Broyer
Received on Thursday, 13 November 2008 17:14:42 UTC