Re: Comments on HTML WG face to face meetings in France Oct 08

On Thu, Nov 13, 2008 at 5:18 PM, wrote:
> e>  XXXXXX (Isn't obviously HTML at all,
>            but browser will presumably
>            build a DOM and render XXXXXX)

Let me add:

f> <title>something</title>XXXXXX

> The best example I have of 'unclean' are (b), in which the close tags are
> in the wrong order, and (e), which has no tags at all.  As far as I know,
> an HTML browser will accept both of these, built a DOM for them, allow
> scripting of that DOM, and render on the screen output per the HTML 5
> Recommendation.
> Perhaps all of those are therefore what we mean by legal or clean HTML 5,
> but I don't think so.

Actually only (f) would be (and it appears to also be legal HTML 4),
as all the others miss a title element.

> (a) seems to me to be legal HTML in a sense that
> (b), for example, is not.  If I wrote an HTML editor and it put out
> content in the form of (b), I hope you'd tell me my editor was buggy, and
> that the tags should be properly nested.

Yup, that's covered in §8.1 "Writing HTML Documents"

> So, that being the case, when there's a language as important as HTML 5, I
> think it's a good thing for there to be a high quality specification that
> makes very clear answers to questions such as:
> * What documents are part the language (or legal in the language if you
> prefer) and which ones not?

A document that parses (§8.2 Parsing HTML Documents) without "parse
errors" *and* produces a DOM that is valid wrt content models is "part
of the language".

> * What is the correct interpretation of the legal documents?

Er, what do you mean by "interpretation"?

See also §1.3 Conformance Requirements
(has been renumbered as §2.2 since the publication of the WD)

> In short, this would be just a language specification, as distinct from
> the existing HTML 5 draft, which focusses on consuming and rendering HTML
> 5 as well as consuming and rendering other input.  Note that, in
> principle, a language specification is not just for authors.  It's a
> specification of what the language >is<.  No doubt, the most common
> consumers of HTML 5 will be browsers, which will be much more liberal in
> what they accept, but the language specification should be referenced by
> anyone who wants to either produce or consume clean, legal, HTML (e.g. no
> badly nested tags).  Usually, such a language specification will say
> nothing about documents like (b) that aren't in the language, except to
> make clear that they aren't.

Section 8.1 focuses on the "global" syntax, section 3.1 on
"microsyntaxes" and the rest of section 3 on "content models"
(elements, their attributes and what the can contain)

Note that section 8.1 obviously doesn't apply to the XML serialization
of an otherwise valid DOM/Infoset.

> Since the current HTML 5 draft is focussed to a significant degree on what
> browsers consume, it provides for processing and building DOMs from (a-e).

Well, that's only section 8.2...

My 2 c€nts

Thomas Broyer

