Re: Comments on HTML WG face to face meetings in France Oct 08

On Thu, Nov 13, 2008 at 5:18 PM, wrote:
>
> e>  XXXXXX (Isn't obviously HTML at all,
>            but browser will presumably
>            build a DOM and render XXXXXX)

Let me add:

f> <title>something</title>XXXXXX

> The best example I have of 'unclean' are (b), in which the close tags are
> in the wrong order, and (e), which has no tags at all.  As far as I know,
> an HTML browser will accept both of these, built a DOM for them, allow
> scripting of that DOM, and render on the screen output per the HTML 5
> Recommendation.
>
> Perhaps all of those are therefore what we mean by legal or clean HTML 5,
> but I don't think so.

Actually only (f) would be (and it appears to also be legal HTML 4),
as all the others miss a title element.

> (a) seems to me to be legal HTML in a sense that
> (b), for example, is not.  If I wrote an HTML editor and it put out
> content in the form of (b), I hope you'd tell me my editor was buggy, and
> that the tags should be properly nested.

Yup, that's covered in §8.1 "Writing HTML Documents"
http://www.w3.org/TR/html5/syntax.html#writing0

> So, that being the case, when there's a language as important as HTML 5, I
> think it's a good thing for there to be a high quality specification that
> makes very clear answers to questions such as:
>
> * What documents are part the language (or legal in the language if you
> prefer) and which ones not?

A document that parses (§8.2 Parsing HTML Documents) without "parse
errors" *and* produces a DOM that is valid wrt content models is "part
of the language".

> * What is the correct interpretation of the legal documents?

Er, what do you mean by "interpretation"?

See also §1.3 Conformance Requirements
http://www.w3.org/TR/html5/introduction.html#conformance
(has been renumbered as §2.2 since the publication of the WD)

> In short, this would be just a language specification, as distinct from
> the existing HTML 5 draft, which focusses on consuming and rendering HTML
> 5 as well as consuming and rendering other input.  Note that, in
> principle, a language specification is not just for authors.  It's a
> specification of what the language >is<.  No doubt, the most common
> consumers of HTML 5 will be browsers, which will be much more liberal in
> what they accept, but the language specification should be referenced by
> anyone who wants to either produce or consume clean, legal, HTML (e.g. no
> badly nested tags).  Usually, such a language specification will say
> nothing about documents like (b) that aren't in the language, except to
> make clear that they aren't.

Section 8.1 focuses on the "global" syntax, section 3.1 on
"microsyntaxes" and the rest of section 3 on "content models"
(elements, their attributes and what the can contain)

Note that section 8.1 obviously doesn't apply to the XML serialization
of an otherwise valid DOM/Infoset.

> Since the current HTML 5 draft is focussed to a significant degree on what
> browsers consume, it provides for processing and building DOMs from (a-e).

Well, that's only section 8.2...


My 2 c€nts

-- 
Thomas Broyer

Received on Thursday, 13 November 2008 19:45:28 UTC