Splitting up the spec from Jim Jewett on 2008-11-21 (public-html@w3.org from November 2008)

From: Jim Jewett <jimjjewett@gmail.com>
Date: Fri, 21 Nov 2008 03:20:41 -0500
To: "HTML WG" <public-html@w3.org>
Cc: "Jonas Sicking" <jonas@sicking.cc>
Message-ID: <fb6fbf560811210020x7b5f7662ld0b5029ba52d735a@mail.gmail.com>
Jonas Sicking wrote:

== Why splitting out error handling is a bad idea.

> First of all the reason that we are in this situation with HTML being a
> total mess to parse is in large parts because the HTML4 spec left error
> handling undefined. This resulted in different browsers doing different

I agree that a HTML consumer needs both parsing and error handing.

But a HTML producer needs neither.

Since browsers already implement dozens of specs, having them
implement both "HTML semantics" and "HTML parsing/processing/error
correction" isn't that much of an extra burden.

For a simple authoring tool, such as a report generator, or a
converter from another format, there is great value in being able to
say "I only care about the HTML semantics spec", and not having to
worry about the corner cases of parsing.

> Another reason splitting error handling from the 'language spec' is that
> there are interdependencies. We've had to adjust aspects of the language
> due to how current browsers do error handling. Otherwise we would end up
> with a language which when sent to existing browsers would render
> gibberish.

This does limit the choices available to the HTML semantics spec, but
those limits apply regardless of whether the semantics section is
split out.

At most, splitting would suggest an extra informative note on the
order of "Yes, this seems sub-optimal, but there are legacy
constraints."  (In the amalgamated spec, the note wouldn't be needed
because the reader would see -- and perhaps get lost in -- the
constraints directly.)

[That said, I can't think of an example right now -- just how hairy
were the adjustments?]


== Why splitting out DOM is a bad idea.

> There are heavy interdependencies between the language and the scripting
> model.

Not really.  There are heavy interdependencies between a few specific
elements, and the scripting model.  I agree that most applications
using those elements will need to be aware of the processing part of
the spec -- but many pages just won't use those elements.

> For example <video> would not have made sense to add if the
> scripting model hadn't been taken into account. We would have simply
> said that <object> could have been used.

But as long as it exists, it is still useful without scripting.

I agree that <video> will probably have long sections in the
processing spec as well.  And <canvas> probably won't even be used by
people or tools that skip the processing spec.

But <h1> and <div> won't need the processing spec.

> Similarly <input> would probably not have the feature set it does
> today if scripting had been taken into account.

So figure out the processing model of things like event-source first,
and then poke just the tip of the iceberg through to the HTML
semantics spec.  While I expect the HTML semantics spec to stabilize
more quickly, it doesn't have to be completely frozen until the
processing spec is also ready.

> we would get lots of people just looking at one of the two specs
> and give review comments based on that.

Absolutely -- because their comments apply only to one of the two specs.

If the comments are really about something highly interactive, then
they need to look at the more complicated processing spec, and they
may need to consider both specs at once.

But if the comments are just about refining semantics, then why force
them to wade through long algorithms about what to do when the page
isn't even conformant?

-jJ
Received on Friday, 21 November 2008 08:21:16 UTC