Re: Splitting up the spec

On Sat, 22 Nov 2008, Toby A Inkster wrote:
> 
> Personally, I don't think they belong as part of the spec at all. Don't 
> get me wrong: it's a brilliant algorithm, and I've found it invaluable - 
> far better than my own attempts before it was pointed out to me. But it 
> doesn't need to be included in the spec. A separate W3C note would 
> suffice; or possibly as an informative appendix to the HTML5 markup 
> spec.

So you think that it's ok for two tools that claim to generate outlines 
for HTML documents to generate substantially different outlines?

That seems like a really bad situation to me. People complain all the time 
about browsers handling pages differently, and I see this as being no 
different (indeed, the two tools could well be Web browsers).


> Here's my current thoughts on how I think the HTML5 spec should be split:
> 
> 1. _HTML5 and DOM5_: Define the language syntax, the elements and their 
> content models, the attributes, and the DOM (i.e. the "document" object 
> in Javascript, but no other part of the document object). The parsing 
> algorithm should be included. The outline algorithm could be an 
> informative appendix or a separate note. Effectively chapters 1-3,8 of 
> the current spec.
> 
> 4. _Server-Sent Events for Javascript_: Chapter 6 of the current spec.
> 
> 5. _Offline Web Applications_: Section 4.7 of the current spec.

The above seems to be self-contradictory. server sent events are part of 
the processing model of <eventsource>, an HTML5 element. And the offline 
Web applications stuff is just the processing model of the <html 
manifest=""> attribute, again part of the language.


> 2. _Storage for Javascript_: the Storage interface, sessionStorage, 
> localStorage, the storage event. Section 4.11.2 of the current spec, and 
> some other parts of 4.11.1.
> 
> 3. _SQL for Javascript_: database storage. This would include the 
> scripting interface to the database, plus the minimal conforming subset 
> of SQL. Section 4.11.2 of the current spec, and some other parts of 
> 4.11.

I agree that these could be split out without much harm to the schedule, 
though I don't think splitting them out gains us much in practical terms. 
I propose to take these out and make separate documents for them once we 
have less feedback pending.


> 6. _HTML Link Vocabulary_: possibly not a spec, but a formal registry. 
> This would start off as 4.12.3 of the current spec, but would probably 
> expand.

No opinion here, though I should not that some types, e.g. rel=icon, 
interact with the language (sizes=""), and some others, e.g. 
rel=stylesheet, have DOM implications.


> 7. _The HTML5 Browser_: this would normatively reference all of the 
> above specifications, plus CSS 2.1, ECMAScript, XHR and Workers.

I don't think it's a good idea to make a spec that defines normatively 
what standards are "in" and what standards are "out" because that's the 
kind of decision that should be left up to the market. But as a post-facto 
spec that might make sense.


> Also a lot of the non-"document.*" Javascripty bits from the current 
> HTML5 spec should be included: Window, History, Location, UndoManager, 
> the draggable stuff, the contentEditable stuff, etc.

draggable and contenteditable are attributes of the language, fwiw.


> Same origin; content-type sniffing. Much of chapter 4 and all of chapter 
> 5 from the current spec.

The browsing context stuff is pretty important for the definition of 
document.write, which is pretty tightly linked to the parser, so I'm not 
sure it makes sense to split out the browsing context stuff from the 
parser spec. It certainly would be a huge amount of work.


> Also, included should be instructions for how user-agents should 
> interpret any obsolete elements which they may encounter - you know, the 
> really horrible ones that we don't want to be part of the markup spec, 
> but browsers must still know how to deal with: <xmp>, <font>, etc.

Your number 7 sounds like a combination of specs "1" and "10" in the list 
of sections to split out that I mentioned.


> It has been mentioned by some people in this discussion that there is a 
> lot of subtle interplay between these different parts, but actually I 
> think that's even better justification for splitting them. It will allow 
> us to see the ties between them because they will need to be made 
> explicit by including normative references between the specifications.

Why is this an advantage? We already know the ties; what does making them 
more explicit gain us?


> Once we can see these dependencies we can look at them one by one, and 
> decide how they should be resolved: by keeping the normative reference, 
> or perhaps redefining something in a different way to avoid the 
> dependency.

It's pretty hard to redefine a lot of this stuff, it's all de-facto 
already. What benefit would there be to adjusting the _way_ something is 
defined if it ends up more complicated than it is now, while meaning the 
same thing?


> An example might be that server-sent events rely on the <event-source/> 
> element. When we saw that dependency between _Server-Sent Events for 
> Javascript_ and _HTML5 and DOM5_ we might think: why is <event-source/> 
> needed? Perhaps an event source could be established entirely via 
> script?

We've already had that design decision. It's not like the current design 
came out of nowhere. :-)


> > Splitting up the specs to help make progress faster, measured either as
> > reaching CR faster, decoupling dependencies that might reach REC faster,
> > getting specification text written faster, or, most importantly, getting
> > interoperable implementations faster, is interesting. IMHO, splitting up
> > the specs for no reason other than editorial preference is not
> 
> Splitting up the spec can serve several useful purposes:
> 
> A. It allows parts of the spec to be finalised before other parts.

I disagree with the idea that having the spec be one document means that 
parts _can't_ be finalised before other parts. Indeed, the HTML5 spec 
today is a counter-example, as e.g. <canvas> is far more stable than <q>.


> As I pointed out as an example before, a full and useful definition of 
> SQL for Javascript is still a long way off. Splitting the spec would 
> allow other important parts such as the markup language and DOM to enter 
> the recommendation track without being held up by less complete parts.

I think it would make sense to split sections out for this reason *once 
this becomes a problem*. But we're nowhere near there yet.


> B. It makes it easier for other markup languages to incorporate 
> components. For example, MathML, XHTML2 or SVG might like to incorporate 
> Javascript storage, or server-sent events; OpenDocument might use the 
> link vocab. This is a lot more practical if they are their own separate 
> components.

Again, if that becomes an issue, then it makes sense to then split out the 
sections. But again, as far as I'm aware, the only sections for which that 
is a problem right now are browsing contexts, Window, and origin, all of 
which are so tightly integrated with HTML that I can't really see them 
being separated anyway.


> C. The aforementioned additional clarity regarding how the different 
> parts are related.

See notes above.


> D. Going forward, past 2022, allowing new versions of each component 
> spec to be worked on independently.

Well obviously in future if individual parts want to be evolved 
separately, the spec can be split at that point, that's no problem.


> Splitting the spec up wouldn't need to lead to the components diverging. 
> The same working group could continue to work on the collection.

In practice, *every single split that has yet happened* has ended up with 
the section going to another working group. So I don't buy that. (Such 
sections include XHR, Web Workers, Window, and Selectors API.)


> People writing HTML5 browsers would still have a definitive spec to aim 
> for: _The HTML5 Browser_.

I don't know why there is such an obsession over the browser. There are 
plenty of other UAs for which this work makes sense, like search engines, 
data mining tools, validators, authoring tools, etc.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Saturday, 22 November 2008 01:41:33 UTC