Splitting up the spec

Hi All,

There has been lots of discussion, and now one draft, for splitting up 
the spec.

== Why splitting up the spec

I agree that splitting up the spec holds some benefits. One being that 
other specs can refer the various parts without having to depend on the 
whole thing. It also makes reviewing and reasoning about the individual 
parts easier.

There is also some advantage in spreading the editing load, though 
generally (but not always) Ian has been very quick to respond so if this 
is the reason we are splitting up the spec that puts some pretty tough 
requirements on the new editor as he'd have to be more responsive than 
Ian for it to make sense.


== What to split up

I do have some very big concerns about the way people are suggesting we 
split the spec though. Some have suggested moving things like the SQL 
interface into a separate spec. This would make a lot of sense to me as 
those interfaces are generally not HTML specific. Same thing goes for 
the localStorage/sessionStorage interfaces, as well as the WebSocket 
interface. Breaking out these parts would make it easier to reason about 
and review them.

However some have suggested moving error handling into a separate spec. 
Others have suggested moving the DOM and scripting specific parts into 
different spec. This makes much less sense to me.


== Why splitting out error handling is a bad idea.

First of all the reason that we are in this situation with HTML being a 
total mess to parse is in large parts because the HTML4 spec left error 
handling undefined. This resulted in different browsers doing different 
things, many of them not thought through. After a few years of all 
browsers trying to be compatible with websites and websites authoring by 
testing in different browsers every browser has a mish-mash of error 
handling techniques.

Please, let us not remake that mistake by treating error handling like a 
second class citizen in the spec.

Another reason splitting error handling from the 'language spec' is that 
there are interdependencies. We've had to adjust aspects of the language 
due to how current browsers do error handling. Otherwise we would end up 
with a language which when sent to existing browsers would render 
gibberish. This would result in discouraging people from applying 
appropriate semantics since that only renders correctly in the latest 
browsers.

I 100% agree that it is unfortunate that the situation is like this. But 
wishing things were different is not going to change anything. (And if 
you want to play the blame game, i'm up for it, but in a separate thread 
please).


== Why splitting out DOM is a bad idea.

There are similar reasons why splitting out the DOM is a bad idea. 
Lessons of the past have shown that when the DOM is designed separately 
from the main language we end up with poor specs. The CSSOM is a prime 
example of this. At this point pretty much everyone agrees that the 
CSSOM is a very bad idea and needs to be replaced, however at this point 
it is very hard since it's been a Rec for a long time and has already 
been deployed. This also applies to the HTML DOM which is in a better 
shape, but not as good as it could have been if developed together with 
the language itself.

There are heavy interdependencies between the language and the scripting 
model. For example <video> would not have made sense to add if the 
scripting model hadn't been taken into account. We would have simply 
said that <object> could have been used. Similarly <input> would 
probably not have the feature set it does today if scripting had been 
taken into account.

Since scripting and the markup needs to be developed together, it makes 
little sense to me to put them in separate specs. Doing so just makes it 
harder to review and reason about the specs. I am quite sure we would 
get lots of people just looking at one of the two specs and give review 
comments based on that. In fact, it sounds like that might apply to a 
few people here. Such comments would be harder to take into account 
since they would not be based on the full set of needed data.


So in conclusion, there are definitely parts of HTML i'd like to see 
broken out. But breaking out parts that have to be designed together is 
something I strongly object to. The only result seems to me to be 
increased confusion. It doesn't actually make these things independent.

As a wise man once said (sort of): Everything should be made as simple 
as possible, but no simpler.

/ Jonas

Received on Friday, 21 November 2008 00:30:36 UTC