Re: Renamed topic: focus and length of HTML5

On Sat, 5 Dec 2009, Shelley Powers wrote:
> >>
> >> Now is exactly the time to consider removing it.
> >
> > Now is a time to consider removing it entirely (it's always that time, 
> > for that matter), but it doesn't make sense to put it in a separate 
> > spec.
> 
> Ah, OK. You have a good point here.

Great, I'm glad you agree.

Does that mean you no longer support Manu's proposal?


On Sat, 5 Dec 2009, Julian Reschke wrote:
> Ian Hickson wrote:
> > > > Even if we grant for the of argument the premise that it is not, 
> > > > it is still the case that the WHATWG has reached a later milestone 
> > > > than the
> > >
> > > He who defines what a milestone is...
> > 
> > I'm not sure to what you are referring here. I would have thought the 
> > milestone of "zero open issues" was a pretty uncontroversial milestone 
> > definition -- it's the same one we're using in this working group, no?
> 
> I simply don't think "no open WHATWG issues" is a significant milestone, 
> when we have several other bug trackers containing open issues.

Do you think the W3C HTML WG should hold up progress if the WHATWG claimed 
to have unresolved issues, even if those same issues had been resolved in 
the W3C HTML WG?


> > > > HTML5 and a number of other specs (XHR, Origin RFC, Sniffing RFC, 
> > > > Web Sockets, CSSOM, CSSOM Views, Web Storage, etc) are working 
> > > > closely together as it stands today. It doesn't seem to have been 
> > > > a problem.
> > >
> > > You mentioned a lot of specs that are separate from HTML5. So this 
> > > seems to support the point that modularity is good.
> > 
> > That the specs exist is not a point in support of modularisation or a 
> > point against it. In fact there is also a spec that exists that has 
> > many of those specs in one document. By your logic that would support 
> > the point that modularity is bad, which is a contradiction (since both 
> > exist).
> 
> The existence of a document that combines the contents from a set of 
> smaller documents doesn't prove anything.

I agree. That's my point. The existence of a set of documents that split 
the contents from a bigger document doesn't prove anything either, counter 
to your claim above (where you said "this seems to support the point that 
modularity is good").


> > I wasn't implying that I would edit them. From the point of view of 
> > editing _any_ specification, having more smaller specifications that 
> > interact with it would make the editing take longer. For instance, it 
> > is easier for me to write specifications that interact with CSS2.1 
> > than CSS3 modules.
> 
> And that's why? In addition to the fact that you need a longer table of 
> document references?

I'm not sure why. I'm just reporting my experience here. My experience 
seems relevant if I'm to edit some of the specs in question; at least as 
relevant as Shelley's opinion and your opinion on the matter.


> > > > > Microdata doesn't need to be in the spec [...]
> > > >
> > > > Microdata "needs" to be part of HTML5 because it is part of the 
> > > > language. Taking Microdata out of HTML5 makes about as much sense 
> > > > as taking out <h1>, <p>, or title="", IMHO.
> > > 
> > > First of all, microdata is truly optional; it's not part of HTML4, 
> > > and it's not in actual use except in experiments.
> > 
> > Sure, it's optional in the same sense that <section> is optional, or 
> > in the same sense that <img> is optional. However, that doesn't mean 
> > we should put them in separate specifications -- if they're part of 
> > the language at all, then they should be defined in the same language 
> > specification.
> 
> If we didn't have consensus for <section>, then yes, the best thing 
> would be to remove it and develop a spec separately.

If we didn't have consensus for <section>, then we shouldn't have it at 
all. Similarly with microdata -- if the working group doesn't think we 
should be working on it, then we shouldn't be working on it -- which 
document something is in has nothing to do with whether we agree with it 
or not.


> <img> is in wide use and uncontroversial, so I don't see the connection.

Microdata is significantly less controversial today than <img> was when it 
was introduced.


> microdata has no significant deployment

It's only just reached LC at the WHATWG, and isn't even in LC at the W3C, 
surely it wouldn't be appropriate per W3C process for it to have any 
deployment at all.


> is controversial

That doesn't seem particularly relevant. Lots of things are controversial, 
that doesn't mean we shouldn't address their use cases.


> and competes with another W3C technology, so the situation simply is 
> different.

Technologies competing is not a problem.


> > > How can you say "it's part of the language" then? Unless you mean 
> > > "it's in the spec, so by definition it is part of it".
> > 
> > I mean "it's part of the language", in the same was that <ruby> was 
> > part of XHTML, despite being defined in a different document. 
> > Microdata adds
> 
> Is it?

Yes.


> > elements, attributes, and DOM APIs that are an integral part of 
> > text/html documents. Defining parts of the language in different 
> > documents leads to a fragmented language with an incoherent design (a 
> > problem we've seen all too often in HTML's history for exactly this 
> > reason).
> 
> Defining a language with clear extension points helps developing specs 
> at different speeds, and helps adding stuff later.

Microdata _is_ an extension point.


> > IMHO this is not a relevant difference. I have been editor of several 
> > specifications that have gotten to LC or even CR and still had huge 
> > sections removed. I've also been part of several efforts (including 
> > HTML5 itself) that have dropped entire sections after those sections 
> > have reached REC. I do not think that the status of the document in 
> > the W3C process would have any effect on our ability to drop the 
> > sections.
> > 
> > (Other factors would; e.g. whether or not implementations exist.)
> 
> So a much less controversial way to do this would be to develop a 
> feature like this as a separate spec, and then *potentially* consider 
> integration once it has succeeded in practice.

That line of reasoning would have us split the HTML5 spec into dozens or 
maybe even hundreds of subspecs. I don't think that's a sensible approach 
to spec development.


> The only reason why Microdata is in the spec right now is because you 
> made that choice, and had the editorial power to do so.

We're not arguing about whether the spec is in the spec now, but whether 
it _should_ be in the spec now. If it wasn't in the spec now, we'd still 
be having this argument.


On Sat, 5 Dec 2009, Shelley Powers wrote:
> 
> Individual modules edited by multiple people would most likely take no 
> longer, and in fact, would take a shorter period of time then if one 
> person was editing.

This is academic, since we don't have more editors. There are literally 
dozens of specs that are just lingering with no editors right now. If we 
had editors, there are far more important and more useful things they 
could do than edit documents that already have editors.


> Even as a writer, I'm part of a team. There are individuals that 
> determine whether there is a market for a book. I have a lead editor who 
> helps ensure my Table of Contents and outline touches all of the 
> important points of the topic. There are tech reviewers who ensure that 
> the material is correct (or as correct as possible), and then copy 
> editors who help me find all my typos. Then there's the production 
> staff.

It sounds quite similar to HTML5's development. HTML5 had been developed 
by literally hundreds of people -- just see the acknowledgements.


> So, no, I completely disagree that it would take longer to have more 
> editors.

I didn't say it would take longer to have more editors. I said it would 
take longer -- does take longer, today, in practice -- when specs are 
divided into small modules instead of having a one document per focus 
area. One document for the language layer (HTML/SVG/MathML/DOM), one 
document for the network layer (HTTP), one document for the styling layer 
(CSS/CSSOM/XBL), etc. We'll never get to the level I'd like, but we can 
get close, e.g. by splitting the language layer into four documents giving 
the vocabulary and APIs for HTML, SVG, MathML, and one for the common 
syntax (XML). Similarly, CSS and XBL can be split. But I think it would be 
a terrible mistake to further subdivide -- splitting HTML into smaller 
bits would cause problems just like splitting CSS into smaller bits has 
caused problems. It _does_ take longer when the specs are split up. This 
isn't theoretical, it's going on right now.


> More importantly, more editors ensures an essential comprehensiveness.

Actually in my experience it's the other way around -- editors tend to 
silo themselves, leading to gaps between specs. For example, separating 
HTML4, DOM2 HTML, and XHTML1 led to huge gaps in the specs that we spent 
significant effort fixing in HTML5. Avoiding this has been one of the 
important features of work with Adam, Anne, Lachlan, and Larry (who have 
edited specifications spun out of HTML5), and it has not been easy. Ask 
Anne, for example, about handling the event loop mechanism. Ask Adam or 
Larry about ensuring that we keep a coherent interface between their specs 
and HTML5. It's easy to see how having more editors can quickly result in 
a _loss_ of comprehensiveness -- quite the opposite of ensuring it, as you 
assert above.


> I'm curious -- you've said multiple times that you really don't like 
> Microdata in the spec. I actually linked an email that said this. So 
> now, you do like having Microdata in the spec?

I don't like microdata. I think if it exists, however, it should be in the 
spec. (I think microdata is a much better solution for its problem space 
than other technologies that have addressed that same problem space 
previously, but personally I don't really think we should be even trying 
to solve that problem space.)

My opinion is irrelevant to this discussion, though (as is yours) -- the 
chairs have indicated that we are to provide rationales, not argue based 
on our opinions.


> As for being defined in the same spec, because Microdata is based in
> HTML, why? There are probably hundreds of thousands of examples of
> libraries and extensions and modules that are based on a "language"
> but are not part of the core. Why would HTML be different? HTML is a
> markup language--nothing more, nothing less.

What should be part of HTML is what affects the conformance criteria for 
HTML conformance classes. Microdata does. I guess a parallel with software 
development would be that code that implements the interface for a library 
should be in that library.

(Also, HTML is far more than just a markup language. It's also a 
vocabulary and a set of APIs, and is part of an application platform.)


> >> If we'd buy that argument we'd have to re-open lots of discussions 
> >> about other features such as RDFa, @profile, whatnot.
> >
> > Yes, all of those should be in HTML5 if we think they should exist at 
> > all.
>
> I have to ask you to put on your application developer's hat and think 
> about what you would think if a person creating an application said, 
> "Oh, I can't use libraries, or other modules, or other people's objects 
> -- I have to create everything in this application, from scratch, and 
> all in one file". I believe you would be appalled at such an attitude, 
> or probably scathing of the resulting app.

This seems to be a complete non-sequitur. HTML makes use of a gread deal 
of other technologies. We're not reinventing everything from scratch. That 
has nothing to do with what we're talking about.


> I could go on, but you're a tech, you know what the rules of good 
> application development are. The rules for developing the specification 
> should be no less.

The "rules" you speak of do not suggest randomly fragmenting a 
specification or application along arbitrary lines based on the 
preferences of members of the team. Applications separate logically 
separate components. Using your analogy, what you and Manu are proposing 
is akin to taking an application and putting some functions in another 
file despite them being part of the same component. What you are proposing 
is emphatically _not_ equivalent to separating unrelated code into 
different modules with minimum interaction.


> >> > > What's another thing we know from application development? It's 
> >> > > easier to add at a later time, then it is to remove.
> >> >
> >> > This indicates a lack of familiarity with HTML5's development over 
> >> > the past few years. We have dropped numerous sections with far less 
> >> > effort than they took to be written. Repetition templates, 
> >> > <datagrid>, form prefilling, space-separate form="", peer-to-peer 
> >> > TCPConnection, the <eventsource> element, <datatemplate>, <font>, 
> >> > the entire "out of scope" section, several introduction sections... 
> >> > Not to mention the numerous sections that were split into other 
> >> > specs, such as XMLHttpRequest, Web Storage, Web Database, Web 
> >> > Workers, Web Sockets API, Web Sockets protocol, the Server-sent 
> >> > Events API, Content-Type sniffing, and URL parsing.
> >>
> >> One difference is that these were removed *before* LC.
> >
> > IMHO this is not a relevant difference. I have been editor of several 
> > specifications that have gotten to LC or even CR and still had huge 
> > sections removed. I've also been part of several efforts (including 
> > HTML5 itself) that have dropped entire sections after those sections 
> > have reached REC. I do not think that the status of the document in 
> > the W3C process would have any effect on our ability to drop the 
> > sections.
> 
> It should impact, because the closer we get to CR, the more others 
> implement the functionality. For instance dialog: you'd be surprised at 
> the number of articles and tutorials online that cover dialog. And 
> that's with the spec still only in first draft.

You seem to have done a complete 180 in your argument here. First you were 
arguing that it IS easier to add a feature later than remove it, and now 
you are implying that it SHOULD be easier! What _should_ be easier or 
harder is not at issue here, the point is that if you think we need to 
ensure it is easy to remove microdata later, that we make sure that it is 
-- and indeed we have, as I have explained.


> >> >> It's easier to get a specification cleanly finished when its 
> >> >> focused on the technology it's supposed to be focused on.
> >> >
> >> > HTML5 _is_ focused on the technology it's supposed to be focused 
> >> > on.
> >> >
> >> > Even if we grant for the of argument the premise that it is not, it 
> >> > is still the case that the WHATWG has reached a later milestone 
> >> > than the HTMLWG, despite the "lack of focus" you infer. Therefore 
> >> > it seems to be untrue that the current development model will 
> >> > prevent progress: it is in fact arguably the requests that the 
> >> > specification be more "focused" that are, in part, preventing that 
> >> > same progress in this working group.
> >>
> >> Yes, but the WhatWG group went to last call when there was still 
> >> issues and bugs pending in the W3C bug database and issues tracker.
> >
> > Are you saying that the W3C HTML WG should wait for WHATWG process 
> > issues to be addressed, if the situation was reversed?
> 
> I'm saying that knowing there were bugs and issues still open in a 
> database you partnered with. When we're ready to go to LC here, I would 
> hope that if there are bugs in the WhatWG database, that someone has 
> copied said bugs into the W3C bug database, so everything is tracked in 
> one place.

But what if the bugs were then resolved in the W3C bug database, but not 
the WHATWG one? Would you suggest that the W3C should continue waiting for 
a resolution on the WHATWG side before publishing, even if there was no 
timetable for doing so?

That's the equivalent of what happened here -- the W3C issues have all 
been given full consideration in the WHATWG, the WHATWG just came to 
conclusions faster.


> Regardless, the announcement caused confusion. We should do everything 
> in our power not to cause confusion.

I haven't seen any confusion. Do you have any pointers?


> >> >> It's easier to integrate the specification with other 
> >> >> specifications, if it's focused.
> >> >
> >> > What do you mean by "integrate" and "focused" in this context?
> >>
> >> No fancy stuff with words here, dictionary definitions will do.
> >
> > "integrate" means "combining parts so that they become a whole", which 
> > seems to be the opposite of what you are requesting, so I still don't 
> > understand what you mean.
> 
> No, I stand by this. A web page is made up of many parts, each of which 
> is defined in a different specification. CSS, HTML, ECMAScript, even the 
> protocols -- all working together to create a whole that works, and is 
> displayed as the web page.
> 
> All these specs don't have to be in the same document. The same as 
> applications don't have to consist of one single file.

I agree entirely. The separate parts should be in separate documents. CSS, 
HTML, ECMAScript, and HTTP are separate parts. Microdata, however, is part 
of the HTML part, just like title="", class="", id="", onclick="", <img>, 
and so on.

Your line of argumentation doesn't support your premise, unless you also 
think we should separate each set of elements and attributes into its own 
separate spec.


> > "focused" means "concentrating interest or activity on something", 
> > which we have done successfully for many years on HTML5 (the 
> > "something" being improving the state of the Web technology stack for 
> > the purposes of writing Web Applications).
> 
> HTML5 is HTML, an XML serialization, and the DOM. We should be focused 
> on HTML. This is not the Web Applications group. I wonder if perhaps 
> you're not happy being the editor of just HTML. Would you be happier 
> working more closely with the Web Apps group, and we could work on 
> finding new editors who are happy working on just HTML for this group's 
> efforts?

My happiness really has nothing to do with this discussion.


> I'm saying that a good specification, like a good library, can be 
> created in such a way as to encourage innovation, rather than strangle 
> it.

Splitting Microdata into its own specification would strangle innovation.

(Well, if you're allowed to make completely non-sequitur soundbite-like 
arguments without evidence to back up your position, I figure I am too.)


> >> >> Just the same as applications are cleaner, and better when they're 
> >> >> based on modularization.
> >> >
> >> > I do not think this statement is a truism.
> >
> > If you meant that they should be focused on small topics, rather than 
> > large topics, then my argument above still holds. Specifications 
> > focused on small topics taken as part of a group of specifications 
> > that cover a larger topic have historically in the W3C been 
> > significantly less successful than "monolithic" specifications focused 
> > on the larger topic as a whole. CSS3 vs CSS2, and the XHTML 
> > Modularisation specifications vs XHTML1.0, are good examples of this. 
> > I'm not aware of any examples where the opposite is true (where a 
> > specification focused on a small topic that is part of a group of 
> > specifications that cover a larger topic has been more successful than 
> > its monolithic counterpart).
> 
> Wow, I think you're extrapolating a lot. For one, I don't think CSS3 is 
> a failure. Seems to me, there's a lot of CSS3 in many of the new 
> browsers. As for XHTML 2.0, I don't believe the topic size was the 
> primarily problem with that effort.

I didn't say CSS3 was a failure, I said CSS3 was less successful than 
CSS2. If you do not think this is the case, maybe you haven't been 
involved with the CSS working group enough.

I didn't mention XHTML 2.0 at all. I talked about XHTML Modularisation, 
which was a 1.x-series set of specifications. That you haven't apparently 
heard of it, or that you confused it with XHTML2, may in fact demonstrate 
my point perfectly.


> >> To me, the fact that HTML was separate from the DOM was separate from 
> >> the CSS and SVG and so on, meant that these specifications could 
> >> progress at their own rate, but there is a good degree of integration 
> >> between the specs.
> >
> > The fact that SVG and CSS were separate from each other led to a long 
> > set of problems that we are _still_ dealing with. Similarly, 
> > separating HTML and the DOM and CSS and the DOM led to such a 
> > complicated set of issues that we are still now, more than a decade 
> > later, trying to fix the mess. Separating SVG and HTML led to the pain 
> > felt in this very working group earlier this year and last year in 
> > trying to reintegrate them, and will lead to decades of ripple effects 
> > as the resulting integration affects every HTML parser from here on 
> > out, and every Web author that tries to use SVG and HTML in the same 
> > page.
> 
> There will always be decisions that were made that shouldn't have been 
> made, and bumps and hiccups along the way. We learn, and we work to do 
> better.

Apparently we _don't_ learn. You are, after all, arguing in favour of the 
kinds of mistaken decisions that led to the exact situation I described. 
If we were learning, we would avoid making those mistakes again.


> >> I don't see broad implementation of HTML5 in user agents or the 
> >> community.
> >
> > Then, with all due respect, you're not looking.
> >
> >   http://wiki.whatwg.org/wiki/Implementations_in_Web_browsers
> >   http://a.deveria.com/caniuse/#agents=All&eras=All&cats=HTML5&statuses=rec,cr,wd,ietf
> >   http://html5gallery.com/
> 
> I repeat, I do not see broad implementations of HTML in user agents, or 
> in the community.

I repeat, with all due respect, that can only mean that you aren't 
looking. I provided links above demonstrating my point. If you would like 
to disagree, I would request that you at least quantify your argument or 
provide some sort of backing for it. I'm sure hat simply asserting 
positions that point-blank contradict provided evidence is not what the 
chairs had in mind when they suggested the use of rationales to support 
our arguments.


> >> >> Microdata doesn't need to be in the spec [...]
> >> >
> >> > Microdata "needs" to be part of HTML5 because it is part of the 
> >> > language. Taking Microdata out of HTML5 makes about as much sense 
> >> > as taking out <h1>, <p>, or title="", IMHO.
> >>
> >> Microdata has not been implemented as part of HTML4, or by any 
> >> browser that I know of. I don't think any but a few have implemented 
> >> in their sites.
> >
> > The same applies to many other features of HTML5 -- should we put them 
> > in their own spec too? Should we put everything that isn't in HTML4 in 
> > a separate spec than HTML5? This seems to be a non-tenable position 
> > that would lead to a highly fragmented, and not very useful, set of 
> > documents.
> 
> I think we need to stop trying to control everything.

That has absolutely nothing to do with the subject at hand. We're talking 
about whether or not a particular feature of HTML5 should be in the same 
document as the rest of HTML5. That doesn't affect control in any way.



> >> >> What's another thing we know from application development? It's 
> >> >> easier to add at a later time, then it is to remove.
> >> >
> >> > This indicates a lack of familiarity with HTML5's development over 
> >> > the past few years. We have dropped numerous sections with far less 
> >> > effort than they took to be written. Repetition templates, 
> >> > <datagrid>, form prefilling, space-separate form="", peer-to-peer 
> >> > TCPConnection, the <eventsource> element, <datatemplate>, <font>, 
> >> > the entire "out of scope" section, several introduction sections... 
> >> > Not to mention the numerous sections that were split into other 
> >> > specs, such as XMLHttpRequest, Web Storage, Web Database, Web 
> >> > Workers, Web Sockets API, Web Sockets protocol, the Server-sent 
> >> > Events API, Content-Type sniffing, and URL parsing.
> >> >
> >> > No, if there's one thing we _do_ know about HTML5, it's that we've 
> >> > had no trouble dropping features. In fact the _only_ exceptions I 
> >> > know about are the ones that you mentioned, and the only reason 
> >> > we've had trouble dropping them is that you and others have 
> >> > objected to dropping them (summary="", for example). I presume that 
> >> > wouldn't apply here, since you are specifically _asking_ that we 
> >> > drop them.
> >>
> >> Not after LC or CR. Once HTML5 hits the streets, it is going to be 
> >> extremely difficult to remove features.
> >
> > It's difficult to remove _implemented_ features regardless of what 
> > stage in the process we are at. However, splitting microdata into a 
> > separate specification (the decision that is before us) does nothing 
> > to change that.
> 
> Actually, this kind of counters what you wrote above.

Could you elaborate? What contradicts what?


> >> The size in pages has little to do with anything. Well, other than 
> >> some of the HTML5 related publications cause browsers to fail.
> >
> > I disagree with the premise that there are parts of HTML5 remaining 
> > that have nothing to do with HTML, its serialisations, and its DOM 
> > APIs. Certainly, the microdata section is an integral part of the HTML 
> > language and its APIs, and so the argument above at a minimum doesn't 
> > apply to this thread, even if it turns out there are parts to which it 
> > does apply.
> 
> We have to disagree then. Material specific to only a subset of user
> agents does not belong in the HTML 5 specification. And there are
> other aspects, as I've written up in bugs.

Is there _any_ content in HTML5 that is _not_ specific to only a subset of 
user agents? If that is your criteria for keeping something in the spec, 
what would you suggest we should _keep_?


> Speaking of which, how are you doing with handing all of the bugs? Do we 
> need to consider adding another editor to help you with all the bugs?

I have been working on other things these past few weeks (including a 
vacation), and will likely not return to dealing with bugs until January. 
Right now we have fewer than 200 bugs; the most we've ever had was 210, 
and that took less than a month to deal with. Since we currently have 32 
issues to deal with, and those take in the region of a month each to deal 
with, it is my understanding that I am not the bottleneck at the moment. 
However, I am in regular contact with the chairs and will ensure that bugs 
are dealt with at a suitable pace.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Sunday, 6 December 2009 05:15:23 UTC