Re: Charters for review from L. David Baron on 2006-11-21 (www-archive@w3.org from November 2006)

From: L. David Baron <dbaron@dbaron.org>
Date: Tue, 21 Nov 2006 15:35:31 -0800
To: Ian Hickson <ian@hixie.ch>
Cc: Chris Lilley <chris@w3.org>, Hypertext CG <w3c-html-cg@w3.org>, timbl@w3.org, dino@w3.org, steve@w3.org, www-archive@w3.org
Message-ID: <20061121233531.GA2475@ridley.dbaron.org>
On Tuesday 2006-11-21 21:05 +0000, Ian Hickson wrote:
> TIMETABLE
> 
> The milestones in the charter are somewhat unrealistic. I would suggest 
> the following timetable would be far more likely, based on past experience 
> with HTML4 (which is still not fully implemented by any two UAs), DOM, and 
> CSS2.1 (which is the only large W3C specification to have attempted a 
> serious disambiguation period):
> 
>     First Working Draft in October 2007.
>     Last Call Working Draft in October 2009.
>     Call for contributions for the test suite in 2011.
>     Candidate Recommendation in 2012.
>     First draft of test suite in 2012.
>     Second draft of test suite in 2015.
>     Final version of test suite in 2019.
>     Reissued Last Call Working Draft in 2020.
>     Proposed Recommendation in 2022.

This seems a bit slow to me.  Ambitious schedules encourage faster
progress.  And I think given good organization, adequate tools,
and adequate amounts of people's time, things could move a good bit
faster than they have in the CSS WG.

I think a good bit of the tension over the schedule reflects tension
over how many improvements are needed to make it worth pushing the
whole set of improvements through the process as a new version.  I
usually prefer small and frequent increments to large and infrequent
ones.  But I think the W3C process has a tendency towards large and
infrequent increments due to administrative overhead (rechartering
per increment, publication overhead) and the tendency of commenters
to repeat the same comments at every increment.

To put it another way, schedules like this don't reflect the fact
that some parts of the specification will likely be stable long
before others are.

CSS has worked around this problem by splitting CSS3 into modules
that are independently versioned and advanced.  However, this
workaround also has significant problems:
 * It is harder to write, harder to publish (extra administrative
   overhead) and harder to read (can't find all the pieces, and some
   parts are unnecessarily abstract so they are modular).
 * The separation between modules started off as logical separation,
   but then gets tweaked so that features of the same stability
   level are published at the same time.  This makes the
   specifications even more confusing.
 * Interactions between features in different modules are unlikely
   to be properly tested.

The problem with the approach where a single large document advances
according to its slowest part is that many of the checks provided by
the process (such as test suites to verify that the specification is
being implemented interoperably) happen much later than they should.
Sections that are stable now should have test suites written now so
that implementations can continually improve.  For example, the
WHATWG <canvas> specification has been implemented by three browsers
(so clearly that *part* of the specification has reached
call-for-implementations), but there's no test suite, and there are
interoperability problems that would have been caught by a test
suite.

Ideally, different parts of a large document could be marked as
being at different stages in the process.  Perhaps that's a lot to
ask (both of W3C management and of document reviewers), but I think
it would help align specifications with reality.

Perhaps this could be approximated by maintaining a master document,
using a tool to remove the parts that aren't at a given maturity
level, and publishing the incremental updates to each maturity level
as HTML 5.00, 5.01, etc.?

> TESTING
> 
> I also think it is critical that the volume required of the test suite 
> deliverable be quantified. The term "comprehensive test suite" is vague at 
> best; the working group is likely to steadily reduce its opinion of what 
> is "comprehensive" as the work progresses. It is far better, in my 
> opinion, to have a very specific goal in place, for example, "an average 
> of 2000 tests per chapter". This gives the working group an unambiguous 
> goal. The goal is somewhat arbitrary, and, after discussion, could be 
> changed by rechartering, but this would require a clear intent, rather 
> than it being a subconscious change over time.

And, behold, the specification containing 20 chapters suddenly
became a specification containing 1 chapter, with 20 sections within
the chapter.

I'd rather define a comprehensive test suite as one that:

 * has tests adequate to verify that each normative sentence in the
   specification applying to browser conformance (as opposed to
   document or authoring tool conformance) is correctly implemented
   in at least the basic cases, and

 * has tests adequate to verify correct interaction of features
   whose interaction is interesting or potentially problematic.

The rule about sentences should cause the complexity of the test
suite to scale with the complexity of the spec, and also provides
further incentive to keep the specification simple.


(I might have some things to say about what you wrote about decision
process later -- including the last of your conclusions in the
openness section.)

-David

-- 
L. David Baron                                <URL: http://dbaron.org/ >
           Technical Lead, Layout & CSS, Mozilla Corporation
Received on Tuesday, 21 November 2006 23:35:58 UTC