RE: Conversion of MS CSS 2.1 tests to reftests from John Jansen on 2010-09-20 (public-css-testsuite@w3.org from September 2010)

From: John Jansen <John.Jansen@microsoft.com>
Date: Mon, 20 Sep 2010 16:28:49 +0000
To: Geoffrey Sneddon <gsneddon@opera.com>, fantasai <fantasai.lists@inkedblade.net>
CC: Arron Eicholz <Arron.Eicholz@microsoft.com>, "public-css-testsuite@w3.org" <public-css-testsuite@w3.org>
Message-ID: <C340671BECD4364E8F9EBA27E8E2313219D2AF21@DF-M14-04.exchange.corp.microsoft.com>
> -----Original Message-----
> From: Geoffrey Sneddon [mailto:gsneddon@opera.com]
> Sent: Tuesday, September 14, 2010 10:23 AM
> To: fantasai
> Cc: John Jansen; Arron Eicholz; public-css-testsuite@w3.org
> Subject: Re: Conversion of MS CSS 2.1 tests to reftests
> 
> On 10/09/10 23:12, fantasai wrote:
> > On 09/10/2010 09:15 AM, John Jansen wrote:
> >>
> >>>
> >>> On 09/09/2010 12:46 PM, Geoffrey Sneddon wrote:
> >>>> Hey,
> >>>>
> >>>> Attached is a diff to convert 830 tests to reftests (with a mere
> >>>> four references!). The list of tests for each reference is based
> >>>> upon tests that pass in Opera with the same screenshot, so anything
> >>>> that we fail won't be included, but still, 830 tests is a nice
> >>>> start.
> >>>>
> >>>> It'd be nice if you could apply the diff (just adding five new
> >>>> files), and adding it to the build system, sometime in the near
> >>>> future.
> >>>
> >>> Arron, if this looks okay to you, I'm happy to check it in for you.
> >>>
> >>> ~fantasai
> >>>
> >> I would like to get some clarification on this ask of Arron before he
> >> commits to this work.
> >>
> >> Based on my understanding of how test ownership works, you are asking
> >> Microsoft to take ownership of these references moving forward, and
> >> so any changes to the tests referenced here will require that we
> >> update the references themselves. Correct?
> >
> > Only changes to the test that would also change the intended rendering
> > would require updating the references. For the kinds of tests Geoffrey
> > is creating references for, this doesn't seem particularly likely.
> >
> > Also, if the concern is about maintaining references, I suspect
> > Geoffrey would be willing to handle maintenance of the references if
> > Arron is willing to delegate that job to him. (Geoffrey can confirm or
> > deny that.) The main ownership issue here is that Arron is managing
> > those directories, and should be aware of anything that affects them.
> 
> Yeah, I'm happy to maintain the references (at the same time, I agree with
> fantasai that they're unlikely to change, though as more tests are automated
> that may become less true, but I'd hope the spec doesn't change in a way to
> break too many tests with the timeline Daniel was talking about at the F2F).
> 
> >> Is your expectation that the 830 tests here are just the beginning?
> >>  The test suite should be locked down in 5 days and then
> >> Implementation Reports should be complete within 30 days. If there
> >> are more changes coming, that seems to go against 'locking down'.
> >
> > Adding references doesn't change the test, it just makes it easier to
> > run. (The tests can still be run manually, too.)
> >
> > I think we do want most of the CSS2.1 test suite to be automatable,
> > although I don't think we'll get there within the REC time frame.
> 
> Nor do I, but I would be pleased to see a second edition of the test suite a
> number of months after the spec reaches REC (given the current
> timezone) which does reach that goal.
> 
My entire concern is over the current time-frame. I realize I'm new to the group, but I am really just concerned with being done with 2.1 and moving on. For me, if we need to discuss any 2.1 issues at TPAC, that is a failure on our part. 

We've learned a lot after working on this suite and I think that learning informs the 3.0 suite nicely, but I am loathe to spend more time on 2.1 now that we are in the very end game.

We planned and reached consensus on a test process as a working group.  Changing or adding any processes to the working group's test suite at this point will only cause risk and delay.  The test cases are done and the suite is nearly complete.  Why would we re-engineer anything this is already done and working?  I oppose any change to the tests (other than the ongoing agreed upon review process) or the agreed-upon test case process at this point because I believe the end result will be a delay to getting the implementation reports.  

As we've seen already with Mozilla's feedback over the weekend, the primary cost for the implementation reports is going to be reviewing the tests for accuracy, not actually running them. I believe the benefit to changing some percentage of them to ref tests is only incremental and will cause an overall delay. Please note too: the test suite has been complete since January, so I was surprised (again, I'm new to the group, so forgive my surprise) that the group would want to start this work now, when we are literally just weeks from being done.

> >> If there are no additional changes coming, it seems like reftesting
> >> ~8% of the suite isn't worth the cost/benefit analysis. Maybe I'm
> >> missing something, but I also think that each Vendor will want to run
> >> these tests manually anyway in order to verify your references.
> >
> > I think the vendors with reftest infrastructure will check that the
> > reference renders correctly, and then let the reftest infrastructure
> > run the tests. And then fill in the rest of the implementation reports
> > manually.
> 
> Indeed. (As it is now, I would expect both ourselves and Microsoft who
> already have the tests in our regression tracking systems to produce an
> implementation result from that; the only real difference is moving the
> verification of the screenshots to being automated).
> 
> >> That feels like not only a 0 sum gain, but in reality extra work for
> >> everyone as we are approaching the end to CSS2.1 (which I what I
> >> believe we all want - to be done and moving on with CSS3).
> >>
> >> Please verify, I may just be misunderstanding the expectations here.
> >
> > If Geoffrey is putting in the time to create references for tests,
> > that will reduce the amount of work in creating and updating
> > implementation reports. It might make the difference between getting
> > an implementation report from Mozilla and not getting one.
> 
> In our case, it will make the difference between us producing an
> implementation for a single shipping browser/platform combination (i.e.,
> Opera Desktop on Windows XP) or for multiple platforms, which may
> become relevant for things such as font-weight which are not implemented
> per spec in the majority of browsers on Windows.
> 
> Finally, can I reiterate what was said at the F2F: we're willing to help
> automate some of the testsuite, but we can't commit the resources to
> automate it all ourselves. Any help from anyone else would be appreciated.
> David Singer said he'd try and nag some people into helping; this would be
> much appreciated if the nagging succeeded. :)
> 
> --
> Geoffrey Sneddon - Opera Software
> <http://gsnedders.com>
> <http://opera.com>
Received on Monday, 20 September 2010 16:29:25 UTC