Re: Importing W3C's test suite into implementor's test suites. from Rebecca Hauck on 2013-03-21 (public-test-infra@w3.org from January to March 2013)

From: Rebecca Hauck <rhauck@adobe.com>
Date: Thu, 21 Mar 2013 14:36:08 -0700
To: Dirk Pranke <dpranke@chromium.org>, Tobie Langel <tobie@w3.org>
CC: James Graham <jgraham@opera.com>, Robin Berjon <robin@w3.org>, public-test-infra <public-test-infra@w3.org>, fantasai <fantasai.lists@inkedblade.net>, Kris Krueger <krisk@microsoft.com>
Message-ID: <CD70C36F.CD80%rhauck@adobe.com>

From: Dirk Pranke <dpranke@chromium.org<mailto:dpranke@chromium.org>>
Date: Thu, 21 Mar 2013 13:45:24 -0700
To: Tobie Langel <tobie@w3.org<mailto:tobie@w3.org>>
Cc: James Graham <jgraham@opera.com<mailto:jgraham@opera.com>>, Robin Berjon <robin@w3.org<mailto:robin@w3.org>>, public-test-infra <public-test-infra@w3.org<mailto:public-test-infra@w3.org>>, fantasai <fantasai.lists@inkedblade.net<mailto:fantasai.lists@inkedblade.net>>, Kris Krueger <krisk@microsoft.com<mailto:krisk@microsoft.com>>
Subject: Re: Importing W3C's test suite into implementor's test suites.

On Thu, Mar 21, 2013 at 7:28 AM, Tobie Langel <tobie@w3.org<mailto:tobie@w3.org>> wrote:
On Thursday, March 21, 2013 at 2:11 PM, James Graham wrote:
> Many vendor's systems are designed around the assumption that "all tests
> must pass" and, for the rare cases where tests don't pass, one is expected
> to manually annotate the test as failing. This is problematic if you
> suddenly import 10,000 tests for a feature that you haven't implemented
> yet. Or even 100 tests of which 27 fail. I don't have a good solution for
> this other than "don't design your test system like that" (which is rather
> late). I presume the answer will look something like a means of
> auto-marking tests as expected-fail on their first run after import.

Afaik, this is what Mozilla does now with CSS ref tests. Fantasai, please correct me if I'm wrong.

Tests are batch imported (monthly) and ran. New tests which fail are marked as such in a manifest file (that Mozilla hosts) and get skipped in test runs. Not sure what happens to existing tests which were previously passing and now fail, but they're probably either skipped too or investigated.

Would be great to understand how WebKit (or would that be vendor specific?) and Microsoft plan to proceed with this.

All of the public / "upstreamed" WebKit ports basically work the way James describes, except that the WebKit infrastructure has ways to mark entire directory trees (or suites) of tests as expected to fail (or also should be skipped).

At the moment, tests are imported sporadically, in a completely ad-hoc and manual manner. There are people (including Rebecca) actively working to change this :)

FWIW, the reason I've started to address automating the import is because I'm currently in a third scenario that hasn't really been addressed directly here.  Where Scenario 1 is upstreaming existing tests from implementors to the W3C and Scenario 2 is implementors importing existing tests from the W3C, I'm currently in spot where I'm writing new tests and want to submit them to both places at once.  Putting aside the challenges I faced with authoring tests that work in both systems (despite reftest and testharness.js support in WK, still had a few),  I also obviously wanted to avoid submitting 2 versions of the same tests to different repos. I chose the model of submitting them first to the W3C and scripting the import of them from the W3C repo into Webkit with the rule of never modifying them in Webkit, thus turning it into Scenario 2.  A lightweight automated import that follows this rule is already happening in Webkit with the webperf repository from https://dvcs.w3.org/hg, so I just keyed off that.  I'm not sure how common this third scenario is, but it really seems it should be (at least encouraged) for anyone actively working on a spec and and implementation together. For the feature that my colleagues and I are currently working on (CSS Regions), advancing both the spec and implementation in Webkit are equally important.  I'd be curious to know if any other implementors had a model for this case. And sorry Tobie if I've just created another fork of this discussion.

Received on Thursday, 21 March 2013 21:36:47 UTC