Re: WebTV Help for Getting Engaged in W3C Test Effort from Robin Berjon on 2014-04-28 (public-test-infra@w3.org from April to June 2014)

From: Robin Berjon <robin@w3.org>
Date: Mon, 28 Apr 2014 17:42:42 +0200
To: Giuseppe Pascale <giuseppep@opera.com>
CC: "public-test-infra@w3.org" <public-test-infra@w3.org>, "public-web-and-tv@w3.org" <public-web-and-tv@w3.org>
Message-ID: <535E76F2.1050507@w3.org>
On 28/04/2014 15:49 , Giuseppe Pascale wrote:
> Note that this questions are not intended to request to add work or
> process to your group (which I doubt I could do anyhow) but to clarify
> some of the questions which have been asked in the last workshop and to
> set the right expectations of what people may and may not find in a W3C
> test suite.

Sure thing.

>     It's a very simple process. When you first create a test, you
>     *might* get the metadata right. (Even then it's a big, big "might"
>     because most people will copy from an existing file, and through
>     that get wrong metadata.)
>
> I agree that an author may get things wrong, but the reviewer should be
> responsible for checking the spec reference.

The problem there is that this adds more work for the reviewer, and we 
already have a problem with insufficient reviewer bandwidth. Given our 
constraints it is unlikely that any change involving more work for the 
reviewers would be popular.

> Otherwise I'm not clear
> what a reviewed test actually mean.  Isn't the reviewer supposed to
> check if the test matches some spec text? If so, and if the author
> doesn't write which spec version he is testing, can the reviewer really
> know what he he supposed to check?

I am not aware of a review in which the specification version has been 
taken into account. To date I only know of reviews done against the 
latest and greatest specification. The idea is that master should always 
be testing the latest, that's where the value is. If someone needs the 
test suite to match a specific version we have branches for that purpose 
(but it is up to whoever has that need to produce the specific subset).

Also, in reality only a relatively limited subset of tests cleanly map 
to a single section. It is usually the case that tests need to exercise 
more than one part of a specification at once. That would mean more 
links, etc.

In general if the reviewer can't figure out which part(s) of the spec 
you are testing simply by reading the code and having the spec open, 
there's a problem either with the test or with the spec. (Or it's a 
really obscure feature, but we shouldn't optimise for those.)

> Maybe the response is: always check the latest editor draft at hand. If
> so, as discussed before maybe the spec version can be auto-inferred by
> the commit date.

But for many specs at best that will give you a commit for the editor's 
draft, not an addressable snapshot. You can't use such heuristics to 
address specific snapshot specifications.

Where specific snapshots need to be tested, presumably whoever is in 
charge of the snapshot knows what they are subsetting from the ED and 
can produce a subset snapshot test suite with a relatively quick 
application of grep and git rm in the appropriate branch. Or by listing 
the results to ignore in the implementation report.

(FWIW we do have that use case for HTML and to a lesser degree DOM, and 
so far it is working fine for us.)

>     But when it's updated what's your incentive to update the metadata?
>     What points you to remember to update it? Pretty much nothing. If
>     it's wrong, what will cause you to notice? Absolutely nothing since
>     it has no effect on the test.
>
> Once again I would expect a "reviewer" to be in charge of it in a
> structured review process (and I would expect an updated test to be
> subject to review). And I assume the reviewer to actually check a spec
> to see if a test is valid (or otherwise how does it check its validity)?

Again, that seems to place a lot of extra work on the shoulders of 
reviewers when we already don't have enough of those. And it doesn't 
tell us what incentive the reviewers have to check the metadata when 
those we currently have participating aren't interested in having any.

We could have metadata-specific reviewers, but that is more overhead 
that would hurt everyone. That's why I suggest doing it orthogonally. It 
would be simpler for everyone.

> Maybe, also here, the answer is implicit (check the latest ED) and can
> then be autogenerated knowing the commit date

But I don't see the use case. I know that there is a genuine demand in 
some communities for having proper snapshot specifications, and it 
logically comes out that they might need snapshot test suites as well. 
But I've never heard any of those ask for pointers to specific ED 
commits, which is all you'd get with the above. I don't think it helps 
that use case.

>     So far, in the pool of existing contributors and reviewers, we have
>     people who benefit greatly from a working test suite, but to my
>     knowledge no one who would benefit from up to date metadata. Without
>     that, I see no reason that it would happen.
>
> The reason for raising this issue is because during the workshop we had
> some people asked about this, i.e. how can they know which tests to use
> given a set of specs they reference. E.g. how can I know which tests are
> up to date and which one have been written against an old spec (and
> maybe not valid anymore?)

The platform has a pretty strong commitment to backwards compatibility. 
The issue of tests only applying to an old specification is therefore 
less likely to start with.

That said, the process is simple: continuous maintenance. Tests are run 
continuously and are investigated when they fail. In fact, that's the 
only way I can think of to make this work. The fact that a test was 
written against an older specification tells you absolutely nothing 
about its validity — the odds are it is still valid (in fact, the older 
the spec, the more likely).

>     I believe everything is in place for the system described above to
>     be implemented relatively easily. I am fully confident that if there
>     is a community that genuinely requires testing metadata they could
>     bash together such a tool in under a month. And we're happy to help
>     answer questions and provide hooks (e.g. GitHub update hooks) where
>     needed.
>
> sounds like a sensible approach. Maybe that will also help inform this
> discussion, i.e. to identify if there are some basic metadata which are
> needed, missing and that an external group cannot generate.

Sure, we're happy to work with anyone to enable third-parties to reuse 
the WPT as much as possible.

>     This is a volunteer and so far largely unfunded project. It is also
>     by a wide margin the best thing available for Web testing today. Its
>     shape and functionality matches what current contributors are
>     interested in; if there are new interests not so far catered to, the
>     solution is simple: just bring in new contributors interested in this!
>
> The goal of this conversation is to bring new contributors, as
> (understandably) some people didn't want to commit to something which
> looked like a black box (to them).

It's not at all a black box, everything is done in the open!

-- 
Robin Berjon - http://berjon.com/ - @robinberjon
Received on Monday, 28 April 2014 15:42:54 UTC