Re: Vendor harnesses from James Graham on 2011-05-10 (public-test-infra@w3.org from April to June 2011)

From: James Graham <jgraham@opera.com>
Date: Tue, 10 May 2011 23:04:40 +0200 (CEST)
To: "Linss, Peter" <peter.linss@hp.com>
cc: James Graham <jgraham@opera.com>, "public-test-infra@w3.org" <public-test-infra@w3.org>
Message-ID: <alpine.DEB.2.00.1105102223370.21605@sirius>
On Tue, 10 May 2011, Linss, Peter wrote:

> 
> On May 10, 2011, at 2:25 AM, James Graham wrote:
>
>       On 05/10/2011 12:55 AM, Linss, Peter wrote:
>
>             I get that vendors are all using their own home-grown systems, and
>
>             integrating those with the official W3C test suites is a problem
>
>             we've been wrestling with for years already in CSS. Personally, I'd
>
>             like to see the W3C test systems be robust enough that they cover the
>
>             internal needs of the vendors and they can simply leverage them
>
>             directly (with perhaps reasonable extension hooks for truly one-off
>
>             uses). Ideally the differences between the official test systems and
>
>             the internal testing systems should approach zero over time. But I'm
>
>             well aware that's a long way off and we have to have a way to get
>
>             there from here.
> 
>
>       I think that should be an explicit non-goal.
>
>       I believe trying to create test runner that everyone is happy to use for
>       all their internal needs it will lead to impossible-to-reconcile
>       conflicting requirements. I think the goal instead should be to produce
>       things that integrate as well as possible into a variety of existing
>       setups. For example Opera really like running our tests over the scope
>       protocol [1]. Expecting other vendors to adopt that approach is clearly
>       a non-starter, but we wouldn't want to replace our existing systems with
>       ones less well adapted to our requirements.
>
>       There are also more mundane issues to consider. For example "never rely
>       on an external server" is the first commandment of getting reliable
>       testing. So we will always have to make some adaptations to ensure that
>       testcases never do that.
>
>       [1] http://my.opera.com/dragonfly/blog/scope-protocol-specification
> 
> 
> I don't agree that synergy between vendor's existing internal testing 
> systems and the W3C's testing system is a "non-goal".

To be clear, my opinion is:

Goal:
Make all testcases usable (and, therefore *used*) by vendors

Non-goal:
Make the actual software vendors use to run the test cases

> First and foremost is the issue of the tests themselves. In the CSS 
> testing effort we found that pretty much every vendor has their own test 
> suite and testing system. Getting people to write tests is hard, getting 
> them to write them _twice_ is virtually impossible. The W3C needs to be 
> aware of existing test suites and testing systems and goal #1 of the 
> W3C's test system needs to be the ability of vendors to take the W3C 
> test suites and drop them into their internal testing systems with as 
> little effort as possible, and vice versa. This way we all have one 
> common corpus of tests we all use and we maximize the availability of 
> tests. I don't really think there's any disagreement on this point.

No. Indeed the experience of CSS 2.1 is very much one that we don't ever 
want to repeat. But the critical thing is that the tests should be written 
so that they are easy to integrate into existing systems, not so that they 
require new systems or manual labour to run. In theory this need not be 
very complex. For example testharness.js exposes a simple result reporting 
API that should allow it to interface with existing results-collecting 
systems.. Tests using it are also supposed to - and this is not yet 
universially followed yet - include testharnessreport.js; a file that is 
empty in the W3C repository but could be non-empty in vendor repositories. 
Internally we have a few lines of code in that file to report the results 
to our existing test system. Those few lines were the only additional 
changes we made to support testharness.js tests. I iamgine that other 
vendors should have no more difficulty interfacing with their existing 
systems.

> When I talk about my goal here (and I did say it was a personal goal), 
> what I mean is that over time, I'd like to see the W3C system evolve to 
> the point that it can serve the same needs as what vendors need for 
> their internal testing. Ideally, if a new vendor comes to the scene down 
> the road, they'll be able to simply adopt the W3C testing system rather 
> than roll their own. And while I'm not expecting existing vendors to 
> drop their own systems and switch to the W3C's, I do want to see the 
> systems converge, so that at least where there are overlaps in 
> functionality, the same tools can be used. 

OK, it's fair enough if you have that as a personal goal, but I strongly 
feel that it isn't a good use of the group's time to work on it. I like 
simple things that will have obvious short-term benefits.


> I don't want to see the W3C testing system try to adapt to every 
> vendor's proprietary testing hooks, but it would be good if the W3C's 
> testing system was extensible so that vendors could write their own 
> adapters to connect their testing hooks to the W3C harness, for example. 
> Even better, would be for browser vendors to agree on a standardized 
> testing API so that any browser can integrate into a standard testing 
> harness.
> 
> Let me give a concrete example, with Firefox, you can make a special 
> build of the browser that can automatically compare reference tests and 
> gather results. Wouldn't it be useful if that build could run the W3C's 
> testing harness directly and submit results there?

Why? I am generally not interested in the W3C collecting results and 
haven't really understood why other people are. Test results are only 
really useful for QA; you can see where you have bugs that you need to fix 
and ensure that you don't regress things that used to work. But that's a 
very vendor specific thing; it's not something that W3C has to do. When 
people try to use tests for non-QA purposes like stating that one browser 
is more awesome than another it leads to bad incentives for people 
submitting tests.

It is certianly useful to be able to run the *tests* directly but whether 
that is by using a big blob of W3C supplied code or vendor-specific code 
seems entirely uninteresting. Making it all standardised whilst meeting 
the desire of vendors to e.g. run tests on devices which cannot run the 
harness code directly but have to be remote-controlled seems like a very 
tough problem.

Anyway, I hope I am not coming across as negative. I just want to 
concentrate on simple, limited scope, solutions that are flexible enough 
to meet everyone's requirements.
Received on Tuesday, 10 May 2011 21:05:15 UTC