Re: Coverage analysis from Robin Berjon on 2013-02-11 (public-test-infra@w3.org from January to March 2013)

From: Robin Berjon <robin@w3.org>
Date: Mon, 11 Feb 2013 21:04:57 +0100
To: Tobie Langel <tobie@fb.com>
CC: "'public-html-testsuite@w3.org'" <public-html-testsuite@w3.org>, public-test-infra <public-test-infra@w3.org>
Message-ID: <51194EE9.5050408@w3.org>
On 11/02/2013 18:13 , Tobie Langel wrote:
> On 2/11/13 4:47 PM, "Robin Berjon" <robin@w3.org> wrote:
>> I've now done this analysis for the HTML and Canvas specs (I would have
>> done Microdata too, but it doesn't seem to have approved tests yet).
>>
>> You can see it here, but be warned that you might not understand it
>> without reading the notes below:
>>
>>      http://w3c-test.org/html-testsuite/master/tools/coverage/
>>
>> I'm copying public-test-infra; in case anyone wants to do the same for
>> other specs I'd be happy to collaborate. If people think it would be
>> useful to provide such data on a regular basis, we can certainly
>> automate it. Note that for this purpose having the data in one big repo
>> would help.
>
> Thanks for doing this. This is great. I absolutely think we should be
> doing this for all specs.
>
> With better visuals, finer tuning of the weight of each metric (maybe even
> per spec tuning?), and data on the actual number of tests written for each
> section, this could give us fantastic overview of testing coverage at W3C,
> with the ability to dig into specifics when needed.
>
> Care to share the script(s) and discuss how to best move this forward?

The scripts are all in the repository (under tools/coverage). Here's a 
quick overview of the architecture (by which I mean "how it grew up to 
be" — it can be improved).

First is a script called test-data.js. What this does is that it 
launches a web server at the root of the repo, lists all files that are 
candidates to contain tests, and for each of those opens them up to 
count the number of tests in them. Since the only way of knowing how 
many tests there are in a testharness.js test, it goes this by calling 
get-data-for.phjs which is a PhantomJS script, telling it to load 
"http://localhost:3017/tools/coverage/loader.html?/path/to/test". That's 
a special document that listens to TH events and knows how to return a 
test count properly. The output of that is test-data.json, which is a 
document listing the number of tests per file in the repo.

This part could easily be automated to run on a regular basis (probably 
against w3c-test.org rather than its own built-in server) and could 
trivially be extended to cover any test suite, and dump the output in 
some DB.

This step is also useful on its own in that it finds errors. I have a 
list of ~50 errors in the TS that it flags. I plan to investigate each 
of those as I suspect that in most cases they are problems with the TS 
(though in ten cases I crash Phantom, I expect those are bugs in the 
latter).

The next step is running tests-per-section.js. This is a munging of the 
previous data that happens to be aware of how the test suite is laid out 
in order to be able to produce test case data mapped to section ID. It's 
a trivial script (outputting to tests-per-section.json). It could easily 
be folded into the previous step. I expect all test suites included in 
the Big Repo to follow the same rules, and so this could apply equally 
to all, and likewise run regularly on its own.

The final step is analyse-specs.js. This is more convoluted. The script 
itself is fairly straightforward, but it calls upon 
get-analysis-for.phjs (another PhantomJS script) that does the actual 
specification analysis.

Here, we couldn't use the same script for all specs because it has to 
understand specific conventions for things that are examples, 
non-normative, etc. In many cases, we could probably handle this without 
PhantomJS. For the HTML spec (and all specs derived directly from that 
source) we're looking at such a markup nightmare (the sections aren't 
marked up as such, you essentially have to resort to DOM Ranges to 
extract content usefully) that PhantomJS really is the only option.

I think there's no reason to despair though. If the HTML spec could be 
analysed, then others will be easier. For instance, all the required 
information is clearly marked up. We should be able to have a small 
number of spec "styles" that we can set up and use.

The output from that is spec-data-*.json. Assuming we can solve the 
above issue (which we can, it's just a bit of work), this too can be 
automated and dumped to a DB.

If you can tell me what you mean by "better visuals" I can easily make 
it happen. Do you mean "make it look less like a train wreck for 
instance by adding boilerplate like Bootstrap" or something else?

-- 
Robin Berjon - http://berjon.com/ - @robinberjon
Received on Monday, 11 February 2013 20:05:10 UTC