Re: Mismatch between CSS and web-platform-tests semantics for reftests from James Graham on 2014-09-04 (public-test-infra@w3.org from July to September 2014)

From: James Graham <james@hoppipolla.co.uk>
Date: Thu, 04 Sep 2014 13:29:50 +0100
To: Peter Linss <peter.linss@hp.com>
CC: public-test-infra@w3.org
Message-ID: <54085B3E.7060806@hoppipolla.co.uk>
On 03/09/14 19:10, Peter Linss wrote:
>> 
>> So I was looking at adding this to web-platform-tests and the
>> current design adds some non-trivial complexity. As background, 
>> web-platform-tests uses a script to auto-generate a manifest file
>> with the test files themselves being the only required input. This
>> is rather slow, since it involves actually parsing the *ML files
>> and inspecting their DOM. Therefore it is important to be able to
>> perform incremental updates.
> 
> FWIW, we have a build step that scans the entire repository looking
> for tests, references, and support files, parses all the *ML files,
> generates manifests, human readable indices, then generates built
> test suites by re-serializing all *ML input files into HTML, XHTML,
> and XHTML-Print output files (where applicable). It also adjusts
> relative paths for reference links so that they remain correct in the
> built suites. The process currently takes about 6 minutes consuming
> over 21000 input files and generating 24 test suites. It has not been
> optimized for speed in any way at this point. Given that it runs
> daily on a build server, the burden is completely manageable.

Having a six minute build step seems rather hostile to decentralised
workflows. Let's say I clone the CSS repository now. What steps do I
have to take before I can run the full set of tests, and how long does
each take? (I honestly don't know what you're supposed to do; if this is
documented somewhere I have overlooked it). For web-platform tests, the
steps are:

* Clone the repo from GitHub - installs tests and deps apart from
html5lib, which I plan to add as a submodule (a few minutes, depending
on location, network, etc.)

* Edit your hosts file to add the required domains (if you know how to
do it about a minute I guess)

* Run python serve.py (a few seconds)

* Go to http://web-platform.test in your browser and press "start". In
the background it builds a MANIFEST.json file containing all the data
required to run the tests (a complete rebuild takes 40s on my laptop
with an SSD; subsequent rebuilds only examine files that changed).

So, if you knew the steps, it is possible to go from a clean slate to
running tests in, I guess, 10 minutes, and incremental updates are
possible in under a minute. Getting a fully automated testrun with
wptrunner is a little slower, because wptrunner requires some
python-specific knowledge, but again it's a one-off cost.

>> Currently it is always possible to examine a single file and
>> determine what type of thing it represents (script test, reftest,
>> manual test, helper file, etc.). For example reftests are
>> identified as files with a <link rel=[mis]match> element. Since
>> (unlike in CSS) tests in general are not required to contain any
>> extra metadata, allowing references to link to other references
>> introduces a problem because determining whether a file is a
>> reference or a test now requires examining the entire chain, not
>> just one file.
> 
> I don't understand why you have to parse the entire chain to
> determine if a single file is a test or a reference, if a file has
> single reference link, then it's a reftest, regardless of how many
> other references there may be. You do, of course, have to parse the
> entire chain to get the list of all references for the manifest, but
> really, that's not adding a lot of files to be parsed, many tests
> reuse references, and we use a cache so each file is only parsed
> once.

So your approach is to regard each file with one or more mis[match]
links as a test and then to allow the overall result of one test to
depend on other tests? I guess that works.

> For that matter, at least in CSS land, we don't differentiate between
> tests and references based on the presence of {mis}match links, those
> only indicate that a test is a reftest. The difference between tests
> and references is done solely by file and directory naming
> convention. References are either in a "reference" directory or have
> a filename that matches any of: "*-ref*", "^ref-*", "*-notref*",
> "^notref-*". Furthermore, it's perfectly valid to use a _test_ (or a
> support file, like a PNG on SVG file) as a reference for another
> test, we have several instances of this in the CSS repo.

It's very unclear to me what this file naming convention is used for.
Does it provide any data that is actually required to run the suite or
is it purely informative?

> Also, let me point out again, that the bulk of the code we use for
> our build process is in a factored out python library[1]. It can use
> some cleanup, but it contains code that parses all the acceptable
> input files and extracts (and in some cases allows manipulation of)
> the metadata. Shepherd also uses this library to do it's metadata
> extraction and validation checks. If we can converge on using this
> library (or something that grows out of it) then we don't have to
> re-write code managing metadata... I'm happy to put in some effort
> cleaning up and refactoring this library to make it more useful to
> you.

Can you point to the exact code that you have in mind? Lots of that code
seems to be concerned with problems that I'm not trying to solve at this
time; literally all I want is to be able to get a complete set of test
files and the data required to run the tests automatically, in a way
that works for w-p-t and CSS without special casing.
Received on Thursday, 4 September 2014 12:30:17 UTC