Re: Mismatch between CSS and web-platform-tests semantics for reftests from Peter Linss on 2014-09-04 (public-test-infra@w3.org from July to September 2014)

From: Peter Linss <peter.linss@hp.com>
Date: Thu, 4 Sep 2014 10:37:10 -0700
To: James Graham <james@hoppipolla.co.uk>
Cc: public-test-infra@w3.org
Message-Id: <A91749DF-3BFE-4108-9DA1-3E3146AB71ED@hp.com>
On Sep 4, 2014, at 5:29 AM, James Graham <james@hoppipolla.co.uk> wrote:

> On 03/09/14 19:10, Peter Linss wrote:
>>> 
>>> So I was looking at adding this to web-platform-tests and the
>>> current design adds some non-trivial complexity. As background, 
>>> web-platform-tests uses a script to auto-generate a manifest file
>>> with the test files themselves being the only required input. This
>>> is rather slow, since it involves actually parsing the *ML files
>>> and inspecting their DOM. Therefore it is important to be able to
>>> perform incremental updates.
>> 
>> FWIW, we have a build step that scans the entire repository looking
>> for tests, references, and support files, parses all the *ML files,
>> generates manifests, human readable indices, then generates built
>> test suites by re-serializing all *ML input files into HTML, XHTML,
>> and XHTML-Print output files (where applicable). It also adjusts
>> relative paths for reference links so that they remain correct in the
>> built suites. The process currently takes about 6 minutes consuming
>> over 21000 input files and generating 24 test suites. It has not been
>> optimized for speed in any way at this point. Given that it runs
>> daily on a build server, the burden is completely manageable.
> 
> Having a six minute build step seems rather hostile to decentralised
> workflows. Let's say I clone the CSS repository now. What steps do I
> have to take before I can run the full set of tests, and how long does
> each take? (I honestly don't know what you're supposed to do; if this is
> documented somewhere I have overlooked it).

The build step is documented at:
http://wiki.csswg.org/test/css2.1/harness

If the dependencies are installed, the steps are: clone the repo, run tools/build.py

Note that pre-built suites are available at:
http://test.csswg.org/suites

and they're also pre-loaded and ready to go in our online test harness:
http://test.csswg.org/harness

We do also keep our tests ready to run "as-is" directly from the repo, the build step does provide the format conversions so that all tests are available in both HTML and XHTML (except where it doesn't make sense).

> For web-platform tests, the
> steps are:
> 
> * Clone the repo from GitHub - installs tests and deps apart from
> html5lib, which I plan to add as a submodule (a few minutes, depending
> on location, network, etc.)
> 
> * Edit your hosts file to add the required domains (if you know how to
> do it about a minute I guess)
> 
> * Run python serve.py (a few seconds)
> 
> * Go to http://web-platform.test in your browser and press "start". In
> the background it builds a MANIFEST.json file containing all the data
> required to run the tests (a complete rebuild takes 40s on my laptop
> with an SSD; subsequent rebuilds only examine files that changed).
> 
> So, if you knew the steps, it is possible to go from a clean slate to
> running tests in, I guess, 10 minutes, and incremental updates are
> possible in under a minute. Getting a fully automated testrun with
> wptrunner is a little slower, because wptrunner requires some
> python-specific knowledge, but again it's a one-off cost.
> 
>>> Currently it is always possible to examine a single file and
>>> determine what type of thing it represents (script test, reftest,
>>> manual test, helper file, etc.). For example reftests are
>>> identified as files with a <link rel=[mis]match> element. Since
>>> (unlike in CSS) tests in general are not required to contain any
>>> extra metadata, allowing references to link to other references
>>> introduces a problem because determining whether a file is a
>>> reference or a test now requires examining the entire chain, not
>>> just one file.
>> 
>> I don't understand why you have to parse the entire chain to
>> determine if a single file is a test or a reference, if a file has
>> single reference link, then it's a reftest, regardless of how many
>> other references there may be. You do, of course, have to parse the
>> entire chain to get the list of all references for the manifest, but
>> really, that's not adding a lot of files to be parsed, many tests
>> reuse references, and we use a cache so each file is only parsed
>> once.
> 
> So your approach is to regard each file with one or more mis[match]
> links as a test and then to allow the overall result of one test to
> depend on other tests? I guess that works.

No. Any file with a <link rel='help' /> is a test (note that what we really care about is that the test is related to a spec through some kind of metadata, not the rel='help' specifically, it'd be a trivial change to infer the link based on the files path in the repo). If a test _also_ has a [mis]match link, then it's a reference test. If the test contains a <script src='/resources/testharness.js'> then it's a script test.

> 
>> For that matter, at least in CSS land, we don't differentiate between
>> tests and references based on the presence of {mis}match links, those
>> only indicate that a test is a reftest. The difference between tests
>> and references is done solely by file and directory naming
>> convention. References are either in a "reference" directory or have
>> a filename that matches any of: "*-ref*", "^ref-*", "*-notref*",
>> "^notref-*". Furthermore, it's perfectly valid to use a _test_ (or a
>> support file, like a PNG on SVG file) as a reference for another
>> test, we have several instances of this in the CSS repo.
> 
> It's very unclear to me what this file naming convention is used for.
> Does it provide any data that is actually required to run the suite or
> is it purely informative?

The naming convention is used by the build tools (and Shepherd) to differentiate between a test, support file and a reference. Given that tests and support files can also be used as references it's more of a convenience. In built test suites all reference file get put into a 'reference' directory regardless of their location in the source tree.

Tests do have to meet the test naming convention, which is basically: don't be a reference file, don't be a support file (i.e. in a 'support' directory), and don't be a tool file (i.e. in a 'tools' directory), and don't be in a directory that is ignored.

> 
>> Also, let me point out again, that the bulk of the code we use for
>> our build process is in a factored out python library[1]. It can use
>> some cleanup, but it contains code that parses all the acceptable
>> input files and extracts (and in some cases allows manipulation of)
>> the metadata. Shepherd also uses this library to do it's metadata
>> extraction and validation checks. If we can converge on using this
>> library (or something that grows out of it) then we don't have to
>> re-write code managing metadata... I'm happy to put in some effort
>> cleaning up and refactoring this library to make it more useful to
>> you.
> 
> Can you point to the exact code that you have in mind? Lots of that code
> seems to be concerned with problems that I'm not trying to solve at this
> time; literally all I want is to be able to get a complete set of test
> files and the data required to run the tests automatically, in a way
> that works for w-p-t and CSS without special casing.

The code that should be interesting to you would be mostly in Sources.py[1]

The SourceTree class will, given a file path, determine if the file is a test, reference, support, or tool file. Its currently hard-coded against the css repo layout but will likely be mostly compatible with the wpt repo at this point.

The SourceCache class will (via the generateSource method) provide an instance of a FileSource subclass (and use a cache, obviously). You can then query the FileSource for its metadata.

These classes can be used pretty much by themselves. And as I said before, a lot of that code is old and crufty and can use some TLC (much of it dates back 7 or 8 years). I'm happy to do some cleanups and refactoring for you if you let me know your specific needs.

Peter

[1] http://hg.csswg.org/dev/w3ctestlib/file/tip/Sources.py
Received on Thursday, 4 September 2014 17:37:35 UTC