Re: Knowing which tests are in the repository from Dirk Pranke on 2013-08-22 (public-test-infra@w3.org from July to September 2013)

From: Dirk Pranke <dpranke@chromium.org>
Date: Thu, 22 Aug 2013 10:21:50 -0700
To: James Graham <james@hoppipolla.co.uk>
Cc: "public-test-infra@w3.org" <public-test-infra@w3.org>
Message-ID: <CAEoffTC7r5_rUQdeYQYi32iVPB+RwRDddksy_g6hX0fCYUJkGA@mail.gmail.com>
On Thu, Aug 22, 2013 at 9:55 AM, James Graham <james@hoppipolla.co.uk>wrote:

> On 22/08/13 17:45, Dirk Pranke wrote:
>
>> I mostly like this ... comments inline.
>>
>> On Thu, Aug 22, 2013 at 9:31 AM, James Graham <james@hoppipolla.co.uk
>> <mailto:james@hoppipolla.co.uk**>> wrote:
>>
>>     A modified proposal:
>>
>>     By default apply the following rules, in the order given:
>>
>>     * Any file with a name starting with a . or equal to
>>     "override.manifest" is a helper file
>>
>>
>> Are there helper files other than manifests that we should be worrying
>> about? I'm thinking of things like .htaccess, .gitignore, etc. I would
>> probably say "is not a test"  (or possibly "can be ignored") rather than
>> "is a helper file".
>>
>
> Sure, I only reused "helper file" for this case because I couldn't think
> of a better term.
>
>
>      * Any file with -manual in the name before the extension is a manual
>>     test.
>>
>>     * Any html, xhtml or svg file that links to testharness.js is a
>>     testharness test
>>
>>     * Any html, xhtml or svg file that has a file with the same name but
>>     the suffix -ref before the extension is a reftest file and the
>>     corresponding -ref file is a helper file.
>>
>>     * Any html, xhtml or svg file that contains a link rel=match or link
>>     rel=mismatch is a reftest file.
>>
>>
>> Strictly speaking, one could say that -manual is unneeded, but since I'd
>> prefer to stomp out as many manual tests as possible, I'm fine w/ making
>> their names be uglier (and I do also like the clarity the naming
>> provides).
>>
>
> I don't see how else you would distinguish manual tests and helper files.
>
>
As per above, I'm not quite sure what all is a "helper file" to you. If
you're talking about subresources in a page, I'd prefer that they be in
dedicated directories called "resources" (or some such name) by themselves
rather than mixed in with the tests. Are there other sorts of files (that
might also have the same file extensions as tests)?


>  Is it too much to ask that we have similar names for either testharness
>> tests or reftests so that you can distinguish which a test is without
>> having to open the file? /me holds out a faint hope ...
>>
>
> I think it's too much effort to require that all testharness.js tests have
> something specific in the filename. Reftests have to be parsed to work out
> the reference anyway.
>
>
Well, yeah, but that way you could at least not have to parse the
testharness tests looking for references. Given that we have 10x the number
of testharness tests as reftests in the web-platform-repo, this isn't a
small thing.

I'm not sure why this is much effort beyond a simple script and a bulk
rename (and some retraining of authors or a commit hook ...), but at any
rate this is hardly a deal-killer to me.


>
>      * Any other file is a helper file.
>>
>>     These rules can be overridden by providing an override.manifest
>>     file. Such a file can contain a list of filenames to exclude from
>>     the normal processing above and a list of urls for tests, similar to
>>     my previous proposal. So for example one might have
>>
>>     [exclude]
>>     foo.html
>>
>>     [testharness]
>>     foo.html?subset=1
>>     foo.html?subset=2
>>
>>     I am still not sure how to deal with timeouts. One option would be
>>     to put the overall timeout in a meta value rather than in the
>>     javascript, since this will be easier to parse out. For tests where
>>     this doesn't work due to strong constraints on the html, one could
>>     use the override.manifest as above (and also specify the timeout in
>>     the js). I can't say I am thrilled with this idea though.
>>
>>
>> Ignoring the issues around query-param based tests and timeouts, is
>> there a reason we'd want to allow exceptions at all apart from the fact
>> that we have a lot of them now? I.e., I'd suggest that we don't allow
>> exceptions for new tests and figure out if we can rename/restructure
>> existing tests to get rid of the exceptions.
>>
>
> The point of the exceptions is only the issues around query params and
> other exceptional circumstances. The point is not to allow deviations in
> cases that could conform to the scheme, but to allow flexibility where it
> is really required.


Okay, thanks for clarifying.


> Since we already have cases where it is really required, and the people
> who require it are typically advanced test authors, this seems quite
> acceptable.


We do? I haven't noticed any such cases yet, but it's quite likely I've
missed them and I'd appreciate pointers.


>
> As far as timeouts go, I'm still not sold on specifying them at all, or
>> at least specifying them regularly as part of the test input. I'd rather
>> have a rule along the lines of "no input file should take more than X
>> seconds to run" (obviously, details would qualify the class of hardware
>> and browser used as a baseline for that). I'd suggest X be on the order
>> of 1-2 seconds for a contemporary desktop production browser on
>> contemporary hardware. I would be fine w/ this being a recommendation
>> rather than a requirement, though.
>>
>
> Well, there are a lot of issues here. Obviously very-long-running tests
> can be problematic. On the other hand, splitting up tests where they could
> be combined creates a lot of overhead during execution. More importantly,
> some tests simply require long running times. It isn't uncommon to have
> tests that delay resource loads to ensure a particular order of events, or
> similar. Tests like these intrinsically take more than a few seconds to run
> and so need a longer timeout.
>
> I don't think we can simply dodge this issue.
>

I'm not trying to dodge the issue. I don't think Blink has any tests that
intrinsically require seconds to run to schedule and load resources, though
we do have some tests that do take seconds to run (usually because they're
doing too much in one test, and sometimes because they're doing something
computationally very expensive). I would be curious to see examples of
tests that were intrinsically slow (and considered well-written) in the CSS
repos. It's always good to have concrete examples to talk about.

I'm not sure about your assertion that splitting up tests creates "a lot of
overhead". Do you mean in test execution time, or configuration / test
management overhead?

Certainly, creating and executing each standalone test page has a certain
amount of overhead (in Blink, this is on the order of a few milliseconds to
ten on a desktop machine, not large but it does add up over thousands of
tests). On the other hand, bundling a large number of individual assertions
into a single testable unit has its own problems, so we almost always want
a tradeoff in practice anyway.
Received on Thursday, 22 August 2013 17:22:37 UTC