Re: Helping implementations test error messages from Jeffrey Yasskin on 2015-07-24 (public-test-infra@w3.org from July to September 2015)

From: Jeffrey Yasskin <jyasskin@chromium.org>
Date: Fri, 24 Jul 2015 16:42:47 -0700
To: James Graham <james@hoppipolla.co.uk>
Cc: public-test-infra@w3.org
Message-ID: <CANh-dX==dRLj_83hug-Av8t6XR4HKpmiN2d5GCtHVZ6+waf49w@mail.gmail.com>
On Thu, Jul 23, 2015 at 2:49 PM, James Graham <james@hoppipolla.co.uk>
wrote:

> On 23/07/15 13:17, Jeffrey Yasskin wrote:
>
>> Hi testing folks,
>>
>> In implementing the Web Bluetooth spec, we'd like to test that it
>> produces good error messages for developers. However, we'd also like our
>> tests to be usable by other implementations without effectively making
>> the error messages part of the specification. testharness.js doesn't
>> currently have support for this, so I'm looking for guidance on what to
>> add to enable it.
>>
>> An initial proposal from the blink-dev thread on this
>> (
>> https://groups.google.com/a/chromium.org/d/topic/blink-dev/8YfKmyq7ugQ/discussion
>> )
>> is to have assert_throws(func, {name: 'FooError', printMessage: true})
>> call printErrorMessage(e.message), which would be provided by
>> testharnessreport.js. Then Blink would put its expected error messages
>> in our -expected.txt files, and other implementations would be free to
>> check their error messages or not, as they wish. This has the downsides
>> that every implementation has to check the same set of error messages,
>> if they check any, and that it can't really be used from concurrent tests.
>>
>
> Yeah, this strategy seems like it isn't really going to work to me. In
> general I'm finding it hard to think of a setup that's going to work well
> for Chromium, given the way that the expectation data format works there.
> For example I considered adding something like
>
> implementation_defined(unique_name, data);
>
> Which would add a list of (name, data) pairs to the test result message.
> An implementation using wptrunner could then use this data in conjunction
> with .ini files to make any additional implementation-specific checks on
> that data. It would take some effort to implement and be quite a lot of
> hassle for the people maintaining the tests to put the
> implementation-specific parts in the ini file rather than in the test
> directly.
>
> I can also imagine having some convention around using a
> __implementation__.js file in a directory that could be filled in with the
> data for implementation-specific asserts. Then a script could so something
> like:
>
> <script src="__implementation__.js"</script>
> <script>
> test(function() {
>     assert_equals(navigator.userAgent, implementation.userAgent);
> });
>
> This would make updating the tests even more of a hassle because you would
> have to be careful not to overwrite these files. You'd still have to be
> careful to quarantine the UA-specific behaviour too.
>
> Honestly I'm not sure that there are so many cases where we have a mix of
> interoperable and implemenation-defined behaviour that it's worthwhile to
> make a special case to support it. How much do we lose if people just don't
> upstream tests for implementation defined behaviour? There are already a
> number of more fundamental things that are hard to upstream, like tests
> which need non-standard or non-web-exposed APIs.
>

We have a huge number of exceptions in the platform, all of which expose
both interoperable and implementation-dependent UI to developers in the
form of their name (interoperable) and message (implementation-dependent).
On the other side, if we don't do something about this, we're going to have
a hard time upstreaming any of the Web Bluetooth tests, since we want to
check error messages there, and we don't want to have to write a completely
parallel set of error tests to do it.

To deal with "every implementation has to check the same set of error
messages", I'd argue that implementations ought to aim for testing *all* of
the messages, since they're all user-visible. We can't do it all at once,
but we can get there over time.

To deal with asynchronous tests, your implementation_defined(unique_name,
data) should work well. In Chromium, we'd gather calls up until the end of
the test, then sort them by the unique_name and print them for comparison
with the -expected.txt file. We could pass the unique_name in through
assert_throws(func, {name: 'FooError', printMessage: 'unique_name'}).

It's not quite as convenient to put our error messages in the -expected.txt
file as it is to inline them in the test, but with the unique_name, I think
it's good enough.

Thanks,
Jeffrey
Received on Sunday, 26 July 2015 19:28:20 UTC