Re: Ringmark, Core Mobile, and the Financial Times

On 06/11/2012 12:38 PM, Robin Berjon wrote:
> Hi James,
>
> On Jun 11, 2012, at 10:39 , James Graham wrote:
>> The best solution for bugginess is to have high quality public
>> testsuites for various features. The various W3C working groups
>> already curate tests for their various specs and have guidelines /
>> frameworks for writing tests. The main problem is sourcing the
>> tests themselves; so far the dominant contributors by far are
>> browser vendors, and even they do not yet contribute as much as you
>> might hope due to a history of writing tests in ways that that are
>> specific to their particular browser or testing setup. This has
>> been improving recently, however.
>
> That's certainly part of the problem, but not the whole story either.
> For most of the issues that Robert brings up the features are
> actually supported and would be reported as such in traditional
> conformance test suites. The fact that they're there but simply
> unusable for developers would not be noted in the implementation
> report.

I agree that some of them are not obviously bugs per-se.

> I think that we need to spend more time working on Quality of
> Implementation testing. It's not easy, but it's very valuable. I
> don't know of a single W3C group that has that on its radar — and
> it's definitely something that CoreMob could do. Allow me to start
> hashing some details out below, and let's see if they stick

Since we do so badly with the low hanging fruit (testable assertions), I 
don't think that we are at the point where expanding into maintaining 
public lists of hard-to-test QoI issues should be a priority. I think it 
is sufficient for people to take up these issues with the vendors 
directly (e.g. via their bug reporting system).

The exception to this is performance benchmarks; it is always useful to 
get good benchmarks. But it turns out that making *good* benchmarks is 
very challenging, even for people that you might expect to be competent 
(see [1] for a random example that I happen to recall. It is by no means 
the only example).


> The initial set of tests could include:
>
> 1. CSS transitions at reasonable speed. 2. Canvas running at
> reasonable speed. 3. CSS font family fallback used during font
> loading. 4. Reasonable audio latency and mixing. 5. AppCache working
> in reasonable ways. 6. Decent UIs for new HTML form control types
> (unlike the<input type=date>  "support" in Safari). 7. Have the UI
> behave "helpfully" on quota increases.

Apart from the problems that others have mentioned e.g. hardware 
dependence, these also have the problem of being largely subjective. 
What is "reasonable" audio latency and mixing? What is "decent" UI? To 
take those specific examples further, I don't think anyone disagrees 
that browsers have room to improve the QoI for audio support; the 
problem is mostly one of priorities/technical complexity. So I don't see 
that the benefit would be to outweigh the difficulty of defining exactly 
what counts as good enough for the purposes of the test. Form controls 
are a slightly different matter as some vendors have been known to rush 
out broken implementations just to appear like they have support for a 
feature on feature-testing sites. But to the extent that one can have a 
minimal set of criteria to implement a feature they could be actual 
(manual) tests.

>> In an ideal world web developers would also contribute tests,
>> particularly when they find bugs that impede their ability to get
>> things working. In particular it would be nice if the general
>> behaviour of developers on finding a bug was:
>>
>> 1. Write test (using public framework) demonstrating bugs 2. Submit
>> test to W3C 3. Submit bug report to affected browser vendor
>> pointing at public test 4. Look for short-term workaround
>
> You know, this got me thinking: how much work would it actually be to
> set up a jsFiddle equivalent that would come with testharness.js
> preloaded, and would feature a dialog dedicated to saving this as a
> bug, with whatever details about platform, etc. are needed. I reckon
> not that much. We could probably start off from tinker.io
> (https://github.com/chielkunkels/tinker) since it's open source, or
> something similar (if someone has a suggestion, it's welcome).

Interesting idea.

> I've been thinking about how best to get input from developers, and
> frankly if you have to get them to find the right spec against which
> to test and then the right group to which to submit it's just too
> much work for all but the most dedicated. Having us do the triage,
> perhaps partly automated, might make more sense (especially if we can
> then get help on the triage bit as a unit of work made simple
> enough).

I am slightly worried that people who aren't reading the spec are 
unlikely to be writing correct testcases. Working out what the expected 
behaviour is in complex cases is non-trivial (and not always just 
"whatever $popular_engine does" — especially since the cases of interest 
will presumably show browser differences). Also there hasn't been that 
much success in getting (browser) people to do upfront review of public 
testcases; no one's job description includes that task so it doesn't 
happen much.

On the other hand having too many testcases to handle would be a new and 
interesting problem to have so I am not opposed to reducing the 
accidental complexity of making and submitting tests.

Received on Tuesday, 12 June 2012 08:49:50 UTC