Re: Ringmark, Core Mobile, and the Financial Times from Robin Berjon on 2012-06-11 (public-coremob@w3.org from June 2012)

From: Robin Berjon <robin@berjon.com>
Date: Mon, 11 Jun 2012 12:38:52 +0200
To: James Graham <jgraham@opera.com>
Cc: public-coremob@w3.org
Message-Id: <E34383AD-514B-446A-9607-684928EECC4E@berjon.com>
Hi James,

On Jun 11, 2012, at 10:39 , James Graham wrote:
> The best solution for bugginess is to have high quality public testsuites for various features. The various W3C working groups already curate tests for their various specs and have guidelines / frameworks for writing tests. The main problem is sourcing the tests themselves; so far the dominant contributors by far are browser vendors, and even they do not yet contribute as much as you might hope due to a history of writing tests in ways that that are specific to their particular browser or testing setup. This has been improving recently, however.

That's certainly part of the problem, but not the whole story either. For most of the issues that Robert brings up the features are actually supported and would be reported as such in traditional conformance test suites. The fact that they're there but simply unusable for developers would not be noted in the implementation report.

I think that we need to spend more time working on Quality of Implementation testing. It's not easy, but it's very valuable. I don't know of a single W3C group that has that on its radar — and it's definitely something that CoreMob could do. Allow me to start hashing some details out below, and let's see if they stick

This new suite would be called the Browser Implementation Quality test suite, or BrowserIQ for short. I reckon that we could start with a relatively short list of tests (say, ten) and for each we gather results for that browser on an average device and rate it as "Fail", "Sluggish", or "Pass". We publish a list of results showing nice big green/orange/red squares next to browser names, ordered from best to worst.

We also ask developers to submit pointers to QoI issues they're having. We take the good ones to expand our list. Lather, rinse, repeat.

The initial set of tests could include:

    1. CSS transitions at reasonable speed.
    2. Canvas running at reasonable speed.
    3. CSS font family fallback used during font loading.
    4. Reasonable audio latency and mixing.
    5. AppCache working in reasonable ways.
    6. Decent UIs for new HTML form control types (unlike the <input type=date> "support" in Safari).
    7. Have the UI behave "helpfully" on quota increases.

There are certainly others — that's just off the top of my head.

Thoughts?

> In an ideal world web developers would also contribute tests, particularly when they find bugs that impede their ability to get things working. In particular it would be nice if the general behaviour of developers on finding a bug was:
> 
> 1. Write test (using public framework) demonstrating bugs
> 2. Submit test to W3C
> 3. Submit bug report to affected browser vendor pointing at public test
> 4. Look for short-term workaround

You know, this got me thinking: how much work would it actually be to set up a jsFiddle equivalent that would come with testharness.js preloaded, and would feature a dialog dedicated to saving this as a bug, with whatever details about platform, etc. are needed. I reckon not that much. We could probably start off from tinker.io (https://github.com/chielkunkels/tinker) since it's open source, or something similar (if someone has a suggestion, it's welcome).

I may be able to get around to setting that up — suggestions welcome.

I've been thinking about how best to get input from developers, and frankly if you have to get them to find the right spec against which to test and then the right group to which to submit it's just too much work for all but the most dedicated. Having us do the triage, perhaps partly automated, might make more sense (especially if we can then get help on the triage bit as a unit of work made simple enough).

> Quite how this interacts with ringmark I don't know. For various reasons ringmark can't possibly run all the tests needed to verify that specs are well implemented, both due to its architecture (one documenent, javascript tests only, etc.) and the amount of time more comprehensive testing would take. It is not really clear what the value proposition of ringmark as an incomplete testsuite is compared to having a static list of browsers that are believed (based on more comprehensive testsuites) to be "good enough" in their implementation of a given feature.

I've said this before, I'll say it again: Ringmark was provided as input to this group and it is up to us to take it forward in whichever ways we see fit. What's more, there is nothing that constrains this group inside Ringmark — we can create other tools, other testing approaches, etc. if we find them useful.

-- 
Robin Berjon - http://berjon.com/ - @robinberjon
Received on Monday, 11 June 2012 10:39:20 UTC