On testing HTML

A few of us got together recently with the idea of improving the state
of Web browser testing at W3C. Since this Group is discussing the
creation of an effort for the purpose of testing the HTML specification,
this is relevant here as well:


The idea started with the fact that we have a number of Working Groups
who are trying to review the way they do testing, but also increase the
number of tests they are doing as well.

The CSS Working Group was foremost in mind when it comes to testing. The
Group has several documents in Candidate Recommendation stage that are
waiting tests and testing. The HTML Working Group is starting to look
into testing as well and a key component of ensure the proper success of
HTML 5 is through testing. The specification is quite big to say the
least and, when it comes to testing, it's going to require a lot of
work. We also have more and more APIs within the Web Apps group, Device
API, Geolocation, etc. The SVG Working Group has a test suite for 1.2,
but they're looking at different ways of testing as well. The framework
produced by the MWI Test Suites framework allow two methods. One
requires a human to look at it and select pass/fail. The other one is
more suitable for script tests, ie APIs testing.

A bunch of us, namely Mike Smith, Fantasai, Jonathan Watt, Doug
Schepers, and myself, decided to get together to discuss this and figure
out how to improve the situation. We focused on three axes: test
submissions, test reviews and how to run a test.

First, we'd like ideally every single Web author to be able to submit
tests, so when they run into a browser bug based on a specification, it
should be easy for them to submit a test to W3C. It should also allow
browser vendors to submit thousands of tests at once. There is the
question of how much metadata do you require when submitting a test. For
example, we do need to know at some point which feature/part of a spec
is being tested. We should also as many format as possible for tests.
Reftests, mochitests, DOM-only tests, human tests, etc. The importance
aspect here is to be able to run those tests on many platforms/browsers
as possible. A test format that can only be ran on one browser is of no
use for us.

Once a test has been submitted, it needs to be reviewed. The basic idea
behind improving test reviews is to allow more individuals to
contribute. The resources inside W3C aren't enough to review ten of
thousands of tests. We need to involve the community at large by doing
crowd reviews. It will allow the working groups to only focus on the
controversial tests.

Once the test got reviewed, we need to run on browsers, as many as
possible. Human tests for example are easy to run on all of them, but it
does require a lot of humans. Automatic layout tests are a lot trickier,
especially on mobiles. We focused on one method during our gathering:
screenshot based. The basic idea here is that a screenshot of the page
is compared to a reference. Mozilla developed a technology called
ref-tests that compares Web pages themselves. You write two pages
differently that are supposed the exact same rendering and compare their
screenshots. It avoids a lot of cross-platforms issues one can. The way
Mozilla is doing that is via the mozPaint API in debug mode. That works
well, but only works in Mozilla. You can guess that other browser
vendors have a similar to automatically take screenshots as well. We
wanted to find a way to do this with all browsers without forcing them
or us to write significant amounts of code. We found a Web site called
browsertests.org and we got in touch with that Sylvain Pasche and, with
his help, we started to make some improvements on his application. It
works well on desktops at least. Once again, we don't think W3C is big
enough to replicate all types of browser environments, so we should make
it easy for people to run the tests in their browser and report the
results back to us. Plenty of testing frameworks have been done already
and we should try to leverage them as much as possible.

We started to set up a database for receiving the tests and their
results. We'd like to continue the efforts on the server/database side,
as well as continuing to improve Sylvain's application, allowing more
tests methods and formats. Testing the CSS or HTML5 parser should be
allowed for example.

You'll find more information at our unstable server but keep in mind
that:

     1. we're in the very early stages
     2. this server is a temporary one that I managed to steal for a few
        days from our system folks. They'll want it back one of those
        days and I need to find a more stable home prior to that event.
        I'll update the link once this happens but expect it to break if
        you bookmark it.
     3. Unless I can secure more resources for the project, we won't go
        far by ourselves.

The server also contains links to more resources on the Web related to
various testing efforts, as well as a more complete of what we wish the
testing framework to accomplish.

For the conclusion, I'd like to thank Mike Smith and Doug Schepers, and
especially Jonathan Watt and Fantasai from the Mozilla Foundation. They
all accepted to argue and code for 8 days around the simple idea of
improving the state of testing at W3C. I hope we're going to be able to
take this project off the ground in the near future. If you're
interested in contributing, got ideas and time, don't hesitate to
contact me.

Regards,

Philippe

Received on Thursday, 17 September 2009 22:09:55 UTC