- From: Dirk Pranke <dpranke@chromium.org>
- Date: Tue, 1 Jul 2014 09:19:28 -0700
- To: James Graham <james@hoppipolla.co.uk>
- Cc: public-test-infra <public-test-infra@w3.org>
- Message-ID: <CAEoffTBrCGaVgQqn7iWQo0S7sa+Y1NBOU9HdfdCrZH9fTSwTow@mail.gmail.com>
On Tue, Jul 1, 2014 at 4:33 AM, James Graham <james@hoppipolla.co.uk> wrote: > On 01/07/14 01:22, Dirk Pranke wrote: > > > > On Mon, Jun 30, 2014 at 5:06 PM, Anton Modella Quintana (Plain > > Concepts Corporation) <v-antonm@microsoft.com > > <mailto:v-antonm@microsoft.com>> wrote: > > > > Hello public-test-infra, > > > > As Erika said previously [1], Microsoft is working on adding > > support to IE to wptrunner and contributing back as much as we > > can. While we created our first internal prototype one of the > > problems we found were the ref tests. Some of them were failing > > just because the antialias on a curve was different depending on > > the browser. I don't think those tests should fail. > > To mitigate the number of false negatives we tested different > > approaches and at the end we decided to use ImageMagick, its > > compare tool and a fuzz factor [2]. Basically we compare how > > different the two images are and if we get a factor equal or less > > than 0.015 then we pass the test. These value is experimental and > > it is the best we got after trying different algorithms and > > factors. I've attached a few images for you to better see how even > > if the images are not exactly equal, the test should pass (at > > least in this example). > > > > Some concerns about this approach: > > * It has a dependency on ImageMagick (we could implement the > > algorithm to remove this dependency if needed) > > * There might be some tests where the factor should be tweaked or > > even disabled. This number could even change depending on the > > browser we are testing > > > > So what does public-test-infra think of this? > > > > I believe that I have seen similar sorts of reftest failures in Blink > > and WebKit over the years as well, though I'm not sure if we have them > > currently (we probably do). > > I know we have similar problems with Mozilla reftests. I think our > current solution is simply to quantify the maximum number of pixels that > can be different. I was hoping we could avoid solving this for > web-platform-tests, but maybe that's over-optimistic. Do you have a list > of the tests that are giving incorrect results without the use of > ImageMagick? > > > I would be a bit sad to pull in a dependency on ImageMagick given that > > it is in Perl, but presumably different platforms can do different > > things as need be. > > > That requires us to understand the algorithm, to the level we can > reimplement it. I'm not sure we currently have that level of > understanding of what Imagemagick does. > I certainly don't, that's true. But, if I was being a standards purist, it seems like defining fuzzy matching criteria would be a good idea, rather than leaving it be implementation defined. That said, I'm not being a standards purist and I'd rather focus on whatever gets people running more tests more often :). -- Dirk
Received on Tuesday, 1 July 2014 16:20:15 UTC