Re: Writing Modes PR & Testsuite

Le 2017-01-12 17:48, Geoffrey Sneddon a écrit :
> Yesterday, we resolved to publish either a CR or, preferably if the
> Process allows it, a PR of Writing Modes Level 3.
> However, I'm unconvinced we have a testsuite sufficient to leave CR
> (and proceed to PR) with; we seem to have hundreds of reftests which
> fail due to anti-aliasing differences.


Can you specify your operating system, what are your anti-aliasing 
settings (browser, browser version and os) and screen resolution?

> According to the test harness, Firefox passes 89.02% of the testsuite
> (which it makes out to be 1120 tests); running locally all automated
> tests,

What exactly are you doing to run locally the automated tests? I am 
trying to understand and reproduce your findings.

> I get 1045 tests (910 parents, 135 subtests;

3 comments on this.

1- It is true that a bunch of reftests are in fact multiple subtests.


is in fact 8 subtests.

2- There are 2 reftests that are "combo" kind of tests.

3- Some tests are linked to multiple sections of the spec. Eg.

table-progression-vlr-003 is a test which is listed only once in

but table-progression-vlr-003 test will appear 3 times in the Test 
harness results page.

So, the 1045 number you get versus the 1120 number difference in the 
test harness results page is perfectly understandable and explainable.

> as far as I'm
> aware the harness has no notion of harnesses), of which 457 pass and
> 588 fail: this implies that 89.02% of the entire testsuite cannot
> pass.

This, I believe, boils down to how strict the person or the software 
taking the 1045 tests is when establishing that a reftest is passed or 

> Looking into many of the failures, it quickly became apparent that
> hundreds of these failures is down to anti-aliasing differences in
> reftests;

Tests were designed for the writing modes specification. As the author 
of many of those tests, I am convinced that tests should be evaluated, 
reviewed, criticized (and eventually improved) with and from the 
writing-modes-specification perspective in mind in the first place. If 
the cause of a test's failure as measured by a screen shot comparison 
run by software is due to anti-aliasing differences, then this does not 
mean that such test was incorrectly designed or incorrectly coded from a 
writing-modes-specification perspective. Browsers and operating systems 
control how a font is handled, including anti-aliasing. And trying to 
overcome each and all browser and os anti-aliasing idiosyncrasies for 
Ahem glyphs a) rotated 90° or b) translated upright or c) rendered 
full-width is currently impossible for a test author. (Vertical advance 
of rotated or translated glyphs is another one.)

> I've filed a bug for this at
> <>.

Please elaborate on the anti-aliasing differences you noticed between 
this test:

and its reference file:

because, over here, I see no difference between the 2. Even at 
'font-size: 60px', I see no difference. Screenshots could help here.

- - - - - -

In your list,


have been marked as failed by Firefox 50+ in the test harness results.

> As such, I at least don't view the testsuite as ready to publish a PR
> and would raise a formal objection if we resolved to do so.
> /Geoffrey.

The problem is with the handling of the Ahem font by the browser and os 
when glyphs are rotated 90° or translated or full-width. And right now, 
there is no known code solution that will make tests work elegantly 
around such issue for all browsers and all os. The solution for the 
problem you see will have to come from the font itself and/or from the 
browser manufacturers and/or from the operating systems. (According to 
Xidorn Quan [1], Windows10 and Edge have solved or worked around this 
antialiasing issue for [a minority? a majority? all?] writing modes 
tests.) Not from the tests themselves.

In all fairness, tests should first be evaluated by their design, by 
their code and by their goal. And then by their precision. We adjusted 
and fixed close to one thousand (1000) of CSS2.1 tests because their 
code were creating fractions of pixel and there was a way (possible and 
elegant way) to avoid, to prevent such precision issue for all browsers 
and all os-es. And there are still a bunch of tests that have not been 
fixed so far: eg. border-top-width-014 and word-spacing-043 tests 
generate fractional pixels.

[1] "Edge can render them prefectly, so I suppose it is somehow 


P.S. please send testsuite related matters to the public-css-testsuite 
mailing list

Received on Monday, 16 January 2017 20:54:30 UTC