RE: [RC6] rgb(50%, ..., ...) or rgb(..., 50%, ...) or rgb(..., ..., 50%): fractional value! from Gérard Talbot on 2012-12-06 (public-css-testsuite@w3.org from December 2012)

From: Gérard Talbot <css21testsuite@gtalbot.org>
Date: Thu, 6 Dec 2012 17:55:10 -0500
To: "Arron Eicholz" <Arron.Eicholz@microsoft.com>
Cc: "Rebecca Hauck" <rhauck@adobe.com>, "Public CSS test suite mailing list" <public-css-testsuite@w3.org>
Message-ID: <119e2c062d3b405daef379adf5585302.squirrel@ed-sh-cp3.entirelydigital.com>
Le Jeu 6 décembre 2012 14:09, Arron Eicholz a écrit :
> On Wednesday, December 05, 2012 5:50 PM GÃ©rard Talbot wrote:
>> >
>> > I encountered many of these tests when scrubbing the list of things
>> we
>> > want to fix for the CSS2.1 Test Suite 2.0 release. In fact, there
>> are
>> > ~100
>> > tests with this issue  so coming up with a solution to this have a
>> big
>> > impact on the overall todo list.
>> >
>> > After looking at them, I noticed that all of them except for those
>> > named
>> > background-color-* use the same rgb() syntax for both the test and
>> > reference elements.
>>
>>
>> Rebecca,
>>
>> Correct. All of them except those named background-color-* use the
>> same
>> rgb() syntax for both the test and reference elements.
>>
>> If the test is about color (and border-[bottom|left|right|top]-color,
>> background-color, outline-color are tests about color), then its
>> associated
>> reftest should be using another method, a different method for
>> expressing/rendering such color. So, if both the test and reftest are
>> using
>> rgb(%, %, %) and comparing a color with a background-color, then it is
>> *not*
>> using a different method, another method of expressing colors, of
>> rendering
>> colors and for comparing colors.
>>
>
> I disagree with what you think these tests are testing. The tests are
> testing if border-*-color supports the rgb() values correctly. If that
> is the case they should match the reference that is verified against the
> image cases. This would mean in anyone's implementation that you must
> have 3 things working correctly at the very least. 1. Image rendering
> must work correctly. 2. Background-color must work and match the image
> rendering. 3. Border-*-color must match the background reference.


If A matches B and B matches C, then A must match C also: transitive
relation. All these *-color-* tests rely on background-color tests
accurately matching an image rendering to begin with.


> If you implemented RGB separately for each property you may have made a
> mistake in your copy of the code path. Who knows how people write
> implementations. At the very least these need to be thought of as more
> or less independent implementations of RGB on a particular property and
> not as a common function for RGB as smarter implementers implement
> things.
>
>> So,
>>
>> http://test.csswg.org/suites/css2.1/nightly-unstable/html4/color-049.htm
>>
>> can never actually fail.
>>
>
> Actually it can fail if a developer implanted RGB for color and then
> some other developer went and implemented RGC for background-color and
> they did not implement things correctly or sue a common function for
> RGB. We are making an assumption that there is a common RGB function and
> in some implementations they might not have gotten far enough into
> coding to need a common function yet but they want to test what they
> have. This test can absolutely fail.
>


Arron, what you describe is always possible. But it was not what I had
in mind. Let's assume/postulate a common rgb() function for
background-color, border-*-color, color, outline-color exists: then such
test
http://test.csswg.org/suites/css2.1/nightly-unstable/html4/color-049.htm
will not fail and can not fail.


>>
>> >
>> > Example: border-bottom-color-049 has:
>> >
>> >      #test
>> >      {
>> > 	border-bottom-style: solid;
>> > 	border-bottom-width: 1in;
>> >         border-bottom-color: rgb(1%, 1%, 1%);
>> >         height: 0;
>> >      }
>> >
>> >      #reference
>> >      {
>> >         background-color: rgb(1%, 1%, 1%);
>> >         margin-top: 10px;
>> >      }
>> >
>> >
>> >
>> > First point:
>> > The way this is written, is the fractional color value really a
>> > precision issue? In other words, whether the UA rounds up or down,
>> > isn't it fair to assume that it'd do the same thing for both the
>> test
>> > and reference elements so they'd always match?
>>
>>
>> It is fair to assume that it will do the same rounding (up or down)
>> for
>> both the test and reference and so they should *always* match: in
>> fact,
>> this is already verifiable and verified too. A consequence of this is
>> that, eg.
>>
>
> No you can't make that assumption unless you know exactly how the code
> was written.


Empirical results then:

http://www.gtalbot.org/BrowserBugsSection/css21testsuite/color-rgb-testing.html


> Our typical implementations seem consistent but what if I
> am some new implementer and have never written a browser before? Then
> what? We can't make assumptions based on our own browser experience. We
> need to think as if we are new implementers that do things piece meal
> and may not make the best coding choices starting from scratch. The
> assumption I made was that at the very least background-color can be
> compared against a real baseline and then from there we can compare
> other against background-color. IF we don't like that choice


>From the beginning, I was not aware of that testing design choice.


> then we
> need to use the image in all the *-color property cases.


I thought we should use the image in all the *-color property cases but
now that you have explained the pivoting transitivity with
background-color tests, I think we can save/exempt ourselves of all this
updating of *-color tests.


>
>> http://test.csswg.org/suites/css2.1/nightly-unstable/html4/color-049.htm
>>
>> can never actually fail.
>>
>
> Yes this case can fail as I stated above.
>
>
>>
>> > Or am I missing something here?
>> >
>> >
>> > Second point:
>> > However, does this expose a different weakness in the test?
>>
>>
>> I think it does expose a weakness in the test. An automated checking
>> of
>> background-color-* test (software comparing screenshots) would
>> eventually report failures that no human eyes would be able to
>> see/notice to begin with.
>>
>> rgb(1%, 1%, 1%) is *not* #020202 in some browsers.
>>
>
> This is a weakness in the spec not really a weakness in the test.


We want tests to work elegantly around difficulties (grey areas, spec
silence, user agent stylesheet default values) or known weaknesses of
the spec and only reveal the result about a well targeted feature or
code situation. I could provide several examples of doing so.



> The
> spec is not clear enough on how to handle these fractional values. I
> assumed that the rounding occurred based on common practices in all the
> painting apps I have ever used. I think for these specific cases we may
> need to do one of the solutions below:
>
> 1. Remove percentage from the spec since it is untestable and cannot be
> interoperable because we do not define rounding correctly or how
> scenarios like these case be verified.
> 2. Define rounding of color values explicitly. Right now the spec only
> says that values can be approximate, section 6.1.4. In my opinion
> rounding is the way we approximate things for this case. That is at
> least how every photo editing program works that I have ever tested.
> 3. Update the background-color cases to have 2 references one ref that
> is on either side of the value being defined.


In the past, for situations involving fractional values, you have always
opted for solution 3.


>
>>
>> rgb(1%, 1%, 1%) is equivalent to #020202 in some other browsers.
>>
>> Therefore, there is a weakness in the test which can be explained.
>>
>
> Actually the weakness is not in the test the weakness is in the person
> verifying the test.


I presume that we also want tests to be formalized to a point where even
a robot would eventually be able to verify tests, especially
"boring"/tedious tests about colors and comparing colors. Or any
automated process like comparing screenshots.


> Though I am not saying that you have to have
> superhuman eyes, what I am saying is that the verification method may
> not be someone looking at the test either. And in some cases it might be
> best if the verification were not a normal person looking at the test.

You'd be surprised how much "nitpicking" or thorough or
strict-minded-accurate some people can be with colors and color names.

Some people (or a software or a robot) will say that
http://test.csswg.org/suites/css2.1/20110323/html4/inlines-020.htm
is failed by all browsers. You see, lime is not green and green is not
lime.

Compare
http://test.csswg.org/suites/css2.1/20110323/html4/blocks-011.htm
with
http://test.csswg.org/suites/css2.1/20110323/html4/background-image-cover-001.htm
and you will get some people (or even a software or a robot) to make one
test fail because navy is not blue and blue is not navy. If the 2 tests
are taken consecutively, a minority of people will understandably
hesitate for sure.

Some people will not be able to make a call/verdict for these 2 tests:
http://test.csswg.org/suites/css2.1/20110323/html4/border-top-style-009.htm
or
http://test.csswg.org/suites/css2.1/20110323/html4/border-left-style-009.htm

Etc.


>
>> The thing is:
>>
>> background-color-049 would eventually be reported as a failure by
>> some
>> browsers while color-049 would never be reported as a failure by
>> any/all
>> browsers.
>>
>
> Background-color-049 if that is the case then how do we fix it? We can't
> unless we do:
>
> 1. Remove percentage from the spec since it is untestable and cannot be
> interoperable because we do not define rounding correctly or how
> scenarios like these case be verified.
> 2. Define rounding of color values explicitly. Right now the spec only
> says that values can be approximate, section 6.1.4. In my opinion
> rounding is the way we approximate things for this case. That is at
> least how every photo editing program works that I have ever tested.
> 3. Update the background-color cases to have 2 references one ref that
> is on either side of the value being defined.
>


We do 3.




>> So, we're not even consistent in doing tests and we're not really
>> doing
>> good testing either.
>>
>
> I disagree with this statement completely. I believe you are looking at
> things from a known browser perspective. I am looking at these tests
> from a spec perspective like there were no browsers to even test this in
> to begin with. In fact I write almost all my test suite cases without
> ever opening them in a browser, that way I am never influenced by
> browser behavior. It is not until I am verifying my browser do I
> actually run them.
>


If we were face-to-face or on the phone, it would be interesting to
discuss this. I don't necessarly disagree with you on everything.


Gérard

>>
>> > Since this
>> > is
>> > testing rgb() with % args with a particular property in focus, is it
>> an
>> > accepted practice to use the same input for the reference rendering
>> > using
>> > a different property?
>>
>>
>> A different property: yes, in a very wide majority of tests and
>> situations. But here, what annoyed me from the beginning is that we
>> are
>> comparing 2 colors with the same feature, with the same method of
>> "creating" such color code, which is rgb(%, %, %).
>>
>
> We are testing property supporting the color here. The verification of
> the color is actually tested by the image which is done in
> background-color cases.  Again it seems like you are mixing up what is
> actually being tested with browser testing.
>
>>
>> > I understand you must make assumptions about the
>> > behavior or stability of everything you use in a test file. But if
>> this
>> > test failed, it would be difficult to tell right away where the
>> point of
>> > failure is - the test property, the ref property or the rgb() value.
>>  If
>> > it is indeed acceptable to construct a test this way, then my first
>> > point
>> > still stands.
>> >
>> > Third point:
>> > Specifically for the tests with 50% values - Nothing about those
>> tests
>> > is
>> > special to 50%. I think these can be changed to 20%,40%,60% or 80%
>> and
>> > compared to non-fractional rgb values (that is, if the %'s need to
>> be
>> > removed at all)
>>
>>
>> Those percentage numbers are good numbers:
>>
>> rgb(20%, 20%, 20%) == #333333  (20% of 255 == 51)
>>
>> rgb(40%, 40%, 40%) == #666666  (40% of 255 == 102)
>>
>> rgb(60%, 60%, 60%) == #999999  (60% of 255 == 153)
>>
>> rgb(80%, 80%, 80%) == #CCCCCC  (80% of 255 == 204)
>>
>>
>> >
>> > The tests named background-color-* all use pngs as a reference, so
>> those
>> > are definitely problematic. Changing the 50% tests to 40% would fix
>> some
>> > of them, but I don't have a solution for those testing 1% and 99%.
>> >
>> > And, speaking briefly about it with Arron this morning, I
>> understand
>> > there
>> > are several hundred more that have this issue - basically all
>> > color-related tests, so this extends wider than what we currently
>> have
>> > identified for the 2.0 release.
>> >
>> > Thoughts?
>>
>> Proposal
>>
>> a) remove all the tests with rgb(1%, , ) or with rgb( , 1%, ) or with
>> rgb( , , 1%) and with rgb(99%, , ) or with rgb( , 99%, ) or with rgb(
>> ,
>> , 99%)
>
> We cannot remove those tests we are testing the boundary cases and
> standard testing practices always test boundary scenarios. This would be
> a bad practice to go against standard testing practices for testing.
> Also if we remove those cases we just proved that percentage values
> cannot be tested or be interoperable this means that the feature should
> be removed from the specification since we just proved that it can't
> work interoperably.
>
>> b) convert the ones using rgb(50%, , ) or with rgb( , 50%, ) or with
>> rgb( , , 50%) to be using 40% instead
>> c) create support files with #660000, #006600, #000066 and #666666
>>
>
> I'm ok with this the 50% values are nominal values and they can be any
> value between 2% and 98% so it really doesn't matter what those values
> are. I just picked 50% because it was half way between 0 and 100. This
> is the case for all nominal value cases if there are other scenarios for
> nominal cases that fall into the fractional problem we should probably
> address those in a similar way.
>
> --
> Thanks,
> Arron Eicholz
>
>
>


-- 
Contributions to the CSS 2.1 test suite:
http://www.gtalbot.org/BrowserBugsSection/css21testsuite/

CSS 2.1 Test suite RC6, March 23rd 2011:
http://test.csswg.org/suites/css2.1/20110323/html4/toc.html

CSS 2.1 test suite harness:
http://test.csswg.org/harness/

Contributing to to CSS 2.1 test suite:
http://www.gtalbot.org/BrowserBugsSection/css21testsuite/web-authors-contributions-css21-testsuite.html
Received on Thursday, 6 December 2012 22:55:45 UTC