[CSSWG] Minutes F2F 2009-06-03 Part I: Testing from fantasai on 2009-06-17 (www-style@w3.org from June 2009)

From: fantasai <fantasai.lists@inkedblade.net>
Date: Tue, 16 Jun 2009 23:23:52 -0700
To: www-style@w3.org
Message-ID: <4A388BF8.1060608@inkedblade.net>
Present:
   David Baron
   Bert Bos
   John Daggettt
   Arron Eicholz
   Elika Etemad
   Sylvain Galineau
   Daniel Glazman
   Molly Holzschlag
   Håkon Wium Lie
   Chris Lilley
   Alex Mogilevsky
   Anne van Kesteren
   Steve Zilles

<RRSAgent> logging to http://www.w3.org/2009/06/03-CSS-irc
Meeting: CSS Working Group Face-to-Face
Chair: Daniel Glazman

Test Suites
-----------
Scribe: ChrisL

   Daniel:  CSS2.1 testsuite is a major item
   Daniel: we should have ts and imp reports by end of 2009 to be on time
           for charter
   Daniel: problem is to decide when to stop, which we are not good at
   Daniel: its for CR exit criteria and then secondly a continuously improved,
           large one
   Daniel: we can't improve the first one forever or we will never get to rec
   Daniel: also we have many modules getting ready for cr and not test suites
           for most of them
   fantasai: we have it for some, build scripts need some work
   Daniel: test suites are part of the technical work. its part of what we
           do. we need sustained commitment to finish specs by doing test
           suites. otherwise the specs are useless
   Arron: is the process fully documented?
   jdaggett: at last f2f we parcelled out the tests. I looked at the font
             submissions, it was not clear where some of the tests came from
   jdaggett: there are build scripts that don't work, tests that are in svn,
             tests on ms page not in svn. format varies too
   jdaggett: so it all needs some work
   fantasai: yes, some is half finished
   fantasai: supposed to have build scripts continually rnning, thats not
             been done yet
   jdaggett: i was writing a script to try and pull all this together
             because looking at source one by one is tedious
   fantasai: ok so lets get that script and put it on the server
   jdaggett: its not packaged, and volunteers need a packaged product and
             platform-independent instructions
   jdaggett: eg rendering can be off by half a pixel, rather than "switch
             off cleartype"
   jdaggett: we need packaged zips of tests, with as version number, so we
             know what people have tested
   fantasai: not looked at platform specifics, probably mainly affects
             fonts, will need to look at this more
   fantasai: need to get the build scripts working
   jdaggett: need to prune out obsolete stuff like old build scripts, old
             instructions that are wrong, etc
   fantasai: yes some of it is a mess but its possible to review from the source
   Arron: chapter 4 needs http
   jdaggett: there are instructions that say install this font then take
             it out "at the end" .... end of what
   howcome: tests are better online, but yes we need versioning, dated
            versions like in CSS1 test suite
   jdaggett: main thing is it should be simple to run
   howcome: should not call them conformance tests
   (several agrees)
   howcome: want to see the term removed. don't call them a "conformance"
            test suite
   fantasai: easy to fix
   jdaggett: a lot of pages, some outdated. need to tidy them up
   jdaggett: never clear what the status was. some need to be marked as
             not maintained
   Anne: please cite specific examples
   <fantasai> http://www.w3.org/Style/CSS/Test/
   jdaggett: people need to know what to ignore
   Bert: wiki pages are just scratch pads, ignore those
   fantasai: no, wiki pages say how to contribute to the project
   jdaggett: are we going to merge these?
   fantasai: yes eventually
   Arron: this page is the central hub but it needs links to instructions
   fantasai: wiki has the instructions
   <jdaggett> http://wiki.csswg.org/test/css2.1
   jdaggett: wiki talks of cvs but now we are using svn
   fantasai: happy to move tests to the w3c site, but need them to be in
             svn as we often move directories
   jdaggett: so already reviewed tests will be supplemented by ms tests,
             once reformatted?
   fantasai: should be same format
   jdaggett: but there can be tests per chapter
   fantasai: will make snapshots more regularly, also combine so we have
             svn as the master repository
   jdaggett: main thing is to have the documentation more clear
   fantasai: (argues convincingly that its easier to tidy up the structure
             than document the existing brokenness)
   Anne: if changes are made, do they propogate to the ms test suite?
   Arron: probably easier to send comments rather than test changes as we
          have multiple internal copies and some internal red tape to go
          through
   Arron: easier for me if people send feedback to mailing list rather
          than modifying tests
   Arron: want to have a same week turnaround for changes to the tests
          after review
   Daniel: so when will we be done?
   Arron: december is a good target if we have the reviews
   Daniel: need to decide when we are done (for CR)
   Daniel: The test suite is done when we have less tests coming in and the
           review comments are no longer finding lots of errors in the tests
   Daniel: deadline for submitting nw tests would help, i can't decide that alone
   Arron: we have submitted all the ones that we think are needed
   Daniel: so it comes doen to a problem of commitment. its less interesting
   Daniel: we have a bad image in the consortium because of this
   Anne: color and namespaces, test have come fast. also for media queries
   Daniel: yes, we have submissions but all over the place. no implementation
           report, not instructions, no review. having some tests is only
           the first step
   dbaron: for selectors we are lacking imp reports
   fantasai: selectors tests build script is broken
   Daniel: original one was a directory, its become more complex now
   Molly: sounds like we need some project management
   fantasai: familiar with the technical basis but have no project
             management skills
   (laughter)
   howcome: you are the right person
   jdaggett: moz japan has funding for an intern to do testing
   Daniel: want a one-click setup of a framework for new tests, so it
           all builds and you just drop in tests
   Daniel: and reusable instructions
   fantasai: for most test suites, they can reuse the same instructions.
             its only if they need something special
   Daniel: we could have 30 test suites if you look at all the modules
   fantasai: it should be just a few lines of config script
   dbaron: lets not add more requirements
   Daniel: trying to avoid revisiting this for each test suite
   Anne: can avoid having html, xhtml, xml versions
   fantasai: build scripts do more than that
   fantasai: makes TOC for example, by chapter and section
   Daniel: so, all tests subitted by september/october 2009? for css 2.1
   Daniel: tests by 15 Sept, reviews by 1 Dec. then we can do a release
           at new year
   Daniel: ok so lets really do it this time
   Daniel: we should be able to get to PR by end of year
   Anne: we do need the implementations
   ChrisL: need the  reports to see where the implementation gaps are
   ChrisL: a coverage report is needed, to show spec coverage
   Anne: for navigation, just a directory listing is fine
   Anne: not sure we should update the build scripts
   dbaron: would like to discuss test format and do a demo
   Daniel: selectors test suite, all we need is imp reports
   fantasai: also needs some tests to be dropped and a couple others added
   dbaron: have sent imp reports four times already
   ChrisL: can just remove non-relevant lines from existing reports
   fantasai: so the build is broken and only hixie can fix
   dbaron: copied color build scripts from selectors?
   fantasai: no, from css2.1 in fact
   dbaron: how crucial are the newly added tests
   fantasai: format is really wierd. its some xml thing that hixie made up
   dbaron: but dont redisign the entire thing to add one more test
   ChrisL: better to make an xslt that converts it to a more directly
           useful format
   Anne: just link to the additional tests
   jdaggett: sucks
   fantasai: lachlan added tests, could not build it either
   Hixie helps fantasai debug the build scripts while the conversation continues.
   Daniel: so mozilla has almost complete coverage for selectors. what
           about opera and microsoft?
   dbaron: can we see what tests changed, rather than re-run the entire
           thing again?
   howcome: much easier
   howcome: depends on how many tests need to be re-run
   dbaron: selectors takes about an hour to re-run
   dbaron: last time i sent a report, no-one else did so .....
   Daniel: we have a two year charter and need to get it done
   howcome: why was the test suite changed
   Daniel: because we removed a section, and found some tests were missing
   dbaron: we did our reports and then some more tests arrived. we need to
           decide to freeze it and stick to that decision
   Anne: we changed selection
   howcome: who accepted the additional tests
   Anne: lachlan submitted them
   mh: do we have an acceptance policy
   Daniel: yes but we didn't follow it
   ACTION: chris run the six new selectors tests on opera
   ChrisL: color module seems to have mostly complete coverage now
   dbaron: sent in some implementation report already

Reftests
--------

ScribeNick: fantasai
   dbaron: One issue with our test suites is that they have to be run
           manually, and that's a laborious possible
   dbaron: Insofar as we can run tests automatically, we should do that
   dbaron: It could reduce the amount of time needed to run the tests
   dbaron: This can make it much quicker to create an implementation report
   dbaron: and to keep implementations from regressing
   <howcome> howcome has joined #css
   dbaron: The idea of reftests is that the test consists of two HTML
           files plus an assertion that those two files either look the
           same or don't look the same
   dbaron: It's something that can be automated
   dbaron: but is also something you can run manually
   dbaron: Here are two tests I can run manually.
   dbaron: I can open them in two tabs and flip between them, to verify
           that they're the same
   dbaron: You can also run them automatically
   dbaron: Our implementation for running this automatically uses canvas
           and JavaScript
   dbaron: other implementations can use other automation frameworks
   dbaron: and you can also run the tests manually
   dbaron: There's a whole lot of stuff you can automate this way
           although not everything
   dbaron: The counters tests I submitted for CSS2.1, for example, were
           originally reftests
   dbaron: I would like to be able to submit tests in this format
   discussion of some tests
   glazou: box-shadow with blur radius, combine with image?
   fantasai: can't do that because gradient for blur radius is explicitly
             undefined
   dbaron: The not-equals tests can be useful to check that assumptions
            are correct
   http://mxr.mozilla.org/mozilla/source/layout/reftests/
   Anne: Opera uses an image-based regression test framework
   Anne: we compare to screenshots
   jdaggett: For a lot of these tests you need to notice a 1px difference
   jdaggett: Some of the Microsoft tests, it's really hard to tell
   dbaron shows off his box model acid test
   http://mxr.mozilla.org/mozilla/source/layout/reftests/box-properties/CSS21-t100303.xhtml
   http://mxr.mozilla.org/mozilla/source/layout/reftests/box-properties/CSS21-t100303-ref.xhtml
   dbaron: It's much better to write a test that combines a lot of
           assertions than to require the tester to run 600 individual
           tests that are all almost the same
   jdaggett: The automation is always going to be vendor-specific
   dbaron: I'm not saying we should require tests in this format, but
           that we should allow tests to be submitted in this format
   Bert: comparing two PDFs will be a bit more difficult than comparing
         two screenshots
   Bert: This is fine for a browser
   Bert: But how am I going to test Prince, or Opera Mini, or HP's printer?
   Bert: Hold two pieces of paper against the light?
   <ChrisL> yes, compare them by hand. the tests can be run manually,
            but are also automatable in some cases
   Discussion of metadata
   Reftests should have all relevant metadata: author, help, assertion, flags, etc
   Chris: SVG compares SVG and a PNG image, we do side-by-side comparisons
          and have a harness for it
   jdaggett: The key point of this format is that it's not rendering vs
             image, but rendering vs rendering
   jdaggett: With images you can't guarantee the same image across multiple
             platforms (anti-aliasing, etc)
   Steve: There are a number of cases where we've consciously left things
          undefined, so you cannot assert pixel equivalence
   <ChrisL> yes, using raster images relies on specific font rendering so
            it makes it harder to automate. markup equivalents are better
            as the platform difference is evened out

Scribe: ChrisL
   Bert: why are we allowing another format that is different to what we
         decided some years ago?
   Daniel: because it allows automated testing and means we get more results
   Bert: think this is worse
   dbaron: its much faster to run through tests in this format
   Alex: we have very little project management here, thats why we are lagging
   Anne: this format is a lot better for implementors
   Bert: some implementors
   Anne: no, most
   Daniel: its good because it gets us closer to our exit criteria
   jdaggett: its a practical matter, allows browsers to fix bugs and check
             for regression. excellent to catch regressions
   jdaggett: helps maintain a higher level of interoperability
   Sylvain: so we run these for imp reports?
   Daniel: yes
   Sylvain: does it get us to the deadline faster
   dbaron: yes. and new tests always need to be run
   Daniel: david is asking if this format is acceptable
   ChrisL: this is for css3 as well
   RESOLVED: we accept tests in ref-format as well, as long as they have the
             existing metadata style
   Arron: prefer to have them link to each other in the metadata
   dbaron: manifest file helps in the case of many-to-one links from refs to tests
   dbaron: and they can be concatenated easily
   (break)
Received on Wednesday, 17 June 2009 07:48:14 UTC