RE: Test Case Template/Meta Data from Kris Krueger on 2009-11-20 (public-html-testsuite@w3.org from November 2009)

From: Kris Krueger <krisk@microsoft.com>
Date: Fri, 20 Nov 2009 05:12:04 +0000
To: 'James Graham' <jgraham@opera.com>
CC: "'public-html-testsuite@w3.org'" <public-html-testsuite@w3.org>
Message-ID: <3607818D324EC04E8767AA21E274C023202382EA@TK5EX14MBXW652.wingroup.windeploy.ntde>
I agree that having meta data in each test case would be too much to handle if the
suite ends up with thousands of test cases.  

Though I do think we need to track what part of the spec have test cases and which 
parts do not have test cases.  Eventually we want to have tests for the whole 
specification, so that the spec can get to rec.  Right?

For example it was mentioned that the URI below is a set of HTML5 parser tests. 

  http://gsnedders.html5.org/html5lib-tests/runner.html

Now if one didn't know that this was a parser test, how long would it take to 
understand this test? A few Q's below....

 * What parts of the HTML5 parsing does it specifically test? 9.2.2.4?
 * What parts of the HTML5 parsing doesn't it not test? 9.2.2.4?
 * What does failure look like?
 * What should we focus on adding next for parsing tests section 9.2.2.4, 9.2.5.26?
 * When a section in the spec changes what tests need to be updated? Which do not?
 * If a UA fails a test, how can they be informed on why they fail the test or 
   if they want to suggest an update to the spec so that it is more clear? 
   Or even argue that the test is incorrect?

My fear is that without placing this meta data in the test suite we'll end up with 
just a bunch of 'ad hoc' html5 tests that don't help the spec get to rec or lead 
to interoperable implementations.

Hopefully we can agree that it would be good to add this 'meta data' for tests in 
some form in the test suite.

-Kris 


--------------------------------------------------
9.2 Parsing HTML documents
9.2.1 Overview of the parsing model
9.2.2 The input stream
9.2.2.1 Determining the character encoding
9.2.2.2 Preprocessing the input stream
9.2.2.3 Changing the encoding while parsing
9.2.3 Parse state
9.2.3.1 The insertion mode
9.2.3.2 The stack of open elements
9.2.3.3 The list of active formatting elements
9.2.3.4 The element pointers
9.2.3.5 Other parsing state flags
9.2.4 Tokenization
9.2.4.1 Data state
9.2.4.2 Character reference data state
9.2.4.3 Tag open state
9.2.4.4 Close tag open state
9.2.4.5 Tag name state
9.2.4.6 Before attribute name state
9.2.4.7 Attribute name state
9.2.4.8 After attribute name state
9.2.4.9 Before attribute value state
9.2.4.10 Attribute value (double-quoted) state
9.2.4.11 Attribute value (single-quoted) state
9.2.4.12 Attribute value (unquoted) state
9.2.4.13 Character reference in attribute value state
9.2.4.14 After attribute value (quoted) state
9.2.4.15 Self-closing start tag state
9.2.4.16 Bogus comment state
9.2.4.17 Markup declaration open state
9.2.4.18 Comment start state
9.2.4.19 Comment start dash state
9.2.4.20 Comment state
9.2.4.21 Comment end dash state
9.2.4.22 Comment end state
9.2.4.23 Comment end bang state
9.2.4.24 Comment end space state
9.2.4.25 DOCTYPE state
9.2.4.26 Before DOCTYPE name state
9.2.4.27 DOCTYPE name state
9.2.4.28 After DOCTYPE name state
9.2.4.29 Before DOCTYPE public identifier state
9.2.4.30 DOCTYPE public identifier (double-quoted) state
9.2.4.31 DOCTYPE public identifier (single-quoted) state
9.2.4.32 After DOCTYPE public identifier state
9.2.4.33 Before DOCTYPE system identifier state
9.2.4.34 DOCTYPE system identifier (double-quoted) state
9.2.4.35 DOCTYPE system identifier (single-quoted) state
9.2.4.36 After DOCTYPE system identifier state
9.2.4.37 Bogus DOCTYPE state
9.2.4.38 CDATA section state
9.2.4.39 Tokenizing character references
9.2.5 Tree construction
9.2.5.1 Creating and inserting elements
9.2.5.2 Closing elements that have implied end tags
9.2.5.3 Foster parenting
9.2.5.4 The "initial" insertion mode
9.2.5.5 The "before html" insertion mode
9.2.5.6 The "before head" insertion mode
9.2.5.7 The "in head" insertion mode
9.2.5.8 The "in head noscript" insertion mode
9.2.5.9 The "after head" insertion mode
9.2.5.10 The "in body" insertion mode
9.2.5.11 The "in RAWTEXT/RCDATA" insertion mode
9.2.5.12 The "in table" insertion mode
9.2.5.13 The "in table text" insertion mode
9.2.5.14 The "in caption" insertion mode
9.2.5.15 The "in column group" insertion mode
9.2.5.16 The "in table body" insertion mode
9.2.5.17 The "in row" insertion mode
9.2.5.18 The "in cell" insertion mode
9.2.5.19 The "in select" insertion mode
9.2.5.20 The "in select in table" insertion mode
9.2.5.21 The "in foreign content" insertion mode
9.2.5.22 The "after body" insertion mode
9.2.5.23 The "in frameset" insertion mode
9.2.5.24 The "after frameset" insertion mode
9.2.5.25 The "after after body" insertion mode
9.2.5.26 The "after after frameset" insertion mode
9.2.6 The end
9.2.7 Coercing an HTML DOM into an infoset
9.2.8 An introduction to error handling and strange cases in the parser
9.2.8.1 Misnested tags: <b><i></b></i>
9.2.8.2 Misnested tags: <b><p></b></p>
9.2.8.3 Unexpected markup in tables
9.2.8.4 Scripts that modify the page as it is being parsed



-----Original Message-----
From: James Graham [mailto:jgraham@opera.com] 
Sent: Thursday, November 19, 2009 2:50 PM
To: Kris Krueger
Cc: 'public-html-testsuite@w3.org'
Subject: RE: Test Case Template/Meta Data

Quoting Kris Krueger <krisk@microsoft.com>:

> Would you agree that one test or set of tests should somehow contain  
>  this meta data?
>
> The meta data doesn't have to be specifically embedded into every   
> individual specific test.  For example it could be stored in another  
>  file (xml?) and still provide information about what this test is   
> testing, the test status, etc..

I tend to agree with Philip about the general approach we should take.  
With regard to metadata, each additional piece of metadata that has to  
accompany a test is a disincentive to write that test at all (because  
it adds to the effort required to write the test and keep it up to  
date). Therefore I think we should be looking to justify each piece of  
required metadata as essential or at least as having a good  
opportunity cost (e.g. by making the testsuite more maintainable).

 From the list you mentioned:
> Author       -> Who created the test

Can be handled by the VCS.

> Status/Phase -> Approved/Draft/Submitted

Can be stored out of band if we actually do this. (This approach  
requires tests have a unique associated identifier).

> Reviewer     -> Who reviewed the test case

Can be stored out of band if we do this.

> Help         -> URI back to the specification
> Assert       -> Describes what specifically the test case tests

These seem quite quite similar. Tests that test a particular  
conformance criterion could have links to the corresponding fragment  
id in the spec. For areas like parsing tests will inevitably touch on  
multiple conformance criteria. With the html5lib tests we haven't  
really found a good way to associate the tests with part of the spec  
in such a way that it is obvious which tests need to change when a  
given part of the spec changes. All the obvious ways to do this (e.g.  
listing all the tokenizer and tree-construction phases that a given  
input should pass through) end up being so much effort that no one  
would bother to write any tests at all if this were required. Instead  
we have just dealt with changes to the spec in an ad-hoc manner. This  
doesn't seem like a bad strategy since the spec should become more  
stable with time, not less.
Received on Friday, 20 November 2009 05:12:57 UTC