So how many tests are there?

Hello all,

At the meeting today, the question was asked “how many tests are there”?

I said I’d write up an answer. It’s a tiny bit subtle because the test
suite works really hard to cover all the possible cases.

In brief, a catalog is a collection of test sets and every test set
contains either:

* a <grammar-test> and zero or more <test-case> elements, or
* one or more <test-case> elements.

Each of those counts as a test. Looking at the master branch at 17:49BST
(16:49GMT, 18:49CEST) today, there are 230 tests.

Note that thare are sometimes test-case elements even when the
grammar-test asserts “not a grammar”. That allows the test suite to
probe how an implementation behaves if it fails to detect the error in
the grammar.

There’s one additional complication. Some tests have an app-info
annotation. These annotations allow the test-suite to distinguish
different possible results for different (implementation-defined)
options. Michael has used them in the hygiene tests. Consider:

  <tc:test-set name="multi-1">
    <tc:created by="cmsmcq" on="2022-03-12"/>
    <tc:description>
      <tc:p>A grammar with multiple rules for the same nonterminal.</tc:p>
    </tc:description>
    <tc:ixml-grammar>
      S = 'a'.
      S = 'b'.
    </tc:ixml-grammar>
    
    <tc:grammar-test>
      <tc:description>
 <tc:p>The ixml spec defines this as a non-conforming grammar.</tc:p>
      </tc:description> 
      <tc:result>
 <tc:assert-not-a-grammar/>
      </tc:result>

So far so good. That is not-a-grammar because it has two S productions.
This test set goes on to say (in part):

      <tc:app-info>
 <tc:options ap:multiple-definitions="silence"/>
 <tc:assert-xml>
   <ixml
     ><rule name="S"
       ><alt><literal string="a"/></alt
     ></rule
     ><rule name="S"
       ><alt><literal string="b"/></alt
     ></rule
   ></ixml>
 </tc:assert-xml>

This says, if you’re running in a mode that silently ignores multiple
definitions of symbols, then the output should be as shown.

I recognize most of the Apperecium annotations. You could handle this by
specifying options when you run the test suite. So you’d get 230 results
for each run, but the results would be different depending on how you
configured the processor.

That required more fiddling with the runtime than I was prepared to sort
out, so I simply run the test once for each app-info that I recognize.
That’s why I report 

  261 reports for 230 cases. 261 pass, 0 fail, 0 skip, 0 inapplicable.

I pass all 230 cases, and I pass 31 additional flavors of hygiene test
(I think all the app-info elements are currently in the hygiene test
catalog).

Hope that’s helpful. And correct :-)

                                        Be seeing you,
                                          norm

--
Norm Tovey-Walsh
Saxonica

Received on Tuesday, 31 May 2022 16:58:45 UTC