structure of a test suite (was Re: ixml and non-xml output - is there an error and if so where?)

On 22,Dec2021, at 7:43 AM, Steven Pemberton <steven.pemberton@cwi.nl> wrote:

> I think this should be our philosophical position:

> ixml provides a way for the author to convert non-XML documents into
> XML.  It is up to the author to write ixml so that it produces
> correct XML, and therefore to ensure that:

>     * serialised names are correct XML names,
>     * attribute and element content do not contain illegal
>       characters,
>     * any element does not have more than one attribute of a given name
>       and not worry more about these issues within the ixml definition.

> As to classifications, in my recent mail on this I proposed a 5-way
> split

>     1. ixml grammar syntax errors
>     2. ixml grammar semantic errors
>     3. ixml grammar correct, input and grammar don't match
>     4. ixml grammar correct, input is ambiguous
>     5. ixml grammar correct, test completes correctly

> What you are proposing I think is a sixth:

>     6. ixml grammar correct, test completes correctly, resulting XML is
>        in error as a result of authoring errors.

I am a little confused; I thought the five-way split you proposed was
a way of organizing a test suite, not a way of classifying the outcome
of a call to an ixml processor.  I don’t see that the two need to have
anything to do with each other.

My answer has grown into two different messages:  one about test
suites and one about errors.  This is the one about test suites.

In organizing a test suite, many people find it helpful to identify
subsets of the tests, either by their focus or by the mechanism used
to produce the tests.  Those running a test suite may also find
subsets useful, especially when running the entire test suite may take
a long time and they want a simpler and faster smoke test.
Unfortunately, these two ways of grouping tests in a test suite are
not likely to produce the same arrangement; I don’t know any way
around that.

In the test suites I'm most familiar with (for XML Schema and XSLT and
XQuery), the top-level organization was often by contributor, and the
internal organization of the tests from any given contributor was
their concern.  Test cases that share infrastructure (same schema,
same data, ...) are often grouped together in a test set to allow the
common material to be specified just once.

I've been assuming the same is likely to apply to tests for ixml, and
so I have not been sure what to make of your proposal for a way of
organizing a test suite.  I have no objection to your organizing your
tests that way, but I had not expected to organize mine that way.

Your five-way division does correspond, more or less cleanly, to the
various ways the test catalog specified as part of the ixml-tests
project can specify the expected result of a test:

   1. ixml grammar syntax errors: <assert-not-a-grammar/> or
   (depending on the way the test catalog is written)
   <assert-not-a-sentence/> (see discussion below).
   
   3. ixml grammar correct, input and grammar don't match:
      <assert-not-a-sentence/>.
      
   4. ixml grammar correct, input is ambiguous: Multiple
      <assert-xml>...</> or <assert-xml-ref ... /> elements.
      
   5. ixml grammar correct, test completes correctly: A single
      <assert-xml>...</> or <assert-xml-ref ... /> element.

I am not sure what your class "2. ixml grammar semantic errors" means.

Note that we can think of grammars in two ways: they can be used to
parse other input, and they are themselves input that can be parsed
using the grammar for grammars.  In the test-catalog schema as
currently defined we can formulate a test of a grammar in either of
these two ways.

We can create a test set specifying the grammar and containing only a
grammar-test, not test-cases using the grammar on instances.  (They
could not be run in any case, since the grammar has errors.)

  <test-set name="class-range.ixml">
    <created by="SP" on="2021-12-16"/>
    <ixml-grammar-ref href="class-range.ixml"/>
    <grammar-test>
      <result>
 <assert-not-a-grammar/>
      </result>
    </grammar-test>
  </test-set>

A grammar test uses the grammar for ixml assumed to be built into an
ixml processor.  (In some cases, of course, the built-in knowledge is
limited to knowing where to find a copy of the ixml grammar when one
is needed.)

The formulation above illustrates one feature of the ixml-tests
catalog structure: when multiple test cases share a grammar, they can
be grouped into a test set and a grammar specified once for the entire
set of tests.

Or we can create a test case in which the ixml grammar is used to
parse the input.  Here there is nothing special about the fact that
the grammar used is the one defined by the ixml spec itself: it's just
a grammar being used to parse input.

  <test-set name="ixml syntax errors - ixml">
    <created by="SP" on="2021-12-16"/>
    <ixml-grammar-ref href="../../../ixml.ixml"/>
    
    <test-case name="class-range">
      <test-string-ref href="class-range.ixml"/>
      <result><assert-not-a-sentence/></result>
    </test-case>
    ...
  </test-set>

Here only a single test set is needed, because all the tests in the
syntaxtests directory share the same base grammar (namely, ixml.ixml).

TL;DR I think your classification is not bad as an enumeration of
possible outcomes of running a test case, which a test catalog needs
to distinguish and all of which need to be exercised in a test suite
with reasonable coverage.  But it doesn't make much sense to me as a
way to structure a test suite; maybe I just don't understand.

Michael

Received on Wednesday, 22 December 2021 18:29:47 UTC