Discussion of test case submission from KDE

Hello Andrew, Carlos, everyone,

I am working towards a test case submission of about 4000 test cases. 
This mail is not the submission, but discusses various issues with this.

First of all, some background of the test cases.

I have written these tests from scratch(I am the only copyrighter) during the 
development of an open source(GNU LGPL) XQuery/XSL-T implementation for KDE. 
Development has been(and is) test driven and I think this has positively 
affected the creation of this test suite, especially when contrasted to how 
the XQTS has been created. Although the approach sometimes has been similar 
to how I suspect the XQTS team has worked, a systematic reading of the 
specifications and creation of tests corresponding to the assertions, many of 
the tests has been created as responses to bugs, either discovered 
sporadically or that I've theoretically concluded their existence judged from 
how KDE's implementation is implemented. That is, tests focusing on what an 
implementation has trouble with.

I find it difficult to compare KDE's test suite to the XQTS(because I don't 
know the exact coverage of the XQTS) but I know that I've found 10-15 bugs 
in Saxon even though it has more or less passed the XQTS(although I have not 
systematically run the tests with Saxon), and that I've sporadically seen 
some areas where I've had tests while XQTS doesn't. For example, testing that 
certain operators doesn't exist or that invalid lexical representations of 
date/time types are trapped. The majority of the tests are XPath-only, very 
few nodetest, nametest, and path expressions, and large focus on casting and 
type checking.

My tests are not in the XQTS format, but a 1.0 XSL-T stylesheet with EXSL-T 
extension can convert into it, in a fully automated way. An "export" to the 
XQTS format is not publically available(although it's very easy to do) right 
now, but all of the related files are here:

svn://anonsvn.kde.org/home/kde/trunk/tests/kxpathtests/tests
or via web interface at:
http://websvn.kde.org/trunk/tests/kxpathtests/tests/

On a linux system with xsltproc installed and make(or on Windows with those 
tools available), running `make xqts` should create a fully functional XQTS 
suite. `make xqts-package` zips it up.

The reason I not yet work on nor publish it in the XQTS format is that as soon 
one does that, one potentially must revert to manually editing(of 4000 
tests!), instead of intervening the conversion process. Therefore, this mail 
discusses issues(below) such that we all save work and achieve the best 
possible result.

On to the details. It might be appropriate that some points are discussed in 
the Bugzilla database. If so, state that, and I'll open and we can continue 
there.

* The testing of KDE's implementation is essential for it. If I loose a 
test, it is potentially a regression. How do you decide whether a test is 
accepted or not? For example, is it possible that a test is not accepted even 
though it does not duplicate a test in XQTS and that it is valid? I (almost 
must) have a very clear view on how this works, otherwise contributing to the 
XQTS can get very costly for KDE's implementation. It can simply be a 
question of close cooperation. If you decide to discard tests that are 
incompatible with me, it can just be a question of that I am informed of what 
tests that are discarded.

* This is how the conversion works: no test cases uses input files and the 
query is merged with the catalog. Example:

<test-case type="ebv"
	   description="Test function fn:true().">true()</test-case>

Multiple tests are then put in files, where each file corresponds to a XQTS 
test-group. An XQTS Catalog file contains XInclude statements inside 
test groups(XQTSCatalogSubmission.in.xml), and an identity transform produces 
query files, expected outputs, and the "final" catalog 
file(XQTSCatalogSubmission.xml).

Not all of the tests are converted. Tests concerning functions(about 1000) 
have not been split into groups and are therefore not included from the 
catalog. In other words, about 3000 tests are ready to go, modulo these 
issues.

* The produced test suite is based on XQTSCatalog.xsd version 0.8.6, including 
the changes in bug #3090.

*  New test-groups were added: OptionDeclarationProlog, CopyNamespacesProlog,      
ValCompTypeChecking, AnyURI, StringComp. These have been marked in the 
catalog with an XML comment saying "NOTE: This is a new group ..."

* Currently there are tests for how an basic implementation should treat 
schema imports and the like(not many, 10 or so). How should those be 
organized? Currently they are in the group "Optional Features", but 
implementations being schema aware will fail them.

* test-case/@is-XPath20 is currently incorrectly set, and I have no automated 
way of doing it. I've gotten the impression that you have scripts to 
automatically do this? If so, you can simply fix that when I "officially" do 
the submission.

* I think the submission guidelines are followed

* http://www.w3.org/XML/Query/test-suite/Guidelines for Test Submission.html 
reads: "Variable names, function names, etc., should not contain any 
copyrighted information or any company name or any other text identifying a 
company." However, XQTS' catalog contains test-group/@featureOwner listing 
organizations such as NIST, Oracle, Micorsoft, and so forth. I find these two 
points contradictory. As I see it, when KDE's test being merged into XQTS, 
they will be listed under a certain company as the feature owner. I could use 
a clarification on this area. Perhaps test-group/@featureOwner should be 
removed?

* Some of the tests are generated from the table in "XQuery 1.0 and XPath 2.0 
Functions and Operators, 17.1 Casting from primitive types to primitive 
types", with fromCastingTable.xsl into casting-generated.xml(see the comments 
in the XSL-T file). If some of those tests duplicate XQTS or in some other 
way are inappropriate, the generation should be intervened.

* The majority of the tests do not have descriptions. Those who don't, get a 
generic one generated: "A test whose essence is: `1 to 1 eq 1`", for example. 
The generation knows what test-group a test belong to, so it would be 
possible to generate a description which is based on the test-group. In 
general, I find the descriptions in XQTS very generic, often the same for all 
tests in one group, so I see the generated descriptions as being on the same 
level of quality. If one wants to manually add descriptions one should 
consider to edit the custom format directly, since it's very simple and one 
has the query right next to the 'description' attribute.

* None of the tests have spec-citations, a dummy is added in order to conform 
to XQTSCatalog.xsd. Again, one could add spec-citations based on the 
test-group, for the sake of it. Is there another approach? Is such broad 
spec-citations better than none?

* I have run the tests when in the XQTS format against my implementation, and 
all tests pass(error codes tested too). I think the test driver is reporting 
correct results, I have regression tested it against an "XQTS driver test 
suite".

However, statistically, I think the suite nevertheless contains errors. I may 
have forgotten cases where different outcomes are valid, and tests which 
simply are wrong(and my implementation is wrong as well, since it passes), or 
that I have forgotten to update to specification changes(although I 
think I've been relatively thorough on that). I would personally not include 
the tests in XQTS before having run them with at least one other 
implementation(Saxon perhaps?).

* The "XQTS driver test suite" is available in the XQTS format here:
svn://anonsvn.kde.org/home/kde/trunk/kdenonbeta/kdom/xpath/kxqts/diagnotics-ts
http://websvn.kde.org/trunk/kdenonbeta/kdom/xpath/kxqts/diagnotics-ts

It tests that the driver really mark cases as failed when they should, and so 
forth. I'll gladly share it in anyway, if of interest. Clarify the license, 
submit it, accept improvements/comments, etc.

* The tests can be said to be aligned to the November drafts(the candidate 
releases), and in some cases aligned with the resolution of reports since 
then(for example, string/anyURI promotion).

* There is a risk of that some tests duplicate tests in XQTS. It is a very 
large job(and error prone in several senses) to check this manually. Perhaps 
one could write a tool which opens all queries, removes the initial comment 
and then compares the tests for finding duplicates. Creating such a tool 
would hopefully be useful with other submissions as well.

* If it is of interest, it's possible to work on this in KDE's SVN repository. 
Getting an SVN account is a non issue.

That's it. Comments and suggestions will be interesting to read.


Best Regards,

		Frans

Received on Monday, 17 April 2006 15:10:31 UTC