RE: The dubious XML schema test collection

The test collection certainly has its faults, but I found it an invaluable
resource when developing the schema processor in Saxon-SA. Yes, it would be
nice if it were better, but I don't believe one should look a gift horse in
the mouth. I certainly don't believe that it does more harm than good.

I didn't use the test definition files or the reference results myself: I
just tried to process all the schemas and validate the relevant documents
against them, comparing the results of my processor with that of other
processors. This is sometimes a bit hit-and-miss (you can reject an invalid
schema for the wrong reason) but it exercises the processor reasonably
thoroughly and experience in the field suggests that (with the help of some
carefully planned supplementary testing) it got most of the bugs out.

Despite the very large numbers of tests, there are some areas where coverage
is not good. For example, I think there are only two tests that redefine a
schema with a change of namespace.

I'm not sure what you mean when you say some of the tests are "incorrect".
Of course, there are many invalid schemas there, as there should be. Also,
some of them are invalid because they use obsolete syntax that was changed
before the final Rec came out.

The three groups of tests complement each other quite well. The Sun suite is
a small set of tests that's quick to run, but manages quite a high coverage
of the spec. The NIST suite goes into exhaustive detail on testing the
validation of simple types. The Microsoft tests are very large in number and
manage a pretty broad coverage, though many of them are testing trivial
error conditions like dangling references, and the coverage of the deeper
semantic issues (like UPA) is much weaker.

I would love to share some of the improvements I have made to the test suite
but I simply don't have the time. I'm sure the same goes for other people
including the original contributors.

Michael Kay  

> -----Original Message-----
> From: xmlschema-dev-request@w3.org 
> [mailto:xmlschema-dev-request@w3.org] On Behalf Of Kasimier Buchcik
> Sent: 02 September 2004 12:02
> To: xmlschema-dev@w3.org
> Subject: The dubious XML schema test collection
> 
> 
> Hi,
> 
> The dubious XML schema test collection: does it more harm than good?
> 
> After examining and using The XML schema test collection to be 
> downloaded from [1], some question arose to me:
> 
> 1. There is only a super tiny subpart of the MS tests defined in the
>    test definition file "testsMS_LTGfmt.xml" - this file is 
> about 33 KB.
>    A test definition file generated form the online HTML 
> pages (e.g. [2])
>    has has a size of about 1.8 MB and contains 4689 test 
> definitions; so
>    quite a difference.
>    Does this mean that the 33 KB subset is the reliable part of it?
>    ;-) I hope not.
> 
> 2. Some of the MS tests are obviously not correct; will the incorrect
>     tests be erased, repaired, or at least marked as invalid soon?
>     What about NIST and SUN, are those suites correct? Does a summary
>     about incorrect tests exist?
> 
> 3. Questions sent to this list, about the state of the test-collection
>     and its incorrect tests have not been answered in the past
>     (e.g. [3 + 4]).
> 
> 4. A test definition file for the SUN tests is missing.
> 
> 
> As I think that there is enough confusion about the XML Schema spec 
> itself out there, so it would be nice if at least there was a 
> reliable 
> and complete test collection.
> 
> 
> [1] http://www.w3.org/2001/05/xmlschema-test-collection.html
> 
> [2] 
>
http://www.w3.org/XML/2001/05/xmlschema-test-collection/result-ms-simpleType
.htm
> 
> [3] 
> http://lists.w3.org/Archives/Public/xmlschema-dev/2003Nov/0091.html
> [4] 
> http://lists.w3.org/Archives/Public/xmlschema-dev/2002Nov/0098.html
> 
> Greetings,
> 
> Kasimier
> 
> 
> 

Received on Thursday, 2 September 2004 11:51:23 UTC