W3C home > Mailing lists > Public > xmlschema-dev@w3.org > September 2004

RE: "RE: The dubious XML schema test collection"'

From: Kevin Krouse <kkrouse@bea.com>
Date: Thu, 2 Sep 2004 10:04:39 -0700
Message-ID: <4B2B4C417991364996F035E1EE39E2E101E43CC2@uskiex01.amer.bea.com>
To: "Kasimier Buchcik" <kbuchcik@4commerce.de>, "Michael Kay" <mhk@mhk.me.uk>
Cc: <xmlschema-dev@w3.org>

+1

Having an up to date test suite would be wonderful.  I'd like to add to
Kasimier's proposals that the failing tests have an error code to check
against false positives.

Kevin Krouse
BEA Systems
http://xmlbeans.apache.org/
 

-----Original Message-----
From: xmlschema-dev-request@w3.org [mailto:xmlschema-dev-request@w3.org]
On Behalf Of Kasimier Buchcik
Sent: Thursday, September 02, 2004 5:50 AM
To: Michael Kay
Cc: xmlschema-dev@w3.org
Subject: 'Re: "RE: The dubious XML schema test collection"'


Hi,

on 9/2/2004 1:50 PM Michael Kay wrote:

> The test collection certainly has its faults, but I found it an
invaluable
> resource when developing the schema processor in Saxon-SA. Yes, it
would be
> nice if it were better, but I don't believe one should look a gift
horse in
> the mouth. I certainly don't believe that it does more harm than good.

Yes, I feel that a test suite is invaluable indeed - I intended to 
badge. I'm thankfull for every peace of test to rely upon. But the suite

is out there for 2-3 years now it seems; it's not marked at all to be 
incorrect in some parts. There is just a table of results showing the 
discrepancy of results produced by various XML Schema processors. We can

add another column with our results to it and it would not gain a bit.

> I didn't use the test definition files or the reference results
myself: I
> just tried to process all the schemas and validate the relevant
documents
> against them, comparing the results of my processor with that of other
> processors. This is sometimes a bit hit-and-miss (you can reject an
invalid
> schema for the wrong reason) but it exercises the processor reasonably

This defines a "XML schema with corresponding XML document collection" 
not a test-collection.

> thoroughly and experience in the field suggests that (with the help of
some
> carefully planned supplementary testing) it got most of the bugs out.
> 
> Despite the very large numbers of tests, there are some areas where
coverage
> is not good. For example, I think there are only two tests that
redefine a
> schema with a change of namespace.
> 
> I'm not sure what you mean when you say some of the tests are
"incorrect".

"Incorrect" should mean that the XML Schema documents and the 
corresponding XML documents, regarding their expected validation 
results, do not conform to the spec.

> Of course, there are many invalid schemas there, as there should be.
Also,

Yes.

> some of them are invalid because they use obsolete syntax that was
changed
> before the final Rec came out.

So they are broken.

> The three groups of tests complement each other quite well. The Sun
suite is
> a small set of tests that's quick to run, but manages quite a high
coverage
> of the spec. The NIST suite goes into exhaustive detail on testing the
> validation of simple types. The Microsoft tests are very large in
number and
> manage a pretty broad coverage, though many of them are testing
trivial
> error conditions like dangling references, and the coverage of the
deeper
> semantic issues (like UPA) is much weaker.

Yes, the NIST tests seem to be very detailed - and correct.

> I would love to share some of the improvements I have made to the test
suite
> but I simply don't have the time. I'm sure the same goes for other
people
> including the original contributors.

And this I can completely understand. Maby there is need to create some 
room for people who have the time actually.

The conclusion for me:

I'll communicate "incorrect" tests to this mailing list when
   encountered.

Special offer:

If the test initiative needs the complete MS and SUN test definition 
files after 2 (3?) years, I can send a copy.

Proposals:

1 Mark the test-collection clearly as "work-in-progress" and incorrect
   in some parts.

2. It should not be communicated to try to conform to the MS tests
   naively (as long as it is not intended to create
   non-spec-conformant-but-MS-conformant schema processors)


Regards,

Kasimier
Received on Thursday, 2 September 2004 17:08:14 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:56:06 UTC