RE: [ANN] XSDBench XML Schema Benchmark 1.0.0 released from Michael Kay on 2006-10-18 (xmlschema-dev@w3.org from October 2006)

From: Michael Kay <mike@saxonica.com>
Date: Wed, 18 Oct 2006 13:22:36 +0100
To: "'Boris Kolpackov'" <boris@codesynthesis.com>
Cc: <xmlschema-dev@w3.org>
Message-ID: <00c801c6f2b0$190e6ee0$6601a8c0@turtle>

Yes, but you can't measure performance of a product with only one test case.


For example, many schema validators are likely to have an elapsed time for
validation of something like (aX + c) where X is the document size. If
you're only measuring one 12K instance, then dividing the processing time by
X doesn't give any useful measure of throughput because you don't know what
"c" is. 

If you're trying to prove that your product is the fastest, then you need to
produce something a bit more convincing; and if you're trying to provide a
tool that's useful to the community, then it needs to do a more thorough
analysis.

(Though I can't complain, because this one test did find a bug in my product
which none of the 3000 test cases in the W3C test suite had shown up!)

Michael Kay
http://www.saxonica.com/


> -----Original Message-----
> From: Boris Kolpackov [mailto:boris@codesynthesis.com] 
> Sent: 18 October 2006 12:42
> To: Michael Kay
> Cc: 'Boris Kolpackov'; xmlschema-dev@w3.org
> Subject: Re: [ANN] XSDBench XML Schema Benchmark 1.0.0 released
> 
> Hi Michael,
> 
> Michael Kay <mike@saxonica.com> writes:
> 
> > Have I missed something, or does this "benchmark" really consist of 
> > just a single schema and a single instance to be measured?
> 
> Yes, because we tried to make it as close to reality as 
> possible. The schema consists of multiple sub-tests for the 
> most commonly-used features of XML Schema (structure). It tests:
> 
>   * attribute
>   * anyAttribute
> 
>   * element
>   * any
> 
>   * all
>   * choice
>   * sequence
> 
>   * complex type empty content, including extension and restriction
>   * complex type simple content, including extension and restriction
>   * complex type complex content, including extension and restriction
> 
> The instance then exercises each of these sub-tests in a 
> number of ways. This way you get an overall performance of 
> the parsers on the set of most commonly used features. Of 
> course, you may not use some of them in your schemas. We 
> still think it is better than to have a number of small 
> schemas that each exercise an individual feature
> because:
> 
>   a) this is not what real-life schemas look like
> 
>   b) it is not clear how to interpret these results for practical
>      purposes (i.e., which parser will be the fastest for my schemas).
> 
> 
> 
> -boris
> 
> 
> --
> Boris Kolpackov
> Code Synthesis Tools CC
> http://www.codesynthesis.com
> tel: +27 76 1672134
> fax: +27 21 5526869

Received on Wednesday, 18 October 2006 12:22:51 UTC