- From: Gregor <iamgregor@gmail.com>
- Date: Thu, 28 Apr 2005 18:55:42 +0530
- To: xmlschema-dev@w3.org
I want to find out whether a parser is cappable of validating all instances that may be generated by one special XML vocabulary. The test suite is testing different let's call them "features" of the XML Schema language, like declaring an element with a ref-attribute that then references another attribute etc. My approach therefore was to classify the different ways an element could be declared in a XML Schema and assign an identifier based on the recommendation to it. (Rules: component X is the n-th subchapter of the 3. chapter in the recommendation so this is its main identifier. The possible attributes are listed m(1-c) c being the number of different attributes in a component so the combination of a component and an attribute is n.m. Links to other components are defined after the XML Schema data component model. If a component is including another component it is written n-o where o is the main identifier of the other component.) For the element component I got the following numbering scheme: 3 Element Declaration 3.1 Element with attribute abstract 3.2 Element with attribute block 3.3 Element with attribute default 3.4 Element with attribute final 3.5 Element with attribute fixed 3.6 Element with attribute form 3.7 Element with attribute id 3.8 Element with attribute maxOccurs 3.9 Element with attribute minOccurs 3.10 Element with attribute name 3.11 Element with attribute nillable 3.12 Element with attribute ref 3.13 Element with attribute substitutionGroup 3.14 Element with attribute type 3-4 Element Declaration is defined by Complex Type Definition 3-11 Element Declaration contain Identity-constraint Definition 3-14 Element Declaration is defined by SimpleType Definition I then wrote an appinfo element for the test case meta data. It describes in a machine readable way what the test group is testing: <annotation> <documentation>3.3.2 XML Representation of Element Declaration Schema Components Specs section: 3.3.2 XML Representation of Element Declaration Schema Components - L Element with ref='foo' with foo is a declared element </documentation> <!-- This is already there --> <appinfo> <!-- this is new --> <td:technicalTestGroupDescription xmlns:td="http://xml.sap.com/2005/04/industrySpeak/validation#technicalDescription"> <td:xmlSchema version="1.0" edition="1" /> <fd:feature recommendationPart="1" featureNumber="3.12" xmlns:fd="http://xml.sap.com/2005/04/industrySpeak/validation#featureDescription"> <fd:Description>Element with attribute ref</fd:Description> </fd:feature> </td:technicalTestGroupDescription> </appinfo> </annotation> I then analyze a XML business vocabulary e.g. all CIDX-Schema files (www.cidx.org) and see whether in any of these files is having a match for the regular expression <(xs:|xsd:)?element ref="". If this is the case 3.12 is assigned the value "true", else "false". I do this for all 102 numbered Components (Components, component attributes and links whose numbers are generated by the ruleset described above). The classification of the components according to element- and attribute-names makes this possible to analyse the vocabulary using one regular expression per component number. A test group might (unlike the above) test several different component numbers. When running the final benchmark I would only include the test groups that test excatly those components that appear in the analyzed vocabulary (in this case CIDX). The reason for the exercise is: When running the whole test suite a certain percentage (say 20%) of test cases fail. This only tells me that the parser is unfitt to support the whole XML Schema language and thus is not a minimally conforming parser. On the other hand it does not mean, that it is unfitt to perform its validation duty in a MiddleWare processing CIDX messages because CIDX messages are only using a subset of the features provide by the schema language. If test cases in this reduced test suite are failing then we can examine them closer and pinpoint the problem. (Limitation of the approach: the above approach can not determine a minimum set of test groups, supposing there is one. I can only reduce the number of test cases to a certain amount.) I would therefore be interested in your thoughts about: 1. Does my classification of the XML Schema features makes sense? 2. What do you think of my appinfo, is anything similar planned for all test suits? 3. What do you think of the idea of testing only "relevant" test groups to pinpoint the problems a parser might have with respect to a certain vocabulary? This is the benchmarking part (saying that a parser is unfitt if he does not support all the test cases). Do you think it is possible with this classification or with any classification and a finite number of test cases to turn the argument around? Example: The analysis of a vocabulary has revealed that it uses the following features, inference rules, components... on the other hand we have test cases testing exactly those features, inference rules, components... If parser X validates these test cases correctly it will also validate the vocabulary (in fact all instances the vocabulary will ever produce) correctly. I would love to see the all test cases be outfitted with some kind of machine-readable (not at all necessarily the one described above) to allow for more targeted and agile testing. (This could also make tests during development a lot easier) In a broader vision the XML industry standard bodies could be asked to supply their test cases in a specified and documented format. I personally think that testing should become on major pillar of the implementation process. In my opinion, If the W3C supplied a reference test suite that has to be completed by a parser in order to gain W3C approval instead of leaving the software companies to figure it out (first the recommendation and then later their own performance in implementing it) compliance could be greatly improved. Thank you for your time, Gregor
Received on Thursday, 28 April 2005 16:23:46 UTC