- From: <noah_mendelsohn@us.ibm.com>
- Date: Thu, 8 Dec 2005 12:08:05 -0500
- To: "Michael Kay" <mike@saxonica.com>
- Cc: "'Bryan Rasmussen'" <brs@itst.dk>, "'Henry S. Thompson'" <ht@inf.ed.ac.uk>, xmlschema-dev@w3.org
- Message-ID: <OFED8D0103.F554A008-ON852570D1.005D5005-852570D1.005E9F21@lotus.com>
Michael Kay asks: > Is it conformant to use a schema for validating instances > without reporting the errors that appear in unused parts of the schema? First, let's separate schema documents from schemas. The Rec has nothing to say about schema documents that look almost like schema docs but have one or more errors, just as the XML Rec has little so say about documents with missing end tags, except to say that they are not XML. From that point of view, if you aren't checking, you're on your own. The schema Rec also makes clear that you can be minimally conforming by putting together a schema any way you like. The pertinent Rec text is (note that this is talking about components in general and notspecifically about schema documents) [1]: Processors have the option to assemble (and perhaps to optimize or pre-compile) the entire schema prior to the start of an ·assessment· episode, or to gather the schema lazily as individual components are required. In all cases it is required that: The processor succeed in locating the ·schema components· transitively required to complete an ·assessment· (note that components derived from ·schema documents· can be integrated with components obtained through other means); no definition or declaration changes once it has been established; if the processor chooses to acquire declarations and definitions dynamically, that there be no side effects of such dynamic acquisition that would cause the results of ·assessment· to differ from that which would have been obtained from the same schema components acquired in bulk. Note: the ·assessment· core is defined in terms of schema components at the abstract level, and no mention is made of the schema definition syntax (i.e. <schema>). Although many processors will acquire schemas in this format, others may operate on compiled representations, on a programmatic representation as exposed in some programming language, etc. The obligation of a schema-aware processor as far as the ·assessment· core is concerned is to implement one or more of the options for ·assessment· given below in Assessing Schema-Validity (§5.2). Neither the choice of element information item for that ·assessment·, nor which of the means of initiating ·assessment· are used, is within the scope of this specification. Although ·assessment· is defined recursively, it is also intended to be implementable in streaming processors. Such processors may choose to incrementally assemble the schema during processing in response, for example, to encountering new namespaces. The implication of the invariants expressed above is that such incremental assembly must result in an ·assessment· outcome that is the same as would be given if ·assessment· was undertaken again with the final, fully assembled schema. The way to think about this is: at the end of your validation episode, you must have used some components. Whatever they are, taken together: * They must meet the constraints on components, I.e. the must be a legal schema. Nothing says that they need to be the same schema you would have used for validating some other instance. * None of the components may have changed during validation, in the following sense: if you were to take that final schema that you knew about at the end, and revalidate the same instance, you must get the same PSVI. So, the results must be the same as if you had not streamed and as if the component properties had been established from the start. That's the story for minimal conformance [2]. If you additionally wish to claim "conformance to the XML Representation of Schemas" [2] then you must indeed read your schema documents for correctness and for proper mappings to components. There has been some disagreement in the WG as to whether you need to check for errors in mappings of components you don't use, but surely you need to check the constraints on XML representations. In any case, my reading is that you can surely do what you want and claim minimal conformance. Whether you can skip certain error checking on "unused" components from schema documents and claim "conformance to the XML Representation of Schemas" is a bit less clear, but I don't see why that should be a barrier to implementation. This is my personal reading. YMMV. Noah [1] http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/#layer1 [2] http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/#concepts-conformance -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 -------------------------------------- "Michael Kay" <mike@saxonica.com> Sent by: xmlschema-dev-request@w3.org 12/08/05 11:26 AM To: "'Henry S. Thompson'" <ht@inf.ed.ac.uk>, "'Bryan Rasmussen'" <brs@itst.dk> cc: <xmlschema-dev@w3.org>, (bcc: Noah Mendelsohn/Cambridge/IBM) Subject: RE: performance testing of schemas > My experience is that schema 'compilation' completely swamps > schema-based > 'validation', so the first thing to do in any performance testing is > to separate these two phases. With this in mind I've been thinking about "lazy" or incremental compilation of schemas, to avoid compiling the parts that aren't used in a particular validation episode. Is it conformant to use a schema for validating instances without reporting the errors that appear in unused parts of the schema? Michael Kay http://www.saxonica.com/
Received on Thursday, 8 December 2005 17:08:40 UTC