- From: Kasimier Buchcik <kbuchcik@4commerce.de>
- Date: Wed, 16 Mar 2005 23:33:28 +0100
- To: noah_mendelsohn@us.ibm.com
- CC: xmlschema-dev@w3.org
Hi, OK, I think I got it now with the help of you and Henry. It boils down that I have the freedom to actually say to the instance: "I cannot validate you in streaming mode, since the result would differ from a non-streaming validation". So only a subset of instances will be able to be streamed, which seems to be according to the spec - although not completely defined there; this might open a door for differing implementations. Good to head that all! I can only recommend to schema authors to certificate their instances as being streamable for all schema processors :-) Thanks & regards, Kasimier noah_mendelsohn@us.ibm.com wrote: > Replying to questions in several of your notes: > > Kasimier Buchcik writes: > > >>Can you already make any statement whether >>component identity checks will still play a role >>in the forthcoming spec? > > > Everyone agrees that having some well crafted notion of identity is > important. For example, you probably want two different conforming > processors to agree on how many components are created from a given > combination of schema documents. The intention is that he rules be as > consistent as possible with those in Schema 1.0, at least insofar as > Schema 1.0 was unambiguous. How best to formulate notions of component > identity in 1.1 is under discussion. Several proposals are being > actively considered. > > >>Wouldn't this break streaming validation? If >>streaming, such a schema to be imported is not >>known until that specific importing node is >>reached - the preceeding nodes of the tree do not >>know of it. Are streaming validators expected to >>prescan the instance, resulting in parsing an >>instance document _twice_? Sounds strange to me, >>but maby I didn't get the statement right. > > > No, the idea is that streaming should be practical in many cases, but I > think you're confused about the model we use to explain it. The way I > believe 1.0 works, and the way I believe 1.1 should work (not everyone > quite agrees with me on this) is that streaming is often possible, but is > just an optimization. You do NOT have to prescan the instance. What you > do need to be sure is that when you finally stumble on and process an > xsi:schemaLocation that its presence would not change any validation results to which you have > already committed. For example, if you've already claimed that an element "e" couldn't > be validated due to a missing element declaration, you can't then start > validating later "e"s with a newly acquired declaration. The common case, > however, is that you hit the xsi:schemaLocation before any elements to > which it applies, and so no such conflict arises; you can pick up the > element declaration and any associated types, and then act as if that > declaration had been in the schema from the start. > > Stated differently: by the time you get to the end of your document, you > will have incrementally assembled a schema. It MUST be the case that if > you were to redo the entire validation using exactly that static schema, > you would get the identical result. That's the sense in which streaming > is an optimization. > > Thus, the result is computed in a streaming way, but is the same as one > would have gotten IF you had prescanned and found the xsi:schemaLocation > in advance. So, streaming and non-streaming processors must report the > same results (except insofar as the recommendation provides lattitude in > other ways.)
Received on Wednesday, 16 March 2005 22:34:05 UTC