Re: Multiple and circular import/include from Kasimier Buchcik on 2005-03-16 (xmlschema-dev@w3.org from March 2005)

From: Kasimier Buchcik <kbuchcik@4commerce.de>
Date: Wed, 16 Mar 2005 23:33:28 +0100
To: noah_mendelsohn@us.ibm.com
CC: xmlschema-dev@w3.org
Message-ID: <4238B438.2030300@4commerce.de>
Hi,

OK, I think I got it now with the help of you and Henry. It boils down
that I have the freedom to actually say to the instance: "I cannot
validate you in streaming mode, since the result would differ from
a non-streaming validation". So only a subset of instances will be able
to be streamed, which seems to be according to the spec - although not
completely defined there; this might open a door for differing
implementations.
Good to head that all! I can only recommend to schema authors to
certificate their instances as being streamable for all schema
processors :-)

Thanks & regards,

Kasimier

noah_mendelsohn@us.ibm.com wrote:
> Replying to questions in several of your notes: 
> 
> Kasimier Buchcik writes:
> 
> 
>>Can you already make any statement whether
>>component identity checks will still play a role
>>in the forthcoming spec?
> 
> 
> Everyone agrees that having some well crafted notion of identity is 
> important.  For example, you probably want two different conforming 
> processors to agree on how many components are created from a given 
> combination of schema documents.  The intention is that he rules be as 
> consistent as possible with those in Schema 1.0, at least insofar as 
> Schema 1.0 was unambiguous.  How best to formulate notions of component 
> identity in 1.1 is under discussion.   Several proposals are being 
> actively considered.
> 
> 
>>Wouldn't this break streaming validation? If
>>streaming, such a schema to be imported is not
>>known until that specific importing node is
>>reached - the preceeding nodes of the tree do not
>>know of it. Are streaming validators expected to
>>prescan the instance, resulting in parsing an
>>instance document _twice_? Sounds strange to me,
>>but maby I didn't get the statement right.
> 
> 
> No, the idea is that streaming should be practical in many cases, but I 
> think you're confused about the model we use to explain it.  The way I 
> believe 1.0 works, and the way I believe 1.1 should work (not everyone 
> quite agrees with me on this) is that streaming is often possible, but is 
> just an optimization.  You do NOT have to prescan the instance.  What you 
> do need to be sure is that when you finally stumble on and process an 
> xsi:schemaLocation that its presence would not change any validation results to which you have 
> already committed.    For example, if you've already claimed that an element "e" couldn't 
> be validated due to a missing element declaration, you can't then start 
> validating later "e"s with a newly acquired declaration.  The common case, 
> however, is that you hit the xsi:schemaLocation before any elements to 
> which it applies, and so no such conflict arises; you can pick up the 
> element declaration and any associated types, and then act as if that 
> declaration had been in the schema from the start. 
> 
> Stated differently: by the time you get to the end of your document, you 
> will have incrementally assembled a schema.  It MUST be the case that if 
> you were to redo the entire validation using exactly that static schema, 
> you would get the identical result.  That's the sense in which streaming 
> is an optimization.
> 
> Thus, the result is computed in a streaming way, but is the same as one 
> would have gotten IF you had prescanned and found the xsi:schemaLocation 
> in advance.  So, streaming and non-streaming processors must report the 
> same results (except insofar as the recommendation provides lattitude in 
> other ways.)
Received on Wednesday, 16 March 2005 22:34:05 UTC