RE: Xml Schema profile from noah_mendelsohn@us.ibm.com on 2006-07-19 (xmlschema-dev@w3.org from July 2006)

From: <noah_mendelsohn@us.ibm.com>
Date: Wed, 19 Jul 2006 16:07:16 -0400
To: "Michael Kay" <mike@saxonica.com>
Cc: lists@jeffrafter.com, "'Paul Kiel'" <paul@hr-xml.org>, xmlschema-dev@w3c.org
Message-ID: <OF7F2611BA.F940F598-ON852571B0.006D9C18-852571B0.006E881E@lotus.com>

Michael Kay writes:

> > There 
> > are issues that you can run into if you use a strange mix of 
> > namespaces and redefining of chameleon components. In the 
> > general case we have seen a lot of improvement in the past year.
> >
> 
> Saxon basically imposes the rule, in managing its schema cache, that 
once a
> type has been used then it can't be redefined. Using a type means using 
it
> for validating an instance or for compiling a query or stylesheet. This 
is a
> bit of a blunt instrument, but it seems to eliminate most of the 
potential
> problems. The rule ensures that all references to a type with a given 
name
> are referring to the same type.

My reading of the spec is that what you're doing is not only allowed, it's 
required.  The spec is intended, at least as far as I'm concerned, to say 
that the PSVI resulting from a schema/instance pair does not depend on the 
order in which you do your work.  So, from that perspective, I'd quibble 
with talking about "once a type has been used".   Note in particular where 
the rec says [1]:

"Although ·assessment· is defined recursively, it is also intended to be 
implementable in streaming processors. Such processors may choose to 
incrementally assemble the schema during processing in response, for 
example, to encountering new namespaces. The implication of the invariants 
expressed above is that such incremental assembly must result in an 
·assessment· outcome that is the same as would be given if ·assessment· 
was undertaken again with the final, fully assembled schema. "

Stated a bit differently, you must commit to the definition of any 
particular type (or other component) no later than the time it is first 
used, and after that point it must be immutable.  In the case of a 
redefined component, that rule requires that [2]:

"The modifications have a pervasive impact, that is, only the redefined 
components are used, even when referenced from other incorporated 
components, whether redefined themselves or not. "

Taken together these rules require what I think you're saying Saxon does, 
I.e. that the redefinition be pervasive, and that to the extend you build 
an implementation that works incrementally, that the results be the same 
as if you had assembled the entire schema in advance, including 
redefinitions, and then done the validation.  Note that you don't actually 
have to decide in advance what the schema is going to be;  at the end of 
the assessment you must have determined what the schema is, and your 
results must be the same as if you had assembled in advance.

Other than giving a clean rule that one can reason about fairly easily, 
this also has the desirable effect of increasing compatibility between 
streaming and non-streaming processors.  When I send you an instance for 
validation, I do need to know which schema you will decide to use, but I 
don't need to know whether your processor is streaming or batch;  either 
way, we'll agree on what the PSVI is (modulo the few areas where schema 
gives a choice in the PSVI, such as whether to include full reflected type 
components or just the names of types.)

Noah

[1] http://www.w3.org/TR/2004/PER-xmlschema-1-20040318/#layer1
[2] http://www.w3.org/TR/2004/PER-xmlschema-1-20040318/#modify-schema

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------

Received on Wednesday, 19 July 2006 20:07:34 UTC