RE: Permit (greedy) conflicting wildcards

Michael Kay writes:

> The thing I'm having trouble understanding is that this seems to assume 
a
> rather finite and predictable schema. It seems to make the outcome of a
> validation episode rather dependent on the set of element declarations 
that
> happen to be lying around in your schema cache.

You may well have some sort of cache as an implementation strategy, but 
it's not an abstraction that appears in the schema language.  There are 
already a number of constructs that have the same closed world feel.  For 
example, substitution groups come to mind.  To know what a substition 
group will validate, you need to know unambiguously what all your global 
element declarations are, at least sufficiently to know which ones claim 
to be in a given substition group.  If you choose as a schema assembly 
strategy "whatever's lying around in my implementation cache", so be it, 
though I assume that precludes your running at 9AM a validation with a 
schema that says <element name="e" type="float"/> and at 10 am <element 
name="e" ty="int"/> unless you flush the cache in between.

The schema recommendation very clearly says:

"Although ·assessment· is defined recursively, it is also intended to be 
implementable in streaming processors. Such processors may choose to 
incrementally assemble the schema during processing in response, for 
example, to encountering new namespaces. The implication of the invariants 
expressed above is that such incremental assembly must result in an 
·assessment· outcome that is the same as would be given if ·assessment· 
was undertaken again with the final, fully assembled schema."

While not written specifically to deal with these "action at a distance" 
mechanisms, this text makes pretty clear that the definition of assessment 
is indeed of a completely assembled schema in which all components are 
known.  Anything more incremental is a processor implementation strategy 
that must not have externally visible characteristics that conflict with 
the normative rule.

I do agree that certain precompilation of NIS wildcards can't be done in 
advance of knowing which global elements exist.  I believe the same is 
true of compilation of the disjunction implied by a reference to a 
substitution group head.  The two seem fairly similar to me in that 
respect.

FWIW: I think there are downsides to this feature.  There is a sense in 
which it's tricky, and as with substution group references, you really 
don't know what they validate by just examining the local component.  I 
don't think they go beyond precedents already set in terms of closed world 
assumptions.

Noah

[1] http://www.w3.org/TR/xmlschema11-1/#layer1



--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------








"Michael Kay" <mike@saxonica.com>
03/21/2007 01:43 PM
 
        To:     <noah_mendelsohn@us.ibm.com>, "'Pete Cordell'" 
<petexmldev@tech-know-ware.com>
        cc:     <xmlschema-dev@w3.org>
        Subject:        RE: Permit (greedy) conflicting wildcards


> Nonetheless, it was because we realized that some users would 
> want more help from the content model itself that we are 
> likely to propose the notQName="##defined" 
> construct (which, by the way, is known informally in the 
> workgroup and in some blog postings I think as the "Not In 
> Schema" or NIS wildcard.

The thing I'm having trouble understanding is that this seems to assume a
rather finite and predictable schema. It seems to make the outcome of a
validation episode rather dependent on the set of element declarations 
that
happen to be lying around in your schema cache. Validation of a document
containing @xml:space will succeed until you install a new version of your
schema processor that has a built-in attribute declaration for @xml:space,
and then it will suddenly start failing because @xml:space is now in "the
schema".

Michael Kay
http://www.saxonica.com/

Received on Wednesday, 21 March 2007 22:44:35 UTC