[Bug 5157] 3.4.2 example unclear

http://www.w3.org/Bugs/Public/show_bug.cgi?id=5157


cmsmcq@w3.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |needsAgreement
  Status Whiteboard|thimble, work               |medium, work, clarification
                   |                            |cluster




------- Comment #1 from cmsmcq@w3.org  2008-01-04 00:49 -------
Thank you for the comment.  The example and its accompanying text
(in section 3.4.2) clearly needs to be revised to address the
obscurities and confusions you identify in it.

You ask 

    Is the definition of notQname telling me that it violates the
    partitioning of symbol spaces by treating both the local and
    global as "the same"?  Is it telling me that the local def of
    speaker would still be allowed because the wildcard notQname
    would match ONLY the GED?

It's not immediately clear to me how best to change the current text
to deal with these questions.  The notQName attribute on (the source
declaration of) a wildcard identifies a set of expanded names, with
the meaning that elements bearing those expanded names cannot be
attributed to / matched up with the wildcard; I'm experiencing
difficulty formulating any sentence describing the relation of the
QNames listed in the attribute to the concept of symbol spaces,
because they seem so fundamentally unconnected to me as to verge on
the non-comparable.  (This is not helpful, I realize, as an answer
to the question "How do the QNames in the notQName attribute relate
to the names in the various symbol spaces?", which I take to be
implicit in your comment.  I'm going to have to work on is.  It's
possible that the analysis of content-model matching would be
cleaner if we did bring the notion of symbol spaces to bear more
explicitly on the issue of matching input elements to wildcards.)

But with the caveat that it doesn't seem natural to say it this way,
I suppose the answer to the first question you ask explicitly is
"yes, the expanded names associated with a wildcard are not
associated a priori with any particular symbol space", even though
when elements match a wildcard, governing declarations are sought
only among top-level (global) element declarations.

I don't understand the second question well enough to venture an
answer.

By way of trying to establish some common ground, though, consider
the alternative restriction

 <xs:complexType name="computer2">
  <xs:complexContent>
   <xs:restriction base="computer">
    <xs:all>
     <xs:element name="CPU"/>
     <xs:element name="memory"/>
     <xs:element name="monitor"/>
     <!-- Any additional information about the computer -->
     <xs:any processContents="lax"/>
    </xs:all>
   </xs:restriction>
  </xs:complexContent>
 </xs:complexType>

Like the base type 'computer', this restriction 'computer2' accepts
the sequence

  <CPU/><memory/><monitor/><speaker/>

and binds the first three elements to the local element declarations
for the elements of those names.  It does not bind the fourth child
to any element declaration at all.  (The base type 'computer', by
contrast, binds all four elements to the local declarations scoped
to /type::computer.)

Two observations about this alternative restriction seem relevant:

  (1) It is unlikely to satisfy a schema author who wants
      instances of type 'computer2' not to have 'speaker' elements.
      That's why we added the example we're discussing: to show how
      to get rid of local elements like 'speaker'.

  (2) It provides less information about the instance than does
      the base type.  (The base type, to be sure, assigns the
      'speaker' element to xs:anyType, so the restriction does not
      accept instances the base type would not have accpeted.  But
      instead of a binding to a local element declaration, we get no
      binding at all.  I hope it's clear that in some significant
      sense, 'computer2' provides less information than 'computer'.
      The type 'computer' thus fails to 'subsume' the type
      'computer2' in the fundamental sense: No thing X can subsume
      any thing Y if X contains / conveys / captures more
      information than Y does.  (The specific technical definition
      of subsumption we offer is merely an attempt to operationalize
      that basic concept for purposes of XSDL.)

On the question of default bindings, you are quite right that any
'speaker' element appearing as a child of an element with type
'computer' will have xs:anyType as its declared type definition, and
it's natural to ask, in that case, why it doesn't subsume a default
binding of xs:anyType.

There are two problems here.  

First, I fear that you have been betrayed by the plausible
assumption that 'default binding' denotes the declared type
definition of whatever element declaration governs an element
instance.  Plausible assumption, but not the case: the term 'default
binding' denotes something else, the possible values of which are a
confusing mishmash of element declarations, attribute declaration +
optional value constraint pairs, or the keywords 'strict', 'lax',
and 'skip'.

If I could think of a way to make the concept less ad hoc, I would
propose it in a heartbeat.  But thus far, I have failed to make much
of a dent.

Second, the text is faulty.  The definition of 'default binding',
whatever its flaws, does at least make clear that 'xs:anyType' is
not a possible value for the default binding of anything.  (The
prose accompanying the example reflects an earlier state of the
spec, in which the concept now called 'default binding' was referred
to as 'Test[ES,P]' and xsd:anyType figured among its possible
values.)

Not only is xsd:anyType not a possible default binding, but the type
'quietComputer' does not in fact provide any default binding at all
for elements named 'speaker': complex types provide default bindings
for children on if the sequence of children is locally valid against
the type.  No sequence of children containing a 'speaker' element is
locally valid against 'quietComputer', so no such sequence gets
default bindings.

It's not clear to me, at this point, whether the prose is correct to
say "if there is a top-level declaration for 'speaker', the
restriction is valid, but if there isn't one, it's not valid".
Certainly the reasoning given is bogus.

The relevant "if X, then ..., otherwise ..." construct may be:

  If the notQName attribute is supplied as shown, the restriction is
  valid.  If it were omitted, then the restriction might be valid or
  invalid: invalid if there is no top-level declaration for
  'speaker', because then the default bindings for 'speaker'
  elements in the input would be the local element declaration (for
  'computer') and the keyword 'lax' (for 'quietComputer'), valid if
  there is a top-level declaration for 'speaker' that is subsumed by
  the local declaration in 'computer' (see clause 4 of the
  definition of 'subsume').  

In practice that means the restriction would be valid unless the
top-level declaration for 'speaker' had identity constraints,
disallowed substitutions, or a type whose derivation from
xsd:anyType involves any extension steps, list construction, or
union construction.  Nillability and value constraints can also make
restrictions like this one invalid, but it's clear from what is
shown in the example that they cannot affect the validity of this
restriction.

The upshot is that although on the face of it the comment here looks
as if it ought to be editorial, the technical details are
problematic enough that this should probably be discussed by the
Schema WG.  So I'm marking this needsAgreement, rather than
editorial.

(While I'm here, I should point out that the example seems slightly
confusing to me, because as a reader I can't help thinking the
wildcard really is not very useful unless it has maxOccurs =
'unbounded'.  The use of xsd:anyType in the local declarations also
distracts me as a reader: I keep thinking "surely no one in their
right mind would do it that way!")

Received on Friday, 4 January 2008 00:49:10 UTC