comments on bug 6009 (second installment) from C. M. Sperberg-McQueen on 2009-04-12 (www-xml-schema-comments@w3.org from April to June 2009)

From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
Date: Sun, 12 Apr 2009 09:18:38 -0600
To: John Arwe <johnarwe@us.ibm.com>, www-xml-schema-comments@w3.org
Cc: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
Message-Id: <7F8B415C-5B2B-46F1-B8B5-E2618491E32B@blackmesatech.com>
Installment two of my responses to John Arwe's comment in bug 6009
(http://www.w3.org/Bugs/Public/show_bug.cgi?id=6009) of 2 September
2008.

The wording proposal at

   http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.b6009.html
   (member-only link)

has been updated to include the changes proposed below in addition to
those of the first installment.


--- 13 -------------------------------------------

 > 3.3.5.4 Element Validated by Type

 > "The first (·item isomorphic·) alternative above ..."

 > FYI, could be read as "nearest" or "immediately preceding" rather
 > than your intent (I think) of "first in this section".

 > Short of a link, hard to truly disambiguate.

Impossible to disambiguate, actually: the choice between alternatives
it mentions was removed in 2006.  I've recast the note:

     The [type definition] property is provided for applications such
     as query processors which need access to the full range of details
     about an item's ·assessment·, for example the type hierarchy; the
     [type definition type], [type definition namespace], [type
     definition name], and [type definition anonymous] properties are
     defined for the convenience of those specifying lighterweight
     interfaces, in which exposing the entire type hierarchy and full
     component details might be a significant burden.


--- 14 -------------------------------------------

 > 3.4.2.3 Mapping Rules for Complex Types with Complex Content

 > XML Mapping Summary for Complex Type Definition with complex content
 > Schema Component

 > clause "4.2.1 If the {base type definition} is ... as per clause 4.1
 > above;"

 > I think you need to add "..., substituting extension for
 > restriction;" or the language lawyers will have you

Point taken.  To avoid the on-the-fly substitution, I have suggested
changing it to read "as per clauses 4.1.1 and 4.1.2 above".


--- 15 -------------------------------------------

 > 3.4.4.1 Context-determined Type and Context-determined Type Table

 > "partial functional mapping" may be using a precise term w/o
 > definition, else it seems mushy ("partial" could be 0.0001%)

It is indeed using a precise term; the function is partial because no
type maps every possible element or attribute information item to a
type.  When the text was approved, I think some in the WG thought that
it needed no definition since it is being used in its normal
mathematical sense.  Your doubt suggests that its normal mathematical
sense is either not as widely known as the WG thought, or that its
applicability here is less obvious than was thought.

I don't think introducing a definition is useful; it's not a
schema-specific term, it's not used that often, and as a
non-mathematician I could easily get the definition wrong in painful
ways.

At the same time, I think it's helpful to think of types as
determining these (partial) functions, and I'm loath to rephrase
eithout mentioning functions, as

     Every Complex Type Definition maps some set of element or
     attribute information items (and their expanded names) to type
     definitions.

 > " This mapping serves as a context-determined type ..." important
 > enough to define, but it's not clear why.  I recognized it was used
 > later when I saw it come up again, but by then I was too tired to
 > compare the two.  Some hint as to the reason this term is important
 > enough to warrant its own name might help it stick better. Just an
 > idea.

 > Same thing occurs in definition of "context-determined type table"
 > further in this section (consistent, good, good)

At the beginning of the section, I've proposed that we add the
paragraph:

     This section defines the concepts of .locally declared type. and
     .context-determined type table.; these concepts are appealed to in
     checking that restrictions and extensions of complex types are
     legitimate.  The locally declared type is also used to help
     determine the governing element declaration and governing type
     definition of an element information item.


--- 16 -------------------------------------------

 > 3.7.2 XML Representation of Model Group Definition Schema Components
 > - final parag

 > "The name of this section is slightly misleading,...Also note that
 > ..."

 > Probably should be a Note: paragraph.

Agreed.  Thanks.


--- 17 -------------------------------------------

 > 3.10.4.2 Wildcard allows Expanded Name

 > Validation Rule: Wildcard allows Expanded Name - clause 4

 > The wording of this VR is "odd".  At the top level it starts "...all
 > of the following must be true:"

 > Clause 4 of the ALL is itself an If-Then however.  What does it mean
 > for an If-Then to be true?

 > Does the ALL wrt 4 AND in the result of 4's If condition? It's Then
 > clause? (how does one even AND an "action"?)

This section has just been revised, and I won't propose any
changes on the basis of this comment.

Here and elsewhere the XSD spec assigns to if-then statements in
constraints the conventional truth table for material implication: the
conditional is true if either the antecedent is false, or the
consequent is true, or both; it is false if and only if the antecedent
is true and the consequent is false.

I had not thought it worth while to say this explicitly, but in case
it's helpful we could add a sentence to the paragraph in section 1.5
which begins "Lists of normative constraints":

   If one of the items in a list of constraints has the form "If
   [antecedent] then [consequent]", then the item is taken to be true
   whenever the antecedent is false, or the consequent is true, or  
both; it
   is false if and only if the antecedent is true and the consequent is
   false.  (This is the truth table conventionally used for material
   implication in sentential logic.)


--- 18 -------------------------------------------

 > 3.13.4.1 Assertion Satisfied

 > "Note: It is a consequence of this construction that attempts to
 > refer, in an assertion, to ... will be unsuccessful."

 > unsuccessful, meaning: error?  "just" never true?  must ...?

No, not an error.  The expression ancestor::* (for example), or the
expression preceding-sibling::* will return the empty sequence,
because in the data model instance used to evaluate the XPath
expression the context element has no ancestors or preceding siblings.

I propose to add

     Such attempted references are not in themselves errors, but the
     data model instance used to evaluate them does not include any
     representation of any parts of the document outside of E, so
     they cannot be referred to.

 > Might a processor able to detect this usefully inform its invoker
 > (e.g. via a warning code/msg)?

Not without backdoor communication between the schema validator and
the XPath interpreter.

 > I think the answers are likely "just never true" and "yes, but might
 > not be practical to detect", but that's based on my dealings with
 > 1.1 authors, not this spec.

 > Is there potentially any ability for another spec (like SML V-Next,
 > with its deref()) function to change this behavior without running
 > afoul of 1.1?  "just never true" suggests it might be.

Another spec can certainly define different behavior for the
evaluation of XPath expressions in its constructs; Schematron, for
example, allows XPath expressions to be evaluated against a data model
instance representing the entire document, not just E.  You can't
prescribe different behavior for XSD assertions while retaining XSD
compatibility, though.  Personally, I expect at least some XSD
processors to offer such (strictly speaking non-conformant) behavior
to users who are less impressed than some members of the XML Schema WG
by the beauty of the current design.  But I could be wrong.


--- 19 -------------------------------------------

 > 4 Schemas and Namespaces: Access and Composition

 > I trundled off to find whether "...namely access to one or more
 > schemas" was a sensible statement, given some of the discussions
 > we've had in the SML wg about words like "schema" vs "schema
 > document" and what each means (is there more than one schema for a
 > given assessment episode?  etc.).  I discovered that "schema" is in
 > fact not formally defined, that is, not in the glossary and lacking
 > the usual "[Definition:]" rendering. Seems like a fundamental item
 > to define, and I think you have a serviceable def in 2.1 already:

 > 2.1 Overview of XSD An XSD schema is a set of components such as
 > type definitions and element declarations.

You touch here on two vexed subjects.

Leaving aside the full explanation of why they are vexed -- let it
suffice to say that they involve topics on which the WG is far from
consensus -- I think that the current text is OK on both counts.

If you are creating a schema (singular) to validate a mixed-namespace
document containing (for example) XHTML, MathML, and SVG markup, you
might start from a schema document which does nothing more than import
those three namespaces.  The detailed description of the meaning of
'import' says that this means your processor will go forth into the
world, search for, retrieve, and return with the schemas (plural)
corresponding to the the schema documents you pointed at -- or if you
didn't specify a schemaLocation, it will go forth and return with
*some* schemas containing components for those namespaces.

It should perhaps be noted that some WG members hold that the
references to schemas other than the one being constructed are really
just figures of speech, not to be taken literally.  But on that
analysis, the reference you quote to schemas in the plural can
similarly be taken as a harmless figure of speech.

The Structures spec does define 'XSD schema' in section 2.2; the term
'schema' elsewhere in the spec generally refers to an XSD schema.



--- 20 -------------------------------------------

 > 4.1 Layer 1: Summary of the Schema-validity Assessment Core - bullet
 > 2

 > "no definition or declaration changes once it has been established;"

 > Someone is going to have to help me reconcile that statement with
 > the existence of redefine and override, which seems to do exactly
 > that.

I think you are imagining the following example: the schema document
you are working from (SD1) has a 'redefine' element pointing to schema
document SD2 and containing a redefinition of type T.  To create a
data structure representing the new T, you must first create one
representing the old T, and then modify it.  How does that fit with
the sentence you quote?

My answer is: the T you care about is in the schema you are
constructing to use for validation.  That T is 'established' just
once.  The process of doing so does require a possibly recursive
process of identifying other schemas (as mentioned above), and
possibly establishing the value of T in one or more of those.  But the
T in the other schema is not necessarily the same as the T in the
schema you are using for validation.

It could be phrased better, perhaps, but only by a WG which has
consensus on a clear story about schema construction and composition
-- i.e. not by this one.



--- 21 -------------------------------------------

 > 4.2.4 <override>

 > "Also, existing XSD processors have implemented conflicting and
 > non-interoperable interpretations of <redefine>, and the <redefine>
 > construct is ·deprecated·. "

 > Circular logic to deprecate <redefine> in this spec and then use
 > that decision to justify the need for <override>.

As I understand it (I speak here for myself), the redefine construct
is deprecated because its specification is contradictory and the WG
has been unable to agree on what it means, or on what it should mean,
or on how to repair it.

Historically, it is not the case that <override> was introduced to
fill a gap left by an earlier deprecation of <redefine>.  The
historical sequence of events was the reverse: the creation of a
plausible proposal for <override> provided an alternative to
<redefine> with roughly similar uses in an applicaiton, and thus made
it politically feasible to deprecate <redefine>.

I don't see any obvious alternate wording that would address your
concern, nor any realistic possibility of arriving at consensus in the
XML Schema WG on any alternate wording.

If that means you claim the right to make fun of us at the Tech
Plenary for our circular reasoning, we shall have to grin and bear it.


--- 22 -------------------------------------------

 > 4.2.5.1 Licensing References to Components Across Namespaces

 > "There is no need, for example, to import...HTML..."

 > This may be factually true, but is not the point _why_ this is true?
 > This point has been omitted.  It is not intuitively obvious from the
 > text here, my suspicion would be that Schema for Schemas has
 > processContents=skip at a convenient place.

 > Same question arises in the Example tableau.  The answer might
 > usefully be annotated in the <documentation> as a comment.

Thank you.  Your confusion takes me a little by surprise, and
so I am the more grateful for it.  The more surprising it is, the
more helpful new information it conveys.

I have expanded the sentence you refer to into a pair of notes which
talk about the questions you raise.  The area of schema composition is
full of mines and pitfalls, so there is a more than average chance
that careful membes of the WG will find errors in the notes, and that
once they are fitted out with all the required caveats, conditions,
and exceptions the notes will become as opaque and unhelpful as the
formal text, but with a little luck they will still be helpful to a
reader.

Note: the WG has a pending proposal to change the text in this section
in order to resolve issue 5779.  To assist myself and other members of
the WG in navigating, I have included the proposal for 5779 in the
proposal for 6009, marked as non-status-quo text (pale colors instead
of bright colors for the changes).  The use of non-status-quo coloring
here should not be confused with my use of it elsewhere, to mark
changes which your or others may favor as ways to address questions
you raise in bug 6009 but which I do not strongly favor myself.  My
apologies if this double use of nsq coloring is confusing.


--- 23 -------------------------------------------

 > 4.2.5.1 Licensing References to Components Across Namespaces

 > "...prefixes declared with namespace declarations in the normal
 > way..."

 > 1.4 Dependencies on Other Specifications says, in essence, that all
 > 1.1 dependencies on XML Namespaces are via Datatypes, i.e. may be NS
 > 1.0 or 1.1 depending on the datatypes provided by the processor.

 > So "normal way" means according to either NS spec right, Schema has
 > nowhere that it depends on a XMLNS 1.1-only "feature"?

If I understand you correctly, the answer is yes.

The use of XML infoset and XML namespaces in schema documents does, I
believe, constitute a dependency of Structures on the Namesapces and
Infoset recs.  I believe that dependency (among others) is intended to
be covered by the sentence in section 1.4 which reads:

     The definition of XML Schema Definition Language: Structures
     depends on the following specifications: [XML-Infoset],
     [XML-Namespaces 1.1], [XPath 2.0], and [XML Schema: Datatypes].

The relevant mechanisms of the Namespaces rec have not changed between
NS 1.0 and NS 1.1, so the choice of versions to support is not called
out.  (The ability to undeclare a namespace prefix may be thought to
be a change to a "relevant mechanism", but unless I am mistaken
nothing in Structures depends on the ability or inability of an XML
author to undeclare namespace prefixes.  To make the case that the NS
1.0/1.1 difference makes a difference to Structures, it would be
necessary (and probably sufficient) to establish that the change in
the Namespaces rec makes the interpretation of some information sets
unstable or makes their interpretation depend on the choice between NS
1.0 and NS 1.1.  I have not spent time trying to construct a proof one
way or the other, and don't propose to, but off hand I would be
surprised if it the NS change were visible at the infoset level.

In sum: no, I don't think the declaration of namespace prefixes "in
the normal way" represents a namespace 1.0/1.1 issue that dependent
specs like SML need to worry about or address.

I have not proposed any wording change to the spec to address this
point.

...

Thank you again for your careful reading.

-- 
****************************************************************
* C. M. Sperberg-McQueen, Black Mesa Technologies LLC
* http://www.blackmesatech.com
* http://cmsmcq.com/mib
* http://balisage.net
****************************************************************
Received on Sunday, 12 April 2009 15:19:24 UTC