RE: LC124: Ignore Unknowns, some proposed text from Jonathan Marsh on 2005-06-15 (www-ws-desc@w3.org from June 2005)

From: Jonathan Marsh <jmarsh@microsoft.com>
Date: Wed, 15 Jun 2005 10:36:05 -0700
To: <paul.downey@bt.com>, <dorchard@bea.com>
Cc: <www-ws-desc@w3.org>
Message-ID: <7DA77BF2392448449D094BCEF67569A507E85395@RED-MSG-30.redmond.corp.microsoft.com>

Isn't one of the advantages of Henry's mechanism is that it the results
are predictable and therefore interoperable?  For instance, given a
schema for the structure:

  <name>
    <first/>
    <last/>
  </name>

And a message instance of:

  <name>
    <first/>
    <last/>
    <last role="alternate-spelling"/>
  </name>

Henry's algorithm should predictably trim one (the last I think) <last/>
element.  I can imagine a reasonable algorithm that trims the first.

Which actually brings forth another question:  The schema is primarily
used for code generation, and for mapping the data to that code.  Given
the above example, which <last/> will be mapped into code?  Especially
if the following is allowed:

  <name>
    <first/>
    <last/>
    <last role="alternate-spelling"/>
  </name>

Do we need Henry's algorithm in play for extracting data from the
message, as well as for the optional validation step?  That would
introduce a mandatory validation step in all messages, which could be
undesirable.

> -----Original Message-----
> From: paul.downey@bt.com [mailto:paul.downey@bt.com]
> Sent: Wednesday, June 15, 2005 8:31 AM
> To: Jonathan Marsh; dorchard@bea.com; www-ws-desc@w3.org
> Subject: LC124: Ignore Unknowns, some proposed text
> 
> I think, as well as offering alternative syntax, Dave's proposal [1]
> is more tightly
> coupled to Henry's technique than mine [2].  So although Dave may have
> some different text in mind, here's my suggestion for the meaning
> of 'Ignore Unknowns'
> 
> Paul
> 
> [1] http://lists.w3.org/Archives/Public/www-ws-desc/2005Jun/0016.html
> [2] http://lists.w3.org/Archives/Public/www-ws-desc/2005Jun/0012.html
> 
> 
> The receiver of a message defined by an XML Schema 1.0 element marked
> with an ignoreUnknows property of 'true' must ignore _unexpected
> items_
> when processing the message.
> 
> Such additional, _unexpected items_ may be defined in a different
> version of the schema which may not be known or available to a
> sender, receiver or a third-party observing the message exchange,
> such as an XML Schema 1.0 validator.
> 
> _Unexpected items_ are attributes and elements not defined by the
> schema
> for a particular element. _Unexpected items_ may appear in any
> namespace
> including the targetNamespace of a known schema, as well as in
> a namespace for which no schema is currently known.
> 
> In the case of an unexpected element, it is the entire element tree,
> including any child elements, child attributes and content which
> must be ignored.
> 
> Beofore checking the validity of a message contents against an XML
> Schema
> element marked with an ignoreUnknowns property value of 'true', any
> _unexpected items_should be first removed from the message.
> How this removal should be achieved is undefined by this
> specification.
> 
> [[
>     Note: A number of different methods of identifying and removing
>     _unexpected items_ exist. One such technique is to apply the XPath
>     "*[pe:validity()='notKnown']" on the Post Schema Validation
> Infoset
>     (PSVI) produced as a result of XML Schema 1.0 validation.
>     For more information see [XML Schema: Structures]
>     and [some Appendix|Primer|Note|Whatever with a write up of
>     Henry's demo
> ]]
> 
>

Received on Wednesday, 15 June 2005 17:36:32 UTC