RE: LC124: Ignore Unknowns, some proposed text from David Orchard on 2005-06-17 (www-ws-desc@w3.org from June 2005)

From: David Orchard <dorchard@bea.com>
Date: Fri, 17 Jun 2005 10:11:37 -0700
To: "Jonathan Marsh" <jmarsh@microsoft.com>, <paul.downey@bt.com>
Cc: <www-ws-desc@w3.org>
Message-ID: <32D5845A745BFB429CBDBADA57CD41AF106FD842@ussjex01.amer.bea.com>
Jonathan,

IIRC, you were asking about extracting data from an instance when
validation had not been done.  I think the concern was that results
could be different if validation was done versus not done.

I think that you have the same problem regardless of whether the 2 pass
validation is done or not and is thus orthogonal to the two-pass
validation.  Imagine no 2 pass validation, and a <first/><last
role="alternate-spelling"/><last/> is passed in.  This isn't validated,
so the parser can blindly extract content.  It may or may not get the
value you want.

There is a 2 by 2 decision matrix:
             Validation not done   |  Validation done
No 2-pass  |                       |
2-pass     |                       |

In either of the "validation not done" cases, you'll get unpredictable
results.  In the case of validation done and no 2-pass, you won't be
able to access the content because it will have failed validation.  In
the case of validation done and 2-pass, you'll be able to access the
content in a predictable way.  

This shows that 2-pass does no harm for the "no validation" case and
increases flexibility in the "validation done" case.  

Cheers,
Dave



> -----Original Message-----
> From: Jonathan Marsh [mailto:jmarsh@microsoft.com]
> Sent: Wednesday, June 15, 2005 10:55 AM
> To: Jonathan Marsh; paul.downey@bt.com; David Orchard
> Cc: www-ws-desc@w3.org
> Subject: RE: LC124: Ignore Unknowns, some proposed text
> 
> Oops, third example was wrong - fixed inline below.
> 
> > -----Original Message-----
> > From: www-ws-desc-request@w3.org [mailto:www-ws-desc-request@w3.org]
> > On Behalf Of Jonathan Marsh
> > Sent: Wednesday, June 15, 2005 10:36 AM
> > To: paul.downey@bt.com; dorchard@bea.com
> > Cc: www-ws-desc@w3.org
> > Subject: RE: LC124: Ignore Unknowns, some proposed text
> >
> >
> > Isn't one of the advantages of Henry's mechanism is that it the
> > results
> > are predictable and therefore interoperable?  For instance, given a
> > schema for the structure:
> >
> >   <name>
> >     <first/>
> >     <last/>
> >   </name>
> >
> > And a message instance of:
> >
> >   <name>
> >     <first/>
> >     <last/>
> >     <last role="alternate-spelling"/>
> >   </name>
> >
> > Henry's algorithm should predictably trim one (the last I think)
> > <last/>
> > element.  I can imagine a reasonable algorithm that trims the first.
> >
> > Which actually brings forth another question:  The schema is
primarily
> > used for code generation, and for mapping the data to that code.
> > Given
> > the above example, which <last/> will be mapped to code?  Especially
> > if the following is also allowed:
> >
> >   <name>
> >     <first/>
> >     <last role="alternate-spelling"/>
> >     <last/>
> >   </name>
> >
> > Do we need Henry's algorithm in play for extracting data from the
> > message, as well as for the optional validation step?  That would
> > introduce a mandatory validation step in all messages, which could
be
> > undesirable.
> >
> > > -----Original Message-----
> > > From: paul.downey@bt.com [mailto:paul.downey@bt.com]
> > > Sent: Wednesday, June 15, 2005 8:31 AM
> > > To: Jonathan Marsh; dorchard@bea.com; www-ws-desc@w3.org
> > > Subject: LC124: Ignore Unknowns, some proposed text
> > >
> > > I think, as well as offering alternative syntax, Dave's proposal
[1]
> > > is more tightly
> > > coupled to Henry's technique than mine [2].  So although Dave may
> > have
> > > some different text in mind, here's my suggestion for the meaning
> > > of 'Ignore Unknowns'
> > >
> > > Paul
> > >
> > > [1] http://lists.w3.org/Archives/Public/www-ws-
> > desc/2005Jun/0016.html
> > > [2] http://lists.w3.org/Archives/Public/www-ws-
> > desc/2005Jun/0012.html
> > >
> > >
> > > The receiver of a message defined by an XML Schema 1.0 element
> > marked
> > > with an ignoreUnknows property of 'true' must ignore _unexpected
> > > items_
> > > when processing the message.
> > >
> > > Such additional, _unexpected items_ may be defined in a different
> > > version of the schema which may not be known or available to a
> > > sender, receiver or a third-party observing the message exchange,
> > > such as an XML Schema 1.0 validator.
> > >
> > > _Unexpected items_ are attributes and elements not defined by the
> > > schema
> > > for a particular element. _Unexpected items_ may appear in any
> > > namespace
> > > including the targetNamespace of a known schema, as well as in
> > > a namespace for which no schema is currently known.
> > >
> > > In the case of an unexpected element, it is the entire element
tree,
> > > including any child elements, child attributes and content which
> > > must be ignored.
> > >
> > > Beofore checking the validity of a message contents against an XML
> > > Schema
> > > element marked with an ignoreUnknowns property value of 'true',
any
> > > _unexpected items_should be first removed from the message.
> > > How this removal should be achieved is undefined by this
> > > specification.
> > >
> > > [[
> > >     Note: A number of different methods of identifying and
removing
> > >     _unexpected items_ exist. One such technique is to apply the
> > XPath
> > >     "*[pe:validity()='notKnown']" on the Post Schema Validation
> > > Infoset
> > >     (PSVI) produced as a result of XML Schema 1.0 validation.
> > >     For more information see [XML Schema: Structures]
> > >     and [some Appendix|Primer|Note|Whatever with a write up of
> > >     Henry's demo
> > > ]]
> > >
> > >
> >
Received on Friday, 17 June 2005 17:12:15 UTC