LC124: Comment on V2S and [validity]=notKnown

I reviewed Henry Thomas' presentation [1] that described the validate
twice with surgery (V2S) algorithm for ignoring additive content. This
algorithm makes certain assumptions about the additive content, which
I think need examination.

The basis of V2S is that a schema processor attempts to validate a
message and in the process makes contributions to the PSVI. The key
contribution being the [validity] property which can be valid,
invalid, or notKnown. The later means that validation has not been
attempted because, e.g. there is no declaration available for the
element.

Consider the following example from Henry's presentation:

<shipTo>
  <ad:name>HM Queen</ad:name>
  <ad:street>Buck House</ad:street>
  <ad:city>London</ad:city>
  <nad:country>UK</nad:country>
 </shipTo>

Here it is supposed that the <nad:country> element has been added to
version 2 of the schema for the message. After attempting validation,
the <nad:country> element has [validity]=notKnown, presumably because
the processor only has version 1 of the schema available. The surgery
step chops this element out, and then the message is valid wrt version
1.

I am troubled by this approach because it assumes you will only add
elements that are brand new, i.e. are not part of version 1 or any
schema it references.

I work with one customer that defines standard elements for common
data items such as customer ids, account numbers, etc. These are used
as the building blocks of messages. It seems very likely to me that
this customer may want to add one of these predefined elements to an
existing message. Even Henry's example suggest this type of 
versioning. Adding something as standard as a country would likely be
done using an existing element declaration. So the element is in fact
known in this case, and the first validation pass would mark it as
[validity]=valid. The enclosing <shipTo> would still be invalid since
<nad:country> is not part of its content model. In this case V2S fails
to give the desired result.

I think we need to carefully understand the assumptions behind V2S and
decide if they are useful enough in practice to be enshrined in WSDL
2.0.

[1] http://www.markuptechnology.com/XMLEu2004/

Received on Wednesday, 20 July 2005 03:02:00 UTC