RE: Proposal for Simplifications to the Component Model from Arthur Ryman on 2005-02-09 (www-ws-desc@w3.org from February 2005)

From: Arthur Ryman <ryman@ca.ibm.com>
Date: Wed, 9 Feb 2005 12:57:44 -0500
To: www-ws-desc@w3.org
Message-ID: <OFA8E0B4CB.0E5E01C8-ON85256FA3.005F63E1-85256FA3.0062AAEB@ca.ibm.com>
Asir, 

Thx. See below:

> Related to my question (b)-
> > We define equivalence of each type of element at 
> > the infoset level, e.g. two <interface> elements 
> > are equivalent is they have the same <operation> 
> > children in any order, etc. 
> 
> <operation> children may or may not be in the same <interface> element
> because of interface inheritance. <operation> children may originate 
from a
> different document. That is, multiple documents are involved. What is
> information set equivalence if multiple documents are involved via WSDL
> import and include? Apologize, if I missed it.

Infoset equivalence takes care of inheritance by comparing the extends 
attributes and then recursively comparing the referenced Interface 
components.

> 
> Related to my question (c) -
> > The spec deals with infosets, no matter where 
> > they come from. They come come from parsing an 
> > XML document, or calling the DOM API. That's the 
> > whole point of Infosets.
> 
> I believe that you missed my question. Let me re-state it. Say, I have 
an
> instance of WSDL component model. I constructed it via an API 
(definitely
> not DOM, a higher level API) or User Interface. There aren't any 
information
> sets or XML documents. In this case, what is information set 
equivalence?

My suggestion is that we eliminate component equivalence by preventing 
duplicate components in the component model. Therefore, if you have a 
component model instance then infoset equivalence is irrelevant. The 
component model has constraints then ensure that you can't have two 
components with the same name. The API should prevent you from creating 
duplicate components.

Think of it this way. If we don't allow duplicate components in the 
component model, then we need to filter out duplicates when we create the 
component model instance from an infoset by testing infosets for 
equivalence. For this to work, infoset equivalence must be practical to 
compute. This means we may need to restrict the semantics of top-level 
extension properties.

> 
> Related to my question (d) -
> > We compare the Infoset after the defaults 
> > have been applied. This is Post Schema Validation 
> > Infoset. We don't care if an attribute value is 
> > actually present or is defined by an attribute. 
> 
> Let me bring back the example that I quoted in my previous e-mail, [1]
> "{style}= The set containing the URIs in the actual  value of the style
> attribute information item if present, otherwise the set containing the 
URIs
> in the actual value of the styleDefault attribute information item of 
the
> [parent] interface element information item if present, otherwise 
empty."
> 
> First, Post Schema Validation Information Set cannot help us. This is
> complex mapping. This is more than simple attribute value contribution.
> Particularly, the phrase "otherwise the set containing the URIs in the
> actual value of the styleDefault attribute information item of the 
[parent]
> interface element". BTW, there are many such complex mapping rules.

In this case, we need to literally compare the infosets. Remember that the 
motivation for introducing component equivalence came primarily from 
diamond inheritance where the identical infoset gets included twice. We 
can also allow the case where a top level component is cut and pasted 
between documents. I don't think we need to allow the case of subtle 
editing changes that produce the same semantics. The point was to detect 
conflicting definitions, not to allow authors subtle freedoms in 
authoring. However, if you feel this is very important then infoset 
equivalence is unworkable and we get a more complex component model.

> 
> Second, WSDL 2.0 Last Call draft does not mandate XML Schema validation 
of
> WSDL documents. Right?

That is a processor question and we eliminated the concept of conformant 
processor. I am discussing the semantics of the component model. Some of 
the semantics are expressed by XML Schema rules. All valid WSDL 2.0 
documents must obey those semantics whether or not a particular processor 
does schema validation. 

> 
> That brings back my question (d), How does information set equivalence 
take
> these default values (and other such constructs) into account? Again, 
these
> default values are more than simple attribute value contribution.

As I said above, things like default attribute values are easy to handle, 
but the more complex WSDL rules are not, e.g. the composition rules for F 
and P. Those complex rules go beyond XML and enter into the domain of 
WSDL, and they therefore require expression in terms of the component 
model. If we want to allow a lot of freedom in authoring duplicate 
components, then we need component equivalence rules. If we just want to 
handle diamond inheritance and cut/paste of top level components, then 
infoset equivalence is enough.

> 
> [1] 
http://www.w3.org/TR/2004/WD-wsdl20-20040803/#InterfaceOperation_Mapping
> 
> Regards,
> Asir S Vedamuthu
> asirv at webmethods dot com
> http://www.webmethods.com/
> 
> -----Original Message-----
> From: www-ws-desc-request@w3.org [mailto:www-ws-desc-request@w3.org] On
> Behalf Of Arthur Ryman
> Sent: Monday, February 07, 2005 6:15 PM
> To: www-ws-desc@w3.org
> Subject: RE: Proposal for Simplifications to the Component Model
> 
> 
> 
> Asir, 
> 
> My responses are below. 
> 
> > Arthur, 
> > 
> > I have a few questions of clarification here, 
> > 
> > > We should eliminate the concept of component equivalence 
> > > and use infoset equivalence instead 
> > 
> > (a) What is information set equivalence? 
> 
> Infoset equivalence is something we get to define. The idea is that if 
there
> are two infosets that each contain a definition for the same component, 
then
> we can look at their element and attribute infomation items and decide 
if
> the definitions are the same. I am suggesting we resolve duplicate
> definitions at the infoset level rather than the component level. 
> 
> Our spec describes how to construct a component model from a set of XML
> infosets. We also talk about equivalence of components since the "same"
> component may be defined in more than one infoset, e.g. via include, 
import.
> 
> 
> Now consider an Interface component. Suppose it is defined twice. That 
is OK
> as long as the two definitions are equivalent. So we have the following
> constraint on the component model: 
> 
> Rule A: For all Interface components, x, y, if the QName of x is the 
same as
> the QName of y then x MUST be equivalent to y. 
> 
> The alternative to the above rule is to detect equivalence in the 
infoset
> before creating the component model. Why bother having equivalent 
components
> floating around in the model? Then we get the following constraint: 
> 
> Rule B: For all Interface components, x, y, if the QName of x is the 
same as
> the QName of y then x MUST be equal to y. 
> 
> Rule B is stronger. It says that Interface components are uniquely
> identified by their QName within the component model. 
> 
> I'd prefer Rule B, but it seems that if we allow arbitrary top level
> extension properties then it is very difficult to compute infoset
> equivalence. For example, suppose we allow a top level extension 
property
> that says "All bindings defined in this infoset require security." Then 
we
> can just look at the infoset of the bindings when comparing two 
infosets. We
> need to understand the context, which means we need to understand the
> component model. In this case we need Rule A. 
> 
> > 
> > (b) "An XML document has an information set" [1], what is 
> > information set equivalence if multiple documents are involved? (via
> > wsdl:import and wsdl:include) 
> 
> We define equivalence of each type of element at the infoset level, e.g. 
two
> <interface> elements are equivalent is they have the same <operation>
> children in any order, etc. 
> 
> > 
> > (c) An instance of the WSDL component model may be constructed by an
> > API, User Interface, or by mapping from a collection of information 
> > sets (via mapping spelled out in WSDL 20 spec). If an instance of 
> > the WSDL component model is constructed by means other than mapping 
> > from a collection of information sets, there aren't any real 
> > information sets, that is, there aren't any real XML documents. In 
> > such cases, what is information set equivalence? 
> 
> The spec deals with infosets, no matter where they come from. They come 
come
> from parsing an XML document, or calling the DOM API. That's the whole 
point
> of Infosets. 
> 
> > 
> > (d) In many instances, in Part 1, mapping brings in default values -
> > example, [2] "{style}= The set containing the URIs in the actual 
> > value of the style attribute information item if present, otherwise 
> > the set containing the URIs in the actual value of the styleDefault 
> > attribute information item of the [parent] interface element 
> > information item if present, otherwise empty." How does information 
> > set equivalence take these default values (and other such 
> > constructs) into account? 
> 
> We compare the Infoset after the defaults have been applied. This is 
Post
> Schema Validation Infoset. We don't care if an attribute value is 
actually
> present or is defined by an attribute. 
> 
> But let me say again, there is no general definition of infoset 
equivalence.
> We have to define it for each of our component types. The point is that 
we
> filter out equalivalent definitions before adding them to the component
> model. 
> 
> 
> > 
> > I looked through this thread and did not find answers for these 
questions.
> 
> > 
> > [1] http://www.w3.org/TR/2004/REC-xml-infoset-20040204/#intro 
> > [2]
> http://www.w3.org/TR/2004/WD-wsdl20-20040803/#InterfaceOperation_Mapping 

> > 
> > Regards, 
> > Asir S Vedamuthu
> > asirv at webmethods dot com
> > http://www.webmethods.com/ 
> > -----Original Message-----
> > From: www-ws-desc-request@w3.org [mailto:www-ws-desc-request@w3.org] 
> > On Behalf Of Arthur Ryman
> > Sent: Thursday, January 27, 2005 3:13 AM
> > To: www-ws-desc@w3.org
> > Subject: Proposal for Simplifications to the Component Model
> 
> > 
> > As I mentioned in an earlier note [1], I've hit problems trying to 
> > formally specify some aspects of the component model. These are 
> > related to the interactions between interface inheritance, component
> > equivalence, and extension elements. I'd like to propose some 
> > simplifications here so I can move forward. 
> > 
> > 1. The spec has the notion of component equivalence. This concept 
> > was introduced as a consequence of interface inheritance. The 
> > problem was that we wanted to allow diamond inheritance, eg: 
> > 
> > interface A extends B, C; 
> > interface B extends D; 
> > interface C extends D; 
> > 
> > The problem occurs because now it looks like interface A contains 
> > two, potentially conflicting, copies of the operations in D. We 
> > resolved this by saying that if the copy of D acquired via B is 
> > equivalent to the copy of D acquired via C, then all is well. 
> > Otherwise there is an error. The two copies will be equivalent if 
> > they come from the same document, which is the normal case. However,
> > we can't simply compare the URIs used to import or include D because
> > it is possible to have two different URIs resolve to the same 
> > document. We therefore need to compare the contents of the documents. 
> > 
> > The definition of component equivalence is recursive and can be 
> > computed bottom-up, i.e. two components are equivalent if all their 
> > properties are equivalent. Their properities could be either values 
> > or component references. If component references, then apply this 
> > definition recursively until you hit just values. 
> > 
> > This would be fine if all component properties could be computed 
> > bottom-up. But there are some properties that are computed top-down,
> > e.g. in-scope Property and Features, or inherited Operation or Fault
> > components. Also, some Extension component properties might be like 
> > this. So the definition is a little circular and hard to specify 
simply. 
> > 
> > I'd like to propose a simplification. We should eliminate the 
> > concept of component equivalence and use infoset equivalence 
> > instead. In a sense, the infoset is really where this concept 
> > belongs since it arises from considering how we combine documents. 
> > The component model has no concept of document. It is built up from 
> > the infosets of documents. 
> > 
> > The impact of this change is that as we are building up the 
> > component model, we check to see that duplicate definitions of 
> > components have equivalent infosets. If the infosets differ then we 
> > have an error and we can't create the component model. The infoset 
> > definition is strictly bottom-up and can be computed without 
> > reference to derived component model properties. 
> > 
> > Furthermore, I suggest we apply this notion only to the top level 
> > elements: interface, binding, and service, since they are the 
> > components that are likely to appear more than once either via 
> > import or include or by cut and paste. 
> > 
> > 2. An implication of the above proposal is that we would disallow 
> > "accidental" duplication of operations or faults. For example, the 
> > following situation is disallowed: 
> > 
> > interface A { operation X }; 
> > interface B extends A { operation X}; 
> > 
> > The above is disallowed since operation X is defined in two 
> > different interfaces. This is disallowed even if the contents of 
> > operation (A/X) is identical to operation (B/X). The appearance of X
> > in B is considered to be an accident and an error. 
> > 
> > Similarly, the following is also illegal: 
> > 
> > interface A { operation X}; 
> > interface B { operation X}; 
> > interface C extends A, B; 
> > 
> > A and B may contain operations of the same name, but an error occurs
> > when C extends both of them, even if X is defined identically in 
> > both. Designers must factor common operations into a base interface, 
e.g. 
> > 
> > interface D {operation X}; 
> > interface A {...}; 
> > interface B {...}; 
> > interface C extends A, B; 
> > 
> > The same considerations apply to Fault components. 
> > 
> > An additional motivation for this rule is that now all components 
> > have unique URI's. Everyone component is defined in a unique parent 
> > component and we can assign it a URI by building up a path composed 
> > of the names of its ancestors. In contrast, if we allowed accidental
> > equivalence, then in the first example, we only have one operation 
> > component X, but is has 2 parents (A and B) and therefore 2 URIs : 
> > nsuri#wsdl.operation(A/X) and nsuri#wsdl.operation(B/X). And we 
> > would really have to compute its derived properties to determine
> equivalence.
> > 
> > 3. Finally, for this to work, we should only permit extension 
> > elements and attributes in the top level elements: interface, 
> > binding, and service. This means they are disallowed as children of 
> > the root description element. 
> > 
> > The motivation for this is that extensions in the root element are 
> > scoped to the document, but there is no way to capture this scope 
> > within the component model. The only property pushed down from the 
> > document to the top level elements is the targetnamespace attribute 
> > which becomes the namespace name of the QNames of interface, 
> > binding, and service. 
> > 
> > Allowing root level extensions complicates the definition of infoset
> > equivalence of the top level elements since the semantics of the 
> > extensions might alter the meanings of the top level element, i.e. 
> > attach some inherited properties to them. 
> > 
> > The consequence is that if an extension is intended to have document
> > wide scope, then it must be explicitly copied into all the top level
> > elements. However, I am not aware of any such extensions in use today. 

> > 
> > One other pleasant consequence of this rule is that we can have a 
> > deterministic schema that enforces the order of the top level 
elements,
> i.e.:
> > 
> > description = 
> >         (import | include) * 
> >         types ? 
> >         (interface | binding | service) * 
> > 
> > This avoids the need to introduce additional elements to enforce 
> > order as I proposed in [2]. 
> > 
> > [1] http://lists.w3.org/Archives/Public/public-ws-desc-
> > comments/2005Jan/0007.html 
> > [2] http://lists.w3.org/Archives/Public/public-ws-desc-
> > comments/2005Jan/0006.html 
> > 
> > Arthur Ryman,
> > Rational Desktop Tools Development
> > 
> > phone: +1-905-413-3077, TL 969-3077
> > assistant: +1-905-413-2411, TL 969-2411
> > fax: +1-905-413-4920, TL 969-4920
> > mobile: +1-416-939-5063, text: 4169395063@fido.ca
> > intranet: http://labweb.torolab.ibm.com/DRY6/
>
Received on Wednesday, 9 February 2005 17:58:17 UTC