RE: Proposal for Simplifications to the Component Model from Arthur Ryman on 2005-02-07 (www-ws-desc@w3.org from February 2005)

From: Arthur Ryman <ryman@ca.ibm.com>
Date: Mon, 7 Feb 2005 18:14:55 -0500
To: www-ws-desc@w3.org
Message-ID: <OF8B56F45B.3E0F4F2A-ON85256FA1.00799DF6-85256FA1.007FB485@ca.ibm.com>
Asir,

My responses are below.

> Arthur,
> 
> I have a few questions of clarification here,
> 
> > We should eliminate the concept of component equivalence 
> > and use infoset equivalence instead
> 
> (a) What is information set equivalence?

Infoset equivalence is something we get to define. The idea is that if 
there are two infosets that each contain a definition for the same 
component, then we can look at their element and attribute infomation 
items and decide if the definitions are the same. I am suggesting we 
resolve duplicate definitions at the infoset level rather than the 
component level.

Our spec describes how to construct a component model from a set of XML 
infosets. We also talk about equivalence of components since the "same" 
component may be defined in more than one infoset, e.g. via include, 
import.

Now consider an Interface component. Suppose it is defined twice. That is 
OK as long as the two definitions are equivalent. So we have the following 
constraint on the component model:

Rule A: For all Interface components, x, y, if the QName of x is the same 
as the QName of y then x MUST be equivalent to y.

The alternative to the above rule is to detect equivalence in the infoset 
before creating the component model. Why bother having equivalent 
components floating around in the model? Then we get the following 
constraint:

Rule B: For all Interface components, x, y, if the QName of x is the same 
as the QName of y then x MUST be equal to y.

Rule B is stronger. It says that Interface components are uniquely 
identified by their QName within the component model.

I'd prefer Rule B, but it seems that if we allow arbitrary top level 
extension properties then it is very difficult to compute infoset 
equivalence. For example, suppose we allow a top level extension property 
that says "All bindings defined in this infoset require security." Then we 
can just look at the infoset of the bindings when comparing two infosets. 
We need to understand the context, which means we need to understand the 
component model. In this case we need Rule A.

> 
> (b) "An XML document has an information set" [1], what is 
> information set equivalence if multiple documents are involved? (via
> wsdl:import and wsdl:include)

We define equivalence of each type of element at the infoset level, e.g. 
two <interface> elements are equivalent is they have the same <operation> 
children in any order, etc.

> 
> (c) An instance of the WSDL component model may be constructed by an
> API, User Interface, or by mapping from a collection of information 
> sets (via mapping spelled out in WSDL 20 spec). If an instance of 
> the WSDL component model is constructed by means other than mapping 
> from a collection of information sets, there aren't any real 
> information sets, that is, there aren't any real XML documents. In 
> such cases, what is information set equivalence?

The spec deals with infosets, no matter where they come from. They come 
come from parsing an XML document, or calling the DOM API. That's the 
whole point of Infosets.

> 
> (d) In many instances, in Part 1, mapping brings in default values -
> example, [2] "{style}= The set containing the URIs in the actual 
> value of the style attribute information item if present, otherwise 
> the set containing the URIs in the actual value of the styleDefault 
> attribute information item of the [parent] interface element 
> information item if present, otherwise empty." How does information 
> set equivalence take these default values (and other such 
> constructs) into account?

We compare the Infoset after the defaults have been applied. This is Post 
Schema Validation Infoset. We don't care if an attribute value is actually 
present or is defined by an attribute.

But let me say again, there is no general definition of infoset 
equivalence. We have to define it for each of our component types. The 
point is that we filter out equalivalent definitions before adding them to 
the component model.


> 
> I looked through this thread and did not find answers for these 
questions.
> 
> [1] http://www.w3.org/TR/2004/REC-xml-infoset-20040204/#intro
> [2] 
http://www.w3.org/TR/2004/WD-wsdl20-20040803/#InterfaceOperation_Mapping
> 
> Regards,
> Asir S Vedamuthu
> asirv at webmethods dot com
> http://www.webmethods.com/ 
> -----Original Message-----
> From: www-ws-desc-request@w3.org [mailto:www-ws-desc-request@w3.org] 
> On Behalf Of Arthur Ryman
> Sent: Thursday, January 27, 2005 3:13 AM
> To: www-ws-desc@w3.org
> Subject: Proposal for Simplifications to the Component Model

> 
> As I mentioned in an earlier note [1], I've hit problems trying to 
> formally specify some aspects of the component model. These are 
> related to the interactions between interface inheritance, component
> equivalence, and extension elements. I'd like to propose some 
> simplifications here so I can move forward. 
> 
> 1. The spec has the notion of component equivalence. This concept 
> was introduced as a consequence of interface inheritance. The 
> problem was that we wanted to allow diamond inheritance, eg: 
> 
> interface A extends B, C; 
> interface B extends D; 
> interface C extends D; 
> 
> The problem occurs because now it looks like interface A contains 
> two, potentially conflicting, copies of the operations in D. We 
> resolved this by saying that if the copy of D acquired via B is 
> equivalent to the copy of D acquired via C, then all is well. 
> Otherwise there is an error. The two copies will be equivalent if 
> they come from the same document, which is the normal case. However,
> we can't simply compare the URIs used to import or include D because
> it is possible to have two different URIs resolve to the same 
> document. We therefore need to compare the contents of the documents. 
> 
> The definition of component equivalence is recursive and can be 
> computed bottom-up, i.e. two components are equivalent if all their 
> properties are equivalent. Their properities could be either values 
> or component references. If component references, then apply this 
> definition recursively until you hit just values. 
> 
> This would be fine if all component properties could be computed 
> bottom-up. But there are some properties that are computed top-down,
> e.g. in-scope Property and Features, or inherited Operation or Fault
> components. Also, some Extension component properties might be like 
> this. So the definition is a little circular and hard to specify simply. 

> 
> I'd like to propose a simplification. We should eliminate the 
> concept of component equivalence and use infoset equivalence 
> instead. In a sense, the infoset is really where this concept 
> belongs since it arises from considering how we combine documents. 
> The component model has no concept of document. It is built up from 
> the infosets of documents. 
> 
> The impact of this change is that as we are building up the 
> component model, we check to see that duplicate definitions of 
> components have equivalent infosets. If the infosets differ then we 
> have an error and we can't create the component model. The infoset 
> definition is strictly bottom-up and can be computed without 
> reference to derived component model properties. 
> 
> Furthermore, I suggest we apply this notion only to the top level 
> elements: interface, binding, and service, since they are the 
> components that are likely to appear more than once either via 
> import or include or by cut and paste. 
> 
> 2. An implication of the above proposal is that we would disallow 
> "accidental" duplication of operations or faults. For example, the 
> following situation is disallowed: 
> 
> interface A { operation X }; 
> interface B extends A { operation X}; 
> 
> The above is disallowed since operation X is defined in two 
> different interfaces. This is disallowed even if the contents of 
> operation (A/X) is identical to operation (B/X). The appearance of X
> in B is considered to be an accident and an error. 
> 
> Similarly, the following is also illegal: 
> 
> interface A { operation X}; 
> interface B { operation X}; 
> interface C extends A, B; 
> 
> A and B may contain operations of the same name, but an error occurs
> when C extends both of them, even if X is defined identically in 
> both. Designers must factor common operations into a base interface, 
e.g. 
> 
> interface D {operation X}; 
> interface A {...}; 
> interface B {...}; 
> interface C extends A, B; 
> 
> The same considerations apply to Fault components. 
> 
> An additional motivation for this rule is that now all components 
> have unique URI's. Everyone component is defined in a unique parent 
> component and we can assign it a URI by building up a path composed 
> of the names of its ancestors. In contrast, if we allowed accidental
> equivalence, then in the first example, we only have one operation 
> component X, but is has 2 parents (A and B) and therefore 2 URIs : 
> nsuri#wsdl.operation(A/X) and nsuri#wsdl.operation(B/X). And we 
> would really have to compute its derived properties to determine 
equivalence.
> 
> 3. Finally, for this to work, we should only permit extension 
> elements and attributes in the top level elements: interface, 
> binding, and service. This means they are disallowed as children of 
> the root description element. 
> 
> The motivation for this is that extensions in the root element are 
> scoped to the document, but there is no way to capture this scope 
> within the component model. The only property pushed down from the 
> document to the top level elements is the targetnamespace attribute 
> which becomes the namespace name of the QNames of interface, 
> binding, and service. 
> 
> Allowing root level extensions complicates the definition of infoset
> equivalence of the top level elements since the semantics of the 
> extensions might alter the meanings of the top level element, i.e. 
> attach some inherited properties to them. 
> 
> The consequence is that if an extension is intended to have document
> wide scope, then it must be explicitly copied into all the top level
> elements. However, I am not aware of any such extensions in use today. 
> 
> One other pleasant consequence of this rule is that we can have a 
> deterministic schema that enforces the order of the top level elements, 
i.e.:
> 
> description = 
>         (import | include) * 
>         types ? 
>         (interface | binding | service) * 
> 
> This avoids the need to introduce additional elements to enforce 
> order as I proposed in [2]. 
> 
> [1] http://lists.w3.org/Archives/Public/public-ws-desc-
> comments/2005Jan/0007.html 
> [2] http://lists.w3.org/Archives/Public/public-ws-desc-
> comments/2005Jan/0006.html 
> 
> Arthur Ryman,
> Rational Desktop Tools Development
> 
> phone: +1-905-413-3077, TL 969-3077
> assistant: +1-905-413-2411, TL 969-2411
> fax: +1-905-413-4920, TL 969-4920
> mobile: +1-416-939-5063, text: 4169395063@fido.ca
> intranet: http://labweb.torolab.ibm.com/DRY6/
Received on Monday, 7 February 2005 23:15:26 UTC