Comments - WSDL 2.0 Core from Mark Nottingham on 2004-05-18 (www-ws-desc@w3.org from May 2004)

From: Mark Nottingham <mark.nottingham@bea.com>
Date: Tue, 18 May 2004 16:31:25 -0700
To: www-ws-desc@w3.org
Message-Id: <7AFFAC9B-A923-11D8-9806-000A95BD86C0@bea.com>
I've just finished a first detailed reading of the WSDL 2.0 Core 
specification, (located at 
http://www.w3.org/TR/2004/WD-wsdl20-20040326/), and would like to share 
some comments.

Although I've tracked the WG's work from a distance and looked at 
specific parts of the specifications-in-progress, this is the first 
time I've done a comprehensive review; therefore, my comments may be 
informed by the benefit of a new pair of eyes looking at the problem 
(or they may just bring up existing, or better left undisturbed, 
issues, for which I apologise in advance).

Generally, I'd like to congratulate the Working Group on this 
specification; the document as it stands seems very solid, and is 
obviously the product of a lot of hard work. In comparison with the 
WSDL 1.1 document, it is a great leap forward in clarity and precision, 
and I'm confident that it will see good adoption in the market because 
of this.

That said, here are some issues I came across:

* Substantive Comments (and Suggestions)
----------------------

- 2.1.1: "Type system components are element declarations drawn from 
some type system. They define the [local name], [namespace name], 
[children] and [attributes] properties of an element information item."

This effectively limits web services to an Infoset data model. However, 
in other places it the document indicated that other data models are 
allowed by WSDL; which is it? E.g., in 2.5.1: "If a non-XML type system 
is in use (as considered in 3.2 Using Other Schema Languages) then 
additional properties would  need to be added to the Message Reference 
Component (along  with extensibility attributes to its XML 
representation) to allow associating such message types with the 
message reference." What is the benefit of this approach, instead of 
just using a more neutral reference mechanism (i.e., "content" or "ref" 
instead of "element") to determine the type of a message?

To me, this is a base requirement for WSDL; a good proportion of the 
content on the Web is NOT most naturally expressed or modeled as an XML 
Infoset, and even WSDL itself has chosen to specify its data model in a 
layer above the Infoset (the "Component Model"). Why should the 
messages WSDL describes be confined to the Infoset, or prejudiced by 
XMLisms like "element"?

I suggest removing any language (such as that above) that requires 
messages to be modelled as Infosets, and changing attributes named 
"element" (and similar) to "content" or another, more neutral term. I 
do not believe that this is a large change, nor is it one that will 
impact existing implementations greatly, but it will provide great 
benefit to the Web.

- 2.1.1: The component model is not well-defined; no where is it said 
that components have properties, nor is are their aspects explained, 
and the {} notation's significance is not documented. I suggest adding 
a section detailing the principles and notation of the component model.

- 2.1.1: "imported/included" - There needs to be a reference to these 
processes. Also, the definition of how to arrive at a component model 
and how to interpret it are intertwined; while it makes sense to 
specify the semantics and mapping of individual components together, 
the separation of the import and exclude functionalities is awkward. I 
suggest that the import/include mechanism be documented along with the 
(expanded) definition of the component model, rather than after the use 
of that model. An explicit processing model could also be documented 
there, whereby one can deterministically convert an Infoset into a 
component model.

- 2.1.1: "an unambiguous name for the intended semantics of the 
components." -> "an unambiguous name *space* for the intended semantics 
of the components." (the namespace isn't used as a name on its own, is 
it?)

- 2.1.1: Why are QNames, rather than URIs, used to identify components? 
If there are good reasons for not using the primary identification 
mechanism in the Web, they should be documented here, along with 
caveats as to their use (e.g., if signing content, etc). If not, URIs 
should be used.

- 2.2.1: What are the semantics of interface extension? E.g., how are 
duplicate operations in the set handled? This is mentioned in a few 
places, but not comprehensively documented.

- Table 2-2 (and elsewhere): What is an "actual value"? Does this imply 
that it is not the [normalized value]?

- 2.3.1: Why is it advantageous to define a fault at the Interface 
level, if it's just repeating information in the operations? I suggest 
either removing this functionality or better motivating it.

- 2.4.2.3: The style attribute has a very loose semantic; it seems 
purpose-built for RPC, and therefore is effectively yet another 
extensibility mechanism. Also, it is readily imaginable for an 
operation to have more than one style; e.g., RPC as well as 
web:method="POST" semantics. Therefore, it needs to be able to carry 
multiple values; while this could be accommodated by making the value a 
list of URIs, I suggest it would be better to define this as an 
rpc-specific attribute with a boolean value (e.g., ext:rpc="1").

- 2.4.4: This section implies that you MUST define your messages in XML 
Schema to use RPC style; such a restriction is not necessary, as long 
as it is functionally equivalent. I suggest rewriting to the effect 
that other message definitions are allowed, as long as they are 
functionally equivalent.

- 2.4.4: "hence the rules which refer to the output element do not 
apply." Read literally, this has the (unintended?) effect of obviating 
the first rule.

- 2.8: The term "properties" is used throughout to denote a part of the 
component model; this section redefines it as something similar but 
different. Suggest using a distinguished term (perhaps "attributes"?).

- 2.9.1: In many places in the spec, the semantics and constraints on 
component properties (e.g., optionality) are described in the Infoset 
mappings, rather than in the properties themselves. For clarity and 
applicability to other mappings, it would be better to place them at 
the component model level. I suggest expanding the content model of 
each component property in the property lists, and removing redundant 
syntactic constraints from the infoset mappings.

- 2.11.2: "A REQUIRED ref attribute information item" - this requires 
all binding operations to refer to corresponding interface operations, 
despite earlier indications in 2.9.1 that bindings could be specified 
generically "across all operations of an interface." If that is true, 
how should one do so? I suggest that this requirement was dropped, and 
guidance given on specifying generic operations.

- 2.12.1: It seems wasteful to duplicate the interface message into the 
binding if there is no additional information therein. Can it be 
omitted with no effect in this case? I.e., the specified properties 
only serve to identify the message, not to affect the concrete 
representation of it; it should be explicitly stated that the absence 
of those properties has no effect on the interpretation of the 
description.

- 2.15: "Two components of the same type are considered equivalent if, 
for each property, the value in the first component is the same as the 
value in the second component." Are simple values compared 
character-by-character? Is any schema information (e.g., defaulting, 
for canonicalisation) necessary? How are sets compared? Will this work 
for Properties (which have an associated value)?

- A.1: The "fragment identifiers" section of the media type 
registration needs to list the mechanism described in C.2.


* Editorial Comments
--------------------

- 2.1.2: Is the pseudo-schema normative? Where are its vocabulary and 
rules explained?

- 2.2.1: "The interfaces a given interface extends MUST NOT themselves 
extend that interface either  directly or indirectly." What does "that" 
refer to? (would be good to mention recursion).

- 2.2.2.3: There needs to be a description of, or references to, the 
properties here (e.g., {message references})

- 2.3.1: "execution of an operation of the interface." -> "execution of 
*any* operation of the interface." ?

- 2.3.1: "The reason... is because that" Poor English.

- 2.3.1: "If a non-XML type system is in use... then additional 
properties would need to be added..." Poor English.

- 2.3.1: "...to allow associating such.." Poor English.

- 2.3.1: "to allow associating such message types with the message 
reference" Shouldn't that be *fault* reference?

- 2.5.1: "A Message Reference component associates to a message  
exchanged in an operation an XML element declaration  that specifies 
its message content." Tortured English.

- 2.5.1: "Message Reference components are identified by the role the  
message plays in the {message exchange pattern} that the  operation is 
using. That is, a message exchange pattern  defines a set /meof 
placeholder messages that participate in the  pattern and assigns them 
unique names within the pattern." What does this mean? This passage is 
*very* confusing.

- 2.5.1: "element" is used often, but not defined; is this Element 
Information Item?

- 2.6.1: "A Fault Reference component associates a Fault component that 
defines the fault message type for a fault that occurs related to a 
message participating in an operation. Fault Reference components are 
identified by the role the related message plays in the {message 
exchange pattern} that the operation is using." What? Please have pity 
on your readers.

- 2.6.1: "The purpose of a Fault Reference component is to 
associate..." Bad English. Try: "A Fault Reference component's purpose 
is the association of..."

- 2.6.2.1: "The ref attribute information item refers to a fault 
component." Shouldn't this be "*interface* fault component."?

- 2.11.1: "Interface Operation components are local to Interface 
components;  they cannot be referred to by QName, despite having both 
{name}  and {target namespace} properties. That is, two Interface 
components  sharing the same {target namespace} property but with 
different  {name} properties MAY contain Interface Operation components 
  which share the same {name} property. Thus, the {name}  and {target 
namespace} properties of the Interface Operation  components are not 
sufficient to form the unique identity of  an Interface Operation 
component. To uniquely identify an  Interface Operation component one 
must first identify the Interface  component (by QName) and then 
identify the Interface Operation  within that Interface component (by a 
further QName)." What is the effect of this statement upon bindings? It 
doesn't place any direct requirements on them.

- 2.13: Shouldn't 2.14 Endpoints come before this section?

- 2.13.1: "A Service component describes a set of endpoints (see  2.14 
Endpoint) at which the single interface of the  service is provided." 
Circular definition; confusing.



--
Mark Nottingham   Principal Technologist
Office of the CTO   BEA Systems
Received on Tuesday, 18 May 2004 19:31:30 UTC