proposal for eliminating <message>

Hello,

During the Rennes F2F we (very) briefly discussed an approach for
eliminating <message>. In this note I'd like to expand on this a 
bit as well as modify the initial ideas a little bit. There is also
an on-going work item within a subgroup to define some additions
to this stuff to make it possible to fully capture RPC signatures 
as was possible with WSDL 1.2 using <part>s. I believe that that 
will build on this proposal, but some tweaking may be necessary.

Let me discuss this bottom up. In SOAP, a message contains a 
body and header blocks. The body itself is a collection of elements
and thanks to PASWA, even attachments are logically container 
within this body. The binding may certainly store it somewhere
else (e.g., inside the MIME envelope) but logically the only
payload is the body.

In WSDL 1.1, the <part>s of a message were intended to represent
the different things that are being sent. These "things" may
be RPC parameters, headers, or simply message components (an
XML document, an image, etc..). We all know the limitations
and evilness of <message>, so I won't bother illustrating them
again. One of the cool (yet misunderstood) things about WSDL's
<part> thing was its ability to support describing things 
that were natively typed using something other than XSD. 

The basic idea for how <message> is to be eliminated is to
define a single complexType that represents all the stuff 
that goes in the SOAP body. No, the proposal is not SOAP 
specific, but I will use SOAP as a canonical binding to
consider. I will be happy to help explain this further with
another binding if such is deemed necessary.

Why a complexType instead of an element? SOAP allows you
to send more than one element in <soap:body>. So its not possible
to define one element as the payload. The complexType, OTOH,
will be the type of <soap:Body>! That too is slightly weird 
because soap:Body already has a type given to it by the SOAP
schema. However, our schema wizards have asserted that its 
legit for an element to have two or more different types
(as long as they're consistent? not sure). In this case the
basic type of soap:Body is anything so any type we define 
for it is fine. I believe that attributes are not allowed
however and we'd need to clarify and support that.

OK so what about the header blocks? In WSDL 1.1 the intent
was that people would describe those as <part>s as well 
and then bind them to different places. (That's why there
was a "parts" attribute in soap:body and soap:header.)
A while ago we made a change whereby SOAP headers could 
only be introduced directly in the binding without ever
describing them abstractly. I have no problem with additional
headers being described in the binding only, but I do think
its necessary to have a mechanism for someone to abstractly 
define a "header" that may be used in multiple bindings.
Indeed some headers will only appear at runtime and hence 
may never be described in WSDL.

So, in addition to indicating the complexType of the 
payload, one should have the ability to list zero or
more header elements. What "header" means to each binding
is of course up to it.

Thus, the proposal is to define an operation as follows:

<operation name="ncname">
    <input body="qname-of-complexType" [headers="list-of-qnames"]/>
    <output .. same ../>
</operation>

In many scenarios the body type will be a single-use definition; i.e,
no other operation will use the same complexType. In those cases, its
rather awkward to have to define a *named* complexType. We can avoid that
by allowing the following syntactic variant as well:

<operation name="ncname">
    <input [headers="list-of-qnames"]>
        <xsd:complexType>
            ...
        </xsd:complexType>
    </input>
    <output .. same ../>
</operation>

Basically the anonymous complexType within input etc. is the type of
the body.

Now, in many cases, it is likely that the body will indeed consist of
just one element (e.g., a well-known element like a purchase order). In
those scenarios its rather awkward to have to define a type just to
include that element:

<operation name="foo">
    <input [headers="list-of-qnames"]>
        <xsd:complexType xmlns:xsd="http://www.w3.org/2000/10/XMLSchema">
            <xsd:sequence>
                <xsd:element ref="x:e1" />
            </xsd:sequence>
        </xsd:complexType>
    </input>
    <output [headers="list-of-qnames"]>
        <xsd:complexType xmlns:xsd="http://www.w3.org/2000/10/XMLSchema">
            <xsd:sequence>
                <xsd:element ref="x:e2" />
            </xsd:sequence>
        </xsd:complexType>
    </output>
</operation>

Clearly a shortcut syntax would be very useful for this case:

<operation name="foo">
    <input element="x:e1" [headers="list-of-qnames"]/>
    <output element="x:e2" [headers="list-of-qnames"]/>
</operation>

Notes that the @element case is strictly a syntactic shortcut as
one can always define a type which contains only that element and
refer to that type using @body. If we accept this approach, @element
should be written up as a short-cut syntax and not as a different
way to describe the payload.

OK, so here's the summary of the proposed replacement syntax:

<operation name="ncname">
    <input [(body="qname") | (element="qname")] 
           [headers="list-of-qnames"]>
        [<xsd:complexType> ... </xsd:complexType>]
    </input>
    <output [(body="qname") | (element="qname")] 
            [headers="list-of-qnames"]>
        [<xsd:complexType> ... </xsd:complexType>]
    </output>
</operation>

Semantics:
    - exactly one of operation/input/@body or operation/input/@element
      or operation/input/xsd:complexType must be present. Same applies
      for output, of course. 
    - if the nested complexType element is present it must not be named.
      That type defines all the payload content.
    - if @element is present then it refers to an element that's the 
      single element which is the payload
    - if @body is present then it refers to a complexType that's the
      type which defines all the payload content. In SOAP, that type 
      would be the type of soap:Body.
    - each of the @headers qnames must refer to a global element
      declaration.

The nice thing with this syntax is that it becomes only incremental 
complex as the input becomes more complex (only a single element, 
one-time use content of more than just one element and re-usable 
content definitions). However, as you see by the rules above, the 
syntax itself is rather too clever by half. I suggest we pick a 
syntax that supports two cases:
    (1) What I believe to be the 80/20 case: a single element in the
        body
    (2) A named complexType defining everything in the body

Basically that means we drop the nested anonymous complexType
inclusion capability. The syntax would then be:

<operation name="ncname">
    <input (element|body)="qname" [headers="list-of-qnames"]/>
    <output (element|body)="qname" [headers="list-of-qnames"]/>
</operation>

If @element is used then only that element will appear in the
body. If @body is used then that's the type representing all
of body content, i.e., the type of soap:Body in the case of SOAP.

One additional syntactic shortcut one can do is to overload the
two usages into one attribute:

<operation name="ncname">
    <input body="qname" [headers="list-of-qnames"]/>
    <output body="qname" [headers="list-of-qnames"]/> 
</operation>

Now depending on whether @body refers to a type or an element,
we get the two previous cases.

BTW, the really cool thing with any of these syntactic approaches
is that they are highly amenable to default doc/lit binding. You 
may recall that one of the binding changes that Kevin and I have
proposed is a defaulting of bindings to SOAP doc/lit style (using
WSDL 1.1 terminology). 

Sanjiva.

Received on Sunday, 6 July 2003 09:17:41 UTC