SOAP 1.2 RPC encoding issue? from Robert van Engelen on 2002-08-12 (xml-dist-app@w3.org from August 2002)

From: Robert van Engelen <engelen@cs.fsu.edu>
Date: Mon, 12 Aug 2002 15:22:26 -0400 (EDT)
To: xml-dist-app@w3.org
Message-Id: <200208121922.g7CJMQD12698@diablo.cs.fsu.edu>
Hi,

I am having some trouble with the SOAP RPC encoding style proposed by the
SOAP 1.2 specification.  Any help would be greatly appreciated to clarify this
issue.

The issue is not toolkit implementation specific, but rather appears to be
due to the choice of node labeling in SOAP 1.2.  The problem can occur in SOAP
RPC encoding (and decoding) involving the utilization of XML schema
extensions, multi-referenced objects, and streaming XML parsing.

This encoding problem is best illustrated with an example (this will also
keep the posting short).

Suppose a service handles the following type of requests (as defined by an
appropriate schema for "struct"):

<env:Envelope ...>
 <env:Body ...>
  <ns:call>
   <param>
    <struct>
     <a>...</a>
     <b>...</b>
    </struct>
   </param>
  </ns:call>
 </env:Body>
</env:Envelope>

I believe that according to the SOAP RPC encoding rules and XML schema
extension, the following request message is also admissable (with an extra
element x in the struct):

<env:Envelope ...>
 <env:Body ...>
  <ns:call>
   <param>
    <struct>
     <x>...</x>
     <a>...</a>
     <b>...</b>
    </struct>
   </param>
  </ns:call>
 </env:Body>
</env:Envelope>

The service would have to ignore element x and only parse the a and b elements.

Now suppose that element x is multi-ref with elements a and b. The SOAP 1.2
encoding will be:

<env:Envelope ...>
 <env:Body ...>
  <ns:call>
   <param>
    <struct>
     <x id="id1">...</x>
     <a href="#id1"/>
     <b href="#id1"/>
    </struct>
   </param>
  </ns:call>
 </env:Body>
</env:Envelope>

Without streaming XML parsing, the above could be handled by parsing the
contents of element x (ellipsis) according to expected types for elements a
and b. Some DOM traversals may need to be done to implement this.

It gets more interesting with streaming parsing.  I believe that the SOAP 1.1
and SOAP 1.2 specifications do not impose any restrictions or requirements on
the parsing method.  However, streaming parsing requires buffering of the
entire contents of element x in order to process elements a and b.  This is
possible, but diminishes the usefulness of streaming parsing to optimize speed
and minimize memory use (e.g. when the contents are large arrays).  With SOAP
1.1, multi-referenced objects are referenced with forward pointers.  Hence,
buffering is not required and sufficient type information can be collected
from the referring nodes to determine the contents of the multi-referenced
object BEFORE the multi-referenced object is parsed.  For example:

<env:Envelope ...>
 <env:Body ...>
  <ns:call>
   <param>
    <struct>
     <x href="#id1"/>
     <a href="#id1"/>
     <b href="#id1"/>
    </struct>
   </param>
  </ns:call>
  <ref id="#id1">...</ref>
 </env:Body>
</env:Envelope>

In this case element x can be safely ignored and the labeling of elements a
and b provide sufficient information to determine the type of object stored in
the multi-referenced ref element.  (Actually, SOAP 1.1 allows inline string
labeling. But since this is the only exception, it is easy to detect).

It seems that SOAP 1.1 RPC encoding can provide some performance and memory
usage guarantees and does not suffer from a choice of node labeling.  These
guarantees don't come cheap with SOAP 1.2.  Further, SOAP 1.1 encoding with
independent multi-referenced elements is really a "safety net" for catching
dropped multi-ref elements.

Comments are highly appreciated.

- Robert van Engelen, Prof., Comp. Sc., FSU, engelen@cs.fsu.edu
Received on Monday, 12 August 2002 15:22:27 UTC