Re: Tunneling HTTP over SOAP

There are at least three related layers of function that are being being 
proposed by the XMLP workgroup.  I would motivate them this way:

XOP:  extends the XML model to more efficiently deal with the sorts of 
data that's traditionally been stored in binary files and/or MIME-typed 
streams.  So, we've always been in pretty good shape handling integers, 
dates, floats, etc., and we've had a base64 binary type.  XOP makes it 
more efficient than it used to be to employ that base64 type.  Note that 
things are happening at two levels here:  in one sense, we're just 
encouraging you to base64 encode such data in XML, and perhaps use some 
standard attributes to indicate the MIME types (see below), etc.  You 
don't need XOP for that, it's just a convention for using XML.  XOP adds 
an optimized non-XML encoding that is more efficient than the character 
form.  That encoding happens to use multipart/related.

MTOM:  standardizes the use of XOP with SOAP.  Essentially, we're defining 
a new media type that is for a SOAP message infoset that's been XOP 
encoded.  The HTTP binding can use either the new media type or the older 
application/soap+xml.  Note that any message can be sent in either of 
these media types, whether or not there's base64 content.  The media types 
just optimize differently, that's all.

Representation header:  provides a Web resource representation as a SOAP 
message header.  The purpose here is >not< to tunnel all of HTTP through 
SOAP but rather,  in the case that the URI happens to use the http: 
scheme, to support a particular flavor of http caching.   Example use 
cases for the representation header include:  a) you're sending the 
message to a node, a PDA perhaps, that is likely to be disconnected from 
the network at the time a URI in the message is to be 
dereferenced...perhaps I've sent you a vCard with my picture b) you're 
connected but for security or performance reasons you don't let your SOAP 
transaction processing applications make random web connections while 
processing a transaction c) not sure how HTTP purists would view this, but 
you want to convey that the representation provided was current at the 
time the message originated.  So, think of the Representation header as 
pre-loading an http (or other scheme) cache.  Except as an optimization, 
XOP and MTOM are orthogonal to the representation function.  You can put a 
Representation header into a message and not know whether it will be 
XOP-encoded on the wire or not.  Representation does use base64 encoding, 
when means XOP/MTOM should do a good job with it.

Which brings us to the whole reason we've proposed to do things this way.  
At the Infoset level, we have a completely consistent model of a SOAP 
message, and that model has nothing to do with XOP.    It's pure XML, 
including the JPEG in the example above.  We've encouraged the use of 
base64binary encoding and given you the hint that we can often handle that 
more efficiently than you would have guessed.

Why does having this consitent view matter?  Consider a SOAP message 
that's relayed through a succession of 4 nodes, using 3 links.  The first 
and third link are XOP-aware, and the 2nd is not.  No problem.   Indeed, 
the higher level application code never sees the difference.   The message 
is XOP encoded as multipart for the first hop, is sent in XML 1.0 
character form on the 2nd hop.  Whether to bother converting back to 
binary on the 3rd hop is an optimization tradeoff:  do you care more about 
the conversion time or the size on the wire?  The point is that the 
message has the same pure XML model at each of the 4 nodes.  Of course, 
because the model is Infoset, we can apply the full range of XML tools 
such as XPath to the whole message or document.  That's harder to do with 
SOAP+Attachments, where some of the data was modeled as MIME multipart. 
Only if that XPath or other tool actually references the characters within 
the optimized elements do we have to do the up-conversions to base64 
character form.

Of course, the performance gain from XOP is greatest when all links are 
XOP-enabled, and the API's at both ends help you out.  Imagine that in the 
vCard example my picture came from some local JPEG file at the original 
sender.  I might use a local API like:


A naive implementation would immediately convert to base64 characters, but 
a smart one will hold off until the character children were actually 
requested, because we hope they never will be.  Sure enough, we're sending 
this message using a XOP wire format, which needs the exact binary we 
already have in the file.  Stream it right out into the MIME part.  At the 
next node, you've got the binary.  If the application asks for the 
character element children, you can always generate them, but with luck 
that node too will want that JPEG in binary form and a suitable API will 
get it without any conversion:

        jpegStream = 

So, you only pay the cost of character conversions if the application 
needs it, but anytime you want to view the message as XML, that's what 
you've got.  Pure XML, characters only, everything defined at the Infoset 
level.  There is always unique XML 1.0 equivalent for a XOP encoded 
document.  The multipart is just an optimized external form.  In some 
ways, XOP's role in XML is a bit like an encoding such as UTF-8 or UTF-16, 
except that XOP is defined at a structural rather than a character level.

The above is just my own opinion, not speaking officially for XMLP.  I 
hope this helps explain what we're trying to do and why the architecture 
is layered as it is.

Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142

Received on Monday, 3 May 2004 16:47:22 UTC