- From: <noah_mendelsohn@us.ibm.com>
- Date: Mon, 3 May 2004 16:45:06 -0400
- To: Tim Bray <Tim.Bray@Sun.COM>
- Cc: Elliotte Rusty Harold <elharo@metalab.unc.edu>, www-tag@w3.org
There are at least three related layers of function that are being being
proposed by the XMLP workgroup. I would motivate them this way:
XOP: extends the XML model to more efficiently deal with the sorts of
data that's traditionally been stored in binary files and/or MIME-typed
streams. So, we've always been in pretty good shape handling integers,
dates, floats, etc., and we've had a base64 binary type. XOP makes it
more efficient than it used to be to employ that base64 type. Note that
things are happening at two levels here: in one sense, we're just
encouraging you to base64 encode such data in XML, and perhaps use some
standard attributes to indicate the MIME types (see below), etc. You
don't need XOP for that, it's just a convention for using XML. XOP adds
an optimized non-XML encoding that is more efficient than the character
form. That encoding happens to use multipart/related.
MTOM: standardizes the use of XOP with SOAP. Essentially, we're defining
a new media type that is for a SOAP message infoset that's been XOP
encoded. The HTTP binding can use either the new media type or the older
application/soap+xml. Note that any message can be sent in either of
these media types, whether or not there's base64 content. The media types
just optimize differently, that's all.
Representation header: provides a Web resource representation as a SOAP
message header. The purpose here is >not< to tunnel all of HTTP through
SOAP but rather, in the case that the URI happens to use the http:
scheme, to support a particular flavor of http caching. Example use
cases for the representation header include: a) you're sending the
message to a node, a PDA perhaps, that is likely to be disconnected from
the network at the time a URI in the message is to be
dereferenced...perhaps I've sent you a vCard with my picture b) you're
connected but for security or performance reasons you don't let your SOAP
transaction processing applications make random web connections while
processing a transaction c) not sure how HTTP purists would view this, but
you want to convey that the representation provided was current at the
time the message originated. So, think of the Representation header as
pre-loading an http (or other scheme) cache. Except as an optimization,
XOP and MTOM are orthogonal to the representation function. You can put a
Representation header into a message and not know whether it will be
XOP-encoded on the wire or not. Representation does use base64 encoding,
when means XOP/MTOM should do a good job with it.
Which brings us to the whole reason we've proposed to do things this way.
At the Infoset level, we have a completely consistent model of a SOAP
message, and that model has nothing to do with XOP. It's pure XML,
including the JPEG in the example above. We've encouraged the use of
base64binary encoding and given you the hint that we can often handle that
more efficiently than you would have guessed.
Why does having this consitent view matter? Consider a SOAP message
that's relayed through a succession of 4 nodes, using 3 links. The first
and third link are XOP-aware, and the 2nd is not. No problem. Indeed,
the higher level application code never sees the difference. The message
is XOP encoded as multipart for the first hop, is sent in XML 1.0
character form on the 2nd hop. Whether to bother converting back to
binary on the 3rd hop is an optimization tradeoff: do you care more about
the conversion time or the size on the wire? The point is that the
message has the same pure XML model at each of the 4 nodes. Of course,
because the model is Infoset, we can apply the full range of XML tools
such as XPath to the whole message or document. That's harder to do with
SOAP+Attachments, where some of the data was modeled as MIME multipart.
Only if that XPath or other tool actually references the characters within
the optimized elements do we have to do the up-conversions to base64
character form.
Of course, the performance gain from XOP is greatest when all links are
XOP-enabled, and the API's at both ends help you out. Imagine that in the
vCard example my picture came from some local JPEG file at the original
sender. I might use a local API like:
createXMLElementFromBinaryFile(fileName)
A naive implementation would immediately convert to base64 characters, but
a smart one will hold off until the character children were actually
requested, because we hope they never will be. Sure enough, we're sending
this message using a XOP wire format, which needs the exact binary we
already have in the file. Stream it right out into the MIME part. At the
next node, you've got the binary. If the application asks for the
character element children, you can always generate them, but with luck
that node too will want that JPEG in binary form and a suitable API will
get it without any conversion:
jpegStream =
getBinaryStreamFromBase64XMLElement(elementWithPicture)
So, you only pay the cost of character conversions if the application
needs it, but anytime you want to view the message as XML, that's what
you've got. Pure XML, characters only, everything defined at the Infoset
level. There is always unique XML 1.0 equivalent for a XOP encoded
document. The multipart is just an optimized external form. In some
ways, XOP's role in XML is a bit like an encoding such as UTF-8 or UTF-16,
except that XOP is defined at a structural rather than a character level.
The above is just my own opinion, not speaking officially for XMLP. I
hope this helps explain what we're trying to do and why the architecture
is layered as it is.
--------------------------------------
Noah Mendelsohn
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------
Received on Monday, 3 May 2004 16:47:22 UTC