Re: Allow EXI as characterization for XML in the JMS body ? from Amelia A Lewis on 2010-11-29 (public-soap-jms@w3.org from November 2010)

From: Amelia A Lewis <alewis@tibco.com>
Date: Mon, 29 Nov 2010 17:10:02 -0500
To: Eric Johnson <eric@tibco.com>
Cc: Jean-Baptiste Bugeaud <bugeaud@gmail.com>, public-soap-jms@w3.org
Message-ID: <20101129171002663938.d58f872a@tibco.com>
On Mon, 29 Nov 2010 12:24:45 -0800, Eric Johnson wrote:
> a) For incoming messages, how does one indicate that the initial 
> request (either SOAP/HTTP, or SOAP/JMS) could be gzip, or exi 
> format?  That seems like it would have to be either an implementation 
> specific detail (Axis2 apparently supports it), or some sort of 
> indication in the WSDL - and I could find no documentation of that 
> anywhere.  If a WSDL proposal exists, it would be useful to have a 
> non-normative reference to it.

I don't know that anyone is doing this; I don't know of a "standard" 
idiom for WSDL (1.1 or 2.0).  In WSDL 1.1, you *can* specify "header", 
but it's sort of ... anti-negotiation.

> b) Certain of the possible content encoding values make no sense 
> depending on circumstances.  For example, it doesn't make sense to 
> allow pack200 (ever, that I can tell).  Also, at least as far as I 
> can tell, "exi" only makes sense to specify in the JMS Message 
> properties if the actual SOAP payload is not MIME multi-part, whereas 
> gzip, deflate, and compress can probably all be used regardless of 
> circumstances.  Although we probably don't need to say anything 
> normative about these constraints, perhaps we should have some 
> commentary?

Well.  So far as I understand it, Content-Encoding *ought* to be 
applied to the parts of a multipart message (or single-part messages), 
but should *not* be applied to a multipart container; the reverse ought 
to be true for Transfer-Encoding.  That is, with a "Content-" prefix, 
something ought to apply only to "leaf" nodes of a multipart message, 
as a general rule (and with some notable (or notorious, if you prefer) 
exceptions).  For instance: Content-Length *may not* be applied to a 
multipart message as-a-whole.  On the other hand, "chunked" encoding 
*does* apply to the message as a whole.  "Content-Type" refers to a 
leaf, or identifies that a composite part is composite (and its 
structure, in general terms, if so).  Justification: RFC 2616, section 
3.6, which states that a transfer coding "differs from a content coding 
in that the transfer-coding is a property of the message, not of the 
original entity."

Transfer-Encoding, though, is only for chunking and encryption.  
Content-Encoding specifies things like compression.

Is it reasonable to apply Content-Encoding to a multipart message?  In 
a word: no.  Content-Encoding ought to apply to an entity, not to a 
composite.  Now ... that may not matter; we're in an under-specified 
area, so the real answer to the question is: what do implementations do?

> c) In the case of a multipart message using, for example, MTOM, is it 
> even possible to apply EXI for the XML part of the MIME multi-part 
> payload?

I want to say, "yes, certainly!" but then I went to look at the 
governing documents at the IETF, and I'm not so sure.

RFC 2616 *does not define* the handling of multipart (or "composite" to 
use the MIME-defined technical term) messages.  All HTTP entities are 
single resources.  There's an extension that allows a new form of 
composite to be used for PUT (but not for resource retrieval).  In 
order to permit multipart, you have to go to RFC 2557--defined before 
2616, and not since updated--which defines "MHTML", or MIME 
Encapsulation of Aggregate Documents.  SOAP with Attachments references 
MHTML, although with differences (MHTML is actually defined so as to 
allow transmission of HTML with its supporting files, such as images 
and stylesheets, over a MIME-compliant protocol, not over HTTP).  MTOM 
was defined, in part, because MHTML is a rather sloppy format (and in 
part because, as a MIME extension, it's fundamentally incompatible with 
HTTP, even though SOAP with Attachments used it precisely as a "MIME 
over HTTP" standard).

So ... in order to determine how these un-specified issues are properly 
resolved, we need to check how other specifications handle this.  The 
most prominent examples are SOAP with Attachments and MTOM/XOP.  This 
may clarify things a bit.  Neither of these specifications creates 
HTTP-style entities inside the outermost container.  That is: both 
specifications use pure MIME parts inside the multipart/related outer 
composite part.

Okay.  Distinguishing MIME from HTTP near-MIME: MIME requires a 
MIME-Version header where HTTP forbids it; MIME requires 
Content-Transfer-Encoding or defaults to 7bit where HTTP *forbids* 
Content-Transfer-Encoding and defaults to 8bit; Content-Length is 
unknown for MIME where it is required by HTTP (unless Transfer-Encoding 
specifies chunked); Content-Encoding and Transfer-Encoding are unknown 
in MIME and are defined by HTTP.  One consequence is that the 
"per-specification" answer to Eric's question is "No, that is not 
permitted."

Following the specifications would lead one to creating MIME-compliant 
"packages" for SOAP with Attachments or MTOM/XOP and then tearing off 
the "MIME-Version" and "Content-Transfer-Encoding" headers from the 
outermost part, replacing them with Content-Length, Transfer-Encoding, 
and Content-Encoding as needed.  If you do that, you won't 
interoperate.  All of the SOAP implementations that I know of use HTTP 
headers inside the encapsulated MIME parts (Content-Length, in 
particular, is often required, and I have never seen an implementation 
that treated a part missing a Content-Transfer-Encoding header as 7bit 
US-ASCII).

Probably not really helping?  We're in an awkward position; the 
specifications don't really match practice.  We're writing a 
specification, but we need to write one that doesn't actually 
contradict practice.  I don't know how we find the way out.

Amy!
-- 
Amelia A. Lewis
Senior Architect
TIBCO/Extensibility, Inc.
alewis@tibco.com
Received on Monday, 29 November 2010 22:10:41 UTC