Re: Allow EXI as characterization for XML in the JMS body ? from Eric Johnson on 2010-11-29 (public-soap-jms@w3.org from November 2010)

From: Eric Johnson <eric@tibco.com>
Date: Mon, 29 Nov 2010 12:24:45 -0800
To: Jean-Baptiste Bugeaud <bugeaud@gmail.com>
CC: Amelia A Lewis <alewis@tibco.com>, public-soap-jms@w3.org
Message-ID: <4CF40C0D.8020109@tibco.com>
I've an action item to make a concrete proposal to resolve ISSUE-65 
(Supporting EXI)

To that end, I'm trying to tie down a threads that are still running 
loose in my head.

a) For incoming messages, how does one indicate that the initial request 
(either SOAP/HTTP, or SOAP/JMS) could be gzip, or exi format?  That 
seems like it would have to be either an implementation specific detail 
(Axis2 apparently supports it), or some sort of indication in the WSDL - 
and I could find no documentation of that anywhere.  If a WSDL proposal 
exists, it would be useful to have a non-normative reference to it.

b) Certain of the possible content encoding values make no sense 
depending on circumstances.  For example, it doesn't make sense to allow 
pack200 (ever, that I can tell).  Also, at least as far as I can tell, 
"exi" only makes sense to specify in the JMS Message properties if the 
actual SOAP payload is not MIME multi-part, whereas gzip, deflate, and 
compress can probably all be used regardless of circumstances.  Although 
we probably don't need to say anything normative about these 
constraints, perhaps we should have some commentary?

c) In the case of a multipart message using, for example, MTOM, is it 
even possible to apply EXI for the XML part of the MIME multi-part payload?

-Eric.

On 11/10/10 4:03 PM, Jean-Baptiste Bugeaud wrote:
> Hello Amy, George, Yves&  al,
>
> Thanks to Yves for having answered the questions from Eric on EXI.
>
>> This appears to have quite a bit of HTTP content negotiation built into
>> it.
> Not so much, because although the proposition is based on some HTTP
> defined principles (aka content coding).
> The proposition I have made is not based on any negotiation mechanism
> but on a more straightforward declarative mechanism that is more
> suited for JMS implementation.
>
> In HTTP, the client make a first request to indicate its encoding
> availability and the server with use this to optimize the entity
> encoding of the response.
> In the accept encodoing  proposition the scenario differ sligtly, the
> WSDL of the service that includes the accept list can either already
> exist at the client side (bundled as part of the application) or be
> fetched with a technique outside the scope of the proposition. In both
> cases, when performing the "call", the client can ignore those
> information or use it to perform some smart task. It is up to the
> client implementer to choose.
>
>>> At this time, there are things that prevent a content encoding (such
>>> as EXI) from beeing used :
>>>   - no way to store the exact content encoding used
>> In fact, JMS headers are a clear extensibility point, as are MIME
>> headers in the content of the message.
> Sure, but there is no standard for mimetype or content encoding. Thus
> any specification using JMS has to specify it.
>
>>>   - mandatory alignment of octet and content type
>> Not too clear on what that means.
> Typically, §2.4 of the current spec "The bytes or characters of the
> JMS Message payload correspond to the MIME format as indicated by the
> definition of the contentType property". If you use contentType
> property to store the real mimetype of the content outside encoding
> (says text/xml even when using EXI), this sentense will borbid it. If
> you store in the content Type the contentEncoding, then you will loose
> one information that is the real mimeType and will be missing one
> property.
>
>>> ===========================
>>>
>>> Addendum to section 2.2.1 :
>>>
>>>
>>> [Definition: soapjms:acceptEncoding] (list of xsd:string)
>>>        * Identifies the list of accepted values for content encoding that
>>> can be set using soapjms:contentEncoding.
>>>        * [Definition: Each values indicated as accept content encoding MUST
>>> be supported by the target destination implementation.†]
>>>        * [Definition: A caller SHOULD only use Each values indicated as
>>> accept content encoding MUST be supported by the target.†]
>> I think all of this is in the realm of content negotiation, and not
>> necessary.
> I also don't think a real negotiation with back-and-forth, is required.
>
> This is only an indication for a caller, the caller is free to do
> whatever he wants with that data. But if given, the caller is warned
> that the callee will support this. It is usefull for tracing and
> checking alignment between service specification (the service
> contract) and the effective runtime (application implementation,
> SOAPJMS implementation, etc).
> Implementers will be free to add extra checks or custom properties
> such as on client side (downgrade strategy, preference strategy, etc).
>
>> Supply ContentEncodingNotSupported, and if someone sends the wrong
>> thing, you shrug, send an error, and you're done.  Supported content
>> encodings might be indicated in the WSDL; if there's not already a
>> pattern for that, then other folks don't consider it necessary either.
>>
>>> Addendum to section 2.2.3 :
>>>
>>> [Definition: soapjms:contentEncoding] (xsd:string)
>>>        * Identifies the transformation that has been applied to the message
>>> payload body.
>> No.  At least, I don't think so.
>>
>> As I understand EXI, it's part of support for MTOM and such.  It's the
>> attachments that are being encoded.  Yes/No?  If yes, then there will
>> never be a JMS-layer header indicating encoding.
>>
>> Now, if EXI can be used for a non-composite SOAP message itself (that
>> is, a SOAP message that is not part of a MIME multipart message), then
>> perhaps we need this.
> If the content encdoing is GZip, which can be a good practice in some
> scenario, if you don't have a contentEncoding you will loose either
> the real mime type (XML or MTOM say) or the fact you have compressed
> it with GZip. You definitively need another property to endicate the
> content encoding. This is the reason HTTP 1.1 editors have done it so.
>
> Actually, MTOM is not directly tied with EXI. EXI is tied with XML.
> MTOM solve situation where you have reasonable size XML (outside its
> binary elements) and at least a binary content encoding that is of a
> big size.
> EXI solve the performance issue when you have big XML that containts
> or not a big binary item embeded.
>
> In this perspective EXI or other content encoding can bring new
> solution that will be definitively helpfull.
>
>>>          * [Definition: If the content encoding is specified, it is
>>> checked to ensure that it matches the content encoding values
>>> supported. A fault MUST be generated with subcode
>>> contentEncodingNotSupported if the encoding values do not match.†]
>> Errrr.  If the Content Encoding contains an unrecognized or unsupported
>> value, the client or server should generate a fault with subcode
>> ContentEncodingNotSupported.
> Yes :)
>
>>>        * [Definition: If no content encoding property is set or no value is
>>> set, the property MUST be assumed as "identity".†]
>> Identity is the only thing that our specification is likely to define.
> Yes we should clearly indicate that "identity" means no transformation
> (content encodign) has been perform on the message content (body)
> whatsoever.
>
>>>        * [Definition: If soapjms:acceptEncoding was set, the contentEncoding
>>> value SHOULD be set to any of those value.†]
>> Well, no.  It MUST be set to a value corresponding to the encoding
>> used, about which we are not going to say anything.  I hope.
>>
> No, such a matching has to be done on the message and the
> contentEncoding. The acceptEncoding is only there as a helper for the
> caller to know the supported list of values.
>
> I was thinking "SHOULD" was suited, but maybe it is a bit strong ...
>
>>> Update to section 2.4 :
>>>
>>> change
>>> "The bytes or characters of the JMS Message payload correspond to the
>>> MIME format as indicated by the definition of the contentType
>>> property"
>>> with
>>> "The bytes or characters of the JMS Message payload correspond to the
>>> MIME format as indicated by the definition of the contentType property
>>> and the contentEncoding property".
>> and if defined, the contentEncoding property.
>>
> Correct.
>
>>> Alter of 2.4.1 :
>>>   a new point in the list of consideration for TextMessage :
>>>        - Messages using the SOAP JMS content encoding will need to use
>>> Content-Transfer-Encoding for attachment parts.
>> I hope this is a typographic error?  You mean Content-Encoding, not
>> Content-Transfer-Encoding, correct?  They're very different things;
> <...>
>> Content-Transfer-Encoding in any HTTP-consistent pseudo-MIME supporting
>> protocol (it's forbidden to use in HTTP), but can use Content-Encoding.
>>
>> </network-protocol-geek>
>>
> This is definitively a bad&  ungly typo. Oops ... Glad you did not
> fall into its trap.
> It was obviously contentEncoding ;-)
>
>>> Addendum to section 2.8 :
>>>
>>>   Add of :
>>>   - contentEncodingNotSupported
>>>   - contentEncodingMismatch
>> Mmmm.  With minimal explanation I think.  Vendors MUST support the
>> identity encoding; others will go without mention (my preference).
>>
>>> Addendum to section 3.4 :
>>>   Add the element acceptEncoding in the list.
>> Let's not go there.  We don't need content negotiation; in the first
>> iteration of SOAP/JMS, a clear error pattern is perfectly adequate.
> Again, it is simply an indication to detect mismatch. If provided you
> might used the indication, or not. It is up to implementers to go
> further with such an indication and perform on client side a
> negotiation based on whatever rules. Whatever is implemented :
> ignoring it, using it as-is, applying rules (negotiation like) ... it
> will stay interroperable.
>
> I really think this accept mechanism worth it because it ease ease the
> job of integration and production team.
>
>>> Add of a new section  :
>>>   X.X Content Encoding
>>>   Content coding values indicate an encoding transformation that has
>>> been or can be applied to the JMS message body content.
>>>   Content codings are primarily used to allow a message body to be
>>> compressed or otherwise usefully transformed without losing the
>>> identity of its underlying media type and without loss of information.
>>>
>>>   All content-coding values are case-sensitive.
>>>
>>>   The Internet Assigned Numbers Authority (IANA) acts as a registry for
>>> content encoding value tokens. Initially the list of valid values is
>>> taken from the HTTP 1.1 Content Coding values (see
>>>
>> http://www.iana.org/assignments/http-parameters/http-parameters.xml#http-parameters-1
>>> ).
>>>
>>>   New content-coding value tokens SHOULD be registered to allow
>>> interoperability between clients and servers, specifications of the
>>> content coding algorithms needed to implement a new value SHOULD be
>>> publicly available and adequate for independent implementation, and
>>> conform to the purpose of content coding defined in this section.
>>>
>>>   An implementation SHOULD support gzip or (and ?) exi content encoding.
>> I think all of this is superfluous--and tending to lead to a need to
>> add more and more references.  Maybe just the pointer to the IANA
>> registry, and the requirement to support identity?
> Well, we can link the HTTP header entry at IANA but doing so it means
> that any entry there will be valid for SOAPJMS as well.
>
> I do not find a reason the two should diverge, so this point realy
> need to discussed between by all the editors.
>
>> Note: throughout the above, I speak for myself (and for my employer as
>> its representative on this working group), not for the working group as
>> a whole.
>>
>> Summarizing and rephrasing my responses: Thank you for the detailed
>> suggestions; that was very helpful.  I believe that they go too far in
>> some directions; we should require no more than identity support, and I
>> do not believe that we need content negotiation.  I'm not clear whether
>> we need a JMS Header defined, or if we need worry about
>> Content-Encoding only for the components of a multi-part message.
>>
> Thanks Amy for this interresting feedback.
>
> Regards,
> JB
>
Received on Monday, 29 November 2010 20:25:22 UTC