Re: Allow EXI as characterization for XML in the JMS body ?

On Tue, 9 Nov 2010, Eric Johnson wrote:

> Hi Jean-Baptiste,
>
> I've filed http://www.w3.org/2002/ws/soapjms/tracker/issues/65 so that the 
> working group can track your concern.
>
> I've tried to follow the thread between you and Amy, and also I find myself 
> stuck on a few points.
>
> a) Large binary data - this can already be handled by use of MTOM. I don't 
> see the value of introducing EXI to handle this case.

The goal of EXI is to efficiently serialize an infoset, if you know that 
your payload is mainly binary, then MTOM is the way to go (but you might 
use EXI to compress the infoset part in your MTOM payload).

> b) I couldn't find any documentation/specification that describes how to use 
> EXI with SOAP/HTTP.  From what I can figure out, there are a few corner cases 
> even with HTTP, and it would help me to understand if there was complete 
> documentation of how it would work.

Same as gzip apart the fact tht it cannot be a Transfer-Encoding.

> b.1) If I'm sending application/exi as the MIME type for a SOAP payload, this 
> obscures the underlying payload (application/soap+xml, for example, or 
> text/xml).  Does it make more sense to define application/soap+exi?

No, EXI can be used as a Content-Encoding (but not a Transfer-Encoding).

> c) The JMS provider could itself be doing compression already.  Which means 
> that doubling down with gzip or EXI could actually result in worse 
> performance.  I just checked with our internal implementation folks, and they 
> confirmed that automated compression with JMS is already possible, but as 
> with many things JMS, it will be vendor specific.

Then I expect that it would be properly labelled somehow, and double 
compression would be avoided.

> d) From my (very quick) review of the EXI specification, it looks like EXI is 
> not necessarily a lossless compression of the underlying data, the way that 
> gzip would be.  In particular, if the schema used as the basis for defining 
> the EXI format can be externalized, then that fundamentally changes the way 
> in which I can process the document, doesn't it?  I mention this, because it 
> comes back to the "mime type" question - if EXI cannot algorithmically be 
> applied (the way gzip can, for example) without reference to external 
> resources, then it really has to be a different MIME type.
>
> Maybe I'm mis-reading EXI on this point?

No, EXI is not entirely lossless (even if there are settings to limit 
what can be changed), the goal is to provide infoset equivalence. The fact 
that it is not lossless is the reason it cannot be used as a 
Transfer-Encoding in HTTP.

> e) Since JMS is an API, unlike HTTP, the notion of content-encoding is 
> somewhat confusing.  It is unclear whether you're informing the API of what 
> it should do to the payload, or whether the payload itself needs to be 
> post-processed to deal with the encoding.  When defined at the protocol layer 
> for HTTP, the use of content-encoding with HTTP makes perfect sense, because 
> it clarifies the form of the data visible at the network transport layer. 
> This makes much less sense at the JMS API level, where the underlying message 
> delivery has already been removed.
>
> I looked for a discussion of how to use EXI with existing SOAP/HTTP, and if I 
> saw an example of that somewhere, that would greatly help my understanding, I 
> think.
>
> -Eric.
>
>
> On 11/9/10 11:26 AM, Jean-Baptiste Bugeaud wrote:
>> Hello Amy,
>> 
>> I do fully agree with your statements. And basically, I was simply
>> thinking of making sure the specification does not prevent EXI (or
>> GZip) from beeing used as a way to encode the content in an
>> interroperable way.
>> 
>> At this time, there are things that prevent a content encoding (such
>> as EXI) from beeing used :
>>   - no way to store the exact content encoding used
>>   - mandatory alignment of octet and content type
>>   - no clarification of error scenari
>> 
>> To solve those issues, an idea would be to add content encoding
>> feature inspired with the existing Content Coding HTTP 1.1 feature.
>> 
>> Practically, this means add a SOAPJMS property for tagging the feature
>> and WSDL property to indicate the content encoding features available
>> on the destination.
>> 
>> Doing so, there will be very with minimal impact for implementers but
>> clear interroperability and extensibility. If you don't plan to suport
>> any content encoding, on the client side you have nothing to do. And,
>> on server side, you only need to make sure that you cancel any message
>> send with a content encoding (aka not "identity" encoding) according
>> to the spec (aka send the specified fault).
>> 
>> Here is the draft of the specification impact for such a proposition.
>> 
>> ===========================
>> 
>> Addendum to section 2.2.1 :
>> 
>> 
>> [Definition: soapjms:acceptEncoding] (list of xsd:string)
>>  * Identifies the list of accepted values for content encoding that
>> can be set using soapjms:contentEncoding.
>>  * [Definition: Each values indicated as accept content encoding MUST
>> be supported by the target destination implementation.?]
>>  * [Definition: A caller SHOULD only use Each values indicated as
>> accept content encoding MUST be supported by the target.?]
>> 
>> 
>> Addendum to section 2.2.3 :
>> 
>> 
>> [Definition: soapjms:contentEncoding] (xsd:string)
>>  * Identifies the transformation that has been applied to the message
>> payload body.
>>          * [Definition: If the content encoding is specified, it is
>> checked to ensure that it matches the content encoding values
>> supported. A fault MUST be generated with subcode
>> contentEncodingNotSupported if the encoding values do not match.?]
>>  * [Definition: If the content encoding is specified, it is checked to
>> ensure that it matches the encoding value from the supplied XML. A
>> fault MUST be generated with subcode contentEncodingMismatch if the
>> content encoding values do not match the encoded content.?]
>>  * [Definition: If no content encoding property is set or no value is
>> set, the property MUST be assumed as "identity".?]
>>  * [Definition: If soapjms:acceptEncoding was set, the contentEncoding
>> value SHOULD be set to any of those value.?]
>> 
>> Update to section 2.4 :
>> 
>> change
>> "The bytes or characters of the JMS Message payload correspond to the
>> MIME format as indicated by the definition of the contentType
>> property"
>> with
>> "The bytes or characters of the JMS Message payload correspond to the
>> MIME format as indicated by the definition of the contentType property
>> and the contentEncoding property".
>> 
>> change
>> "and specifies an appropriate value for the contentType property which"
>> by
>> "and specifies an appropriate value for the contentType property and
>> contentEncoding property which
>> 
>> Alter of 2.4.1 :
>>   a new point in the list of consideration for TextMessage :
>>  - Messages using the SOAP JMS content encoding will need to use
>> Content-Transfer-Encoding for attachment parts.
>> 
>> 
>> Addendum to section 2.8 :
>>
>>   Add of :
>>   - contentEncodingNotSupported
>>   - contentEncodingMismatch
>> 
>> Addendum to section 3.4 :
>>   Add the element acceptEncoding in the list.
>> 
>> Add of a new section  :
>>   X.X Content Encoding
>>   Content coding values indicate an encoding transformation that has
>> been or can be applied to the JMS message body content.
>>   Content codings are primarily used to allow a message body to be
>> compressed or otherwise usefully transformed without losing the
>> identity of its underlying media type and without loss of information.
>>
>>   All content-coding values are case-sensitive.
>>
>>   The Internet Assigned Numbers Authority (IANA) acts as a registry for
>> content encoding value tokens. Initially the list of valid values is
>> taken from the HTTP 1.1 Content Coding values (see
>> http://www.iana.org/assignments/http-parameters/http-parameters.xml#http-parameters-1
>> ).
>>
>>   New content-coding value tokens SHOULD be registered to allow
>> interoperability between clients and servers, specifications of the
>> content coding algorithms needed to implement a new value SHOULD be
>> publicly available and adequate for independent implementation, and
>> conform to the purpose of content coding defined in this section.
>>
>>   An implementation SHOULD support gzip or (and ?) exi content encoding.
>> 
>> =====================
>> 
>> Regards,
>> JB
>> 
>> 
>> 2010/11/9 Amelia A Lewis<alewis@tibco.com>:
>>> What specific changes would need to be made to the specification in
>>> order to avoid ruling out the use of EXI?
>>> 
>>> I'm phrasing this differently than you have, of course: I do not think
>>> incorporating an EXI *requirement* is in scope for the SOAP/JMS working
>>> group.  In fact, I do not think, at this stage of the specification
>>> work, that we ought to introduce a new optional dependency.  However, I
>>> will readily acknowledge that a vendor might wish to enable EXI in SOAP
>>> over JMS, and if our specification forbids it, we shouldn't.
>>> 
>>> Rephrasing: I think we should not forbid, nor require, nor even specify
>>> as optional behavior the use of EXI in SOAP over JMS, but should phrase
>>> our specification in such a way that a person or group writing an
>>> extension specification could make using EXI possible and interoperable.
>>> 
>>> Amy!
>>> On Mon, 8 Nov 2010 19:40:58 +0100, Jean-Baptiste Bugeaud wrote:
>>>> Dear SOAP-JMS Editors,
>>>> 
>>>> Could it be possible to allow EXI characterisation (see
>>>> http://www.w3.org/TR/exi/) in the message body §2.4 as well ?
>>>> 
>>>> Using EXI makes sense in a JMS context for any mission critical system
>>>> where latency has to be kept at a the minimum.
>>>> 
>>>> EXI would also help a lot :
>>>>   - keeping XML processing CPU overhead low (message inspection for
>>>> active routing for instance)
>>>>   - handling&  streaming of huge attachement (base64bin saved as octet
>>>> binary set for instance)
>>>>   - etc
>>>> 
>>>> Regards,
>>>> JB BUGEAUD
>>>> 
>>>> 
>>> --
>>> Amelia A. Lewis
>>> Senior Architect
>>> TIBCO/Extensibility, Inc.
>>> alewis@tibco.com
>>> 
>
>

-- 
Baroula que barouleras, au tiéu toujou t'entourneras.

         ~~Yves

Received on Wednesday, 10 November 2010 18:01:09 UTC