Re: ACTION-34: write up specific proposal on how to support TextMessage from Eric Johnson on 2008-09-23 (public-soap-jms@w3.org from September 2008)

From: Eric Johnson <eric@tibco.com>
Date: Mon, 22 Sep 2008 23:04:32 -0700
To: Phil Adams <phil_adams@us.ibm.com>
CC: public-soap-jms@w3.org, public-soap-jms-request@w3.org
Message-ID: <48D886F0.3090302@tibco.com>
A fairly detailed response.  Sorry that it is so close to our meeting.

Phil Adams wrote:
> 
> Hi Eric,
> I added some comments below...
> 
> Phil Adams
> WebSphere Development - Web Services
> IBM Austin, TX
> email: phil_adams@us.ibm.com
> office: (512) 838-6702  (tie-line 678-6702)
> mobile: (512) 750-6599
> 
> 
> 
> From:  Eric Johnson <eric@tibco.com>
> To:  Phil Adams/Austin/IBM@IBMUS
> Cc:  public-soap-jms@w3.org
> Date:  09/19/2008 12:07 PM
> Subject:  Re: ACTION-34: write up specific proposal on how to support 
> TextMessage
> 
> 
> ------------------------------------------------------------------------
> 
> 
> 
> Hi Phil,
> 
> In general, it looks like a good thorough write up.
> 
> I'm still slightly puzzled by the use-case, and I think this helps 
> clarify it - which is certainly the point!
> 
> For a request/reply exchange, it seems like you've clearly identified a 
> messageType property that does this:
> 
> Client -->  (via TextMessage)  --> (received by) server ... which 
> generates a reply
> Client <-- (via ????Message) <-- server sends reply
> 
> This is referenced in your update to section 2.6.2.3.
> *<pca>I think that in the vast majority of cases, the response message 
> will also be a TextMessage, but we have to account for the case where 
> the reply might contain attachments, in which case we'd have to back off 
> and send back a BytesMessage.  With very few exceptions, users will know 
> whether or not they are using attachments and in the majority of 
> situations where they might use "messageType=TEXT", there will be no 
> problem in using a TextMessage for the reply. <pca>*

(I don't believe we need to back off - see comment below on RFC 1341).

A solution I could live with has three possibilities:
  1. only text messages sent by binding (all messages for all ops)
  2. only bytes messages sent (likewise, all messages for all ops)
  3. text or bytes message can be sent by the client, and are supported 
by the server

What we've got now is #2.  Originally, I was pushing for option #3 - 
which says that the recipient has to support whichever it receives.  How 
the message producer decides which type of message to use is 
out-of-scope of the specification, and would be determined in a vendor 
specific way.  Almost certainly, this would mean a vendor defines WSDL 
extensions to carry that information.  It would not be a violation of 
the SOAP/JMS binding if the message was of the "wrong" type, but it 
would be up to the vendor to define whether exceptions would be thrown 
if the wrong type of message was received.  As far as Phil's use cases 
are concerned, IBM could go off and satisfy the customers, but only by 
extending the specification.

Now that I've thought through it more, and contemplated what it would 
mean to send text messages using MIME encoding (see comment below), I'm 
comfortable with an approach that requires all messages to and fro as 
the same thing, either text or bytes.

What I haven't figured out is how to express these three options.  Does 
it get specified as three different URLs for the @transport attribute, 
or as an extension attribute of the soap:binding element, or as 
something else?  (I lean towards the extension attribute)

> 
> Here's what makes this particularly interesting - if I understand your 
> use case  - in practice, neither the client or the server particularly 
> cares whether the message sent is "TextMessage" - presumably, if they're 
> conforming to our (properly complete) specification, they can generate 
> and consume both TextMessage and BytesMessage.  The point is that 
> something in the middle between the two endpoints cares.  Does that 
> something in the middle care per input/output/fault message, per 
> operation, per portType/port, or the entire set of services provided by 
> a server?
> *<pca>The "something in the middle" (i.e. some sort of intermediary) is 
> use-case #3 below, and I'm not sure if we can define with 100% certainty 
> what the exact scope of "text message" should be.     In some cases, the 
> user would want ALL messages to be TextMessage (I suppose), but yet 
> other users might want only messages associated with certain services or 
> operations to use a TextMessage.</pca> *

I guess my concern here is simple - supporting all variations of on 
ports, operations, and messages, adds a *great* deal of complexity to 
the specification, including possibly annotating input and output 
messages, as well as defining the way defaults are inherited, and 
mapping all of that onto both WSDL 1.1 & WSDL 2.0.

I'm OK with IBM wanting to be able to add that level of complexity 
without violating the SOAP/JMS specification.  However, I'd rather the 
specification itself keep it simple, and have clear distinctions.

> 
> Further, we've written off the notion of using attachments with 
> TextMessage, but it is clearly possible - although I cannot figure any 
> obvious alternative to simply doing a Base64 encoding of the attachment 
> - which implies a 33% increase in the size of the message.  However, if 
> it is fundamentally a "MUST" that the message be carried as a 
> TextMessage, why are we making an exception with respect to 
> attachments?  Might customers want the choice of bigger payload versus 
> the convenience of auditing?
> *<pca>If I read RFC 1341 correctly, then I don't think it would be 
> correct to use base64 encoding of an attachment part.   Section 7.2 
> includes this: "*As stated in the definition of the 
> Content-Transfer-Encoding field, no encoding other than "7bit", "8bit", 
> or "binary" is permitted for entities of type "multipart".*"...  So I 
> don't think it would be correct to create a mime part whose 
> Content-Transfer-Encoding is "base64", which I think is what you are 
> suggesting.</pca>*

Hoping you don't mind my directness, you've read RFC 1341 incorrectly. 
From: http://www.w3.org/Protocols/rfc1341/7_2_Multipart.html

The relevant section of the text, in its entirety, is this:

"As stated in the definition of the Content-Transfer-Encoding field, no 
encoding other than "7bit", "8bit", or "binary" is permitted for 
entities of type "multipart". The multipart delimiters and header fields 
are always 7-bit ASCII in any case, and data within the body parts can 
be encoded on a part-by-part basis, with Content-Transfer-Encoding 
fields for each appropriate body part."

Notice that part about "data within the body parts can be encoded on a 
part-by-part basis"?

The way I think about MIME is as an arbitrarily nested boxes.  Any box 
that "contains" other boxes is a "multipart" box.  Any box that has no 
children has a different content type, and uses any of the 
content-transfer encoding types defined in Section 5:
http://www.w3.org/Protocols/rfc1341/5_Content-Transfer-Encoding.html

Content-Transfer-Encoding := "BASE64" / "QUOTED-PRINTABLE" /
                              "8BIT"   / "7BIT" /
                              "BINARY" / x-token


> 
> Some other comments below.
> 
> Phil Adams wrote:
> 
> Hi everyone,
> During our 9/16 call, I took an action item to write up a specific 
> proposal regarding TextMessages that we can hopefully make a decision on 
> fairly soon.
> 
> 
> Background:
> Customer feedback that I've received has indicated that the use of a 
> TextMessage as an alternative to a BytesMessage would be useful in 
> situations where:
> 1.        The user's application needs to interface with a legacy system 
> or another Web services toolkit which might only support TextMessage.
> 2.        The user needs to perform audit logging of SOAP request and 
> response messages.   A TextMessage would make this easier.
> 3.        Using a TextMessage would make it easier for an intermediary 
> to process a message for routing or other purposes, where the processing 
> needs to look at the message content.
> I actually don't buy this last justification for a single moment.  The 
> payload is *XML*.  Surely customers are not doing routing by scanning a 
> text string and hoping to get that right without being aware of 
> surrounding XML context (CDATA sections, comments, etc.)?  Which means 
> that, if the customer is doing this correctly (and I don't think we 
> should be supporting them doing it incorrectly!), they still have to 
> send the data to an XML parser.  Once you're doing that, at least in 
> Java, the difference between sending a bytes payload and a text payload 
> is at most one line of additional code - maybe five if you count the 
> required if statement and lines with "else" and braces.
> *<pca>Well, I'm certainly sorry that you don't buy this use-case for a 
> single moment :)    Perhaps I should challenge my users more when they 
> try to provide justifications for things like this.  Seriously though, I 
> agree that if the user is basing routing decisions on the XML content 
> (possibly a SOAP header value, etc.) then they should be retrieving the 
> XML content in the 'correct' way, and certainly not by simply looking 
> for text strings within the XML stream, etc.     However, the fact 
> remains that if the user perceives that a TextMessage would make that 
> easier then it's hard to convince them otherwise.   I'm not saying it's 
> correct; I'm just say it is reality.</pca>*

As I said parenthetically above, I don't believe we should support 
incorrect uses of SOAP and XML.  If the customers think this is a valid 
use case, then IBM is welcome to support them, but I think a W3C blessed 
specification should not.

> 
> As to justification #2, this is somewhat suspect in my view.  What will 
> make auditing easier is a single standard for doing this, not the 
> introduction of TextMessage for the cases where attachments don't 
> apply.  Having never looked at the auditing tools for messages, I don't 
> know whether my analysis is simplistic in practice.  Not knowing the 
> details, I'm certainly willing to go along with this one.
> *<pca>In the situations where I've discussed this with users, I don't 
> think there are any *formalized* auditing tools being used mind you.  I 
> think it is just something the users have implemented in their 
> particular environments and would simply prefer TextMessages since it 
> would make it easier for them to implement the audit logging.</pca>*

In case it wasn't clear in all I wrote up, use-case #2 is where I'm 
willing to go along with a proposal to use text messages.

> 
> The first use-case is certainly quite compelling in principle.  In 
> practice, though, does it work?  If you're using the newly defined 
> standard SOAP/JMS binding for sending SOAP messages over JMS, how is 
> ever going to be compatible with an existing SOAP/JMS binding that 
> expects a specific set of headers in a specific place.  For us to enable 
> that would require defining some means to map the properties we've 
> defined to JMS Message Properties, so that they could be carried in the 
> appropriate headers for the pre-existing binding.  In short, just 
> throwing a "TextMessage" into the mix maybe solves 10% of the 
> compatibility question.  Isn't it, rather, for us the vendors to support 
> those old bindings?
> *<pca>I certainly see your point on this one, but I don't think it's a 
> black or white issue.   I think it's more a case of the users' legacy 
> system already supporting TextMessage and they are familiar with it, for 
> lack of a better phrase.    I'm sort of speculating at this point, but 
> my guess is that some or most users would be willing to possibly make 
> some minor tweaks to support different property names, etc., but find it 
> more difficult to switch to a different message type.    I don't 
> necessarily think that switching to a BytesMessage would be all that 
> difficult or time-consuming, but I'm speculating that customers think 
> that.<pca>*

Speculating about the use-cases this late in the game makes me nervous 
(any SOAP/JMS consumers out there care to comment?)  Absent specifics 
about how what we might do here helps, I'm very hesitant to race down 
that path.

> 
> These are the main justifications that I've heard from customers.   
> There might be other reasons that some of you know about.
> 
> Proposal:
> My proposal for introducing TextMessage support to the SOAP/JMS spec 
> would include the following:
> 
> 1.  The JMS URI spec (located at: 
> _http://www.ietf.org/internet-drafts/draft-merrick-jms-uri-03.txt_) 
> would be updated as follows:
> 
>     * Introduce support for a new "shared parameter" called
>       "messageType".   Section 4.1.5 would be added to document this as
>       follows:
> 
>       4.1.5   messageType
> 
> This property specifies the JMS message type that should be used for the 
> request message.   The valid values for this property are "TEXT" and 
> "BYTES".  If this property is specified as "TEXT", then a JMS 
> TextMessage MUST be used.   If a value of "BYTES" is specified, then a 
> JMS BytesMessage MUST be used.    If this parameter is not specified, 
> then a value of "BYTES" SHOULD be used.
> 
>     * "messageType" should be added to the list of parameters in section
>       8.2.1.    The "messageType" parameter should also be removed from
>       the request URI that is set on the request message.
>     * Here's an example of the messageType parameter within a JMS URI:
>       jms:jndi:jms/MyQueue&jndiConnectionFactoryName=jms/MyCF&timeToLive=300&deliveryMode=PERSISTENT&messageType=TEXT
> 
> My suggestion is that we make no changes to the URI scheme internet 
> draft.  We can instead follow the pattern of what we did for the 
> "service" parameter that is exposed by the SOAP/JMS binding, but is not 
> defined by the URI scheme.  That is, we leave the URI scheme to continue 
> to capture the details of getting a Destination.
> *<pca>I'm not sure what you are referring to when you say "what we did 
> for the 'service' parameter".     If we didn't put the "messageType" 
> property in the JMS URI scheme internet draft, but instead put it solely 
> in the SOAP/JMS binding spec, would the name actually need to be 
> "SOAPJMS_messageType" when it appears in a JMS endpoint location URI?   
>   I would want it to appear simply as "messageType" to at least *appear* 
> to be similar to the other JMS-message related properties that are 
> already defined in the JMS URI spec.</pca>*

See section 2.2.4 of the spec, specifically how it deals with the 
"targetService" property:
http://dev.w3.org/cvsweb/~checkout~/2008/ws/soapjms/soapjms.html?content-type=text/html;%20charset=utf-8#binding-props-URI

> 
> 2.  The SOAP/JMS spec (located at: 
> _http://www.w3.org/TR/2008/WD-soapjms-20080723/_) would be updated as 
> follows:
> 
>     * In section 2.2.2 (JMS Message Header properties) we would add a
>       section which describes the "messageType" property:
>       *
>       [Definition: soapjms:messageType] (xsd:string)
>        - *indicates the JMS message type for the request.   The valid
>       values are "TEXT" and "BYTES".  The default value is "BYTES".
>        - optional in URI, optional in WSDL, optional in environment
>        - if specified as "TEXT", then the message sending node MUST use
>       a JMS TextMessage for the request if there
>          are no attachments.   If there are attachments, then a fault is
>       generated. [Definition: use fault subcode *invalidMessageType*]
>        - if specified as "BYTES" or not specified at all, then the
>       message sending node MUST use a JMS BytesMessage for the request.
> 
>       Issue: It's possible to argue that "messageType" is not exactly in
>       the same category as the other "JMS Message Header properties" as
>       it dictates the type of the JMS message, rather than the value of
>       a JMS Message Header, so perhaps this property doesn't belong in
>       section 2.2.2 but belongs in a new section of its own.    I think
>       it's ok in section 2.2.2 but I certainly can understand if others
>       don't agree with that.
>     * In section 2.2.4 (Binding of Properties to URI), the following row
>       should be added to the table:
> 
>       "messageType", "as messageType query parameter", "Should exclude"
>     * In section 2.4 (The JMS Message Body), we should reword the first
>       sentence to indicate that the message should be a TextMessage if
>       messageType=TEXT is specified, and a BytesMessage otherwise.    
>        Also, at the end of section 2.4, we should perhaps add an
>       explanation that if the user requests a TextMessage and
>       attachments exist, then the message sending node MUST generate a
>       fault with subcode invalidMessageType.   This would be redundant
>       with section 2.2.2 (above), so perhaps we can omit it here?
>     * In section 2.6.1.1 (Init), we should change the second sentence so
>       that it specifies that a TextMessage MUST be used if
>       messageType=TEXT is specified, and a BytesMessage MUST be used if
>       messageType=BYTES is specified.      In addition, a row should be
>       added to the table to account for the fact that the messageType
>       property specifies the JMS message type.
>     * In section 2.6.2.3 (Receiving + Sending), we should re-word the
>       third sentence as follows:
> 
>       If the request message is a JMS TextMessage and no attachments
>       exist in the response, then the response MUST be created as a JMS
>       TextMessage, otherwise the response MUST be created as a JMS
>       BytesMessage.
> 
>       Issue: There might be some debate over the handling of this
>       situation.   My opinion is that the message receiving node should
>       try to respond in kind if at all possible, but should use a
>       BytesMessage if attachments exist.     My thought is that we
>       should not generate a fault in this case, but should do our best
>       to get the response back to the requester, although I can see both
>       sides of this issue.
> 
> This highlights the key question for me.  It seems to me like once you 
> want a TextMessage, you always want a text message - for all messages in 
> all operations for an entire portType.  Which means we might just as 
> well define an alternate "@transport" value specifically for carrying 
> TextMessage.
> *<pca>I don't think it's necessarily a given that once you want a 
> TextMessage you always want one, especially if the user also will be 
> invoking operations that involve attachments, etc. Note that not all 
> invocations made by a web services consumer will necessarily be destined 
> for a legacy system, or through an intermediary, etc.   Certainly, a 
> particular web services consumer has the ability to invoke operations on 
> various services which will be delivered to any number of destinations.*

(See my options outlined above.  If we're supporting text messages, I 
think consistency is good - if we have text message anywhere, we use it 
everywhere within a scope.)

> 
> *Regarding the "transport" value...   the entire WSDL Usage chapter of 
> the SOAP/JMS binding spec is optional for a conforming runtime.   
> Therefore, a vendor might choose not to even use the soap binding's 
> "transport" attribute found in the WSDL (in fact, I think it's easy to 
> argue that a transport-related value like that doesn't even *belong* in 
> the soap binding element, but that's another story).    Given this, I 
> really don't think we should determine the "text vs bytes" question by 
> looking for a specific transport attribute value in the soap binding 
> element within the WSDL document.*
> *</pca>*

If I'm recalling correctly, the original use case for putting 
"deliveryMode", "timeToLive", and "priority" in the URI was so that a 
customer could take an existing WSDL, and configure it for the 
appropriate QOS, without making any other changes to a WSDL document 
other than the @transport attribute, and the service URI, connection to 
a SOAP/JMS service.

Since those attributes affect the QOS, I think there's a case to be made 
there, although given a preference, I'd just as soon restrict them only 
to the WSDL, although that would force other changes to the WSDL.

The text vs. bytes question, to me, doesn't affect the QOS, and 
therefore does not belong in the URI for the service.  If you need that 
level of control, you can add something to the WSDL.

The fact that the WSDL is optional is largely a matter of coupling.  If 
you can figure out some non-WSDL way for your SOAP/JMS services to 
describe themselves to each other, then have at it.  That won't prevent 
you from using the core binding.

However, metadata about configuring a binding deserves to be described 
in the metadata document for that binding - available as WSDL 1.1 or 
2.0.  We just don't require, with the SOAP/JMS binding, that you must 
use one of those metadata descriptions.  You could use another, although 
you'd want it to have equivalent information.

As per the WSDL 2.0 HTTP binding:
http://www.w3.org/TR/2007/REC-wsdl20-adjuncts-20070626/#http-binding

Metadata that configures things like the content encoding, the method to 
invoke, and additional HTTP headers, are all defined in the WSDL, not in 
the HTTP URL to invoke the service.

> 
> This differs from the other characteristics (priority, timeToLive, 
> deliveryMode), in that those properties may affect how the message is 
> routed, and quality of service, whereas using TextMessage shouldn't 
> fundamentally affect anything about the service - at least you've not 
> indicated that it should.
> *<pca>I guess I don't see the distinction quite as clearly as you do.   
>  Fundamentally, all those properties (priority, timeToLive, 
> deliveryMode, and messageType) control some sort of characteristic of 
> the JMS message, whether it's the amount of time the message should 
> "live", whether the message should be persisted or not, the relative 
> priorty of the message, and whether the message itself should be a Bytes 
> or a Text message.     In general, these are all message 
> characteristics.    I suppose if we tried hard enough we could come up 
> with four distinct categories for these properties but I'm not sure how 
> useful that would be.</pca>*

Yes, the distinction is murky here.  My druthers would be to take *all* 
of these out of the URI, but I conceded that battle a while back for the 
ones we have.

-Eric.
Received on Tuesday, 23 September 2008 06:05:12 UTC