Re: ACTION-34: write up specific proposal on how to support TextMessage

Hi Eric,
I added some comments below...

Phil Adams 
WebSphere Development - Web Services
IBM Austin, TX
email: phil_adams@us.ibm.com
office: (512) 838-6702  (tie-line 678-6702)
mobile: (512) 750-6599




From:
Eric Johnson <eric@tibco.com>
To:
Phil Adams/Austin/IBM@IBMUS
Cc:
public-soap-jms@w3.org
Date:
09/19/2008 12:07 PM
Subject:
Re: ACTION-34: write up specific proposal on how to support TextMessage



Hi Phil,

In general, it looks like a good thorough write up.

I'm still slightly puzzled by the use-case, and I think this helps clarify 
it - which is certainly the point!

For a request/reply exchange, it seems like you've clearly identified a 
messageType property that does this:

Client -->  (via TextMessage)  --> (received by) server ... which 
generates a reply
Client <-- (via ????Message) <-- server sends reply

This is referenced in your update to section 2.6.2.3.
<pca>I think that in the vast majority of cases, the response message will 
also be a TextMessage, but we have to account for the case where the reply 
might contain attachments, in which case we'd have to back off and send 
back a BytesMessage.  With very few exceptions, users will know whether or 
not they are using attachments and in the majority of situations where 
they might use "messageType=TEXT", there will be no problem in using a 
TextMessage for the reply. <pca>

Here's what makes this particularly interesting - if I understand your use 
case  - in practice, neither the client or the server particularly cares 
whether the message sent is "TextMessage" - presumably, if they're 
conforming to our (properly complete) specification, they can generate and 
consume both TextMessage and BytesMessage.  The point is that something in 
the middle between the two endpoints cares.  Does that something in the 
middle care per input/output/fault message, per operation, per 
portType/port, or the entire set of services provided by a server?
<pca>The "something in the middle" (i.e. some sort of intermediary) is 
use-case #3 below, and I'm not sure if we can define with 100% certainty 
what the exact scope of "text message" should be.     In some cases, the 
user would want ALL messages to be TextMessage (I suppose), but yet other 
users might want only messages associated with certain services or 
operations to use a TextMessage.</pca> 

Further, we've written off the notion of using attachments with 
TextMessage, but it is clearly possible - although I cannot figure any 
obvious alternative to simply doing a Base64 encoding of the attachment - 
which implies a 33% increase in the size of the message.  However, if it 
is fundamentally a "MUST" that the message be carried as a TextMessage, 
why are we making an exception with respect to attachments?  Might 
customers want the choice of bigger payload versus the convenience of 
auditing?
<pca>If I read RFC 1341 correctly, then I don't think it would be correct 
to use base64 encoding of an attachment part.   Section 7.2 includes this: 
"As stated in the definition of the Content-Transfer-Encoding field, no 
encoding other than "7bit", "8bit", or "binary" is permitted for entities 
of type "multipart"."...  So I don't think it would be correct to create a 
mime part whose Content-Transfer-Encoding is "base64", which I think is 
what you are suggesting.</pca>

Some other comments below.

Phil Adams wrote: 

Hi everyone, 
During our 9/16 call, I took an action item to write up a specific 
proposal regarding TextMessages that we can hopefully make a decision on 
fairly soon. 


Background: 
Customer feedback that I've received has indicated that the use of a 
TextMessage as an alternative to a BytesMessage would be useful in 
situations where: 
1.      The user's application needs to interface with a legacy system or 
another Web services toolkit which might only support TextMessage. 
2.      The user needs to perform audit logging of SOAP request and 
response messages.   A TextMessage would make this easier. 
3.      Using a TextMessage would make it easier for an intermediary to 
process a message for routing or other purposes, where the processing 
needs to look at the message content.
I actually don't buy this last justification for a single moment.  The 
payload is *XML*.  Surely customers are not doing routing by scanning a 
text string and hoping to get that right without being aware of 
surrounding XML context (CDATA sections, comments, etc.)?  Which means 
that, if the customer is doing this correctly (and I don't think we should 
be supporting them doing it incorrectly!), they still have to send the 
data to an XML parser.  Once you're doing that, at least in Java, the 
difference between sending a bytes payload and a text payload is at most 
one line of additional code - maybe five if you count the required if 
statement and lines with "else" and braces.
<pca>Well, I'm certainly sorry that you don't buy this use-case for a 
single moment :)    Perhaps I should challenge my users more when they try 
to provide justifications for things like this.  Seriously though, I agree 
that if the user is basing routing decisions on the XML content (possibly 
a SOAP header value, etc.) then they should be retrieving the XML content 
in the 'correct' way, and certainly not by simply looking for text strings 
within the XML stream, etc.     However, the fact remains that if the user 
perceives that a TextMessage would make that easier then it's hard to 
convince them otherwise.   I'm not saying it's correct; I'm just say it is 
reality.</pca>

As to justification #2, this is somewhat suspect in my view.  What will 
make auditing easier is a single standard for doing this, not the 
introduction of TextMessage for the cases where attachments don't apply.  
Having never looked at the auditing tools for messages, I don't know 
whether my analysis is simplistic in practice.  Not knowing the details, 
I'm certainly willing to go along with this one.
<pca>In the situations where I've discussed this with users, I don't think 
there are any *formalized* auditing tools being used mind you.  I think it 
is just something the users have implemented in their particular 
environments and would simply prefer TextMessages since it would make it 
easier for them to implement the audit logging.</pca>

The first use-case is certainly quite compelling in principle.  In 
practice, though, does it work?  If you're using the newly defined 
standard SOAP/JMS binding for sending SOAP messages over JMS, how is ever 
going to be compatible with an existing SOAP/JMS binding that expects a 
specific set of headers in a specific place.  For us to enable that would 
require defining some means to map the properties we've defined to JMS 
Message Properties, so that they could be carried in the appropriate 
headers for the pre-existing binding.  In short, just throwing a 
"TextMessage" into the mix maybe solves 10% of the compatibility 
question.  Isn't it, rather, for us the vendors to support those old 
bindings?
<pca>I certainly see your point on this one, but I don't think it's a 
black or white issue.   I think it's more a case of the users' legacy 
system already supporting TextMessage and they are familiar with it, for 
lack of a better phrase.    I'm sort of speculating at this point, but my 
guess is that some or most users would be willing to possibly make some 
minor tweaks to support different property names, etc., but find it more 
difficult to switch to a different message type.    I don't necessarily 
think that switching to a BytesMessage would be all that difficult or 
time-consuming, but I'm speculating that customers think that.<pca>

These are the main justifications that I've heard from customers.   There 
might be other reasons that some of you know about. 

Proposal: 
My proposal for introducing TextMessage support to the SOAP/JMS spec would 
include the following: 

1.  The JMS URI spec (located at: 
http://www.ietf.org/internet-drafts/draft-merrick-jms-uri-03.txt) would be 
updated as follows:
Introduce support for a new "shared parameter" called "messageType".   
Section 4.1.5 would be added to document this as follows:

4.1.5   messageType
This property specifies the JMS message type that should be used for the 
request message.   The valid values for this property are "TEXT" and 
"BYTES".  If this property is specified as "TEXT", then a JMS TextMessage 
MUST be used.   If a value of "BYTES" is specified, then a JMS 
BytesMessage MUST be used.    If this parameter is not specified, then a 
value of "BYTES" SHOULD be used.
"messageType" should be added to the list of parameters in section 8.2.1. 
   The "messageType" parameter should also be removed from the request URI 
that is set on the request message.
Here's an example of the messageType parameter within a JMS URI:
jms:jndi:jms/MyQueue&jndiConnectionFactoryName=jms/MyCF&timeToLive=300&deliveryMode=PERSISTENT&messageType=TEXT
My suggestion is that we make no changes to the URI scheme internet 
draft.  We can instead follow the pattern of what we did for the "service" 
parameter that is exposed by the SOAP/JMS binding, but is not defined by 
the URI scheme.  That is, we leave the URI scheme to continue to capture 
the details of getting a Destination.
<pca>I'm not sure what you are referring to when you say "what we did for 
the 'service' parameter".     If we didn't put the "messageType" property 
in the JMS URI scheme internet draft, but instead put it solely in the 
SOAP/JMS binding spec, would the name actually need to be 
"SOAPJMS_messageType" when it appears in a JMS endpoint location URI? I 
would want it to appear simply as "messageType" to at least *appear* to be 
similar to the other JMS-message related properties that are already 
defined in the JMS URI spec.</pca>

2.  The SOAP/JMS spec (located at: 
http://www.w3.org/TR/2008/WD-soapjms-20080723/) would be updated as 
follows: 
In section 2.2.2 (JMS Message Header properties) we would add a section 
which describes the "messageType" property:

[Definition: soapjms:messageType] (xsd:string)
 - indicates the JMS message type for the request.   The valid values are 
"TEXT" and "BYTES".  The default value is "BYTES".
 - optional in URI, optional in WSDL, optional in environment
 - if specified as "TEXT", then the message sending node MUST use a JMS 
TextMessage for the request if there 
   are no attachments.   If there are attachments, then a fault is 
generated. [Definition: use fault subcode invalidMessageType]
 - if specified as "BYTES" or not specified at all, then the message 
sending node MUST use a JMS BytesMessage for the request.

Issue: It's possible to argue that "messageType" is not exactly in the 
same category as the other "JMS Message Header properties" as it dictates 
the type of the JMS message, rather than the value of a JMS Message 
Header, so perhaps this property doesn't belong in section 2.2.2 but 
belongs in a new section of its own.    I think it's ok in section 2.2.2 
but I certainly can understand if others don't agree with that.
In section 2.2.4 (Binding of Properties to URI), the following row should 
be added to the table:

"messageType", "as messageType query parameter", "Should exclude"
In section 2.4 (The JMS Message Body), we should reword the first sentence 
to indicate that the message should be a TextMessage if messageType=TEXT 
is specified, and a BytesMessage otherwise.      Also, at the end of 
section 2.4, we should perhaps add an explanation that if the user 
requests a TextMessage and attachments exist, then the message sending 
node MUST generate a fault with subcode invalidMessageType.   This would 
be redundant with section 2.2.2 (above), so perhaps we can omit it here?
In section 2.6.1.1 (Init), we should change the second sentence so that it 
specifies that a TextMessage MUST be used if messageType=TEXT is 
specified, and a BytesMessage MUST be used if messageType=BYTES is 
specified.      In addition, a row should be added to the table to account 
for the fact that the messageType property specifies the JMS message type.
In section 2.6.2.3 (Receiving + Sending), we should re-word the third 
sentence as follows:

If the request message is a JMS TextMessage and no attachments exist in 
the response, then the response MUST be created as a JMS TextMessage, 
otherwise the response MUST be created as a JMS BytesMessage.

Issue: There might be some debate over the handling of this situation.   
My opinion is that the message receiving node should try to respond in 
kind if at all possible, but should use a BytesMessage if attachments 
exist.     My thought is that we should not generate a fault in this case, 
but should do our best to get the response back to the requester, although 
I can see both sides of this issue.
This highlights the key question for me.  It seems to me like once you 
want a TextMessage, you always want a text message - for all messages in 
all operations for an entire portType.  Which means we might just as well 
define an alternate "@transport" value specifically for carrying 
TextMessage.
<pca>I don't think it's necessarily a given that once you want a 
TextMessage you always want one, especially if the user also will be 
invoking operations that involve attachments, etc. Note that not all 
invocations made by a web services consumer will necessarily be destined 
for a legacy system, or through an intermediary, etc.   Certainly, a 
particular web services consumer has the ability to invoke operations on 
various services which will be delivered to any number of destinations.

Regarding the "transport" value...   the entire WSDL Usage chapter of the 
SOAP/JMS binding spec is optional for a conforming runtime.   Therefore, a 
vendor might choose not to even use the soap binding's "transport" 
attribute found in the WSDL (in fact, I think it's easy to argue that a 
transport-related value like that doesn't even *belong* in the soap 
binding element, but that's another story).    Given this, I really don't 
think we should determine the "text vs bytes" question by looking for a 
specific transport attribute value in the soap binding element within the 
WSDL document.
</pca>

This differs from the other characteristics (priority, timeToLive, 
deliveryMode), in that those properties may affect how the message is 
routed, and quality of service, whereas using TextMessage shouldn't 
fundamentally affect anything about the service - at least you've not 
indicated that it should.
<pca>I guess I don't see the distinction quite as clearly as you do. 
Fundamentally, all those properties (priority, timeToLive, deliveryMode, 
and messageType) control some sort of characteristic of the JMS message, 
whether it's the amount of time the message should "live", whether the 
message should be persisted or not, the relative priorty of the message, 
and whether the message itself should be a Bytes or a Text message.     In 
general, these are all message characteristics.    I suppose if we tried 
hard enough we could come up with four distinct categories for these 
properties but I'm not sure how useful that would be.</pca>

In section 2.6.2.3 (Receiving + Sending), we should add a row to the table 
to account for the JMS message type and explain that it is derived from 
the request message type.
In section 2.7.1 (Behaviour of Sending SOAP Node), the first sentence of 
the second paragraph should be re-worded to account for TextMessage as 
well as BytesMessage, similar to section 2.6.1.1 above.     In addition, 
we should add a row to the table to account for the JMS message type.
In section 2.8 (Faults), we should add "invalidMessageType" to the list of 
fault subcodes.
In section 3.4.1 (Example), we might consider adding an example of the 
"messageType" property to the existing WSDL 1.1 example, perhaps like 
this:

24       <wsdl11:operation name="GetLastTradePrice">
25         <wsdl11soap11:operation soapAction="
http://example.com/GetLastTradePrice"/>
26         <wsdl11:input>
27             <wsdl11soap11:body use="literal"/>
28         </wsdl11:input>
29         <wsdl11:output>
30             <wsdl11soap11:body use="literal"/>
31         </wsdl11:output>
>>>>>>     <soapjms:messageType>TEXT</soapjms:messageType> 
32       </wsdl11:operation>

This would indicate that a TextMessage should be used when the client 
invokes the "GetLastTracePrice" operation.
Do we need/want this level of per-operation granularity? It appears here 
that you're setting it on an operation - but the use cases above all 
suggest that it matters really only at the level of an entire 
portType/port.
<pca>I agree that the use-cases above might not justify setting the 
messageType property at the operation level, but I was looking at this 
from the standpoint of trying to make the "messageType" property look and 
behave like the other JMS message-related properties (timeToLive, 
priority, deliveryMode).   I personally think that the most useful means 
of setting "messageType" would be in the JMS endpoint location URI, but I 
would also like to see this specified in a fashion that is consistent with 
our already existing properties.</pca>


In section 3.6 (Properties), we should add a row to the table:

"messageType", "service, port/endpoint, binding"
I identified a couple of issues above, but there's one more that was 
discussed on our calls and I'm not sure there was a resolution to it.  It 
has to do with the encoding of the JMS message vs the encoding of the SOAP 
envelope within the JMS message.    We will need to discuss and resolve 
this... 
I may come across as being contrarian with regards to this proposal.  I do 
see the value of supporting TextMessage, I'm just not clear on the best 
way to support it.

-Eric.

Regards, 

Phil Adams 
WebSphere Development - Web Services
IBM Austin, TX
email: phil_adams@us.ibm.com
office: (512) 838-6702  (tie-line 678-6702)
mobile: (512) 750-6599

Received on Monday, 22 September 2008 19:32:56 UTC