RE: Asynchronous Web Services from Edwin Khodabakchian on 2002-07-20 (www-ws-arch@w3.org from July 2002)

From: Edwin Khodabakchian <edwink@collaxa.com>
Date: Sat, 20 Jul 2002 15:04:27 -0700
To: "'David Booth'" <dbooth@w3.org>, "'Paul Prescod'" <paul@prescod.net>, <www-ws-arch@w3.org>
Message-ID: <001201c23039$6aba0440$cddffea9@collaxa.net>
David,

I very very much agree with the 2 correlationId approach:

The clientCorrelation could be a URI if the client can host URI or a
UUID if the client can not (an excel application for example can not
host a URI).

The serverCorrelationId should be the URI of the initiated long-lived
transaction[1]. A GET on that URI could provide the state of the
transaction or its status or it ATC, the client correlation id, etc...

But this is one of those cases where any standard approach would be
better than no standard approach. Defining a non-ambigious way to
declare and implement asynchronous web services would be very valuable: 
(1) a lot of vendors are creating their own proprietary approaches to
this problem.
(2) it prevents the adoption of asynchronous web services which I
believe will be one of the dominant forms of web services when people
start to implement real world applications
(3) asynchrony will move developers from the current RPC incarnation of
SOAP to something that will be more reliable, scalable and adaptable.

Best,

Edwin

[1] Please note that I use the term long-lived transaction where you use
the term conversation. Those 2 terms have similar requirements when it
comes to asynchrony and correlation but will be different when it gets
to flow control and transactional semantics.





> -----Original Message-----
> From: David Booth [mailto:dbooth@w3.org] 
> Sent: Saturday, July 20, 2002 1:43 PM
> To: Edwin Khodabakchian; 'Paul Prescod'; www-ws-arch@w3.org
> Subject: Re: Asynchronous Web Services
> 
> 
> At 04:54 PM 7/19/2002 -0700, Edwin Khodabakchian wrote:
> 
> >Paul,
> >
> >This is a very good explanation.
> >
> >How would you address the case where the client initiates the 
> >long-lived transaction through a one-way protocol like SMTP 
> or JMS. It 
> >seems to me that it that case, the client needs to specify the 
> >correlationId.
> 
> Another possibility is that the client and the service could 
> each have 
> their own correlation ID.  The client could supply a 
> clientCorrelationID if 
> the client initiates the interaction with the service, and 
> the service 
> could respond with a serverCorrelationID.  The 
> clientCorrelationID could be 
> a URI that GETs the client's state (which should include the 
> corresponding 
> serverCorrelationID as soon as it is know), and the 
> serverCorrelationID 
> could be a URI that GETs the server's state (which would also 
> include the 
> corresponding clientCorrelationID).  Since each one points to 
> the other, 
> either one could be used to unambiguously identify the 
> conversation as a whole.
> 
> 
> >As to whether the correlationId should be a URI, some 
> clients (like a 
> >VB
> >app) cannot host URIs: those clients interact with the long-lived
> >transaction through polling (passing in the correlationId as part of
> >each request).
> >
> >It seems to me that a correlationId is not a resource but 
> rather a key 
> >used as part of an asynchronous interactions.
> 
> Do you mean "key" as in "database key"?  I think it's true that a 
> correlationID might conventionally be viewed like a database 
> key -- with a 
> scope limited to the particular application.  But I don't 
> think this would 
> be a good idea for a Web application.  In the global world of 
> the Web and 
> Web Services I think it would be better to give it global 
> scope, as a URI.
> 
> We are accustomed to thinking of a URI (well, a URL actually) 
> as being the 
> address of a document.  But in fact, a URI really *is* a key, in the 
> database sense.  The difference between a traditional 
> database key and a 
> URI is its scope.  A database key is a unique identifier for 
> something that 
> lives within the very limited scope of a particular database 
> -- not the 
> world as a whole -- and is difficult to use sensibly outside of that 
> scope.  A URI is a unique identifier for something that lives 
> within the 
> global scope of the World Wide Web.  It is globally unique, 
> which makes it 
> easy to use sensibly in new contexts beyond the limited scope of the 
> initial application.  This will be particularly important in 
> Web Services, 
> as smaller Services are combined in new ways to make larger Services.
> 
> 
> >Otherwise, I would tend to agree with you that the initiated 
> long-live 
> >transaction should be a resource hosted by the server and 
> whose state 
> >should be available through GET.
> >
> >Best,
> >
> >Edwin
> >http://www.collaxa.com
> >
> >
> > > -----Original Message-----
> > > From: www-ws-arch-request@w3.org 
> [mailto:www-ws-arch-request@w3.org] 
> > > On Behalf Of Paul Prescod
> > > Sent: Wednesday, July 17, 2002 4:33 PM
> > > To: David Orchard; www-ws-arch@w3.org
> > > Subject: REST, Conversations and Reliability
> > >
> > >
> > >
> > > David offers the following URI:
> > >
> > > http://dev2dev.bea.com/techtrack/SOAPConversation.jsp
> > >
> > > In my mind, it is a perfect example of a protocol that can be 
> > > enhanced by applying some REST discipline.
> > >
> > > The BEA proposal introduces a concept of "ConversationID" which 
> > > represents a conversation. It also introduces a state 
> machine that 
> > > allows the participants to move through the stages from "no 
> > > conversation" to "talking" to "finished conversing". It 
> defines ways 
> > > that headers are used to move through those stages. It 
> also defines 
> > > how a callback URI can be presented. It has quite a 
> resemblance to 
> > > the ideas in the HTTPEvents draft.
> > >
> > > Now let me apply a combination of REST discipline and my own 
> > > thoughts about networking.
> > >
> > > Let's call the recipient of the first message the 
> "server" and the 
> > > sender of the first message the "client" although at an 
> HTTP level 
> > > they may switch roles if the exchange is asynchronous.
> > >
> > > The server needs to deal with N incoming conversations 
> and needs to 
> > > keep them all straight. Also, the server by definition has the 
> > > capability to host URIs but the client may or may not. 
> For this and 
> > > other reasons, I feel that the conversation ID should be 
> generated 
> > > by the recipient, not the sender. Most important: the 
> recipient can 
> > > trivially generate IDs unique to them. The sender can at best use 
> > > UUIDs to reduce the chances of collision.
> > >
> > > Second, the conversation ID should be a (surprise!) http URI. It 
> > > should point to a conversation resource. Obviously if the 
> > > conversation is necessary to the successful completion of the 
> > > discussion then it is an important resource and deserves 
> a URI. This 
> > > isn't just theoretically clean it is extremely important 
> in practice 
> > > as will become clear in a moment.
> > >
> > > Let's think about reliability.
> > >
> > > What happens if the conversation-constructing message is lost? 
> > > That's okay. The client can just send it again.
> > >
> > > What happens if the conversation-constructing response is lost? 
> > > That's okay. The client can just set up a new 
> conversation resource 
> > > and the server can dispose of the unused one after a timeout.
> > >
> > > Now both partners are in the "conversing" state. But the big 
> > > difference between the original proposal and the REST proposal is 
> > > that the REST proposal makes this state explicit in terms of 
> > > universally addressable resources.
> > >
> > > According to the original proposal, callbacks refer to the 
> > > conversation ID. In my proposal, callbacks would also 
> refer to the 
> > > conversation resource. But the conversation resource 
> would be a real 
> > > data-containing resource. For instance in an instant-messaging 
> > > application, the conversation resource would list which users are 
> > > involved with the discussion. In an order negotiation 
> application, 
> > > the conversation resource could point to the good being bought or 
> > > sold. Note that the server by definition has access to this 
> > > information so it is just a case of giving the 
> information a URI so 
> > > that it may be looked up at runtime by the client or 
> third parties.
> > >
> > > This is important for a variety of reasons. First, it means that 
> > > clients can be stateless and thus simpler. It means that the 
> > > client-end of a conversation can migrate from one machine 
> to another 
> > > merely by passing the conversation ID URI (and authentication 
> > > information). It means that an (authorized) third-party 
> application 
> > > like a logger, auditor or security filter can apprise 
> itself of the 
> > > full state of the conversation just by following the URI in the 
> > > message.
> > >
> > > A conversation resource is not in any way tied to any particular 
> > > nodes/endpoints. Once it is set up, dozens or hundreds of 
> > > participants can be involved without any major 
> architectural shift. 
> > > The third, fourth, etc. participants are brought in merely by 
> > > forwarding them the URI. There are no hard-coded roles of 
> "client" 
> > > and "server" after the conversation is set up. There is 
> "the server 
> > > maintaining the conversation" and "everybody else".
> > >
> > > Also, stateless presentation tools like XSLT stylesheets 
> can extract 
> > > information for rendering the transmitted message. 
> Assertions can be 
> > > made about conversation resources using RDF. An HTML 
> representation 
> > > of resources can be used for technical support and debugging.
> > >
> > > Most important: if the client or other participant misses 
> a message, 
> > > gets state corrupted or otherwise gets confused about the 
> state of 
> > > the conversation, it can refresh itself with a simple 
> GET. That's a 
> > > scalable approach to reliability. Under the original 
> protocol, there 
> > > is no way for a confused client that has missed a message 
> to check 
> > > whether the conversation is still ongoing and thus it 
> should expect 
> > > more messages. For instance if the client is momentarily offline,
> > > there is no way for it to check whether the server timed-out
> > > in the meantime.
> > >
> > > The original proposal says:
> > >
> > > "The ContinueHeader
> > >        MUST be sent on any messages to operations that 
> are marked in 
> > > the
> > >        WSDL as requesting a ContinueHeader."
> > >
> > > I feel that this is too large grained of a constraint. In some 
> > > cases, conversations will need to nest. For instance there is the 
> > > conversation that sets up a shopper/seller relationship and then 
> > > within that there are conversations on the price of individual 
> > > items. In the REST model these would be just different kinds of 
> > > conversation resources. Some operations would expect a 
> reference to 
> > > a "shopping conversation" and some operations would expect a 
> > > reference to a "product price negotiation" conversation. 
> Of course 
> > > each resource would have a link to the other so that it is
> > > possible to easily go from one to another.
> > >
> > > The original proposal says that the conversation is ended in an 
> > > unspecified manner. I do not understand why it would specify some 
> > > things and leave that unspecified. Therefore I would say 
> rather that 
> > > the conversation is ended when either party DELETEs the 
> conversation 
> > > resource. There should be some standardized way for the server to 
> > > indicate that it is doing so to a callback-capable client. 
> > > Alternately, conversation resources could be immortal 
> (for archival 
> > > purposes) but could have a flag that says whether they 
> are ongoing 
> > > or historical.
> > >
> > > I hope this demonstrates that a REST approach is not at 
> all at odds 
> > > with a "named conversation" approach but a REST approach 
> would say:
> > >
> > >  1. Conversations should be named as everything else on 
> the Web is 
> > > named, with URIs.
> > >
> > >  2. Conversations should be inspectable and introspectable as 
> > > everything else on the Web is, through HTTP GET.
> > >
> > >  3. Any authorized party (especially confused clients) should be 
> > > able to bring itself up to a full understanding of the 
> state of the 
> > > conversation by looking at the conversation URI (or 
> things linked to 
> > > the conversation URI).
> > >
> > >  4. Conversations will almost always have important 
> associated data 
> > > (the stuff being talked about) and the resource storing that 
> > > information can easily serve as the conversation resource.
> > > --
> > > Come discuss XML and REST web services at:
> > >   Open Source Conference: July 22-26, 2002, 
> conferences.oreillynet.com
> > >   Extreme Markup: Aug 4-9, 2002,  www.extrememarkup.com/extreme/
> > >
> > >
> 
> -- 
> David Booth
> W3C Fellow / Hewlett-Packard
> Telephone: +1.617.253.1273
> 
>
Received on Saturday, 20 July 2002 18:04:34 UTC