Re: The role of transfer protocols from David Hull on 2006-01-06 (xml-dist-app@w3.org from January 2006)

From: David Hull <dmh@tibco.com>
Date: Fri, 06 Jan 2006 12:47:34 -0500
To: Mark Baker <distobj@acm.org>
Cc: xml-dist-app@w3.org
Message-id: <43BEAD36.1030904@tibco.com>
In the solutions you outline below, what acknowledgment, of what, do I
get when?

What I /want/ is acknowledgment that my counterpart on the
intermittently attached device has received the message I sent (for
bonus points, that it has received it and done something meaningful with
it).

If I understand your terminology correctly, using HTTP as a /transfer/
protocol would mean that I would not get back an HTTP response until my
counterpart actually received the message, arbitrarily later. 
Similarly, SMTP would not send back an OK message until the counterpart
actually received the message.  As I understand it, this is at best bad
practice with HTTP and simply not what SMTP does at all (if I want to
know that someone actually got my email, I ask for a return receipt).

The general point is that that *TP can only tell me that something was
transferred to the server on the other end of the connection.  The
connection itself may involve caches, proxies and such, but however you
get there, there's a particular process on the other end of the line. 
Often, I only want to know about that transfer, so everything works fine
as is.  Sometimes, though, I want to know more and thus have to do
more.  If this means using a "transfer" protocol as a "transport"
protocol, I'm frankly not bothered.

Moving streams messages around a network strikes me as significantly
different from exchanging documents between applications distributed
over a network.  For example, on a trading floor, people make bids and
offers on instruments. When a bid gets hit or an offer taken,
instruments change hands.  A stock quote is one particular synthesis of
these events.  A chart is another, an order book is another, etc.

You /could/ define a "document" consisting of a time series of every
single bid and offer for an instrument, define the usual toy example of
GetStockQuote as a particular query against that document and say that
bids, offers and sales are actually changes to that document, and
therefore that a market data delivery system is just propagating the
state of that document to various parties (who then often disregard or
quickly discard most of its contents), but why?  What's actually going
on is that people are doing things and other people want to know what
they're doing.

Alternatively, you could define each bid, offer or sale as a document in
and of itself.  IMHO, this is stretching the notion of document far
beyond any useful point.  Documents simply live in a different design
space from notifications.

Document exchange systems are generally aimed at moving reasonably large
hunks of data around with a premium on reliability and without great
regard for short and predictable latency.  Event notifications are
typically quite small.  Something like "machine 3 just caught fire" or
"Goldman bids $20.25/share for 1000 shares of FooCorp" manifestly takes
no more than dozens of bytes even without any attempt to remove
redundant information.  A carefully designed system can deliver dozens
of notifications in the packets it takes TCP to shake hands (I've seen
it done :-).

Guaranteed delivery of notifications is not always appropriate.  It most
likely would be for "machine 3 just caught fire", but often with market
data "old news is no news" and it's better to deliver the most recent
information quickly than to make sure that old information gets there. 
Similarly, latency and predictability thereof matter in real-time
notification systems.

Whether it makes sense to build message delivery on top of document
transfer protocols depends on what you're trying to move where and what
tools are available.  I certainly wouldn't rule it out, but I would also
look for more appropriate solutions, which is one reason I'm currently
poring over SOAP/XMPP and SOAP/BEEP.

Mark Baker wrote:

>David,
>
>On 1/5/06, David Hull <dmh@tibco.com> wrote:
>  
>
>> OK, so here are the constraints:
>>
>>
>>I want to send a message to some intermittently connected device.
>>I would like acknowledgment when the message finally arrives.
>>This acknowledgment needs to be asynchronous.  I cannot keep a connection
>>open until the destination device decides to come back on line.
>>    
>>
>
>Sounds good.
>
>  
>
>> Questions:
>>
>>
>>What transfer protocol would you suggest for this?  HTTP, SMTP, FTP etc.
>>clearly don't solve this problem and aren't intended to.
>>    
>>
>
>Not so!  Transfer protocols are, in general, designed to solve exactly
>those kinds of problems; how to exchange documents between
>applications distributed over a network.
>
>There's obviously no "uber" transfer protocol that is appropriate for
>all data transfer scenarios, but some are useful for a wider variety
>of tasks than others.  I consider HTTP to be the most generally useful
>transfer protocol that exists.
>
>  
>
>>What problems do you see in building a solution on top of *TP?  Specifically
>>
>>Use <your favorite transfer protocol> to put the message into a buffer
>>somewhere.  It would be polite for the transfer protocol to acknowledge that
>>the message was received and queued, but it's not strictly necessary.
>>    
>>
>
>HTTP and SMTP are well suited to that scenario, and both provide acks
>with queue-like semantics; HTTP 202 and SMTP 250 (the latter's
>queue-like-ness being by virtue of SMTP response semantics being
>hop-by-hop).
>
>FTP would require more coordination than HTTP or SMTP, as it doesn't
>include a "data submission" semantic, only a "data storage" semantic. 
>Therefore, the client and server have to coordinate the names of
>stored documents out-of-band so as to approximate the data submission
>semantic.  In fact, Infospace used to use (and maybe still does,
>dunno) exactly this interface for their third party mobile content
>interface back in 2001; they published naming conventions for file and
>directory names for content providers to use.  But with such a
>convention, an FTP server could be used as a queue.
>
>  
>
>>When it's on line and in the mood to retrieve messages, the destination
>>device uses <some transfer protocol it likes> to pull messages out of the
>>buffer.
>>    
>>
>
>If you have a specific "pull" requirement, then you could use HTTP,
>IMAP, FTP, POP, ...  SMTP is obviously "push"-only.
>
>  
>
>>Either the destination device or the buffer uses <a suitable transfer
>>protocol> to send me an acknowledgment when this happens. This seems quite a
>>bit like using return receipt with email.  Notably, doing so works
>>regardless of whether I'm using raw UUCP, SMTP, POP, IMAP or a carrier
>>pigeon with a USB drive to get email.  The same concept even works with
>>snail mail (no surprise).
>>    
>>
>
>POP wouldn't work; it's pull-only.  But sure, for some simple
>scenarios like that one, a lot of transfer protocols would fit the
>bill.  As the scenario gets more complex though, the differences
>between the capabilities of the protocols becomes more apparent, as
>hopefully described above.
>
>Mark
>--
>Mark Baker.  Ottawa, Ontario, CANADA.       http://www.markbaker.ca
>Coactus; Web-inspired integration strategies  http://www.coactus.com
>
>  
>
Received on Friday, 6 January 2006 17:47:47 UTC