W3C home > Mailing lists > Public > public-ws-async-tf@w3.org > April 2005

Leaky abstractions, protocols and all that.

From: David Hull <dmh@tibco.com>
Date: Fri, 08 Apr 2005 17:20:52 -0400
To: public-ws-async-tf@w3.org
Message-id: <4256F5B4.3050706@tibco.com>
This is the promised commentary on Dave Orchard's blog entry 
(http://www.pacificspirit.com/blog/2005/04/05/underlying_protocol_is_a_completely_leaky_abstraction) 
in the context of the proposed rules I gave for WSDL and SOAP MEPs.

First, since leaky abstractions constitute one of the main themes, 
here's my take on the matter.

I think the basis of the "leaky abstractions" meme is that you can never 
pretend that the hardware isn't there.  You can't pretend that numbers, 
even bignums, have infinite precision.  You can't pretend that the 
network is reliable (see Deutsch's 8 fallacies of distributed 
computing).  You can't pretend that HTTP is the same as UDP, and so forth.

The wrong conclusion to draw from this is that, since the familiar 
"pure" abstractions of mathematics don't cover the full range actual 
computing machine behavior, all abstractions are inherently "leaky" by 
virtue of being abstractions.  The right conclusion to draw from this is 
that the abstractions we use have to account for the facts of life in 
the computing world.

TCP is an excellent and successful example.  TCP sits on top of mostly 
one-way protocols that can be noticeably flaky in practice and provides 
a bidirectional channel that fails seldom enough that failures can be 
treated as exceptional conditions.  It does /not/ provide a 100% 
reliable protocol and does not purport to.  Trying to treat a TCP 
connection abstractly as a 100% reliable connection fails not because 
all abstractions leak, but because the abstraction is not accurate.  
Treating a TCP connection abstractly as a bidirectional channel that may 
fail under exceptional circumstances is reasonable and accurate.

In my view, a model that treats a TCP connection as a bidirectional 
channel that may fail but usually doesn't is an abstraction, but is not 
leaky.  It's an abstraction because of all the details I don't care 
about.  For example,  I don't care what kind of physical transport is in 
use, or even whether messages are going out over a physical wire at 
all.  If there's a failure and the connection closes, the main 
application doesn't care what caused the connection to close.

The model is not leaky because it accounts for all the TCP behavior 
we're interested in here.  Granted, if there's a transport error,  the 
error-handling code /will/ generally care what's going on under the 
covers, but I don't see this as a leak.  Different modules may see 
different abstractions.  Setup and error recovery are a classic cases of 
this -- they generally see a lot more of the physical detail than the 
rest of the application.

If you consider including transport faults in a model or having setup 
and recovery code know more about the physical details as leaks, then we 
just have a different notion of "leak", and that's fine as long as we 
take care to be clear.

Now to the analysis of WSDL MEPs and underlying protocols.

I believe that the points about status codes in one-way over HTTP and 
similar concerns are well taken, but I also believe they can be 
adequately handled by modeling two basic facts:

    * All transports can produce transport faults.
    * Some transports have built-in back-channels and some don't.

Again, these are both abstractions and as far as I can tell neither of 
them leaks.  When the main application code tries an operation, it must 
be prepared for a transport fault.  When such a fault happens, all it 
knows is that something exceptional happened and a message that was 
expected to be delivered is known not to have been delivered.  It can 
then take action, perhaps handing off the fault to a recovery module 
that can put on its overalls and dive under the hood, toolbox in hand, 
and try to repair the condition or report that it can't be repaired.

Similarly, the infrastructure implementing WS* will use some form of the 
various rules we've discussed, together with its knowledge of the 
binding, to try to realize the desired WSDL MEP at the SOAP transport 
layer.  To do this, it needs to know if a back-channel is required and 
if so, whether the transport in question provides one.  It /doesn't/ 
need to know any physical details like which HTTP codes map to results, 
transport faults or successful MEP completion without either.

As to the myth of transport independence, I believe there is a spurious 
all-or-nothing dichotomy lurking in the background, namely the notion 
that either all transports are treated uniformly, or they all must be 
treated case-by-case. I don't think anyone would advocate such a 
position put that baldly, but I think such an assumption or one of its 
cousins may be sneaking in through the back door.

We can't pretend all transports are equal; some provide back-channels 
and some don't.  We also can't forget entirely about transport-level 
details; setup, error recovery and (of course) certain parts of the 
infrastructure code have to know quite a bit about them.  On the other 
hand, we can and should abstract large amounts of transport detail away 
from the 90% code path of applications by adopting a model that takes 
transport faults and back-channels into account without going into 
detail about them.

Reading over, I see a possible disconnect concerning what I'm calling a 
"transport fault".  I believe Dave is modeling, say, a 5xx status return 
as a fault message being returned on the back-channel.  I'm modeling it 
as a transport fault and not worrying about where it came from.  As far 
as I'm concerned, it's coincidence that the 5xx comes back on the same 
TCP channel that a reply (or application fault) would have.  I /only/ 
care about the [* endpoint] when the transport is working -- the 
receiver has received the inbound message, thought about it, and 
successfully sent back an outbound message, whether reply, fault or 
something more exotic.

I don't think I've covered everything in the blog entry here; there's 
quite a bit of meat to it, but I hope that this helps clarify how the 
proposals I've put forth may relate to it.  Just as a cross-check, I'd 
like to specifically comment on the list of leaks that Dave gives near 
the end:

   1. The specific binding leaks into the SOAP mep selected, because
      HTTP bindings match up well with a to-be-standardized (TBS) SOAP
      in-optional-out to support faults on the HTTP connection
   2. The specific binding leaks into the WSDL mep selected because the
      WSDL mep will logically match up with the SOAP mep
   3. The specific binding leaks into the WS-Addressing Fault, because
      an HTTP binding will suggest Faults can be returned on the in HTTP
      connection AND a separate connection for the FaultTo.

In-reply-to this (using a numbered list for correlation :-)

   1. I believe this is captured by noting that bindings may or may not
      support back channels, by distinguishing transport faults from
      faults sent by the receiver, and by asserting that all bindings
      are liable to produce transport faults.
   2. When mapping a WSDL MEP to SOAP, you have to take into account
      which SOAP MEPs the binding supports (i.e., whether there is a
      back-channel available), as detailed previously, and you may have
      to complain if the operation needs a back-channel that's not
      there.  That appears to be all you have to take into account.
   3. This is where the distinction between transport faults and
      application faults is important.  Yes, some codes that manifest as
      transport faults may come back on the same back-channel that may
      or may not also be intended for application faults.  But this
      doesn't really matter.  I don't see that HTTP needs to be treated
      differently from other transports in this respect.  More broadly,
      I don't see a visible distinction between transports that may send
      fault codes back on the back channel and those that never will. 
      By contrast, there /is/ a visible distinction between transports
      that have back-channels and those that don't (see above).

Whether to classify any of the above as "leakage" is, IMHO, more a 
matter of terminology than anything else.

In any case, my feeling is we're getting somewhere, and I'm happy about 
that.
Received on Friday, 8 April 2005 21:20:55 GMT

This archive was generated by hypermail 2.2.0 + w3c-0.30 : Friday, 8 April 2005 21:20:55 GMT