Leaky abstractions, protocols and all that. from David Hull on 2005-04-08 (public-ws-async-tf@w3.org from April 2005)

From: David Hull <dmh@tibco.com>
Date: Fri, 08 Apr 2005 17:20:52 -0400
To: public-ws-async-tf@w3.org
Message-id: <4256F5B4.3050706@tibco.com>

This is the promised commentary on Dave Orchard's blog entry
(http://www.pacificspirit.com/blog/2005/04/05/underlying_protocol_is_a_completely_leaky_abstraction)
in the context of the proposed rules I gave for WSDL and SOAP MEPs.

First, since leaky abstractions constitute one of the main themes,
here's my take on the matter.

I think the basis of the "leaky abstractions" meme is that you can never
pretend that the hardware isn't there. You can't pretend that numbers,
even bignums, have infinite precision. You can't pretend that the
network is reliable (see Deutsch's 8 fallacies of distributed
computing). You can't pretend that HTTP is the same as UDP, and so forth.

The wrong conclusion to draw from this is that, since the familiar
"pure" abstractions of mathematics don't cover the full range actual
computing machine behavior, all abstractions are inherently "leaky" by
virtue of being abstractions. The right conclusion to draw from this is
that the abstractions we use have to account for the facts of life in
the computing world.

TCP is an excellent and successful example. TCP sits on top of mostly
one-way protocols that can be noticeably flaky in practice and provides
a bidirectional channel that fails seldom enough that failures can be
treated as exceptional conditions. It does /not/ provide a 100%
reliable protocol and does not purport to. Trying to treat a TCP
connection abstractly as a 100% reliable connection fails not because
all abstractions leak, but because the abstraction is not accurate.
Treating a TCP connection abstractly as a bidirectional channel that may
fail under exceptional circumstances is reasonable and accurate.

In my view, a model that treats a TCP connection as a bidirectional
channel that may fail but usually doesn't is an abstraction, but is not
leaky. It's an abstraction because of all the details I don't care
about. For example, I don't care what kind of physical transport is in
use, or even whether messages are going out over a physical wire at
all. If there's a failure and the connection closes, the main
application doesn't care what caused the connection to close.

The model is not leaky because it accounts for all the TCP behavior
we're interested in here. Granted, if there's a transport error, the
error-handling code /will/ generally care what's going on under the
covers, but I don't see this as a leak. Different modules may see
different abstractions. Setup and error recovery are a classic cases of
this -- they generally see a lot more of the physical detail than the
rest of the application.

If you consider including transport faults in a model or having setup
and recovery code know more about the physical details as leaks, then we
just have a different notion of "leak", and that's fine as long as we
take care to be clear.

Now to the analysis of WSDL MEPs and underlying protocols.

I believe that the points about status codes in one-way over HTTP and
similar concerns are well taken, but I also believe they can be
adequately handled by modeling two basic facts:

* All transports can produce transport faults.
* Some transports have built-in back-channels and some don't.

Again, these are both abstractions and as far as I can tell neither of
them leaks. When the main application code tries an operation, it must
be prepared for a transport fault. When such a fault happens, all it
knows is that something exceptional happened and a message that was
expected to be delivered is known not to have been delivered. It can
then take action, perhaps handing off the fault to a recovery module
that can put on its overalls and dive under the hood, toolbox in hand,
and try to repair the condition or report that it can't be repaired.

Similarly, the infrastructure implementing WS* will use some form of the
various rules we've discussed, together with its knowledge of the
binding, to try to realize the desired WSDL MEP at the SOAP transport
layer. To do this, it needs to know if a back-channel is required and
if so, whether the transport in question provides one. It /doesn't/
need to know any physical details like which HTTP codes map to results,
transport faults or successful MEP completion without either.

As to the myth of transport independence, I believe there is a spurious
all-or-nothing dichotomy lurking in the background, namely the notion
that either all transports are treated uniformly, or they all must be
treated case-by-case. I don't think anyone would advocate such a
position put that baldly, but I think such an assumption or one of its
cousins may be sneaking in through the back door.

We can't pretend all transports are equal; some provide back-channels
and some don't. We also can't forget entirely about transport-level
details; setup, error recovery and (of course) certain parts of the
infrastructure code have to know quite a bit about them. On the other
hand, we can and should abstract large amounts of transport detail away
from the 90% code path of applications by adopting a model that takes
transport faults and back-channels into account without going into
detail about them.

Reading over, I see a possible disconnect concerning what I'm calling a
"transport fault". I believe Dave is modeling, say, a 5xx status return
as a fault message being returned on the back-channel. I'm modeling it
as a transport fault and not worrying about where it came from. As far
as I'm concerned, it's coincidence that the 5xx comes back on the same
TCP channel that a reply (or application fault) would have. I /only/
care about the [* endpoint] when the transport is working -- the
receiver has received the inbound message, thought about it, and
successfully sent back an outbound message, whether reply, fault or
something more exotic.

I don't think I've covered everything in the blog entry here; there's
quite a bit of meat to it, but I hope that this helps clarify how the
proposals I've put forth may relate to it. Just as a cross-check, I'd
like to specifically comment on the list of leaks that Dave gives near
the end:

1. The specific binding leaks into the SOAP mep selected, because
HTTP bindings match up well with a to-be-standardized (TBS) SOAP
in-optional-out to support faults on the HTTP connection
2. The specific binding leaks into the WSDL mep selected because the
WSDL mep will logically match up with the SOAP mep
3. The specific binding leaks into the WS-Addressing Fault, because
an HTTP binding will suggest Faults can be returned on the in HTTP
connection AND a separate connection for the FaultTo.

In-reply-to this (using a numbered list for correlation :-)

1. I believe this is captured by noting that bindings may or may not
support back channels, by distinguishing transport faults from
faults sent by the receiver, and by asserting that all bindings
are liable to produce transport faults.
2. When mapping a WSDL MEP to SOAP, you have to take into account
which SOAP MEPs the binding supports (i.e., whether there is a
back-channel available), as detailed previously, and you may have
to complain if the operation needs a back-channel that's not
there. That appears to be all you have to take into account.
3. This is where the distinction between transport faults and
application faults is important. Yes, some codes that manifest as
transport faults may come back on the same back-channel that may
or may not also be intended for application faults. But this
doesn't really matter. I don't see that HTTP needs to be treated
differently from other transports in this respect. More broadly,
I don't see a visible distinction between transports that may send
fault codes back on the back channel and those that never will.
By contrast, there /is/ a visible distinction between transports
that have back-channels and those that don't (see above).

Whether to classify any of the above as "leakage" is, IMHO, more a
matter of terminology than anything else.

In any case, my feeling is we're getting somewhere, and I'm happy about
that.

Received on Friday, 8 April 2005 21:20:55 UTC