- From: David Hull <dmh@tibco.com>
- Date: Fri, 08 Apr 2005 17:20:52 -0400
- To: public-ws-async-tf@w3.org
- Message-id: <4256F5B4.3050706@tibco.com>
This is the promised commentary on Dave Orchard's blog entry (http://www.pacificspirit.com/blog/2005/04/05/underlying_protocol_is_a_completely_leaky_abstraction) in the context of the proposed rules I gave for WSDL and SOAP MEPs. First, since leaky abstractions constitute one of the main themes, here's my take on the matter. I think the basis of the "leaky abstractions" meme is that you can never pretend that the hardware isn't there. You can't pretend that numbers, even bignums, have infinite precision. You can't pretend that the network is reliable (see Deutsch's 8 fallacies of distributed computing). You can't pretend that HTTP is the same as UDP, and so forth. The wrong conclusion to draw from this is that, since the familiar "pure" abstractions of mathematics don't cover the full range actual computing machine behavior, all abstractions are inherently "leaky" by virtue of being abstractions. The right conclusion to draw from this is that the abstractions we use have to account for the facts of life in the computing world. TCP is an excellent and successful example. TCP sits on top of mostly one-way protocols that can be noticeably flaky in practice and provides a bidirectional channel that fails seldom enough that failures can be treated as exceptional conditions. It does /not/ provide a 100% reliable protocol and does not purport to. Trying to treat a TCP connection abstractly as a 100% reliable connection fails not because all abstractions leak, but because the abstraction is not accurate. Treating a TCP connection abstractly as a bidirectional channel that may fail under exceptional circumstances is reasonable and accurate. In my view, a model that treats a TCP connection as a bidirectional channel that may fail but usually doesn't is an abstraction, but is not leaky. It's an abstraction because of all the details I don't care about. For example, I don't care what kind of physical transport is in use, or even whether messages are going out over a physical wire at all. If there's a failure and the connection closes, the main application doesn't care what caused the connection to close. The model is not leaky because it accounts for all the TCP behavior we're interested in here. Granted, if there's a transport error, the error-handling code /will/ generally care what's going on under the covers, but I don't see this as a leak. Different modules may see different abstractions. Setup and error recovery are a classic cases of this -- they generally see a lot more of the physical detail than the rest of the application. If you consider including transport faults in a model or having setup and recovery code know more about the physical details as leaks, then we just have a different notion of "leak", and that's fine as long as we take care to be clear. Now to the analysis of WSDL MEPs and underlying protocols. I believe that the points about status codes in one-way over HTTP and similar concerns are well taken, but I also believe they can be adequately handled by modeling two basic facts: * All transports can produce transport faults. * Some transports have built-in back-channels and some don't. Again, these are both abstractions and as far as I can tell neither of them leaks. When the main application code tries an operation, it must be prepared for a transport fault. When such a fault happens, all it knows is that something exceptional happened and a message that was expected to be delivered is known not to have been delivered. It can then take action, perhaps handing off the fault to a recovery module that can put on its overalls and dive under the hood, toolbox in hand, and try to repair the condition or report that it can't be repaired. Similarly, the infrastructure implementing WS* will use some form of the various rules we've discussed, together with its knowledge of the binding, to try to realize the desired WSDL MEP at the SOAP transport layer. To do this, it needs to know if a back-channel is required and if so, whether the transport in question provides one. It /doesn't/ need to know any physical details like which HTTP codes map to results, transport faults or successful MEP completion without either. As to the myth of transport independence, I believe there is a spurious all-or-nothing dichotomy lurking in the background, namely the notion that either all transports are treated uniformly, or they all must be treated case-by-case. I don't think anyone would advocate such a position put that baldly, but I think such an assumption or one of its cousins may be sneaking in through the back door. We can't pretend all transports are equal; some provide back-channels and some don't. We also can't forget entirely about transport-level details; setup, error recovery and (of course) certain parts of the infrastructure code have to know quite a bit about them. On the other hand, we can and should abstract large amounts of transport detail away from the 90% code path of applications by adopting a model that takes transport faults and back-channels into account without going into detail about them. Reading over, I see a possible disconnect concerning what I'm calling a "transport fault". I believe Dave is modeling, say, a 5xx status return as a fault message being returned on the back-channel. I'm modeling it as a transport fault and not worrying about where it came from. As far as I'm concerned, it's coincidence that the 5xx comes back on the same TCP channel that a reply (or application fault) would have. I /only/ care about the [* endpoint] when the transport is working -- the receiver has received the inbound message, thought about it, and successfully sent back an outbound message, whether reply, fault or something more exotic. I don't think I've covered everything in the blog entry here; there's quite a bit of meat to it, but I hope that this helps clarify how the proposals I've put forth may relate to it. Just as a cross-check, I'd like to specifically comment on the list of leaks that Dave gives near the end: 1. The specific binding leaks into the SOAP mep selected, because HTTP bindings match up well with a to-be-standardized (TBS) SOAP in-optional-out to support faults on the HTTP connection 2. The specific binding leaks into the WSDL mep selected because the WSDL mep will logically match up with the SOAP mep 3. The specific binding leaks into the WS-Addressing Fault, because an HTTP binding will suggest Faults can be returned on the in HTTP connection AND a separate connection for the FaultTo. In-reply-to this (using a numbered list for correlation :-) 1. I believe this is captured by noting that bindings may or may not support back channels, by distinguishing transport faults from faults sent by the receiver, and by asserting that all bindings are liable to produce transport faults. 2. When mapping a WSDL MEP to SOAP, you have to take into account which SOAP MEPs the binding supports (i.e., whether there is a back-channel available), as detailed previously, and you may have to complain if the operation needs a back-channel that's not there. That appears to be all you have to take into account. 3. This is where the distinction between transport faults and application faults is important. Yes, some codes that manifest as transport faults may come back on the same back-channel that may or may not also be intended for application faults. But this doesn't really matter. I don't see that HTTP needs to be treated differently from other transports in this respect. More broadly, I don't see a visible distinction between transports that may send fault codes back on the back channel and those that never will. By contrast, there /is/ a visible distinction between transports that have back-channels and those that don't (see above). Whether to classify any of the above as "leakage" is, IMHO, more a matter of terminology than anything else. In any case, my feeling is we're getting somewhere, and I'm happy about that.
Received on Friday, 8 April 2005 21:20:55 UTC