- From: Geoff Arnold <Geoff.Arnold@Sun.COM>
- Date: Wed, 25 Sep 2002 18:10:14 -0400
- To: www-ws-arch@w3.org
An even stronger statement in the same vein from Ken Arnold [no relation, though we were sometimes taken for brothers!]. As you can imagine, I feel we have plenty of work to do in the area of reliable messaging! ---- Failure is the defining difference between distributed and local programming, so you have to design distributed systems with the expectation of failure. Imagine asking people, "If the probability of something happening is one in ten to the thirteenth, how often would it happen?" Your natural human sense would be to answer, "Never." That is an infinitely large number in human terms. But if you ask a physicist, she would say, "All the time. In a cubic foot of air, those things happen all the time." When you design distributed systems, you have to say, "Failure happens all the time." So when you design, you design for failure. It is your number one concern.
Received on Wednesday, 25 September 2002 18:10:43 UTC