Re: Reliability is really two-phase (was RE: Reliable Web Services)

> Your software vendor gives you two types of support services.

Reading ahead, this could explain why I don't have a "software
vendor".  Never mind...

>
> If you have a question that was anticipated, or someone asked before, you
> can find it in the documentation, faq, or tech note. You can get
> instantenous response by using a search engine.

I think you're saying it won't take "too long".  Maybe.

>
> Instanteneous means immediate, without pause or delay. It doesn't mean 0
> response time

So zero delay doesn't mean zero response time?  I think you'd better
check your math because while searching a faq database may be fast,
it doesn't take zero time, just like network propagation doesn't.  And if
your file server's hard disk inhaled some glass dust like my PC's
disk did last summer, it could take longer than you've allowed for here.


> , you can't even get a message to the network card without some
> response time. If you need zero response time you will need to be able to
> predict the response. Quantum computing may let you operate faster than
the
> speed of light, but the technology is not yet available for large scale
> deployment. So for now let's assume an HTTP request/response is as fast as
> you can get.

"HTTP request/response" actually tells me nothing about the network
or system latencies.  Can you clarify?

>
> If you have a question that is specific to your installation, you
> encountered a bug, or the response is not documented yet, someone has to
> cater for it. That means someone has to receive the request, do some
> thinking, and give you back a response. Even if the expert is sitting
there
> in front of the computer not doing anything else, it's probably going to
> take them a few minutes to try and figure out what's going wrong. Maybe
they
> have no clue, then need to go and talk to someone before they can come
back
> to you.

I don't know why, but I like the "no clue" theory ... 8-)

>
> As a vendor, I want my support stuff to give a response time that is
faster
> than the speed of light, but even though I'm busy working on that
solution,
> it's going to take a few years before we roll it to the market. Right now
we
> have support cycle that is faster than the competition, but even though
they
> are fast, they still take time to figure out exactly what the problem is
and
> how to response to it.

Um.  Communication is breaking up here.  The speed of light is measured
in terms of distance/time.  What is the distance of support?  I honestly
don't
get this.

>
> Let's say that looking at a problem and coming back with a response takes
4
> hours (it's a very tricky problem and so the solution is not evident). You
> can call a support person and wait on the phone 4 hours for a response. I
> assume you're a busy person, you don't want to sit there and wait for 4
> hours until they figure out how to solve it. A better option is for you to
> call with the request, made sure it got logged, then go and do something
> else.

Gotcha.  Where do you go into the details of "make sure it got logged"?

>
> You can keep calling every 30 minutes (poll) or you can wait for the
> response person to call you back when they figured out the response
> (interrupted). You will agree with me that being interrupted is more
> efficient use of your time than polling. Similarly, when you build
software
> you look for infrastructure solutions (like MOM) that let you do that.

Yes.  Above you said "wait for the response person to call you back...",
but I think you mean "go about your business until the person calls you
back". Right?

>
> Let's say they call you back but you can't pick up the phone (you just
> entered the Lincoln tunnel). They will try again five minutes later and
> again. So by retrying at frequent intervals they increase the liklihood
that
> you will get the response in a timely manner.

"Timely manner" has no meaning in this context.  I've been waiting four
hours already.  I probably switched to an open source solution by now,
downloaded the code and fixed the problem myself.

The reality is that if I was desperate to hear from your support person, I
wouldn't have entered the Lincoln Tunnel.  Also, four hours later I'm
thinking there probably won't be a reply unless I call again, so I'm doing
that.  Bottom line, your retry tactic doesn't matter.  You might get lucky
and call me just before I call you, but so what?

>
> Does that give you a good picture of where asynchronous/RM gets to be
used?

Asynchronous, yes.  RM, no.

>
>
> > The same solution doesn't apply in either case.  A soft-realtime
> > application
> > may decide, after waiting X milliseconds for an acknowledgement,
> > the the business value of that ack has reached zero, based on new
> > information received by the same application.  The "separate layer"
> > has no knowledge of that, and so cannot participate, let alone "solve"
> > the problem.
>
> The ack has no business value at all nor is it delivered to the
application.
> It is part of a positive-ack protocol between the RM that is there to
> expedite the delivery of messages when message loss occurs. Redundant acks
> have no affect on the behavior of the application.

I was talking about an application level "ack", like the one you hinted
at way above when you said "make sure the request got logged".
I hope you're not saying that the presence of RM mandates that my
application level coordination protocol can't have acks in it.  That
would make RM not only unnecessary but downright bothersome.

>
> The separate thread has the same subject as this one, it's just branching
> off into a discussion of coordination protocols.

Right.  That's the thread I'm posting to now, thanks to you. 8-)

Happy solstice!

Walden

Received on Wednesday, 25 December 2002 23:39:00 UTC