Re: Fwd: I-D Action:draft-nottingham-http-pipeline-00.txt

Hi Mark, and ietf-wg folk,

First, thanks for undertaking this. As RTTs go up it is more relevant
every day.

Re: draft-nottingham-http-pipeline-00.txt ... the four anecdotal
pipelining problems you enumerate nicely sum up the reasons I have heard
and encountered over the years myself. In some ways I think it is a
mistake just to mix them all together as they have such different
characteristics.

To give them handy reference names:
 1] unprocessed pipeline
 2] broken response order
 3] corrupted responses
 4] hol blockers

I think it is helpful to break #1 down into two categories (and maybe
you meant this and my reading of the document would just benefit from
clarification):

  - 1a] server closes connection without processing (some portion of)
pipeline. This is a minor performance nit when the request could have
been served on a parallel connection instead of being pipelined. But the
loose constraints about when servers can close connections make it
difficult to determine whether this is a pipelining failure or just a
semi-random and expected event.

  - 1b] server silently discards pipelined requests but remains open for
requests on persistent connection. That's really ugly because it either
turns into a big timeout or #2 if another request is sent that the
server perceives as non-pipelined. (and of course what the ua perceives
as pipelined may not look that way to the server).
 
While not terribly important, I think if we're going to undertake a
document on pipelining deployment problems it is worth noting that some
deployments operate correctly but still don't pipeline end to end due to
intermediaries enforcing lock-step forwarding (perhaps unintentionally).

My first question is do we have any information on the relative
importance of each? I must say that I have not personally witnessed an
instance of #1b, #2 or #3 in several years (and I say that as anecdote,
not as an assertion that they aren't running wild in someone else's view
of the Internet - so the question is sincere).. while #1a and #4 have
been various-levels of disasters for me when I have tried to introduce
pipelining into an intermediary.

As previously noted, categories 1b, 2 and 3 are implementation bugs -
mechanisms to detect and work around the bugs are helpful but any kind
of feature negotiation that requires an upgrade might as just well be
replaced with bug fixed code, right? In that case I'm on board with the
folks that are skeptical that an unverifiable promise such as
"Connection: pipelining" really promises anything useful.. A verifiable
MD-5 or echoed ID token such as your draft suggests is a much more
useful thing to provide - so I think that is the right approach. But I'm
a bigger fan of md5.

I do share the expressed concern about making the correlation ID be the
same URL used by the UA. I appreciate the fact that it can just be
stored as part of cached headers, but too much rewriting, routing, et al
goes on in today's Web to expect that to be consistently matched (or
even translated to be the same easily). More importantly, I fear that we
would accidentally be introducing false-positives into the system as
each endpoint viewed the naming differently and through errors even when
pipelining was working. It's not clear to me assoc-req is less fragile
than anonymous pipelining which it is trying to help.

In contrast, the checksum scheme is also cache friendly, isn't dependent
on naming, seems to accomplish all of what assoc-req does(*), and has
the advantage of already being well defined. There is a minor
computational cost, to be sure, but I think its lack of deployment is
more related to the lack of perceived benefit than anything else. If
developers buy into a benefit for assoc-req then why wouldn't they be
convinced of the same logic supporting md-5? And md-5 doesn't need a new
standards document.

Also along those lines, md-5 plays well with proxies. Assoc-req makes my
head hurt - the draft says proxies should never generate assoc-req (and
I get why), but pipelining is certainly not necessarily an end-to-end
property in practice.. some hops might want to initiate it by
multiplexing different tcp sessions together (crazy in today's
conditions, but if they had reason to believe it was safe to do so it
makes a lot of sense for something like a server side content
accelerator to do).. and some hops commonly undo pipelines without
breaking interop simply by behaving in a store and forward fashion...
even if a hop didn't want to initiate a pipeline it might want to deploy
a "safe pipeline" spec in order to allow the safe enablement of
pipelining in legacy browsers as a service to its users. 

(*)assoc-req could of course also be used in an out-of-order scheme but
we seem to be in agreement that is out of scope for http/1.*

Conditions 1a and 4 suffer from an information gap that hinting could
help with. 

Especially #4 - a primary case is simply the filtering out of things
like comet long polls as well as large and slow documents. To a
reasonable extent content (and javascript) authors can figure some of
that stuff out and label it as you suggest, but I am troubled by the gap
between the protocol and whatever markup the attribute lives in. I don't
have a suggestion at my fingertips but it seems that if the hint lived
within the protocol somehow it would be much more versatile and helpful,
right?  The timeout of 1b could also be shortened significantly in the
presence of some kind of hinting.

I'm pessimistic that attribute level meta data can help much with
scenario #1a.. #1a seems more driven by the implementation and
configuration of the server itself rather than the application data that
it serves. The latter often doesn't know much about the former. The
server could hint at the protocol layer about its implementation, but
that doesn't portend to be very satisfying. I wish I was making a more
constructive statement here. In some ways this is the least significant
but most intractable problem.

HTH at least a little.

-Patrick

Received on Monday, 23 August 2010 20:32:16 UTC