Long-polling to monitor resources

Some time back we established that long-polling might be the only currently available method for low latency, server->client communication without major changes to the infrastructure.

Intermediary-friendly long-polling is something I've been thinking on.  It would be really nice if intermediaries could help, rather than being rudely shoved aside with a ``Cache-Control: no-cache'' directive on either request or response.

How does a client indicate to an intermediary that a request might be a long-poll?  What is the most effective model to apply when using long-polling?  What behaviour should an cache exhibit?

A resource-based model for long-polling is fairly simple.  For instance, a resource might be identified "https://example.com/events?user=2616", retrieving this resource provides the latest event (or events).

Polling the resource provides the client with state updates; the client is notified of events.  Long-polling both limits the amount of traffic generated by polling, while ensuring low latency.

What I'm looking for is a set of mechanisms/suggestions that a) improve long-polling, and b) break in a graceful manner if intermediaries don't understand them.  These might be advice on using existing mechanisms, or they might be extensions to HTTP.

It was mentioned in the HyBi discussions that this might be something that httpbis might be up to adopting.  Is there any interest in this?

--Martin


~~~

==Thoughts

There are some thoughts in 
<http://tools.ietf.org/html/draft-loreto-http-bidirectional-01#section-6>

===Request Construction

What does the client include in their request?  If a request reaches the origin server it can hold onto the request, since it likely knows that this is the purpose of the resource.

An intermediary doesn't have this knowledge and it can't distinguish a normal request from a long-poll.  This is the reason that caching is usually intentionally suppressed.

Using conditional headers (If-Modified-Since, If-None-Match) might suffice.  Are there problems that people are aware of with using these?

Some form of explicit indication might help.  An indication would also help an intermediary identify requests that are likely to tie up resources for significant periods.  Maybe a proxy could allocate connections for long-polling requests from a separate pool.

===Timeouts

This leads to timeouts.  Many intermediaries use them so that shared resources (outgoing connections) are not monopolized by a few.  If a server decides to time out, it can send 304; if an intermediary times out, it can send 504.

A client cares little for this, but it's not especially nice on the intermediary, who has to stop a connection to the upstream server (the server seems unresponsive).  If it knew that this was a long-poll, maybe it wouldn't time out so quickly.  Alternatively, the client could let the server know when it should timeout - thus the intermediary doesn't need to be upgraded and it doesn't wear the negative consequences.

Perhaps the client could detect a gateway timeout and advise the server of when this might occur.  This way, the server could send the 304 earlier and potentially avoid the problem.  Timeouts would be less likely to occur any more than once for a particular path.

My thought is that this would work most effectively on a per-resource basis, with a fair margin of error.  Problems include multiple paths, changing intermediaries, packet loss and volatile network topologies.  Is there still any value in this despite these problems?

===Response Construction

On the return journey, what should a server include?  I see a lot of ``Cache-Control: no-cache'' directives being handed down, but is that actually the best option?

A client is perfectly able to request that a cache not interfere.  A client might be perfectly happy with the risk that information is old--within certain bounds.  

Setting max-age to a small value might allow for significantly improved responsiveness, particularly for public resources that can be stored in shared caches.

Received on Thursday, 14 January 2010 06:38:05 UTC