- From: Jamie Lokier <jamie@shareable.org>
- Date: Wed, 11 May 2011 13:01:03 +0100
- To: Willy Tarreau <w@1wt.eu>
- Cc: Ben Niven-Jenkins <ben@niven-jenkins.co.uk>, HTTP Working Group <ietf-http-wg@w3.org>
Willy Tarreau wrote: > Example 1: > video streaming sent by chunks might start playing only once download > is complete. However, once transferred, it works, so even if this causes > discomfort to the end user, it does not break the original goal of the > application. I know of several *infinitely* long video streaming applications, and a plenty of internet radio audio streaming applications, which expect to be able to stream continuously. They will break if an intermediary tries to buffer it all. > Example 2: > an application which sends exactly one image in each chunk, hoping > for the recipient to use chunks as delimiters might break if an > intermediary rechunks them. I see rechunking as similar to TCP packet boundaries. Nobody should depend on chunk boundaries to mean anything. > Example 3: > an application which is based on short request-response transfers > inside a single message body has a lot of chances to break once any > intermediary buffers. > > I don't know if it's clearer now. Probably we'd find a better rewording. I think there are 3 different issues being mixed up here: 1. Streaming or buffering a message (in one direction only). 2. Streaming request while streaming response. 3. Depending on reliable chunk boundaries. I think it's important to distinguish them, instead of declaring them all invalid while lumped together. 1. Streaming or buffering a message (in one direction only) ----------------------------------------------------------- 1a: The ability to continuously stream a response is a fairly well established use for some applications, such as internet radio. Those applications don't require devery byte to be forwarded immediately, only that they are delivered in a reasonable time. 1b: A fair number of applications expect HTML or Javascript response streaming to work. HTML streaming goes back to the early days of Netscape Navigator - when it was possible to transmit multiple HTML documents, each replacing the previous one, and it was *expected* that they'd be delivered in a reasonable time. Javacript streaming is what people do with iframes these days (still) in lieu of XmlHttpRequest for a variety of reasons. However there are still question marks around how reliable streaming is in practice. 1c: It's clear that if an intermediary buffers whole messages, it will limit the message size. For example if you PUT or GET an 80GB file (haven't we all), not many full-buffering proxies are going to be happy with that. Yet it is clearly a valid use of RFC2616 HTTP. That would be an example of something which works until you insert a proxy (of a certain kind) where you can't really complain about the application, and perhaps should complain about the proxy. 1d: For unidirectional streaming, it should be noted that CONNECT's buffering/forwarding behaviour is not fully specified either (e.g. forward every byte eagerly, or delay for a *limited* time to coalesce TCP segments). If a proxy has a sane CONNECT implementation it's not unreasonable for it to use the same strategy for streaming *unidirectional* requests or response, if it does not have a reason to buffer them more. 1e: In practice, if unidirectional streaming is not reliable enough, the common strategy is to transmit a stream of bulky HTTP entire messages instead to accomplish the same task. Or to try using CONNECT. 2. Streaming request while streaming response --------------------------------------------- 2a: This is certainly debatable and/or dubious. I know from experiments it does not work with all browser clients. Sometimes the client waits until it has sent the whole request before reading the response (so deadlock is possible if the server doesn't handle this). 2b: I wouldn't expect all proxies to support this, even those which handle CONNECT fine, which is a shame as it would be quite useful if it worked. 2c: In practice applications can simply open two HTTP connections, and stream one direction over each connection (at some loss of efficiency). This removes the question of bidirectional streaming, while still depending on unidirectional streaming behaviours. 3. Depending on reliable chunk boundaries ----------------------------------------- 3a: Chunk boundaries are analogous to TCP segment boundaries: Absolutely nothing should *ever* depend on their position. If a protocol wants to do something fancy with incremental messages, that should be encoded in the data bytes of the messages *only*. Even transparent proxies may rechunk arbitrarily according to instantaneous low-level TCP states. 3b: It isn't specified, but any behaviour that depends on a proxy forwarding bytes in a reasonable time also requires it to forward *parts* of chunks in a reasonable time. That follows from the chunk boundaries having no semantic significance, and practically the fact an earlier sender may merge chunks arbitrarily. 3c: The sole entire purpose of chunks is to allow the sender to terminate the data stream when it is ready without advance knowledge of the message size. This is why chunk boundaries don't (or shouldn't) mean anything. It's just a mechanism for encoding an out of band "end of message". 3d: The example which started this thread, of an application expecting individual chunks to be forwarded through haproxy immediately, is broken by design if it requires chunk boundaries to be preserved. This is *separate* from whether it requires simultaneous request and response streaming, and whether it requires its data bytes to be forwarded in a reasonable time (and in that case it should use a sliding window so that small delays don't stall the protocol's progress disproportionately). Finally: Buffering / buffer timing behaviour of CONNECT is not specified either. It is commonly understood that all bytes must be forwarded in a reasonable time (otherwise HTTPS wouldn't work), but not whether they are forwarded *eagerly*, or delayed for a short time for TCP efficiency, as haproxy does to streamed messages in the example which started this thread. An application like the one which started this thread, but running over CONNECT, may well be bitten by the same delay RTT issue over CONNECT with a proxy in the middle, and this is likely to affect WebSocket users who design protocols sensitive to RTT as well. -- Jamie
Received on Wednesday, 11 May 2011 12:01:35 UTC