- From: Alex Rousskov <rousskov@measurement-factory.com>
- Date: Fri, 8 Jan 2016 11:33:37 -0700
- To: ietf-http-wg@w3.org
- Cc: Kazuho Oku <kazuhooku@gmail.com>
On 01/08/2016 12:17 AM, Kazuho Oku wrote: > Yesterday, Mark and I have submitted a new draft named "Cache Digests > for HTTP/2." > https://datatracker.ietf.org/doc/draft-kazuho-h2-cache-digest/ > > The draft proposes a new HTTP/2 frame named CACHE_DIGEST that conveys > client's cache state so that a server can determine what should be > pushed to the client. > Please let us know how you think about the proposal. If possible, I recommend removing Draft language that makes (or appears to make) your feature specific to optimizing push traffic to user agents. Cache digests are useful for many things. Optimizing push traffic to user agents is just one use case. For example, Squid proxies already use Cache Digests (based on Bloom Filters) to optimize cache-to-cache communication in caching hierarchies [1,2]. [1] http://www.squid-cache.org/CacheDigest/cache-digest-v5.txt [2] http://wiki.squid-cache.org/SquidFaq/CacheDigests I suspect it is possible to define the new CACHE_DIGEST frame without adding artificial restrictions on its use. Let the agents sending and receiving that frame decide what use is appropriate between them while following some general guidelines. Since there are already two cache digests formats (based on Bloom filters and based on Golumb-coded sets), we should expect a third one. Have you considered allocating the first few response octets to specify the digest format? > A CACHE_DIGEST frame can be sent from a client to a server on any > stream in the "open" state, and conveys a digest of the contents of > the cache associated with that stream Perhaps I am missing some important HTTP/2-derived limits here, but the "cache associated with a stream" sounds too vague because HTTP caches are often not associated with specific streams. Did you mean something like "the cache portion containing shared-origin URIs?" > servers ought not > expect frequent updates; instead, if they wish to continue to utilise > the digest, they will need update it with responses sent to that > client on the connection. Perhaps I am missing some important HTTP/2 caveats here, but how would an origin server identify "that client" when the "connection" is coming from a proxy and multiplexes responses to many user agents? > 1. Convert "URL" to an ASCII string by percent-encoding as > appropriate [RFC3986]. There are many ways to percent-encode the same URI. This step must define a single way for doing so. Besides case insensitive parts and the decision of what characters to [un]escape, please do not forget about trailing slashes, URI fragments, and other optional parts. This is critical for interoperation! > MUST choose a parameter, "P", > that indicates the probability of a false positive it is willing to > tolerate For clarity, please detail what you mean by a "false positive" in this context. It may also be useful to mention whether the digesting algorithm may create false negatives. > 7. Write log base 2 of "N" and "P" to "digest" as octets. The wording is ambiguous: Store log2(N) and then store log2(P)? Store log2(N&P)? Store log2(N) and then store P? I suspect it is the latter and recommend splitting step #7 into two steps, one step per number. BTW, why note store the actual value of N? > 7. Write log base 2 of "N" and "P" to "digest" as octets. ... > 8. Write "R" to "digest" as binary, using log2(P) bits. It is not clear how a number should be written/encoded. Different programming languages and different systems store/represent numbers differently, so I would expect the Draft to specify encoding precisely. Sorry if I missed that detail. The draft appears to be missing a section documenting how the digest recipient can test whether the digest contains a given URI. Please consider an additional Security Consideration: Origin servers are expected to store digests so that the stored digests can be consulted when pushing traffic. Most origin servers will store digests in RAM. A malicious client may send a huge digest as a form of a DoS attack on a naive server that does not validate digest sizes. Malicious client(s) may send many small digests as a form of a (D)DoS attack on a naive server that do not control the total size of stored digests. Thank you, Alex.
Received on Friday, 8 January 2016 18:34:20 UTC