- From: Daniel Stenberg <daniel@haxx.se>
- Date: Fri, 28 Aug 2009 15:45:59 +0200 (CEST)
- To: "Ford, Alan" <alan.ford@roke.co.uk>
- cc: Henrik Nordstrom <henrik@henriknordstrom.net>, Robert Siemer <Robert.Siemer-http@backsla.sh>, Mark Nottingham <mnot@mnot.net>, ietf-http-wg@w3.org, Mark Handley <m.handley@cs.ucl.ac.uk>
On Fri, 28 Aug 2009, Ford, Alan wrote: > the client could infer capability by getting a Mirrors: header back from a > HEAD request first, and then deciding what to do (assuming the connection > can be kept alive). That would work even if the connection isn't kept alive, wouldn't it? > Which brings me onto another thing about Mirrors: header. One of our > longer-term goals with this would be to somehow provide wildcarded lists of > mirrors, so that a client could immediately run off and fetch bits of a > website from many mirrors, potentially speeding up loading time > considerably, and providing an alternative method of load balancing. > > However, I'm struggling to see a neat way of doing this reliably, since we > couldn't get checksums for every file on the first handshake (or if all > content was static we might be able to, but it's a big overhead). Does > anybody have any ideas as to a neat way of doing this? Best I can think of > so far is some sort of version number/(pseudo)hash of the entire directory > structure! This idea is attractive methinks, but coming up with a fine protocol for it is really tricky. A hash of the entire directory would be problematic, I think, since it would imply that both directory structures need to remain identical - not only hold the right files and no extra files. I'm thinking like: you have two sites A and B, they show one picture each A.jpg and B.jpg. Both sites refer to a mirror that holds BOTH those images in the same directory. It could work fine, but the mirror's dir doesn't look the same as the dir of A nor B. That concept would break too easily I think. We want to avoid doing requests to non-existing resources on the mirror that'd respond with a 404 back (which then would have to retried to the master site or another mirror) - we need a decent way for a client to know which URIs it can try to get from a mirror instead of the master... I think all this make me favour not a wildcard concept, but more a list-concept where a site can list not only that "this object also exist HERE and HERE" but then also "THESE OTHER OBJECTS also exist HERE and HERE" and "THESE OTHER" would then be a list of (relative?) URIs somehow. But this becomes awkward if the list of items is long. Then we come to the concept of changing items. How long can a client assume that the mirrors have the corresponding object? Would they need some kind of cache control headers to specify that? In the mirror-for-a-single-object case I think we can assume that the mirror will have the object for at least a very short while after the response said so but then it too gets this problem. -- / daniel.haxx.se
Received on Friday, 28 August 2009 13:46:54 UTC