W3C home > Mailing lists > Public > ietf-http-wg@w3.org > July to September 2009

RE: Multi-server HTTP

From: Daniel Stenberg <daniel@haxx.se>
Date: Fri, 28 Aug 2009 15:45:59 +0200 (CEST)
To: "Ford, Alan" <alan.ford@roke.co.uk>
cc: Henrik Nordstrom <henrik@henriknordstrom.net>, Robert Siemer <Robert.Siemer-http@backsla.sh>, Mark Nottingham <mnot@mnot.net>, ietf-http-wg@w3.org, Mark Handley <m.handley@cs.ucl.ac.uk>
Message-ID: <alpine.DEB.2.00.0908281401570.27641@yvahk2.pbagnpgbe.fr>
On Fri, 28 Aug 2009, Ford, Alan wrote:

> the client could infer capability by getting a Mirrors: header back from a 
> HEAD request first, and then deciding what to do (assuming the connection 
> can be kept alive).

That would work even if the connection isn't kept alive, wouldn't it?

> Which brings me onto another thing about Mirrors: header. One of our 
> longer-term goals with this would be to somehow provide wildcarded lists of 
> mirrors, so that a client could immediately run off and fetch bits of a 
> website from many mirrors, potentially speeding up loading time 
> considerably, and providing an alternative method of load balancing.
> However, I'm struggling to see a neat way of doing this reliably, since we 
> couldn't get checksums for every file on the first handshake (or if all 
> content was static we might be able to, but it's a big overhead). Does 
> anybody have any ideas as to a neat way of doing this? Best I can think of 
> so far is some sort of version number/(pseudo)hash of the entire directory 
> structure!

This idea is attractive methinks, but coming up with a fine protocol for it is 
really tricky.

A hash of the entire directory would be problematic, I think, since it would 
imply that both directory structures need to remain identical - not only hold 
the right files and no extra files.

I'm thinking like: you have two sites A and B, they show one picture each 
A.jpg and B.jpg. Both sites refer to a mirror that holds BOTH those images in 
the same directory. It could work fine, but the mirror's dir doesn't look the 
same as the dir of A nor B. That concept would break too easily I think.

We want to avoid doing requests to non-existing resources on the mirror that'd 
respond with a 404 back (which then would have to retried to the master site 
or another mirror) - we need a decent way for a client to know which URIs it 
can try to get from a mirror instead of the master...

I think all this make me favour not a wildcard concept, but more a 
list-concept where a site can list not only that "this object also exist HERE 
and HERE" but then also "THESE OTHER OBJECTS also exist HERE and HERE" and 
"THESE OTHER" would then be a list of (relative?) URIs somehow. But this 
becomes awkward if the list of items is long.

Then we come to the concept of changing items. How long can a client assume 
that the mirrors have the corresponding object? Would they need some kind of 
cache control headers to specify that? In the mirror-for-a-single-object case 
I think we can assume that the mirror will have the object for at least a very 
short while after the response said so but then it too gets this problem.


  / daniel.haxx.se
Received on Friday, 28 August 2009 13:46:54 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 11:10:51 UTC