- From: Paul Leach <paulle@microsoft.com>
- Date: Wed, 10 Jul 1996 16:34:59 -0700
- To: "'Roy T. Fielding'" <fielding@liege.ICS.UCI.EDU>
- Cc: "'http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com'" <http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com>
>---------- >From: Roy T. Fielding[SMTP:fielding@liege.ICS.UCI.EDU] >Sent: Tuesday, July 09, 1996 9:06 PM >To: Paul Leach >Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com >Subject: Re: multi-host virtual sites for HTTP 1.2 > >> What I'd like to see in HTTP/1.2 is the ability to easily and >> efficiently make multiple hosts act like a single virtual Web site, so >> that one can build huge web sites. > >Ummm, assuming it is decent server software, the only real limitation >on site size right now is the bandwidth of the incoming network. True enough today, but I think it has to be considered short sighted given the exponential growth rate of the net and the even larger growth of "Intranets" (organization nets converting to use Internet protocols). We have a T3 connection to the net -- 45 megabits/sec. It isn't saturated, but it won't be too long as the net grows exponentially. And then we'll get an OC3... We can put multiple A records in for www.microsoft.com, and that will work as long as all the pages under www.microsoft.com fit conveniently on one server. At some point it becomes than one server can handle; then we put in the gateway like you propose below. But this is just our public info. What about all the internal pages? They're about 2 orders of magnitude larger in number. I can't put them on one server's disks, even if I replicate the server to handle the load, and I don't want to funnel all my internal accesses through gateways, any more than I would want to funnel all my network file system traffic though gateways. And as I distribute and re-distribute them across multiple servers' disks, I don't want to have to change their names -- some of them are well known, and the rest still require fixing all the links that point at them. It is nearly axiomatic in distributed systems design that in order to really scale, you have to be able to replicate and partition and cache. The proposal was a small addition to HTTP to allow one to partition transparently to the names chosen (within a site) without the cost of 302s for each and every one. >Splitting requests onto different server machines can be easily >accomplished using a very fast gateway box which sits on the trunk >and routes requests to the internal servers for processing. The only >thing the gateway would need to do is look for the beginning of each >message and multiplex from there (which is more difficult than >IP routing, but not much more). It needs to do more than just look at the front of each message -- it also needs to forward the request and relay the response. At full load, >it would need to forward 90 megabits/sec to handle our _current_ site. Any single point will eventually become a bottleneck as the load increases. At some point, I have to replicate the gateway to handle the load. Software that implements a protocol to avoid all that is lots cheaper. Plus it lets me move pages within my site with changing all the links to >the pages. > >I think that would be a lot easier than changing HTTP to accomodate >complex URL rewriting rules. They're not that complex -- just substitute one prefix for another. But OK, it would be better if no rewriting rules were needed. You've prompted me to think of a slightly different scheme where they aren't needed: the referral header just has a prefix and a new hostname to use for all Request-URIs that start with that prefix; the Request-URI and the Host sent to that new hostname are left unchanged from the original request. The servers are configured to know what subtree(s) of the name space they own, and look up in their local file system relative to that. > There are good reasons to support a >URI rewriting mechanism on the client side (e.g., to make the URI >syntax truly extensible, support automated mirroring, support URNs, >etc.), but I haven't considered single-site performance to be one >of them. All of the arguments in your message apply equally well to the DNS -- why does it have referrals? It could have just used CNAME -- that's the analogy to the way HTTP uses of links to glue together the distributed servers. Ditto for AFS and X.500. Paul
Received on Wednesday, 10 July 1996 16:41:27 UTC