- From: Ted Hardie <hardie@merlot.arc.nasa.gov>
- Date: Wed, 20 Dec 1995 17:07:51 -0800 (PST)
- To: brian@organic.com (Brian Behlendorf)
- Cc: hardie@internic.nasa.gov, hammond@csc.albany.edu, www-talk@w3.org
First off, let me agree whole-heartedly that this is academic until someone implements a test. There is some benefit, however, to discussing what we might want to test. The primary reason Brian seems to want to send the "Top" of the pre-fetched document (or other object) rather than the whole thing is to generate something like reasonable statistics for who actually saw a page rather than who fetched it. There are other, minor advantages like being able to store bits of more pages in the user agent's cache without incurring lots of transport overhead, but this would apply only for objects whose MIME type made it reasonable to start displaying with only partially information (it seems like this could could be a problem for partial markup even with some psuedo-HTML, by which I mean Netscapisms like "Javascript"). I dig the motivation, but I suspect it's not the right reason to do it this way. First, the statistics on who saw a document are affected by caches in fairly unpredictable ways, so that fixing this bit does not really solve the bigger problem. Second, many other kinds of advertiser supported media have the same problem, both for under-reported and over-reported statistics. Eventually, the advertisers get a feel for it, without having to engineer concrete statistics (imagine a TV they stored the ads until a proximity-based radiant heat sensor indicated that a human was nearby....). The Web is young, so it doesn't have that, but it will get it without our writing support for into basic methods for http. What would be needed to fix the basic problem is some confirmation mechanism, so that web server can see in a concrete way how many times the page was rendered. An optional header to do this might work this way: Server sends: Confirm: First with a document; it is ignored by any intermediate caches and noted by the user agent. If the user agent is set to respond to confirms, it notes the desire for the confirm and sends the confirmation when it renders the document. If Confirm: is set to Each (or every, or some such), the browser sends a confirmation every time it renders the page. Make confirm a method, which takes the same arguments as head, and expects no reply or some specified Confirmation received response code. This solves the basic problem of mega-caches *and* prefetched documents. It also means that the change in behavior occurs only when a server admin wants the confirmation. You might need to protect privacy by having a user dialog come up when a confirmation is requested, or have a global setting for those users who don't care about confirmation. Still, the thing is pretty doable. Why aren't we doing it then? My guess is first, because it requires network resources that don't add anything to the user experience, and they are the ones with the thinnest pipes; if turning this off gives any improvement in local performance, it will be turned off. Secondly, it looks big-brotherish even if the confirmation is for a site that selling sweet little old lady products like quilts and doilies; people start to wonder what else is going on in the way of tracking them (not that they shouldn't wonder). Thirdly, it means the client has to keep track of another type of attribute for pages in user caches, orthoganol to all the current ones. Maybe it will come down the pike in the future anyway, depending on how much content providers want it. I have not noticed, however, that content providers have much of a voice in influencing the big dogs (maybe that's just my experience as someone working with content providing with public funding, however). I guess that's more than two cents on this topic, so I'll shut up now. Ted Hardie Note: I do not speak for NASA. Brian Behlendorf writes: > The second disadvantage you list, I don't have an answer to - yeah, there > will be no prefetching there simply because it's a (potentially) different > adminstrative domain. However, the first problem isn't a problem when you > compare it to *no* prefetching - sure, a second document request is needed (I > won't say new tcp connection or round trip because we could be talking > persistant connections here), but at least you have the first screenful of > the document to read while the rest is loading, so the *perception* is that > there was no delay between the "click" and the beginning of the document > rendering. Furthermore, you don't have the bandwidth hit of having all pages > prefetched, only the very beginnings of those pages. I say make that > "beginning" mark arbitrary, so server/site authors can configure that on an > object-by-object basis. > > If we want to push this "smarts" back to the client, we could have a new > method, say "TOP", which means "give me the headers and however many bytes of > content you think I should be able to see before the full request goes > through". In a typical persistant HTTP request, it means a GET is placed on > a document, the document is parsed for IMG and EMBED-ed objects, those are > fetched using GET, finally the document is parsed for HREF-linked objects, > and those objects are sent a TOP method. When an HREF is selected, another > full request happens just like nowadays, but the browser can render the TOP > info it got immediately. Just how many bytes a TOP request returns is left > up to the server/site author. Some servers may configure it to be to the > first <HR> in an HTML doc - others may say the fist 1500 bytes. The server > should also have some way of saying "look, the object you wanted was so > small, I gave you the whole thing anyways". > > This is academic theory until it's implemented as a test somewhere, so > I won't press too much more on it. > > Brian > > --=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-- > brian@organic.com brian@hyperreal.com http://www.[hyperreal,organic].com/ > >
Received on Wednesday, 20 December 1995 20:03:00 UTC