- From: Koen Holtman <koen@win.tue.nl>
- Date: Tue, 13 Aug 1996 23:55:51 +0200 (MET DST)
- To: Simon Spero <ses@tipper.oit.unc.edu>
- Cc: koen@win.tue.nl, mogul@pa.dec.com, jg@zorch.w3.org, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Simon Spero: > > >With NG, there isn't quite a strong a division between the UA profile and >the individual user's profile; profiles can be modified dynamically; >the server has the option of either caching the whole modified profile, >or just the 'base class' profile; the client just needs to know and note >what profile has been cached. So if I understand you correctly, there are two profiles: the UA profile and the user profile. Or is there a complete inheritance layer in there, which allows 5 layered profiles if you want to? How does the client know what profile has been cached? If it knows that the user profile is cached, will the client send the entire profile in the request or will it omit any user profile info? If I'm negotiating a bilingual homepage, how do I tell the client to send me the user profile with the language preferences? >> where do you stop exactly? How many `profile cache misses' do you >> estimate under your proposal? > >The question of where you stop is up to the server; just caching the UA >specific base profiles wins big; spending the extra effort to cache >per-user profiles is an even bigger win - the tradeoff depends on how >much perisistent storage you wan tto dedicate to the problem. > >I would expect to see a big win with a cache size of around 20 (enough for >the most popular Nevergethere, Exploder, and Slosaic versions to be safe >from getting flushed by the small fry. There'd be a bigger win around >2000, as even the small-fry get to stay put. I think you are right about the UA profile (as long as UAs remain monolithic systems, that is). I have doubts about this working for user profiles; read on. >For a big site with a regular audience, But what if I'm a small site where random people drop by to read 2 pages? >it might be worth spending a >hundred dollars or so on this and dedicating up to a gig or to profile >caching; this keeps things really fast for caching. Taking 1K for a user profile, that would mean you have 1M users! Suppose I am a small site where random people drop by to read 2 language-negotiated pages. Now, will user profile caching outperform sending large headers combined with a sticky header scheme in this case? That depends mainly on P_same: the probability that two users have the same user profile if P_same is 1 in 1 million, and if you have already had 10.000 different users (OK, the site is not so small after all), the next user will get a user profile cache hit in about 1% of all cases. Not very optimal. So what is a realistic estimate of P_same? That depends on the amount of variance in the user profile. Making a table of things you want in your user profile: VERY CONSERVATIVE estimate of number of description different settings 100 accepted languages with some q factors 10 accepted charsets with some q factors 2^8 presence of common file viewers (word processors, movie players, audio players) 3^4 q factors for the 4 file viewers you have 2^5 presence of common plug-ins 10 type of screen (colordepth, monochrome, ...) 4 size of screen 2 I hate frames 2 I hate animated gifs 2 Life is too short for large images --------- * 2e11 So a very conservative P_same is 1 in 2e11. This reduces the chances of getting a user profile cache hit for a different user to zero. (Except for users who never change their default profiles, but how many web users would be that boring?) Also, with this P_same, the user profile cache key essentially becomes a global user identifier. Definitely not good for privacy. I don't see how you can ever get the numbers working for user profile caching. The introduction of user profile caching scheme seems to put penalties on sites seeking an irregular audience, even if such sites are very big. It seems to me that only transparent content negotiation scales for P_same values of 1 in 1 million or more (while also still protecting privacy). And I think that for any moderately interesting collection of things to negotiate on (like the table above), you will get a P_same like this. You can also look at this result in another way: if user profile caching were effective, then P_same would be so low that we could easily encode all user profile information in a short request header. It seems that profile caching is a good way to `translate' a user-agent string into a large table of capabilities (assuming that user agents stay monolithic), but that it has little general use beyond that. >Simon (more later) Koen.
Received on Tuesday, 13 August 1996 15:00:43 UTC