- From: Francois Daoust <fd@w3.org>
- Date: Tue, 03 Jun 2008 11:41:59 +0200
- To: Jo Rabin <jrabin@mtld.mobi>
- CC: public-bpwg-ct@w3.org
Right, probably never, but it still should... I mean, "dynamic content" is different from "dynamically generated representations based on user-agents". In the former case, the Content Provider should use some "no-cache" directive, whereas in the latter case the Content Provider should use a "Vary: User-Agent", make sure the response can be cached, and should participate in the creation of a better world by helping caches save the bandwidth, which in practice means use the "Content-Location" header to help intermediary caches understand when two responses on two User-Agent are actually equal. If we don't put a note, CPs will use a "Vary: User-Agent" header without even thinking that it impacts intermediary caches. If we put a note, well, at least, readers will know there is something they should do, even if they don't care. That's the reason why I suggest the note. More like a contextual ad: "Using the Vary header? Get 2 for the same price, adopt a lonely Content-Location one!" (suggested background music: "Heal the world, make it a better place...") Francois. Jo Rabin wrote: > Thanks Francois, I guess I am wondering how often a server that offers a > varying response can respond with a Not Modified status, given that the > response would have been constructed by some dynamic process ... > > Jo > > On 02/06/2008 11:11, Francois Daoust wrote: >> Well, I'm not a cache expert but I'd say that the following would happen: >> >> 1. the proxy receives a response with a Cache-Location: >> http://example.org/Representation1 header for a specific User-Agent >> 2. it stores the response. >> 3. another request is received for the same URI but with a different >> User-Agent. >> 4. the proxy cannot match on the stored response, but can send a >> conditional request to the server. >> 5. the server answers with the same representation response and a 304 >> Not Modified status >> 6. the proxy serves the response that it has in cache >> >> In short, it only saves a bit of bandwidth between the server and the >> proxy. >> >> If the conditional request is not possible for 4., then I totally >> agree that this is just plain useless... >> >> Francois. >> >> >> >> Jo Rabin wrote: >>> >>> I am a bit confused as to how/why this all works. It seems to me that >>> for this to actually work cache efficiently, the cache would have to >>> understand how _exactly_ the server processes the User Agent header. >>> >>> As pointed out earlier in this thread, there may be countless >>> variations on the same basic header that as far as the server is >>> concerned all represent near-enough the same thing. However, use of >>> content location and vary headers does not give any clue as to how it >>> makes that judgement. >>> >>> So it seems to me that a proxy, knowing that a server varies its >>> representations based on the UA header can legitimately cache _only_ >>> if the UA is _exactly_ the same. So what puzzles me is why a content >>> location header helps it. I think I must be missing the point here, >>> and if so, apologies. >>> >>> Jo >>> >>> On 02/06/2008 08:39, Francois Daoust wrote: >>>> I agree as well... >>>> >>>> ... and I also agree with Jo that, in all cases, having the >>>> different representations available at specific locations means >>>> extra-work for the CP with no real added-value (save the fact that >>>> managing a clean list of the different representations available - >>>> tweaks included - eases testing), and is probably not a common >>>> practice. >>>> >>>> Francois. >>>> >>>> Umesh Sirsiwal wrote: >>>>> I agree with Bryan. We may want to recommend (a) as the preferred >>>>> solution as this saves the extra roundtrip. >>>>>> -----Original Message----- >>>>>> From: Sullivan, Bryan [mailto:BS3131@att.com] >>>>>> Sent: Friday, May 30, 2008 12:25 PM >>>>>> To: Francois Daoust; Umesh Sirsiwal >>>>>> Cc: Jo Rabin; public-bpwg-ct@w3.org >>>>>> Subject: RE: CT Proxies and Forward Caches >>>>>> >>>>>> Hi Francois, >>>>>> With (b) as an option, do you think the proposal would result in a >>>>>> greater number of redirects? I can see a case where a CP wants to >>>>>> normally provide only a generic URI, e.g. since this is embedded as >>>>>> links in other resources. If the CP did not make the specific >>>>>> representation available at a unique URI also, your proposal would >>>>>> require that a redirect result for each request related to (or based >>>>>> upon) the generic URI. >>>>>> >>>>>> It might be better just to say: "When varying representations >>>>>> based on >>>>>> received HTTP headers, cache-efficient techniques should be used. For >>>>>> example, if the total number of representations is limited whereas >>>>>> the >>>>>> number of values for a HTTP header used for varying representation is >>>>>> high [typically the case when varying representations based on the >>>>>> User-Agent string], the different representations should be made >>>>>> available at specific URIs and the request to the generic resource >>>>>> should return the specific representation along with a >>>>> Content-Location >>>>>> header that identifies the representation being served." >>>>>> >>>>>> This would avoid the message to CP's that redirect to specific >>>>>> representations (as compared to just returning them) is a recommended >>>>>> practice, if they are somehow prevented from making the >>>>> representations >>>>>> available at specific URI's. >>>>>> >>>>>> Best regards, >>>>>> Bryan Sullivan | AT&T >>>>>> >>>>>> -----Original Message----- >>>>>> From: Francois Daoust [mailto:fd@w3.org] >>>>>> Sent: Friday, May 30, 2008 7:50 AM >>>>>> To: Umesh Sirsiwal >>>>>> Cc: Jo Rabin; Sullivan, Bryan; public-bpwg-ct@w3.org >>>>>> Subject: Re: CT Proxies and Forward Caches >>>>>> >>>>>> Thanks for the clarification, Umesh. >>>>>> >>>>>> Very good point. >>>>>> >>>>>> The Content-Location header would probably have deserved a mention in >>>>>> the TAG Finding I mentioned at the beginning of the thread and in >>>>>> particular in 2.1.1 section [1], third item, since the Vary header >>>>>> makes >>>>>> things work, and the Content-Location header makes things >>>>>> cache-friendly. It saves the redirection, and makes groups used by >>>>>> the >>>>>> server available to caches without revealing how they were built. >>>>>> >>>>>> As far as content-transformation is concerned, there may not be much >>>>> to >>>>>> say though as it's a rather generic caching issue. The need to use a >>>>>> "Vary" on the "User-Agent" header is yet typical of the Mobile world, >>>>>> so >>>>>> we probably should emphasize this point somewhere. I'm not sure the >>>>>> Content Transformation guidelines document is the right place for it, >>>>>> but since Content-Location sounds like a "natural" companion for the >>>>>> Vary header, we could add a note, next to the guideline that says >>>>>> that >>>>>> the server MUST add a "Vary" HTTP header when varying >>>>>> representations, >>>>>> along the lines of: >>>>>> >>>>>> "When varying representations based on received HTTP headers, >>>>>> cache-efficient techniques should be used. For example, if the total >>>>>> number of representations is limited whereas the number of values for >>>>> a >>>>>> HTTP header used for varying representation is high [typically the >>>>> case >>>>>> when varying representations based on the User-Agent string], the >>>>>> different representations should be made available at specific URIs >>>>>> and: >>>>>> a) the request to the generic resource should return the specific >>>>>> representation along with a Content-Location header that identifies >>>>> the >>>>>> representation being served. >>>>>> or b) the request to the generic resource should return a redirection >>>>>> to >>>>>> the specific representation." >>>>>> >>>>>> Any other view on that? >>>>>> >>>>>> Francois >>>>>> >>>>>> >>>>>> [1] http://www.w3.org/2001/tag/doc/alternatives- >>>>>> discovery.html#id2261787 >>>>>> >>>>>> >>>>>> >>>>>> Umesh Sirsiwal wrote: >>>>>>> Hi Fancois, >>>>>>> Sorry for the confusion. Based on my understanding of the Link >>>>>>> element, I can further clarify difference between the Link element >>>>>> and >>>>>> >>>>>>> the Presentation-URI. >>>>>>> >>>>>>> My understanding is that the Link header provides a method of >>>>>>> advertising available alternatives for the page being served. On the >>>>>>> other hand the Presentation-URI provides a method to identify the >>>>>>> alternative included in the response. In case of the deployment case >>>>>>> you mentioned below, once the CT proxy has identified the page to be >>>>>>> served it will include a Presentation-URI header identifying the >>>>>> selected URI. >>>>>>> Using this the Vary header will be able to identify the criteria on >>>>>>> which the server varied its response, while the Presentation-URI >>>>> will >>>>>>> be able to identify which of the several alternatives was served. >>>>>>> >>>>>>> Rereading HTTP specification, the Presentation-URI is the same as >>>>>>> Content-Location header field. I am proposing that the CP or the CT >>>>>>> proxy which can serve multiple presentation of the content for the >>>>>>> same URI, should include Content-Location header to identify the >>>>>>> entity it is serving. >>>>>>> >>>>>>> -Umesh >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Francois Daoust [mailto:fd@w3.org] >>>>>>>> Sent: Monday, May 26, 2008 11:31 AM >>>>>>>> To: Umesh Sirsiwal >>>>>>>> Cc: Jo Rabin; Sullivan, Bryan; public-bpwg-ct@w3.org >>>>>>>> Subject: Re: CT Proxies and Forward Caches >>>>>>>> >>>>>>>> Hi Umesh, >>>>>>>> >>>>>>>> I'm not sure I completely follow your point here, feel free to >>>>>>>> correct me. >>>>>>>> >>>>>>>> The Presentation-URI header you mention to identify alternative >>>>>>>> representations being served looks like the "Link" element we're >>>>>>>> currently discussing in another thread, see: >>>>>>>> >>>>>>>> >>>>> http://lists.w3.org/Archives/Public/public-bpwg-ct/2008May/0021.html >>>>>>>> and replies. >>>>>>>> >>>>>>>> In the case of the Link element, we're currently trying to see when >>>>>>>> it makes sense to use it, and how it could be used in practice. >>>>> This >>>>>>> would >>>>>>>> indeed avoid the extra round trip in the sense that the CT-proxy >>>>>>>> would be able to do the redirection for the user and so the >>>>>>>> "redirect" would not reach the high-latency network the end-user is >>>>>> connected to. >>>>>>>> Now, obviously, the problem with the "Link" element is that it is >>>>> at >>>>>>>> the markup level, and not at the HTTP level. It would be cool to >>>>>> have >>>>>> >>>>>>>> a "Link" HTTP header, typically for images and more generally for >>>>>> all >>>>>> >>>>>>>> non-HTML content. We're not the only ones who want the "Link" >>>>> header >>>>>>>> back to life ("back" since it previously existed but disappeared >>>>> for >>>>>>>> lack of use, how ironic ;-)), and there are many on-going >>>>>> discussions >>>>>> >>>>>>>> within W3C and IETF about that. If it ever becomes a reality, it >>>>>>>> would indeed be useful to serve multiple representations of a >>>>>> resource. >>>>>>>> Note that it's not directly related to content transformation in >>>>>>>> itself. >>>>>>>> The presence of a content transformation proxy merely adds to the >>>>>>> case. >>>>>>>> Did I get you right? >>>>>>>> >>>>>>>> Francois. >>>>>>>> >>>>>>>> >>>>>>>> Umesh Sirsiwal wrote: >>>>>>>>> Jo, Francois, Bryan, >>>>>>>>> Thanks for the responses. IMO absence of standardization in this >>>>>>>> space >>>>>>>>> will cause caches built in CT or otherwise to implement heuristics >>>>>>>> based >>>>>>>>> solutions to deduce intent of CP or CT. That is less then >>>>>> desirable. >>>>>>>>> To avoid the extra round trip Francois pointed out, the CP can >>>>>>>> possible >>>>>>>>> serve an HTTP header (let us call it Presentation-URI) identifying >>>>>>>>> alternative representation served. The CT proxy or other caches >>>>>> will >>>>>> >>>>>>>>> need to pay attention to this new header. But, as long as Via >>>>>> header >>>>>>>> is >>>>>>>>> always included, they will be able to correctly cache and serve >>>>> the >>>>>>>>> content. >>>>>>>>> >>>>>>>>> The Presentation-URI does not have to be limited to the three >>>>>>> groups. >>>>>>>> In >>>>>>>>> some cases the Presentation-URI can be very specific and say >>>>>>>> something >>>>>>>>> like www.example.com/Device_a. Won't that work? >>>>>>>>> >>>>>>>>> -Umesh >>>>>>>>> >>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: Jo Rabin [mailto:jrabin@mtld.mobi] >>>>>>>>>> Sent: Thursday, May 22, 2008 6:16 AM >>>>>>>>>> To: Francois Daoust >>>>>>>>>> Cc: Umesh Sirsiwal; Sullivan, Bryan; public-bpwg-ct@w3.org >>>>>>>>>> Subject: Re: CT Proxies and Forward Caches >>>>>>>>>> >>>>>>>>>> Aside from the redirect cost that Francois mentions, I am not >>>>> sure >>>>>>>>> that >>>>>>>>>> having separate URIs to allow caching of the "high" "medium" and >>>>>>>> "low" >>>>>>>>>> cases is the whole answer, since the response may still vary >>>>>> within >>>>>> >>>>>>>>>> those groups depending on work-arounds to the quirks of any >>>>>>>> particular >>>>>>>>>> device within the grouping. >>>>>>>>>> >>>>>>>>>> As Francois points out, this relates to the "long-running" ISSUE- >>>>>>>> 222, >>>>>>>>>> and it's down to me to try to make sure that it doesn't run much >>>>>>>>> longer >>>>>>>>>> :-( >>>>>>>>>> >>>>>>>>>> Jo >>>>>>>>>> >>>>>>>>>> On 21/05/2008 09:34, Francois Daoust wrote: >>>>>>>>>>> Indeed, the use of a "Vary: User-Agent" header generates much >>>>>> more >>>>>> >>>>>>>>>>> entries than a more typical use of Vary such as "Vary: Accept- >>>>>>>>>> Language", >>>>>>>>>>> and is thus not a really cache-friendly directive. >>>>>>>>>>> >>>>>>>>>>> The solution Bryan suggested to create representation-specific >>>>>>> URIs >>>>>>>>>> for >>>>>>>>>>> each UA group, coupled with a redirect response from a canonical >>>>>>>>>>> representation is much better from a cache perspective but it >>>>> has >>>>>>> a >>>>>>>>>>> cost: that of a round-trip between the server and the client to >>>>>>>>> serve >>>>>>>>>>> the redirect response to the representation-specific URI. This >>>>>>>>>> solution >>>>>>>>>>> is recommended by the W3C Technical Architecture Group in a >>>>>>> finding >>>>>>>>>> "On >>>>>>>>>>> Linking Alternative Representations To Enable Discovery And >>>>>>>>>> Publishing" >>>>>>>>>>> [1]. >>>>>>>>>>> >>>>>>>>>>> We only mention the use of the "Vary" header in current version >>>>>> of >>>>>>>>>> the >>>>>>>>>>> Content Transformation Guidelines document, but we have a long- >>>>>>>>>> running >>>>>>>>>>> discussion (internally named ISSUE-222) on the above mentioned >>>>>> TAG >>>>>> >>>>>>>>>>> finding. We may include that possibility in the document as >>>>> well. >>>>>>>>>>> [1] http://www.w3.org/2001/tag/doc/alternatives- >>>>>>>>>> discovery.html#id2261672 >>>>>>>>>>> Sullivan, Bryan wrote: >>>>>>>>>>>> Hi Umesh, >>>>>>>>>>>> As you mention, meta-group assignment (e.g. good/better/best) >>>>> is >>>>>>> a >>>>>>>>>>>> deployment-specific function, i.e. one Content Provider (CP) >>>>> may >>>>>>>>>>>> choose a different set of groups and UA assignment as compared >>>>>> to >>>>>> >>>>>>>>>>>> another. Without the direct involvement of the CT proxy in >>>>> group >>>>>>>>>>>> selection, the only way I see to reduce the cached >>>>>>> representations >>>>>>>>>> is >>>>>>>>>>>> for the CP to provide a distinct URI to UA's in a group (e.g. a >>>>>>>> URI >>>>>>>>>>>> parameter or unique path), so the various UA's naturally get >>>>>>>> served >>>>>>>>>>>> one of a fewer variations of the page from the cache. >>>>>>>>>>>> >>>>>>>>>>>> "direct involvement of the CT proxy in group selection" implies >>>>>>>>> some >>>>>>>>>>>> kind of metadata exchange between CP and CT proxy, through >>>>> which >>>>>>>>>>>> group-related pages can be indicated, and maybe a tighter >>>>>>>>>> integration >>>>>>>>>>>> of the CT proxy and cache. Both appear (to me) to be less >>>>>>>> desirable >>>>>>>>>> to >>>>>>>>>>>> standardize, and at least more complex to consider. >>>>>>>>>>>> >>>>>>>>>>>> Best regards, >>>>>>>>>>>> Bryan Sullivan | AT&T >>>>>>>>>>>> >>>>> ------------------------------------------------------------------- >>>>>> - >>>>>>>>>> ---- >>>>>>>>>>>> *From:* public-bpwg-ct-request@w3.org >>>>>>>>>>>> [mailto:public-bpwg-ct-request@w3.org] *On Behalf Of *Umesh >>>>>>>>> Sirsiwal >>>>>>>>>>>> *Sent:* Monday, May 19, 2008 8:12 AM >>>>>>>>>>>> *To:* public-bpwg-ct@w3.org >>>>>>>>>>>> *Subject:* CT Proxies and Forward Caches >>>>>>>>>>>> >>>>>>>>>>>> Several content transformation proxies and the Internet in >>>>>>> general >>>>>>>>>>>> includes forward caches. Current definition of HTTP includes >>>>>>>>>>>> indication of transformation using Vary header. In most cases >>>>>> the >>>>>> >>>>>>>>>>>> Content Transformation proxies and servers vary their responses >>>>>>>>>> based >>>>>>>>>>>> on User-Agent header. The number of User-Agent string in is >>>>> very >>>>>>>>>> high >>>>>>>>>>>> and caches cannot possibly store these mean copies of the >>>>>>>> response. >>>>>>>>>>>> Most servers are likely to classify the devices in certain >>>>> meta- >>>>>>>>>> groups >>>>>>>>>>>> for the purpose of content transformation. However, this meta- >>>>>>>> group >>>>>>>>>> is >>>>>>>>>>>> expected to be server specific. In absence of formal method, >>>>> the >>>>>>>>>>>> caches will be left to guess the meta-group. What will be the >>>>>>>>> method >>>>>>>>>>>> to solve this? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>> >>> >>> >
Received on Tuesday, 3 June 2008 09:42:35 UTC