W3C home > Mailing lists > Public > public-bpwg-ct@w3.org > June 2008

Re: CT Proxies and Forward Caches

From: Jo Rabin <jrabin@mtld.mobi>
Date: Tue, 03 Jun 2008 12:09:36 +0100
Message-ID: <48452670.9070001@mtld.mobi>
To: Francois Daoust <fd@w3.org>
CC: public-bpwg-ct@w3.org

 > "Using the Vary header? Get 2 for the same price, adopt a lonely
 > Content-Location one!" (suggested background music: "Heal the world,
 > make it a better place...")

I find myself unable to add to this :-)

On 03/06/2008 10:41, Francois Daoust wrote:
> Right, probably never, but it still should...
> 
> I mean, "dynamic content" is different from "dynamically generated 
> representations based on user-agents". In the former case, the Content 
> Provider should use some "no-cache" directive, whereas in the latter 
> case the Content Provider should use a "Vary: User-Agent", make sure the 
> response can be cached, and should participate in the creation of a 
> better world by helping caches save the bandwidth, which in practice 
> means use the "Content-Location" header to help intermediary caches 
> understand when two responses on two User-Agent are actually equal.
> 
> If we don't put a note, CPs will use a "Vary: User-Agent" header without 
> even thinking that it impacts intermediary caches.
> If we put a note, well, at least, readers will know there is something 
> they should do, even if they don't care.
> 
> That's the reason why I suggest the note. More like a contextual ad: 
> "Using the Vary header? Get 2 for the same price, adopt a lonely 
> Content-Location one!" (suggested background music: "Heal the world, 
> make it a better place...")
> 
> Francois.
> 
> 
> Jo Rabin wrote:
>> Thanks Francois, I guess I am wondering how often a server that offers 
>> a varying response can respond with a Not Modified status, given that 
>> the response would have been constructed by some dynamic process ...
>>
>> Jo
>>
>> On 02/06/2008 11:11, Francois Daoust wrote:
>>> Well, I'm not a cache expert but I'd say that the following would 
>>> happen:
>>>
>>> 1. the proxy receives a response with a Cache-Location: 
>>> http://example.org/Representation1 header for a specific User-Agent
>>> 2. it stores the response.
>>> 3. another request is received for the same URI but with a different 
>>> User-Agent.
>>> 4. the proxy cannot match on the stored response, but can send a 
>>> conditional request to the server.
>>> 5. the server answers with the same representation response and a 304 
>>> Not Modified status
>>> 6. the proxy serves the response that it has in cache
>>>
>>> In short, it only saves a bit of bandwidth between the server and the 
>>> proxy.
>>>
>>> If the conditional request is not possible for 4., then I totally 
>>> agree that this is just plain useless...
>>>
>>> Francois.
>>>
>>>
>>>
>>> Jo Rabin wrote:
>>>>
>>>> I am a bit confused as to how/why this all works. It seems to me 
>>>> that for this to actually work cache efficiently, the cache would 
>>>> have to understand how _exactly_ the server processes the User Agent 
>>>> header.
>>>>
>>>> As pointed out earlier in this thread, there may be countless 
>>>> variations on the same basic header that as far as the server is 
>>>> concerned all represent near-enough the same thing. However, use of 
>>>> content location and vary headers does not give any clue as to how 
>>>> it makes that judgement.
>>>>
>>>> So it seems to me that a proxy, knowing that a server varies its 
>>>> representations based on the UA header can legitimately cache _only_ 
>>>> if the UA is _exactly_ the same. So what puzzles me is why a content 
>>>> location header helps it. I think I must be missing the point here, 
>>>> and if so, apologies.
>>>>
>>>> Jo
>>>>
>>>> On 02/06/2008 08:39, Francois Daoust wrote:
>>>>> I agree as well...
>>>>>
>>>>> ... and I also agree with Jo that, in all cases, having the 
>>>>> different representations available at specific locations means 
>>>>> extra-work for the CP with no real added-value (save the fact that 
>>>>> managing a clean list of the different representations available - 
>>>>> tweaks included - eases testing), and is probably not a common 
>>>>> practice.
>>>>>
>>>>> Francois.
>>>>>
>>>>> Umesh Sirsiwal wrote:
>>>>>> I agree with Bryan. We may want to recommend (a) as the preferred
>>>>>> solution as this saves the extra roundtrip.
>>>>>>> -----Original Message-----
>>>>>>> From: Sullivan, Bryan [mailto:BS3131@att.com]
>>>>>>> Sent: Friday, May 30, 2008 12:25 PM
>>>>>>> To: Francois Daoust; Umesh Sirsiwal
>>>>>>> Cc: Jo Rabin; public-bpwg-ct@w3.org
>>>>>>> Subject: RE: CT Proxies and Forward Caches
>>>>>>>
>>>>>>> Hi Francois,
>>>>>>> With (b) as an option, do you think the proposal would result in a
>>>>>>> greater number of redirects? I can see a case where a CP wants to
>>>>>>> normally provide only a generic URI, e.g. since this is embedded as
>>>>>>> links in other resources. If the CP did not make the specific
>>>>>>> representation available at a unique URI also, your proposal would
>>>>>>> require that a redirect result for each request related to (or based
>>>>>>> upon) the generic URI.
>>>>>>>
>>>>>>> It might be better just to say: "When varying representations 
>>>>>>> based on
>>>>>>> received HTTP headers, cache-efficient techniques should be used. 
>>>>>>> For
>>>>>>> example, if the total number of representations is limited 
>>>>>>> whereas the
>>>>>>> number of values for a HTTP header used for varying 
>>>>>>> representation is
>>>>>>> high [typically the case when varying representations based on the
>>>>>>> User-Agent string], the different representations should be made
>>>>>>> available at specific URIs and the request to the generic resource
>>>>>>> should return the specific representation along with a
>>>>>> Content-Location
>>>>>>> header that identifies the representation being served."
>>>>>>>
>>>>>>> This would avoid the message to CP's that redirect to specific
>>>>>>> representations (as compared to just returning them) is a 
>>>>>>> recommended
>>>>>>> practice, if they are somehow prevented from making the
>>>>>> representations
>>>>>>> available at specific URI's.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Bryan Sullivan | AT&T
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Francois Daoust [mailto:fd@w3.org]
>>>>>>> Sent: Friday, May 30, 2008 7:50 AM
>>>>>>> To: Umesh Sirsiwal
>>>>>>> Cc: Jo Rabin; Sullivan, Bryan; public-bpwg-ct@w3.org
>>>>>>> Subject: Re: CT Proxies and Forward Caches
>>>>>>>
>>>>>>> Thanks for the clarification, Umesh.
>>>>>>>
>>>>>>> Very good point.
>>>>>>>
>>>>>>> The Content-Location header would probably have deserved a 
>>>>>>> mention in
>>>>>>> the TAG Finding I mentioned at the beginning of the thread and in
>>>>>>> particular in 2.1.1 section [1], third item, since the Vary header
>>>>>>> makes
>>>>>>> things work, and the Content-Location header makes things
>>>>>>> cache-friendly. It saves the redirection, and makes groups used 
>>>>>>> by the
>>>>>>> server available to caches without revealing how they were built.
>>>>>>>
>>>>>>> As far as content-transformation is concerned, there may not be much
>>>>>> to
>>>>>>> say though as it's a rather generic caching issue. The need to use a
>>>>>>> "Vary" on the "User-Agent" header is yet typical of the Mobile 
>>>>>>> world,
>>>>>>> so
>>>>>>> we probably should emphasize this point somewhere. I'm not sure the
>>>>>>> Content Transformation guidelines document is the right place for 
>>>>>>> it,
>>>>>>> but since Content-Location sounds like a "natural" companion for the
>>>>>>> Vary header, we could add a note, next to the guideline that says 
>>>>>>> that
>>>>>>> the server MUST add a "Vary" HTTP header when varying 
>>>>>>> representations,
>>>>>>> along the lines of:
>>>>>>>
>>>>>>> "When varying representations based on received HTTP headers,
>>>>>>> cache-efficient techniques should be used. For example, if the total
>>>>>>> number of representations is limited whereas the number of values 
>>>>>>> for
>>>>>> a
>>>>>>> HTTP header used for varying representation is high [typically the
>>>>>> case
>>>>>>> when varying representations based on the User-Agent string], the
>>>>>>> different representations should be made available at specific URIs
>>>>>>> and:
>>>>>>> a) the request to the generic resource should return the specific
>>>>>>> representation along with a Content-Location header that identifies
>>>>>> the
>>>>>>> representation being served.
>>>>>>> or b) the request to the generic resource should return a 
>>>>>>> redirection
>>>>>>> to
>>>>>>> the specific representation."
>>>>>>>
>>>>>>> Any other view on that?
>>>>>>>
>>>>>>> Francois
>>>>>>>
>>>>>>>
>>>>>>> [1] http://www.w3.org/2001/tag/doc/alternatives-
>>>>>>> discovery.html#id2261787
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Umesh Sirsiwal wrote:
>>>>>>>> Hi Fancois,
>>>>>>>> Sorry for the confusion. Based on my understanding of the Link
>>>>>>>> element, I can further clarify difference between the Link element
>>>>>>> and
>>>>>>>
>>>>>>>> the Presentation-URI.
>>>>>>>>
>>>>>>>> My understanding is that the Link header provides a method of
>>>>>>>> advertising available alternatives for the page being served. On 
>>>>>>>> the
>>>>>>>> other hand the Presentation-URI provides a method to identify the
>>>>>>>> alternative included in the response. In case of the deployment 
>>>>>>>> case
>>>>>>>> you mentioned below, once the CT proxy has identified the page 
>>>>>>>> to be
>>>>>>>> served it will include a Presentation-URI header identifying the
>>>>>>> selected URI.
>>>>>>>> Using this the Vary header will be able to identify the criteria on
>>>>>>>> which the server varied its response, while the Presentation-URI
>>>>>> will
>>>>>>>> be able to identify which of the several alternatives was served.
>>>>>>>>
>>>>>>>> Rereading HTTP specification, the Presentation-URI is the same as
>>>>>>>> Content-Location header field. I am proposing that the CP or the CT
>>>>>>>> proxy which can serve multiple presentation of the content for the
>>>>>>>> same URI, should include Content-Location header to identify the
>>>>>>>> entity it is serving.
>>>>>>>>
>>>>>>>> -Umesh
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Francois Daoust [mailto:fd@w3.org]
>>>>>>>>> Sent: Monday, May 26, 2008 11:31 AM
>>>>>>>>> To: Umesh Sirsiwal
>>>>>>>>> Cc: Jo Rabin; Sullivan, Bryan; public-bpwg-ct@w3.org
>>>>>>>>> Subject: Re: CT Proxies and Forward Caches
>>>>>>>>>
>>>>>>>>> Hi Umesh,
>>>>>>>>>
>>>>>>>>> I'm not sure I completely follow your point here, feel free to
>>>>>>>>> correct me.
>>>>>>>>>
>>>>>>>>> The Presentation-URI header you mention to identify alternative
>>>>>>>>> representations being served looks like the "Link" element we're
>>>>>>>>> currently discussing in another thread, see:
>>>>>>>>>
>>>>>>>>>
>>>>>> http://lists.w3.org/Archives/Public/public-bpwg-ct/2008May/0021.html
>>>>>>>>> and replies.
>>>>>>>>>
>>>>>>>>> In the case of the Link element, we're currently trying to see 
>>>>>>>>> when
>>>>>>>>> it makes sense to use it, and how it could be used in practice.
>>>>>> This
>>>>>>>> would
>>>>>>>>> indeed avoid the extra round trip in the sense that the CT-proxy
>>>>>>>>> would be able to do the redirection for the user and so the
>>>>>>>>> "redirect" would not reach the high-latency network the 
>>>>>>>>> end-user is
>>>>>>> connected to.
>>>>>>>>> Now, obviously, the problem with the "Link" element is that it is
>>>>>> at
>>>>>>>>> the markup level, and not at the HTTP level. It would be cool to
>>>>>>> have
>>>>>>>
>>>>>>>>> a "Link" HTTP header, typically for images and more generally for
>>>>>>> all
>>>>>>>
>>>>>>>>> non-HTML content. We're not the only ones who want the "Link"
>>>>>> header
>>>>>>>>> back to life ("back" since it previously existed but disappeared
>>>>>> for
>>>>>>>>> lack of use, how ironic ;-)), and there are many on-going
>>>>>>> discussions
>>>>>>>
>>>>>>>>> within W3C and IETF about that. If it ever becomes a reality, it
>>>>>>>>> would indeed be useful to serve multiple representations of a
>>>>>>> resource.
>>>>>>>>> Note that it's not directly related to content transformation in
>>>>>>>>> itself.
>>>>>>>>> The presence of a content transformation proxy merely adds to the
>>>>>>>> case.
>>>>>>>>> Did I get you right?
>>>>>>>>>
>>>>>>>>> Francois.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Umesh Sirsiwal wrote:
>>>>>>>>>> Jo, Francois, Bryan,
>>>>>>>>>> Thanks for the responses. IMO absence of standardization in this
>>>>>>>>> space
>>>>>>>>>> will cause caches built in CT or otherwise to implement 
>>>>>>>>>> heuristics
>>>>>>>>> based
>>>>>>>>>> solutions to deduce intent of CP or CT. That is less then
>>>>>>> desirable.
>>>>>>>>>> To avoid the extra round trip Francois pointed out, the CP can
>>>>>>>>> possible
>>>>>>>>>> serve an HTTP header (let us call it Presentation-URI) 
>>>>>>>>>> identifying
>>>>>>>>>> alternative representation served. The CT proxy or other caches
>>>>>>> will
>>>>>>>
>>>>>>>>>> need to pay attention to this new header. But, as long as Via
>>>>>>> header
>>>>>>>>> is
>>>>>>>>>> always included, they will be able to correctly cache and serve
>>>>>> the
>>>>>>>>>> content.
>>>>>>>>>>
>>>>>>>>>> The Presentation-URI does not have to be limited to the three
>>>>>>>> groups.
>>>>>>>>> In
>>>>>>>>>> some cases the Presentation-URI can be very specific and say
>>>>>>>>> something
>>>>>>>>>> like www.example.com/Device_a. Won't that work?
>>>>>>>>>>
>>>>>>>>>> -Umesh
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Jo Rabin [mailto:jrabin@mtld.mobi]
>>>>>>>>>>> Sent: Thursday, May 22, 2008 6:16 AM
>>>>>>>>>>> To: Francois Daoust
>>>>>>>>>>> Cc: Umesh Sirsiwal; Sullivan, Bryan; public-bpwg-ct@w3.org
>>>>>>>>>>> Subject: Re: CT Proxies and Forward Caches
>>>>>>>>>>>
>>>>>>>>>>> Aside from the redirect cost that Francois mentions, I am not
>>>>>> sure
>>>>>>>>>> that
>>>>>>>>>>> having separate URIs to allow caching of the "high" "medium" and
>>>>>>>>> "low"
>>>>>>>>>>> cases is the whole answer, since the response may still vary
>>>>>>> within
>>>>>>>
>>>>>>>>>>> those groups depending on work-arounds to the quirks of any
>>>>>>>>> particular
>>>>>>>>>>> device within the grouping.
>>>>>>>>>>>
>>>>>>>>>>> As Francois points out, this relates to the "long-running" 
>>>>>>>>>>> ISSUE-
>>>>>>>>> 222,
>>>>>>>>>>> and it's down to me to try to make sure that it doesn't run much
>>>>>>>>>> longer
>>>>>>>>>>> :-(
>>>>>>>>>>>
>>>>>>>>>>> Jo
>>>>>>>>>>>
>>>>>>>>>>> On 21/05/2008 09:34, Francois Daoust wrote:
>>>>>>>>>>>> Indeed, the use of a "Vary: User-Agent" header generates much
>>>>>>> more
>>>>>>>
>>>>>>>>>>>> entries than a more typical use of Vary such as "Vary: Accept-
>>>>>>>>>>> Language",
>>>>>>>>>>>> and is thus not a really cache-friendly directive.
>>>>>>>>>>>>
>>>>>>>>>>>> The solution Bryan suggested to create representation-specific
>>>>>>>> URIs
>>>>>>>>>>> for
>>>>>>>>>>>> each UA group, coupled with a redirect response from a 
>>>>>>>>>>>> canonical
>>>>>>>>>>>> representation is much better from a cache perspective but it
>>>>>> has
>>>>>>>> a
>>>>>>>>>>>> cost: that of a round-trip between the server and the client to
>>>>>>>>>> serve
>>>>>>>>>>>> the redirect response to the representation-specific URI. This
>>>>>>>>>>> solution
>>>>>>>>>>>> is recommended by the W3C Technical Architecture Group in a
>>>>>>>> finding
>>>>>>>>>>> "On
>>>>>>>>>>>> Linking Alternative Representations To Enable Discovery And
>>>>>>>>>>> Publishing"
>>>>>>>>>>>> [1].
>>>>>>>>>>>>
>>>>>>>>>>>> We only mention the use of the "Vary" header in current version
>>>>>>> of
>>>>>>>>>>> the
>>>>>>>>>>>> Content Transformation Guidelines document, but we have a long-
>>>>>>>>>>> running
>>>>>>>>>>>> discussion (internally named ISSUE-222) on the above mentioned
>>>>>>> TAG
>>>>>>>
>>>>>>>>>>>> finding. We may include that possibility in the document as
>>>>>> well.
>>>>>>>>>>>> [1] http://www.w3.org/2001/tag/doc/alternatives-
>>>>>>>>>>> discovery.html#id2261672
>>>>>>>>>>>> Sullivan, Bryan wrote:
>>>>>>>>>>>>> Hi Umesh,
>>>>>>>>>>>>> As you mention, meta-group assignment (e.g. good/better/best)
>>>>>> is
>>>>>>>> a
>>>>>>>>>>>>> deployment-specific function, i.e. one Content Provider (CP)
>>>>>> may
>>>>>>>>>>>>> choose a different set of groups and UA assignment as compared
>>>>>>> to
>>>>>>>
>>>>>>>>>>>>> another. Without the direct involvement of the CT proxy in
>>>>>> group
>>>>>>>>>>>>> selection, the only way I see to reduce the cached
>>>>>>>> representations
>>>>>>>>>>> is
>>>>>>>>>>>>> for the CP to provide a distinct URI to UA's in a group 
>>>>>>>>>>>>> (e.g. a
>>>>>>>>> URI
>>>>>>>>>>>>> parameter or unique path), so the various UA's naturally get
>>>>>>>>> served
>>>>>>>>>>>>> one of a fewer variations of the page from the cache.
>>>>>>>>>>>>>
>>>>>>>>>>>>> "direct involvement of the CT proxy in group selection" 
>>>>>>>>>>>>> implies
>>>>>>>>>> some
>>>>>>>>>>>>> kind of metadata exchange between CP and CT proxy, through
>>>>>> which
>>>>>>>>>>>>> group-related pages can be indicated, and maybe a tighter
>>>>>>>>>>> integration
>>>>>>>>>>>>> of the CT proxy and cache. Both appear (to me) to be less
>>>>>>>>> desirable
>>>>>>>>>>> to
>>>>>>>>>>>>> standardize, and at least more complex to consider.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>> Bryan Sullivan | AT&T
>>>>>>>>>>>>>
>>>>>> -------------------------------------------------------------------
>>>>>>> -
>>>>>>>>>>> ----
>>>>>>>>>>>>> *From:* public-bpwg-ct-request@w3.org
>>>>>>>>>>>>> [mailto:public-bpwg-ct-request@w3.org] *On Behalf Of *Umesh
>>>>>>>>>> Sirsiwal
>>>>>>>>>>>>> *Sent:* Monday, May 19, 2008 8:12 AM
>>>>>>>>>>>>> *To:* public-bpwg-ct@w3.org
>>>>>>>>>>>>> *Subject:* CT Proxies and Forward Caches
>>>>>>>>>>>>>
>>>>>>>>>>>>> Several content transformation proxies and the Internet in
>>>>>>>> general
>>>>>>>>>>>>> includes forward caches. Current definition of HTTP includes
>>>>>>>>>>>>> indication of transformation using Vary header. In most cases
>>>>>>> the
>>>>>>>
>>>>>>>>>>>>> Content Transformation proxies and servers vary their 
>>>>>>>>>>>>> responses
>>>>>>>>>>> based
>>>>>>>>>>>>> on User-Agent header. The number of User-Agent string in is
>>>>>> very
>>>>>>>>>>> high
>>>>>>>>>>>>> and caches cannot possibly store these mean copies of the
>>>>>>>>> response.
>>>>>>>>>>>>> Most servers are likely to classify the devices in certain
>>>>>> meta-
>>>>>>>>>>> groups
>>>>>>>>>>>>> for the purpose of content transformation. However, this meta-
>>>>>>>>> group
>>>>>>>>>>> is
>>>>>>>>>>>>> expected to be server specific. In absence of formal method,
>>>>>> the
>>>>>>>>>>>>> caches will be left to guess the meta-group. What will be the
>>>>>>>>>> method
>>>>>>>>>>>>> to solve this?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>
>>>>
>>>>
>>
Received on Tuesday, 3 June 2008 11:10:42 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:06:29 UTC