Re: CT Proxies and Forward Caches

Thanks Francois, I guess I am wondering how often a server that offers a 
varying response can respond with a Not Modified status, given that the 
response would have been constructed by some dynamic process ...

Jo

On 02/06/2008 11:11, Francois Daoust wrote:
> Well, I'm not a cache expert but I'd say that the following would happen:
> 
> 1. the proxy receives a response with a Cache-Location: 
> http://example.org/Representation1 header for a specific User-Agent
> 2. it stores the response.
> 3. another request is received for the same URI but with a different 
> User-Agent.
> 4. the proxy cannot match on the stored response, but can send a 
> conditional request to the server.
> 5. the server answers with the same representation response and a 304 
> Not Modified status
> 6. the proxy serves the response that it has in cache
> 
> In short, it only saves a bit of bandwidth between the server and the 
> proxy.
> 
> If the conditional request is not possible for 4., then I totally agree 
> that this is just plain useless...
> 
> Francois.
> 
> 
> 
> Jo Rabin wrote:
>>
>> I am a bit confused as to how/why this all works. It seems to me that 
>> for this to actually work cache efficiently, the cache would have to 
>> understand how _exactly_ the server processes the User Agent header.
>>
>> As pointed out earlier in this thread, there may be countless 
>> variations on the same basic header that as far as the server is 
>> concerned all represent near-enough the same thing. However, use of 
>> content location and vary headers does not give any clue as to how it 
>> makes that judgement.
>>
>> So it seems to me that a proxy, knowing that a server varies its 
>> representations based on the UA header can legitimately cache _only_ 
>> if the UA is _exactly_ the same. So what puzzles me is why a content 
>> location header helps it. I think I must be missing the point here, 
>> and if so, apologies.
>>
>> Jo
>>
>> On 02/06/2008 08:39, Francois Daoust wrote:
>>> I agree as well...
>>>
>>> ... and I also agree with Jo that, in all cases, having the different 
>>> representations available at specific locations means extra-work for 
>>> the CP with no real added-value (save the fact that managing a clean 
>>> list of the different representations available - tweaks included - 
>>> eases testing), and is probably not a common practice.
>>>
>>> Francois.
>>>
>>> Umesh Sirsiwal wrote:
>>>> I agree with Bryan. We may want to recommend (a) as the preferred
>>>> solution as this saves the extra roundtrip.
>>>>> -----Original Message-----
>>>>> From: Sullivan, Bryan [mailto:BS3131@att.com]
>>>>> Sent: Friday, May 30, 2008 12:25 PM
>>>>> To: Francois Daoust; Umesh Sirsiwal
>>>>> Cc: Jo Rabin; public-bpwg-ct@w3.org
>>>>> Subject: RE: CT Proxies and Forward Caches
>>>>>
>>>>> Hi Francois,
>>>>> With (b) as an option, do you think the proposal would result in a
>>>>> greater number of redirects? I can see a case where a CP wants to
>>>>> normally provide only a generic URI, e.g. since this is embedded as
>>>>> links in other resources. If the CP did not make the specific
>>>>> representation available at a unique URI also, your proposal would
>>>>> require that a redirect result for each request related to (or based
>>>>> upon) the generic URI.
>>>>>
>>>>> It might be better just to say: "When varying representations based on
>>>>> received HTTP headers, cache-efficient techniques should be used. For
>>>>> example, if the total number of representations is limited whereas the
>>>>> number of values for a HTTP header used for varying representation is
>>>>> high [typically the case when varying representations based on the
>>>>> User-Agent string], the different representations should be made
>>>>> available at specific URIs and the request to the generic resource
>>>>> should return the specific representation along with a
>>>> Content-Location
>>>>> header that identifies the representation being served."
>>>>>
>>>>> This would avoid the message to CP's that redirect to specific
>>>>> representations (as compared to just returning them) is a recommended
>>>>> practice, if they are somehow prevented from making the
>>>> representations
>>>>> available at specific URI's.
>>>>>
>>>>> Best regards,
>>>>> Bryan Sullivan | AT&T
>>>>>
>>>>> -----Original Message-----
>>>>> From: Francois Daoust [mailto:fd@w3.org]
>>>>> Sent: Friday, May 30, 2008 7:50 AM
>>>>> To: Umesh Sirsiwal
>>>>> Cc: Jo Rabin; Sullivan, Bryan; public-bpwg-ct@w3.org
>>>>> Subject: Re: CT Proxies and Forward Caches
>>>>>
>>>>> Thanks for the clarification, Umesh.
>>>>>
>>>>> Very good point.
>>>>>
>>>>> The Content-Location header would probably have deserved a mention in
>>>>> the TAG Finding I mentioned at the beginning of the thread and in
>>>>> particular in 2.1.1 section [1], third item, since the Vary header
>>>>> makes
>>>>> things work, and the Content-Location header makes things
>>>>> cache-friendly. It saves the redirection, and makes groups used by the
>>>>> server available to caches without revealing how they were built.
>>>>>
>>>>> As far as content-transformation is concerned, there may not be much
>>>> to
>>>>> say though as it's a rather generic caching issue. The need to use a
>>>>> "Vary" on the "User-Agent" header is yet typical of the Mobile world,
>>>>> so
>>>>> we probably should emphasize this point somewhere. I'm not sure the
>>>>> Content Transformation guidelines document is the right place for it,
>>>>> but since Content-Location sounds like a "natural" companion for the
>>>>> Vary header, we could add a note, next to the guideline that says that
>>>>> the server MUST add a "Vary" HTTP header when varying representations,
>>>>> along the lines of:
>>>>>
>>>>> "When varying representations based on received HTTP headers,
>>>>> cache-efficient techniques should be used. For example, if the total
>>>>> number of representations is limited whereas the number of values for
>>>> a
>>>>> HTTP header used for varying representation is high [typically the
>>>> case
>>>>> when varying representations based on the User-Agent string], the
>>>>> different representations should be made available at specific URIs
>>>>> and:
>>>>> a) the request to the generic resource should return the specific
>>>>> representation along with a Content-Location header that identifies
>>>> the
>>>>> representation being served.
>>>>> or b) the request to the generic resource should return a redirection
>>>>> to
>>>>> the specific representation."
>>>>>
>>>>> Any other view on that?
>>>>>
>>>>> Francois
>>>>>
>>>>>
>>>>> [1] http://www.w3.org/2001/tag/doc/alternatives-
>>>>> discovery.html#id2261787
>>>>>
>>>>>
>>>>>
>>>>> Umesh Sirsiwal wrote:
>>>>>> Hi Fancois,
>>>>>> Sorry for the confusion. Based on my understanding of the Link
>>>>>> element, I can further clarify difference between the Link element
>>>>> and
>>>>>
>>>>>> the Presentation-URI.
>>>>>>
>>>>>> My understanding is that the Link header provides a method of
>>>>>> advertising available alternatives for the page being served. On the
>>>>>> other hand the Presentation-URI provides a method to identify the
>>>>>> alternative included in the response. In case of the deployment case
>>>>>> you mentioned below, once the CT proxy has identified the page to be
>>>>>> served it will include a Presentation-URI header identifying the
>>>>> selected URI.
>>>>>> Using this the Vary header will be able to identify the criteria on
>>>>>> which the server varied its response, while the Presentation-URI
>>>> will
>>>>>> be able to identify which of the several alternatives was served.
>>>>>>
>>>>>> Rereading HTTP specification, the Presentation-URI is the same as
>>>>>> Content-Location header field. I am proposing that the CP or the CT
>>>>>> proxy which can serve multiple presentation of the content for the
>>>>>> same URI, should include Content-Location header to identify the
>>>>>> entity it is serving.
>>>>>>
>>>>>> -Umesh
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Francois Daoust [mailto:fd@w3.org]
>>>>>>> Sent: Monday, May 26, 2008 11:31 AM
>>>>>>> To: Umesh Sirsiwal
>>>>>>> Cc: Jo Rabin; Sullivan, Bryan; public-bpwg-ct@w3.org
>>>>>>> Subject: Re: CT Proxies and Forward Caches
>>>>>>>
>>>>>>> Hi Umesh,
>>>>>>>
>>>>>>> I'm not sure I completely follow your point here, feel free to
>>>>>>> correct me.
>>>>>>>
>>>>>>> The Presentation-URI header you mention to identify alternative
>>>>>>> representations being served looks like the "Link" element we're
>>>>>>> currently discussing in another thread, see:
>>>>>>>
>>>>>>>
>>>> http://lists.w3.org/Archives/Public/public-bpwg-ct/2008May/0021.html
>>>>>>> and replies.
>>>>>>>
>>>>>>> In the case of the Link element, we're currently trying to see when
>>>>>>> it makes sense to use it, and how it could be used in practice.
>>>> This
>>>>>> would
>>>>>>> indeed avoid the extra round trip in the sense that the CT-proxy
>>>>>>> would be able to do the redirection for the user and so the
>>>>>>> "redirect" would not reach the high-latency network the end-user is
>>>>> connected to.
>>>>>>> Now, obviously, the problem with the "Link" element is that it is
>>>> at
>>>>>>> the markup level, and not at the HTTP level. It would be cool to
>>>>> have
>>>>>
>>>>>>> a "Link" HTTP header, typically for images and more generally for
>>>>> all
>>>>>
>>>>>>> non-HTML content. We're not the only ones who want the "Link"
>>>> header
>>>>>>> back to life ("back" since it previously existed but disappeared
>>>> for
>>>>>>> lack of use, how ironic ;-)), and there are many on-going
>>>>> discussions
>>>>>
>>>>>>> within W3C and IETF about that. If it ever becomes a reality, it
>>>>>>> would indeed be useful to serve multiple representations of a
>>>>> resource.
>>>>>>> Note that it's not directly related to content transformation in
>>>>>>> itself.
>>>>>>> The presence of a content transformation proxy merely adds to the
>>>>>> case.
>>>>>>> Did I get you right?
>>>>>>>
>>>>>>> Francois.
>>>>>>>
>>>>>>>
>>>>>>> Umesh Sirsiwal wrote:
>>>>>>>> Jo, Francois, Bryan,
>>>>>>>> Thanks for the responses. IMO absence of standardization in this
>>>>>>> space
>>>>>>>> will cause caches built in CT or otherwise to implement heuristics
>>>>>>> based
>>>>>>>> solutions to deduce intent of CP or CT. That is less then
>>>>> desirable.
>>>>>>>> To avoid the extra round trip Francois pointed out, the CP can
>>>>>>> possible
>>>>>>>> serve an HTTP header (let us call it Presentation-URI) identifying
>>>>>>>> alternative representation served. The CT proxy or other caches
>>>>> will
>>>>>
>>>>>>>> need to pay attention to this new header. But, as long as Via
>>>>> header
>>>>>>> is
>>>>>>>> always included, they will be able to correctly cache and serve
>>>> the
>>>>>>>> content.
>>>>>>>>
>>>>>>>> The Presentation-URI does not have to be limited to the three
>>>>>> groups.
>>>>>>> In
>>>>>>>> some cases the Presentation-URI can be very specific and say
>>>>>>> something
>>>>>>>> like www.example.com/Device_a. Won't that work?
>>>>>>>>
>>>>>>>> -Umesh
>>>>>>>>
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Jo Rabin [mailto:jrabin@mtld.mobi]
>>>>>>>>> Sent: Thursday, May 22, 2008 6:16 AM
>>>>>>>>> To: Francois Daoust
>>>>>>>>> Cc: Umesh Sirsiwal; Sullivan, Bryan; public-bpwg-ct@w3.org
>>>>>>>>> Subject: Re: CT Proxies and Forward Caches
>>>>>>>>>
>>>>>>>>> Aside from the redirect cost that Francois mentions, I am not
>>>> sure
>>>>>>>> that
>>>>>>>>> having separate URIs to allow caching of the "high" "medium" and
>>>>>>> "low"
>>>>>>>>> cases is the whole answer, since the response may still vary
>>>>> within
>>>>>
>>>>>>>>> those groups depending on work-arounds to the quirks of any
>>>>>>> particular
>>>>>>>>> device within the grouping.
>>>>>>>>>
>>>>>>>>> As Francois points out, this relates to the "long-running" ISSUE-
>>>>>>> 222,
>>>>>>>>> and it's down to me to try to make sure that it doesn't run much
>>>>>>>> longer
>>>>>>>>> :-(
>>>>>>>>>
>>>>>>>>> Jo
>>>>>>>>>
>>>>>>>>> On 21/05/2008 09:34, Francois Daoust wrote:
>>>>>>>>>> Indeed, the use of a "Vary: User-Agent" header generates much
>>>>> more
>>>>>
>>>>>>>>>> entries than a more typical use of Vary such as "Vary: Accept-
>>>>>>>>> Language",
>>>>>>>>>> and is thus not a really cache-friendly directive.
>>>>>>>>>>
>>>>>>>>>> The solution Bryan suggested to create representation-specific
>>>>>> URIs
>>>>>>>>> for
>>>>>>>>>> each UA group, coupled with a redirect response from a canonical
>>>>>>>>>> representation is much better from a cache perspective but it
>>>> has
>>>>>> a
>>>>>>>>>> cost: that of a round-trip between the server and the client to
>>>>>>>> serve
>>>>>>>>>> the redirect response to the representation-specific URI. This
>>>>>>>>> solution
>>>>>>>>>> is recommended by the W3C Technical Architecture Group in a
>>>>>> finding
>>>>>>>>> "On
>>>>>>>>>> Linking Alternative Representations To Enable Discovery And
>>>>>>>>> Publishing"
>>>>>>>>>> [1].
>>>>>>>>>>
>>>>>>>>>> We only mention the use of the "Vary" header in current version
>>>>> of
>>>>>>>>> the
>>>>>>>>>> Content Transformation Guidelines document, but we have a long-
>>>>>>>>> running
>>>>>>>>>> discussion (internally named ISSUE-222) on the above mentioned
>>>>> TAG
>>>>>
>>>>>>>>>> finding. We may include that possibility in the document as
>>>> well.
>>>>>>>>>> [1] http://www.w3.org/2001/tag/doc/alternatives-
>>>>>>>>> discovery.html#id2261672
>>>>>>>>>> Sullivan, Bryan wrote:
>>>>>>>>>>> Hi Umesh,
>>>>>>>>>>> As you mention, meta-group assignment (e.g. good/better/best)
>>>> is
>>>>>> a
>>>>>>>>>>> deployment-specific function, i.e. one Content Provider (CP)
>>>> may
>>>>>>>>>>> choose a different set of groups and UA assignment as compared
>>>>> to
>>>>>
>>>>>>>>>>> another. Without the direct involvement of the CT proxy in
>>>> group
>>>>>>>>>>> selection, the only way I see to reduce the cached
>>>>>> representations
>>>>>>>>> is
>>>>>>>>>>> for the CP to provide a distinct URI to UA's in a group (e.g. a
>>>>>>> URI
>>>>>>>>>>> parameter or unique path), so the various UA's naturally get
>>>>>>> served
>>>>>>>>>>> one of a fewer variations of the page from the cache.
>>>>>>>>>>>
>>>>>>>>>>> "direct involvement of the CT proxy in group selection" implies
>>>>>>>> some
>>>>>>>>>>> kind of metadata exchange between CP and CT proxy, through
>>>> which
>>>>>>>>>>> group-related pages can be indicated, and maybe a tighter
>>>>>>>>> integration
>>>>>>>>>>> of the CT proxy and cache. Both appear (to me) to be less
>>>>>>> desirable
>>>>>>>>> to
>>>>>>>>>>> standardize, and at least more complex to consider.
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Bryan Sullivan | AT&T
>>>>>>>>>>>
>>>> -------------------------------------------------------------------
>>>>> -
>>>>>>>>> ----
>>>>>>>>>>> *From:* public-bpwg-ct-request@w3.org
>>>>>>>>>>> [mailto:public-bpwg-ct-request@w3.org] *On Behalf Of *Umesh
>>>>>>>> Sirsiwal
>>>>>>>>>>> *Sent:* Monday, May 19, 2008 8:12 AM
>>>>>>>>>>> *To:* public-bpwg-ct@w3.org
>>>>>>>>>>> *Subject:* CT Proxies and Forward Caches
>>>>>>>>>>>
>>>>>>>>>>> Several content transformation proxies and the Internet in
>>>>>> general
>>>>>>>>>>> includes forward caches. Current definition of HTTP includes
>>>>>>>>>>> indication of transformation using Vary header. In most cases
>>>>> the
>>>>>
>>>>>>>>>>> Content Transformation proxies and servers vary their responses
>>>>>>>>> based
>>>>>>>>>>> on User-Agent header. The number of User-Agent string in is
>>>> very
>>>>>>>>> high
>>>>>>>>>>> and caches cannot possibly store these mean copies of the
>>>>>>> response.
>>>>>>>>>>> Most servers are likely to classify the devices in certain
>>>> meta-
>>>>>>>>> groups
>>>>>>>>>>> for the purpose of content transformation. However, this meta-
>>>>>>> group
>>>>>>>>> is
>>>>>>>>>>> expected to be server specific. In absence of formal method,
>>>> the
>>>>>>>>>>> caches will be left to guess the meta-group. What will be the
>>>>>>>> method
>>>>>>>>>>> to solve this?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>
>>
>>

Received on Monday, 2 June 2008 10:44:42 UTC