Re: ACTION-679: Propose text for para 2 of 3.1.1 from Francois Daoust on 2008-03-18 (public-bpwg-ct@w3.org from March 2008)

From: Francois Daoust <fd@w3.org>
Date: Tue, 18 Mar 2008 15:30:26 +0100
To: Martin Jones <martinj@volantis.com>
CC: public-bpwg-ct@w3.org
Message-ID: <47DFD202.5080102@w3.org>
Martin Jones wrote:
> 
> Hi Francois
> 
> My proposed text is really aimed at preventing requests from 
> non-browsers being modified by the proxy - e.g. ones from media players, 
> Java applications etc, all of which might end up being routed through 
> the same proxy.
> 
> Here are my thoughts on the AJAX/XHR use case:
> 
> Firstly, the user-agent is still browser whether or not it is using XHR 
> and I don't think it would be appropriate to prevent proxies modifying 
> these requests at all.  In the F2F, Rob mentioned tokenizing URLs so 
> that's at least one case where it could be necessary to modify the 
> request if it comes from a page that was transformed.
> 
> I think there could be a class of AJAX-aware CT proxies that perform 
> some limited transformations on AJAX pages, such as URL rewriting or 
> fixing up compatibility issues.  We must not preclude these kinds of 
> proxies so it may be appropriate for an XHR and even its response to be 
> modified if it is done correctly.

Yes, good points, I agree.


> 
> In almost every case, XHR requests will come from web pages that have 
> already been through the proxy. If the proxy has transformed the page 
> without being aware that it uses AJAX, the chance of the XHR doing 
> anything useful is quite small whether it is modified or not.  I think 
> the document already provides for sufficient control over the 
> transformation of responses (web pages) so nothing extra should be 
> needed here.

Indeed. I wonder if we should not emphasize that point in the list of 
heuristics of §3.4 (Proxy Response to User Agent), adding a point such 
as: "examination of the content reveals that the page contains 
client-side scripts that may break if the page gets adapted".

I don't know if we want to go into details about that list, but warning 
that it's an important point to check sounds useful and harmless.


> 
> If the proxy hasn't transformed the page then it is important to ensure 
> that it does not modify the XHR request.  Perhaps the guidelines should 
> say that *requests* should only be modified when the proxy can determine 
> positively that they originate from a page which was transformed by it.  
> There are ways to do that, some more invasive than others.  We could 
> leave that issue for vendors to resolve.

Yes, I like the idea.

It may be a bit complicated to implement though. I mean, how may a 
CT-proxy make the distinction between an HTTP request that comes from a 
URI the user entered (it does not originate from any other page and may 
be subject to content adaptation) and an HTTP request that originates 
from a page that was not transformed (which should not be subject to 
content adaptation)?

May the CT-proxy decision be based on the HTTP "Referer" header, its 
knowledge of the user's history and of its previous decisions?

Another idea/possibility could be to recommend that content providers 
that want to switch off a CT-proxy on a page that contains Ajax-like 
code append a "Cache-Control: no-transform" header in XHR requests 
(possible through the use of the setRequestHeader method of the XHR 
interface). I don't really fancy that idea though.
Received on Tuesday, 18 March 2008 14:30:57 UTC