ACTION-718: Summarize and continue discussion re Ajax/XHR requests and CT

Context
-------
This is a follow-up to Martin's ACTION-679, discussed in the mailing-list:
http://lists.w3.org/Archives/Public/public-bpwg-ct/2008Mar/0008.html
...  and in last teleconference:
http://www.w3.org/2008/03/18-bpwg-minutes.html#item04


Potential problem
-----------------
Guidelines say (in §3.1.1): "the proxy must behave transparently (q.v.) 
unless it is able to determine positively that the user agent is a browser".

When a web page uses client-scripting to send HTTP requests, the request 
headers sent by the XMLHttpRequest object are exactly the same as if the 
browser itself had made the request (actually, that's slightly wrong, a 
Referer HTTP header that references the page's URI is added in the 
second case, but the use of this header is not confined to XHR calls)

In short, there's no real way for a CT-proxy to tell whether a request 
comes from a web page or from an XHR call.

It's not a problem if the CT-proxy decided to adapt the original page. 
as it means it must have detected the XHR calls, and somehow knows how 
to handle/adapt them as needed.

It's a problem when the CT-proxy left the page untouched. In that case, 
it must leave the XHR requests/responses that originate from the page 
untouched as well, and thus somehow must be able to connect the XHR 
calls with the original page.


Possible guidelines
-------------------
1. As suggested by Martin, we could say something like:
"*requests* should only be modified when the proxy can determine 
positively that they originate from a page which was transformed by it" 
which I would complete with "or that they are requests for the original 
pages" (a bit clumsy though). It could go in §3.1.1.

2. As suggested by Bryan, we could recommend that content providers who 
use Ajax as a way to develop true Web Applications on top of the 
browser's sandbox change the HTTP User-Agent header used in the XHR 
calls to identify themselves as an application and not as the browser. 
Actually the best practice to change the User-Agent extends to Web 
Applications in general (not only in browsers) and may be a good 
candidate for BP2.

3. We could recommend that content providers add a "Cache-Control: 
no-transform" to the HTTP requests sent using XHR calls.

4. We could say the whole thing is not a real problem and not address 
the case.

5. Any other idea?


I would put 1. and 2. in my shopping cart.

Re 1. it may be tricky to implement it in practice (since the CT-proxy 
will have to be smart enough to tell the difference between a request 
for a URI the user entered and one for a URI an XHR object of an 
untouched page is sending). Up to the CT-vendors, we should just make 
sure we don't recommend something impossible to achieve...

Re 3. Same as Martin, I don't think there's any need to add a 
"Cache-Control: no-transform" in the XHR request. Adding the directive 
in the response instead (as for a regular page that the content provider 
doesn't want to see adapted) should be enough in most cases. That's 
already covered in §3.2 "[The server] should inhibit transformation of 
the response by including a no-transform directive", so no need to add 
anything here.


Side guidelines
---------------
Indirectly related to this:

a) What about saying in §3.4 that client-scripting is one example of the 
heuristics that the CT-proxy should use to decide in favor or against 
adaptation?
I proposed: "examination of the content reveals that the page contains 
client-side scripts that may break if the page gets adapted"
... but any improvement on that would be most welcome

b) A CT-proxy may discover that a web site is "incoherent" in that the 
original page says "you may transform", whereas a linked image or script 
or CSS resource says "no-transform". We may want to say that the 
original page may be considered by the CT-proxy as the point of 
reference. I'm not sure we need to precise that case though. What may be 
worth saying is something along the lines of "content providers should 
be coherent in their CT directives between linked resources"


François.


P.S.
----
Using the XMLHttpRequest interface, it is possible to set the HTTP 
headers as needed. In particular, it is possible to change the 
User-Agent HTTP header and/or to add a Cache-Control: no-transform header.
See: http://www.w3.org/2008/03/xhr-ua-test/test.html

Received on Wednesday, 19 March 2008 12:47:25 UTC