Re: Feedback on content transformation guidelines ( LC-2066 LC-2067 LC-2068 LC-2069 LC-2070 LC-2071 LC-2073 LC-2072 LC-2074 LC-2075 LC-2076 LC-2077 LC-2078 LC-2079 LC-2080 LC-2081 LC-2082 LC-2083 LC-2084)

Hi Mark,

A couple of comments on 406 below.
The group will discuss your comments during its F2F this week.


Mark Nottingham wrote:
> Many of the underlying requirements in this document are just reinterpretations of HTTP's requirements. Harm will come because it's very imprecise in doing so, thereby giving vendors permission to violate RFC2616, because they're conformant with a Recommendation. 
> 
> E.g., it allows retrying a POST request upon a 406, even though this isn't allowed in HTTP. It effectively allows proxies to ignore no-transform, if they really want to. It blurs the semantics of a 200 response based upon its content. 

On the particular point of retrying a POST request upon a 406, I guess 
our reading of RFC2616 was that a 406 response implied that the request 
had not been processed by the server. I could not find an explicit 
statement one way or the other about that in the RFC, but this should 
also mean that we cannot consider this to be a valid assumption. The 
guidelines should be fixed, IMO.


As a side comment on 406, it just striked me that we recommend to use 
406 in a more generic way than what its definition suggests. The 
definition specifies that it is for resources that "have content 
characteristics not acceptable according to the *accept* headers sent in 
the request" [1]. We recommend to use 406 "if a request cannot be 
satisfied with content that meets the criteria specified by values of 
the *HTTP request header fields*" [2]. In short, we do not restrict its 
use to accept HTTP headers (and we strongly think about the User-Agent 
HTTP header here).

We could tighten the wording not to say that a 200 response can be 
treated as a 406 response where we mean to say "as a response to a 
request with a browser that is not supported". The browser's 
(non-)identification is likely to be based on the User-Agent and not on 
the accept HTTP headers.

Note that I do not think this would address your comment on blurring the 
semantics of a 200 response.

Francois.

[1] http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.7
[2] http://www.w3.org/TR/ct-guidelines/#sec-server-use-of-406


> 
> I realise that people will intervene in traffic in any case; as you're aware, HTTP allows this. The problem I have is that Recommending how to do this in such a loose way will just encourage vendors and network operators to disregard HTTP's requirements further, rather than limiting the damage they cause.
> 
> Cheers,
> 
> 
> On 25/11/2009, at 9:08 PM, Jo Rabin wrote:
> 
>> Hello Mark
>>
>> Many thanks for taking the trouble to make your further comments.
>>
>> We will address the specific comments in the group on a one by one basis. For now, would you mind clarifying:
>>
>> My concern is that Recommending this document will cause more harm than good to the Web overall (even if it does represent a consensus of sort among a more limited community).
>>
>> Would you be kind enough to tell us in more detail in which ways you fear the document will cause harm. Clearly it's not intended to do that, it's intended to improve a poor state of affairs where proxies on mobile access networks intervene seemingly haphazardly in traffic passing through them.
>>
>> thanks
>> Jo
>>
>> On 23/11/2009 23:57, Mark Nottingham wrote:
>>> Francois, My apologies, things have been quite busy.
>>> I've taken a quick look through the document.
>>> A few things that caught my eye;
>>> * 4.1.1 "Proxies should not intervene in methods other than GET, POST, HEAD."  "Intervene" is vague here; does it mean that they're not allowed to change the requests, but are allowed to change the responses to them? Or they're not allowed to transform neither the request nor the response?
>>> * 4.1.4 "so should notify the user that this is the case" -- how? Using a Warning header? Are they required to populate the Age header, so that the user can calculate whether it's stale themselves?
>>> * 4.1.4 "and must provide a simple means of retrieving a fresh copy." Again, now? Wouldn't it be simpler to say that they MUST honour Cache-control: no-cache?
>>> * 4.1.5 It needs to be explicitly pointed out here that the modifications listed are not allowed when CC: no-transform is present in the request. Otherwise, the relative precedence of the requirements in the document is too imprecise.
>>> * 4.1.5 "Aside from the usual procedures defined in [RFC 2616 HTTP] proxies should not modify the values" -- I have a hard time parsing this. Do you mean "In addition to the requirements of [RFC2616]..." ?
>>> * 4.1.5 "the request is part of a sequence of requests to the same Web site and either it is technically infeasible not to adjust the request because of earlier interaction, or because doing so preserves consistency of user experience."  This seems like a hole that a proxy vendor can drive a truck through... are you serious?
>>> * 4.1.5.1 "The theoretical idempotency of GET requests is not always respected by servers. In order, as far as possible, to avoid misoperation of such content, proxies should avoid issuing duplicate requests and specifically should not issue duplicate requests for comparison purposes." Existing proxies can and do already retry GETs; I'm not sure who you're trying to protect here. * 4.2.7 Link to "handheld" representation -- you're requiring proxies to "process" (whatever that means) handheld links, even if the client isn't handheld?
>>> Overall, I'd say that the document quality is still marginal at best; it uses a lot of imprecise terminology and muddies the waters more than it clarifies things. Many (if not most) of its requirements aren't testable. My concern is that Recommending this document will cause more harm than good to the Web overall (even if it does represent a consensus of sort among a more limited community). Cheers,
>>> On 16/11/2009, at 11:48 PM, Francois Daoust wrote:
>>>> Dear Mark,
>>>>
>>>> The Last Call review period for the Guidelines for Web Content Transformation Proxies is over and we have not yet heard from you. We were wondering whether you had time to review the response to your comments below and the updated document, and whether you could let us know if you agree with it or not via email.
>>>>
>>>> The header of the previous email was generated from a template that did not give us the opportunity to apologize for the time it took us to get back to you. Comments received during the first Last Call review period generated a lot of discussions within the group. Resolutions of the issues took more time than expected. The group thinks the document has quite improved as a consequence, apologizes for the delay and would like to thank you again for your contribution!
>>>>
>>>> Thanks,
>>>>
>>>> For the Mobile Web Best Practices Working Group,
>>>> Francois Daoust,
>>>> W3C Staff Contact.
>>>>
>>>>
>>>> fd@w3.org wrote:
>>>>> Dear Mark Nottingham ,
>>>>> The Mobile Web Best Practices Working Group has reviewed the comments you
>>>>> sent [1] on the Last Call Working Draft [2] of the Content Transformation
>>>>> Guidelines 1.0 published on 1 Aug 2008. Thank you for having taken the time
>>>>> to review the document and to send us comments!
>>>>> The Working Group's response to your comment is included below, and has
>>>>> been implemented in the new version of the document available at:
>>>>> http://www.w3.org/TR/2009/WD-ct-guidelines-20091006/.
>>>>> Please review it carefully and let us know by email at
>>>>> public-bpwg-comments@w3.org if you agree with it or not before 6 November
>>>>> 2009. In case of disagreement, you are requested to provide a specific
>>>>> solution for or a path to a consensus with the Working Group. If such a
>>>>> consensus cannot be achieved, you will be given the opportunity to raise a
>>>>> formal objection which will then be reviewed by the Director during the
>>>>> transition of this document to the next stage in the W3C Recommendation
>>>>> Track.
>>>>> Thanks,
>>>>> For the Mobile Web Best Practices Working Group,
>>>>> Dominique Hazaël-Massieux
>>>>> François Daoust
>>>>> W3C Staff Contacts
>>>>> 1. http://www.w3.org/mid/427CE896-3572-4F32-8C9D-589B59AEE7D5@mnot.net
>>>>> 2. http://www.w3.org/TR/2008/WD-ct-guidelines-20080801/
>>>>> =====
>>>>> Your comment on 2.1 Types of Proxy:
>>>>>> * Section 2.1 - "Alteration of HTTP requests and responses is not  prohibited by HTTP other than in the circumstances referred to in  [RFC2616 HTTP] Section 13.5.2."  This isn't true; section 14.9.5 needs to be referenced here as well.
>>>>> Working Group Resolution (LC-2066):
>>>>> We agree that section 14.9.5 refers to this and have changed the reference
>>>>> accordingly.
>>>>> The updated text is available at:
>>>>> http://www.w3.org/TR/2009/WD-ct-guidelines-20091006/Overview.html#sec-types-of-proxy
>>>>> ----
>>>>> Your comment on 3.4 Content Deployment Conformance:
>>>>>> * Section 3.4 / 3.5 "A [Content|Transformation] Deployment conforms to these guidelines if it follows the statements..."  What does "follows" mean here -- if they conform to all MUST level requirements? SHOULD  and MUST?
>>>>> Working Group Resolution (LC-2067):
>>>>> We agree that this is unclear. The guidelines now state that conformance
>>>>> applies to SHOULD statements as well and that a justification is required
>>>>> for each circumstance in which a SHOULD statement is not followed.
>>>>> Transformation Deployments willing to claim conformance to the spec must
>>>>> make available a conformance statement available as a separate document and
>>>>> referenced from the guidelines.
>>>>> Check the updated version of the text:
>>>>> http://www.w3.org/TR/2009/WD-ct-guidelines-20091006/Overview.html#sec-transformation-deployment-conformance
>>>>> ----
>>>>> Your comment on 4.1.2 no-transform directive in Request:
>>>>>> * Section 4.1.2 "If the request contains a Cache-Control: no-transform directive proxies must forward the request unaltered to the server,  other than to comply with transparent HTTP behaviour and as noted  below."  I'm not sure what this sentence means.
>>>>> Working Group Resolution (LC-2068):
>>>>> We agree and have added references to the revelant sections of RFC2616, as
>>>>> well as to the section of the guidelines that points out the HTTP header
>>>>> fields proxies should add.
>>>>> The updated text is available at:
>>>>> http://www.w3.org/TR/2009/WD-ct-guidelines-20091006/Overview.html#sec-request-no-transform
>>>>> ----
>>>>> Your comment on 4.1.3 Treatment of Requesters that are not Web browsers:
>>>>>> * Section 4.1.3 "Proxies must act as though a no-transform directive  is present (see 4.1.2 no-transform directive in Request) unless they  are able positively to determine that the user agent is a Web  browser."  How do they positively" determine this? Using heuristics is far from a guaranteed mechanism. Moreover, what is the reasoning  behind this? If the intent is to only allow transformation of content intended for presentation to humans, it would be better to say that.  In any case, putting a MUST-level requirement on this seems strange.
>>>>> Working Group Resolution (LC-2069):
>>>>> We agree that there is no applicable mechanism to determine that the user
>>>>> agent is a Web browser, and have removed any normative statement from the
>>>>> section.
>>>>> The section was substantially reworded and now refers to the notion of
>>>>> "Traditional Browsing".
>>>>> The updated text is available at:
>>>>> http://www.w3.org/TR/2009/WD-ct-guidelines-20091006/Overview.html#sec-non-web-browsers
>>>>> ----
>>>>> Your comment on 4.1.4 Serving Cached Responses:
>>>>>> * Section 4.1.4 "Proxies should follow standard HTTP procedures in  respect to caching..."  This seems a strange way to phrase it, and I  don't think it's useful to use RF2616 language here.
>>>>> Working Group Resolution (LC-2070):
>>>>> We agreed and have reworded the section to remove the weird use of
>>>>> normative terms to refer to RFC2616.
>>>>> The updated text is available at:
>>>>> http://www.w3.org/TR/2009/WD-ct-guidelines-20091006/Overview.html#sec-serving-cached-responses
>>>>> ----
>>>>> Your comment on 4.1.5 Alteration of HTTP Header Values:
>>>>>> * Section 4.1.5 Bullet points one and 3 are get-out-of-jail-free cards for non-transparent proxies to ignore no-transform and do other anti- social things. They should either be tightened up considerably, or  removed.
>>>>> Working Group Resolution (LC-2071):
>>>>> What is now section 4.2.3 makes it clear that no-transform must be
>>>>> respected:
>>>>> http://www.w3.org/TR/2009/WD-ct-guidelines-20091006/Overview.html#sec-receipt-of-cache-control-no-transform
>>>>> Bullets 1 and 3 only refer to alteration of the User-Agent and Accept-*
>>>>> headers and not to transformation of the response.
>>>>> ----
>>>>> Your comment on 4.1.5 Alteration of HTTP Header Values:
>>>>>> * Section 4.1.5 "proxies should use heuristics including comparisons  of domain name to assess whether resources form part of the same "Web site."  I don't think the W3C should be encouraging vendors to  implement yet more undefined heuristics for this task; there are  several approaches already in use (e.g., in cookies, HTTP, security  context, etc.); please pick one and refer to it specifically.
>>>>> Working Group Resolution (LC-2073):
>>>>> We are not aware of any satisfactory heuristics. We acknowledge the fact
>>>>> that Transformation Deployments will need to adopt heuristics of some kind,
>>>>> and that this must be left open.
>>>>> ----
>>>>> Your comment on 4.1.5 Alteration of HTTP Header Values:
>>>>>> * Section 4.1.5 What is a "restructured desktop experience"?
>>>>> Working Group Resolution (LC-2072):
>>>>> We agree and have added a term reference to the definition of
>>>>> "restructuring" and a cross reference to the section that describes the
>>>>> user selection of restructured experience.
>>>>> The updated text is available at:
>>>>> http://www.w3.org/TR/2009/WD-ct-guidelines-20091006/Overview.html#sec-altering-header-values
>>>>> ----
>>>>> Your comment on 4.1.5.1 Content Tasting:
>>>>>> * Section 4.1.5.1 Proxies (and other clients) are allowed to and do  reissue requests; by disallowing it, you're profiling HTTP, not  providing guidelines.
>>>>> Working Group Resolution (LC-2074):
>>>>> Based on our experience and feedback from servers whose operators take
>>>>> strong exception to this practice, we think it's reasonable to advise
>>>>> operators of CT-proxies of this situation. We're not prohibiting reissuing
>>>>> requests, we're just observing that content providers don't like it.
>>>>> ----
>>>>> Your comment on 4.1.5.2 Avoiding "Request Unacceptable" Responses:
>>>>>> * Section 4.1.5.2 Again, not specifying the heuristics is going to  lead to differences in behaviour, which will cause content authors to have to account for this as well.
>>>>>>
>>>>>> * Section 4.1.5.2 "A proxy must not re-issue a POST/PUT request..." Is this specific to POST and PUT, or all requests with bodies, or...?
>>>>> Working Group Resolution (LC-2075):
>>>>> We now limit the scope to HEAD GET and POST. We observe that duplicate
>>>>> POSTS are seen "in the wild" and think it important to point out to
>>>>> operators of content transformation proxies that this is problematical.
>>>>> We acknowledge that not specifiying heuristics will lead to differences in
>>>>> behaviour. However, this is something that content transformation providers
>>>>> will need to do to provide the service they set out to provide.
>>>>> The updated text is available at:
>>>>> http://www.w3.org/TR/2009/WD-ct-guidelines-20091006/Overview.html#sec-avoiding-request-unacceptable
>>>>> ----
>>>>> Your comment on 4.1.5.4 Sequence of Requests:
>>>>>> * Section 4.1.5.4 Use of the term 'representation' is confusing here;  please pick another one.
>>>>>>
>>>>>> * Section 4.1.5.4 Using the same headers is often not a good idea.  More specific, per-header advice would be more helpful.
>>>>> Working Group Resolution (LC-2076):
>>>>> Ref 'representation', we agree and have used the term "included
>>>>> resources", as defined in the W3C mobileOK Basic Tests standard:
>>>>> http://www.w3.org/TR/mobileOK-basic10-tests/#included_resources
>>>>> Ref per-header advice, we agree and have clarified that we are only
>>>>> talking about keeping the User-Agent HTTP header field consistent.
>>>>> The updated text is available at:
>>>>> http://www.w3.org/TR/2009/WD-ct-guidelines-20091006/Overview.html#sec-sequence-of-requests
>>>>> ----
>>>>> Your comment on 4.1.5.5 Original Headers:
>>>>>> * Section 4.1.5.5 This is specifying new protocol elements; this is  becoming a protocol, not guidelines.
>>>>> Working Group Resolution (LC-2077):
>>>>> We are reflecting current practice as implemented by content
>>>>> transformation proxies. It is not our intention to create a new protocol,
>>>>> just to try to reduce the chaos that is currently perceived to be out
>>>>> there.
>>>>> The newly introduced HTTP Header Fields have been provisionally registered
>>>>> with IANA:
>>>>> http://www.iana.org/assignments/message-headers/prov-headers.html
>>>>> ----
>>>>> Your comment on 4.1.6.1 Proxy Treatment of Via Header:
>>>>>> * Section 4.1.6.1 When a proxy inserts the URI to make a claim of  conformance, exactly what are they claiming -- all must-level  requirements are met? Should-level? What is the use case for this  information?
>>>>> Working Group Resolution (LC-2078):
>>>>> We agree and have clarified that inclusion of a Via comment of the form
>>>>> indicated is not a conformance claim, but is an indication that the proxy
>>>>> may restructure or otherwise modify content.
>>>>> The updated text is available at:
>>>>> http://www.w3.org/TR/2009/WD-ct-guidelines-20091006/Overview.html#sec-via-headers
>>>>> ----
>>>>> Your comment on 4.2.1 Use of HTTP 406 Status:
>>>>>> * Section 4.2.1 Requiring servers to respond with 406 is profiling  HTTP; HTTP currently allows the server to send a 'default'  representation even when the headers say that the client doesn't  prefer it.
>>>>> Working Group Resolution (LC-2079):
>>>>> We agree and have moved server behavior into an "Informative Guidance for
>>>>> Origin Servers" non-normative appendix where we point out that servers
>>>>> should consider using an HTTP 406 Status if a request cannot be satisfied.
>>>>> The updated text is available at:
>>>>> http://www.w3.org/TR/2009/WD-ct-guidelines-20091006/Overview.html#sec-server-use-of-406
>>>>> ----
>>>>> Your comment on 4.2.2 Server Origination of Cache-Control: no-transform:
>>>>>> * Section 4.2.2 "Servers must include a Cache-Control: no-transform  directive if one is received in the HTTP request." Why?
>>>>> Working Group Resolution (LC-2080):
>>>>> We agree and have moved server behavior into an "Informative Guidance for
>>>>> Origin Servers" non-normative appendix where we point out that servers
>>>>> should consider including a Cache-Control: no-transform directive if one is
>>>>> received as it may be an indication that the client does not wish to
>>>>> receive a transformed response.
>>>>> The updated text is available at:
>>>>> http://www.w3.org/TR/2009/WD-ct-guidelines-20091006/Overview.html#sec-cache-control-no-transform
>>>>> ----
>>>>> Your comment on 4.2.3.1 Use of Vary HTTP Header:
>>>>>> * Section 4.2.3.1 "Serves may base their actions on knowledge... but  should not choose an Internet content type for a response based on an assumption or heuristics about behaiour of any intermediaries." Why
>>>>>> not?
>>>>> Working Group Resolution (LC-2081):
>>>>> Guidelines for origin servers were switched to an informative appendix.
>>>>> The text was clarified to point out that the Internet content type for a
>>>>> response should be correct for the actual content, and not chosen on the
>>>>> basis that the server suspects the proxy will not transform the content if
>>>>> it receives such an Internet media type.
>>>>> The updated text is available at:
>>>>> http://www.w3.org/TR/2009/WD-ct-guidelines-20091006/Overview.html#sec-use-of-vary-header
>>>>> ----
>>>>> Your comment on 4.3.2 Receipt of Warning: 214 Transformation Applied:
>>>>>> * Section 4.3.2 Why can't proxies transform something that has already been transformed?
>>>>> Working Group Resolution (LC-2082):
>>>>> We agree and have replaced the section with a section that notes that
>>>>> intermediate proxies may add a Cache-Control: no-transform directive if
>>>>> they want to inhibit further transformation.
>>>>> The updated text is available at:
>>>>> http://www.w3.org/TR/2009/WD-ct-guidelines-20091006/Overview.html#sec-proxy-use-of-no-transform
>>>>> ----
>>>>> Your comment on 4.3.3 Server Rejection of HTTP Request:
>>>>>> * Section 4.3.3 Sniffing content for error messages is dangerous, and  also unlikely to work. E.g., will you sniff for all languages and all possible phrases? How will you avoid false positives? Remove this  section and require content providers to get it right. People may  still do this in their products, but there's no reason to codify it.
>>>>> Working Group Resolution (LC-2083):
>>>>> Sniffing content is an important part of the mechanism described in 4.1.5
>>>>> so has to be mentioned here in some form. But we don't mean to propose this
>>>>> as a fail safe mechanism, we merely mean to indicate that Content
>>>>> Transformation proxies may need to employ heuristics to provide an improved
>>>>> service for their users. Therefore we have removed any reference to
>>>>> conforming servers.
>>>>> The updated text is available at:
>>>>> http://www.w3.org/TR/2009/WD-ct-guidelines-20091006/Overview.html#sec-unacceptable-response
>>>>> ----
>>>>> Your comment on 4.3.4 Receipt of Vary HTTP Header:
>>>>>> * Section 4.3.4 What's the purpose behind this behaviour?
>>>>> Working Group Resolution (LC-2084):
>>>>> This section is part of the fail safe mechanism described in 4.1.5.2
>>>>> Avoiding "Request Unacceptable" Responses. The reference to 4.1.5.2 was
>>>>> moved to the beginning of this section and the wording simplified.
>>>>> The updated text is available at:
>>>>> http://www.w3.org/TR/2009/WD-ct-guidelines-20091006/Overview.html#sec-receipt-of-vary-header
>>>>> ----
>>> --
>>> Mark Nottingham     http://www.mnot.net/
> 
> 
> --
> Mark Nottingham     http://www.mnot.net/
> 
> 

Received on Monday, 7 December 2009 11:34:18 UTC