W3C home > Mailing lists > Public > public-bpwg-ct@w3.org > November 2008

New editor's draft of Guidelines for Web Content Transformation Proxies (1p)

From: Jo Rabin <jrabin@mtld.mobi>
Date: Fri, 07 Nov 2008 18:40:07 +0000
Message-ID: <49148B87.3050703@mtld.mobi>
To: public-bpwg-ct <public-bpwg-ct@w3.org>

Hello everyone

I have updated the CT Guidelines according to all the resolutions I 
could find in minutes of meetings.

Francois: I have yet to update the LC tracker with "Resolution 
implemented" where appropriate

You will find the shiny new draft at

http://www.w3.org/2005/MWI/BPWG/Group/TaskForces/CT/editors-drafts/Guidelines/081107

There's no much point in a diff, but you'll find a link to one under 
"Previous Revisions".

Enjoy.

Jo

===================
Detailed Change Log
===================

LC-2007

     RESOLUTION: remove "Content Deployment" class of product and move
     section 4.2 Server Response to Proxy to an informative section. No
     more normative guidelines on Content Providers.

Summary of requirements removed as not consistent with the document 
describing only the proxy's behavior.

Audience section reworded to reflect change.

Changed Conformance Section.

Previous Guidelines included as an Appendix and heavily reworded.
*** Note that the meaning of many of the sections is altered ***

Removed normative *recommended* from non-normative appendix 
"Applicability to Transforming Solutions which are Out of Scope"

---

LC-2067

     RESOLUTION: re. LC-2067, state that conformance applies to SHOULD
     statements as well. A justification is required for each
     circumstance in which a SHOULD statement is not followed. Prepare an
     Implementation Conformance Statement to be filled out by
     Transformation Deployments willing to claim conformance to the spec.

Reworded Conformance to make it clear that SHOULD needs to be observed 
too. Reference the following too.

It was Francois's ACTION-846 to prepare a draft conformance statment. 
I've made a nice space as an Appendix and I'm sure we are all looking 
forward to seeing it :-) .


---

LC-2050

     RESOLUTION: Re LC-2050 move definitions to scope to clarify that we
     are talking only about restructuring

*** on reflection, I don't think this makes sense. So I left the 
definitions where they are. In reality, we are talking about both 
restructuring and recoding and optimizing. I think perhaps we should 
take this back to Eduardo when he joins the group.

     RESOLUTION: re LC-2050 we don't intend to define these concepts any
     more formally than we do now

(no edits required)


---

LC-2018

     RESOLUTION: The title of the spec will be "Guidelines for Web
     Content Transformation Proxies"

And so it is.


LC-2012

     RESOLUTION: re. LC-2012, use Tom's wording above, (and note we may
     have to further clarify the introduction for other reasons anyway)
     ... re. LC-2012, use Tom's wording above, (and note we may have to
     further clarify the introduction for other reasons anyway)

overtaken by events ...

     RESOLUTION: re. LC-2012, replace the sentence deemed obscure by
     "Within this document content transformation refers to the
     manipulation of requests to, and responses from, an origin server.
     This manipulation is carried out by proxies in order to provide a
     better user experience of content that would otherwise result in an
     unsatisfactory experience on the device making the request."



---

LC-2064

     RESOLUTION: LC-2064 is a mistake. There are no duplicated IDs in the
     document.

(no edit required)


---

LC-2068

     RESOLUTION: re. LC-2068, we think the text is as clear as possible.
     Stick to the text in the spec.

(no edit required)


---

LC-2008

     RESOLUTION: re. LC-2008, update the text according to Jo's proposal,
     as pasted above

Text on vary headers ... slightly differently put in the Appendix as it 
is now informative.


---

LC-2090 and LC-2091

     RESOLUTION: The manner in which transformation is carried out, when
     it is permitted, including any additional navigational or other
     material that is included, aside from where explicitly stated
     (insecure links etc.) will be noted in an "out of scope section" in
     the document. And resolve no to LC-2090 and LC-2091

Amended "Scope" to refer to this question and to mention internal 
operation being out of scope.


---

LC-2068

     RESOLUTION: re. LC-2068, amend the text in section 4.1.2 with
     references to RFC HTTP sections. Final text: "If the request
     contains a Cache-Control: no-transform directive, proxies must not
     alter the request other than to comply with transparent HTTP
     behavior defined in HTTP RFC 2616 sections 14.9.5 and 13.5.2. and to
     add headers as described in 4.1.6 Additional HTTP Headers below."

done


---

LC-2023

     RESOLUTION: On character encoding mention this under 4.3.6.1 and
     respond "Yes partial" to LC-2023

Added as a note under what was 4.3.6.1
<p>Other than as noted in this section the nature of restructuring that 
is carried out, any character encoding alterations and what is omitted 
and what is inserted is, as discussed in <specref ref="sec-scope"/>, out 
of scope of this document.</p>

whereas the text discussed was:

     Other than as noted in this
     section the nature of restructuring that is carried out, what is
     omitted and what is inserted may be a copyright issues and is in any
     case out of scope of this document


     RESOLUTION: Mention the out of scope nature of the details of
     restructuring under 4.3.6 somewhere (cf insertion of headers,
     footers etc.)

See above under LC-2090 and previous inserted text.


---

LC-2065

     RESOLUTION: Move content from Appendix E to 4.3.6 somewhere and
     reword appropriately (and yes, partial to LC-2065)

As follows as a new 4.x.6.1

<head>User Preferences</head>
<p>Proxies <rfc2119>must</rfc2119> provide a means for users to express 
preferences for inhibiting content transformation. Those preferences 
<rfc2119>must</rfc2119>be maintained on a user by user and Web site by 
Web site basis. Proxies <rfc2119>must</rfc2119> solicit re-expression of 
preferences in respect of a server if the server starts to indicate that 
it offers varying responses as discussed under <specref 
ref="sec-receipt-of-vary-header"/>.</p>


---

LC-2026, LC-2027, LC-2085, LC-2028, LC-2029, LC-2030, LC-2015, LC-2031, 
LC-2016, LC-2032, LC-2001, LC-2033, LC-2004, LC-2024

(phew!)

     RESOLUTION: Accept the thrust of Tom's submission on HTTPS, and
     editor to make sure that the wording is beefed up (e.g. by saying
     that if a proxy rewrites HTTPS ... rather than saying a proxy MAY)
     to make it clear that if you _must_ do it the user MUST know and
     MUST have a choice

     ACTION-860 - Add clarification to HTTPS rewriting
     to make it clear that the via header MUST be added

     ACTION-864 - Redraft HTTPS section for discussion
     on list [on Jo Rabin - due 2008-10-21]

Tom's Submissions:

1. http://lists.w3.org/Archives/Public/public-bpwg-ct/2008Sep/0013.html
2. http://lists.w3.org/Archives/Public/public-bpwg-ct/2008Oct/0012.html

Pending
     Francois's ACTION-859 - Contact IETF TLS group and advise
     them of what we are thinking and ask for guidance on what to
     recommend to Content Provider about detecting the presence of a
     man-in-the-middle proxy

Pending
     Discussion with Thomas Roessler about his concerns ref applications 
and possible security risks relating to the client thinking that all 
hosts are the same (i.e. that they are the proxy).

Discussion: http://www.w3.org/2008/10/07-bpwg-minutes.html#item02

The amended text so far:

4.2.7.2 HTTPS Link Re-writing
Note:

The BPWG does not condone link rewriting, but notes that in some 
circumstances HTTPS is used in situations where the user is prepared to 
trade usability provided by a transforming proxy for the loss of 
end-to-end security. Servers can prevent users from exercising this 
choice by applying a Cache-Control: no-transform directive.

If a proxy rewrites HTTPS links, it must advise the user of the security 
implications of doing so and must provide the option by-pass it and to 
communicate with the server directly.

Notwithstanding anything else in this document, proxies must not rewrite 
HTTPS links in the presence of a Cache-Control: no-transform directive.

If a proxy re-writes HTTPS links, replacement links must have the scheme 
https.

When forwarding requests originating from HTTPS links proxies must 
include a Via header as discussed under 4.1.6.1 Proxy Treatment of Via 
Header.

When forwarding responses from servers proxies must notify the user of 
invalid server certificates.

Add some stuff below under guidance for servers

Note:

For clarity it is emphasized that it is not possible for a transforming 
proxy to transform content accessed via an HTTPS link without breaking 
end-to-end security.


---

LC-2078

     RESOLUTION: Rewrite section 4.1.6.1 to clarify that inclusion of a
     via comment of the form indicated is not a conformance claim, but is
     an indication that the proxy may restructure or otherwise modify
     content

Replace "indicate their conformance ..." with "indicate their ability to 
transform content"

---

LC-2019

     RESOLUTION: re. LC-2019, amend text on conversion between HEAD and
     GET to say that other conversions are not allowed, and resolve
     partial to LC-2019

Other than to convert between HEAD and GET proxies <rfc2119>must 
not</rfc2119> alter request methods.

---

LC-2034: Applicable HTTP methods (§4.1.1)

     RESOLUTION: ref LC-2034, we clarify that the scope of
     the document is limited to GET, POST, HEAD requests and their
     responses and resolve "no"

Modified to remove PUT: <p>Proxies <rfc2119>should not</rfc2119> 
intervene in requests with methods other than GET, POST, HEAD.</p>

Added:

<div3 id="sec-ApplicableResponses">
                     <head>Applicable Responses</head>
                     <p>Proxies <rfc2119>should not</rfc2119> intervene 
in response if the request method was not HEAD, GET or POST.</p>
                 </div3>

---

LC-1997, LC-2006, LC-2014, LC-2046

*** Discussed on 14th but no resolution that day ***

---

LC-2066

RESOLUTION: Accept LC-2066 and add the reference

Added reference to HTTP section 14.9.5

---

LC-2044

RESOLUTION: Ref LC-2044 Resolve yes, and change the text to say 
"*values* of User Agent and Accecpt headers", and clarify that we do not 
propose guidance for new user agents' use of these headers, it is out of 
scope

*** I didn't add the clarification, it seems out of place ***

And anyway

RESOLUTION: re LC-2044, resolution on LC-2069 removes the part that 
required clarification, resolve partial, we won't talk about "use of 
evidence"

---

LC-2070

RESOLUTION: Ref LC-2070, resolve yes, and change para 1 to say "Aside 
from the usual caching procedures defined in RFC 2616, in some 
circumstances ..."

done

---

LC-2069

RESOLUTION: ref LC-2069. Resolved yes, with the replacement text: Before 
altering aspects of an HTTP request proxies ought to take account of the 
fact that HTTP is used as a transport mechanism for many other 
applications than "Traditional Browsing" and that alteration of HTTP 
requests for those applications can cause serious misoperation.

Used the words "need to"

---

LC-2003

RESOLUTION: Make a note about the reasons for not referring to lists, of 
whatever hue, because the preumption about the internal operation of 
proxies is not in scope, as far as we are concenred these are "black boxes"

Text included in Scope per the above LC-2090

---

LC-1996 et al (section 4.1.5)

RESOLUTION: WRT 4.1.5 Text remains substantially as is but is reinforced 
by saying that the CT proxy SHOULD NOT change headers and values other 
than User Agent and Accept(-*), MUST NOT delete headers and it MUST be 
psosible for the server to reconstruct the original UA originated 
headers by using X-Device etc.

done

---

LC-2074

RESOLUTION: re. LC-2074, resolve no. Based on our experience and 
feedback from servers whose operators take strong exception to this 
practice, we think it's reasonable to advise CT-proxies operators of 
this situation


No change needed

---

LC-2037

RESOLUTION: ref LC-2037 yes, we have removed PUT partly in response to 
your comment

Done

PROPOSED RESOLUTION: ref LC-2037 ref retrying POSTs, no, we agree that 
it shouldnot be necessary to point this out, but sadly it is

No actual resolution but no change needed.

See http://lists.w3.org/Archives/Public/public-bpwg-ct/2008Oct/0052.html

---

LC-2075

RESOLUTION: LC-2075 differences in behaviour: the internal operation of 
the proxy is not open to our specification, we need to point out to CT 
proxies that in practice 406 responses are not the only way in which 
content proivders signal that they can't or won't handle a request, 
though we do say that this is the preferred way of them doing so

*** actually the text in what is now the appendix doesn't say it's the 
preferred way any more ***

but no change needed at this section

RESOLUTION: ref LC-2075, we have changed the text to refer only to POST 
and we acknowledge that this should not need restatement from RFC 2616 
but we are aware of this kind of misoperation "in the wild"

Removed PUT

---

LC-2076, LC-2039

RESOLUTION: Ref LC-2076 - yes, we will change the use of the word 
representation and use something like "included resources"

done with reference to mobileOK basic test 1.0 (sic)

RESOLUTION: ref LC-2039 and LC-2076: Yes, we will clarify that we are 
talking about keeping the User Agent Header consistent

done

---

LC-2079, LC-2041, LC-2080 - 4.2.1 Use of HTTP 406 Status - 4.2.2 Server 
Origination of Cache-Control: no-transform

RESOLUTION: ref LC-2041, LC-2080 and LC-2079, yes, we intend to move 
server behaviour into a non-normative section and point out that servers 
may wish to respond with no-transform if they think that this respects 
the intention of the requester and that for the sake of clarity use of 
406 is clearer than using a default representation using 200 and the 
text "your browser is not supported"

*** done-ish ***

---

LC-2045 - Respect of RFC2616 - 4.2.2 Server Origination of 
Cache-Control: no-transform

RESOLUTION: re. LC-2045, resolve partial, comment actually applies to 
4.3.1 where it is emphasized that proxies MUST behave "transparently" 
with a link to the definition that contains links to sections 13.5.2 and 
14.9.5 of RFC2616

done

---

LC-2081

RESOLUTION: Change second second para of 4.2.3.1 to say "don't 
systematically misrepresent your content, even if you think that will 
avoid it being transformed"

Something like the above, anyway

---

LC-2009, LC-2010, LC-2011 - Use of the link element - 4.2.3.2 Indication 
of intended presentation media type of presentation

RESOLUTION: LC-2010 is a reasonable comment but is now overtaken by 
events - namely that we don't propose to use fragment identifiers as a 
method to achieve this anymore.

RESOLUTION: Ref LC-2011 in 4.2.3.2 (and elsewhere as suits clarity and 
editorial convenience) at para 3 and the following note. Make it clear 
that where more than one representation is available from the same URI 
this ought to be represented by using a Vary header and can't be 
represented using <link rel="alternate">. In other cases the link header 
should be used to reference alternative representations (i.e. where the 
Base URI, ref RFC 3986 secs 5.5 and 5.1 does not indicate a same 
document reference)

Hope I have done this according to expectations

RESOLUTION: re. LC-2009, resolve yes, acknowledge RFC3986 section 4.4 
and remove the part on fragment identifiers

Yup

---

LC-2020 - Copyright - 4.3 Proxy forwarding of response to user agent

RESOLUTION: re. LC-2020, resolve no, the presence or absence of a 
Copyright is not a clear indication of the rights associated with the page

---

LC-2082, LC-2042 - Cascading proxies - 4.3.2 Receipt of Warning: 214 
Transformation Applied

RESOLUTION: WRT LC-2082, LC2042: resolve_yes and remove 4.3.2 replace 
with a section noting that intermediate proxies should send no-transform 
if they want to inhibit further transformation

---

LC-2083

RESOLUTION: ref LC-2083, no, it is an important part of the mechanism 
described in 4.1.5 so has to be here in some form. We don't mean to 
propose this as a fail safe mechanism, we merely mean to indicate that 
CT proxies may need to employ heuristics to provide an improved service 
for their users. Remove reference to conforming servers.

Done

---

LC-2084 - purpose of behavior - 4.3.4 Receipt of Vary HTTP Header

RESOLUTION: re. LC-2084, resolve partial since this is part of the fail 
safe mechanism defined in 4.1.5.2 that explains the use case. Move 
reference to 4.1.5.2 earlier int he sentence and simplify wording, add 
reference to example kindly to be re-provided by Francois

*** noting that the example is needed from Francois! ***

Hope that the revised text is clearer

---

LC-1998 - No transformation for application/xhtml+xml - 4.3.6 Proxy 
Decision to Transform


RESOLUTION: Remove examples of heuristics from the main run of text and 
include Appendices to list in a *non-endorsed* way lists of stuff that 
other people have used but are No-endorsed by us, and did I mentionthat 
they are not endorsed


Created Examples Appendices, moved example doctypes there

RESOLUTION: re. LC-1998, resolve no and point out to commenter that this 
assumption is unsafe without other supporting evidence.

No change needed

---

LC-1999 - No transformation for small pages - 4.3.6 Proxy Decision to 
Transform

RESOLUTION: Ref LC-1999 Resolve no commenter and point out to commenter 
that size on its own is unsafe as an indicator of mobile friendlines e.g 
content with embedded flash

---

LC-2048, LC-2002, LC-2052, LC-2021 - Heuristics - 4.3.6 Proxy Decision 
to Transform

RESOLUTION: Ref LC-2048 and LC-2002, LC-2052 and LC-2021, resolve 
partial, and say that we include these examples as non-endorsed 
heuristics in the non endorsed heuristics appendix

now there are some long lists

---

LC-2022 - i-mode content - 4.3.6 Proxy Decision to Transform

RESOLUTION: Ref LC-2022 resolve partial, we agree that this was not 
included and have added it as a non-endorsed heuristic in the relevant 
appendix

already dealt with above

---

LC-2090, LC-2000 - No extra content without the consent of the content 
owner - 4.3.6 Proxy Decision to Transform

RESOLUTION: Ref LC-2090 and LC-2000, resolve no, other than to note that 
adding extra content is forbidden where no-transform is present and 
content providers should use this if they want to be sure their content 
is not added to

nothing to do

---

LC-2013 - meta http-equiv - 4.3.6 Proxy Decision to Transform

RESOLUTION: Ref LC-2013, resolve yes, clarify in 4.3.1 and 4.3.6 and in 
other relevant sections that meta http-equiv should be consulted if the 
relevant actual HTTP header is not present

Included as a preface to what is now 4.2

---

LC-2051 - Open Mobile Alliance Standard Transcoding Interface work - 
Appendix A and D

ACTION-868 - Review OMA STI to see if there's something relevant for CT 
for LC-2051

Francois concluded that it was not relevant

---

LC-1995 - About "recent" HTTP "drafts" - Appendix D.2

PROPOSED RESOLUTION: re. LC-1995, resolve yes, and replace "recent draft 
of HTTP" by "HTTP/1.1"

This resolution deemed taken (like the other one Francois mentioned)

done

---

LC-2047 - Cascading proxies - Appendix D.4 Inter Proxy Communication


RESOLUTION: ref LC-2047 part a. No. We do not view the CT proxy as being 
a user agent in its own right, it is a proxy like any other. Knowing 
that it is upstream of other proxies doesn't alter it's prescibed 
behaviour according to this document

RESOLUTION: ref LC-2047 part b. we think that this is defined in HTTP 
and don't need to elaborate on it unless there are specific examples of 
misoperation that we can refer to

RESOLUTION: ref LC-2047 part c. we disagree and think that this is very 
complex and requires a substantial use case analysis to achieve a 
complete understanding. We think that a more complex HTTP vocabulary for 
inter proxy operation is likely to be required to achieve useful 
results, and we are not chartered to create technology of that kind

no action required

RESOLUTION: Add a section with a diagram explaining which proxies are in 
scope

*** not done *** needs to be done *** any offers? ***

---

LC-2038 - is it a list of Best Practices? Be explicit it that's the case

RESOLUTION: ref LC-2038, resolve partial. Answer "no, these are not best 
practices, but guidelines". Don't change the text.

Fair enough, I didn't

---

LC-2049 - forbid the alteration of the request when the URI follows some 
mobile pattern (*.mobi, wap.*, ...)

RESOLUTION: ref LC-2049 resolve no, URI patterns can never be more than 
a heuristic, but we will move the list of examples to a non normative 
appendix


Already done up there ^^ somewhere

---

LC-2053 - Classes of devices

*** Not done, awaiting clarification

---
LC-2072 - what is a restructured desktop experience?

RESOLUTION: ref LC-2072, resolve yes, and insert a termref to 
restructured and an Xref to 4.1.5.3

as noted

---

LC-2073 - heuristics and web sites

RESOLUTION: Ref LC-2073, resolve no, we are not aware of any 
satisfactory heuristics, but understand that CT Proxy vendors will need 
to adopt heuristics of some kind so we have no choice but to leave it open

No change needed

---

LC-2040 - X-Device-* should be in an Internet Draft

Pending ACTION-879 - Ask [someone] about adding IETF headers [on 
François Daoust - due 2008-11-11].

---

ends
Received on Friday, 7 November 2008 18:41:03 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 7 November 2008 18:41:03 GMT