- From: Jo Rabin <jrabin@mtld.mobi>
- Date: Mon, 19 Nov 2007 13:28:58 -0000
- To: <public-bpwg-ct@w3.org>
Hello everyone. A little later than anticipated and not in a pretty form yet [though you should see this as the editor's draft mentioned in the ACTION], this elaborates Magnus's original text for 2.1, and taking note of the threads mentioned below goes on to propose text for subsequent sections - these make reference to but not been merged with the original contributions on sections 2.2 and 2.3 from Sean Patterson and Aaron, which are included verbatim below. Both Sean and Aaron make significant points about the advantages of content transformation and why this can have a positive impact on the user experience. These points need capturing, but I think probably as a consolidated preamble - and also bearing in mind that the Landscape document is actually supposed to be the place where such points are discussed - if it doesn't at present make these points clearly enough then we need a further revision of that document to make sure that the points made in these contributions are noted there. In this draft, I've taken a slightly orthogonal approach to what we originally thought, which is to follow the course of a request and response and identify what each of the participants in its path is meant to do. Consequently the chapter outline as originally envisaged has not been followed in detail. Once this all has become a little more fleshed out, we might decide to rethink the sections in the document, but no need to worry about that for now. I have tried not to confine discussion to HTTP based signaling, as I think the following require mention at least as heuristics, if not recommended practice as they do play a role: a) a priori knowledge of device characteristics, as gleaned from a DDR; b) administrative arrangements, white lists etc.; c) heuristics, such as knowing which content types and DTDs are specifically mobile, looking for the presence of "handheld" in style sheets and @media attributes, looking for mobileOK labels; d) User interaction In reference to one of Bryan's contributions, user interaction needs more thought and discussion - on the one hand we don't want to interrupt the user experience with excise tasks, yet on the other, in the end, the user must act to signal their intentions and this needs noting. E.g. there could be a note that the host should provide interactions that allow the user to have a choice of presentations and so should the proxy and the client, for that matter. Another as yet unopened Pandora's box is that the discussion and proposed text below looks at the issues primarily from the point of view of "varying presentation from Thematically consistent URIs". What hasn't, as yet, been explored is how it all works if there is a common entry point to a site (Thematically consistent URI for a home page) which then dispatches via redirect to media specific versions. This is possibly rather more common than the previous case (e.g. redirect to example.com/mobile - or rather better, imo, example.mobi). Naturally, there will also be varying presentation even within a redirected solution. This whole area needs further thought. Whatever we come up with does of course have to deal with conforming and non conforming and transforming and non-transforming proxies. There isn't, as yet, a use case analysis, it is a bit too soon for that, I think. The philosophy here should be in line with existing HTTP practice, which is to fall back to safe behavior. Thus, when trying to distinguish reformatting behavior from recoding behavior, the objective is to fall back to "safe" known HTTP/1.1 practice for non conforming (unaware) and say things like: Cache-Control: no-transform, allow-reencode as this will result in a stricter interpretation by unaware participants. This behavior is discussed in detail in HTTP section 14.9.6 (reproduced below in this note for your convenience and see Sean's detailed list of references to points in the HTTP spec that need to be included also). This, of course, immediately introduces the question as to whether we are over stepping the mark in introducing such extensions, and I think we need to be clear about that before going further. On the one hand HTTP makes it clear, in explaining how to introduce extensions that it expects such extensions to be introduced. On the other hand, we do typically take a conservative approach and say if it is not in the IANA registry then it's not an existing protocol and therefore beyond our scope. Introducing extensions to existing header values, to my mind falls short of introducing new headers. Though it's not clear that we can do what we need to if we don't do that, go through IANA registration and so on. I think that we are going to need to do that and suggest we speak to this point tomorrow on our call, if necessary by joining forces with a group that is actually chartered to "invent new protocols". The alternative being a much more insipid document that only gets to a small subset of the problem. I'd also like to bring the group's attention to the following RFCs: RFC 2506 Media Feature Tag Registration Procedure RFC 2295 Transparent Content Negotiation RFC 2296 Remote Variant Selection Algorithm RFC 2295 is experimental, but actually gets to some of the points we want to make, though doesn't exactly address what we are doing. It's rather a lengthy and detailed read, and has a lot of features that we don't need. It does, however, introduce a couple of headers and field values which have been IANA registered. Also, the main points of the negotiation are implemented in Apache in mod_negotiation (see [APACHE]). [APACHE] http://httpd.apache.org/docs/2.2/content-negotiation.html IANA registration is probably a bit of a nuisance, and may be something we don't need to do - e.g. it would seem that the q parameter for content type and much else is not registered. For those of you who fancy a bit of train spotting, I think you'll find registered things at [IANA], though I confess I find this all a bit impenetrable and difficult to navigate. [IANA] http://www.iana.org/numbers.html I have tried to take into account the contributions and discussions on the list, especially those threads starting at the following points. Some are quite lengthy threads and can be followed with the "Next in Thread" link: Magnus's original proposal for 2.1 [1] elaborated in the text below [1] http://lists.w3.org/Archives/Public/public-bpwg-ct/2007Sep/att-0014/00-p art Sean Patterson's original proposal for 2.3 [2] points included in the text and included verbatim [2] http://lists.w3.org/Archives/Public/public-bpwg-ct/2007Sep/0029.html Aaron's contribution for section 2.3 [3] points included in the text and included verbatim [3] http://lists.w3.org/Archives/Public/public-bpwg-ct/2007Sep/0025.html Pointer to ISSUE-222 TAG Finding on Alternative Representations [4] http://lists.w3.org/Archives/Public/public-bpwg-ct/2007Oct/0011.html Pointer to ISSUE-223 (Jo's CT Shopping List): Various Items to Consider for the CT Guidelines [5] http://lists.w3.org/Archives/Public/public-bpwg-ct/2007Oct/0012.html Pointer to ACTION-575 Techniques for Guidelines Document [6] http://lists.w3.org/Archives/Public/public-bpwg-ct/2007Oct/0023.html Scope of CT Guidelines [7] http://lists.w3.org/Archives/Public/public-bpwg-ct/2007Oct/0041.html And with that extremely length pre-amble and disclaimer, here goes: ___________ .Overview The purpose of this section is to explore the need for actors (clients, proxy servers, gateways, origin servers, etc) to communicate with each other, and also suggest guidelines for doing so. The relevant scenario involving a content transformation proxy is as follows: client browser <---HTTP---> content transformation proxy <---HTTP---> origin server There may be other scenarios as well but they will initially be ignored for the sake of simplicity. The needs of these three actors are as follows: 1. The client browser needs to be able to tell the content transformation proxy: a. what media-type (presentation format e.g. desktop, handheld) is desired. b. that all content transformation should be avoided, or that reformatting is allowed/desired c. what type of mobile device and what user agent is being used d. that the device has (zoom, linearize, keyhole) presentation [@@??] 2. The content transformation proxy needs to be able to tell the origin server: 1. that some degree of content transformation (re-coding and reformatting) can be performed 2. Content transformation will be carried out unless instructed not to. 3. that content is being requested on behalf of something else. 4. about the delivery context (for example mobile device type and user agent). 5. That the request headers have been altered (e.g. additional content types inserted) [??] 3. The origin server needs to be able to tell the content transformation proxy: 1. that content is already optimized and no additional transformation is required (or that it should not be restructured by may be recoded] 2. that it's OK to perform additional content transformation.[??] 3. That it varies its presentation 4. That it has media-specific presentations 5. I can't/don't wish to handle this request in its present form 6. That request headers should/should not be modified 4. The content transformation proxy needs to be able to tell the client browser: 1. the status of the content: it is reformatted/recoded/untouched; 2. where to find the original content if it has been transformed. [@@ should this read "how", or do we suppose that there are "magic" mechanisms/URIs for by-passing proxies?] .. Objectives In satisfying these requirements existing HTTP headers and directives and behaviors must be respected. However, not all of the features required can be achieved without extensions to the behaviors defined in [RFC 2616]. Knowing that many actors will be unaware of any HTTP extensions, special consideration needs to go into making sure that the fall-back behavior - i.e. strict adherence to HTTP/1.1 - is "safe". For example, if there is no standard way for a client browser to specify that all content transformation should be avoided in a request, then we must define a default behavior for a well-behaved content transformation proxy that receives a request from such a client. [@@ other principles behind what we are trying to do - e.g. noting Sean's point that there is a wide diversity of different devices that all fall under the simple appellation of "handheld".] ..Types of Proxy HTTP defines two types of proxy: transparent proxies and non-transparent proxies. As discussed in Section 1.3 [HTTP], Terminology: A "transparent proxy" is a proxy that does not modify the request or response beyond what is required for proxy authentication and identification. A "non-transparent proxy" is a proxy that modifies the request or response in order to provide some added service to the user agent, such as group annotation services, media type transformation, protocol reduction, or anonymity filtering. Except where either transparent or non-transparent behavior is explicitly stated, the HTTP proxy requirements apply to both types of proxies. This document elaborates the behaviour of non-transparent proxies, when used for content transformation in the context discussed in [Content Transformation Landscape] and henceforward referred to as transforming proxies. ..Types of Transformation Transforming proxies can carry out a wide variety of operations. To carry out an exhaustive survey of those operations and to discuss means of server or client side control of them is beyond the scope of this document. In this document we categorize this rich vocabulary of possible operation into two types: 1) Alteration of Request Headers 2) Alteration of Responses Alteration of responses is further sub-categorized into a) restructuring content; b) recoding content; c) optimizing content. Restructuring content is a process whereby the original layout is altered so that content is added or removed or where the spatial or navigational relationship of parts of content is altered, e.g. by linearization or pagination. Recoding content is a process whereby the layout of the content remains the same, but details of its encoding may be altered. Examples include re-encoding HTML as XHTML, correcting invalid markup in HTML, conversion of images between formats (but not, for example, reducing animations to static images). Optimizing content means removing redundant white space, recompressing images (without loss of fidelity), zipping for transfer ... ..Alteration of HTTP Requests and Responses Alteration of HTTP requests and responses is not prohibited by HTTP other than in the circumstances referred to in [HTTP] section 13.5.2. This document describes how the Client and the Destination Server may require conforming transforming proxies not to alter HTTP requests and responses. ..Control by Client/User A transforming proxy gains knowledge of whether a user requests alteration of requests and responses by: a) Administrative arrangements between the provider of the proxy and the end user; b) As a result of the request containing an indication that changing the request headers must not be carried out; c) Direct interaction with the User; d) Other means. ..Control by Server A transforming proxy gains knowledge of whether a server permits alteration of requests and responses by: e) Administrative arrangements between the provider of the server and the provider of the proxy; f) For requests, by having previously received an indication from the origin server as a response to a request [for a resource on the path that this request is in scope of] that transformation of headers is not permissible; g) For responses as a result of the response containing indications as to the servers intentions - including mobileOK labels; h) Other means. Aside from b) f) and g) above, these techniques are generally out of scope of this document, however use of knowledge gleaned for sources other than HTTP is referred to below. Transforming proxies SHOULD allow the overriding of standing administrative arrangements on a request by request and response by response basis. .Behavior of Components ..Client Request to Proxy The client may request that the Content-Type and Content-Encoding MUST NOT be altered in the response by setting the Cache-Control: no-transform directive. The client may add a [@@preserve-headers directive] to indicate that transforming proxies MUST NOT alter other aspects of the request headers, except as permitted by HTTP/1.1 to allow correct operation of caching functions [want to say that do not affect transparency, but that is probably not technically exact]. The [@@preserve-headers directive] may only be present in addition to the no-transform Cache-Control directive. The client may add an [@@allow-recode directive] to the Cache-Control: no-transform directive, indicating that the proxy MAY change the format of the response but not restructure the content. The client may add an [@@allow-compress] to the Cache-Control: no-transform directive, meaning that a proxy MAY remove redundant white space, recompress images or change the Content-Encoding (to use gzip, from identity, for example). The client may also add [@@preferred-medium directive] indicating that a preference for a presentation style. The [@@preferred-medium directive] has the form media=presentation-format (as described in RFC ..., current values of the presentation format-directive are taken from IANA ... and include "screen" and "handheld"). [It would be nice if the client were able to indicate what type of presentational capabilites it has, for example, zoom, linearize, keyhole ... @@@ client-feature indication] ..Proxy Request to Server If the request contains a Cache-Control: no-transform directive [@@or any of the other directives specified in previous section] the proxy MUST forward the request unaltered to the server. If there are no [@@ such directives] present in the request from the client, and there is no indication from a downstream proxy that it intends to transform [@@ see I will transform below] the proxy SHOULD analyze whether it intends to offer transformation services by referring to any administrative arrangements that are in place with the user of the client, or the server, and any a priori knowledge it has of client capabilities [@@ from a DDR and so on]. Knowing that the client has available a linearization or zoom capability the proxy SHOULD NOT attempt to offer that service. Knowing that a client is capable of a broad range of formats the proxy SHOULD NOT offer to recode content. If as a result of this deliberation it intends to restructure the proxy MUST indicate this by including a [@@@ I will transform (restructure / reformat / compress)] - [@@ and even if it doesn't it MAY indicate its potential for restructuring or recoding or compressing content [@@by means of ...]. The proxy MUST include a Via HTTP header indicating its presence. Proxies MUST NOT intervene in https and SHOULD NOT intervene in methods other than GET and HEAD. ...Alternative 1 When altering the Accept HTTP header, the proxy SHOULD indicate any formats that it intends to recode for delivery by assigning a lower q factor (indicated by the q parameter) than those natively supported and should, in addition,[@@extension] add a further transform parameter indicating that the format is not natively supported by the client. e.g. Accept: image/jpeg, image/gif, image/png;q=0.7;[@@transform] When altering the User-Agent HTTP Header the proxy MUST indicate this change by adding a [@@ User Agent Modified indication with the Original User-Agent indicated] If other HTTP header fields are altered then the proxy MUST be prepared to re-issue the request as received from the client on receipt of a Vary header in the response indicating that the server offers variants of its presentation according to any of the HTTP header fields that have been modified. ...Alternative 2 When altering the Accept HTTP header, the proxy SHOULD indicate any formats that it intends to recode for delivery by assigning a lower q factor (indicated by the q parameter) than those natively supported. e.g. Accept: image/jpeg, image/gif, image/png;q=0.7 If other HTTP header fields are altered then the proxy MUST be prepared to re-issue the request as received from the client on receipt of a Vary header in the response indicating that the server offers variants of its presentation according to any of the HTTP header fields that have been modified. ..Server Response to Proxy If the server varies its presentation according to examination of received HTTP Headers then it MUST include a Vary HTTP header indicating this to be the case. If, in addition to, or instead of HTTP headers, the server varies its presentation on other factors (source IP Address ...) then it MUST include a * as one of the fields in the Vary response. The server MUST include a no-transform directive if one is received from the client. If it is capable of varying its presentation it SHOULD take account of client capabilities [@@as derived from a DDR etc.] and formulate an appropriate experience according to those criteria. If the server has distinct presentations according to its perception of the presentation media, then the medium for which the presentation is intended SHOULD be indicated [@@using the ...] If the client has requested a specific presentation using the [@@ directive] the server should provide a presentation of that kind. e.g. if the server would ordinarily provide a handheld experience but the client requests a screen experience the screen experience should be provided. And vice versa, of course. If the server creates a specific user experience for certain presentation media types it SHOULD inhibit transformation of the response by including a no-transform directive. The server SHOULD NOT prohibit recoding or compression of its content unless it has specific reasons not to allow it [including that this has been requested by the client] and hence should in general add a [@@allow-recoding or allow-compression] directive when adding a no-transform directive. Note that including a no-transform directive may [@@SHOULD actually] disrupt the behaviour of WAP/WML proxies, because this inhibits such proxies from converting WML to WMLC (because this is a content-encoding behavior). Adding [@@allow-recoding] or [@@allow-compression] is unlikely to be recognized in the short-term by such proxies which predate these guidelines. Servers MAY base their actions on a priori knowledge of behaviour of transforming proxies, when they are identified in a Via header. The server SHOULD NOT choose a Content-Type for its response based on its assumptions about the heuristic behavior of any intermediaries. (e.g. it should not choose content-type: application/vnd.wap.xhtml+xml solely on the basis that it suspects that transforming proxies will apply heuristics that make them not restructure it). If servers provide only limited variants of presentation they SHOULD consider providing a rich presentation and allowing a transforming proxy to reduce this - which may result in a richer experience for the user than providing a basic handheld experience only, say. 406 Response - Note that some clients (MSIE for instance) don't display the body of a 406 response, this is in contravention of HTTP/1.1 as far as I can see. Vary headers in 406 response - restrict to the one(s) that have caused the 406. In general, successful responses should are done with 200 OK Vary: User-Agent, Accept, Accept-Language etc. e.g. MS doesn't want you to do updates except with IE. so they should say 406 Vary: User-Agent (but note that IE doesn't display the body of 406 responses) Servers should respond with a 406 not a 200 if they can't handle the request and should indicate that they permit header alteration in that 406. Servers should provide information about alternative representations by using the Vary header (if the alternatives are available from the same URI) or using link information if alternative representations are handled by different URIs. [This restricts to HTML for now. If link headers a reinstated in HTTP then this becomes a more universal mechanism. Open question as to whether it SVG or WICD etc. support any such notion] [@@300 Response - could this be used as a signal from the server to say that it understands the protocol? A la RFC 2295] .. Proxy Receipt of Response from Server If the proxy has altered any of the HTTP request headers, and it receives a Vary response from the server it should re-make the request with the original headers and forward the subsequent response without restructuring it, irrespective of the contents of the subsequent response. The proxy SHOULD take note of this and SHOULD NOT vary headers for subsequent requests, unless requests are subsequently received with the Vary header [@@ + note on backoff below] [@@note that loop detection and elimination is needed here] .. Proxy Response to Client If the response includes a Warning: 214 Transformation Applied the proxy MUST NOT apply further transformation. If the response includes a Cache-Control: no-transform directive that is not modified by [@@ other directives on recoding] then the response MUST be forwarded to the client unaltered. In the absence of a Vary or no-transform directive the proxy SHOULD apply heuristics to the content to determine whether it is appropriate to restructure or recode it (in the presence of such directives, heuristics SHOULD NOT be used.) e.g. a. The server has previously shown that it is contextually aware, even if the present response does not indicate this - modified by a need for the proxy to be aware that the server has changed its behavior and is no longer aware in that way b. the content-type is known to be specific to the device or class of device e.g. application/vnd.wap.xhtml+xml c. examination of the content reveals that it is of a specific type appropriate to the device or class of device e.g. DOCTYPE XHTML-MP or WBMP or [@@mobile video] [@@ note Sean's extensive list of heuristics that should be included as an informative example?] d. The response is an HTML response and it includes <link> elements specifying alternat(iv)es according to media type [or that such links are included as HTTP headers] or that the content has a mobileOK label. If the proxy alters the content then it MUST add a Warning: 214 Transformation Applied HTTP Header .. Client Action on Receipt of Response [@@ discussion of what to do on receipt of Warnings etc.] . Encoding of [@@new] Features preferred-medium = screen; and so on [@@TBD] .Use Case Analysis Client Proxy Server Unaware Unaware Unaware etc. [@@TBD] .Testing All ... must be tested for deleterious effects ... [@@TBD] Providers of transforming proxies SHOULD make available interfaces that facilitate testing of Web sites accessed through them. [@@ though how they should make known how to do this and what administrative arrangements would be needed are both probably out of scope] ______________________________________________ Sean Patterson's contribution under ACTION-550 ______________________________________________ 2 Guidance for Delivery Chain Component Developers 2.3 Guidance for Content Transformation Server Developers Content transformation servers have the ability to transform content into a form that is suitable for a requesting entity's delivery context. However, a content transformation server that is invisible from browsers and other servers on the network can cause problems. These problems include transforming content that should not be transformed, multiple transformations, and sub-optimal transformation. This section contains guidelines for developers of content transformation servers to help avoid these problems. 2.3.1 The Need for Content Transformation Servers 2.3.1.1 Variation of device capabilities While there are many mobile devices in existence today that give their users the ability to browse the web, the majority of devices are not capable of accessing web content. Even for those devices that can access the internet, there are large variations in their web browsing capabilities. Content transformation servers can transform web content into a form that works well on any particular device. 2.3.1.2 Most content is not designed for mobile devices The majority of web sites are designed for users of desktop (or laptop) computers. These computers have large screens, a mouse, full-size keyboards, fast CPUs, large amounts of memory, and are fully connected to the Internet, typically at broadband speeds. Mobile devices (especially mobile phones) normally have none of these characteristics. Regular web content frequently assumes that it will be displayed using the hardware of a desktop computer. Content transformation servers can reduce the hardware requirements of the content so that it works better on a mobile device. 2.3.1.3 Most content is not designed for mobile browsers Most web content is designed to be displayed on web browsers that run on desktop computers. These are full-featured browsers that can display web sites that use complex HTML, CSS, and JavaScript as well as multimedia content such as Flash and video. In addition, most desktop web sites assume that the user has a mouse or other pointing device. Mobile devices frequently have much more limited web browsers. Regular web content may not display properly or at all on the web browser in a mobile device. Even if a desktop web site displays reasonably well, it may be difficult to use on a mobile phone. Content transformation can transform the content into a simpler form that can be displayed and used on a mobile browser. 2.3.1.4 Variation of mobile content There is a wide variation of what is considered "mobile content." Mobile content that is designed for a high-end mobile device may not display well or be useable on lower-end mobile devices. In this case it makes sense for a content transformation server to transform the content developed for a higher-end mobile device into content that is suitable for a lower-end device. 2.3.1.5 Eliminates the need for a least common denominator solution One approach to the problem of the variation of mobile devices is to create a "least common denominator" page that works on all (or almost all) mobile devices. This approach is simpler than having multiple versions of the page (see the next section), but limits the end user experience. An example of a least common denominator approach is writing content that will work with the Default Delivery Context" (DDC) defined in the "Mobile Web Best Practices 1.0" W3C Proposed Recommendation [1]. The "Default Delivery Context" outlines the baseline characteristics that a device must implement in order to be suitable for browsing the web. If a content transformation server exists on the network, the least common denominator approach is not necessary. Instead, a rich version of the site can be created with the knowledge that it will be "reduced down" for any requesting entity that is less capable. 2.3.1.6 Reduces the need for multiple versions of a site Another way to handle the variation of mobile devices is to create multiple versions of a web site to deal with the multiple types of mobile devices that can access the site. This approach is costly to establish and maintain across the increasingly diverse range of handsets available. When a content transformation server exists in the network, the need to create multiple versions for different mobile devices is reduced. Again, a single, rich version of the site can be created and easily maintained. 2.3.1.7 A content transformation server can do a better job of following mobile best practices The "Mobile Web Best Practices 1.0" W3C Proposed Recommendation [1] contains many recommendations for authoring content that is intended for viewing on a mobile device. A well-designed content transformation server can do a better job of following the mobile best practices than a human author, especially when taking into account the capabilities of the many different mobile devices. The result will be a more consistent, uniform experience. 2.3.2 Guidelines of how content transformation servers should communicate with the rest of the delivery chain 2.3.2.1 Identifying the content transformation server HTTP 1.1 requires that all proxy servers append a string to the Via header [2] for any request or response they forward. This string consists of the name of the protocol of the received message, the version number of the protocol, the hostname (or a pseudonym if the hostname is sensitive information), and an optional comment. (The name of the protocol is assumed to be HTTP if not specified.) Content transformation servers should identify themselves in the comment of the string they put in the Via header. Here is an example where a content transformation server at zzz.net adds itself to the Via header: Via: 1.1 nowhere.com (Apache/1.1), 1.1 zzz.net (CT-Server-2000/1.0) Unfortunately, the HTTP 1.1 protocol specification [3] allows subsequent servers that receive the message to remove comments in the Via header. So, while it is recommended that content transformation servers identify themselves in the Via header, it is not always reliable. A more reliable method for identifying a content transformation server is to use the X-Mobile-Gateway header. The syntax of the X-Mobile-Gateway header is as follows (expressed in Augmented BNF form as described in [4]): X-Mobile-Gateway = "X-Mobile-Gateway" ":" 1*( product | comment ) An example would be: X-Mobile-Gateway: CT-Server-2000/1.0 (Server-Only; Linux i686; en-US), Super-CT-Server/2.0 (Headers, Footers; MS Windows XP i686; en-US) The syntax for each content transformation server in the X-Mobile-Gateway header is the same as for the User-Agent and Server headers. It is recommended that value of this header contain the product name and version of the content transformation server as well as a comment in parentheses that contains useful characteristics of the content transformation server separated by semicolons. See [5] for the syntax of "product". Each subsequent content transformation server in the request/response chain appends its information to the end of the X-Mobile-Gateway header. In contrast to the Via header, content transformation servers are only allowed to append to the end of the X-Mobile-Gateway header; no other modifications are allowed. 2.3.2.2 The User-Agent header It is frequently necessary for content transformation servers to replace the User-Agent header in requests with a value that is the same as used by a desktop browser. For example, the content transformation server might use the following User-Agent header: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6 Although web servers are technically supposed to base the content they send to browsers on the Accept header [6], it is very common for web servers to use the User-Agent header to make decisions about the content to return to a particular browser. For example, a web site that has both a desktop and mobile version may examine the User-Agent header and send the desktop version of the site if the User-Agent is recognized as a desktop browser and return the mobile version of the site if the User-Agent is recognized as a mobile browser on a mobile device. Content transformation servers typically want the origin server to send the desktop version of the site since the desktop version is usually more functional. This is the reason that content transformation servers frequently send a User-Agent header from a desktop browser. If the origin server needs to know what the actual User-Agent header is from the original device that made the request, it can examine the X-Device-User-Agent header (see section 2.3.2.3). 2.3.2.3 Identifying the mobile browser Since content transformation servers typically replace the User-Agent header in the original request from the mobile browser with a desktop User-Agent string, there needs to be a way for the origin server to identify the mobile browser that made the original request. This is done with the X-Device-User-Agent header. The syntax for the X-Device-User-Agent header is as follows: X-Device-User-Agent = "X-Device-User-Agent" ":" 1*( product | comment ) (The syntax is the same as for the User-Agent header.) When a content transformation server replaces the User-Agent header with a desktop User-Agent string, an X-Device-User-Agent header should be added to the request and the original User-Agent value from the mobile browser should be copied without modification to the X-Device-User-Agent header. This will allow the origin server to detect the type of mobile browser and mobile device that made the request if it needs this information. Content transformation servers should not modify the X-Device-User-Agent header if it already exists. 2.3.2.4 Determining whether or not a web page should be transformed There are times when the origin server wants a web page to be sent to the mobile web browser unchanged. The origin server can signal that it does not want a web page to be transformed by a content transformation server (or any other proxy) by using the Cache-Control [7] header. The no-transform directive [8] is used to specify that the entity body of a response from the origin server should not be modified. Cache-Control: no-transform The Cache-Control header must be honored for both requests and responses. A content transformation server must not modify the entity body of any request or response that uses the Cache-Control: no-transform header. In addition there are a handful of headers that should not be modified as well. See [9] for a list of those headers. The Cache-Control: no-transform header can be added by content transformation servers but it should not be modified by content transformation servers. 2.3.2.5 Notification that transformation has been applied If a content transformation server makes changes (i.e., transformations) to the entity body in a response, the content transformation server must set the Warning header [10] to "214": Warning: 214 zzz.net "Transformation applied" This lets the browser and any other content transformation servers in the request/response 2.3.2.6 Identification of mobile content Content can be identified as intended for mobile browsers by one of the following methods: * The Content-Type header of the response is one of the following values: o application/vnd.wap.xhtml+xml o text/vnd.wap.wml * The document type of the response document is o <!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.0//EN" "http://www.wapforum.org/DTD/xhtml-mobile10.dtd"> o <!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.1//EN" "http://www.openmobilealliance.org/tech/DTD/xhtml-mobile11.dtd"> o <!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.2//EN" "http://www.openmobilealliance.org/tech/DTD/xhtml-mobile11.dtd"> o <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN" "http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd"> o <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.1//EN" "http://www.w3.org/TR/xhtml-basic/xhtml-basic11.dtd"> * There is a link element in the response document with a media attribute that has a value of "handheld" that points to a mobile document. Here is an example: <link rel="alternate" media="handheld" href="www.mobileversion.com/" /> Origin servers that want to present a choice to the user of whether to view the desktop version of a web page or the mobile version may use this technique. (The mobile browser would need to have the capability of presenting the choice to the user for this to work.) Identifying mobile content is important when the content transformation server is deciding which transformations to apply to the response content received from the origin server. * if the response content is identified as mobile, the content transformation server should be conservative and try to perform only non-layout and non-format changing transformations. For example, it would be OK to accelerate the content (by removing non-layout whitespace, non-lossy compression, etc.), add a header and/or footer to the page, apply content corrections, etc. It would less desirable to remove HTML tables, change the size and/or format of an image, etc. However, if the content returned from the origin server uses features that the content transformation server "knows" that the client device does not support (e.g., by examining the User-Agent header sent the mobile web browser), it is permissible to make more extensive changes to make the content more suitable for the client device. For example, if an origin server returns an image in GIF format to a device that does not support GIF images, it would be OK for the content transformation server to transform the image into a different format that the client device did support. * if the response content is not identified as mobile, and there is no Cache-Control: no-transform header, the content transformation server should perform all reasonable transformations on the response. References [1] http://www.w3.org/TR/mobile-bp/ [2] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.45 [3] http://www.w3.org/Protocols/rfc2616/rfc2616.html [4] http://www.w3.org/Protocols/rfc2616/rfc2616-sec2.html#sec2.1 [5] http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.8 [6] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.1 [7] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9 [8] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.5 [9] http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.5.2 [10] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.46 ________________________ _____________________________________ Aaron's Contribution under ACTION-551 _____________________________________ 2.3 Guidance for Content Transformation Server Developers Most mobile devices have a limited capacity for receiving and displaying content that was originally designed for a desktop browsing environment. A content transformation server may be used to adapt desktop content in such a way that it may be successfully retrieved and rendered by a mobile device. A few of the well known limitations include: * Poor or non-existent support for markup other than well-formed XHTML * Limited image format support (eg, JPEG only) * Limited memory capacity for document retrieval and processing * Poor or non-existent HTTPS support * Poor or non-existent CSS support In many cases, sending a mobile content that it was not prepared to process will cause serious failures, often forcing the user to reset the device. A content transformation server can ensure that the content will be suitable for display on the device, allowing the user to access the information they desire. Even in cases where no actual content transformation is strictly necessary, a content transformation server can improve the experience greatly by reducing the amount of data that must be transferred to the mobile. Decreasing the number of connections required by in-lining style sheets and other resources can also dramatically reduce the amount of time spent retrieving and rendering page content. Some websites provide a mobile alternative that is suitable for display on some mobile devices. Unfortunately, the vast majority of websites do not, and those that do often cater only to a small subset of the mobile devices that are in active use. Furthermore, many sites actively detect and divert non-desktop browsers to "incompatible browser" pages and the like, preventing the user from seeing any content at all. In these situations, a content transformation server that "pretends" to be a desktop browser on behalf of the mobile can provide a better experience by retrieving and processing the original desktop-oriented site. In the event that a website author does provide a viable mobile alternative, any content transformation servers in the delivery chain should recognize this content as acceptable for mobile display and not attempt to modify it. In order to increase the chances that a website will provide a viable mobile alternative, content transformation servers should preserve and pass on any information about the delivery context that is available. This includes but is not limited to preserving the HTTP User-Agent and Accept headers. [This is an issue I am not actually sure what we want to do about. On the one hand, we need to present valid device information to the origin server so that it may provide a mobile experience, but we also want to masquerade as a desktop browser to cover the (much more common) case where the site will refuse to send content for unknown user agents. There are several possible strategies, but we will need to come up one that we can all agree on to present here.] [TODO: Details of how content transformation servers communicate with the rest of the delivery chain] ________________________ Juicy Excerpts from HTTP ________________________ 14.9.6 Cache Control Extensions The Cache-Control header field can be extended through the use of one or more cache-extension tokens, each with an optional assigned value. Informational extensions (those which do not require a change in cache behavior) MAY be added without changing the semantics of other directives. Behavioral extensions are designed to work by acting as modifiers to the existing base of cache directives. Both the new directive and the standard directive are supplied, such that applications which do not understand the new directive will default to the behavior specified by the standard directive, and those that understand the new directive will recognize it as modifying the requirements associated with the standard directive. In this way, extensions to the cache-control directives can be made without requiring changes to the base protocol. This extension mechanism depends on an HTTP cache obeying all of the cache-control directives defined for its native HTTP-version, obeying certain extensions, and ignoring all directives that it does not understand. For example, consider a hypothetical new response directive called community which acts as a modifier to the private directive. We define this new directive to mean that, in addition to any non-shared cache, any cache which is shared only by members of the community named within its value may cache the response. An origin server wishing to allow the UCI community to use an otherwise private response in their shared cache(s) could do so by including Cache-Control: private, community="UCI" A cache seeing this header field will act correctly even if the cache does not understand the community cache-extension, since it will also see and understand the private directive and thus default to the safe behavior. Fielding, et al. Standards Track [Page 116] RFC 2616 HTTP/1.1 June 1999 Unrecognized cache-directives MUST be ignored; it is assumed that any cache-directive likely to be unrecognized by an HTTP/1.1 cache will be combined with standard directives (or the response's default cacheability) such that the cache behavior will remain minimally correct even if the cache does not understand the extension(s). ______ 13.5.2 Non-modifiable Headers Some features of the HTTP/1.1 protocol, such as Digest Authentication, depend on the value of certain end-to-end headers. A transparent proxy SHOULD NOT modify an end-to-end header unless the definition of that header requires or specifically allows that. Fielding, et al. Standards Track [Page 92] RFC 2616 HTTP/1.1 June 1999 A transparent proxy MUST NOT modify any of the following fields in a request or response, and it MUST NOT add any of these fields if not already present: - Content-Location - Content-MD5 - ETag - Last-Modified A transparent proxy MUST NOT modify any of the following fields in a response: - Expires but it MAY add any of these fields if not already present. If an Expires header is added, it MUST be given a field-value identical to that of the Date header in that response. A proxy MUST NOT modify or add any of the following fields in a message that contains the no-transform cache-control directive, or in any request: - Content-Encoding - Content-Range - Content-Type A non-transparent proxy MAY modify or add these fields to a message that does not include no-transform, but if it does so, it MUST add a Warning 214 (Transformation applied) if one does not already appear in the message (see section 14.46). Warning: unnecessary modification of end-to-end headers might cause authentication failures if stronger authentication mechanisms are introduced in later versions of HTTP. Such authentication mechanisms MAY rely on the values of header fields not listed here. The Content-Length field of a request or response is added or deleted according to the rules in section 4.4. A transparent proxy MUST preserve the entity-length (section 7.2.2) of the entity-body, although it MAY change the transfer-length (section 4.4). _____ no-transform Implementors of intermediate caches (proxies) have found it useful to convert the media type of certain entity bodies. A non- transparent proxy might, for example, convert between image formats in order to save cache space or to reduce the amount of traffic on a slow link. Serious operational problems occur, however, when these transformations are applied to entity bodies intended for certain kinds of applications. For example, applications for medical Fielding, et al. Standards Track [Page 115] RFC 2616 HTTP/1.1 June 1999 imaging, scientific data analysis and those using end-to-end authentication, all depend on receiving an entity body that is bit for bit identical to the original entity-body. Therefore, if a message includes the no-transform directive, an intermediate cache or proxy MUST NOT change those headers that are listed in section 13.5.2 as being subject to the no-transform directive. This implies that the cache or proxy MUST NOT change any aspect of the entity-body that is specified by these headers, including the value of the entity-body itself.
Received on Monday, 19 November 2007 13:29:18 UTC