- From: Jo Rabin <jrabin@mtld.mobi>
- Date: Mon, 24 Sep 2007 22:49:09 +0100
- To: "Sean Patterson" <SPatterson@Novarra.com>, <public-bpwg-ct@w3.org>
- Message-ID: <C8FFD98530207F40BD8D2CAD608B50B47326D9@mtldsvr01.DotMobi.local>
Hi Sean Thanks for this, certainly food for thought. I have a number of detailed comments about the wording. E.g. I am not sure that it is true that completely accurate to say "Although web servers are technically supposed to base the content they send to browsers on the Accept header [6]," as my reading of the HTTP spec is that the server can vary its response on any aspect of the request, and that is what the Vary header indicates. Although I agree that if you read the section on the Accept header in isolation it does strongly suggest that this is the sole basis of decision making. However, it is pretty clear from other places in the spec that this is not the case. Also I have various other comments on the assumptions behind your proposed 2.3.1 and 2.3.2 and the specifics of the text, especially "Content transformation servers typically want the origin server to send the desktop version of the site since the desktop version is usually more functional." which seems to ignore the intentions of the content owner and the preferences of the content consumer. It also appears to make the assumption that a mobile version is a dumbed down version of a desktop version, whereas what we advocate is really the opposite, i.e. that a mobile version is a carefully crafted rendition that is especially tailored for the mobile user. This is really not primarily to do with the technical limitations of format/content-type but is about understanding the use of your content in the mobile context. If the content provide has crafted their content to be especially useful in that context, then the transformation provider saying that 'there is a more functional desktop presentation' is actually completely wrong. It would be more functional if the user had a desktop, but the point precisely is they don't. The content owner knows that, and is in a _much_ better position than some automatic process to determine what is a "functional user experience" (q.v) of _their_ content in the mobile context. That said, my main observations relate to your proposed mechanisms in 2.3.2. This is of course just a sketch, not proposed text: As a matter of priority I would prefer that we find a mechanism that primarily, or exclusively, calls on HTTP headers to figure out what is going on. I'd also, personally, prefer that we don't call on HTTP header extensions if that is possible. Extending the values or interpretation of the values of existing headers seems more in line with the spirit of it all. A Via header consists of a sequence of hosts or pseudonyms (pseudonyms are syntactically HTTP tokens) and optional comments. I'm unsure of why you think that it is unreliable. My understanding is that although the comment field MAY be stripped the fact that a proxy has been involved MUST be preserved (all MUSTards RFC2119 wise, of course). Hence a via header consists of a sequence of hosts or pseudonyms. We could suggest a practice in which transcoding proxies identify themselves by adding a #transcoding suffix to either the pseudonym or the host. We could suggest that a standard URI is added to the list to identify that the previous proxy is a transforming proxy. Something along those lines. (Incidentally, I think that we need to be clear, also, that where it is the user agent itself that exhibits transcoding or adaptation behavior it should also add a via header and should make it clear that it is the user agent itself that is doing that - by being the first in the chain and by some other mechanism ...). I believe that the transcoding proxy MAY alter the User Agent Header in addition to being obliged to add a Via field. This is only to avoid blocking hosts. If it does modify the user agent header it MUST include a standard notification that it has done so. (that would imo be a URI) . I believe that a server that has alternative representations SHOULD always add a Vary: User-Agent (or Vary: *) to indicate that it has alternative representations. If a server receives a request with a Via header that contains #transcoding (or whatever) it MUST respond with Vary to indicate that it requires the correct user agent header. On receipt of a Vary header, having presented a modified user agent header, the Transcoding proxy MUST re-present the request with the User Agent header that it originally received. The transcoding proxy SHOULD cache the information that the server varies its response according to the User Agent header and not modify it in the future. (Subject to cache expiry etc.) The server may in addition add cache-control: no-transform to any response. And this MUST be respected. I realise that what I am proposing puts the onus on servers that have more than one representation to do something that they don't today, and I realize also that I am suggesting that transcoding proxies can modify the user agent string. In light of recent intemperate discussion, on this and other lists, let me repeat that what I mean is that they can do this only in order to avoid a blocking response. If, in response to a request, they get a Vary: User-Agent or * they MUST re-request the URI with the User-Agent they received and remember that setting for that URI (using a mechanism similar to authentication realms in which user agents are expected to remember the URI path and re-present authentication credentials without prompting the user for URIs in that path). Proxies must pass on the Vary. My justification for putting the onus on servers in this way is that if the server is dumb you can't do anything other than spoof the UA string. If on the other hand the server is smart and varies the response, then the RFC says "An HTTP/1.1 server SHOULD include a Vary header field with any cacheable response that is subject to server-driven negotiation." Delete "cacheable" for the purposes of this discussion. I think that in addition transforming proxies MAY take account of clues, like the content-type header or DOCTYPE suggesting mobile content. However, I think that this is likely to be a short term solution that locks in a very unsatisfactory status quo on XMHTL-MP and Basic mish-mash. Since XHTML Basic content is also XHTML content and since the DOCTYPE and content-type markers work extremely imperfectly and not really at all together I think it is something we'd like not to rely on. We should be able to send HTML content to UAs that support it and be able to inform transcoding proxies that we really do mean this for those user agents. I agree with your idea of adding Warning: headers. I suggest adding some standard URIs to mean various things. I think we should consider what, if anything, to say about 300 HTTP responses. An important principle, imo, is that if there is a choice and the user is interested in it, they should be given it. It seems that the 300 response is a vehicle for that. How should we make use of it? I think we could consider what role HEAD has in tasting whether a server can Vary. I like your idea of using links rel="handheld" but a) don't think this extends generally enough (what is a handheld) and b) would prefer to see the Link header reinstated in HTTP for this purpose I wonder what we should say about increasing the overhead on requests and reponses. We should at least introduce a statement of principle? We haven't touched on how a transcoding proxy can advertise its capabilities nor how a consenting server can take advantage of them. Hope this makes some kind of sense. Jo ________________________________ From: public-bpwg-ct-request@w3.org [mailto:public-bpwg-ct-request@w3.org] On Behalf Of Sean Patterson Sent: 24 September 2007 20:14 To: public-bpwg-ct@w3.org Subject: ACTION-550: Draft some initial material for Section 2.3 of the Guidelines I apologize for not getting this sent out to the group sooner. Here is my draft for section 2.3 of the "Guidelines for Using Content Transformation". (Hopefully the formatting will be OK. Is there a better/preferred way to submit material to the group?) Sean Patterson +1 630 773 0000 ext. 289 novarra Powering the Mobile Generation(tm) 2 Guidance for Delivery Chain Component Developers 2.3 Guidance for Content Transformation Server Developers Content transformation servers have the ability to transform content into a form that is suitable for a requesting entity's delivery context. However, a content transformation server that is invisible from browsers and other servers on the network can cause problems. These problems include transforming content that should not be transformed, multiple transformations, and sub-optimal transformation. This section contains guidelines for developers of content transformation servers to help avoid these problems. 2.3.1 The Need for Content Transformation Servers 2.3.1.1 Variation of device capabilities While there are many mobile devices in existence today that give their users the ability to browse the web, the majority of devices are not capable of accessing web content. Even for those devices that can access the internet, there are large variations in their web browsing capabilities. Content transformation servers can transform web content into a form that works well on any particular device. 2.3.1.2 Most content is not designed for mobile devices The majority of web sites are designed for users of desktop (or laptop) computers. These computers have large screens, a mouse, full-size keyboards, fast CPUs, large amounts of memory, and are fully connected to the Internet, typically at broadband speeds. Mobile devices (especially mobile phones) normally have none of these characteristics. Regular web content frequently assumes that it will be displayed using the hardware of a desktop computer. Content transformation servers can reduce the hardware requirements of the content so that it works better on a mobile device. 2.3.1.3 Most content is not designed for mobile browsers Most web content is designed to be displayed on web browsers that run on desktop computers. These are full-featured browsers that can display web sites that use complex HTML, CSS, and JavaScript as well as multimedia content such as Flash and video. In addition, most desktop web sites assume that the user has a mouse or other pointing device. Mobile devices frequently have much more limited web browsers. Regular web content may not display properly or at all on the web browser in a mobile device. Even if a desktop web site displays reasonably well, it may be difficult to use on a mobile phone. Content transformation can transform the content into a simpler form that can be displayed and used on a mobile browser. 2.3.1.4 Variation of mobile content There is a wide variation of what is considered "mobile content." Mobile content that is designed for a high-end mobile device may not display well or be useable on lower-end mobile devices. In this case it makes sense for a content transformation server to transform the content developed for a higher-end mobile device into content that is suitable for a lower-end device. 2.3.1.5 Eliminates the need for a least common denominator solution One approach to the problem of the variation of mobile devices is to create a "least common denominator" page that works on all (or almost all) mobile devices. This approach is simpler than having multiple versions of the page (see the next section), but limits the end user experience. An example of a least common denominator approach is writing content that will work with the Default Delivery Context" (DDC) defined in the "Mobile Web Best Practices 1.0" W3C Proposed Recommendation [1]. The "Default Delivery Context" outlines the baseline characteristics that a device must implement in order to be suitable for browsing the web. If a content transformation server exists on the network, the least common denominator approach is not necessary. Instead, a rich version of the site can be created with the knowledge that it will be "reduced down" for any requesting entity that is less capable. 2.3.1.6 Reduces the need for multiple versions of a site Another way to handle the variation of mobile devices is to create multiple versions of a web site to deal with the multiple types of mobile devices that can access the site. This approach is costly to establish and maintain across the increasingly diverse range of handsets available. When a content transformation server exists in the network, the need to create multiple versions for different mobile devices is reduced. Again, a single, rich version of the site can be created and easily maintained. 2.3.1.7 A content transformation server can do a better job of following mobile best practices The "Mobile Web Best Practices 1.0" W3C Proposed Recommendation [1] contains many recommendations for authoring content that is intended for viewing on a mobile device. A well-designed content transformation server can do a better job of following the mobile best practices than a human author, especially when taking into account the capabilities of the many different mobile devices. The result will be a more consistent, uniform experience. 2.3.2 Guidelines of how content transformation servers should communicate with the rest of the delivery chain 2.3.2.1 Identifying the content transformation server HTTP 1.1 requires that all proxy servers append a string to the Via header [2] for any request or response they forward. This string consists of the name of the protocol of the received message, the version number of the protocol, the hostname (or a pseudonym if the hostname is sensitive information), and an optional comment. (The name of the protocol is assumed to be HTTP if not specified.) Content transformation servers should identify themselves in the comment of the string they put in the Via header. Here is an example where a content transformation server at zzz.net adds itself to the Via header: Via: 1.1 nowhere.com (Apache/1.1), 1.1 zzz.net (CT-Server-2000/1.0) Unfortunately, the HTTP 1.1 protocol specification [3] allows subsequent servers that receive the message to remove comments in the Via header. So, while it is recommended that content transformation servers identify themselves in the Via header, it is not always reliable. A more reliable method for identifying a content transformation server is to use the X-Mobile-Gateway header. The syntax of the X-Mobile-Gateway header is as follows (expressed in Augmented BNF form as described in [4]): X-Mobile-Gateway = "X-Mobile-Gateway" ":" 1*( product | comment ) An example would be: X-Mobile-Gateway: CT-Server-2000/1.0 (Server-Only; Linux i686; en-US), Super-CT-Server/2.0 (Headers, Footers; MS Windows XP i686; en-US) The syntax for each content transformation server in the X-Mobile-Gateway header is the same as for the User-Agent and Server headers. It is recommended that value of this header contain the product name and version of the content transformation server as well as a comment in parentheses that contains useful characteristics of the content transformation server separated by semicolons. See [5] for the syntax of "product". Each subsequent content transformation server in the request/response chain appends its information to the end of the X-Mobile-Gateway header. In contrast to the Via header, content transformation servers are only allowed to append to the end of the X-Mobile-Gateway header; no other modifications are allowed. 2.3.2.2 The User-Agent header It is frequently necessary for content transformation servers to replace the User-Agent header in requests with a value that is the same as used by a desktop browser. For example, the content transformation server might use the following User-Agent header: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6 Although web servers are technically supposed to base the content they send to browsers on the Accept header [6], it is very common for web servers to use the User-Agent header to make decisions about the content to return to a particular browser. For example, a web site that has both a desktop and mobile version may examine the User-Agent header and send the desktop version of the site if the User-Agent is recognized as a desktop browser and return the mobile version of the site if the User-Agent is recognized as a mobile browser on a mobile device. Content transformation servers typically want the origin server to send the desktop version of the site since the desktop version is usually more functional. This is the reason that content transformation servers frequently send a User-Agent header from a desktop browser. If the origin server needs to know what the actual User-Agent header is from the original device that made the request, it can examine the X-Device-User-Agent header (see section 2.3.2.3). 2.3.2.3 Identifying the mobile browser Since content transformation servers typically replace the User-Agent header in the original request from the mobile browser with a desktop User-Agent string, there needs to be a way for the origin server to identify the mobile browser that made the original request. This is done with the X-Device-User-Agent header. The syntax for the X-Device-User-Agent header is as follows: X-Device-User-Agent = "X-Device-User-Agent" ":" 1*( product | comment ) (The syntax is the same as for the User-Agent header.) When a content transformation server replaces the User-Agent header with a desktop User-Agent string, an X-Device-User-Agent header should be added to the request and the original User-Agent value from the mobile browser should be copied without modification to the X-Device-User-Agent header. This will allow the origin server to detect the type of mobile browser and mobile device that made the request if it needs this information. Content transformation servers should not modify the X-Device-User-Agent header if it already exists. 2.3.2.4 Determining whether or not a web page should be transformed There are times when the origin server wants a web page to be sent to the mobile web browser unchanged. The origin server can signal that it does not want a web page to be transformed by a content transformation server (or any other proxy) by using the Cache-Control [7] header. The no-transform directive [8] is used to specify that the entity body of a response from the origin server should not be modified. Cache-Control: no-transform The Cache-Control header must be honored for both requests and responses. A content transformation server must not modify the entity body of any request or response that uses the Cache-Control: no-transform header. In addition there are a handful of headers that should not be modified as well. See [9] for a list of those headers. The Cache-Control: no-transform header can be added by content transformation servers but it should not be modified by content transformation servers. 2.3.2.5 Notification that transformation has been applied If a content transformation server makes changes (i.e., transformations) to the entity body in a response, the content transformation server must set the Warning header [10] to "214": Warning: 214 zzz.net "Transformation applied" This lets the browser and any other content transformation servers in the request/response 2.3.2.6 Identification of mobile content Content can be identified as intended for mobile browsers by one of the following methods: * The Content-Type header of the response is one of the following values: o application/vnd.wap.xhtml+xml o text/vnd.wap.wml * The document type of the response document is o <!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.0//EN" "http://www.wapforum.org/DTD/xhtml-mobile10.dtd"> o <!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.1//EN" "http://www.openmobilealliance.org/tech/DTD/xhtml-mobile11.dtd"> o <!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.2//EN" "http://www.openmobilealliance.org/tech/DTD/xhtml-mobile11.dtd"> o <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN" "http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd"> o <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.1//EN" "http://www.w3.org/TR/xhtml-basic/xhtml-basic11.dtd"> * There is a link element in the response document with a media attribute that has a value of "handheld" that points to a mobile document. Here is an example: <link rel="alternate" media="handheld" href="www.mobileversion.com/" /> Origin servers that want to present a choice to the user of whether to view the desktop version of a web page or the mobile version may use this technique. (The mobile browser would need to have the capability of presenting the choice to the user for this to work.) Identifying mobile content is important when the content transformation server is deciding which transformations to apply to the response content received from the origin server. * if the response content is identified as mobile, the content transformation server should be conservative and try to perform only non-layout and non-format changing transformations. For example, it would be OK to accelerate the content (by removing non-layout whitespace, non-lossy compression, etc.), add a header and/or footer to the page, apply content corrections, etc. It would less desirable to remove HTML tables, change the size and/or format of an image, etc. However, if the content returned from the origin server uses features that the content transformation server "knows" that the client device does not support (e.g., by examining the User-Agent header sent the mobile web browser), it is permissible to make more extensive changes to make the content more suitable for the client device. For example, if an origin server returns an image in GIF format to a device that does not support GIF images, it would be OK for the content transformation server to transform the image into a different format that the client device did support. * if the response content is not identified as mobile, and there is no Cache-Control: no-transform header, the content transformation server should perform all reasonable transformations on the response. References [1] http://www.w3.org/TR/mobile-bp/ [2] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.45 [3] http://www.w3.org/Protocols/rfc2616/rfc2616.html [4] http://www.w3.org/Protocols/rfc2616/rfc2616-sec2.html#sec2.1 [5] http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.8 [6] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.1 [7] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9 [8] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.5 [9] http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.5.2 [10] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.46
Received on Monday, 24 September 2007 21:49:29 UTC