- From: Nicholas Shanks <nickshanks@nickshanks.com>
- Date: Tue, 12 Mar 2013 10:52:13 +0000
- To: IETF HTTP Working Group <ietf-http-wg@w3.org>
I just came across this page: http://www.utexas.edu/cola/centers/lrc/ielex/ The first bulleted UL on the page demonstrates that real-world needs for Accept-Charset are not met by existing specifications for this header, or other related TCN headers/UA behaviour. I am aware that Google are presently applying a patch to remove explicit support for Accept-Charset from Chromium. They are the last of the major browser vendors to do so. This has led me to ponder what could be done in a post-Accept-Charset world to automate variant selection for the above use case (negotiate between representations based upon installed fonts, rather than UA support for charsets). I think a new the following would make sense (n.b. I just made up the unicode-range values for demonstration purposes) -> GET //www.utexas.edu/cola/centers/lrc/ielex/PokornyMaster-X.html <- 200 OK Content-Type: text/html; charset=utf-8; unicode-range="U+40-7F, U+2000-207F, U+10000-103FF" Alternates: {"PokornyMaster.html" 0.8 {unicode-range "U+40-7F, U+2000-21FF"}}, {"PokornyMaster-R.html" 0.4 {charset iso-8859-1}} (UA determines [via Unicode-Range header or while parsing response body] that it would use a last resort font or .notdef/.null glyphs to display some characters, so issues a second request for the variant with the highest qs value that the UA knows it can support) -> GET //www.utexas.edu/cola/centers/lrc/ielex/PokornyMaster.html <- 200 OK Content-Type: text/html; charset=utf-8; unicode-range="U+40-7F, U+2000-21FF" Alternates: {"PokornyMaster-X.html" 1.25 {unicode-range "U+40-7F, U+2000-207F, U+10000-103FF"}}, {"PokornyMaster-R.html" 0.5 {charset iso-8859-1}} This introduces a new, optional Content-Type parameter, "unicode-range" valid for text/* types. Also, it adds one TCN extension, per RFC2295 section 5.1 syntax: extension-name = "unicode-range" extension-value = quoted-string This way, we get all the usual benefits from Alternates-based negotiation: • Only negotiable resources being viewed on sub-par devices are subject to a second round-trip • There is no Accept-* overhead on the initial request • Each representation has it's own URI and does not use Vary, so caching is optimal. Alternates header is always sent, to support stateless proxies, or in case requests go via different routes. • Variant selection is done by the UA, no leaking of configuration/user-identifying info Downsides: • First response may be downloaded unnecessarily. Authors should link to/serve the highest source-quality representation available by default. UAs on insufficient devices may need to use heuristics based on data flow rate and Content-Length to choose whether to close the connection and open a new one, or wait for entire body of response to download and re-use the same connection for the subsequent request. • The onus falls on UAs to be smarter about automatic variant selection. They could even display a dialog if automatic selection is not desired, e.g. "This document cannot be displayed correctly due to lack of fonts supporting characters used. A lower-quality, but supported alternative is available. Do you wish to continue using this document or request the alternative?" -- Nicholas.
Received on Tuesday, 12 March 2013 10:53:26 UTC