- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Tue, 22 Feb 2011 18:42:38 +0100
- To: HTTP Working Group <ietf-http-wg@w3.org>
On 15.02.2011 22:33, Julian Reschke wrote: > Hi, > > last week IE9RC came out, and sure enough, it has limited support for > RFC 2231/5987 (limited in that it only supports UTF-8 (*)). See > <http://greenbytes.de/tech/tc2231/>. > > Which makes me wonder - should the spec include more advice for producers? > > The current situation is: > > 1) Once IE9 is out, senders can rely on RFC5987/UTF-8 support for all > "current" UAs, except for IE < 9 and Safari. > > 2) There is no fallback that would work both with Safari and legacy IE > versions. > > 3) filename and filename* can safely be sent together (with filename > acting as fallback), except that Firefox will still pick the wrong (see > <https://bugzilla.mozilla.org/show_bug.cgi?id=588781> -- it would be > great to see this finally fixed in FF5). > > Given these constraints senders that really need to provide non-ASCII > characters to "all" recipients have to do User Agent sniffing. That's > bad, but I believe explaining the problem would still be better than > being silent on it. > > (*) Given that IE9RC only supports UTF-8, and the inclusion of > ISO-8859-1 in RFC 5987 isn't essential anyway, should we note in the > spec not ever to use ISO-8859-1? > > Best regards, Julian I got one "go ahead" off-list, so I tried to write down some advice. I was determined not to mention User-Agent sniffing; but it's really the only choice today if you *need* I18N for "all" browsers. To provide up-to-date information, I created a PURL URI for my test cases; maybe that's an acceptable compromise for linking to up-to-date information. Feedback appreciated! -- snip -- Appendix D. Advice on Generating Content-Disposition Header Fields Unfortunately, the varying quality of user agent implementations makes it non-trivial to generate the filename parameter. The "token" and "quoted-string" forms are widely implemented, but senders need to be aware both of the constraints of these formats and several implementation quirks in user agents: o The "token" form excludes characters that are likely to occur in filenames, such as the space character. o Some implementations do not properly handle the escaping mechanism used in the "quoted-string" form; however, the escape character is "\", which isn't supposed to show up in a portable filename anyway. o Some implementations misinterpret "percent" escapes, and therefore filenames containing a "%hh" sequences (where "h" is a "hex digit") can yield unexpected results (see Appendix C.2). o Some implementations apply heuristics when encountering non-ASCII characters, and might unexpectedly switch to UTF-8 encoding (see Appendix C.3). The encoding used in "filename*" (see [RFC5987]) eliminates all these issues, but not all user agents support it. Among those which do support it, the following issues are known: o One implementation only supports UTF-8, although RFC 5987 requires support for ISO-8859-1 as well. In practice, this is not a problem because UTF-8 is preferable anyway. o One implementation picks the wrong parameter when both "filename" and "filename*" are specified, thus it's not always possible to specify "filename" as a fallback for old implementations. Consequently, the "best" approach depends on the use case. In many cases, it might be acceptable to simply strip or replace certain characters that are known to be problematic. In other cases, it might be ok to just ignore legacy user agents that do not support the "filename*" format. Of course, using different formats based on the User-Agent request header field is also an option, but the drawbacks of this choice need be kept in mind (implementations improving, picking the right default, caching considerations). <http://purl.org/NET/http/content-disposition-tests> provides an overview of what implementations do support, and can assist in selecting an implementation approach. -- snip -- Best regards, Julian
Received on Tuesday, 22 February 2011 17:43:15 UTC