- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Tue, 22 Feb 2011 18:42:38 +0100
- To: HTTP Working Group <ietf-http-wg@w3.org>
On 15.02.2011 22:33, Julian Reschke wrote:
> Hi,
>
> last week IE9RC came out, and sure enough, it has limited support for
> RFC 2231/5987 (limited in that it only supports UTF-8 (*)). See
> <http://greenbytes.de/tech/tc2231/>.
>
> Which makes me wonder - should the spec include more advice for producers?
>
> The current situation is:
>
> 1) Once IE9 is out, senders can rely on RFC5987/UTF-8 support for all
> "current" UAs, except for IE < 9 and Safari.
>
> 2) There is no fallback that would work both with Safari and legacy IE
> versions.
>
> 3) filename and filename* can safely be sent together (with filename
> acting as fallback), except that Firefox will still pick the wrong (see
> <https://bugzilla.mozilla.org/show_bug.cgi?id=588781> -- it would be
> great to see this finally fixed in FF5).
>
> Given these constraints senders that really need to provide non-ASCII
> characters to "all" recipients have to do User Agent sniffing. That's
> bad, but I believe explaining the problem would still be better than
> being silent on it.
>
> (*) Given that IE9RC only supports UTF-8, and the inclusion of
> ISO-8859-1 in RFC 5987 isn't essential anyway, should we note in the
> spec not ever to use ISO-8859-1?
>
> Best regards, Julian
I got one "go ahead" off-list, so I tried to write down some advice.
I was determined not to mention User-Agent sniffing; but it's really the
only choice today if you *need* I18N for "all" browsers.
To provide up-to-date information, I created a PURL URI for my test
cases; maybe that's an acceptable compromise for linking to up-to-date
information.
Feedback appreciated!
-- snip --
Appendix D. Advice on Generating Content-Disposition Header Fields
Unfortunately, the varying quality of user agent implementations
makes it non-trivial to generate the filename parameter.
The "token" and "quoted-string" forms are widely implemented, but
senders need to be aware both of the constraints of these formats and
several implementation quirks in user agents:
o The "token" form excludes characters that are likely to occur in
filenames, such as the space character.
o Some implementations do not properly handle the escaping mechanism
used in the "quoted-string" form; however, the escape character is
"\", which isn't supposed to show up in a portable filename
anyway.
o Some implementations misinterpret "percent" escapes, and therefore
filenames containing a "%hh" sequences (where "h" is a "hex
digit") can yield unexpected results (see Appendix C.2).
o Some implementations apply heuristics when encountering non-ASCII
characters, and might unexpectedly switch to UTF-8 encoding (see
Appendix C.3).
The encoding used in "filename*" (see [RFC5987]) eliminates all these
issues, but not all user agents support it. Among those which do
support it, the following issues are known:
o One implementation only supports UTF-8, although RFC 5987 requires
support for ISO-8859-1 as well. In practice, this is not a
problem because UTF-8 is preferable anyway.
o One implementation picks the wrong parameter when both "filename"
and "filename*" are specified, thus it's not always possible to
specify "filename" as a fallback for old implementations.
Consequently, the "best" approach depends on the use case. In many
cases, it might be acceptable to simply strip or replace certain
characters that are known to be problematic. In other cases, it
might be ok to just ignore legacy user agents that do not support the
"filename*" format. Of course, using different formats based on the
User-Agent request header field is also an option, but the drawbacks
of this choice need be kept in mind (implementations improving,
picking the right default, caching considerations).
<http://purl.org/NET/http/content-disposition-tests> provides an
overview of what implementations do support, and can assist in
selecting an implementation approach.
-- snip --
Best regards, Julian
Received on Tuesday, 22 February 2011 17:43:15 UTC