Re: Content-Disposition: *sender* advice

On 15.02.2011 22:33, Julian Reschke wrote:
> Hi,
>
> last week IE9RC came out, and sure enough, it has limited support for
> RFC 2231/5987 (limited in that it only supports UTF-8 (*)). See
> <http://greenbytes.de/tech/tc2231/>.
>
> Which makes me wonder - should the spec include more advice for producers?
>
> The current situation is:
>
> 1) Once IE9 is out, senders can rely on RFC5987/UTF-8 support for all
> "current" UAs, except for IE < 9 and Safari.
>
> 2) There is no fallback that would work both with Safari and legacy IE
> versions.
>
> 3) filename and filename* can safely be sent together (with filename
> acting as fallback), except that Firefox will still pick the wrong (see
> <https://bugzilla.mozilla.org/show_bug.cgi?id=588781> -- it would be
> great to see this finally fixed in FF5).
>
> Given these constraints senders that really need to provide non-ASCII
> characters to "all" recipients have to do User Agent sniffing. That's
> bad, but I believe explaining the problem would still be better than
> being silent on it.
>
> (*) Given that IE9RC only supports UTF-8, and the inclusion of
> ISO-8859-1 in RFC 5987 isn't essential anyway, should we note in the
> spec not ever to use ISO-8859-1?
>
> Best regards, Julian

I got one "go ahead" off-list, so I tried to write down some advice.

I was determined not to mention User-Agent sniffing; but it's really the 
only choice today if you *need* I18N for "all" browsers.

To provide up-to-date information, I created a PURL URI for my test 
cases; maybe that's an acceptable compromise for linking to up-to-date 
information.

Feedback appreciated!

-- snip --
Appendix D.  Advice on Generating Content-Disposition Header Fields

    Unfortunately, the varying quality of user agent implementations
    makes it non-trivial to generate the filename parameter.

    The "token" and "quoted-string" forms are widely implemented, but
    senders need to be aware both of the constraints of these formats and
    several implementation quirks in user agents:

    o  The "token" form excludes characters that are likely to occur in
       filenames, such as the space character.

    o  Some implementations do not properly handle the escaping mechanism
       used in the "quoted-string" form; however, the escape character is
       "\", which isn't supposed to show up in a portable filename
       anyway.

    o  Some implementations misinterpret "percent" escapes, and therefore
       filenames containing a "%hh" sequences (where "h" is a "hex
       digit") can yield unexpected results (see Appendix C.2).

    o  Some implementations apply heuristics when encountering non-ASCII
       characters, and might unexpectedly switch to UTF-8 encoding (see
       Appendix C.3).

    The encoding used in "filename*" (see [RFC5987]) eliminates all these
    issues, but not all user agents support it.  Among those which do
    support it, the following issues are known:

    o  One implementation only supports UTF-8, although RFC 5987 requires
       support for ISO-8859-1 as well.  In practice, this is not a
       problem because UTF-8 is preferable anyway.

    o  One implementation picks the wrong parameter when both "filename"
       and "filename*" are specified, thus it's not always possible to
       specify "filename" as a fallback for old implementations.

    Consequently, the "best" approach depends on the use case.  In many
    cases, it might be acceptable to simply strip or replace certain
    characters that are known to be problematic.  In other cases, it
    might be ok to just ignore legacy user agents that do not support the
    "filename*" format.  Of course, using different formats based on the
    User-Agent request header field is also an option, but the drawbacks
    of this choice need be kept in mind (implementations improving,
    picking the right default, caching considerations).

    <http://purl.org/NET/http/content-disposition-tests> provides an
    overview of what implementations do support, and can assist in
    selecting an implementation approach.
-- snip --

Best regards, Julian

Received on Tuesday, 22 February 2011 17:43:15 UTC