Re: Content-Disposition status

On 12/07/2011, at 11:26 AM, Adrien de Croy wrote:

> 
> I bumped into this problem recently, with a customer complaining that filenames were munged when being downloaded through our proxy when they were specified in Shift-JIS.

Were they seeing interoperable behaviour across browsers when sending Shift-JIS in C-D, or a single browser?


> we were basically not binary safe with non-ASCII headers (which shouldn't exist), so that's an issue.

Yes; as per <http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-15#section-3.2>,

> Recipients SHOULD treat other (obs-text) octets in field content as opaque data.



> But I think we are still going to have an issue, especially with Content-Disposition (one of the more popular candidates for localisation).  The reason being that many if not most Content-Disposition header values are coming from some script, written by someone who never read MIME or HTTP specs.

Yep, it's very important to keep that distinction in mind, for all sorts of things.


> So I don't think this problem is going to go away.
> 
> I don't see too many opportunities to clean this up either.  In fact about the only option I can think of is that servers hosting scripts must clean up the header output of the scripts to make it conformant.  this then reduces the set of people who need to do something from the large set of those writing scripts, down to the much smaller set of those writing web server software who hopefully are more likely to take notice of the IETF.
> 
> in order for Content-Disposition to be able to be converted into the form proposed in RFC 6266, more information needs to be communicated from the script to the host web server.  I don't see this happening, since 6266 uses 5987 encoding, which requires at least a character set.  That information simply isn't present since it's not emitted by the script.
> 
> Getting all these scripts changed to emit character sets as well I think is just never going to happen, so there will always be a requirement on something downstream to clean it up.  I think the best place to clean it up is in the host web server, and if this requires sniffing, or breaking the response in some way (e.g. assuming some default character set) then this will provide the necessary incentive to get it cleaned up in the script perhaps.


I'm not so glum. Having the server do it is a non-starter to me (as you indicate), but scripts usually use frameworks and libraries for these things. If 6266 points a way for them to interoperate with all modern and many old browsers, they frameworks will fall in line, given time (and perhaps some evangelisation).

Julian, have you done any work to notify the various frameworks that they might want to have a look at 6266?

Cheers,

--
Mark Nottingham   http://www.mnot.net/

Received on Tuesday, 12 July 2011 06:57:57 UTC