W3C home > Mailing lists > Public > ietf-http-wg@w3.org > July to September 2011

Re: Content-Disposition status

From: Adrien de Croy <adrien@qbik.com>
Date: Tue, 12 Jul 2011 13:26:21 +1200
Message-ID: <4E1BA2BD.7030109@qbik.com>
To: Julian Reschke <julian.reschke@gmx.de>
CC: HTTP Working Group <ietf-http-wg@w3.org>

I bumped into this problem recently, with a customer complaining that 
filenames were munged when being downloaded through our proxy when they 
were specified in Shift-JIS.

we were basically not binary safe with non-ASCII headers (which 
shouldn't exist), so that's an issue.

But I think we are still going to have an issue, especially with 
Content-Disposition (one of the more popular candidates for 
localisation).  The reason being that many if not most 
Content-Disposition header values are coming from some script, written 
by someone who never read MIME or HTTP specs.

So I don't think this problem is going to go away.

I don't see too many opportunities to clean this up either.  In fact 
about the only option I can think of is that servers hosting scripts 
must clean up the header output of the scripts to make it conformant.  
this then reduces the set of people who need to do something from the 
large set of those writing scripts, down to the much smaller set of 
those writing web server software who hopefully are more likely to take 
notice of the IETF.

in order for Content-Disposition to be able to be converted into the 
form proposed in RFC 6266, more information needs to be communicated 
from the script to the host web server.  I don't see this happening, 
since 6266 uses 5987 encoding, which requires at least a character set.  
That information simply isn't present since it's not emitted by the script.

Getting all these scripts changed to emit character sets as well I think 
is just never going to happen, so there will always be a requirement on 
something downstream to clean it up.  I think the best place to clean it 
up is in the host web server, and if this requires sniffing, or breaking 
the response in some way (e.g. assuming some default character set) then 
this will provide the necessary incentive to get it cleaned up in the 
script perhaps.

I don't recall seeing any language in 2616 about requirements for 
servers that host scripts, in some ways the only way to comply with 2616 
etc (e.g. to generate compliant output when the output is originating 
from a script) is to parse script output and normalize it in some way, 
but I don't think this is done in practise much, and it could be a lot 
more explicit in the spec that this is required in practise to comply 
when the source data may not be compliant (since it's under the control 
of the script author).

Adrien


On 22/06/2011 6:19 a.m., Julian Reschke wrote:
> On 2011-05-27 13:41, Julian Reschke wrote:
>> Hi there,
>>
>> 1) draft-ietf-httpbis-content-disposition is now close to publication,
>> with editorial fixes being applied; I'm tracking the changes at
>> <http://greenbytes.de/tech/webdav/draft-ietf-httpbis-content-disp-latest-from-previous.diff.html>. 
>>
>
> Published as RFC 6266 -- "pretty" version at 
> <http://greenbytes.de/tech/webdav/rfc6266.html>.
>
>> 2) The spec recommends a fallback strategy for UAs that do not support
>> the RFC 5987 encoding; this didn't work in Firefox 4 (as it picked
>> "filename" rather than "filename*", see
>> <https://bugzilla.mozilla.org/show_bug.cgi?id=588781>) but will work in
>> Firefox 5 (which just want to beta).
>
> Firefox 5 was released today.
>
> This means that I18Nized filenames now can be sent without UA 
> sniffing, if it's ok for you to ignore Safari (which still doesn't 
> have RFC2231/5987 support).
>
> See <http://greenbytes.de/tech/webdav/rfc6266.html#rfc.section.D> for 
> the full story.
>
>> 3) The test cases at <http://greenbytes.de/tech/tc2231/> have been
>> augmented with parsing results based on a regexp-based parser embedded
>> into the XSLT that generates the page; this is work-in-progress and
>> currently only covers the generic syntax and not yet the RFC2231/5987
>> encoding used in "filename*".
>
> In the meantime I also added code that decodes the 2231/5987-encoded 
> parameters.
>
> Best regards, Julian
>

-- 
Adrien de Croy - WinGate Proxy Server - http://www.wingate.com
WinGate 7 beta out now - http://www.wingate.com/getlatest/
Received on Tuesday, 12 July 2011 01:27:03 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 06:51:44 GMT