Re: Content-Disposition next steps from Maciej Stachowiak on 2010-12-13 (ietf-http-wg@w3.org from October to December 2010)

From: Maciej Stachowiak <mjs@apple.com>
Date: Mon, 13 Dec 2010 02:54:32 -0800
To: Julian Reschke <julian.reschke@gmx.de>
Cc: Adam Barth <ietf@adambarth.com>, Mark Nottingham <mnot@mnot.net>, HTTP Working Group <ietf-http-wg@w3.org>
Message-id: <0B467F86-EAE5-4FE8-8C01-5C62268E3CE5@apple.com>

I wanted to add, for clarity, that we don't see any major problems in Adam's proposal as-is, though we'd suggest adding ISO-8859-1 as a final fallback.

If there is a test suite that matches the expectations of Adam's proposal and is easy to run, I'll try to get someone to run it. Or if this testing has already been done, I can comment on the ways it diverges from Safari behavior and whether we are likely to care.

On Dec 13, 2010, at 1:28 AM, Julian Reschke wrote:

> Hi Maciej,
> 
> thanks for forwarding.
> 
> On 13.12.2010 10:06, Maciej Stachowiak wrote:
>> Here are some comments from my colleague Alexey Proskuryakov on your
>> proposal. I know these may have been outpaced by the considerable
>> discussion since that point, but they still seem like they could be useful.
>> 
>>> I only know about file name decoding - all parsing is of course in
>>> CFNetwork, and most logic is in Launch Services, I think.
>>> 
>>> Adam's proposal is a step forward in that it acknowledges the need to
>>> process raw non-ASCII bytes in filename, which is the only encoding
> 
> That's incorrect in that the base spec already says it's ISO-8859-1 (although this may be hard to find in the published specs as opposed to the Internet Draft we're discussing).
> 
> (maybe this is a case where Alexey looked at an old proposal?)

I suspect he looked at the existing published RFC. In any case, treating everything as Latin1 is likely not acceptable to us. We came up with our (somewhat complicated) rule through a long process of trial and error based on bug reports and user requirements. 

> 
>>> style that matters. He also describes the proper algorithm,
>>> acknowledging that Chrome doesn't fully implement it. Unsurprisingly,
>>> that part was met with resistance from the "we always told you it was
>>> ISO-8859-1" crowd.
>>> 
>>> I agree that RFC2047 style encoding shouldn't be supported, and I'm
>>> ambivalent about RFC5987. RFC2231/5987 is a step in the wrong
>>> direction (opaque encoding for something that doesn't need it), but
>>> given that IETF won't cease pushing it, we might as well implement it
>>> and be more compatible with Firefox, if not the Web.
>>> 
>>> - WBR, Alexey Proskuryakov
> 
> In a perfect world we could declare that the HTTP header encoding is UTF-8. But it isn't.
> 
> If we *tried* to change the default encoding just for C-D/filename, we will still break existing code.
> 
> So I'm not sure what the "doesn't need it" refers to.

Treating filenames in Content-Disposition as Latin1 will in any case break existing code. I think the WG's only choices here are which code to break, and by how much.

Regards,
Maciej

Received on Monday, 13 December 2010 10:55:21 UTC