Re: Content-Disposition next steps from Adam Barth on 2010-12-13 (ietf-http-wg@w3.org from October to December 2010)

From: Adam Barth <ietf@adambarth.com>
Date: Mon, 13 Dec 2010 03:00:50 -0800
To: Maciej Stachowiak <mjs@apple.com>
Cc: Julian Reschke <julian.reschke@gmx.de>, Mark Nottingham <mnot@mnot.net>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <AANLkTinPm701ca1O2pUw+5z5y6udcO-5VByLoBvFDkvs@mail.gmail.com>

On Mon, Dec 13, 2010 at 2:54 AM, Maciej Stachowiak <mjs@apple.com> wrote:
> I wanted to add, for clarity, that we don't see any major problems in Adam's proposal as-is, though we'd suggest adding ISO-8859-1 as a final fallback.

Done.

> If there is a test suite that matches the expectations of Adam's proposal and is easy to run, I'll try to get someone to run it. Or if this testing has already been done, I can comment on the ways it diverges from Safari behavior and whether we are likely to care.

Julian has written a nice test suite.  We'll just need to set the expectations.

> On Dec 13, 2010, at 1:28 AM, Julian Reschke wrote:
>
>> Hi Maciej,
>>
>> thanks for forwarding.
>>
>> On 13.12.2010 10:06, Maciej Stachowiak wrote:
>>> Here are some comments from my colleague Alexey Proskuryakov on your
>>> proposal. I know these may have been outpaced by the considerable
>>> discussion since that point, but they still seem like they could be useful.
>>>
>>>> I only know about file name decoding - all parsing is of course in
>>>> CFNetwork, and most logic is in Launch Services, I think.
>>>>
>>>> Adam's proposal is a step forward in that it acknowledges the need to
>>>> process raw non-ASCII bytes in filename, which is the only encoding
>>
>> That's incorrect in that the base spec already says it's ISO-8859-1 (although this may be hard to find in the published specs as opposed to the Internet Draft we're discussing).
>>
>> (maybe this is a case where Alexey looked at an old proposal?)
>
> I suspect he looked at the existing published RFC. In any case, treating everything as Latin1 is likely not acceptable to us. We came up with our (somewhat complicated) rule through a long process of trial and error based on bug reports and user requirements.

The rule in my proposal isn't quite as complicated as what Safari
implements.  In particular, the proposal doesn't take the current
frame encoding into account.

>>>> style that matters. He also describes the proper algorithm,
>>>> acknowledging that Chrome doesn't fully implement it. Unsurprisingly,
>>>> that part was met with resistance from the "we always told you it was
>>>> ISO-8859-1" crowd.
>>>>
>>>> I agree that RFC2047 style encoding shouldn't be supported, and I'm
>>>> ambivalent about RFC5987. RFC2231/5987 is a step in the wrong
>>>> direction (opaque encoding for something that doesn't need it), but
>>>> given that IETF won't cease pushing it, we might as well implement it
>>>> and be more compatible with Firefox, if not the Web.
>>>>
>>>> - WBR, Alexey Proskuryakov
>>
>> In a perfect world we could declare that the HTTP header encoding is UTF-8. But it isn't.
>>
>> If we *tried* to change the default encoding just for C-D/filename, we will still break existing code.
>>
>> So I'm not sure what the "doesn't need it" refers to.
>
> Treating filenames in Content-Disposition as Latin1 will in any case break existing code. I think the WG's only choices here are which code to break, and by how much.

I've added back UTF-8 as the first encoding to try based on your and
Alexey's input.

Adam

Received on Monday, 13 December 2010 11:02:31 UTC