Re: Comments on draft-ietf-httpbis-content-disp from Martin J. Dürst on 2010-11-02 (ietf-http-wg@w3.org from October to December 2010)

From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Date: Tue, 02 Nov 2010 13:52:02 +0900
To: Adam Barth <ietf@adambarth.com>
CC: httpbis <ietf-http-wg@w3.org>
Message-ID: <4CCF98F2.4010309@it.aoyama.ac.jp>

On 2010/11/01 17:30, Adam Barth wrote:

> Jungshik Shin writes:
>
> [[
> As for RFC 5987, I'm aware that it's a profile of RFC 2231 (it's good
> that it's simpler than the full RFC 2231), but I wrote that it's
> unnecessarily 'complex' and not many web servers would adopt that
> anytime soon. That's why I advocated a much simpler approach of using
> (percent-encoded) UTF-8. I'm aware that it has its own share of
> issues, but I suspect that it's got a better chance of being adopted
> by web servers.
> ]]
>
> I agree with his assessment.  We should simply use percent-encoded
> UTF-8 instead of letting the server specify whatever crazy encoding it
> dreams up.

For the record, I also agree with this. The simpler, more general, and 
more straightforward the encoding is, the better. This can be specified 
as an additional layer on top of header processing, if necessary.

> Also, we should remove the language tagging facility
> because it is gratuitous.

I also agree with this. There is currently no OS or any related facility 
that 'knows' in any way in what language its filenames are. There is 
also no way for a user to enter that information, in the interfaces I 
know. There is also no practice of language negotiation for filenames 
(there is language negotiation for different language versions of 
content, where as a result the filenames may also be different, but 
that's a different thing).

> http://tools.ietf.org/html/draft-ietf-httpbis-content-disp-03#appendix-C.2
>
> As far as I can tell, this is actually the biggest interoperability
> problem with the Content-Disposition header field.  Unfortunately,
> this document does nothing to resolve this issue.  I recommend that
> this document take a position with respect to how to handle
> percent-encoded values in the filename parameter.  Specifically, I
> recommend that the document instruct user agents to decode percent
> encoded values using the user's preferred encoding.  Yes, that's ugly,
> but it's the way Content-Disposition works in the real world and the
> most likely requirement to actually be implemented by user agents.

Well, you say 'ugly', but in this case, it's equivalent to "does *NOT* 
work". [Users don't prefer encodings, it's the user's system's preferred 
encoding.] Preferred encodings on the receiving side, and used encodings 
on the server side, differ, and the result of that is garbage. I'm not 
sure recommending to produce garbage is a good idea at all.

As for the wording of Appendix C.2, even if it stays more or less as it 
is, it is confusing. It first says that some user agents accept UTF-8, 
then it says that the first user agent to implement this (i.e. UTF-8) 
used the local encoding (i.e. NOT UTF-8).

Regards,    Martin.

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp

Received on Tuesday, 2 November 2010 04:52:47 UTC