W3C home > Mailing lists > Public > www-international@w3.org > October to December 2003

Re: Problem in downloading a pdf file having Japanese characters in the name of the file

From: A. Vine <andrea.vine@Sun.COM>
Date: Fri, 31 Oct 2003 09:42:14 -0800
To: Jungshik Shin <jshin@i18nl10n.com>
Cc: Steve Billings <billings@global360.com>, "souravm (by way of Martin Duerst <duerst@w3.org>)" <souravm@infosys.com>, www-international@w3.org
Message-id: <3FA29EF6.2070605@sun.com>

I'm with Steve.  RFC 2231 is so awkward and has so little support (it's 
been around for many years and yet only now have even a few products 
decided to support it) that I never recommend using it.

Andrea

Jungshik Shin wrote:

> 
> On Thu, 30 Oct 2003, Steve Billings wrote:
> 
> 
>>I wrestled with this problem earlier this year, and unfortunately found no
>>good solutions. As far as I can tell (and I hope someone can prove me
>>wrong), it's a yet-to-be solved problem in the internet infrastructure. I
>>was using recent versions of IE and Netscape browsers, and a not-so-new
>>version of Tomcat (3.something, I think).
>>
>>The approach that came closest to working was to encode the filename using
>>URLEncoder
>>(http://java.sun.com/j2se/1.4.1/docs/api/java/net/URLEncoder.html) with
>>UTF-8, and set the Content-Disposition according to RFC 2047 as follows:
> 
> 
>   You're not supposed to use RFC 2047 encoding for _parameters_
> (such as 'filename' in Content-Disposition header)  of header
> field. It's RFC 2231 that has to be used.  It's regrettable that
> this fact is buried deep inside RFC 822/STD 11, RFC 2047, RFC 2184
> and RFC 2231.
> 
> 
>>With this approach, if the Japanese filename is short, when you save the
>>file from the browser, everything looks fine. If you open it without saving
>>it, Notepad gets the encoded name (bad). Another problem is that this
>>approach can only handle filenames up to about 17 Japanese characters.
>>
>>I tried using other standards (RFC 2184, RFC 2231) with no success.
> 
> 
>   Mozilla 1.5 or later does support RFC 2231 (see
> <http://bugzilla.mozilla.org/show_bug.cgi?id=162765>
> and <http://i18nl10n.com/moztest/download.html>)
> It's unfortunate that MS IE does not understand RFC 2231 used in
> Content-Disposition header of HTTP. As a fallback, Mozilla also accepts
> RFC 2047 'raw' UTF-8 and 'raw' non-ASCII string in the same character
> encoding as that of the 'containing' document.
> 
>  Jungshik
> 

-- 
I have always wished that my computer would be as easy to use as my 
telephone. My wish has come true. I no longer know how to use my telephone.
-Bjarne Stroustrup, designer of C++ programming language (1950- )
Received on Friday, 31 October 2003 13:13:57 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:03 GMT