- From: Hugo Haas <hugo@w3.org>
- Date: Wed, 26 Apr 2000 19:50:53 -0400
- To: www-talk@w3.org
There are a lot of problems with the gzip'ed files downloaded. If I request: http://www.w3.org/TR/html4/html40.pdf the server is replies: Content-Encoding: gzip Content-Length: 962329 Content-Location: html40.pdf.gz Content-Type: application/pdf; qs=0.001 Now, when I ask for: http://www.w3.org/TR/html4/html40.pdf.gz the server is going to reply: Content-Encoding: gzip Content-Length: 962329 Content-Type: application/pdf; qs=0.001 This is the configuration on www.w3.org, and I believe that it is Apache's default behavior. When I request html40.pdf, it is obvious that my browser needs to decode (i.e. uncompress) the file on the fly, and save it under html40.pdf. What about html.pdf.gz? The HTTP headers are the same, so I guess that it's why browsers usually uncompress the file. And since they asked for html40.pdf.gz, they save it under this name (there's no evidence telling them to get rid of the .gz extension). Under Windows, you get a PDF file with a .gz extension, which PDF readers don't like. I think that the reply to http://www.w3.org/TR/html4/html40.pdf.gz should not specify any content encoding and the content type should be set to application/x-gzip or something similar, and that application/pdf should only be used when the resource is negotiated. RFC2616 says (section 7.2.1): Content-Type specifies the media type of the underlying data. Content-Encoding may be used to indicate any additional content codings applied to the data, usually for the purpose of data compression, that are a property of the requested resource. There is no default encoding. My understanding of that is it is encoding of the transfer, not of the data. What is the correct behavior (of both clients and servers)? -- Hugo Haas, Webmaster, Systems Team - W3C/MIT mailto:hugo@w3.org - tel:+1-617-452-2092
Received on Tuesday, 2 May 2000 15:56:59 UTC