Re: HTML 4.01.pdf?

On Thu, Sep 07, 2000, Steven wrote:
> Ever heard of pdf? It's a format capable of delivering high
> quality text for easy online reading. That is if the whole thing
> isn't ruined by visible link boxes! (that's hint number 1).
> Pdf doesn't have to be stuffed to be accessible on the web.
> Especially not in a mac-unfriendly gzip format. (that's hint
> number 2). I have tried both on a Mac and on a PC to download
> the HTML401 spec in pdf format. (http://www.w3.org/TR/html4/html40.pdf.gz)
> In both cases I get error messages that say the file is corrupted.
> Please, don't be geeky and give us the plain pdf and loose the
> visible link boxes in the file. And if you must compress do it in
> formats that can be read.

I cannot answer your first comment, but I can answer your second one.

GZIP is an encoding format describe in RFC1952 which has been developped
to be platform independent.

It is one of the 4 content coding listed in the HTTP/1.1 specification
(RFC2616, section 3.5) and is served as such by our Web server:

	Content-Encoding: gzip
	Content-Length: 962329
	Content-Type: application/pdf; qs=0.001
	
As far as I know, most of the programs able to uncompress ZIP files can
uncompress GZIP files (WinZip for example - I am not a Mac person so I
can't speak about Macs).

The file sometimes _appears_ to be corrupted because the Web browser
uncompresses the file on the fly and names it html40.pdf.gz whereas once
uncompressed it is PDF and should most likely be named html40.pdf, or
doesn't modify the file and names it html40.pdf instead of html40.pdf.gz
since in this case it is still GZIP'ed (I have seen such behaviors on
PCs).

Basically, we are compressing the PDF file to save bandwidth and reduce
the downloading time (the file would be 3 times bigger uncompressed) in
a standard way.

Regards,

Hugo

-- 
Hugo Haas, Webmaster, Systems Team - W3C/MIT
mailto:hugo@w3.org - tel:+1-617-452-2092

Received on Tuesday, 19 September 2000 10:40:13 UTC