- From: Henrik Frystyk Nielsen <frystyk@w3.org>
- Date: Wed, 26 Feb 1997 13:48:10 -0500
- To: w3c-dist-auth@w3.org, w3c-http@w3.org
I have some new performance results that is of interest to implementors of HTML authoring tools... As part of our paper "Network Performance Effects of HTTP/1.1, CSS1, and PNG" we looked into compression of HTML to see how much we can save in bytes and hence time to transfer the data. The paper is available from http://www.w3.org/pub/WWW/Protocols/HTTP/Performance/Pipeline.html We have made some simple tests on how zlib compression is affected by case canonicalizing HTML tags. The figures are available at http://www.w3.org/pub/WWW/Protocols/HTTP/Performance/Compression/HTMLCanon.h tml From this very small test, _lowercase_ canonicalization of HTML tags gives the best performance. This is not surprising as most of the actual text in the document is lowercase and hence the probability that lowercase HTML tag names can be reused in the dictionary is bigger than if using uppercase tags. Uppercase is, however, the dominant way most editors work today. Optimizing the compression algorithm for size does not have a significant impact compared to the default compression. These data should be taken with a grain of salt as the data set is very small (exactly on file which is the top page of our "microscape" test site) Other things that are interesting to investigate are experimenting with different dictionaries and other types of canonicalizations. Thanks Henrik -- Henrik Frystyk Nielsen, <frystyk@w3.org> World Wide Web Consortium, MIT/LCS NE43-346 545 Technology Square, Cambridge MA 02139, USA
Received on Wednesday, 26 February 1997 13:48:03 UTC