Re: Validator errors

Lloyd Wood <l.wood@eim.surrey.ac.uk> wrote:

>>Given that even with current round-peg-in-square-hole attempts at HTTP
>>compression case difference has negligable impact on compression
>>efficiency -- due to the ratio of markup to data and the tendency of
>>authors and authoring tools to be internally consistent
>
>okay, so far. Unfortunately, database-generated output mixes and
>matches content from a variety of sources; compare ad banner insertion
>code, templates and content on any online magazine site you like.

You somewhere like, say, C|NET <URL:http://www.cnet.com/>? Where the sole
exception (in the 35KB+ document) to the rule that elements and attributes
are lowercase is the single META element?

The argument is valid (C|NET is a bad example), but it's still nitpicking.


>>any specialized HTTP compression scheme should contain optimizations
>>for the kind of data that is likely to occur,
>
>the optimisations you appear to be considering would damage the
>integrity of the original bytestream.

No. Any generalized HTTP compression scheme would have to employ different
compression algorithms depending on the content served; including applying
no compression to content that is already compressed.

SGML and XML are ridiculously verbose because humans (currently) need to
read and write it. But the number of unique permutations of elements and
simple attributes can probably be expressed in two or three bytes. If you
also take into account prefix and suffix optimizations of complex
attributes, you can probably reduce _any_ element to a maximum of four
bytes and still provide ample room for growth (e.g. the X in XML).

Received on Monday, 31 January 2000 11:25:44 UTC