- From: Phillip M. Hallam-Baker <hallam@w3.org>
- Date: Wed, 26 Jul 95 14:24:54 -0400
- To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
someone has brought to my attention that there is a debate on a packet encoding for content. Here are my comments:- 1) Boundary delimiters (like mime) These are absolutely, fundamentaly and irrevokably unacceptable. They require the originator to be precient is choice of boundary marker, they require the recipient to scan for them. Ad-hoc arguments resorting to MD5 and probablistic considerations carry no weight with me. In the first place there is a very strong probability that a persistent TCP/IP stream will be self referential, that is make a response based upon the low level encoding of the stream. Consider two people in a chat conversation debugging HTTP connections, At some point somone will incorporate the message boundary into a message and the system will come to a grinding halt. In the second place there is the processing overhead of scanning a live MPEG feed for a boundary delimeter. This is simply not the right thing. The only argument for the boundary strings I have heard is to simplify MAIL gateways. I don't accept the argument that one compromises a system for a minor simplification of backwards compatibility. If a gateway likes boundary delimeters it can easily convert. The email system is horribly compromised already by the numerous concessions it makes to broken MTAs. One more should not hurt. Do not ask the clean protocol to start making the same concessions by proxy. 2) On binary vs human readable. I prefer that a protocol be faithful unto itself. Either be human readable or let us lose with ASN.1 BER or something like it. There is no point in half measures one way or the other. If you must consider performance at this level use base 16. Base 10 is probably as efficient since there is not the upper/lowercase ambiguity. If you don't think this can be done well in an ascii based scheme please make that statement. We can then decide how to produce a generic scheme for employing binary encodings as MIME/RFC822 achieves for ascii, human readable encodings. Please do not attempt to create a hybrid that is neither one thing nor the other. If you are going to have an ascii based scheme there will be an overhead. I don't think its a very large one but it will exist. There are a number of binarry encodings with worse overhead, ASN.1 DER for example. 3) Encryption Please remeber that people will want to encode streams using block ciphers. This requires provision for specifying the number of padding octets. Note that PEM style padding cannot be used since is creates a security problem when used for repeated numbers of small packets. It is essential that this padding be random bytes. 4) On a fixed upper limit on the packet size. This is not an acceptable proposal. Whatever size you chose I can provide examples where there is a very good need for a bigger size. Since history is littered with standards whose creators did not anticipate future needs I don't think we should make an arbitrary restriction which has no rational justification. I have just helped order a pair of machines with 512Mb of RAM, I have used file systems with a capacity of 6 Tb. I have built systems with a bandwidth of 6Tb/sec. 32 bits is simply not enough for these purposes. Experience demonstrates that to build a system for the future we must at least be able to satisfy the most exotic needs of the day. Self describing binary length encodings are simple enough. ASN.1 has one, one of the sensible suggestions in the original design that survived the committee cycle? ASN.1 has two legth encodings, short and long. The first octet defines the length of the length. If bit 7 is clear the length is encoded in a single octet a bits 0-6 of the first octet. Otherwise bits 0-6 give the number of additional octets the length. A perhaps cleaner way to do it combines the recycling unused bits in the lead octet idea with the fact that most lengths will be most conveniently expressed using 1, 2, 4 or 8 bytes. The decision table is as follows: IF Bit 7 clear, length is 1 octet, bits 0-6 give value Bit 6 clear, length is 2 octets, bits 0-5 give msb of value, octet 2 gives LSB Bit 5 clear, length is 4 octets, bits 0-4 give msb of value, octets 2-4 gives LSB Bit 4 clear, length is 8 octets, bits 0-3 give msb of value, octets 2-9 gives LSB TRUE these are reserved for future expansion. Of these points the last is the most critical. If the first is not accepted it will not be a significant disaster because I don't think the protocol would be used. If a scheme with a fixed length gains popularity it will be very hard to remedy it later on. Thus I would like names of any proponents of a fixed length scheme with their rationale. I promise faithfully to retain said list and produce it in a suitable forum (eg IETF conference dinner) for an "I told you so" speach. -- Phillip M. Hallam-Baker Not speaking for anoyone else hallam@w3.org http://www.w3.org/hypertext/WWW/People/hallam.html Information Superhighway -----> Hi-ho! Yow! I'm surfing Arpanet!
Received on Wednesday, 26 July 1995 11:29:16 UTC