Re: Content-Transfer-Encoding

Phillip M. Hallam-Baker writes in <9507261823.AA18496@www18.w3.org>:
>1) Boundary delimiters (like mime)
>        These are absolutely, fundamentaly and irrevokably unacceptable. 
They
>require the originator to be precient is choice of boundary marker,
>they require the recipient to scan for them.
Although I don't agree, as
>Ad-hoc arguments resorting to MD5 and probablistic considerations
do carry weight with me, there are times when it is preferable to not have 
to scan for a boundary marker.

In this spirit, I propose a simple (and perhaps simple-minded :<) method for 
a self-describing binary length encoding:
1. The initial octet shall be in the Base64 alphabet (RFC1341).  It 
describes the length of the length string; and
2. The following octets of the length string shall be in the Base64 
alphabet, such that "B"(64) = 1(10), "BA"(64) = 64(10), "BAAA"(64) = 
262144(10), and so on.  These examples would then be:
     octet# on wire octet value
     -------------- -----------
     0         B
     1         B
for "B"(64) = 1(10),
     octet# on wire octet value
     -------------- -----------
     0         C
     1         B
     2         A
for "BA"(64) = 64(10), and
     octet# on wire octet value
     -------------- -----------
     0         E
     1         B
     2         A
     3         A
     4         A
for "BAAA"(64) = 262144(10).

This complies with the spirit of HTTP, as the length encoding will be in 
ASCII as Phill noted:
>I prefer that a protocol be faithful unto itself.

A self-describing binary length encoding string of length 64 (1 octet for 
the length of the length + 63 length octets) represented in the Base64 
alphabet can encode a transmission length of up to 64^63 - 1 octets, or 
61565634681866373769186000156474396570437092610102260418669208444133\
9402679643915803347910232576806887603562348543 octets if you prefer.  This 
is around 6.16x10^112, whereas there are only around 10^80 particles in the 
known universe.  By the time we run into this limit, it is likely we will 
want to switch away from HTTP :)...

The Base64 alphabet was chosen because it is a compact printable 
representation of base 64 numbers that is portable between ASCII, EBCDIC, 
ISO 646, and ISO 10646 that should have multiple correct alphabet 
translation tables already available.
======================================================================
Mark Fisher                            Thomson Consumer Electronics
fisherm@indy.tce.com                   Indianapolis, IN

Received on Thursday, 27 July 1995 11:22:36 UTC