May to August 1995

Re: Content-Transfer-Encoding

From: Fisher Mark <FisherM@is3.indy.tce.com>
Date: Thu, 27 Jul 95 13:14:00 PDT
To: HTTP Working Group <http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com>
Cc: "Baker Phillip M." <hallam@w3.org>
Message-Id: <3017F436@MSMAIL.INDY.TCE.COM>

Phillip M. Hallam-Baker writes in <9507261823.AA18496@www18.w3.org>:
>1) Boundary delimiters (like mime)
>        These are absolutely, fundamentaly and irrevokably unacceptable. 
>require the originator to be precient is choice of boundary marker,
>they require the recipient to scan for them.
Although I don't agree, as
>Ad-hoc arguments resorting to MD5 and probablistic considerations
do carry weight with me, there are times when it is preferable to not have 
to scan for a boundary marker.

In this spirit, I propose a simple (and perhaps simple-minded :<) method for 
a self-describing binary length encoding:
1. The initial octet shall be in the Base64 alphabet (RFC1341).  It 
describes the length of the length string; and
2. The following octets of the length string shall be in the Base64 
alphabet, such that "B"(64) = 1(10), "BA"(64) = 64(10), "BAAA"(64) = 
262144(10), and so on.  These examples would then be:
     octet# on wire octet value
     -------------- -----------
     0         B
     1         B
for "B"(64) = 1(10),
     octet# on wire octet value
     -------------- -----------
     0         C
     1         B
     2         A
for "BA"(64) = 64(10), and
     octet# on wire octet value
     -------------- -----------
     0         E
     1         B
     2         A
     3         A
     4         A
for "BAAA"(64) = 262144(10).

This complies with the spirit of HTTP, as the length encoding will be in 
ASCII as Phill noted:
>I prefer that a protocol be faithful unto itself.

A self-describing binary length encoding string of length 64 (1 octet for 
the length of the length + 63 length octets) represented in the Base64 
alphabet can encode a transmission length of up to 64^63 - 1 octets, or 
9402679643915803347910232576806887603562348543 octets if you prefer.  This 
is around 6.16x10^112, whereas there are only around 10^80 particles in the 
known universe.  By the time we run into this limit, it is likely we will 
want to switch away from HTTP :)...

The Base64 alphabet was chosen because it is a compact printable 
representation of base 64 numbers that is portable between ASCII, EBCDIC, 
ISO 646, and ISO 10646 that should have multiple correct alphabet 
translation tables already available.
Mark Fisher                            Thomson Consumer Electronics
fisherm@indy.tce.com                   Indianapolis, IN
