Re: Digesting the digest...

>Once you start digesting entity (bodies , headers) is is very important
>to be specific about the canonical form of the data that is being 


>Other other RFC's (PEM and or MOSS) have their own specs. It should be made
>clear how transfer codings etc do/don't impact. No canonical EOL for text/*
>types etc.

This is because in SMTP the conversion of entity bodies by gateways and
agents is tolerated. HTTP does *NOT* tolerate modification. HTTP *is* 
8 bit clean. 

Thus a gateway or proxy which changes CRLF to CR or vice-versa is *BROKEN*

Although the email community have on many occasions attempted to 
challenge these design principles the Web community has never budged
from the 8-bit clean point.

>From a personal point of view I am much happier if the odd MIME gateway
fails occasionaly than if we break HTTP for everyone. I wish that SMTP 
did not allow agents to mess with the data. Then I could email program code to 
people with reasonable probability of it working. 

HTTP defines a very clean separation between the protocol layer and the 
content layer. HTTP agents are simply not permitted to meddle at the
content layer. If they do they are broken.

So the message digest is simply the digest ot the entity transfered. IE
for a non chunked transfer the actual bytes transfered, starting with the
first byte following the CRLF-CRLF sequence marking the end of the headers up to 
and including the last byte when the connection closes.

There should be no canonicalisation process specified. The canonical form is the 
entity as transferred on the wire. If a text/plain body includes CR or LF 
outside a CRLF sequence that is what goes into the digest. If lines are 20,000 
characters long that is what goes in.


Received on Wednesday, 28 February 1996 15:06:07 UTC