- From: <hallam@w3.org>
- Date: Wed, 28 Feb 96 18:02:48 -0500
- To: Peter J Churchyard <pjc@trusted.com>, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
- Cc: hallam@w3.org
>Once you start digesting entity (bodies , headers) is is very important >to be specific about the canonical form of the data that is being >digested. Agreed. >Other other RFC's (PEM and or MOSS) have their own specs. It should be made >clear how transfer codings etc do/don't impact. No canonical EOL for text/* >types etc. This is because in SMTP the conversion of entity bodies by gateways and agents is tolerated. HTTP does *NOT* tolerate modification. HTTP *is* 8 bit clean. Thus a gateway or proxy which changes CRLF to CR or vice-versa is *BROKEN* Although the email community have on many occasions attempted to challenge these design principles the Web community has never budged from the 8-bit clean point. >From a personal point of view I am much happier if the odd MIME gateway fails occasionaly than if we break HTTP for everyone. I wish that SMTP did not allow agents to mess with the data. Then I could email program code to people with reasonable probability of it working. HTTP defines a very clean separation between the protocol layer and the content layer. HTTP agents are simply not permitted to meddle at the content layer. If they do they are broken. So the message digest is simply the digest ot the entity transfered. IE for a non chunked transfer the actual bytes transfered, starting with the first byte following the CRLF-CRLF sequence marking the end of the headers up to and including the last byte when the connection closes. There should be no canonicalisation process specified. The canonical form is the entity as transferred on the wire. If a text/plain body includes CR or LF outside a CRLF sequence that is what goes into the digest. If lines are 20,000 characters long that is what goes in. Phill
Received on Wednesday, 28 February 1996 15:06:07 UTC