- From: Jacob Palme <jpalme@dsv.su.se>
- Date: Thu, 29 Jan 1998 13:39:55 +0100
- To: Larry Masinter <masinter@parc.xerox.com>
- Cc: Jim Gettys <jg@pa.dec.com>, Nick Shelness <shelness@lotus.com>, IETF working group on HTML in e-mail <mhtml@segate.sunet.se>, http-wg@cuckoo.hpl.hp.com, Nick_Shelness/SSW/Lotus@lotus.com
At 07.40 -0800 98-01-28, Larry Masinter wrote: > I know of no Unix text editor that produces text files that > use the "canonical" format for text/plain using CRLF instead > of just plain LF, and I've never seen a Macintosh text editor > that produces text files using the "canonical" format with > CRLF instead of just plain CR. An editor produces files that > are appropriate for the native environment, and people with > Internet mail who mail documents using SMTP expect the mailer > to convert the file from the local convention to the canonical > one on transmission, and to do the inverse conversion when > saving the document. I believe BBEDIT, a popular Macintosh text editor with special features for (not WYSIWYG) HTML, has some support for this. Many HTML editors have built-in features for updating the master document on a HTTP server. I believe they usually follow the FTP TEXT convention of converting line breaks to the convention used on the server platform. I am no expert on security features. Is it true that these always convert line breaks to CRLF format before computing the checksums? If that is the case, the problem becomes less important. Is it also true that security features decode Content-Transfer-Encoding before computing their checksums. That would further reduce the problem of HTML documents in formats unsuitable for mail transmission without Content-Transfer-Encoding. > Security checksums _should_ be applied to the canonical form, > but in truth, it is feasible to apply it to the transmitted form, > and it is the choice of the originator whether to do this. Since > an HTTP server should never actually modify the end-of-line convention, > the data as represented by the choice of the original sender can > remain intact. That is, this is not really an issue. Choice of the originator or choice of the security algorithm? Do you mean that the security algorithms allows a sealed document to contain information on which conversions and uncodings should be done on a document before computing checksums? Or do you mean that different security schemes handle this in different ways? > We've been through this topic endlessly in the past, and wrangled > out some specific wording in the HTTP draft that was believed generally > to be adequate to explain the situation, at least as far as > the 'end of line convention' discussion in the HTTP/1.1 specification > was concerned. I am sorry if I am taking time to discuss an issue which you have already resolved. I agree that HTTP servers should not change the documents before sending them. This means that the documents stored in the HTTP server should be in a form which allows gatewaying to e-mail without loss of security checksums. > If there was some demand, a separate document on "HTTP/mail gateway > implementation considerations" could elaborate on the topic, but the > demand for that seems to be low. (Perhaps IMAP/WebDAV interoperability > might be an interesting application (isn't a mail folder a 'collection'?)). In the MHTML group, we have carefully tried to make a standard which would break security checksums as little as possible. We believed this was a user requirement; when you say "demand for that seems to be low", do you mean that this is not, and will not, be an important user requirement? We already have som text about this in the MHTML document. Below is a quote of what we have written: - The MIME standard [MIME2] requires that e-mailed documents of "Content-Type: Text/ MUST be in canonical form before a Content-Transfer-Encoding is applied, i.e. that line breaks are encoded as CRLFs, not as bare CRs or bare LFs or something else. This is in contrast to [HTTP] where section 3.6.1 allows other representations of line breaks. Note that this might cause problems with integrity checks based on checksums, which might not be preserved when moving a document from the HTTP to the MIME environment. If a document has to be converted in such a way that a checksum based message integrity check becomes invalid, then this integrity check header SHOULD be removed from the document. Other sources of problems are Content-Encoding used in HTTP but not allowed in MIME, and character sets that are not able to represent line breaks as CRLF. A good overview of the differences between HTTP and MIME with regards to Content-Type: "text" can be found in [HTTP], appendix C. If the original document has line breaks in the canonical form (CRLF), then the document SHOULD remain unconverted so that integrity check sums are not invalidated. A provider of HTML documents SHOULD always provide original documents in the canonical form with CRLF for line breaks, so that they will be transferable via both HTTP and SMTP without invalidating checksum integrity checks. So rather than writing a new document on this, MHTML could be used or extended to cover this issue. The HTTP spec could just mention that MHTML contains some discussion on how to create documents which can be sent through both e-mail and HTTP without breaking security checksums. MIME recommends using Content-Transfer-Encoding to break lines longer than 76 characters when transporting documents through e-mail. This might also break security checksums, unless the security checksum algorithms are defined in ways which will not be broken by Content-Transfer-Encoding. Is it permitted to use e-mail-type Content-Transfer-Encoding on documents sent through HTTP? From what you say, I believe it must be permitted if the Content-Transfer-Encoding header is inside the body, since HTTP only handles the outermost (HTTP) header. But is a Content-Transfer-Encoding header field allowed in the outermost (HTTP) heading? ------------------------------------------------------------------------ Jacob Palme <jpalme@dsv.su.se> (Stockholm University and KTH) for more info see URL: http://www.dsv.su.se/~jpalme
Received on Thursday, 29 January 1998 04:45:12 UTC