- From: <Nick_Shelness@motorcity2.lotus.com>
- Date: Thu, 15 Jan 1998 21:03:38 GMT
- To: Jacob Palme <jpalme@dsv.su.se>
- Cc: IETF working group on HTML in e-mail <mhtml@segate.sunet.se>, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Jacob, > I am not very happy with changing an existing and already implemented > IETF proposed standard in such a radical way. But maybe it is necessary. > Let us examine the differences between how MHTML and HTTP uses Content- > Location to see if they really need to be split into two different > header fields. The reason for changing from my previous align MHTML with HTTP 1.1 position (which you also enunciate), to my employ a new header field position, was because I was concerned that the MIME folding algorithm we apply to header fields containing invalid URIs would be incompatible with HTTP 1.1. HTTP 1.1 can outlaw invalid URIs, MHTML has to both make them RFC822/MIME safe and cope with them. The problematic text is: 4.4 Encoding and decoding of URIs in MIME header fields 4.4.1 Encoding of URIs containing inappropriate characters Some documents may contain URIs with characters that are inappropriate for an RFC 822 header, either because the URI itself has an incorrect syntax according to [URL] or the URI syntax standard has been changed to allow characters not previously allowed in MIME headers. These URIs cannot be sent directly in a message header. If such a URI occurs, all spaces and other illegal characters in it must be encoded using one of the methods described in [MIME3] section 4. This encoding MUST only be done in the header, not in the HTML text. Receiving clients must decode the [MIME3] encoding in the heading before comparing URIs in body text to URIs in Content-Location headers. The charset parameter value "US-ASCII" SHOULD be used if the URI contains no octets outside of the 7-bit range. If such octets are present, the correct charset parameter value (derived e.g. from information about the HTML document the URI was found in) SHOULD be used. If this cannot be safely established, the value "UKNOWN-8BIT" [RFC 1428] MUST be used. Note, that for the matching of URIs in text/html body parts to URIs in Content-Location headers, the value of the charset parameter is irrelevant, but that it may be relevant for other purposes, and that incorrect labeling MUST, therefore, be avoided. Warning: Irrelevance of the charset parameter may not be true in the future, if different character encodings of the same non-English filename are used in HTML. 4.4.2 Folding of long URIs Since MIME header fields have a limited length and long URIs can result in Content-Location and Content-Base headers that exceed this length , Content-Location and Content-Base headers may have to be folded. Encoding as discussed in clause 4.4.1 MUST be done before such folding. After that, the folding can be done, using the algorithm defined in [URLBODY] section 3.1. 4.4.3 Unfolding and decoding of received URLs in MIME header fields Upon receipt, folded MIME header fields should be unfolded, and then any MIME encoding should be removed, to retrieve the original URI. Nick
Received on Friday, 16 January 1998 04:12:34 UTC