- From: Marc VanHeyningen <mvanheyn@cs.indiana.edu>
- Date: Wed, 07 Dec 1994 13:23:18 -0500
- To: hallam@alws.cern.ch
- Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Phillip said: > To sum up Marcs argument: > > 1) The performance hit is not too great I did not say this. I am not convinced of this one way or the other. If it is, the issue of what to do is still not crystal-clear. Anyway, your comments don't seem to be primarily performance-motivated, so never mind this one for now. "We should do it, but it's expensive" probably leads us to a different place than "We shouldn't do it, it's stupid." > 2) If there is no reason to do it and no reason not to then follow the spec. More or less, yes. I am skeptical of the extents to which: - HTTP is somehow radically different from everything else - The members of this group (including me. Especially me.) are somehow radically wiser than everyone else Call me a conservative in this area. (Sorry if that's one of your dirty words, Phillip. :-) > I do not want cannonicalisation under any circumstances. I have had my fill > of systems that "canonicalise" trying to be "clever". Such systems break > much much more than they mend. Like the FTP ASCII transfer mode which is > enabled by default in most FTP clients (but not some of the more modern ones). I have no idea what poorly-designed FTP clients have to do with this issue. Ideally, FTP could work such that the decision of whether ASCII or binary mode would be employed was based on the specific object, and chosen by the server, which should know which is appropriate. Canonicalization is not "clever" at all. Trying to guess which of various different representations for line breaks is being employed is "trying to be clever." Personally, I think cleverness is good; but mandating it is something else. This is the first time I've heard someone suggest that canonicalization would actually break something, as opposed to merely being a performance loss or a pedantic irrelevancy. Can you be more specific? File-sharing mechanisms that don't concern themselves with this (say, NFS) end up pushing these problems off onto their applications and seriously restrict their portability (if you assume NFS is worth anything even in a homogenous environment. :-) > In most cases canonicalisation is simply impractical, if the message body is > compressed then canonicalisation is a loser. Yes; obviously an object stored in a compressed non-canonical form would be a big lose to convert in this fashion. There need to be clear guidelines for dealing with such cases. > What the MIME specs state in this area is irrelevant. MIME is designed to > pass through mail gateways. HTTP is not. It is the 8 bit clean restriction > that is HTTPs main win over other protocols. No way. FTP is not 8 bit clean? Finger is not 8 bit clean? It is the uniform and portable representation of metadata (i.e. HTTP headers) that is HTTP's main win over other protocols. FTP could be nearly as good if there were uniform ways to find out, rather than heuristically guess at, things like the last modification time and content-type of files. HTTP mostly combines the headers and content-labeling of email/MIME, the file-transfer of FTP, and the lightweight request-reply nature of finger. Quiz: Which of these three protocols does not employ canonicalization? > This is a character set issue, not a content type issue. If people want to > propose that the default characterset interprets CRLF in this manner then > fair enough. HTTP supports different character sets? :-) Assuming MIME takes the direction it appears to be, it's the case in every charset, not just US-ASCII. (This does have the implication that Unicode can't be a text/foo type but must be an application/foo type.) - Marc
Received on Wednesday, 7 December 1994 10:26:54 UTC