Re: Comments on the HTTP/1.0 draft. from Marc VanHeyningen on 1994-12-07 (ietf-http-wg@w3.org from October to December 1994)

From: Marc VanHeyningen <mvanheyn@cs.indiana.edu>
Date: Wed, 07 Dec 1994 13:23:18 -0500
To: hallam@alws.cern.ch
Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <18621.786824598@gummy.cs.indiana.edu>
Phillip said:
> To sum up Marcs argument:
> 
> 1) The performance hit is not too great

I did not say this.  I am not convinced of this one way or the other.
If it is, the issue of what to do is still not crystal-clear.  Anyway,
your comments don't seem to be primarily performance-motivated, so never 
mind this one for now.  "We should do it, but it's expensive" probably 
leads us to a different place than "We shouldn't do it, it's stupid."

> 2) If there is no reason to do it and no reason not to then follow the spec.

More or less, yes.  I am skeptical of the extents to which:
- HTTP is somehow radically different from everything else
- The members of this group (including me.  Especially me.) are somehow
  radically wiser than everyone else

Call me a conservative in this area.  (Sorry if that's one of your dirty
words, Phillip. :-)

> I do not want cannonicalisation under any circumstances. I have had my fill
> of systems that "canonicalise" trying to be "clever". Such systems break
> much much more than they mend. Like the FTP ASCII transfer mode which is
> enabled by default in most FTP clients (but not some of the more modern ones).

I have no idea what poorly-designed FTP clients have to do with this issue.
Ideally, FTP could work such that the decision of whether ASCII or
binary mode would be employed was based on the specific object, and chosen
by the server, which should know which is appropriate.

Canonicalization is not "clever" at all.  Trying to guess which of various
different representations for line breaks is being employed is "trying to be
clever."  Personally, I think cleverness is good; but mandating it is
something else.

This is the first time I've heard someone suggest that canonicalization would
actually break something, as opposed to merely being a performance loss or
a pedantic irrelevancy.  Can you be more specific?

File-sharing mechanisms that don't concern themselves with this (say, NFS)
end up pushing these problems off onto their applications and seriously 
restrict their portability (if you assume NFS is worth anything even in
a homogenous environment. :-)

> In most cases canonicalisation is simply impractical, if the message body is
> compressed then canonicalisation is a loser.

Yes; obviously an object stored in a compressed non-canonical form would be
a big lose to convert in this fashion.  There need to be clear guidelines for
dealing with such cases.

> What the MIME specs state in this area is irrelevant. MIME is designed to
> pass through mail gateways. HTTP is not. It is the 8 bit clean restriction
> that is HTTPs main win over other protocols.

No way.  FTP is not 8 bit clean?  Finger is not 8 bit clean?

It is the uniform and portable representation of metadata (i.e. HTTP headers)
that is HTTP's main win over other protocols.  FTP could be nearly as good if
there were uniform ways to find out, rather than heuristically guess at,
things like the last modification time and content-type of files.

HTTP mostly combines the headers and content-labeling of email/MIME, the
file-transfer of FTP, and the lightweight request-reply nature of finger.
Quiz:  Which of these three protocols does not employ canonicalization?

> This is a character set issue, not a content type issue. If people want to
> propose that the default characterset interprets CRLF in this manner then
> fair enough. 

HTTP supports different character sets? :-)

Assuming MIME takes the direction it appears to be, it's the case in every
charset, not just US-ASCII.  (This does have the implication that Unicode
can't be a text/foo type but must be an application/foo type.)

- Marc
Received on Wednesday, 7 December 1994 10:26:54 UTC