Re: Comments on the HTTP/1.0 draft. from Marc VanHeyningen on 1994-12-07 (ietf-http-wg@w3.org from October to December 1994)

From: Marc VanHeyningen <mvanheyn@cs.indiana.edu>
Date: Wed, 07 Dec 1994 16:48:50 -0500
To: hallam@alws.cern.ch
Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <22941.786836930@silky.cs.indiana.edu>
Phillip said:
>The problem is that whatever we do there will be a lot of non canonicalising
>servers arround so the clients have to cope anyway. So all the 
canonicalisation
>requirement will mean is that documents will get incorrectly canonicalised 
>when they should not.

Whoa.  Are you saying a server that returns canonicalized text (as most MS-DOS
based servers presumably will) is broken, and clients will not be able to 
handle the results?

Any HTTP implementation should be able to deal with objects in canonical form,
as well as possibly other forms.  One that can't do that is broken.  There is
no case where objects must not get canonicalized; only possibly areas where
they *may* not.

>>HTTP supports different character sets? :-)
> 
>Yep, you can define the character set as part of the content type texp/plain;
>charset=EBSIDIC or wotever.

Yes, I know; hence the smiley.  But in practice, client support is within
epsilon of being nonexistent.  As is server support.  They can't even parse
the header right, let alone actually attempt to display the specified
character set.

>It woiuld be easy enough to specify a MACversion of the ASCII charset.

One (more than one, if memory serves) already exists and is registered.

Chuck said:
>  It's not clear that you're discussing the same thing as everyone else. No
>one disputes the need for the request and response headers to use standard
>EOL encoding. That's a given and everyone understands that this is the
>case. This is a no-brainer because all the headers are generated by clients
>and servers and can always be generated correctly.

(Actually, I think your statement that all headers always are generated by
clients and servers is a bit shortsighted, and I can envison instances in
which HTTP headers are stored in files and shipped wholesale, but never mind
that now.)

>This discussion is about object-bodies ONLY.

No, this discussion is about *textual* object-bodies ONLY.  I think everybody
agrees that, in addition to headers, GIFs and audio files and MPEGs and
everything else should be shipped around the network in their canonical form,
rather than in some local form.  Luckily, most of the systems that people
use happen, by a convenient coincidence, to locally use the canonical form
or something easily converted into canonical form, so that there isn't a
requirement for expensive conversion.

There are minor conversions; in a sense the Macintosh local form could be
said to include the resource fork as well as the data fork, but I don't
think anybody thinks all clients should understand macbinary even though
this could be said to be the local form of Mac files.  Mac servers should
convert, say, GIF files stored on a Mac to canonical form by discarding
the resource fork and sending only the data fork.  Do we at least agree
on this point?

What you are suggesting is that everything go in canonical form *except* text,
which should be considered a special case because it's common, has varying
local forms, but those local forms are not inordinately difficult to understand
in a flexible fashion.  This may be a reasonable exception.  But it is an
exception, an argument that textual object-bodies should be a special case.

>I have yet to hear a factual, supported reason why this must be done or
>else HTTP will fail.

That's OK, I have yet to hear a clear statement from you which of the two
positions I mentioned in the first paragraph of my previous message is yours.
Do you think that canonicalizing line breaks is technically fine but just too
expensive to implement, or that it's just plain dumb?  It does make a
difference.  If you think the former, then we agree about the important
stuff.

>You are asking that current practice be
>discarded in favor of an idea that has not been proven to be of any use to
>the HTTP community.

No, I am stating that I think existing standards and practices outside of
HTTP are being dismissed without due consideration.  I fully expect us to
eventually get a reasonable approach which either tolerates or standardizes
existing practice with regard to the special treatment of textual objects.
The question is whether it should tolerate it, as the current spec appears
to do, or standardize it, and exactly how.

Albert Lunde said:
>Now, it seems like we are saying is that current practice
>(not just "bad" servers) is to treat EOL differently in
>the object body for performance reasons.
>
>In this, and in other ways, we are not just quoting the MIME
>spec, we are sort of rewriting it.

Yes.  Absolutely.

- Marc
Received on Wednesday, 7 December 1994 13:52:39 UTC