- From: Marc VanHeyningen <mvanheyn@cs.indiana.edu>
- Date: Wed, 07 Dec 1994 16:48:50 -0500
- To: hallam@alws.cern.ch
- Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Phillip said: >The problem is that whatever we do there will be a lot of non canonicalising >servers arround so the clients have to cope anyway. So all the canonicalisation >requirement will mean is that documents will get incorrectly canonicalised >when they should not. Whoa. Are you saying a server that returns canonicalized text (as most MS-DOS based servers presumably will) is broken, and clients will not be able to handle the results? Any HTTP implementation should be able to deal with objects in canonical form, as well as possibly other forms. One that can't do that is broken. There is no case where objects must not get canonicalized; only possibly areas where they *may* not. >>HTTP supports different character sets? :-) > >Yep, you can define the character set as part of the content type texp/plain; >charset=EBSIDIC or wotever. Yes, I know; hence the smiley. But in practice, client support is within epsilon of being nonexistent. As is server support. They can't even parse the header right, let alone actually attempt to display the specified character set. >It woiuld be easy enough to specify a MACversion of the ASCII charset. One (more than one, if memory serves) already exists and is registered. Chuck said: > It's not clear that you're discussing the same thing as everyone else. No >one disputes the need for the request and response headers to use standard >EOL encoding. That's a given and everyone understands that this is the >case. This is a no-brainer because all the headers are generated by clients >and servers and can always be generated correctly. (Actually, I think your statement that all headers always are generated by clients and servers is a bit shortsighted, and I can envison instances in which HTTP headers are stored in files and shipped wholesale, but never mind that now.) >This discussion is about object-bodies ONLY. No, this discussion is about *textual* object-bodies ONLY. I think everybody agrees that, in addition to headers, GIFs and audio files and MPEGs and everything else should be shipped around the network in their canonical form, rather than in some local form. Luckily, most of the systems that people use happen, by a convenient coincidence, to locally use the canonical form or something easily converted into canonical form, so that there isn't a requirement for expensive conversion. There are minor conversions; in a sense the Macintosh local form could be said to include the resource fork as well as the data fork, but I don't think anybody thinks all clients should understand macbinary even though this could be said to be the local form of Mac files. Mac servers should convert, say, GIF files stored on a Mac to canonical form by discarding the resource fork and sending only the data fork. Do we at least agree on this point? What you are suggesting is that everything go in canonical form *except* text, which should be considered a special case because it's common, has varying local forms, but those local forms are not inordinately difficult to understand in a flexible fashion. This may be a reasonable exception. But it is an exception, an argument that textual object-bodies should be a special case. >I have yet to hear a factual, supported reason why this must be done or >else HTTP will fail. That's OK, I have yet to hear a clear statement from you which of the two positions I mentioned in the first paragraph of my previous message is yours. Do you think that canonicalizing line breaks is technically fine but just too expensive to implement, or that it's just plain dumb? It does make a difference. If you think the former, then we agree about the important stuff. >You are asking that current practice be >discarded in favor of an idea that has not been proven to be of any use to >the HTTP community. No, I am stating that I think existing standards and practices outside of HTTP are being dismissed without due consideration. I fully expect us to eventually get a reasonable approach which either tolerates or standardizes existing practice with regard to the special treatment of textual objects. The question is whether it should tolerate it, as the current spec appears to do, or standardize it, and exactly how. Albert Lunde said: >Now, it seems like we are saying is that current practice >(not just "bad" servers) is to treat EOL differently in >the object body for performance reasons. > >In this, and in other ways, we are not just quoting the MIME >spec, we are sort of rewriting it. Yes. Absolutely. - Marc
Received on Wednesday, 7 December 1994 13:52:39 UTC