- From: Marc VanHeyningen <mvanheyn@cs.indiana.edu>
- Date: Sun, 04 Dec 1994 12:34:36 -0500
- To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Chuck Shotton said: >At 11:35 PM 12/1/94, Albert Lunde wrote: >>However, I think this is too much of a burden to add to client >>writers. It might or might not be. However, we do need to decide whether this behavior should be described in an appendix on "how to be tolerant of bad servers" or it should be a required part of the spec. If it stays where it is now, a "non-bad server" needs to be defined more clearly. (Note that this is also a burden on server authors, unless we require that the "request" portion be in canonical form but allow the "response" portion not to be; this would be sort of strange.) >Not at all. Show me a single client that doesn't already do this. As Roy >says, it is also an issue of scale. A client can more effectively do this >translation once for a single user than a server that must do it thousands >of times an hour for all users. This is assuming servers do this on the fly. While this is one way they might do this, it certainly is not the only way. A clever server might store the document in canonical form after converting it once, or cache frequently-requested documents after the conversion, or whatever. If it's important to do, it can be implemented efficiently; the question is whether it's important. In any case, if you are sending text with something other than CRLFs as line breaks, you are not sending text/plain; it's something else. The definition of text/plain is very clear on this point. I merely believe that, whatever it is, it should be clearly labeled. If we want to invent a way to label it, that's OK; Content-Encoding is a good way, and since it's not part of MIME we can do whatever we like with it. That reminds me, the current draft officially codifies "x-compress" and "x-gzip" as registered encodings. Since the x-token convention is normally used to indicate unregistered stuff, I think this should change to just "compress" and "gzip" [with compliance for the old labels mentioned, since existing system will still use them.] (Actually, I think the encodings should be "LZW" and "LZ77", refering to the actual algorithms rather than the names commonly given to the UNIX implementations of those algorithms, but that would cause more confusion than it's worth.) >Wouldn't be a very efficient standard when implemented. Every compiler I'm >aware of supports multiple representations for EOL. Why shouldn't the >parsers associated with HTTP and HTML be equally tolerant? HTML files are >"source code" for the HTTP "compiler." HTML and HTTP are orthogonal. And the issue of transmitting objects in canonical form is not just about text, though that's the most prominent example. >>It would be nice of HTTP and HTML standards agreed on the treatment >>of line breaks in text/html.... Indeed it would be. However, MIME-Version 1.0 requires that all textual subtypes have line breaks represented as CRLFs, so the decision is pretty easy unless we want to register it as application/html. >I agree... as long as it accomodates all the representations for EOL in >current practice. The current attitudes towards this seem to be very >Unix-centric and this is very wrong. And requiring all clients in existence, regardless of what platform they run on, to understand the UNIX conventions for line breaks in text is not UNIX-centric? Huh? >It won't be long before we see HTTP >servers that have NOTHING to do with a local file system and reside on top >of a DBMS or some other non-traditional object store. I'm not aware of ANY >commercial DBMS implementations that use LF as EOL. This diverges from the >topic a bit, but I'm trying to make a point that it is NOT sufficient to >accomodate only a portion of the platforms in use (e.g., Unix) in the >standard as they will represent a decreasing proportion of Web platforms as >the Web grows. I absolutely agree; over time, more and more different platforms will be used in widely varying ways. But I don't think this supports your position; quite the opposite. This is why I shy away from codifying into the standard (is this intentended to be an Internet standards-track protocol?) UNIX-centrisms or requirements that all implementations understand the conventions used by every different system there is. -- Marc VanHeyningen <URL:http://www.cs.indiana.edu/hyplan/mvanheyn.html>
Received on Sunday, 4 December 1994 09:35:38 UTC