- From: Marc VanHeyningen <mvanheyn@cs.indiana.edu>
- Date: Thu, 01 Dec 1994 22:50:50 -0500
- To: "Roy T. Fielding" <fielding@avron.ICS.UCI.EDU>
- Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Highlighting a few issues, which I hope will not create the image that I am just trying to disagree with Roy on everything :-)... - If-Modified-Since. Part of the whole point of how this mechanism was defined is that servers that don't support it will just ignore it and return the whole object, which may sometimes be inefficient but won't break anything. I think servers "should" implement this feature. "Must" is too strong for a feature that increases efficiency but won't break anything by its absence. - Non-ASCII characters in headers. I don't think this is a big deal at all, though I'll be surprised if there isn't already somebody somewhere using non-ASCII in the comment section of the From: line or something, and I hope it's being done according to 1522 instead of somebody assuming the character set used in his particular nation is the universal character set for the whole world. - HTTP-Dates. It's not that including the day of the week is unfathomably difficult, but changing things in general. It's confusing to say "An rfc1123-date in HTTP actually only allows a restrictive subset of what RFC 1123 specifies," and for little if any gain. I am uncomfortable with deviating from existing specifications without more compelling reasons for doing so. I mean, heck, if we just want a date that's easy to parse, how about an integer of the number of seconds since the beginning of 1970? Easy to implement, at least under UNIX. :-) - Canonicalization of content. I'll drop this if everyone else thinks I'm just being a pedantic dork, but I really believe the purpose of a specification is to establish precise, correct behavior in which neither clients nor servers need to do heuristic guessing about what means what. Chuck Sutton suggests: > IMHO, it should state, and CRLF should all be interpreted > equally as EOL when used as line ends. This avoids any problems with > machine dependent EOL symbols, and fairly represents the current practice. > (It also avoids forcing clients and especially servers to do line-by-line > translations of EOL for all outgoing response information, which is a BIG > performance hit.) (Aside: Does somebody have benchmarks to establish the magnitude of this "big performance hit"?) This is probably sensible behavior, and something along these lines (possibly modulo the suggested changes from Ari) should go in an appendix on tolerant, robust implementations. This is in keeping with the oft-cited philosophy of "be liberal in what you accept." However, the other half of that is "be conservative in what you send." Being conservative means sending objects in canonical form only, and not assuming the program on the other end will be clever enough to guess what you really meant. The spec should say this. How about with new developments? If UNICODE support is desired, how should line breaks be represented and detected in a robust fashion? Do we really want to have to include low-level stuff like this in the spec, instead of just saying "do it in canonical form"? Aside: The issue of canonicalization is, in principle, not wedded to any particular content-type family, but in practice seems almost exclusive associated with line endings. In principle, this isn't really true; for instance, discarding the resource fork from a Mac file and sending on the data could be considered converting it to canonical form, and obviously that's needed. Or should we expect all clients to be clever enough to recognize that and discard it? :-) OK, end of tirade (maybe.) If people simply must ship around objects with different ways of representing the same thing, there should be an out-of-bandwidth way to indicate that. A Content-Encoding of "unix-text", for instance, could indicate that line breaks are represented with LF. Obviously a provision for multiple C-Es would be needed to describe things like "gzipped UNIX text". This should be a C-E, though, not a C-T-E. A proposed C-T-E for UNIX text would probably trigger an uproar of laughter on the MIME mailing list (and rightly so.) - Passing thought: If a request contains a Message-ID header, should the server include that message-ID in the response, maybe in an In-Reply-To: header? - Marc -- Marc VanHeyningen <URL:http://www.cs.indiana.edu/hyplan/mvanheyn.html>
Received on Thursday, 1 December 1994 19:51:49 UTC