Re: Comments on the HTTP/1.0 draft. from Roy T. Fielding on 1994-12-02 (ietf-http-wg@w3.org from October to December 1994)

From: Roy T. Fielding <fielding@avron.ICS.UCI.EDU>
Date: Thu, 01 Dec 1994 20:54:34 -0800
To: Marc VanHeyningen <mvanheyn@cs.indiana.edu>
Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <9412012054.aa09964@paris.ics.uci.edu>
> Highlighting a few issues, which I hope will not create the image that
> I am just trying to disagree with Roy on everything :-)...

Well, it's better than just agreeing with me -- I hate that.  ;-)

> - If-Modified-Since.  Part of the whole point of how this mechanism
>   was defined is that servers that don't support it will just ignore
>   it and return the whole object, which may sometimes be inefficient
>   but won't break anything.  I think servers "should" implement this
>   feature.  "Must" is too strong for a feature that increases
>   efficiency but won't break anything by its absence.

No, the object was to remain compatible with *existing* servers -- from the
very beginning it was assumed that all new servers would have to implement
it so that caching mechanisms could effectively switch from using HEAD
to using the conditional GET.  In fact, I regularly tell people that
they MUST upgrade their existing server because the old versions do
not support it.  This is not something open for debate -- it could make
the difference between the web making or breaking trans-atlantic
(and other miserably overloaded) network links.

> - Non-ASCII characters in headers.  I don't think this is a big deal
>   at all, though I'll be surprised if there isn't already somebody
>   somewhere using non-ASCII in the comment section of the From: line
>   or something, and I hope it's being done according to 1522 instead
>   of somebody assuming the character set used in his particular nation
>   is the universal character set for the whole world.

I suppose so.

> - HTTP-Dates.  It's not that including the day of the week is
>   unfathomably difficult, but changing things in general.  It's
>   confusing to say "An rfc1123-date in HTTP actually only allows a
>   restrictive subset of what RFC 1123 specifies," and for little if
>   any gain.
> 
>   I am uncomfortable with deviating from existing specifications
>   without more compelling reasons for doing so.  I mean, heck, if we
>   just want a date that's easy to parse, how about an integer of the
>   number of seconds since the beginning of 1970?  Easy to implement,
>   at least under UNIX. :-)

There is no deviation -- its just a subset.  That way, HTTP messages
can go outside HTTP with no problem, and inside HTTP we have a completely
unambiguous date format which will last beyond the year 1999.
We could, of course, require that clients and servers be capable of
parsing all date formats (a la USENET's get_date() function), but I
don't think I'd live through the pummeling I'd get from all the programmers
at every Web conference for the next 50 years or so.  ;-)

> - Canonicalization of content.
> 
>   I'll drop this if everyone else thinks I'm just being a pedantic
>   dork, but I really believe the purpose of a specification is to
>   establish precise, correct behavior in which neither clients nor
>   servers need to do heuristic guessing about what means what.

That is what ftp does.  It may look good in theory, but it simply isn't
necessary in practice.  That doesn't mean we shouldn't encourage people
to do so -- it just means we can't require it in HTTP/1.0.

>   Chuck Sutton suggests:
  >> IMHO, it should state, and CRLF should all be interpreted
  >> equally as EOL when used as line ends. This avoids any problems with
  >> machine dependent EOL symbols, and fairly represents the current practice.
  >> (It also avoids forcing clients and especially servers to do line-by-line
  >> translations of EOL for all outgoing response information, which is a BIG
  >> performance hit.)
> 
>   (Aside: Does somebody have benchmarks to establish the magnitude of
>   this "big performance hit"?)

Yes, just turn on server parsing of HTML files -- the performance hit
was two-three orders of magnitude on my server (worse if you consider
logfile collisions important).

In general, it is much better for clients to use content heuristics
than it is for servers to deal with content canonicalization, because
of the issues of scale.

> - Passing thought:  If a request contains a Message-ID header, should
>   the server include that message-ID in the response, maybe in an
>   In-Reply-To: header?

Nope.  I misrepresented this in the spec -- Message-ID should only be
"strongly-recommended" for PUT and POST requests, since that is the only
time when it is useful.  Is this acceptable?

Similarly, Date: should only be "strongly recommended" for response
messages and PUT/POST requests.


......Roy Fielding   ICS Grad Student, University of California, Irvine  USA
                                     <fielding@ics.uci.edu>
                     <URL:http://www.ics.uci.edu/dir/grad/Software/fielding>
Received on Thursday, 1 December 1994 21:07:18 UTC