Some comments on the HTTP/1.1 draft

Here are some comments from my first reading of the May 2 draft of HTTP/1.1.


1.  Overall, there could be a little more clarity on the relationship between
this document and other versions (lower or higher) of HTTP.  I can think of
several possible relationships:

(A) This document specifies protocol version 1.1: how two peers that each
understands 1.1 and nothing higher talk to each other; all other interactions
are beyond the scope of this document.

(B) This document specifies protocol version 1.1: how two peers that each
understands 1.1 and nothing higher talk to each other; this document also gives
some hints, but not a comprehensive specification, on how other interactions
should go.

(C) This document specifies protocol version 1.1, and a general pattern for how
two 1.*-cognizant agents agree on which protocol version to use.  A response
message is in the highest protocol understood by both requestor and responder;
HTTP 1.* are designed such that any request can be in the highest protocol
understood by the requestor and yet safely interpreted by a responder that
knows only lower numbered protocol versions; an agent that understands 1.X is
required to understand 1.Y for all 0 <= Y <= X.

(D) This document specifies how a 1.1 agent interacts with a 1.0 or 1.1 agent.

(E) This document specifies how a 1.1 or 1.0 agent interacts with a 1.1, 1.0,
or 0.9 agent.

There is a discussion of 1.1 vs. 1.0 early on, and it started me thinking one
way, but some of the later material made me suspect a different relationship
was in mind.  In particular, there are some use of "MUST" and "SHOULD" that
look appropriate for 1.1-to-1.1 communication but not other cases that are also
addressed in this same document.


2.  I've always found the term "content negotiation" to suggest more than is
intended, I think.  I think what's intended is more like merely "format
negotiotiation" than the full generality suggested by "content negotiation".
Given the other terminology set forth, perhaps "variant negotiation" is a term
that hits the nail on the head?


3.  Let me see if I've got this "resource" and "entity" stuff right.  One
aspect is variability over time.  A "resource" is like a variable, an "entity"
like a value.  A "resource" can be bound to different entities over time (a
generic resource to multiple entities at once, a plain resource always to
exactly one entity).


4.  Another aspect is genericity.  A resource might be "generic", in that it
corresponds to multiple variables, rather than a single one.  Each of these
variables holds exactly one value (i.e., entity) at a time.  The set of
variables associated with a generic resource can change over time.  I expected
to see a term for the role played by "variable" in my explanation; instead I
see a term, "variant", for the value held by such a variable.  This seems
doubly odd to me, not only because it does not provide a term I think I need,
but also because it provides a second term for a concept we already have ---
the value of a variable (it seems odd to use different words for values of the
same kind depending on the role of the variable that holds them).


5.  What kinds of "entities" are there *besides* "resource entities"?


6. Section 6.1 doesn't mention the rule for concatenation.  The rule for
alternatives uses an "I" instead of a "|" in one place.


7.  Section 7.1, paragraph "Applications sending...MUST include an HTTP-Version
of ``HTTP/1.1''".  Is this intended to apply to the response to an HTTP/1.0
request?


8.  Section 7.2.1 switches from preferring "URI" to preferring "URL" where the
grammar is displayed.


9.  Section 7.2.1 says "All client and proxy implementations MUST be able to
handle a URI of any finite length".  Do you mean to say my implementation isn't
compliant if it can't handle a URI that's 2^100 byes long?


10.  The end of section 7.4 talks about "the lowest common denominator of the
character codes" used in a body.  But there's no definition of LCD over
character sets wrt character codes.  Presumably it's something like "the
simplest character set (i.e., encoding) that includes those character codes and
assigns them the intended characters".  But how exactly do I compare character
sets (encodings)?  Does this define a total order among character sets?  And is
the "SHOULD" here suggesting I should scan the document content to compute the
set of codes actually used, and then do some kind of search for the "simples"
encoding that uses those codes?  That could be a much smaller set of codes than
available in the native encoding of the document, and the search could
conceivably find a "simpler" encoding.  But this seems like an implausible
amount of work to be recommending.


11.  Last paragraph of page 21, 2nd sentence includes "a an".


12.  Section 7.6, last sentence.  I think you mean "size zero chunk header",
not "zero-sized chunk": a "zero-sized chunk" would be ``"0" CRLF CRLF'', but
the display shows only one CRLF after the "0".


13.  Section 7.7, last paragraph.  How about giving a URL for the registry?
Starting from <http://www.isi.edu/div7/iana/> for IANA, I found
<ftp://venera.isi.edu/in-notes/iana/assignments/media-types/>.


14.  Section 7.11, last sentence: ``... a resource entity that does not
exist''.  This linguistically is a little odd.  Since entities are eternal
values, it's odd to talk about them "existing".  As an analogy, numbers are
also eternal values.  4.7 always "exists".  You probably mean something more
like ``... a resource entity that is not currently bound to any variable at
XXX'', for some value of XXX.


15.  Section 7.12, first sentence starts talking about "instances" of
"entities".  Now, if an entity is a value, not a variable, what is the word
"instance" trying to tell me?  It looks like this sentence is trying to say "a
cache stores values, not variables" --- but can't because there's no term for
"variables" defined.


16.  Section 7.13 mentions "opaque-validator", but we haven't yet seen what it
is, and there's no reference to where it's defined (it's not defined, or even
clearly mentioned, in 16.5.3).


17.  Section 9, first paragraph promises two formats for a request, but
displays only one.


18.  Section 9.1.2, 3rd block of text talks about a proxy needing to recognize
all its names to avoid loops.  I think this is necessary only in the case where
the proxy is trying to send the request on "to the server specified by the
absoluteURI" and uses HTTP/1.X for some X>1 where an absoluteURI, not an
abs_path, is used in requests to origin servers.  Therefore I think the need to
recognize all a proxy's names is best brought up after the requirement to
accept absoluteURIs in requests to origin servers.


19.  Section 9.1.2, sentence "The Request-URI is transmitted as an encoded
string, where ...".  It's not obvious to me whether this sentence is defining
the encoding (by the "where" clause) or merely mentioning a property of an
encoding defined elsewhere.  Is it saying that the general syntax defined in
7.2.1 is not what applies here, but some encoding of it?  Or alluding to the
fact that the general syntax of 7.2.1 allows for something else to be encoded
in the 7.2.1 syntax?  Is this sentence saying something other than "The
Request-URI is in the syntax defined in 7.2.1"?


20.  Section 11.2, last sentence before 11.2.1 says "an entity body or a
Content-Length header field defined with a value of 0".  Why the second half of
the "or"?  Wouldn't "an entity body of length 0" capture what's intended, given
the discussion elsewhere about what length indicators have to be present when
there's an entity body?


21.  Section 11.2.2., first block of text ends with "server closing the
connection", but I wonder if "sender" is meant instead.


22.  Section 11.2.2, third block of text says closing the connection can't be
used to indicate end of a request.  But TCP (I mean the protocol, not
necessarily the subset easily available through the Berkeley sockets interface)
has a way (FIN) to indicate one peer is done sending without foreclosing the
ability of that peer to receive more from the other, right?


23.  Section 12.2.1.2, first paragraph, last 2 sentences.  The "otherwise"
clause of the last sentence seems to apply to the situation where "the action
*can* be carried out immediately", but the previous sentence says 201, not 202,
should be used in this case.


24.  Section 12.2.1.6, last sentence: Do you really mean to not say "... MUST
lack an entity body or include one of length 0"?


25.  Section 12.3.1.2, second paragraph applies "If the new URI is a location".
OK, what if it's not?  Same for 12.3.1.3.


26.  Section 12.5, introductory paragraph, second sentence says "If the client
has not completed the request when a 5xx code is received, it SHOULD
immediately cease sending data to the server".  Do you mean the client should
immediately terminate the request, or the connection?  A similar question
arises in 12.4.1.14, where I think the answer is more clearly indicated (by
lack of proscription): it's the request, not the connection, that's terminated.


27.  Section 13, 2nd introductory paragraph.  I'm surprised by the requirement
to include a Host header in requests using absoluteURIs.  Note that the top of
p. 30 says the Host header will be ignored in this case.  Are you planning to
drop the requirement for Host headers in later HTTP/1.* protocols?  If so,
don't you want to allow a 1.1 server receiving a 1.3 request to not err if Host
is missing but an absoluteURI is given?  Section 9.2 seems to be saying this.
For consistency, section 13 shouldn't give a flat "MUST" when the situation is
actually more complex.  "MUST" can be read as both of two things: a requirement
on senders, and a guarantee to receivers (i.e., restriction on what they're
required to accept without complaint); I think the latter is not intended here.


28.  Section 13.3 says you can NOT put the "Range" restriction in a HEAD
request.  What would be wrong with allowing the Range restriction?


29.  Section 16.1, last sentence before 16.1.1: do you also want to stipulate
that a transparency violation should be accompanied by a warning?  Or that such
violations should only be done under user control?


30.  Section 16.1.1, item 2: I'm surprised to see "least restrictive" instead
of "most" here.


31.  Section 16.1.1 ends with an incomplete sentence.


32.  Section 16.2.1, the displayed note: is this subordinate to the "if" of the
preceeding paragraph?


33.  Section 16.2.2, second paragraph.  I'm surprised to see history lists
addressed in a protocol document; aren't history lists purely a browser design
issue?


34.  Section 16.3.4, first bullet.  Replace "SHOULD send an entity tag
validator unless ..." with "SHOULD send a strong entity tag validator unless
...", because the "unless" talks about weak ones?


[I have more comments, but no time to send them today.]

Mike

Received on Friday, 17 May 1996 08:05:01 UTC