- From: Jean-Philippe Martin-Flatin <syj@ecmwf.int>
- Date: Wed, 20 Sep 1995 12:03:43 +0100
- To: Roy Fielding <fielding@beach.w3.org>
- Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Hi, Here are some comments on HTTP 1.0 Draft 3, referencing the PostScript version. I've tried to sort them in 5 categories: <Problem>, <Why>, <Edit>, <Comment> and <Pedantic> to help you browse through them. Feedback is welcome ! ## 1 ## <Comment> As a general rule, we should have less "should" and more "must" in this spec. Otherwise, backward compatibility will be a nightmare with the advent of HTTP/1.1 and HTTP/1.2. Also, we should stick to the usual hierarchy {must, should, may} found in RFCs, and avoid expressions like "are strongly encouraged to" wherever possible. ## 2 ## <Why> Page 4, section 1.1, first paragraph: "This specification is not intended to become an Internet standard". Why ? FTP is a standard, Telnet is a standard, why shouldn't HTTP become one ? ## 3 ## <Pedantic> Page 4, section 1.2, first paragraph: "metainformation" should read "meta-information". The same correction should be made page 5, definition of entity, page 19, section 5.2.2, lines 2 and 4, page 22, last line before section 6.2.1, page 23, last line but one, page 27, section 7.1, line 1. ## 4 ## <Edit> Page 5, definition of server: "A program that accepts connections in order to service requests by sending back responses". We should add after this sentence: "A server can be an origin server, a proxy or both". ## 5 ## <Edit> Page 5, definition of proxy: We should remove the last sentence: "Some proxy servers also act as origin servers". It's confusing in this context, and #4 makes this point clear. ## 6 ## <Pedantic> Page 6, definition of #rule, line 3: "and optional linear whitespace (LWS)". To be consistent with the rest of section 2.1, this should read "and optional linear whitespace characters (LWS)". Same correction page 7, section implied*LWS, line 2. ## 7 ## <Edit> Page 7, section 2.2, line 2: We should add at the end of the first paragraph something like: "Throughout this document, BNF definitions are not given in the usual top to bottom manner, i.e. definitions are not only based on reserved words previously defined, but also use reserved words defined further on in the text. This was deemed to improve the clarity of the whole document". ## 8 ## <Why> Page 7, section 2.2, definition of LWS: Why do we have [CRLF] ? Are there > 80% of the WWW applications using CRLF as a linear whitespace character ? I doubt it. Page 16 section 4.2, it says: "Header fields can be extended over multiple lines by preceding extra line with at least one LWS, though this is not recommended". Better than not recommended, this should be forbidden by the protocol. Instead, a mention to this isue on page 43 section B ("Tolerant applications") would be a better idea. What is the rationale for allowing it ? ## 9 ## <Why> Page 7, section 2.2, definition of tspecials: What does the "t" stand for in the reserved word "tspecials" ? ## 10 ## <Pedantic> Page 7, section 2.2, definition of tspecials: Rather than 4 lines, this could be compressed in 2 lines. The same comment applies to many other BNF definitions. This would make the whole document more concise. ## 11 ## <Pedantic> Page 8, line 5: "any text" is confusing: "text" is a reserved word here, and refers to the BNF definition further on. It's not the common word "text". This could be made more explicit by using a different typography for reserved words, e.g. italics. ## 12 ## <Edit> Page 8, last but one line: "Recipients... may assume that they represent ISO-8859-1 characters". It's not "may", it's either "should" or "must". I'd vote for "must". ## 13 ## <Edit> Page 10, section 3.2.1, definition of scheme: Do we really need to allow "+", "-" and "." in a scheme name ? I don't think so. ALPHA and DIGIT are enough. ## 14 ## <Edit> Page 10, last but one line: Typo: in "and HTTP proxies may receive requests for URLs", "proxies" should clearly read "servers". ## 15 ## <Edit> Page 11, section 3.2.2: The trailing slash to indicate the default page of a server (e.g. http://www.w3.org/ as opposed to http://www.w3.org) should be mandatory for clients, and servers should apply the robustness principle, i.e. understand URIs without a trailing slash and add one. The current wording is a bit vague on this issue. ## 16 ## <Edit> Page 11, section 3.3, second paragraph: It should be stated more clearly that the 3rd format (asctime) is obsolete, that neither clients nor servers should generate a date in asctime, but both should be able to understand it (again, principle of robustness). We should also add that the second format (RFC 850/1036) uses 2 digits for the year, and with the year 2000 getting close, this format is likely to be obsoleted by the next release of HTTP. ## 17 ## <Edit> Page 11, definitions of rfc1123-date and rfc850-date: To be consistent, we should name them either rfc1123-date and rfc1036-date or rfc822-date and rfc850-date. The current names are inconsistent. ## 18 ## <Edit> Page 12, last line before the definition of charset: The line "and other names specifically recommended for use within MIME charset parameters" should be deleted. This is not an exhaustive list of all enabled charsets, but just "the preferred names for those character sets most likely to be used". ## 19 ## <Edit> Page 12, definition of charset: "token" should be replaced with "<IANA character set>". It is a bad idea not to use the IANA character sets, I can't see the rationale for it. The 2 sentences following the definition of charset should be replaced with: "Applications are encouraged to use the preferred character set names listed above, and required to use a character set defined by the IANA registry". ## 20 ## <Edit> Page 13, section 3.5, first line: Typo: "that has been or can be applied to a resource" should be replaced with "that has been applied to a resource". A content coding that "can be" applied to a resource is meaningless. ## 21 ## <Why> Page 13, section 3.6, line 6-7: "because it does not restrict itself to the official IANA and x-token types". Why ? What's the rationale ? ## 22 ## <Edit> Page 13, section 3.6, after definition of subtype: We should add the following before "Parameters may follow...": "HTTP/1.0 uses media-type values in the Content-Type (Section 8.5) header field". This makes it consistent with section 3.5 and its reference to section 8.3. ## 23 ## <Why> Page 15, section 3.6.2, line 2: "The multipart types registered by IANA [15] do not have any special meaning for HTTP/1.0". Why ? This section says that HTTP multipart messages are possible, and says further on that "multipart body-parts may contain HTTP header fields which are significant to the meaning of this part". But the definition of Full-Request page 18 allows a single {Entity-Header, CRLF, Entity-Body}, not multiple ones, so there seems to be a contradiction. The multipart issue probably needs further clarification. Is this an item for HTTP/1.1 ? ## 24 ## <Pedantic> Page 16, section 4.1, line 7: Typo: "a.k.a." should be expanded to "also known as": this is a formal spec ! ## 25 ## <Edit> Page 16, section 4.2: Several points are not covered here. We should add the following to this section: "No header field has a default value, except Date: (Section 8.6). If a field-value is specified without a field-content, it should be ignored. The field-name is case-insensitive. If a field-name appears in more than one header field, then the whole message should be discarded and a 4xx or 5xx error returned". ## 26 ## <Edit> Page 17, first line: Typo: "The order in which header fields are received is not significant": "received" should read "sent", cf next sentence. ## 27 ## <Edit> Page 17, lines 5-6: If we trust section 3.6.2 that "multipart body-parts may contain HTTP header fields", then this sentence is wrong: in each part of a multipart message, we could have the same HTTP header field. ## 28 ## <Problem> Page 17, section 4.3: "There are a few headers... which do not apply to the communicating parties or the content being transferred". MIME-Version is surely concerned in the content being transferred, so there's a problem here ! ## 29 ## <Problem> Page 17: Should a maximum length be defined for the HTTP header = General-Header + Response-Header + Entity-Header + {Request-Line|Status-Line}, say 4KB ? That would ensure that a server cannot get stuck reading an infinite HTTP header from a bogus client. ## 30 ## <Why> Page 18, section 5.2, line 2: "The method is case-sensitive". Why ? Why couldn't we accept Get, Head and Post for instance ? Almost everything else is case-insensitive, why be more restrictive here ? ## 31 ## <Edit> Page 21, section 6.1, definition of Status-Line: The Reason-Phrase should really be optional, i.e. the BNF should read: Status-Line = HTTP-Version SP Status-Code [SP Reason-Phrase] CRLF This is even implied next page: "since that entity is *likely* to include human-readable information". ## 32 ## <Pedantic> Pages 22-26: Sections 6.2.1 through 6.2.5 should be moved to chapter 8: chapters 4-5-6-7 are not in-depth, chapter 8 is. ## 33 ## <Edit> Page 23, definition of 201, line 4: "or within a clearly defined timeframe": how can the client learn from the server what this "clearly defined timeframe" is ? This is wishful thinking: even if it looks like a good idea initially, it's not feasible in practice with HTTP/1.0. This should be removed from the spec. This may be put back in a later version of HTTP if we add a header field for the server to tell the client what this "clearly defined timeframe" is at the same time it returns 201. ## 34 ## <Edit> Page 24, definition of 300, last line: "user agents may use the Location value for automatic redirection". Read this sentence twice, it's ambiguous and has 2 meanings: what you mean is that it can get from the server a Location field. What you can also understands is that the client may use the Location field in its request, which is wrong. Let you native English speakers devise a new unambiguous sentence to replace this one ! ## 35 ## <Problem> Page 24, definition of 301, second paragraph: The new URL is given twice, once in the Location header field, once in the Entity-Body. This is redundant and a loss of bandwidth. It shouldn't appear in the Entity-Body, IMHO. Ditto for 302. ## 36 ## <Why> Page 24, definition of 301, third paragraph: "If the 301 status code is received in response to a request using the POST method, the user agent must not automatically redirect the request unless it can be confirmed by the user, since this might change the conditions under which the request was issued." What is the rationale ? What practical case do you have in mind ? Ditto for 302. ## 37 ## <Edit> Page 25, section 6.2.4: Line 2, in "should immediately cease", replace "should" with "must". Line 3, in "the server is encouraged to include", replace "is encouraged to" with "should". In the definition of 400, in "The client is discouraged from repeating", replace "is discouraged from" with "should not". ## 38 ## <Edit> Page 25, definition of 401:, line 3: Just after "suitable Authorization", we should add "(Section 8.2)". ## 39 ## <Comment> Page 25, definition of 403: We lack a status code whereby the server refuses to honor a request, but is willing to say why, e.g. "this page is available to internal users only". Should we create a new status code for this in HTTP/1.1 ? ## 40 ## <Edit> Page 26, line 4-5: In "it should immediately cease", replace "should" with "must". In "the server is encouraged to include an entity", replace "is encouraged to" with "should". ## 41 ## <Comment> Page 27, definition of Entity-Header: Allow is considered as an Entity-Header. Isn't it server-specific rather than URI-specific ? In other words, shouldn't it be a Response-Header instead ? ## 42 ## <Edit> Page 27, section 7.1, last line: Add "unmodified" as follows: "Unknown header fields should be ignored by the recipient and forwarded unmodified by proxies". ## 43 ## <Pedantic> Page 27, section 7.2, second paragraph: Line 2, "in general" should be deleted, because it's always the case: cf line 4 "must include". Line 4, there's a typo: "request message header", singular. Line 4 again, "HTTP/1.0 requests containing content" doesn't sound great: "containing an entity-body" sounds better to me. ## 44 ## <Edit> Page 27, section 7.2, last line: "The responses 204 (no content) and 304 (not modified)". Should also add "and 403 (forbidden)". ## 45 ## <Pedantic> Page 28, line 4: Delete "(i.e. the identity function)", it's dead wood and adds nothing. ## 46 ## <Edit> Page 28, lines 5-6: Cf #25 for the default value. Replace lines 5-6 with: "All HTTP/1.0 messages with an entity-body must have a Content-Type header field. If and only if this header field is not specified, as is always the case for HTTP/0.9 messages, the recipient may attempt to guess the media type...". Line 8, replace "the receiver should" with "the recipient must". ## 47 ## <Pedantic> Page 28, section 7.2.2, paragraph 2, line 2-3: "containing content" should become "containing an entity-body". End of line 3, "entity body" should become "entity-body" to be consistent with the rest of the document. ## 48 ## <Edit> Page 28, last 2 lines: "Unless the client knows that it is receiving a response from a compliant server, it should not depend on the Content-Length value being correct". Argh ! These servers aren't HTTP/1.0 compliant, let's not ask the clients to break the protocol to accomodate them ! This sentence should be deleted. Maybe replaced with a reminder that clients should be robust, but I doubt it, as the rest of the note makes the point clear. ## 49 ## <Edit> Page 28, last line: Add a second note: "Note: The Content-Length header field must not be specified if there is no Entity-Body in the message; in other words, 'Content-Length: 0' is invalid." ## 50 ## <Comment> Pages 29-35: It would be a good idea to add a line at the beginning of all 8.x sections specifying whether this header field may be found in a request or a response, e.g.: Request: YES Response: NO For the moment, we need to edit the first sentence of most sections 8.x to give this information (a few already have it). ## 51 ## <Edit> Page 29, section 8.1: Line 1, add "response" after "Allow". Line 4, in "thus should be ignored", replace "should" with "must". Second paragraph, line 2, delete "This field has no default value (cf #25). Third paragraph, replace "the allow header" with "the Allow header field", to remain consistent in terms of terminology. ## 52 ## <Edit> Page 29, section 8.2: Beginning of line 1, add: "This is a request header field". Last line, replace "Proxies must not cache the response to a request" with "Proxies must not cache any HTTP/1.0 message". ## 53 ## <Edit> Page 29, section 8.3: Line 1, add "(request or response)" after "header field". ## 54 ## <Edit> Page 30, lines 2-3: Delete "or analogous usage". The sentence starts with "Typically", so you don't need it. ## 55 ## <Edit> Page 30, section 8.4: Line 1, add "(request or response)" after "header field". The paragraph after the example is wrong, cf page 20 line 1 (mandatory for POST) and page 28 line 17. This paragraph should read instead: "Applications must use this field to indicate the size of the Entity-Body to be transferred, regardless of the media type of the entity". Next line, "greater than or equal to zero" should become "greater than zero", cf #49. In the Note, line 3, replace "should" with "must". The rationale is that it's mandatory, but applications should be robust and not crash if it's not there. ## 56 ## <Edit> Page 30, section 8.5: Line 1, add "(request or response)" after "header field". After the example, delete "The Content-Type header field has no default value", cf #25. ## 57 ## <Edit> Page 30, section 8.6: Line 1, replace "The Date header" with "The Date header field (request or response)". After the example, it states: "If a message is received via direct connection with the user agent (in the case of requests) or the origin server (in the case of responses), then the default date can be assumed to be the current date at the receiving end". The presence of proxies is irrelevant here. This sentence should be replaced with: "If a message has no Date header field, then the recipient may assume that the default date is the current date at the time the message is received". Last line of the page, in "origin servers should always include a Date header", replace "should" with "must". ## 58 ## <Edit> Page 31, section 8.6: Line 5, delete first sentence: "Only one Date header field is allowed per message" (cf #25). ## 59 ## <Edit> Page 31, section 8.7: Line 1, add "response header" after "Expires". Lines 2-3, replace "Caching clients, including proxies" with "Caching clients and proxies". Second paragraph, i.e. after the example, delete sentence "The Expires field has no default value" (cf #25). End of second paragraph, after "dynamism", we should add: "The Expires date should not be earlier than the Date date, but this is not mandatory." This is to cope with bogus implementations as explained in the note of section 8.7. Third paragraph, "The Expires field" should become "The Expires header field", for consistency. ## 60 ## <Edit> Page 32, section 8.8: Line 1, add "request" after "From". Line 3, "as updated" becomes "and updated". End of note, add "(Section 8.14)". ## 61 ## <Edit> Page 32, section 8.9: Line 1, add "request" after "If-Modified-Since". In a) line 1, replace "200" with "2xx". End of section, add: "Note: Servers implementors are encouraged to return responses with a status code of 304 quicker (i.e. higher priority) than responses to a normal GET or an If-Modified-Since with another status code." Not sure many servers already prioritize their responses, but sounds like a good idea, as it encourages caching. ## 62 ## <Edit> Page 33, section 8.10: Line 1, add "response" after "Last-Modified". Line 1, delete "sender believes the", that's dead wood. Nothing is "guaranteed" per se, it's always as the client or the server believes it. Line 3, replace "receiver" with "recipient" twice (cf terminology at the beginning of section 8). Replace last line of section 8.10 with: "In such cases, where the resource's last modification time would indicate some time in the future (e.g. due to time skew between the origin server and a database accessed via a gateway), the server must replace that date with the message origination date". ## 63 ## <Pedantic> Page 33, section 8.12: Line 1: replace "MIME-conformant" with "MIME-compliant", to use the same expression throughout the spec. Line 1, add "(both requests and responses)" after "HTTP/1.0 messages". Line 9, replace "intended to be MIME-conformant" with "fully MIME-compliant". ## 64 ## <Edit> Page 34, section 8.13: Line 1, replace "Pragma message" with "Pragma response". Line 3, delete "intermediate" (cf terminology section 1.3). First line after the definition of extension-pragma, replace "intermediary" with "proxy" (cf terminology section 1.3). ## 65 ## <Why> Page 34, section 8.13, lines 4-5: "All pragma directives specify optional behavior from the viewpoint of the protocol": why optional rather than mandatory ? ## 66 ## <Edit> Page 34, section 8.14: Last line, add "(cf Section 8.8)". ## 67 ## <Comment> Page 35, section 8.15: After the example: "If the response is being forwarded through a proxy, the proxy application should not add its data to the product list". In fact, the proxy mustn't overwrite any header field, except the HTTP-Version in the Status-Line (cf page 9, last paragraph). So this sentence should probably be removed. Last line of the note: this is security though obscurity. It gives you the illusion of being more secure, that's all. If you have a server open to the world, it's open for hackers. If they know a loophole to break in say NCSA httpd 1.3, they'll try it on your server, whatever you return with Server. I don't think there's any point in encouraging servers implementors to make Server a configurable option. ## 68 ## <Edit> Page 35, section 8.16: Line 1: add "request header" after "User-Agent". Line 4, replace "should always" with "should": again, let's stick to the RFCs definitions of must, should and may. ## 69 ## <Edit> Page 35, section 8.17: Line 1, add "response" after "WWW-Authenticate". ## 70 ## <Edit> Page 36, last but one paragraph: The second sentence is ambiguous: you want to allow authentication and encryption mechanisms which are not at the transport level as well, cf IPv6. I suggest to rephrase it like this: "Additional authentication and encryption mechanisms may be used, e.g. at transport level via message encapsulation, and/or with additional headers...". ## 71 ## <Edit> Page 36, last but one line: Replace "and must not cache a request containing Authorization" with "and must not cache a message containing an Authorization or WWW-Authenticate header field". ## 72 ## <Comment> Page 39, lines 3-4: "the Referer field may indicate a private document's URI whose publication would be inappropriate". If it's private, it's either not accessible to anyone except the owner, or not accessible to external users. So there's little risk here, IMHO. The real security problem is the weakness of the authentication and autorization schemes available in HTTP/1.0, as stated earlier in section 10. Referer, From and Server are negligible in comparison, I see little point in insisting so heavily on them. ## 73 ## <Pedantic> Page 40, line 10: Replace "Jean Francois-Groff" with "Jean-Francois Groff". ## 74 ## <Pedantic> Page 43, section B, lines 5-6: To be consistent with the rest of the document: Replace "StatusLine" with "Status-Line". Replace "RequestLine" with "Request-Line". ## 75 ## <Pedantic> Page 44, section C, line 4: To be consistent with the rest of the document: Replace "MIME-conforming" with "MIME-compliant". Ditto page 45, line 2. ## 76 ## <Comment> Page 45: The difference between Content-Encoding and Content-Transfer-Encoding is not crystal clear (at least to me !). A few words explaining it would be welcome in section C.4. ## 77 ## <Comment> Page 45: A section C.5 about multipart could be added, cf #23 and #27. Et voila ! Thanks for reading up to here, it was rather longish... Jean-Philippe
Received on Wednesday, 20 September 1995 04:07:18 UTC