- From: Mike Cowlishaw <mfc@vnet.ibm.com>
- Date: Tue, 29 Nov 94 16:14:04 GMT
- To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
I was very pleased to see the new HTTP draft; it's a major improvement on previous versions! Here are some comments on the new draft which I hope will be useful. I am writing these from the perspective of an implementer of software for a reasonably general-purpose HTTP server, so I am especially looking for a definition of the HTTP which a) allows a server to be implemented using the definition (and referenced documents) and specifically without reference to specific clients or other implementations b) makes it absolutely clear what is required for a server to be stated as conforming to the definition. Most of these comments therefore seek clarification in these two areas. I'm sorry I won't be able to attend the IETF meeting next week--a long-standing commitment has me on the wrong coast of the USA. Mike Cowlishaw IBM Fellow, IBM UK Laboratories, Winchester, UK - - - - - - - - - 2.1 The '#' rule implies that no whitespace is allowed after (or before) the commas in a list. Is this correct? For example, in the example in 7.1 there is a space after each comma (which certainly aids readability). 2.2 linear-white-space rule: I didn't understand the comment "CRLF => Folding". I think this rule allows whitespace (but non-null) lines in headers etc.? 3.1 Header fields: (a) "However ... use of comments is discouraged". This seems rather outside the scope of a definition such as this; at most it should be a informational note, and explain why the note is there (historical or client incompatibility, performance, reduced net traffic?). (b) [nit] the second open quote in the comment rule should be a close quote. (c) The ctext rule seems to be missing some characters (there's an open quote, followed by an open single quote, but neither is closed). Also, shouldn't LF be excluded too? 3.2 Object body: (a) This is my 'biggest' question -- I don't understand from the second paragraph how to determine when to stop reading the data on a request. If the headers are only 'similar' to those defined by MIME, then the MIME definition may or may not be relevant. Moreover, a "heuristic function of the Content-Type and Content-Encoding" would appear to be unimplementable, as new Types (especially application/xxx) seem to spring up daily. It would seem to be appropriate that the HTTP protocol specify that Content-Length, in bytes, be Required--at least for Requests. (b) Does the server have to read the headers (and data, if any) on a request? For example, if the customizing filter/script doesn't need the information in order to determine the response, is the server permitted to leave the data unread, or could this embarrass some client(s) or TCP/IP stack(s)? 4.1 Date/Time stamps: (a) I'm a little disturbed that time is only permitted to be specified to the second, given that most server hardware will be able to handle many more than one request per second, and network transit time is often sub-second. If this is a limitation of RFC 1123, perhaps an additional HTTP header for sub-second time information should be specified. (b) Since this is a new standard/document, surely it should specify a single Date/Time format, and only mention the others for compatibility/historical information? (c) [aside] I wish, oh wish, that Longitude/Latitude information were a recommended header. 4.2 Multipart types: From the text, I infer that a server/script is not Required to respond with a multipart type when a client has indicated that it can accept them. It might be worth an explicit statement to that effect. 4.3.1 Date Header Field: (a) [nit] should refer to RFC 1123 rather than 822, or both? (b) It's not clear what time the header should refer to. For a response, is it the time when the request was accepted, or when the response line was generated, or when the first line of the header was transmitted, or when the 'Date:' header line was generated? (c) [nit] 'of' is missing between 'creation date' and 'the enclosed'. 4.3.3 Message-ID: [suggestion] Although the example shows a unique ID, it might be nice to encourage via the example a form of ID that includes the port number (if not 80), and even follows URL format. Perhaps: Message-ID: <http://info.cern.ch:8080/9411251630.4256> 4.3.4 MIME version: It is not at all clear why this is useful and strongly recommended if it is not an indication of full compliance with MIME. One might argue that it should *not* be included unless MIME-compliance of the remainder of the header and data is guaranteed? 5. Request: The 0.9 requirement here (Simple Request must have Simple Response) is somewhat onerous on a server. Is it possible to relax or remove this requirement yet, or are there still 0.9-only clients in use? I've noticed that at least some Simple Responses will not go through some proxies transparently. 5.2 Method and 5.2.2 Head: I'm surprised that the HEAD method *must* be supported, as it is ill-defined. 5.2.2 simply says that there must be no Object-Body; it seems that the header may or may not be related to the header that would be sent if the method were GET, and in particular, HEAD may as well just return an empty (null) header. Further, in many cases the cost of determining, building and sending the header is going to be the major part of many transactions, so should clients or proxies be encouraged to use this Method? 5.2.1 Get: [nit] The first paragraph should have the suffix: "(unless that is the produced data)." 5.2.3 Post: (a) Some clarification seems to be needed here; there's an assumption that Form data is used by some gateway program rather than the server/script directly, but in the latter case the specification (the paragraph starting "If the URI does not refer to a gateway...") implies that the Form data must be retrievable at some later date. (b) Can the URI returned via a URI-header be a partial URI as described in 5.4? Or does it have to be a full URI? [I infer the latter.] 5.2.3.1 [nit] Change 'references' to 'refers to'? 5.3 HTTP Version: Given that this definition is more rigorous than earlier documents, and hence must be more constraining, it would seem to be necessary to change the version number (perhaps to 1.1) to reflect the stricter conditions for compliance. If the version number is not changed, then the date of the relevant HTTP 1.0 document would have to be specified at every reference. 5.4 URI Note 1: What's 'default escaping'? What characters may be 'considered unsafe'? The "should" (probably meant to be "shall"?) implies that a server must comply with these conditions, but they do not seem to be well defined. 5.5.2 If-Modified-Since: This section implies that servers *must* implement this feature. However, the last-modified-date might be unavailable, unreliable, or not applicable for some URIs. In these cases (or indeed in any case), is the server permitted to return the object, despite the presence of the I-M-S header? 5.5.4 Authorization: UU-encoding: if defined by RFC 1421, then this should appear in Section 13 (References), and Appendix 15 should go away (as it does not appear to apply, in any case)? 6.3 Status Codes and Reason Phrases: The rule for Reason-Phrase does not allow spaces, but several of the phrases specified later do include a space. 6.3.1 201 Created: Also possible following PUT, presumably. 6.3.1 202 Accepted: "delay header line" is what? 6.3.1 204 No Response: Allowed for POST, too? (For a Form.) 6.3.2 301 & 302 Moved: Allowed for POST, and others, too? 6.3.3 401 Unauthorized: [nit] change first 'a' to 'an'. 6.3.3 404 Method Not Allowed: is this only for the defined methods, or should this also be used for a misspelled or unrecognized method name? 6.3.4 500 Server Error: 502 (twice) and 504 call this "500 Internal Error". All should say "Server Error"? 6.3.4 503 Service Unavailable: Does this imply that a server is not permitted to refuse to accept a connection? [Presumably not, though it could be read that way.] 6.4 paragraph 1: [nit] change 'a Object-Body' to 'an ...' 6.4.2 Version: since this refers to an object and not the server, shouldn't it be in section 7? 7.2 Content-Length Note 2: [nit] 'wherever' has only three 'e's. 7.4 Content Encoding: (a) [nit] this heading (and 7.5 too) needs a hyphen. (b) [nit] change 'method' to 'mechanism' in the first paragraph to avoid confusion with use of the term elsewhere? 7.5 C-T-E: The rule omits the token and colon before the type. 7.7 Expires: Are there any constraints on the Date and Time specified? Specifically, may they refer to a time earlier than or the same as that in the Date: header? 7.9 URI First example: [nit] Change close quote to open quote. 7.9 URI Second example: [nit] Semicolon missing. 7.12 Title: "isomorphic" here implies that the Title follows SGML syntax, and hence depends on the HTML DTD (including the Declaration), and is allowed any valid entities and shortrefs within, etc. This probably isn't intended (I hope!). 7.13 Link: [nit] both examples have unmatched quotes. '//' missing after 'mailto:'? 8 Neg. algorithm para 4: [nit] change 'between 0 and 1' to 'in the range 0 through 1'? (0 and 1 are allowed values.) 8 Neg. algorithm 'bs' definition: [nit] change 'send' to 'sent' 9 Authentication, paragraph 1: Here is perhaps the strongest statement in the document about conformance. Yet, surely, if the server would never return "401 Unauthorized" (because all its data are public) there is no need for it to implement the Basic Access Authentication Scheme? 9 Authentication, fourth bullet: [nit] change second 'a' to 'an'. 11.3 Abuse, para 1: While I strongly support the intent behind the last sentence here, this document is a definition of the HTTP protocol, not people using it. It cannot impose requirements on, or define, people. (Does my server become non-conforming because someone using my server abused his or her collected data?) Also, am I (as the writer, and hence provider, of a server) responsible for the actions of other people *using* my server to provide data? Thin ice... Perhaps it should read something like: "People using the HTTP protocol to provide data are responsible for...". 11.3 Abuse, final para: This reads as though the user must be prompted with the From field to be sent before sending every request. Probably not the intent. 16 Server Tolerance, para 2: Time to make this a Requirement? 17 Bad servers: The first paragraph here sounds like a Compliance statement. As such, it should be in the body of the document, not an appendix? The document certainly needs a Compliance section. 17.1 Back compatibility, para 2: This doesn't seem to reflect current practice (inline <img href="xxx"> requests for .GIF files do seem to appear as images, not as HTML documents). mfc/29 Nov 1994
Received on Tuesday, 29 November 1994 08:16:57 UTC