Re: WGLC review of p2-semantics (editorial stuff) from Roy T. Fielding on 2013-01-14 (ietf-http-wg@w3.org from January to March 2013)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Sun, 13 Jan 2013 19:28:21 -0800
To: Dan Winship <dan.winship@gmail.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <FD63AA95-B06B-4445-930E-E2E55292D89A@gbiv.com>
Hi Dan,

Thanks for your detailed review.  These comments have all been addressed
in the issue tracker at

   <http://tools.ietf.org/wg/httpbis/trac/ticket/426>

Cheers,

....Roy

On Oct 30, 2012, at 4:19 AM, Dan Winship wrote:

>> 3.1. Representation Metadata
> 
>>   | Expires           | Section 7.3 of [Part6] |
> 
> If "Expires" is considered "representation metadata", then it seems
> like "ETag" and "Last-Modified" should be as well. But I think it
> would make more sense to just remove "Expires" from the list; it's
> clearly the odd man out here.
> 
> 
> 
>> 3.1.1.2. Character Encodings (charset)
> 
>>   Implementers need to be aware of IETF character set requirements
>>   [RFC3629] [RFC2277].
> 
> It's not clear what requirements this is referring to; RFC 2277 places
> requirements on protocol authors, not on implementors, and RFC 3629 is
> just the definition of UTF-8. If the requirement is "implementations
> MUST support UTF-8" then we should say that.
> 
> 
> 
>> 3.1.1.4. Multipart Types
> 
>>   In general, HTTP treats a multipart message body no differently than
>>   any other media type: strictly as payload.  HTTP does not use the
>>   multipart boundary as an indicator of message body length.  In all
>>   other respects, an HTTP user agent SHOULD follow the same or similar
>>   behavior as a MIME user agent would upon receipt of a multipart type.
> 
> That last part seems completely wrong; a web browser is not expected
> to handle multipart/alternative or multipart/related in the way a mail
> reader would. (This requirement came from RFC 2616, but... it was
> wrong then too.)
> 
>>   The MIME header fields within each body-part of a multipart message
>>   body do not have any significance to HTTP beyond that defined by
>>   their MIME semantics.
> 
> This is not true of multipart/byteranges; in RFC 2616 that was
> explained separately, but that explanation got lost in httpbis
> rewrites at some point.
> 
> Suggested rewrite for the second and third paragraphs:
> 
>   In general, HTTP treats a multipart message body no differently
>   than any other media type: strictly as payload.  The one exception
>   is the "multipart/byteranges" type (Appendix A of [Part5]) when it
>   appears in a 206 (Partial Content) response.  In all other cases,
>   the MIME header fields within each body-part of a multipart message
>   body do not have any significance at the HTTP level; they are
>   just part of the representation data.
> 
> (This drops the newly-added "HTTP does not use the multipart boundary
> as an indicator of message body length", but that is already implied
> by the removal of 2616's prohibition on epilogue data; if the
> multipart is allowed to have an epilogue, then the final boundary
> doesn't indicate the end of the body anyway. It also drops the
> "unrecognized multipart subtype" text, which was already irrelevant
> given the "strictly as payload" rule anyway.)
> 
> 
> 
>> 3.1.3.1. Language Tags
> 
>>   In summary, a language tag is composed of one or more parts: A
>>   primary language subtag followed by a possibly empty series of
>>   subtags:
>> 
>>     language-tag = <Language-Tag, defined in [RFC5646], Section 2.1>
> 
> Kinda weird... the text sets you up to expect an actual grammar for
> language-tag, but then you just get a cross-reference. I'd rearrange
> stuff to:
> 
>   ... HTTP uses language tags within the Accept-Language and
>   Content-Language fields.
> 
>     language-tag = <Language-Tag, defined in [RFC5646], Section 2.1>
> 
>   A language tag is composed of one or more parts: A primary language
>   subtag followed by a possibly empty series of subtags.  White space
>   is not allowed within the tag and all tags are case-insensitive.
>   Example tags include:
> 
>     en, en-US, es-419, az-Arab, x-pig-latin, man-Nkoo-GN
> 
>   See [RFC5646] for further information.
> 
> (also dropping the language-subtag-registry ref, since that's covered
> by the "See [RFC5646]")
> 
> 
> 
>> 3.4. Content Negotiation
> 
>>   (such as when many different formats are supported by a user-agent),
> 
> no hyphen
> 
> 
> 
>> 3.4.1. Proactive Negotiation
>> 
>> 
>>   If the selection of the best representation for a response is made by
>>   an algorithm located at the server, it is called proactive
>>   negotiation.
> 
> That text doesn't motivate the new name. How about:
> 
>   If the selection of the best representation for a response is made
>   by the server based on preferences indicated by the user agent in its
>   initial request for the resource, it is called proactive negotiation.
> 
>>   4.  It might limit a public cache's ability to use the same response
>>       for multiple user's requests.
> 
> users' not user's
> 
>>   For example, the origin server might not implement proactive
>>   negotiation, or it might decide that sending a response that doesn't
>>   conform to them is better than sending a 406 (Not Acceptable)
>>   response.
> 
> Not clear what "them" is. "...that doesn't conform to the user agent's
> preferences..."
> 
> 
> 
>> 3.4.2. Reactive Negotiation
> 
>>   This specification defines the 300 (Multiple Choices) and 406 (Not
>>   Acceptable) status codes for enabling reactive negotiation when the
>>   server is unwilling or unable to provide a varying response using
>>   proactive negotiation.
> 
> 406 doesn't really "enable reactive negotiation". It just fails to do
> proactive negotiation.
> 
> Also, should we mention how reactive negotiation is *actually* done?
> 
>   This specification defines the 300 (Multiple Choices) status code
>   for enabling reactive negotiation. However, in practice, Web sites
>   wanting to do reactive negotiation will just return a successful
>   response containing a "default" (or proactively negotiated)
>   representation of the resource, which includes within it links that
>   the user can follow to reach other representations.
> 
> 
> 
>> 4. Product Tokens
> 
>>   By convention, the products are listed in order of their
>>   significance for identifying the application.
> 
> "...in *decreasing* order of...", or something like that. (likewise in
> the description of User-Agent in 6.5.3 and Server in 8.4.2)
> 
> 
> 
>> 5.2.2. Idempotent Methods
> 
> Section 6.2.2.1 of Part1 implies that the concept of "idempotent
> sequences of request methods" (as opposed to merely "idempotent
> methods") will be discussed here, but it's not. I'm not sure if it
> should be added here or there.
> 
> 
> 
>> 5.3.1. GET
> 
>>   The semantics of the GET method change to a "partial GET" if the
>>   request message includes a Range header field ([Part5]).
> 
> "a Range or If-Range header field"
> 
> 
> 
>> 5.3.6. CONNECT
> 
> Though obvious, it seems like for consistency's sake, this should end
> with:
> 
>   Responses to the CONNECT method are not cacheable.
> 
> 
> 
>> 5.3.7. OPTIONS
> 
>>   If no payload body is included, the response MUST include a
>>   Content-Length field with a field-value of "0".
> 
> Does this actually mean to prohibit servers from using chunked
> encoding (or "Connection: close" with no Content-Length) in that case?
> Or is it just supposed to be a reminder that "empty message body" is
> different from "no message body"?
> 
> (Section 9.1.2 has basically the same text.)
> 
>>   If no Max-Forwards field is present in the request, then the
>>   forwarded request MUST NOT include a Max-Forwards field.
> 
> "If no Max-Forwards field is present in the upstream request, then the
> downstream request MUST NOT include a Max-Forwards field."
> 
> 
> 
>> 6.2. Conditionals
> 
>>   The HTTP/1.1 conditional request mechanisms are defined in
>>   [Part4].
> 
> "and [Part5]" (If-Range)
> 
> 
> 
>> 6.3. Content Negotiation
> 
> 6.1 and 6.2 had some introductory text before the table, and it seems
> weird to not have that here.
> 
> (6.4 and 6.5 have the same problem)
> 
> 
> 
>> 6.3.1. Quality Values
> 
> Should this section be called "Weight" now?
> 
> 
> 
>> 6.3.5. Accept-Language
> 
>>   would mean: "I prefer Danish, but will accept British English and
>>   other types of English". (see also Section 2.3 of [RFC4647])
> 
> Capitalize "See"
> 
> 
> 
>> 7. Response Status Codes
> 
>>   The status-code element is a 3-digit integer result code of the
>>   attempt to understand and satisfy the request.
> 
> "...a 3-digit integer code giving the result of the attempt..."
> 
>>   o  2xx (Successful): The action was successfully received,
>>      understood, and accepted
> 
> "The *request* was successfully..."
> 
> 
> 
>> 7.1. Overview of Status Codes
> 
>>   The reason phrases listed here are only recommendations -- they can
>>   be replaced by local equivalents without affecting the protocol.
> 
> That suggests you can/should translate them into other languages,
> which isn't really what they're for and kind of contradicts p1 3.1.2's
> "A client SHOULD ignore the reason-phrase content."
> 
>>   | 415         | Unsupported Media Type       | Section 7.5.13       |
>>   | 416         | Requested range not          | Section 3.2 of       |
>>   |             | satisfiable                  | [Part5]              |
>>   | 417         | Expectation Failed           | Section 7.5.14       |
> 
> The capitalization of "Requested range not satisfiable" is
> inconsistent with the rest of the table.
> 
> 
> 
>> 7.2. Informational 1xx
> 
>>   A client MUST be prepared to accept one or more 1xx status responses
>>   prior to a regular response, even if the client does not expect a 100
>>   (Continue) status message.
> 
> No reason to call out 100 Continue specifically here... "A client MUST
> be prepared to accept one or more 1xx status responses prior to a
> regular response, even if the client does not expect one."
> 
> 
> 
>> 7.3.2. 201 Created
> 
>>   If the newly created resource's URI is the same as the Effective
>>   Request URI, this information can be omitted
> 
> "effective request URI" is not capitalized like that anywhere else.
> (Well, except for once more later on in this section which should also
> be fixed.)
> 
>>   If the action cannot be carried out immediately, the server
>>   SHOULD respond with 202 (Accepted) response instead.
> 
> "with *a* 202 (Accepted) response"
> 
> 
> 
>> 8.1.1.2. Date
> 
>>   1.  If the response status code is 100 (Continue) or 101 (Switching
>>       Protocols), the response MAY include a Date header field, at the
>>       server's option.
> 
> Is that really supposed to be limited to 100 and 101, and not other
> 1xx codes?
> 
> 
> 
>> 8.1.3. Retry-After
> 
>>   This field MAY also be used with any 3xx (Redirection) response
>>   to indicate the minimum time the user-agent is asked to wait
> 
> No hyphen in "user agent"
> 
> 
> 
>> 8.4.1. Allow
> 
>>     Allow = #method
> 
> Should that be 1#method? If not, it should explain what an empty
> "Allow" header means.
> 
> 
> 
>> 9.1.1. Procedure
> 
>>   HTTP method registrations MUST include the following fields:
> 
> Should "cacheability" be an explicit field (rather than just a
> required part of the specification text)?
> 
> 
> 
>> 9.3. Header Field Registry
> 
> It seems weird to have this in p2 since p1 defines headers too...
> 
> 
> 
>> 9.3.1. Considerations for New Header Fields
> 
>>   o  Whether it is appropriate to list the field-name in the Connection
>>      header field (i.e., if the header field is to be hop-by-hop, see
>>      Section 6.1 of [Part1]).
> 
> should have a semicolon rather than comma after "hop-by-hop". (So that
> it doesn't read like it's telling you to only follow the xref if the
> header field is hop-by-hop.)
> 
> 
> 
>> 10.1. Transfer of Sensitive Information
> 
>>   Four header fields are worth special mention in this context:
>>   Server, Via, Referer and From.
> 
> "Via" is in p1 though, so the Via bits should be moved to p1's
> Security Considerations? (Or maybe if we end up with a p0, all of the
> security considerations should be consolidated there.)
> 
>>   The information sent in the From field might conflict with the user's
>>   privacy interests or their site's security policy, and hence it
>>   SHOULD NOT be transmitted without the user being able to disable,
>>   enable, and modify the contents of the field.  The user MUST be able
>>   to set the contents of this field within a user preference or
>>   application defaults configuration.
> 
> Do any browsers actually ever send the "From" header? If not, should
> we just say "From is for robots, not browsers"?
> 
> 
> 
>> Appendix C. Changes from RFC 2616
> 
>>   Remove base URI setting semantics for "Content-Location" due to poor
>>   implementation support, which was caused by too many broken servers
>>   emitting bogus Content-Location header fields, and also the
>>   potentially undesirable effect of potentially breaking relative links
>>   in content-negotiated resources.  (Section 3.1.4.2)
> 
> That would parse better if the "which was..." clause was parenthesized
> rather than just set off by commas.
> 
>>   Failed to consider that there are many other request methods that are
>>   safe to automatically redirect, and further that the user agent is
>>   able to make that determination based on the request method
>>   semantics.
> 
> This is written in the opposite style from the rest of the list (it
> describes the problem with 2616 rather than the solution in httpbis).
> Should be something like:
> 
>   Allow automatic redirection of all "safe" methods, not just GET and
>   HEAD, and give the user agent more latitude in redirecting unsafe
>   methods. (Section 7.4)
>
Received on Monday, 14 January 2013 03:28:43 UTC