WGLC review of p1-messaging (editorial stuff) from Dan Winship on 2012-10-30 (ietf-http-wg@w3.org from October to December 2012)

From: Dan Winship <dan.winship@gmail.com>
Date: Tue, 30 Oct 2012 07:15:10 -0400
To: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <508FB6BE.1070401@gmail.com>
> 1. Introduction

>    The Hypertext Transfer Protocol (HTTP) is an application-level
>    request/response protocol that uses extensible semantics and
>    MIME-like message payloads for flexible interaction with
>    network-based hypertext information systems.

Removing the word "hypertext" from the last line would make this a
more accurate description of present-day HTTP. Likewise a bit later
on:

>    HTTP proxies and gateways can provide access to alternative
>    information services by translating their diverse protocols into
>    a hypertext format that can be viewed and manipulated by clients
>    in the same way as HTTP services.



> 2.2. Implementation Diversity

>    Likewise, requirements that an automated action be confirmed by
>    the user before proceeding can me met via advance configuration

typo: "can *be* met"



> 2.3. Intermediaries

>    Many implementations depend on HTTP's stateless design in order
>    to reuse proxied connections or dynamically load balance requests
>    across multiple servers.

"load-balance" should have a hyphen (otherwise it's easy to misparse
as being about loading "balance requests")



> 2.4. Caches

>    There are a wide variety of architectures and configurations of
>    caches and proxies deployed across the World Wide Web and inside
>    large organizations. These systems include national hierarchies
>    of proxy caches to save transoceanic bandwidth, systems that
>    broadcast or multicast cache entries, organizations that
>    distribute subsets of cached data via optical media, and so on.

Since this section is entitled "Caches" and all of the examples are
about caches, it might make sense to remove the words "and proxies" in
the first sentence.



> 2.6. Protocol Versioning

>    However, the minor version was not incremented for the changes
>    introduced between [RFC2068] and [RFC2616], and this revision is
>    specifically avoiding any such changes to the protocol.

Change tense? ("has specifically avoided")



> 2.7.1. http URI scheme

>    If the host identifier is provided as an IP literal or IPv4
>    address,

We don't import the "IP-literal" and "IPv4address" rules from 3986, so
this would make more sense as just "provided as an IP address".

>    received with an empty host, then it MUST be rejected as invalid.
>    If the port subcomponent is empty or not given, then TCP port 80 is
>    assumed (the default reserved port for WWW services).

should "WWW" be "HTTP"?

>    Senders MUST NOT include a userinfo subcomponent (and its "@"
>    delimiter) when transmitting an "http" URI in a message.

would read better if the "and" was "or" IMHO

Also, taken literally, this rule prohibits even transmitting an HTML
page containing a link with a userinfo subcomponent. Maybe:

   Senders MUST NOT include a userinfo subcomponent (or its "@"
   delimiter) when transmitting an "http" URI in the request-line or
   header section of a message.

[Mark just commented on this in his review too but suggested "in a
request-target". Not sure if the restriction was meant to include
headers as well...]

>    the deprecated subcomponent is being used to obscure the authority
>    for the sake of phishing attacks.

Should "phishing" be defined/cited?



> 2.7.2. https URI scheme

>    distinct origin servers. However, an extension to HTTP that is
>    defined to apply to entire host domains, such as the Cookie
>    protocol [RFC6265], can allow information set by one service to
>    impact communication with other services within a matching group
>    of host domains.

"can allow" sounds like permission. I'd say "might allow" (and maybe
even explicitly state that it really SHOULD NOT?).



> 2.7.3. http and https URI Normalization and Comparison

>    Likewise, an empty path component is equivalent to an absolute
>    path of "/", so the normal form is to provide a path of "/"
>    instead.

Except that an empty path is not equivalent to a path of "/" in an
OPTIONS request sent to a proxy...



> 3.1. Start Line

>    In theory, a client could receive requests and a server could
>    receive responses, distinguishing them by their different
>    start-line formats, but in practice servers are implemented to
>    only expect a request (a response is interpreted as an unknown or
>    invalid request method) and clients are implemented to only
>    expect a response.

This is new text, and given that there will not be an HTTP/1.2, is
there really any reason to add it?



> 3.2. Header Fields

>       Note: The "Set-Cookie" header field as implemented in practice
>       can occur multiple times, but does not use the list syntax,
>       and thus cannot be combined into a single line ([RFC6265]).
>       (See Appendix A.2.3 of [Kri2001] for details.) Also note that
>       the Set-Cookie2 header field specified in [RFC2965] does not
>       share this problem.

The Kri2001 citation doesn't add anything now that we have a canonical
Set-Cookie spec in RFC 6265. Likewise, there's no point in citing RFC
2965 since it's now obsolete.



> 3.2.1. Whitespace

>      OWS            = *( SP / HTAB )
>                     ; "optional" whitespace
>      RWS            = 1*( SP / HTAB )
>                     ; "required" whitespace
>      BWS            = OWS
>                     ; "bad" whitespace

The quotes around "optional" and "required" don't serve any purpose.
(The ones around "bad" can either go or stay, depending on whether you
think BWS is bad or just "bad".)



> 3.2.2. Field Parsing

>    A field value MAY be preceded by optional whitespace (OWS); a
>    single SP is preferred.

"MAY" and "optional" are redundant. "A field value is preceded by
optional whitespace..."



> 3.2.4. Field value components

>      qdtext         = OWS / %x21 / %x23-5B / %x5D-7E / obs-text
>      ctext          = OWS / %x21-27 / %x2A-5B / %x5D-7E / obs-text

It seems weird to use "OWS" here rather than "HTAB / SP".



> 3.3.3. Message Body Length

>        If this is a response message received by a user-agent, it
>        MUST be treated as an error by discarding the message and
>        closing the connection.

"user agent" does not have a hyphen.

>    5.  If a valid Content-Length header field is present without
>        Transfer-Encoding, its decimal value defines the message body
>        length in octets. If the actual number of octets sent in the
>        message is less than the indicated Content-Length, the
>        recipient MUST consider the message to be incomplete and
>        treat the connection as no longer usable.

The only way the sender could have sent fewer than the indicated
number of octets would be if it closed the connection early, in which
case telling the receiver to "treat the connection as no longer
usuable" is hardly necessary. Maybe:

     If the sender closes the connection or the recipient times out
     before the indicated number of octets are received, then the
     recipient MUST consider the message to be incomplete and treat
     the connection as no longer usable.

>        If the actual number of octets sent in the message is more
>        than the indicated Content-Length

This is impossible by definition; any octets after the indicated
Content-Length are not part of the message.

>        the recipient MUST only process the message body up to the
>        field value's number of octets; the remainder of the message
>        MUST either be discarded or treated as the next message in a
>        pipeline. For the sake of robustness, a user-agent MAY
>        attempt to detect and correct such an error in message
>        framing if it is parsing the response to the last request on
>        a connection and the connection has been closed by the
>        server.

As above, "user agent" shouldn't have a hyphen, but it should be
"client" here anyway, shouldn't it?

Maybe replace the whole paragraph with:

    For the sake of robustness, a client MAY assume that any
    additional octets after the indicated Content-Length in the
    response to the last request on a connection were intended to be
    part of the message body.



> 5.3. Request Target

We explicitly say not to include userinfo when using origin-form:

>    A Host header field is also sent, as defined in
>    Section 5.4, containing the target URI's authority component
>    (excluding any userinfo).

or authority-form:

>    When making a CONNECT request to establish a tunnel through one
>    or more proxies, a client MUST send only the target URI's
>    authority component (excluding any userinfo) as the
>    request-target.

but we don't say anything about userinfo when using absolute-form.
I guess 2.7.1 already forbids sending a userinfo there but it still
seems inconsistent to not say it here. (Alternatively, if 2.7.1's
restriction is only supposed to apply to request-target, then we could
remove it there and specify it in each case here.)



> 5.4. Host

>    If the authority component is missing or undefined for the target
>    URI, then the Host header field MUST be sent with an empty
>    field-value.

5.1 defines "target URI" as being an absolute URI, and http URIs
always have an authority component, so this requirement seems
meaningless.



> 5.7. Via

>      received-protocol = [ protocol-name "/" ] protocol-version

protocol-name and protocol-version aren't defined yet. Forward ref?



> 6.1. Connection

>    When a header field is used to supply control information for or
>    about the current connection, the sender SHOULD list the
>    corresponding field-name within the "Connection" header field.

Isn't that a MUST?

>      Connection: close
>
>    in either the request or the response header fields indicates that
>    the connection SHOULD be closed after the current request/response
>    is complete (Section 6.2.5).

Likewise. (6.2.5 seems to agree)



> 6.2. Persistent Connections

I'd mentioned this before on the list, but all the intro text here
except for the final "HTTP implementations SHOULD implement persistent
connections" is really only of historical interest and could just go
away. The "SHOULD implement" requirement could be moved into "6.2.2
Reuse", and then each of the subsections of 6.2 could be promoted up
to become a direct child of section 6. So you'd get:

  6.  Connection Management
  6.1.  Connection
  6.2.  Establishment
  6.3.  Reuse
  6.3.1.  Pipelining
  6.3.2.  Retrying Requests
  6.4.  Concurrency
  6.5.  Failures and Time-outs
  6.6.  Tear-down
  6.7.  Upgrade



> 6.2.2. Reuse

>    A server MAY assume that an HTTP/1.1 client intends to maintain a
>    persistent connection until a close connection option is received
>    in a request.

SHOULD?



> 6.2.2.2. Retrying Requests

>    Confirmation by user-agent software with semantic understanding of
>    the application MAY substitute for user confirmation.  The
>    automatic

"user agent" shouldn't have a hyphen. (This one is inherited from
2616; 2068 was consistent, but the hyphenated form started creeping in
in 2616.)



> 6.2.5. Tear-down

>    A server that receives a close connection option MUST initiate a
>    lingering close of the connection after it sends the final
>    response

This could use a "(see below)" since "lingering close" hasn't been
defined yet.



> 7.6. Upgrade Token Registry

>    The HTTP Upgrade Token Registry defines the name space for
>    protocol-name tokens used to identify protocols in the Upgrade
>    header field.

They're used in the Via header field too... (Should the registry be
renamed to "Protocol-name Token Registry"?)



> 8.4. DNS-related Attacks

>    Clients need to be cautious in assuming the validity of an IP
>    number/DNS name association unless the response is protected by
>    DNSSec

"DNSSEC", not "DNSSec"



> A.1.3. Introduction of Transfer-Encoding

>    Proxies/gateways MUST remove any transfer-coding
>    prior to forwarding a message via a MIME-compliant protocol.

This seems like it would fit better in p2's "Appendix A. Differences
between HTTP and MIME".



> A.2. Changes from RFC 2616

We probably want to double-check these sections against the issue
tracker before final publication, but one particular thing that stuck
out to me as missing is the addition of the CONNECT rule and removal
of the multipart/byteranges rule from Section 3.3.3.
Received on Tuesday, 30 October 2012 11:17:01 UTC