- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Thu, 26 Jan 2012 19:15:05 +0100
- To: Willy Tarreau <w@1wt.eu>
- CC: ietf-http-wg@w3.org
On 2012-01-26 16:56, Willy Tarreau wrote: > Hi, > > I haven't finished reading p1 but I already have some comments, so > I'm sending them here and will proceed with what remains. > > > 2.1. Client/Server Messaging, page 11 > >> Note that 1xx responses (Section 7.1 of [Part2]) are not final; >> therefore, a server can send zero or more 1xx responses, followed by >> exactly one final response (with any other status code). > > This parts falls here quite out of context in my opinion. Neither > responses nor status core nor messaging has been defined yet and all > of a sudden we get this. I suggest we move this to P2 7.1 and replace > it with a small note such as : > > Note that sometimes a server may send multiple responses, see Section > 7.1 of [Part2] for more details about interim responses. We did that totally on purpose, see <http://trac.tools.ietf.org/wg/httpbis/trac/ticket/300>. > 2.4. Intermediaries, page 13 > > Context : >> > > > > >> UA =========== A =========== B =========== C =========== O >> < < < < > ... > >> For example, B might be receiving >> requests from many clients other than A, and/or forwarding requests >> to servers other than C, at the same time that it is handling A's >> request. > > I'd underline that there is no single path between a UA and an intermediary, > and that sometimes direct and indirect communications are possible. It helps > remind people that rewriting URLs along the path is not always a good idea. > I'd suggest this then : > > For example, B might be receiving requests from many clients other than A > including UA/C/O, and/or forwarding requests to servers other than C, at > the same time that it is handling A's request. UA I see, but C and O? > ... > 2.7.1. http URI scheme > >> If the host identifier is provided as an IP literal or IPv4 address, > > I did not find a clear definition of the term "IP literal". Also, does it > cover the bracketed format of IPv6 ? I think we need to ref <http://greenbytes.de/tech/webdav/rfc3986.html#rfc.section.3.2.2> here. > ... > 3.5. Message Parsing Robustness > >> Likewise, although the line terminator for the start-line and header >> fields is the sequence CRLF, we recommend that recipients recognize a >> single LF as a line terminator and ignore any CR. > > Does this mean that CR CR CR CR CR CR LF should be interpreted as a single > LF ? It kinds of scares me on the risk of smuggling attacks. I'd rather > suggest : > > ... we recommend that recipients recognize a single LF as a line > terminator and ignore the optional preceeding CR. Messages containing > a CR not followed by an LF MUST be rejected. Sounds good to me. >> When a server listening only for HTTP request messages, or processing >> what appears from the start-line to be an HTTP request message, >> receives a sequence of octets that does not match the HTTP-message > > Wouldn't "does not *exactly* match" be better ? I'm used to find > crappy requests in my logs which are blocked but which some not-so-lazy > implementations would let pass (eg: multiple SP). "match" means "match"; I don't think there's any ambiguity here... >> grammar aside from the robustness exceptions listed above, the server >> MUST respond with an HTTP/1.1 400 (Bad Request) response. > > I would also suggest that clients and proxies protect themselves against > malformed response messages, which are problematic in shared hosting > environments. This could be summarized like this : > > In general, any agent which receives a malformed message MUST NOT try > to fix it if there is any possibility that any other implementation > along the chain understands it differently. In such conditions, the > message MUST be rejected. -0.5. - it's a requirement hard to test for, and - it's not going to be implemented by browsers. > 4.1. Types of Request Target > >> Note: The "no rewrite" rule prevents the proxy from changing the > > I did not find reference to this "no rewrite" rule. It's the rule above the note. -> <http://trac.tools.ietf.org/wg/httpbis/trac/changeset/1517> > 4.2. The Resource Identified by a Request > >> 1. If request-target is an absolute-URI, the host is part of the >> request-target. Any Host header field value in the request MUST >> be ignored. >> >> 2. If the request-target is not an absolute-URI, and the request >> includes a Host header field, the host is determined by the Host >> header field value. >> >> 3. If the host as determined by rule 1 or 2 is not a valid host on >> the server, the response MUST be a 400 (Bad Request) error >> message. > > Rule 3 might be difficult to apply in massively hosted environments, as > I easily imagine that there could be a large "vhosts" directory with > all the hosts roots presented by their names there. The server would > then simply try to "cd $host" to check for the host's validity, which > might seem appropriate at first. But using a host of ".." or a host > containing a slash would have dramatic effects. > > I don't know what recommendation we could add here because we can't > add boring long sentences, but avoiding such simple traps would be > nice. Maybe we should just add : > > For instance, a host should never be ".." nor contain a slash. Are those allowed in a host name anyway? > ... > 8.4. TE > >> The presence of the keyword "trailers" indicates that the client is >> willing to accept trailer fields in a chunked transfer-coding, as > > Is it only limited to the client ? Nowhere it's said that a server cannot > advertise "TE: trailers" in responses so that a client knows it can emit > chunked-encoded messages with trailers in further requests (eg: backups > with SHA1 at the end). Replace "client" with "sender" maybe ? We seem to be confused about who can set TE anyway: "The "TE" header field indicates what extension transfer-codings it is willing to accept in the response, and whether or not it is willing to accept trailer fields in a chunked transfer-coding." We need to state who "it" is... > ... > A.1.2 Keep-Alive Connections > >> Clients are also encouraged to consider the use of Connection: keep- >> alive in requests carefully; while they can enable persistent >> connections with HTTP/1.0 servers, clients using them need will need >> to monitor the connection for "hung" requests (which indicate that >> the client ought stop sending the header), > > I know a number of people who use the term "the header" to designate all > the headers section. I must say that when I read this sentence, it was > unclear to me upon first reading that the intent was in fact to stop > sending "Connection: keep-alive" in subsequent requests, as it can also > be understood as "stop sending the headers as long as the connection > hangs" (which does not make sense). > > I'd suggest the following change : > > - the client ought stop sending the header), > + the client ought stop using this header in further communications with > + the server), "...ought to stop using this header field in further ..."? > ... > That's all for me now, I'll probably have other comments later. > ... Thanks a lot for that; I tried to comment where I had some confidence on the resolution. We probably need to figure out a way to manage the feedback better; maybe recommend sending smaller chunks with meaningful subject lines, so threading works properly? Best regards, Julian
Received on Thursday, 26 January 2012 18:15:47 UTC