- From: Lucas Pardue <lucas@lucaspardue.com>
- Date: Wed, 29 May 2024 23:03:37 +0100
- To: "Mohammed Al Sahaf" <mohammed@caffeinatedwonders.com>, "David Benjamin" <davidben@chromium.org>
- Cc: "HTTP Working Group" <ietf-http-wg@w3.org>
- Message-Id: <a21e2452-a122-4ad3-8278-e70980a6a112@app.fastmail.com>
Hi Mohammed, Thanks for bringing this topic up. It's something I care about too. Others have already expressed some of the complications or caveats I might have mentioned. But in spite of those things, I think there is room for improvement around testing, especially through the lens of making HTTP's dark corners more accessible to the community. There's definitely more we can do in terms of tooling and testing. I don't think there is a one size fits all approach. vtest looked good when it was presented at the HTTP Workshop in 2019 [1], good to see it still working well. Something that's always nagged me is writing test oriented around the HTTP/2 and HTTP/3 framing layer. Especially since the H2 Rapid Rest attacks last year. Since November, I've been working on a tool to address (in Rust, since that suits my preference and our internal dev env). The tool aims to make it easier to do interactive exploration of HTTP/3 framing and stream lifecycle management, and to capture the results of that as a reproducible test sequence. It can also be controlled programmatically to write test cases, similar to vtest. We already had plans to open source this tool soon, so it's a nice coincidence to see the topic come up. An open question is if we should provide a corpus of test case examples that could emulate some of the capabilities of h2spec*. I'll post again when open sourcing happens. Mohammed there's another HTTP workshop planned this year, so perhaps consider coming along to talk testing in person with some of the wider community. Cheers Lucas * Kazu Yamamoto maintains h3spec and its great. However, it is a bit more focused on QUIC than HTTP/3. I've no doubt more test cases could be added to h3spec but for my own needs, I want a Rust-focused set of tooling. [1] - https://daniel.haxx.se/blog/2019/04/02/the-http-workshop-2019-begins/ [2] - https://github.com/HTTPWorkshop/workshop2024/blob/main/README.md On Wed, May 29, 2024, at 21:56, Mohammed Al Sahaf wrote: > Good catch! I did a retest with HTTP/1 and HTTP/2 split. I excluded h2c because setting it up is a hassle for me given my convenient tools; plus it's unsupported by browsers, so perhaps only curl will have results on that. The result appears more consistent now. > +---------+---------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------+ > | | HTTP/1 | HTTP/2 | > | +-------------------------------------------------------------+-------------------------------------------------------------+------------+-----------------------------------------------------------------------+ > | | HTTP | HTTPS | HTTP (h2c) | HTTPS | > +=========+=============================================================+=============================================================+============+=======================================================================+ > | curl | `curl: (18) transfer closed with 2 bytes remaining to read` | `curl: (18) transfer closed with 2 bytes remaining to read` | N/A | `(92) HTTP/2 stream 1 was not closed cleanly: PROTOCOL_ERROR (err 1)` | > | | | | | | > | | response payload displayed | response payload displayed | | response payload displayed | > +---------+-------------------------------------------------------------+-------------------------------------------------------------+------------+-----------------------------------------------------------------------+ > | Firefox | `NS_ERROR_PARTIAL_TRANSFER` | `NS_ERROR_PARTIAL_TRANSFER` | N/A | displays the full payload without reporting any errors | > | | | | | | > | | response payload displayed | response payload displayed | | | > +---------+-------------------------------------------------------------+-------------------------------------------------------------+------------+-----------------------------------------------------------------------+ > | Chrome | `(failed)net::ERR_CONTENT_LENGTH_MISMATCH` | `(failed)net::ERR_CONTENT_LENGTH_MISMATCH` | N/A | displays the full payload without reporting any errors | > | | | | | | > | | nothing displayed | nothing displayed | | | > +---------+-------------------------------------------------------------+-------------------------------------------------------------+------------+-----------------------------------------------------------------------+ > > > Given the feedback on the other branches of this thread, I think it's best to scope my proposals to servers. As mentioned elsewhere, clients are harder to test. To quote Willi Tarreau: > > > testing clients is very difficult because contrary to > > servers which just have to respond to sollicitations, someone has to act > > on the client to run the desired tests, so the approach is different > > (and different between various clients), and I'm not convinced that a > > same test collection would work for all implementations due to this. > > All the best, > Mohammed > > > On Tuesday, May 28th, 2024 at 4:47 PM, David Benjamin <davidben@chromium.org> wrote: >> The results in scenario 2 sound off. Chrome shouldn't treat ERR_CONTENT_LENGTH_MISMATCH differently between HTTP and HTTPS. I'm not familiar with their implementation, but the Firefox results similarly don't make sense. Indeed it's quite important to enforce this over HTTPS, for HTTP/1.1, because that is what defends against truncation attacks. (In principle, TLS has the close_notify alert, but close_notify is, in practice, a fiction for HTTPS. Instead we must rely on in-protocol termination signals. For HTTP/1.1, one of those signals is Content-Length.) Also it's generally on HTTPS that one can be more strict, not less, because there are fewer intermediaries to worry about. >> >> Given some of your errors mention HTTP/2, I suspect you are comparing apples to oranges, and your HTTPS tests are testing HTTP/2. You mention the Go standard library, but keep in mind that Go automatically enables HTTP/2. The Content-Length header means very different things between HTTP/1.1 and HTTP/2. In HTTP/1.1, it is a critical part of framing and needs to be checked at that layer. (HTTP/1.1's framing is incredibly fragile. "Text" protocols are wonderful.) In HTTP/2, it has no impact on framing and was historically[0] considered advisory. The spec now considers it invalid, otherwise an h2-to-h1 intermediary will have problems. [1] discusses this. But that's where this mess with receivers enforcing dates to. Provided it doesn't cause you to turn around and send mis-framed HTTP/1.1, it is more-or-less safe, if sloppy, to accept it in HTTP/2. >> >> David >> >> [0] https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-http2-00#section-3.2.2 >> [1] https://www.rfc-editor.org/rfc/rfc9110#section-8.6-11 >> >> On Mon, May 27, 2024 at 7:30 AM Mohammed Al Sahaf <mohammed@caffeinatedwonders.com> wrote: >>> Hello, >>> >>> This is a proposal that is triggered by some of my involvement with the Caddy web server <https://caddyserver.com/> project. We (Caddy team) have been working towards developing a declarative test suite for the Caddy server. The discussions (particularly a comment <https://github.com/caddyserver/caddy/pull/6255#issuecomment-2088632219> by a user) led me to believe it's best to bring up the HTTP spec compliance parts with the HTTP WG for better insight and to have a common well-being check for all members of the HTTP community. >>> >>> There are numerous RFCs governing HTTP and the behavior of its citizens. Compliance to the RFCs is only validated through interoperability or manual eyeing of the RFCs against the implementation. The RFCs, for good reasons, are walls of texts and are akin to legalese when it comes to interpretation. Consequently, nuanced sections are possibly missed without visible failures due to being an edge-case. Having a specification defined as a test suite in a declarative language removes much of ambiguity and enables validation of conformance by the HTTP citizens. I hope to turn this into an official proposal, but I'd like to put the draft forward for discussion to solidify the approach and the scope first. >>> >>> *_Motivation_*: >>> >>> Conformance is an assurance of compatibility across the various components of the web and gives confidence of breakage if any of them were to change behavior or if the HTTP semantics were to change. Conformity can also assist in optimization efforts. If the behavior is known for sure in advance, certain optimizations can be applied. >>> >>> Secondly, it unifies the expectations of the community. Let's take for example the HTTP semantics of the `Content-Length` header as defined in RFC 9110 <https://www.rfc-editor.org/rfc/rfc9110.html#name-content-length>. The RFC states when servers and user-agents SHOULD, SHOULD NOT, MAY, MAY NOT, AND MUST NOT send the `Content-Length` header, but it does not specify how should either of them (server and user-agent) handle cases of mismatch between content-length header value and actual content length of the payload. I have run an unscientific poll on Twitter about the assumed ideal behavior of a client if the `Content-Length` value does not match the actual content-length of the body. >>> >>> First poll <https://twitter.com/MohammedSahaf/status/1792267681032253683>: What's the ideal HTTP client (e.g. curl, browser) behavior when the server includes more bytes in response body than stated in the content-length header? e.g. "Content-Length: 2", actual body length: 3. >>> >>> *Responses* (19 responses): >>> • Ignore header; read fully (4 votes; 21.1%) >>> >>> • Read till content-length value (6 votes; 31.6%) >>> >>> • *Abort/reject (9 votes; 47.4%)* >>> >>> *Reality*: >>> When testing this scenario, I found the following: >>> >>> • `curl` aborts the connection, reporting `"(18) transfer closed with 1 bytes remaining to read"` for *plaintext HTTP* connection, and `"(92) HTTP/2 stream 1 was not closed cleanly: PROTOCOL_ERROR (err 1)"` for *HTTPS* connections. >>> >>> • Firefox fails the transfer on *plaintext HTTP* with `"NS_ERROR_NET_PARTIAL_TRANSFER"`; but with *HTTPS* connection, it only reads and displays payload per the number of bytes stated in the header value. >>> >>> • Chrome fails the transfer on *plaintext HTTP* with `"(failed)net::ERR_CONTENT_LENGTH_MISMATCH"`; but with *HTTPS* connection, it ignores the header value and displays the full payload. >>> >>> Second Poll <https://twitter.com/MohammedSahaf/status/1792267687411831063>: What's the ideal HTTP client (e.g. curl, browser) behavior when the server includes fewer bytes in response body than stated in the content-length header? e.g. "Content-Length: 5", actual body length: 3 >>> >>> *Responses* (18 responses): >>> • Ignore header; read 3 (7 votes; 38.9%) >>> >>> • Pad; with what? (0 votes; 0%) >>> • *Reject/abort (9 votes; 50%)* >>> >>> • Other; comment (2 votes; 11.1%, none of them elaborated) >>> >>> *Reality*: >>> When testing this scenario, I found the following: >>> • `curl` aborts the connection, reporting `"(18) transfer closed with 1 bytes remaining to read"` for *plaintext HTTP* connection; *for HTTPS*, it prints the payload in full preceded by the message `"(92) HTTP/2 stream 1 was not closed cleanly: PROTOCOL_ERROR (err 1)"`. >>> >>> • Firefox displays the full payload *for HTTPS* connections without reporting any errors. For *plaintext HTTP,* it displays the full payload but reports an error `"NS_ERROR_NET_PARTIAL_TRANSFER"` >>> >>> • Chrome displays the full payload *for HTTPS* connections without reporting any errors. For *plaintext HTTP,* it fails to load the content and reports the error `"(failed)net::ERR_CONTENT_LENGTH_MISMATCH"`. >>> >>> Third Poll <https://twitter.com/MohammedSahaf/status/1792267692700827955>: Assuming it's possible... What's the ideal HTTP client (e.g. curl, browser) behavior when the server includes negative value in the content-length header? e.g. "Content-Length: -2" >>> >>> *Responses* (21 responses): >>> • Ignore value (7 votes; 33.3%) >>> >>> • *Reject/abort (14 votes; 66.7%)* >>> >>> *Reality*: >>> I couldn't in effect test the scenario. I'm using Caddy for all the scenarios, and the Go standard library doesn't set the `Content-Length` header if it's set to a value of less than 0 (source <https://github.com/golang/go/blob/377646589d5fb0224014683e0d1f1db35e60c3ac/src/net/http/server.go#L1201>) >>> >>> Conclusion: The variation observed in the user agents and the, albeit unscientific, poll responses show a lack of consensus on the expected behavior. Each agent (human or machine) apply their own interpretations and assumptions to the protocol. The disagreement makes the evolution of the protocol difficult because of the varying expectations. >>> >>> *_Method_*: >>> >>> The test suite should be defined in declarative format that can be easily composed by humans and read by machines. The declarative, programming-language-agnostic format allows developers (RFC developers and software developers) of all backgrounds to contribute without a programming-language-based gate. In my research, I have found the open-source tool `hurl <https://hurl.dev/>` to be a suitable tool for defining HTTP server specification and to test it. It defines its own DSL for the request/response patterns and may be run with `--test` flag along with `--report-{json,html,tap}` to produce test results. >>> >>> The testing effort may be implemented in phases. The first phase is to author the test suite in a common, public repository. This makes it accessible for web server developers to clone it and run the test suite against their own software. The second phase is to provide an interface for automated testing and a UI to display the conformance summary of each web server submitted to the list. >>> >>> *_Challenges_*: >>> >>> _Agnostic Tooling_: The first challenge is to find an HTTP client that implements HTTP Semantics RFCs perfectly, otherwise its own idiosyncrasies will get in the way of the validation. One would be tempted to default to `curl`, especially that `hurl` uses `curl` under the hood. However, there is a chance `curl` may have its own set of HTTP idiosyncrasies that may affect the results of the test execution. Changes to `curl` are probably not desired unless the subject behavior is confirmed to be a defect. Involvement of `curl` is voluntary to `curl`, and the team may be looped and involved into this initiative for comments if desired. >>> >>> _Suitable DSL (Domain Specific Language)_: The `hurl` DSL is decent. In my experiment for a proof-of-concept, I found it lacking a few functions or operations to be perfectly suitable, e.g. indicating optionality. `hurl` has the advantage of its DSL grammar being defined as a spec with deterministic parsing. Extensions and/or changes to `hurl` to accommodate this effort is up to the `hurl` developers. >>> >>> *_Prior Art_*: >>> >>> There's a wiki page title HTTP Testing Resources <https://github.com/httpwg/wiki/wiki/HTTP-Testing-Resources> under the github.com/httpwg/wiki repository. The page contains the following note: >>> • *Note that *there is no official conformance test suite*. These tests have not been vetted for correctness by the HTTP Working Group, and the authority for conformance is always the relevant specification.* >>> >>> Indeed some of the listed projects have not been updated for a while. Worthy of note on the page is the cache-tests.fyi project, REDBot, and httplint (which REDBot uses). >>> >>> The REDBot project (by Mark Nottingham) was used by one of Caddy users to report a gap <https://github.com/caddyserver/caddy/issues/5849> in the conformance, which was subsequently fixed. Using REDBot requires pointing at a particular resource. >>> >>> The cache-tests.fyi is of keen interest for some inspiration of design. The test suite is close in essence to this proposal, which is a declarative suite that can be run by cache system developers to validate their conformance to the cache related RFCs. The Souin <https://github.com/darkweak/souin> caching system runs the cache-tests.fyi test suite on every pull request to note its conformance level and to watch for variations. The display of each system, the use case, and its conformance status allows the users and the developers to take appropriate actions. The website and the UI can be a phase-2 (long-term goal) of this proposal, but the details of how to set up the system and run the tests can be postponed until more information is known about the test suite itself. >>> >>> *_Humble Attempt_*: >>> >>> To test the idea and develop a proof-of-concept, I have managed to convert 4 tests from REDBot (procedural, Python-based) to Hurl (declarative format). The test suite contains a collection of test sets in nested directory structure. The suite declares the URL it will call for each test case so web servers can be configured accordingly for the subject URL. The GitHub repository for the PoC is here: github.com/mohammed90/http-semantics-test-suite. The repository currently does not have a license applied as it's only for display of PoC, though I am inclined to apply Apache-2 or any other open-source-compatible license once the approach is agreed and finalized. >>> >>> All the best, >>> Mohammed >>> Blog <https://www.caffeinatedwonders.com/> | LinkedIn <https://www.linkedin.com/in/mohammedalsahaf/> >>>
Received on Wednesday, 29 May 2024 22:04:04 UTC