Re: Declarative HTTP Spec Test Suite from Willy Tarreau on 2024-05-27 (ietf-http-wg@w3.org from April to June 2024)

From: Willy Tarreau <w@1wt.eu>
Date: Mon, 27 May 2024 15:23:13 +0200
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc: Mohammed Al Sahaf <mohammed@caffeinatedwonders.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-ID: <ZlSJQawTqEzLYVvg@1wt.eu>
On Mon, May 27, 2024 at 01:04:44PM +0000, Poul-Henning Kamp wrote:
> > This is a proposal that is triggered by some of my involvement with the Caddy
> > web server project. We (Caddy team) have been working towards developing a
> > declarative test suite for the Caddy server.
> > [...]
> > Prior Art:
> > [...]
> > Indeed some of the listed projects have not been updated for a while.
> 
> Mohammed,
> 
> I may be misunderstanding you, but using repository activity as a
> screen will make you miss stuff which Just Works(TM).
> 
> I will argue that VTest (https://github.com/vtest/VTest) is one
> such:  It chugs away, day in and day out, in both the Varnish Cache
> and HAproxy projects.
> 
> A specific reason for repository inactivity, which you might not
> have noticed, is that VTest is just the test program code and the
> test-cases do not live in the VTest respository, but in the "parent"
> projects which use VTest:
> 
> 	https://github.com/haproxy/haproxy/tree/master/reg-tests
> 
> and 
> 
> 	https://github.com/varnishcache/varnish-cache/tree/master/bin/varnishtest/tests
> 
> Full VTest documentation is under "Varnishtest" in the varnish project:
> 
> 	http://varnish-cache.org/docs/trunk/reference/index.html
> 
> I started writing VTest a dozen years ago for many of the same
> reasons you touch in your analysis, but one of my other reasons
> was that I wanted a good language to write test-cases in.
> 
> Here is a basic test from Varnish Cache, checking that chunked
> encoding works at all:
> 
> 	varnishtest "Check chunked encoding from backend works"
> 	
> 	server s1 {
>         	rxreq
>         	expect req.url == "/bar"
>         	send "HTTP/1.1 200 OK\r\n"
>         	send "Transfer-encoding: chunked\r\n"
>         	send "\r\n"
>         	send "00000004\r\n1234\r\n"
>         	send "00000000\r\n"
>         	send "\r\n"
> 	
>         	rxreq
>         	expect req.url == "/foo"
>         	send "HTTP/1.1 200 OK\r\n"
>         	send "Transfer-encoding: chunked\r\n"
>         	send "\r\n"
>         	send "00000004\r\n1234\r\n"
>         	chunked "1234"
>         	chunked ""
> 	} -start
> 	
> 	varnish v1 -vcl+backend {} -start
> 	
> 	client c1 {
>         	txreq -url "/bar"
>         	rxresp
>         	expect resp.status == 200
>         	expect resp.bodylen == "4"
>         	txreq -url "/foo"
>         	rxresp
>         	expect resp.status == 200
>         	expect resp.bodylen == "8"
> 	} -run
> 
> First, notice the lack of "boilerplate" code, getting a bog standard
> varnish instance is one line in the test-description and the VTest
> tool takes it from there.  There is another module in VTest which
> does the same for HAproxy with the "haproxy" command.
> 
> Second, notice how VTest allows you to work at different levels of
> the HTTP protocol from one "instruction" to the next: The "s1" server
> receives an entire request with "rxreq" but in this particular test-case
> it formulates the first response at the bytelevel, while the second
> response uses the "chunked" command which automatically generates the
> length.
> 
> As I read your long analysis, VTest is not a perfect fit for you,
> but I think it is more than half way there, and you are more than
> welcome to help us make it even better.

I generally agree with the points made here. Is it possible to do better?
Certainly, just like with any tool. But it's already fairly complete and
it does something that few tools achieve right now: synchronizing both
sides, which is extremely convenient when testing reverse proxies. You
can for example have an http/1 server and verify that the output h2
encoding out of your gateway matches what you expect. We (haproxy) do
use it routinely for various reasons ranging from regression testing to
spec test compliance (mostly detect breakage before our users), and since
it's fast (validates ~3600 expect rules from 200 tests in 8 seconds on my
laptop), we run it locally on every push and in the CI on ~20 platforms
for every push.

Another tool you might be interested in having a look at is h2spec. It's
h2-centric and tries to trigger edge cases from the implementation then 
verifies if the behavior matches the spec. It requires a server however,
but that's no big deal. It's very close to the spec so it can be seen as
a catalogue of the checks to run on the traffic.

Regards,
willy
Received on Monday, 27 May 2024 13:23:28 UTC