Re: Declarative HTTP Spec Test Suite from Poul-Henning Kamp on 2024-05-28 (ietf-http-wg@w3.org from April to June 2024)

From: Poul-Henning Kamp <phk@phk.freebsd.dk>
Date: Tue, 28 May 2024 08:26:17 +0000
To: Mohammed Al Sahaf <mohammed@caffeinatedwonders.com>
cc: Willy Tarreau <w@1wt.eu>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-Id: <202405280826.44S8QHwK053849@critter.freebsd.dk>
--------
Mohammed Al Sahaf writes:

> > > As I read your long analysis, VTest is not a perfect fit for you,
> > > but I think it is more than half way there, and you are more than
> > > welcome to help us make it even better.
>
> Indeed, after reading the documentation, studying a few of the tests, and t
> rying to draft a test by hand, vtest is close to the desired tool. I was al
> so able to draft a simplistic test case pretty quickly:
>
>  vtest "Test basic Caddy response"
>
>  process p1 "caddy respond --listen 'localhost:8080' --body 'Hello'" -start
>  delay 5
>
>  client c1 -connect "localhost:8080" {
>   txreq -req GET -proto HTTP/1.1 -url /
>   rxresp
>   expect resp.proto == HTTP/1.1
>   expect resp.status == 200
>      expect resp.body == "Hello"
>  } -run
>
>  process p1 -stop
>  process p1 -wait

Not bad :-)

Your example highlights one crucial detail I forgot to mention:

If you want to run multiple tests in parallel or have tests run in
parallel by different users on the same machine, something needs
to manage/assign unique TCP/UDP port numbers and temporary directories,
so the different test-runs do not interfere with each other.

For Varnish we added a debug CLI command so VTest can start varnishd
with "-a localhost:0" and then ask varnishd what port number it
actually got.  VTest support in HAproxy does something similar.

Likewise in the example I showed the "-vcl+backend {}" creates a
VCL program from varnishd, where all the "server s%d {}" instances
are already defined as backends, with whatever randomly assigned
IP# and port numbers they got.

> I noticed HTTP/2 requests aren't as approachable as HTTP/1.

You can generalize that statement by removing the word "requests" :-)

> The documentation seems to imply the user has to manage the streams
> directly. Is that the case?

Again:  We wanted VTest to be able to specify the test as generally
or specifically as required.

To be honest, I cannot remember if there is any automation that helps
with that, if not, it would be easy to add.

My own experience is that you want to manually assign the stream
numbers, so that you can correlate things further down the road.

> It's also isn't clear how to make HTTPS requests.

Right now VTest does not support SSL/TLS natively, but that is clearly
something we should add.

There are two major issues, the first being where does the certs come from?

A) VTest creates self-signed certs as needed on the fly.  That would
   probably be intolerably slow.

B) VTest creates a cache of self-signed certs, either with Y2K38
   like expiry or with logic to periodically refresh them.

C) The user of VTest have to provide the certs to be used, possibly
   from a helper script which does B)

The second major issue is:  Which SSL/TLS implementation, and here
there are only two options:

A) Pick one, and forget about using VTest to probe the edges of
   the SSL/TLS implementations.

B) Write one, so that VTest can test the edges of SSL and TLS traffic

I dont know what the situation is with respect with tools to test
SSL/TLS protocol implementations.  If they already have such, B)
would be surplus to requirements, if not, somebody should write it.

> Another apparent gap is the inability to carry one value from one response
> of the server into the next. For instance, to save the Etag value sent by
> the server from the initial request to be used with If-None-Match in subsequent
> requests. What am I missing?

Some code ? :-)

Right now macros are expanded when the {...} argument is parsed, so it is not
possible to change or define the value of a macro while it runs.

It would not be hard to provide some kind of "dictionary" where
things can be stashed and picked up again later.

I think we already talked about something like that, but I guess
nobody needed it enough to implement it.

> However, I feel like my intent is misunderstood. The intent isn't to just
> build a test suite for Caddy and call it a day. The work on Caddy was the
> spark for the proposal, but not the end goal. I'm proposing building a test
> suite blessed by the HTTP WG for all reverse-proxies, web servers, and
> HTTP-speaking entities to check their conformance to HTTP. 

I got that first time around, but did not comment on it, because I
did not want to discourage you.

Let me do so now.

Compliance Tests are a neat idea:  If you pass this pile of test-cases
you get to call your product "FOOBAR Compliant", and put the golden
"FOOBAR says: Will Just Work™" sticker on it.

Where the subject matter is one-way, for instance Nicolas Seriot's
JSON beastiarium (https://github.com/nst/JSONTestSuite), the success
of compliance testing is entirely up to the malicious creativity
of the person(s) writing the tests, and Nicolas did really well,
breaking (almost?) all JSON parsers in existence.

But where the subject matter is two way, like communication protocols,
I have never seen compliance testing deliver anywhere near promise.

The first problem is that even when everybody agree 100% on what
the protocol definition says, most protocols explode combinatorically
in very few exchanges.

A HTTP server can return 3xx to almost any request, that is not
a compliance failure, it can send "Connection: Close" or Keepalive,
that is not a compliance failure either.

Imagine this:

 compliance-client:
  HEAD /
 server-under-test:
  301 to /justkidding
 compliance-client:
  HEAD /justkidding
 server-under-test:
  301 to /

How many times is a compliant server allowed to yank a client
around like that, before it actually answers ?

(The above is surprisingly common patterns of identification)

The second problem is timing.  HTTP doesn't have rigidly specified
timeouts, but there is still stuff like "If-Modified-Since", which
the server is allowed to just ignore whenever it feels like it.

But the biggest problem of all, is that people simply do not agree
what standards say or mean.

I served time in meetings with angry IBM people, where we could not
agree what "have received" meant.  (Their mainframes crashed when
my code answered too fast because they (tacitly) assumed modems and
a telco would always be involved, where I used a 5 meter null-modem
cable.)

That is not to say that such testing is never valuable, it is, and
IETF have often arranged "bake-offs", where different implementations
would meet, and after first getting things to work, they would try
to break each other's implementations with hostile traffic.

So as I said: I do not want to discourage you, and as long as you dont
expect to hand out "HTTP-compliant" gold star stickers, I dont see
why it cannot or should not be done.

Poul-Henning

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.
Received on Tuesday, 28 May 2024 08:26:25 UTC