Re: New Version Notification for draft-nottingham-structured-headers-00.txt from Willy Tarreau on 2017-11-05 (ietf-http-wg@w3.org from October to December 2017)

From: Willy Tarreau <w@1wt.eu>
Date: Sun, 5 Nov 2017 08:03:49 +0100
To: Andy Green <andy@warmcat.com>
Cc: Matthew Kerwin <matthew@kerwin.net.au>, Mark Nottingham <mnot@mnot.net>, Kazuho Oku <kazuhooku@gmail.com>, Poul-Henning Kamp <phk@phk.freebsd.dk>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <20171105070349.GA26461@1wt.eu>

On Sun, Nov 05, 2017 at 10:25:57AM +0800, Andy Green wrote:
> > Honestly, checking for integer values using strings is complex and not
> > natural to anyone. Try to tell this to the guy who used tonumber(header)
> > in my previous example, where "tonumber()" is provided by default on his
> > system.
> 
> Yeah.  It needs some code to check a number string and see if it's bigger
> than what you can handle.  But it's not rocket science type code, is it?

It's much worse than explaining in the spec how an integer must safely be
parsed so that these popular frameworks adopt and provide the method.
Typically they would provide the equivalent of this :

    /* returns <0 on parse error */
    int http_to_int(const char *str, int *value);

Then it's easier to deal with overflows the "normal" way in the parser
(ie: check for sum > 214748364 before multiplying then check for sum < prev
after adding the digit), which emphasizes the importance of having a single
parser for integers.

> What's the alternative, the server says "not my job to check if I can handle
> these numbers, mate" and just breaks or serves the first part of the file
> endlessly if some numbers come beyond what it can handle?  That's not really
> an OK situation for anyone.

I never said this at all, if you re-read what I wrote, I said that it's OK
to send this but that implementers must be aware that some implementations
will not support it, so when they have an alternative and are seeking for
interoperability, it's better to use the alternative. A good example is the
chunk size. Emitting chunks larger than 2GB doesn't provide much benefit.
Similarly, identifiers should not be specified as integers if they're going
to be larger than 2^31-1. And with just a few such careful measures we'll
quickly find that some implementations are totally interoperable on the
send side because they never manipulate large quantities at all, just like
other implementations will be totally safe on the receive side because they
will take specific measures to deal with the rare important values they care
for and will be OK. Just like my ESP doesn't need to parse large content
lengths advertised in POST since it will not be able to process them anyway.
However if it can detect the value is too large and report a 413 entity too
large, it suddenly becomes safe and interoperable.

> I think it's OK if they want to say, "this server only deals with files
> smaller than 2G, that's what it is" or whatever, but it's the server's job
> to say no cleanly if it meets a situation eg, like Range: is beyond its
> limits to understand.  If the server doesn't take care about it, it's
> broken.

I think we're saying the same thing.

> Putting it another way, more on the original topic, clearly there are two
> separate things here, practical limits on the numbers the internal server
> can express (which may be 64-bit), and judging if an expression of a number
> (which may exceed 64-bits, depending on what is decided) is inside the
> internal limits, without interpreting it.  Eg if it was length:MSB_data like
> bignum, if he see's a length coming > 4 he can judge it's too big without
> having to interpret anything else.  If it was ==4 he can look at the top bit
> of the MSB.

Yep.

> If it was decimal ASCII numbers coming, eg, his limit is 2147483647 he can
> check how many digits and ban it if > 10, accept it if <=9, and
> progressively check the digits against "2147483647" if == 10 to judge it.

That's exactly my point and what I'd like clearly stated in the spec. We
know that implementers do implement exactly what is specified when it's
easy, otherwise they seek something "surely good enough" which already
exists.

> I wrote this and the
> related ESP32 stuff in lws from scratch, and it now supports H2 on ESP32
> 
> https://github.com/warmcat/lws-esp32-factory

Oh nice, it's cool to know that H2 is ported to small devices since we've
been careful to keep this possible during the H2 design, eventhough there's
"a lot" of RAM on ESP32 (520 kB) :-)

> I think this business about being able to judge a number vs internalize it
> is a more fundamental and useful way to look at it.

I think we agree on this (and you probably formulate it better than I do).

> Maybe there's some way
> to express the protection you hope for in the way that could eventually be
> defined.

I think that describing a safe and portable integer parser, and a number of
very common limits that senders should avoid to cross whenever possible or
with a good reason will be enough for the spec.

Willy

Received on Sunday, 5 November 2017 07:05:10 UTC