Re: New Version Notification for draft-nottingham-structured-headers-00.txt from Andy Green on 2017-11-05 (ietf-http-wg@w3.org from October to December 2017)

From: Andy Green <andy@warmcat.com>
Date: Sun, 5 Nov 2017 10:25:57 +0800
To: Willy Tarreau <w@1wt.eu>
Cc: Matthew Kerwin <matthew@kerwin.net.au>, Mark Nottingham <mnot@mnot.net>, Kazuho Oku <kazuhooku@gmail.com>, Poul-Henning Kamp <phk@phk.freebsd.dk>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <6c6d0d93-4f79-6d2c-3e7a-0a4cf5dbbe3a@warmcat.com>
On 11/04/2017 08:38 PM, Willy Tarreau wrote:
> On Sat, Nov 04, 2017 at 08:13:25PM +0800, Andy Green wrote:
>> The question is because some people on a 64-bit capable platform decided to
>> use 32-bits internally,
> 
> No, nobody decided to use 32-bit internally, they just used the integer of
> the default size proposed by their language. "int" is signed 32-bit on the

So... those guys decided to use 32-bit internally.

> vast majority of platforms and has even been used for decades to store IPv4
> addresses. There's no reason for accusating anyone of purposely doing bad
> stuff, the reality is that unsafe code exists all over the planet by lack
> of knowledge and awareness. While we can hardly improve people's knowledge
> using standards, we can at least improve their awareness of the issues.

...

>> Nothing stops those servers processing the integers in question as strings
>> and seeing if they exceed some implementation limit,
> 
> Honestly, checking for integer values using strings is complex and not
> natural to anyone. Try to tell this to the guy who used tonumber(header)
> in my previous example, where "tonumber()" is provided by default on his
> system.

Yeah.  It needs some code to check a number string and see if it's 
bigger than what you can handle.  But it's not rocket science type code, 
is it?  What's the alternative, the server says "not my job to check if 
I can handle these numbers, mate" and just breaks or serves the first 
part of the file endlessly if some numbers come beyond what it can 
handle?  That's not really an OK situation for anyone.

I think it's OK if they want to say, "this server only deals with files 
smaller than 2G, that's what it is" or whatever, but it's the server's 
job to say no cleanly if it meets a situation eg, like Range: is beyond 
its limits to understand.  If the server doesn't take care about it, 
it's broken.

Putting it another way, more on the original topic, clearly there are 
two separate things here, practical limits on the numbers the internal 
server can express (which may be 64-bit), and judging if an expression 
of a number (which may exceed 64-bits, depending on what is decided) is 
inside the internal limits, without interpreting it.  Eg if it was 
length:MSB_data like bignum, if he see's a length coming > 4 he can 
judge it's too big without having to interpret anything else.  If it was 
==4 he can look at the top bit of the MSB.

If it was decimal ASCII numbers coming, eg, his limit is 2147483647 he 
can check how many digits and ban it if > 10, accept it if <=9, and 
progressively check the digits against "2147483647" if == 10 to judge it.

So maybe that should be recognized, that things wanting to use this new 
standard may choose not to be able to internalize the values that come, 
but they must all be able to judge them as able to be internalized or not.

>> If they don't do that, they don't even work properly serving large files
>> today; that's their problem.
> 
> On small platforms nobody seeks to serve large files, they're dealing with
> authentication pages, 10 settings on a page, and POST requests to change
> such settings. Despite this these software do cause interoperability issues
> right now by improperly parsing some valid responses (eg: when retrieving
> the whether forecast from public services to adjust the heater). And when
> you're unforunate enough to be responsible for the proxy in the middle that
> blocks the so-called "valid" response that the device normally properly
> deals with, it's a bit of a pain to have to argument that you're the only
> one applying safe processing there.

Well, you are far too brave saying small systems with, eg, a 32GB uSD 
connected, as ESP32 has the IO for, have no use for files > 2GB.  Even 
in the cases where they decide their limit, the implementation is 
responsible to cleanly enforce it.

>>> We aren't here to tell people what (not) to do; rather we're here to
>>> describe how to get something done in a way that is useful and reliable,
>>> hard to get wrong, and easy to sort out if/when it does eventually go
>>> wrong.  That includes predicting and addressing ways we can foresee that
>>> it could go wrong, or that similar things have gone wrong in the past.
>>
>> If you step back a bit, a simple, clear standard is the best way to get
>> something "useful, reliable, hard to get wrong, easy to sort out". Piling on
>> weird stuff - what was it, four "profiles" for integer sizes... is not
>> making a better standard in those regards than just saying compliant
>> implementations must handle 64-bit ints.  If it becomes so complex just to
>> eat an int, you will create more "interesting" bugs you say you want to
>> avoid.
> 
> I disagree on this one because we all know that we all code based on the
> target use case. Do you add provisions against bit flips due to solar
> eruptions in your code ? Most likely not. I don't either. But probably
> if you were coding for a Mars probe you'd have to use "volatile" in front
> of all your variables and perform such checks. It's just out of your scope
> and you skip such checks, that's fine. People developing for platforms
> adjusting their home temperature based on public forecast do the same,
> no need to deal with complex code compatible with 64-bit quantities to
> retrieve a JSON block.

My own code doesn't seem to lack bugs and people are using and shipping 
versions of it years old.  Imperfections and misunderstandings are 
everywhere.  But for guys looking at writing up a new way to express 
HTTP header values, which IIUI is the scope here, so what?  It has 
always been so and always will be so.  A spec that will be relevant 
perhaps for decades should be guiding everyone to a good clean place.

>> As shown it is not an onerous requirement to say it should handle
>> 64-bit even on weak platforms like ESP8266 / ESP32.
> 
> As I shown it's the opposite. *existing* frameworks are not even able to
> parse anything outside -2^31 and 2^31-1 nor have the types needed for this
> and these ones are currently being used in various products.
> 
>>> And nobody (for a given definition of "nobody") cares about
>>> interoperability test suites, apart from the really big one (i.e.
>>> reality.)  Some of the errors that come up in that one can be rather...
>>> Interesting(TM).
>>
>> Ehhh are you sure :-) I think you find many implementors really like having
>> test suites.
> 
> I think you're only working in enterprise environments where some people do
> care about this. While I too am fond of standards compliance, I'm disgusted

I agree there is a lot of that about around here, but it really isn't 
that I just don't understand where you are coming from... I likely 
understand where you are coming from at least as well as you do... eg, I 
wrote this and the related ESP32 stuff in lws from scratch, and it now 
supports H2 on ESP32

https://github.com/warmcat/lws-esp32-factory

> by what I see every other day, emitted by software supposed to be working
> fine, or the type of issues certain software face. Just google for
> "strstr cookie" or "atoi content-length" to see real-world horrors that *we*
> can avoid by better specs. But it's not by asking HTTP implementers to have
> to manually handle large integer arithmetics to comply with a spec that's out
> of their usage scope that we'll see any progress made in this area!

I think this business about being able to judge a number vs internalize 
it is a more fundamental and useful way to look at it.  Maybe there's 
some way to express the protection you hope for in the way that could 
eventually be defined.

-Andy

> Willy
>
Received on Sunday, 5 November 2017 02:26:55 UTC