Re: New Version Notification for draft-nottingham-structured-headers-00.txt from Loïc Hoguin on 2017-11-02 (ietf-http-wg@w3.org from October to December 2017)

From: Loïc Hoguin <essen@ninenines.eu>
Date: Thu, 2 Nov 2017 01:32:32 +0000
To: Kazuho Oku <kazuhooku@gmail.com>, Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <fbf4519c-d3e8-6c4b-2f9d-9e59628e9259@ninenines.eu>
On 11/01/2017 01:52 AM, Kazuho Oku wrote:
> Hi, PHK!
> 
> Thank you for your response.
> 
> 2017-10-31 17:26 GMT+09:00 Poul-Henning Kamp <phk@phk.freebsd.dk>:
>> --------
>> In message <CANatvzzOXL6vFjm_4KBxLvwosZ6vJYW_ic34_KCwFXtFXFTLsQ@mail.gmail.com>
>> , Kazuho Oku writes:
>>
>>> So why not mandate support for 64-bit integers?
>>>
>>> [...]
>>>
>>> Let's not repeat the failure made by JSON.
>>
>> If we were designing a general-purpose data-carrier format, I would
>> be 100% with you there, but we are not.
>>
>> The goal here is to design a maximally robust data-carrier format,
>> and that means conservative choices and putting the inconvenience
>> on the end which packages the data.
> 
> In my view, current limit (15 digits at max.) is overly conservative.
> 
> Let me explain in response to your text below.
> 
>> The number format is intended for sending quantities on which
>> arithmetic makes sense, and the point of the restriction is
>> to reserve to the implementor the ability to use the most
>> efficient hardware native data type, without loss of precision.
>>
>> 15 digits is 49¾ bits, and while I'm not prepared to state
>> that "is enough for everybody" I think we can safely say that
>> it covers all uses of arithmetic seen in HTTP until now.
> 
> IMO, we should consider the future instead of optimizing against what
> we see now.
> 
> When Content-Length was defined in HTTP/1.0 back in May 1996, the
> largest file we used to transfer were CD-ROM images (650MB ~ 1GB ~
> 2^30 bytes). We are now after 20 years since that, and we are seeing
> SD cards of 512GB (~ 1TB ~ 2^40 bytes). Assuming that the increase
> will continue, we would be seeing a storage that can store 2^50 bytes
> (~1 PB) of data within 20 years.
> 
> How long is the expected lifetime of Structured Headers? Assuming that
> it would be used for 20 years (HTTP has been used for 20+ years, TCP
> is used for 40+ years), there is fair chance that the 49¾ bits limit
> is too small. Note that even if we switch to transferring headers in
> binary-encoded forms, we might continue using Structured Headers for
> textual representation.
> 
> Do we want to risk making _all_ our future implementations complex in
> exchange of being friendly to _some_ programming languages without
> 64-bit integers?

I will second that question for the opposite reason. The implementation 
I maintain currently supports integers of up to around 255^255 because 
the programming language's integers are implemented as bignums once they 
are above a certain value.

This means that theoretically I could accept or server content of size 
well exceeding the 15 digits, today. Having a new hard limit of 15 
digits would mean that, at least for content-length, I would have to 
start rejecting what worked perfectly before. (I don't think I've heard 
of anyone using values this large yet, but as pointed out this could 
well change in 20 years.)

Is that a real issue as far as content-length is concerned though? I 
think not. If the client tries to upload content that is too large, we 
have an error for that. If the server tries to send content that is 
larger than the client expects, the client can abort and figure out an 
alternative. This can happen not only because the number of digits is 
too large but also because the value is too large (not enough space on 
disk, file system not allowing files larger than X bytes, etc.) so I am 
not sure that a restriction on the number of digits helps in that case 
either.

It also won't help the smallest implementations that can sometimes be 
seen in embedded deployments; for many of the ones I've seen, even if 
they can parse a 15 digits content-length, they're not going to do 
anything with it because they don't have the memory. For them, it 
doesn't matter if the limit is 15 or 150 digits.

I understand wanting to nicely map textual representations to 
programming language types but that rarely plays out as expected. JSON 
is a great example of that, it has Javascript's numbers: floating point 
numbers that are made to look like integers. As a result it has proven 
unusable as-is for a variety of applications that need better precision 
(financial, for example). If JSON had used decimal numbers instead from 
the start there would have been no problem.

Cheers,

-- 
Loïc Hoguin
https://ninenines.eu
Received on Thursday, 2 November 2017 01:33:12 UTC