Very large values (Re: Call For Adoption Live Byte Ranges) from Martin Thomson on 2017-01-03 (ietf-http-wg@w3.org from January to March 2017)

From: Martin Thomson <martin.thomson@gmail.com>
Date: Tue, 3 Jan 2017 11:37:26 +1100
To: Craig Pratt <craig@ecaspia.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CABkgnnVj6yKkdr=QZiZpqMDsbYDpO39bbYThi6bPOcOx7Sdi5Q@mail.gmail.com>

On 3 January 2017 at 10:00, Craig Pratt <craig@ecaspia.com> wrote:
> 2^63 is 9223372036854775808 (decimal). I've defined a smaller value to avoid
> potential conflicts and to make the value more easily identifiable:
> 9222999999999999999.
>
> I think having a clearly-defined Very Large Value such as this to represent
> the indeterminate end of content will be more deterministic/easily
> implemented than having a Server try to establish a VLV in each HTTP
> exchange. But I'd appreciate any thoughts prior to revising the draft.

I think that any value you choose will be OK-ish.  The question is
whether you think that there is a response that will exceed that size.
If there is, then no single value you choose will be enough.  If that
is possible, then you don't want a single fixed value at all, just a
recommendation to pick a big number that far exceeds the size you
want/expect.

I guess the other concern is that 9222999999999999999 (which I had to
copy because I go cross-eyed counting those nines), is too big for
some numeric formats.  Javascript has trouble with that number, which
it reads as 9223000000000000000 instead, a problem that starts with
9007199254740993 (just paste that into your browser console and see
what comes back). That suggests a smaller value might be safer, but
then you have more problems with overflow.

Note that whatever value you pick has to be safe for a great many
implementations, even if those implementations never need that space.
They still have to parse the value properly, preferably without
resorting to use of bignums.

If you believe it to be possible to pick a safe value that will never
be exceeded, then ignore the rest of my mail :)

The risk in specifying a single value is that implementations will
hard-code checks around that value like (end == VLV) or if things are
done poorly (end >= VLV).  Implementations that have that check will
assume indefinite ranges, even if there isn't an indefinite range and
might get caught with bugs, like infinite loops:

10: I have up to <VLV>, I need more bytes
20: ask for a range from current end to <VLV> (i.e., VLV-VLV)
30: get a zero-length range back
40: if need more bytes, goto 20

That leads to problems: implementations won't be able to send
responses of exactly the size you choose (however unlikely that is),
or in the bad case, you won't ever be able to exceed that value.

You can get the same effect if major implementations pick the same value.

On the other hand, a client can just pick an arbitrary stupidly large
value (ASLV).  This can be an increment on what the client already
has, and should probably include some randomness.  If there is that
much still remaining, then they just have to make a new request.

Thus, clients can pick a minimum increment that won't cause too much
pain for them.  2^32 might be enough for clients that don't mind
making a request every 4Gb or so, and it might make sense to start
with "smaller" increments like that to avoid triggering
incompatibility problems.

Adding some amount of randomness will provide greater surety that the
server has read and understood the request.  e.g.,

aslv = lastByte + 2**32 + random(2**32)
request.setHeader('Content-Range, 'bytes %d-%d/*' % (lastByte, aslv)

Received on Tuesday, 3 January 2017 00:37:58 UTC