review of draft-ietf-httpbis-rand-access-live-01 from Martin Thomson on 2017-11-09 (ietf-http-wg@w3.org from October to December 2017)

From: Martin Thomson <martin.thomson@gmail.com>
Date: Thu, 9 Nov 2017 12:49:15 +1100
To: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CABkgnnUxsm3F4M8UF2VqF9CcDJx7W0FoDTBgg9Axq7ti1nPqEQ@mail.gmail.com>
This document is almost ready for publication.

The mechanism is solid.  I just have a few quibbles about document
structure.  I hope that I've provided concrete-enough suggestions here
to unblock progress.

I found it hard to understand a) what the shortcomings of the existing
mechanisms were, and b) how this document intended to approach
addressing those shortcomings.  Section 2 almost gets there by noting
that there are two components to the design.

I think that it has to do two more things:
 - establish why one request isn't enough
 - put more meat on how the two pieces fit together

The draft currently buries the explanation for the first part, which
makes it hard to understand the motivation for the mechanism being
proposed.

It's also probably true that you don't need two requests; you could
leap right into a request with a very large maximum.  But I think that
in the interest of rigor, the formulation in the draft is fine.


I would phrase this as follows:

~~~
This document recommends a two-step process for accessing resources
that have indeterminate length representations.

Two steps are necessary because of limitations with the Range and
Content-Range header fields.  A server cannot know from a range
request that a client wishes to receive a response that does not have
a definite end.  More critically, the header fields do not allow the
server to signal that a resource has indeterminate length without also
providing a fixed portion of the resource.

A client first learns that the resource has a representation of
indeterminate length by requesting an indeterminate range.  The server
responses with the range that is available to it, but indicates that
it does not know the length of the representation.  See Section 2.1
for details and examples.

Once the resource is known to have indeterminate length, the client
can requests a very large range from the resource.  This range is has
an explicit end, but the client chooses an end value larger than it is
likely to reach in the near term.  The server signals an understanding
of the client request for an indeterminate range by indicating that
the range of the representation it is providing exactly matches the
client request rather than the range that it currently has available
to it.  See Sections 2.2 and 2.3 for details.
~~~

In terms of structure, I would prefer having this explanation (in the
above form, or tweaked as you see fit) ahead of any description of the
solution.

Note that I added the point that echoing the last-byte-pos by the
server is an indication that the server either understands that the
request is for an indefinite range (or that it has that much data to
send, which is OK, because the number is generally more than the
client wants).  That's an important point that the draft isn't
explicit enough about.

I would also merge Sections 2.2 and 2.3 as part of the same thing.

In the introduction, I found that the second paragraph didn't really
help motivate the design particularly well.  I get that requesting an
entire resource will - for this sort of resource - necessarily result
in an indefinite response, but that isn't immediately obvious without
doing some heavy thinking.  The third paragraph is much better
justification and sufficient in my opinion.  I would delete the second
paragraph.

Also in the introduction, I would rephrase the final paragraph as follows:

~~~
This document describes a usage pattern for range requests that can be
used to efficiently retrieve representations that are appended to over
time.  This technique uses range requests for ranges that have "very
large" values for the end of the range.  This allows representations
to be progressively delivered by servers as new content is added.  It
also ensures compatibility with servers and intermediaries that don't
support this technique.
~~~

I found Section 3.1 hard to follow.

   If a Client would like to start the content transfer at the
   Aggregation ("live") point without including any randomly-accessible
   portion of the representation, then it should supply the last-byte-
   pos from the most-recently received byte-range-spec and a Very Large
   Value for the last-byte-pos in the byte-range request.

Because the focus of the example is on how a client learns where the
last octet is.  I would drop the second GET request example from this
to concentrate on the HEAD request as follows:

~~~
A client that wishes to only receive any newly-added portion of a
resource (i.e., start at the "live" point), can use a HEAD request to
learn what range the server has currently available.  For example:

   HEAD /resource HTTP/1.1
   Host: example.com
   Range: bytes=0-

The Content-Range header field in the response will indicate the range
(or ranges) of octets available to the server.

  200 OK
  Content-Range: bytes 0-1234567/*

The client can then issue a request for a range starting at the end
value (using a very large value for the end of a range) and receive
only new content.

~~~

Including the method in the example helps a lot in this case, because
it is the relevant point that distinguishes it from earlier examples.

Nits:

   A Server that doesn't support or supply a continuously aggregating
   ("live") response should

I think that since this is also talking about servers that don't
implement this mechanism, this "should" needs to be replaced with
"will".  Or maybe we could be more careful and say that we *assume*
that servers (or resources) that don't support indeterminate ranges
will set a last-byte-pos that corresponds to the last octet they have
available at the time.

You don't need to capitalize Client and Server.  RFC 7230 doesn't.

This needs to refer to "Section 4.1 of [RFC7230]".  There is nothing
in 7233 relevant to this claim.

   A 0-length chunk indicates that
   aggregation of the transferring resource is permanently discontinued,
   per section 4.1 of [RFC7233].

However, I would only say "A zero-length chunk indicates the end of
the transfer, see Section 4.1 of [RFC7230]."  I don't think that we
can claim permanent discontinuation.  That is, if the server decides
to cease service to a client, that doesn't mean that it won't be able
to resume later.  This text implies a permanence that isn't
appropriate.

I'm happy to generate PRs for any or all of the above.
Received on Thursday, 9 November 2017 01:49:39 UTC