Re: review of draft-ietf-httpbis-rand-access-live-01 from Martin Thomson on 2017-11-09 (ietf-http-wg@w3.org from October to December 2017)

From: Martin Thomson <martin.thomson@gmail.com>
Date: Fri, 10 Nov 2017 08:12:08 +1100
To: Craig Pratt <craig@ecaspia.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CABkgnnXj0cPoUg0zVuibgiQuLAg4OxZqjDfKu258hGuSjRCJGQ@mail.gmail.com>
Nice work Craig, thanks for the quick turnaround.

On Thu, Nov 9, 2017 at 8:54 PM, Craig Pratt <craig@ecaspia.com> wrote:

> Thanks much for the great feedback Martin.
>
> I've been concentrating on prototyping a server for demo and testing with
> proxies. So the feedback is much appreciated. It's a good opportunity to
> shift focus for a bit.
>
> I've incorporated your edits - and made tweaks of my own. Here's the PR:
>
>  https://github.com/httpwg/http-extensions/pull/414
>
> Feel free to comment on the PR for anything I missed and/or my edits.
>
> And I can split this into multiple PRs if anyone thinks this is too
> unwieldy for one PR.
>
> Thx again,
>
> cp
>
>
> On 11/8/17 5:49 PM, Martin Thomson wrote:
>
> This document is almost ready for publication.
>
> The mechanism is solid.  I just have a few quibbles about document
> structure.  I hope that I've provided concrete-enough suggestions here
> to unblock progress.
>
> I found it hard to understand a) what the shortcomings of the existing
> mechanisms were, and b) how this document intended to approach
> addressing those shortcomings.  Section 2 almost gets there by noting
> that there are two components to the design.
>
> I think that it has to do two more things:
>  - establish why one request isn't enough
>  - put more meat on how the two pieces fit together
>
> The draft currently buries the explanation for the first part, which
> makes it hard to understand the motivation for the mechanism being
> proposed.
>
> It's also probably true that you don't need two requests; you could
> leap right into a request with a very large maximum.  But I think that
> in the interest of rigor, the formulation in the draft is fine.
>
>
> I would phrase this as follows:
>
> ~~~
> This document recommends a two-step process for accessing resources
> that have indeterminate length representations.
>
> Two steps are necessary because of limitations with the Range and
> Content-Range header fields.  A server cannot know from a range
> request that a client wishes to receive a response that does not have
> a definite end.  More critically, the header fields do not allow the
> server to signal that a resource has indeterminate length without also
> providing a fixed portion of the resource.
>
> A client first learns that the resource has a representation of
> indeterminate length by requesting an indeterminate range.  The server
> responses with the range that is available to it, but indicates that
> it does not know the length of the representation.  See Section 2.1
> for details and examples.
>
> Once the resource is known to have indeterminate length, the client
> can requests a very large range from the resource.  This range is has
> an explicit end, but the client chooses an end value larger than it is
> likely to reach in the near term.  The server signals an understanding
> of the client request for an indeterminate range by indicating that
> the range of the representation it is providing exactly matches the
> client request rather than the range that it currently has available
> to it.  See Sections 2.2 and 2.3 for details.
> ~~~
>
> In terms of structure, I would prefer having this explanation (in the
> above form, or tweaked as you see fit) ahead of any description of the
> solution.
>
> Note that I added the point that echoing the last-byte-pos by the
> server is an indication that the server either understands that the
> request is for an indefinite range (or that it has that much data to
> send, which is OK, because the number is generally more than the
> client wants).  That's an important point that the draft isn't
> explicit enough about.
>
> I would also merge Sections 2.2 and 2.3 as part of the same thing.
>
> In the introduction, I found that the second paragraph didn't really
> help motivate the design particularly well.  I get that requesting an
> entire resource will - for this sort of resource - necessarily result
> in an indefinite response, but that isn't immediately obvious without
> doing some heavy thinking.  The third paragraph is much better
> justification and sufficient in my opinion.  I would delete the second
> paragraph.
>
> Also in the introduction, I would rephrase the final paragraph as follows:
>
> ~~~
> This document describes a usage pattern for range requests that can be
> used to efficiently retrieve representations that are appended to over
> time.  This technique uses range requests for ranges that have "very
> large" values for the end of the range.  This allows representations
> to be progressively delivered by servers as new content is added.  It
> also ensures compatibility with servers and intermediaries that don't
> support this technique.
> ~~~
>
> I found Section 3.1 hard to follow.
>
>    If a Client would like to start the content transfer at the
>    Aggregation ("live") point without including any randomly-accessible
>    portion of the representation, then it should supply the last-byte-
>    pos from the most-recently received byte-range-spec and a Very Large
>    Value for the last-byte-pos in the byte-range request.
>
> Because the focus of the example is on how a client learns where the
> last octet is.  I would drop the second GET request example from this
> to concentrate on the HEAD request as follows:
>
> ~~~
> A client that wishes to only receive any newly-added portion of a
> resource (i.e., start at the "live" point), can use a HEAD request to
> learn what range the server has currently available.  For example:
>
>    HEAD /resource HTTP/1.1
>    Host: example.com
>    Range: bytes=0-
>
> The Content-Range header field in the response will indicate the range
> (or ranges) of octets available to the server.
>
>   200 OK
>   Content-Range: bytes 0-1234567/*
>
> The client can then issue a request for a range starting at the end
> value (using a very large value for the end of a range) and receive
> only new content.
>
> ~~~
>
> Including the method in the example helps a lot in this case, because
> it is the relevant point that distinguishes it from earlier examples.
>
> Nits:
>
>    A Server that doesn't support or supply a continuously aggregating
>    ("live") response should
>
> I think that since this is also talking about servers that don't
> implement this mechanism, this "should" needs to be replaced with
> "will".  Or maybe we could be more careful and say that we *assume*
> that servers (or resources) that don't support indeterminate ranges
> will set a last-byte-pos that corresponds to the last octet they have
> available at the time.
>
> You don't need to capitalize Client and Server.  RFC 7230 doesn't.
>
> This needs to refer to "Section 4.1 of [RFC7230]".  There is nothing
> in 7233 relevant to this claim.
>
>    A 0-length chunk indicates that
>    aggregation of the transferring resource is permanently discontinued,
>    per section 4.1 of [RFC7233].
>
> However, I would only say "A zero-length chunk indicates the end of
> the transfer, see Section 4.1 of [RFC7230]."  I don't think that we
> can claim permanent discontinuation.  That is, if the server decides
> to cease service to a client, that doesn't mean that it won't be able
> to resume later.  This text implies a permanence that isn't
> appropriate.
>
> I'm happy to generate PRs for any or all of the above.
>
>
>
>
> --
>
> craig pratt
>
> Caspia Consulting
>
> craig@ecaspia.com
>
> 503.746.8008
>
>
>
>
>
Received on Thursday, 9 November 2017 21:12:33 UTC