RE: Issue with "bytes" Range Unit and live streaming from Thorsten Lohmar on 2016-04-20 (ietf-http-wg@w3.org from April to June 2016)

From: Thorsten Lohmar <thorsten.lohmar@ericsson.com>
Date: Wed, 20 Apr 2016 19:31:14 +0000
To: Craig Pratt <craig@ecaspia.com>, "K.Morgan@iaea.org" <K.Morgan@iaea.org>, "fielding@gbiv.com" <fielding@gbiv.com>
CC: Göran Eriksson AP <goran.ap.eriksson@ericsson.com>, "bs7652@att.com" <bs7652@att.com>, "remy@lebeausoftware.org" <remy@lebeausoftware.org>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>, "rodger@plexapp.com" <rodger@plexapp.com>, "julian.reschke@gmx.de" <julian.reschke@gmx.de>, "C.Brunhuber@iaea.org" <C.Brunhuber@iaea.org>, Darshak Thakore <d.thakore@cablelabs.com>
Message-ID: <9E953B010F1E974399030905C5DCB2E7183D4070@ESESSMB101.ericsson.se>
Hi Craig,

> What *is* missing is the ability to get a continuous byte range on live content
> that starts at an arbitrary offset and the ability to directly jump to the live
> point.

Yes, and there are two solutions to the issue:
A. Enable it on HTTP Layer through the definition of a new range request method
B. Enable working with existing HTTP procedures, i.e. client can workout the precise byte offsets.

BR,
/Thorsten

> -----Original Message-----
> From: Craig Pratt [mailto:craig@ecaspia.com]
> Sent: Wednesday, April 20, 2016 8:21 PM
> To: Thorsten Lohmar; K.Morgan@iaea.org; fielding@gbiv.com
> Cc: Göran Eriksson AP; bs7652@att.com; remy@lebeausoftware.org; ietf-
> http-wg@w3.org; rodger@plexapp.com; julian.reschke@gmx.de;
> C.Brunhuber@iaea.org; Darshak Thakore
> Subject: Re: Issue with "bytes" Range Unit and live streaming
> 
> Hey Thorsten,
> 
> I'm not clear about what you think is missing today from MP4/ISO-BMFF.
> 
> HTTP/1.1 MP4-aware client has everything it needs today to resolve time
> offsets to byte offsets and find the nearest random access point(s) -
> whether unfragmented, static-fragmented, or live-fragmented - and I have
> code to demonstrate this. I'm sure this must be a misunderstanding, but I
> don't see what here requires a ISO-BMFF extension.
> 
> What *is* missing is the ability to get a continuous byte range on live content
> that starts at an arbitrary offset and the ability to directly jump to the live
> point.
> 
> More thoughts in-line.
> 
> cp
> 
> On 4/20/16 8:14 AM, Thorsten Lohmar wrote:
> > Hi Craig,
> >
> > See inline.
> >
> > It might be worthwhile to bring-up the issue in MPEG. E.g. as ISO-BMFF
> extension or DASH extension.
> >
> > BR,
> > /Thorsten
> >
> >> -----Original Message-----
> >> From: Craig Pratt [mailto:craig@ecaspia.com]
> >> Sent: Tuesday, April 19, 2016 11:54 PM
> >> To: Thorsten Lohmar; K.Morgan@iaea.org; fielding@gbiv.com
> >> Cc: Göran Eriksson AP; bs7652@att.com; remy@lebeausoftware.org; ietf-
> >> http-wg@w3.org; rodger@plexapp.com; julian.reschke@gmx.de;
> >> C.Brunhuber@iaea.org; Darshak Thakore
> >> Subject: Re: Issue with "bytes" Range Unit and live streaming
> >>
> >> Reply in-line.
> >>
> >> cp
> >>
> >> On 4/19/16 7:02 AM, Thorsten Lohmar wrote:
> >>> Hi Craig,
> >>>
> >>> Thanks for sharing the github link. That certainly clarifies the
> >>> use-case even further.
> >>>
> >>> Maybe we should focus the discussion on the fMP4 format for some
> >>> time, since tune-in into fMP4 requires random access to fragment
> boundaries.
> >>> Compared to ts or mp3, fMP4 does not support synchronization to the
> >>> stream from any received byte. The client must start processing an
> >>> fMP4 stream from fragment boundaries (or box boundaries). Tuning to
> >>> selected samples inside of the fMP4 file happens at a later stage.
> >>>
> >>> So, when I understand your proposal around the byte-live request
> >>> correctly, then the client is NOT asking for the precise byte range,
> >>> but the next possible random access points in close proximity of the
> >>> requested range. So, the client sends a request containing "Range:
> >>> bytes-live=0-*" and gets a response with "Content-Range: bytes-live
> >>> 123456-*": So, the server is providing the client with a HTTP
> >>> resource, starting in case of fMP4 with the fragment boundary (which
> >>> is at byte-offset 123456 of the resource).
> >>>
> >>> Do I understand it correctly then: When the client wants to tune-in
> >>> e.g. 5min in the TSB, then the client measures the bitrate of the
> >>> media stream and calculates a rough byte offset (i.e. 5min x
> >>> estimated bitrate, lets say byte pos 654321) [all out of scope of
> >>> the ID] and creates a bytes-live range request of form "Range:
> >>> bytes-live=654321-*". The server looksup a good fragment boundary
> >>> (lets say 654420)  and responds with "Content-Range: bytes-live:
> >>> 654420-*". Do I understand the proposal correctly?
> >>>
> >> TSBs and MP4s are a nasty combination. ISO BMFF containers
> >> accommodate amended content (via fragments). But it doesn't have
> >> facilities for front-end truncation. It's possible, just not easy. So
> >> this will go down the rabbit hole quickly.
> > [TL] Yes, it is possible to append to the end. But you cannot simply delete
> from the front when you realize a TSB. The question is, whether this issue
> should be solved on HTTP layer, since likely you also want to give an
> indication of the TSB to the Users (GUI representation of the timeshiftbuffer
> depth).
> [cp] Yeah - there always needs to be a mov box at the front. And while I
> believe Content-Range is sufficient to communicate front-end truncation,
> anything time-related is another matter. In MP4, this can be communicated
> in the timeline via an elst box (edit list). Anything else requires some kind of
> time-based Range Unit. And that's orthogonal to our draft - and live content
> in general.
> 
> >> One would need to define the operation of the server for providing a
> >> time- shifted MP4 representation. Basically the idea is that the
> >> server would maintain a list of valid fragments and would have to
> >> maintain a valid movie box at the front of the representation as
> >> fragments were added and removed (if a client's expected to consume a
> >> spec-compliant ISO BMFF container).
> > [TL] Yes.
> >
> >> So I'd say you're mostly right, but let me paraphrase: When an MP4
> >> client wants to come in at a precise time on a MP4/ISO BMFF (e.g. 5
> >> minutes), it typically would:
> >>
> >> 1) get the movie box from the front of the representation (a couple
> >> byte- Range requests);
> >> 2) access the movie fragment header boxes until the requested time
> >> offset is determined (multiple byte-Range requests);
> > [TL] Yes, but the Movie Box does not contain such information for a Live
> offering. So, something is missing here.
> [cp] Whether the MP4 is being actively appended to or not, walking the
> fragments (moofs) will provide a client the time offsets. I have code
> demonstrating this or you can do a protocol analysis of VLC to see it
> performing the bytes-Range requests. Are we getting tripped up on the
> definition of "live"? "Live" != "HLS Live Streaming", right?
> >
> >> 3) If the time offset is found in the currently-available fragments,
> >> perform a Range request to get the fragment containing the target
> >> time and start rendering it (one or two "bytes" Range request), and
> >> start fetching the next Fragment if/when desired. e.g. If the nearest
> >> random access point for 5 minutes is at the start of a fragment with
> >> offset
> >> 4444444 with length 30001:
> >>
> >>       Range: bytes=4444444-4474444
> > [TL] Well, almost. You must always start at the fragment boundary. When
> the fragment contains e.g. 4sec of media data (e.g. with 1Sec Gops), the
> client must start fetching from the beginning of the fragment and then skip
> the media data before the desire playtime start. The client can also do an
> open range request.
> [cp] Agreed - which is why I said "the nearest random access point for 5
> minutes is at the start of a fragment with offset 4444444". The actual
> sample/frames corresponding with the 5-minute mark would be at some
> point after 4444444.
> >> 4) If the time offset is not found to be after the
> >> currently-available fragments, jump to the live point by grabbing the
> >> last fragment and any fragments that are added on (one "bytes-live"
> >> range request). e.g. if the last fragment's offset is 5555555 with length
> 40001:
> >>
> >>       Range: bytes-live=5555555-*
> >>
> >> The client would quickly render the last fragment (to prime the frame
> >> buffer), put the framebuffer on-screen, and then block on the socket
> >> (or wait for async notification) and render the new fragment(s) as they
> come in.
> > [TL] Ok, understood.
> [cp] It's important to remember that this is the feature we care about in
> context of this draft. If you're good with this, then we're on the same page.
> >> * Note that one might use MP4 timelines in here to communicate the
> >> fact that sample time 0 is not media time 0. But I'd rather not get too far
> into that.
> >>
> >> ** As with live content, TSBs are non-cachable. (but in-progress
> >> recordings
> >> *are* cache-able)
> > [TL] Since we talk about a single HTTP transaction with progressive media
> data, the session is anyhow not cachable.
> [cp] Well, in-progress recordings should still be byte-wise cachable, correct?
> Byte 0 will always be byte 0, byte X will always be byte X. So a cache can be
> populated with the currently-accessible bytes and can satisfy Range
> requests. There might just be a cache miss if a client accesses bytes not-yet-
> cached. But a proxy should know that this is a possibility since the Content-
> Range responses have a "*" in place of the content length.
> >
> >>> If yes, should the solution be limited to live cases?  If I
> >>> understand it correctly, then you are looking for a solution where
> >>> the client indicates a rough byte range in the request and the
> >>> server responds with another range, which is fulfilling some
> >>> condition. In case of a live session with fMP4, the server looks for
> >>> a random access point into the stream. The random access point must
> >>> be a fragment boundary in case of fMP4 and can be PAT, PMT, PES in
> case of TS.
> >>>
> >> Technically, regular bytes-Range request could have this "fencing"
> >> behavior (a Client should be driven off the Content-Range response
> header).
> >> But I think this is somewhat disingenuous and might be inconsistent
> >> with
> >> RFC7233 (I'd have to look closely). And I think this is another
> >> example of where a different Range Unit would make sense.
> >>
> >> e.g. DLNA specifications defined a TimeSeekRange.dlna.org header that
> >> carries this contract (of providing the most-immediately-preceding
> >> "decoder- friendly position") in the returned media. Now, this was
> >> defined to work with HTTP/1.0, so defining a new Range Unit wasn't an
> >> option. But if so, TimeSeekRange.dlna.org could be easily replaced
> >> with a "npt" Range Unit (normal play time). And what you're talking
> >> about would be similar but with byte offsets. Both carry a similar
> assumption:
> >> The server has some knowledge of the content structure.
> > [TL] Yes, started also thinking about time range requests, but I couldn't
> remember, whether that was define in DLNA, OIPF of DVB. Are the time
> range headers widely used today? Can such a solution make sense for H1.1
> or H2?
> [cp] TimeSeekRange is widely used in DLNA. But they're more useful for less
> structured content such as MP2 Program/Transport streams. (for MP4
> representations, one never sees DLNA clients using TimeSeekRange - but
> servers are required to support TSR for all media formats in CVP-2).
> 
> >> Aside: Both use cases could be covered with something like a "mso"
> >> (media-sensitive offset) Range Unit which could take a time offset or
> >> a byte offset (and incorporate live semantics).
> >>
> >> The bytes-live draft (or whatever we end up calling it) is intended
> >> to assume all knowledge is client-side - just as the case with the
> >> bytes Range Unit. I think you're assuming some of this server-side
> >> structure knowledge that I was not intending. But "mso" would make
> another great Range Unit.
> > [TL] I still think, that the client should have precise information of the TSB
> (i.e. precise range and depth) in order to properly render the TSB
> representation on the GUI (Media Players often render the TSB in a progress
> bar together with the progress of the Live Session).
> [cp] I do think it could be useful to define something like
> TimeSeekRange.dlna.org as a Range Unit. This would be very useful for
> random access on MPEG2-contained content. If you want to start a draft, I'm
> happy to co-author. ;^J
> >> There are many other potential applications of something like
> >> bytes-live I believe - which is why we're bringing this to the IETF.
> >> But really it's all about getting to the "live point" in a couple different
> ways.
> >> But it assumes the *client* knows where the random access points are
> >> located (and tailors the byte offsets accordingly).
> > [TL] Maybe, it is better to discuss this issue in MPEG, either as ISO-BMFF
> extensions or DASH extensions.
> [cp] Perhaps. But the mechanism of how the transfer occurs - which is what
> this draft is related to - is definitely in the transport space.
> For some complex formats, it may be in the "necessary but not sufficient"
> category. But as I mentioned at the top, for
> (linear/progressive) MP4/ISO-BMFF, everything should be there.
> >>> Nothing inline.
> >>>
> >> Good - that was getting crazy... ;^J
> >>
> >>> BR,
> >>>
> >>> Thorsten
> >>>
> >>> *From:*Craig Pratt [mailto:craig@ecaspia.com]
> >>> *Sent:* Tuesday, April 19, 2016 11:55 AM
> >>> *To:* Thorsten Lohmar; K.Morgan@iaea.org; fielding@gbiv.com
> >>> *Cc:* Göran Eriksson AP; bs7652@att.com; remy@lebeausoftware.org;
> >>> ietf-http-wg@w3.org; rodger@plexapp.com; julian.reschke@gmx.de;
> >>> C.Brunhuber@iaea.org; Darshak Thakore
> >>> *Subject:* Re: Issue with "bytes" Range Unit and live streaming
> >>>
> >>> Hey Thorsten,
> >>>
> >>> I'll try to reply in-line.
> >>>
> >>> cp
> >>>
> >>> On 4/18/16 3:50 PM, Thorsten Lohmar wrote:
> >>>
> >>>      Hi Craig, all,
> >>>
> >>>      Thanks for the clarification. Some further question inline
> >>>
> >>>      BR,
> >>>
> >>>      Thorsten
> >>>
> >>>      *From:*Craig Pratt [mailto:craig@ecaspia.com]
> >>>      *Sent:* Monday, April 18, 2016 10:29 PM
> >>>      *To:* Thorsten Lohmar; K.Morgan@iaea.org
> >>>      <mailto:K.Morgan@iaea.org>; fielding@gbiv.com
> >>>      <mailto:fielding@gbiv.com>
> >>>      *Cc:* Göran Eriksson AP; bs7652@att.com <mailto:bs7652@att.com>;
> >>>      remy@lebeausoftware.org <mailto:remy@lebeausoftware.org>;
> >>>      ietf-http-wg@w3.org <mailto:ietf-http-wg@w3.org>;
> >>>      rodger@plexapp.com <mailto:rodger@plexapp.com>;
> >>>      julian.reschke@gmx.de <mailto:julian.reschke@gmx.de>;
> >>>      C.Brunhuber@iaea.org <mailto:C.Brunhuber@iaea.org>; Darshak
> >>>      Thakore; STARK, BARBARA H
> >>>      *Subject:* Re: Issue with "bytes" Range Unit and live streaming
> >>>
> >>>      [cc-ing the co-authors]
> >>>
> >>>      Hi Thorsten,
> >>>
> >>>      I'm happy to help provide whatever answers I can.
> >>>
> >>>      Reply in-line.
> >>>
> >>>      cp
> >>>
> >>>      On 4/18/16 8:10 AM, Thorsten Lohmar wrote:
> >>>
> >>>          Hi Craig, all,
> >>>
> >>>          My colleague Göran asked me some question around the problem
> >>>          and I would like to raise these questions directly to you. Of
> >>>          course, there are some alternative solutions available, where
> >>>          the client can work out the different things from a manifest.
> >>>          But you seem to look for a simple solution, which works with
> >>>          non-segmented media on a single HTTP session.
> >>>
> >>>          When I understood it correctly, an HTTP server is making a
> >>>          live stream available using HTTP. A normal live stream can be
> >>>          opened with a single HTTP request and the server can serve
> >>>          data "from the live point" either with or without HTTP chunked
> >>>          delivery. The server cannot give a Content-Length, since this
> >>>          is an ongoing live stream of unknown size.
> >>>
> >>>      [cp] all correct.
> >>>
> >>>
> >>>      Your use-case seem to be about recording of content. Client should
> >>>      access content from the recorded part, but should be able to jump
> >>>      to the live-point. I assume that you are not looking into sliding
> >>>      window recordings (i.e. timeshift). I assume that the a single
> >>>      program is continuous recording and the HTTP object is growing
> >>>      until the end of the live session, correct?
> >>>
> >>>      [cp] I didn't spell it out in the draft, but I would like to
> >>>      consider adding clarifications for the time-shift cases. This
> >>>      should just be a matter of a Client requesting one thing and
> >>>      getting another. e.g. "Range: bytes-live=0-*" results in
> >>>      "Content-Range: bytes-live 123456-*". In either case, you're
> >>>      correct: the end of the representation is moving forward in time
> >>>      until the end of the live session.
> >>>
> >>>
> >>>      */[TL] The "Range: bytes-live=0-*" case is not clear to me. Your
> >>>      ID says "All bytes currently in the representation and those
> >>>      appended to the end of the representation after the request is
> >>>      processed". I get the impression, that the server is deleting all
> >>>      data before a certain timepoint (aka, behavior of a slighting
> >>>      window timeshift).  So, the client seems to request all data from
> >>>      the beginning of the timeshift buffer. Why does the server need to
> >>>      change the byte offset from 0 to 123456? /*
> >>>
> >>>      *I can understand, that the server must signal "growing resource"
> >>>      in the response.*
> >>>
> >>> [cp] I was trying to illustrate a case where the server had trimmed
> >>> off bytes 0-123456 (the TSB model). So in this case, it's signalling
> >>> to the client "you're getting bytes starting at 123456 (not 0)". e.g.
> >>> If a client requests "Range: bytes-live=0-*" on an in-progress
> >>> recording, one might expect:
> >>>
> >>>      Content-Range: bytes-live 0-*/*
> >>>
> >>> [cp] Basically saying (as described in the ID) that all bytes
> >>> currently in the representation and those appended to the end of the
> >>> representation after the request is processed will be returned. But
> >>> on a TSB, one might expect:
> >>>
> >>>      Content-Range: bytes-live 123456-*/*
> >>>
> >>> [cp] Basically saying that bytes starting at byte 123456 in the
> >>> representation and those appended to the end of the representation
> >>> after the request is processed will be returned.
> >>>
> >>> [cp] While I'm thinking about TSB use cases in the back of my mind,
> >>> this is really not the primary use case I was considering for the ID
> >>> (but I would hope it can be covered).
> >>>
> >>> In any case, how does the client know "good" byte range offsets (i.e.
> >>> service access points) to tune into the recording? Or is the
> >>> assumption, that the client can synchronize to the media stream from
> >>> any byte range?
> >>>
> >>> [cp] For byte-level access, random access implementation is up to
> >>> the client. For some containers this is easier than others. e.g. For
> >>> MP4, the random access points can be established by reading the
> >>> movie and fragment header(s). For something like MP2, it's trickier of
> course.
> >>>
> >>> */[TL] Well, in case of fMP4, the client needs to get the Movie
> >>> Header for initialization. Then, proper access point are fragment
> boundaries.
> >>> There are various ways to signal time to byte-offsets. /*
> >>>
> >>> [cp] Fragments can actually have multiple access points - implicit
> >>> (per sample) and explicit (random access points). But yeah, it seems
> >>> common for fragments to have one random access point (and often
> >>> correlate to a GOP) - and that there's a huge variety of ways to lay
> >>> out the samples.
> >>>
> >>> */In case of TS, the client needs a PAT, PMT and PES starts for
> >>> tune-in. It is a bit more tricky, but also here are solutions. /*
> >>>
> >>> */But the email talks about "none-segmented" media. The draft talks
> >>> about "mimicking segmented media". fMP4 is actually the way to
> >>> create ISO-BMFF segments. So, it is for segmented media, but without
> >>> a separate manifest?/*
> >>>
> >>> [cp] It's important to differentiate between *fragmented* and
> >>> *segmented* MP4/ISO BMFF representations. bytes-live is most
> >>> applicable to fragmented files - where you have one representation
> >>> being used for the entire streaming session - with this
> >>> representation being appended to periodically (usually one fragment at
> a time).
> >>>
> >>> [cp] I really need to revise my description in the draft to help
> >>> avoid confusion. What I was trying to describe was how a solution
> >>> using just byte-Range requests would always be slightly behind the
> >>> live point - as is the case with rendering "live" segmented streams.
> >>> While bytes-live could be used for fragmented (MP4/ISO BMFF) content
> >>> or segmented content, the primary use case is for non-segmented
> >>> representations.
> >>>
> >>>
> >>>
> >>> [cp] One major feature this draft allows is for retrieval of bytes
> >>> just preceding the live point. So for example, a client can do a
> >>> Range head request like "Range: bytes=0-", get a "Content-Range:
> >>> bytes 0-1234567/*", then perform something like a "Range:
> >>> bytes-live=1200000-*", and prime its framebuffer with 34567 bytes of
> >>> data that precede the live point - allowing for the client to find
> >>> an access point (e.g. mpeg2 start codes) and to allow live
> >>> presentation to display much sooner than it would from the live
> >>> point (without random access).
> >>>
> >>> */[TL] So, how does the client know, that the proper fragment
> >>> boundary is at byte position 120000? Do you assume that the client
> >>> first fetches a time-to-byte offset file, which tells the client
> >>> that a access point (e.g. a fragment boundary) is at byte pos
> >>> 120000? If yes, why does the client need the HEAD request, when it
> >>> already has the byte position?/*
> >>>
> >>> [cp] How a client know the amount to pre-fetch before the live point
> >>> would depend upon the media format. For an MP4/ISO BMFF file,
> 120000
> >>> could represent the random access point most immediately preceding
> >>> the live point. It would be similar for an indexed MP2. And for
> >>> unindexed
> >>> MP2 representations, it's not uncommon for a client to prebuffer a
> >>> fixed amount of content in the hopes of capturing a keyframe (really
> >>> a heuristic).
> >>>
> >>> [cp] The HEAD request is necessary in this case to know where the
> >>> live point is at the time the request is made so the HTTP client
> >>> would know if it can jump into already-stored content or if it
> >>> should just acquire the live point.
> >>>
> >>> [cp] The important point is that all common video formats need a
> >>> discontinuity-free number of bytes before the live point to provide
> >>> a quality user experience.
> >>>
> >>>
> >>> How should the client know, which byte ranges are already available
> >>> on the server? When the client is playing back from the recorded
> >>> part and would like to skip 5min forward, how does the client know,
> >>> whether a normal range request is needed or whether the client
> >>> should as for the live point? What type of HTTP Status code should
> >>> be provided, when the range request is not yet available of the server?
> >>>
> >>> [cp] We're not trying to come up with a universal solution for
> >>> performing time-based seek on all media formats with this draft. So
> >>> some of this is out of scope. But let me see if I can fill in some
> >>> of the blanks.
> >>>
> >>> */[TL] Ok, not everything needs to be in-scope. But an essential
> >>> assumption should be, whether the client has a time-to-byteoffset
> >>> table or whether the client can determine precisely the fragment
> >>> boundary positions. /*
> >>>
> >>> [cp] Optimally, time-to-byte indexes would be used. But even without
> >>> this, clients can often manage with heuristics. e.g. VLC can perform
> >>> a reasonable job of providing time-seek on unindexed MP2 files.
> >>>
> >>>
> >>>
> >>> [cp] Some applications of media streaming have time-based indexing
> >>> facilities built-in. e.g. MP4 (ISO BMFF) containers allow time and
> >>> data to be associated using the various internal, mandatory metadata
> >>> "boxes". In other cases, applications may provide a separate
> >>> resource that contains time-to-byte mappings (e.g. content index
> >>> files). In either case, there's a facility for mapping time offsets
> >>> to byte offsets - or sometimes the client incorporates heuristics to
> >>> perform time skips (e.g. VLC will do this on some file formats).
> >>>
> >>>
> >>> */[TL] Yes. fMP4 supports this and MPEG DASH is leveraging this. But
> >>> the live-point is not described in the fragments. The client
> >>> determines the livepoint from the manifest. /*
> >>>
> >>> [cp] Correct. In fragmented content, the time-to-segment map tells
> >>> you which representation to fetch (via GET). While I'd say that
> >>> bytes-live can also improve segmented rendering (by reducing the
> >>> latency of rendering), the primary focus of our draft is for
> >>> non-segmented representations.
> >>>
> >>>
> >>> [cp] In all these cases, there's some mechanism that maps time
> >>> offsets to byte offsets.
> >>>
> >>> */[TL] Yes/*
> >>>
> >>>
> >>>
> >>> [cp] When it comes to the available byte range, a client can know
> >>> what data range is available by utilizing a HEAD request with a "Range:
> >>> bytes=0-". The "Content-Range" response can contain something like
> >>> "Content-Range: bytes 0-1234567/*" which tells the client both the
> >>> current randomly accessible content range (via the "0-1234567") and
> >>> that the content is of indeterminate length (via the "*").
> >>>
> >>> */[TL] So, that is the existing Content-Range response, but with an
> >>> '*' to indicate the unknown content-length, correct? /*
> >>>
> >>> [cp] Yeah, the "*" in place of the last-byte-pos indicates an
> >>> indetermine-length response body.
> >>>
> >>>
> >>>
> >>> [cp] Putting this all together, a client would implement a 5-minute
> >>> skip by:
> >>>      (1) Adding 5 minutes to your current play time,
> >>>      (2) determining the byte offset for that given time using the
> >>> appropriate index/heuristic (e.g. "3456789"),
> >>>      (3) if the time is outside the index, jump to the live point
> >>> and update the time to the last-index time or other means (e.g.
> >>> using
> >>> "Range: bytes-live=340000-*" to pre-buffer/pre-prime the
> >>> frame/sample buffer),
> >>>      (4) if the time is inside the index, either perform a standard
> >>> bytes Range request to retrieve an implementation-specific quantum
> >>> of time or data (e.g. "Range: bytes=3456789-3556789") and render.
> >>>
> >>> */[TL] In (2), How does the client determine the byte offset? fMP4
> >>> requires precise byteoffset, In case of TS, the client can sync to
> >>> the stream by first searching for 0x47 sync bytes. In (3), how does
> >>> the client determine "outside of the index"? Seems that some sort of
> >>> manifest is implicitly needed, which allows the client to understand
> >>> the latest byte pos. /*
> >>>
> >>> [cp] (2) is media-format-specific. For MP4/ISO BMFF, it would use
> >>> the built-in metadata, for MP2, it would either use an index file or
> >>> a heuristic.
> >>>
> >>> [cp] For (3), if the current live point (in byte terms) is greater
> >>> than the last byte offset in the index, then the live point is
> >>> "outside the index". That is, the time the client is trying to
> >>> access isn't randomly accessible, and the client should just jump to
> >>> the live point.
> >>>
> >>> *//*
> >>>
> >>>
> >>>
> >>> [cp] Again, some of this is out of scope, but I hope that clarifies
> >>> a common use case.
> >>>
> >>> */[TL] Would be good to clarify, what information the client needs
> >>> to get in order to do the operations. How the client gets the info
> >>> can be left out-of-scope./*
> >>>
> >>> [cp] ok - I hope I'm filling in more of the blanks...
> >>>
> >>>
> >>>
> >>> [cp] Regarding the status code, RFC7233 (section 4.4) indicates that
> >>> code 416 (Range Not Satisfiable) must be returned when "the current
> >>> extent of the selected resource or that the set of ranges requested
> >>> has been rejected due to invalid ranges or an excessive request of
> >>> small or overlapping ranges." This part of 4.4 applies to *all*
> >>> Range requests - regardless of the Range Unit.
> >>>
> >>> */[TL] ok. /*
> >>>
> >>>
> >>>
> >>> [cp] The bytes-live draft then goes on to say that "A
> >>> bytes-live-range-specifier is considered unsatisfiable if the
> >>> first-byte-pos is larger than the current length of the
> >>> representation". This could probably be elaborated on a bit. But
> >>> this is supposed to be the "hook" into the 4.4 language.
> >>>
> >>>
> >>> Can you please clarify the questions?
> >>>
> >>> [cp] I hope I succeeded (at least partially). Apologies for the long
> >>> response. I wanted to make sure I was answering your questions.
> >>>
> >>> */[TL] Gets a bit clearer, but I still don't understand the "mimic
> >>> HLS or DASH". DASH / HLS focuses on CDN optimization by creating a
> >>> sequence of individual files. The client can work out the live-point
> >>> URL from the manifest. Each segment is a "good" access point (in
> >>> DASH always box boundaries and in HLS always TS boundaries even with
> >>> PAT / PMT). So, the key issue here is to clarify, how the client
> >>> gets the byte offsets of the fragment boundaries for range
> >>> requests./*
> >>>
> >>> [cp] If it's still a bit unclear how this is performed, I can go
> >>> into more detail. But like I say, I should really reword that
> >>> section of the draft since I think I've created some confusion. The
> >>> point I was trying to make was that *polling* a non-segmented
> >>> representation would
> >>> - other than being inefficient - have the kind of multi-second
> >>> latency that segmented live streaming would have.
> >>>
> >>> [cp] But the difficulty of expressing this (secondary) benefit in
> >>> the bytes-live is probably not worth the trouble. I'll see if I can
> >>> reword the draft to make it less confusing. I don't think this point
> >>> is necessary to "sell" the concept of bytes-live (or a
> >>> bytes-live-like feature).
> >>>
> >>> [cp] BTW, if you're really interested in the details of mapping time
> >>> to offsets in a ISO BMFF container, have a look at
> >>> odid_mp4_parser.vala:get_random_access_points() and
> >>> get_random_access_point_for_time() at
> >>> https://github.com/cablelabs/rygel/tree/cablelabs/master/src/media-
> >> engines/odid/.
> >>> I can probable even get you instructions for printing RAPs for MP4
> >>> files using the test program.
> >>>
> >>> hth - cp
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> BR,
> >>>
> >>> Thorsten
> >>>
> >>> *From:*Craig Pratt [mailto:craig@ecaspia.com]
> >>> *Sent:* Monday, April 18, 2016 11:04 AM
> >>> *To:* K.Morgan@iaea.org <mailto:K.Morgan@iaea.org>;
> >> fielding@gbiv.com
> >>> <mailto:fielding@gbiv.com>
> >>> *Cc:* Göran Eriksson AP; bs7652@att.com <mailto:bs7652@att.com>;
> >>> remy@lebeausoftware.org <mailto:remy@lebeausoftware.org>;
> >>> ietf-http-wg@w3.org <mailto:ietf-http-wg@w3.org>;
> >> rodger@plexapp.com
> >>> <mailto:rodger@plexapp.com>; julian.reschke@gmx.de
> >>> <mailto:julian.reschke@gmx.de>; C.Brunhuber@iaea.org
> >>> <mailto:C.Brunhuber@iaea.org>
> >>> *Subject:* Re: Issue with "bytes" Range Unit and live streaming
> >>>
> >>> On 4/18/16 12:34 AM, K.Morgan@iaea.org <mailto:K.Morgan@iaea.org>
> >> wrote:
> >>>      On Friday,15 April 2016 22:43,fielding@gbiv.com
> >> <mailto:fielding@gbiv.com>  wrote:
> >>>          Oh, never mind, now I see that you are referring to the
> >>> second number being
> >>>
> >>>          fixed.
> >>>
> >>>
> >>>
> >>>          I think I would prefer that be solved by allowing
> >>> last-byte-pos to be empty, just
> >>>
> >>>          like it is for the Range request.  I think such a fix is
> >>> just as likely to be
> >>>
> >>>          interoperable as introducing a special range type (same failure
> cases).
> >>>
> >>>
> >>>
> >>>          ....Roy
> >>>
> >>>
> >>>
> >>>      +1000
> >>>
> >>>
> >>>
> >>>      A very similar idea was proposed before [1] as an I-D [2] by
> >>> Rodger
> >> Coombs. We've also brought this up informally with other members of
> >> the WG.
> >>>
> >>>
> >>>      Alas, in our experience range requests don't seem to be a high
> >>> priority :(
> >> For example, the problem of combining gzip with range requests is
> >> still unsolved [3].
> >>>
> >>>
> >>>
> >>> [1]https://lists.w3.org/Archives/Public/ietf-http-wg/2015AprJun/0122
> >>> .h
> >>> tml
> >>>
> >>>
> >>> [2]https://tools.ietf.org/html/draft-combs-http-indeterminate-range-
> >>> 01
> >>>
> >>>
> >>> [3]https://lists.w3.org/Archives/Public/ietf-http-wg/2014AprJun/1327
> >>> .h
> >>> tml
> >>>
> >>> [cp] Yeah, it's unfortunate that no solutions have moved forward for
> >>> this widely-desired feature. I can only assume that people just
> >>> started defining proprietary solutions - which is unfortunate. I'll
> >>> try to be "persistent"... ;^J
> >>>
> >>> [cp] As was mentioned, the issue with just arbitrarily allowing an
> >>> open-ended Content-Range response (omitting last-byte-pos) is that
> >>> there's no good way for a client to indicate it can support
> >>> reception of a Content-Range without a last-byte-pos. So I would
> >>> fully expect many clients to fail in "unpredictable ways"
> >>> (disconnecting, crashing, etc).
> >>>
> >>> [cp] I see that the indeterminate length proposal you referenced in
> >>> your first citation introduces a "Accept-Indefinite-Ranges" header
> >>> to prevent this issue. But I think this brings with it some other
> >>> questions. e.g. Would this apply to any/all Range Units which may be
> >>> introduced in the future? How can a Client issue a request that
> >>> starts at the "live point"? It feels like it has one hand tied behind its back.
> >>>
> >>> [cp] If I could, I would prefer to go back in time and advocate for
> >>> an alternate ABNF for the bytes Range Unit. Seeing as that's not an
> >>> option, I think using this well- and long-defined Range Unit
> >>> extension mechanism seems like a good path forward as it should not
> >>> create interoperability issues between clients and servers.
> >>>
> >>> [cp] And I would hope adding a Range Unit would have a low/lower bar
> >>> for acceptance. e.g. If a Range Unit fills a useful role, is
> >>> well-defined, and isn't redundant, it seems reasonable that it
> >>> should be accepted as it shouldn't impact existing HTTP/1.1
> >>> semantics. In fact, the gzip case (referenced in your third
> >>> citation) seems like a perfect application of the Range Unit (better
> >>> than bytes-live). If there's interest, I'll write up an RFC to demonstrate...
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> This email message is intended only for the use of the named recipient.
> >> Information contained in this email message and its attachments may
> >> be privileged, confidential and protected from disclosure. If you are
> >> not the intended recipient, please do not read, copy, use or disclose
> >> this communication to others. Also please notify the sender by
> >> replying to this message and then delete it from your system.
> >>> --
> >>>
> >>>
> >>>
> >>> craig pratt
> >>>
> >>> Caspia Consulting
> >>>
> >>> craig@ecaspia.com <mailto:craig@ecaspia.com>
> >>>
> >>> 503.746.8008
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>>
> >>> craig pratt
> >>>
> >>> Caspia Consulting
> >>>
> >>> craig@ecaspia.com <mailto:craig@ecaspia.com>
> >>>
> >>> 503.746.8008
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>> craig pratt
> >>>
> >>> Caspia Consulting
> >>>
> >>> craig@ecaspia.com <mailto:craig@ecaspia.com>
> >>>
> >>> 503.746.8008
> >>>
> >>>
> >>>
> >>>
> >>
> >> --
> >>
> >> craig pratt
> >>
> >> Caspia Consulting
> >>
> >> craig@ecaspia.com
> >>
> >> 503.746.8008
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> 
> 
> --
> 
> craig pratt
> 
> Caspia Consulting
> 
> craig@ecaspia.com
> 
> 503.746.8008
> 
> 
> 
> 
> 
> 
>
Received on Wednesday, 20 April 2016 19:31:49 UTC