- From: Thorsten Lohmar <thorsten.lohmar@ericsson.com>
- Date: Wed, 20 Apr 2016 19:31:14 +0000
- To: Craig Pratt <craig@ecaspia.com>, "K.Morgan@iaea.org" <K.Morgan@iaea.org>, "fielding@gbiv.com" <fielding@gbiv.com>
- CC: Göran Eriksson AP <goran.ap.eriksson@ericsson.com>, "bs7652@att.com" <bs7652@att.com>, "remy@lebeausoftware.org" <remy@lebeausoftware.org>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>, "rodger@plexapp.com" <rodger@plexapp.com>, "julian.reschke@gmx.de" <julian.reschke@gmx.de>, "C.Brunhuber@iaea.org" <C.Brunhuber@iaea.org>, Darshak Thakore <d.thakore@cablelabs.com>
Hi Craig, > What *is* missing is the ability to get a continuous byte range on live content > that starts at an arbitrary offset and the ability to directly jump to the live > point. Yes, and there are two solutions to the issue: A. Enable it on HTTP Layer through the definition of a new range request method B. Enable working with existing HTTP procedures, i.e. client can workout the precise byte offsets. BR, /Thorsten > -----Original Message----- > From: Craig Pratt [mailto:craig@ecaspia.com] > Sent: Wednesday, April 20, 2016 8:21 PM > To: Thorsten Lohmar; K.Morgan@iaea.org; fielding@gbiv.com > Cc: Göran Eriksson AP; bs7652@att.com; remy@lebeausoftware.org; ietf- > http-wg@w3.org; rodger@plexapp.com; julian.reschke@gmx.de; > C.Brunhuber@iaea.org; Darshak Thakore > Subject: Re: Issue with "bytes" Range Unit and live streaming > > Hey Thorsten, > > I'm not clear about what you think is missing today from MP4/ISO-BMFF. > > HTTP/1.1 MP4-aware client has everything it needs today to resolve time > offsets to byte offsets and find the nearest random access point(s) - > whether unfragmented, static-fragmented, or live-fragmented - and I have > code to demonstrate this. I'm sure this must be a misunderstanding, but I > don't see what here requires a ISO-BMFF extension. > > What *is* missing is the ability to get a continuous byte range on live content > that starts at an arbitrary offset and the ability to directly jump to the live > point. > > More thoughts in-line. > > cp > > On 4/20/16 8:14 AM, Thorsten Lohmar wrote: > > Hi Craig, > > > > See inline. > > > > It might be worthwhile to bring-up the issue in MPEG. E.g. as ISO-BMFF > extension or DASH extension. > > > > BR, > > /Thorsten > > > >> -----Original Message----- > >> From: Craig Pratt [mailto:craig@ecaspia.com] > >> Sent: Tuesday, April 19, 2016 11:54 PM > >> To: Thorsten Lohmar; K.Morgan@iaea.org; fielding@gbiv.com > >> Cc: Göran Eriksson AP; bs7652@att.com; remy@lebeausoftware.org; ietf- > >> http-wg@w3.org; rodger@plexapp.com; julian.reschke@gmx.de; > >> C.Brunhuber@iaea.org; Darshak Thakore > >> Subject: Re: Issue with "bytes" Range Unit and live streaming > >> > >> Reply in-line. > >> > >> cp > >> > >> On 4/19/16 7:02 AM, Thorsten Lohmar wrote: > >>> Hi Craig, > >>> > >>> Thanks for sharing the github link. That certainly clarifies the > >>> use-case even further. > >>> > >>> Maybe we should focus the discussion on the fMP4 format for some > >>> time, since tune-in into fMP4 requires random access to fragment > boundaries. > >>> Compared to ts or mp3, fMP4 does not support synchronization to the > >>> stream from any received byte. The client must start processing an > >>> fMP4 stream from fragment boundaries (or box boundaries). Tuning to > >>> selected samples inside of the fMP4 file happens at a later stage. > >>> > >>> So, when I understand your proposal around the byte-live request > >>> correctly, then the client is NOT asking for the precise byte range, > >>> but the next possible random access points in close proximity of the > >>> requested range. So, the client sends a request containing "Range: > >>> bytes-live=0-*" and gets a response with "Content-Range: bytes-live > >>> 123456-*": So, the server is providing the client with a HTTP > >>> resource, starting in case of fMP4 with the fragment boundary (which > >>> is at byte-offset 123456 of the resource). > >>> > >>> Do I understand it correctly then: When the client wants to tune-in > >>> e.g. 5min in the TSB, then the client measures the bitrate of the > >>> media stream and calculates a rough byte offset (i.e. 5min x > >>> estimated bitrate, lets say byte pos 654321) [all out of scope of > >>> the ID] and creates a bytes-live range request of form "Range: > >>> bytes-live=654321-*". The server looksup a good fragment boundary > >>> (lets say 654420) and responds with "Content-Range: bytes-live: > >>> 654420-*". Do I understand the proposal correctly? > >>> > >> TSBs and MP4s are a nasty combination. ISO BMFF containers > >> accommodate amended content (via fragments). But it doesn't have > >> facilities for front-end truncation. It's possible, just not easy. So > >> this will go down the rabbit hole quickly. > > [TL] Yes, it is possible to append to the end. But you cannot simply delete > from the front when you realize a TSB. The question is, whether this issue > should be solved on HTTP layer, since likely you also want to give an > indication of the TSB to the Users (GUI representation of the timeshiftbuffer > depth). > [cp] Yeah - there always needs to be a mov box at the front. And while I > believe Content-Range is sufficient to communicate front-end truncation, > anything time-related is another matter. In MP4, this can be communicated > in the timeline via an elst box (edit list). Anything else requires some kind of > time-based Range Unit. And that's orthogonal to our draft - and live content > in general. > > >> One would need to define the operation of the server for providing a > >> time- shifted MP4 representation. Basically the idea is that the > >> server would maintain a list of valid fragments and would have to > >> maintain a valid movie box at the front of the representation as > >> fragments were added and removed (if a client's expected to consume a > >> spec-compliant ISO BMFF container). > > [TL] Yes. > > > >> So I'd say you're mostly right, but let me paraphrase: When an MP4 > >> client wants to come in at a precise time on a MP4/ISO BMFF (e.g. 5 > >> minutes), it typically would: > >> > >> 1) get the movie box from the front of the representation (a couple > >> byte- Range requests); > >> 2) access the movie fragment header boxes until the requested time > >> offset is determined (multiple byte-Range requests); > > [TL] Yes, but the Movie Box does not contain such information for a Live > offering. So, something is missing here. > [cp] Whether the MP4 is being actively appended to or not, walking the > fragments (moofs) will provide a client the time offsets. I have code > demonstrating this or you can do a protocol analysis of VLC to see it > performing the bytes-Range requests. Are we getting tripped up on the > definition of "live"? "Live" != "HLS Live Streaming", right? > > > >> 3) If the time offset is found in the currently-available fragments, > >> perform a Range request to get the fragment containing the target > >> time and start rendering it (one or two "bytes" Range request), and > >> start fetching the next Fragment if/when desired. e.g. If the nearest > >> random access point for 5 minutes is at the start of a fragment with > >> offset > >> 4444444 with length 30001: > >> > >> Range: bytes=4444444-4474444 > > [TL] Well, almost. You must always start at the fragment boundary. When > the fragment contains e.g. 4sec of media data (e.g. with 1Sec Gops), the > client must start fetching from the beginning of the fragment and then skip > the media data before the desire playtime start. The client can also do an > open range request. > [cp] Agreed - which is why I said "the nearest random access point for 5 > minutes is at the start of a fragment with offset 4444444". The actual > sample/frames corresponding with the 5-minute mark would be at some > point after 4444444. > >> 4) If the time offset is not found to be after the > >> currently-available fragments, jump to the live point by grabbing the > >> last fragment and any fragments that are added on (one "bytes-live" > >> range request). e.g. if the last fragment's offset is 5555555 with length > 40001: > >> > >> Range: bytes-live=5555555-* > >> > >> The client would quickly render the last fragment (to prime the frame > >> buffer), put the framebuffer on-screen, and then block on the socket > >> (or wait for async notification) and render the new fragment(s) as they > come in. > > [TL] Ok, understood. > [cp] It's important to remember that this is the feature we care about in > context of this draft. If you're good with this, then we're on the same page. > >> * Note that one might use MP4 timelines in here to communicate the > >> fact that sample time 0 is not media time 0. But I'd rather not get too far > into that. > >> > >> ** As with live content, TSBs are non-cachable. (but in-progress > >> recordings > >> *are* cache-able) > > [TL] Since we talk about a single HTTP transaction with progressive media > data, the session is anyhow not cachable. > [cp] Well, in-progress recordings should still be byte-wise cachable, correct? > Byte 0 will always be byte 0, byte X will always be byte X. So a cache can be > populated with the currently-accessible bytes and can satisfy Range > requests. There might just be a cache miss if a client accesses bytes not-yet- > cached. But a proxy should know that this is a possibility since the Content- > Range responses have a "*" in place of the content length. > > > >>> If yes, should the solution be limited to live cases? If I > >>> understand it correctly, then you are looking for a solution where > >>> the client indicates a rough byte range in the request and the > >>> server responds with another range, which is fulfilling some > >>> condition. In case of a live session with fMP4, the server looks for > >>> a random access point into the stream. The random access point must > >>> be a fragment boundary in case of fMP4 and can be PAT, PMT, PES in > case of TS. > >>> > >> Technically, regular bytes-Range request could have this "fencing" > >> behavior (a Client should be driven off the Content-Range response > header). > >> But I think this is somewhat disingenuous and might be inconsistent > >> with > >> RFC7233 (I'd have to look closely). And I think this is another > >> example of where a different Range Unit would make sense. > >> > >> e.g. DLNA specifications defined a TimeSeekRange.dlna.org header that > >> carries this contract (of providing the most-immediately-preceding > >> "decoder- friendly position") in the returned media. Now, this was > >> defined to work with HTTP/1.0, so defining a new Range Unit wasn't an > >> option. But if so, TimeSeekRange.dlna.org could be easily replaced > >> with a "npt" Range Unit (normal play time). And what you're talking > >> about would be similar but with byte offsets. Both carry a similar > assumption: > >> The server has some knowledge of the content structure. > > [TL] Yes, started also thinking about time range requests, but I couldn't > remember, whether that was define in DLNA, OIPF of DVB. Are the time > range headers widely used today? Can such a solution make sense for H1.1 > or H2? > [cp] TimeSeekRange is widely used in DLNA. But they're more useful for less > structured content such as MP2 Program/Transport streams. (for MP4 > representations, one never sees DLNA clients using TimeSeekRange - but > servers are required to support TSR for all media formats in CVP-2). > > >> Aside: Both use cases could be covered with something like a "mso" > >> (media-sensitive offset) Range Unit which could take a time offset or > >> a byte offset (and incorporate live semantics). > >> > >> The bytes-live draft (or whatever we end up calling it) is intended > >> to assume all knowledge is client-side - just as the case with the > >> bytes Range Unit. I think you're assuming some of this server-side > >> structure knowledge that I was not intending. But "mso" would make > another great Range Unit. > > [TL] I still think, that the client should have precise information of the TSB > (i.e. precise range and depth) in order to properly render the TSB > representation on the GUI (Media Players often render the TSB in a progress > bar together with the progress of the Live Session). > [cp] I do think it could be useful to define something like > TimeSeekRange.dlna.org as a Range Unit. This would be very useful for > random access on MPEG2-contained content. If you want to start a draft, I'm > happy to co-author. ;^J > >> There are many other potential applications of something like > >> bytes-live I believe - which is why we're bringing this to the IETF. > >> But really it's all about getting to the "live point" in a couple different > ways. > >> But it assumes the *client* knows where the random access points are > >> located (and tailors the byte offsets accordingly). > > [TL] Maybe, it is better to discuss this issue in MPEG, either as ISO-BMFF > extensions or DASH extensions. > [cp] Perhaps. But the mechanism of how the transfer occurs - which is what > this draft is related to - is definitely in the transport space. > For some complex formats, it may be in the "necessary but not sufficient" > category. But as I mentioned at the top, for > (linear/progressive) MP4/ISO-BMFF, everything should be there. > >>> Nothing inline. > >>> > >> Good - that was getting crazy... ;^J > >> > >>> BR, > >>> > >>> Thorsten > >>> > >>> *From:*Craig Pratt [mailto:craig@ecaspia.com] > >>> *Sent:* Tuesday, April 19, 2016 11:55 AM > >>> *To:* Thorsten Lohmar; K.Morgan@iaea.org; fielding@gbiv.com > >>> *Cc:* Göran Eriksson AP; bs7652@att.com; remy@lebeausoftware.org; > >>> ietf-http-wg@w3.org; rodger@plexapp.com; julian.reschke@gmx.de; > >>> C.Brunhuber@iaea.org; Darshak Thakore > >>> *Subject:* Re: Issue with "bytes" Range Unit and live streaming > >>> > >>> Hey Thorsten, > >>> > >>> I'll try to reply in-line. > >>> > >>> cp > >>> > >>> On 4/18/16 3:50 PM, Thorsten Lohmar wrote: > >>> > >>> Hi Craig, all, > >>> > >>> Thanks for the clarification. Some further question inline > >>> > >>> BR, > >>> > >>> Thorsten > >>> > >>> *From:*Craig Pratt [mailto:craig@ecaspia.com] > >>> *Sent:* Monday, April 18, 2016 10:29 PM > >>> *To:* Thorsten Lohmar; K.Morgan@iaea.org > >>> <mailto:K.Morgan@iaea.org>; fielding@gbiv.com > >>> <mailto:fielding@gbiv.com> > >>> *Cc:* Göran Eriksson AP; bs7652@att.com <mailto:bs7652@att.com>; > >>> remy@lebeausoftware.org <mailto:remy@lebeausoftware.org>; > >>> ietf-http-wg@w3.org <mailto:ietf-http-wg@w3.org>; > >>> rodger@plexapp.com <mailto:rodger@plexapp.com>; > >>> julian.reschke@gmx.de <mailto:julian.reschke@gmx.de>; > >>> C.Brunhuber@iaea.org <mailto:C.Brunhuber@iaea.org>; Darshak > >>> Thakore; STARK, BARBARA H > >>> *Subject:* Re: Issue with "bytes" Range Unit and live streaming > >>> > >>> [cc-ing the co-authors] > >>> > >>> Hi Thorsten, > >>> > >>> I'm happy to help provide whatever answers I can. > >>> > >>> Reply in-line. > >>> > >>> cp > >>> > >>> On 4/18/16 8:10 AM, Thorsten Lohmar wrote: > >>> > >>> Hi Craig, all, > >>> > >>> My colleague Göran asked me some question around the problem > >>> and I would like to raise these questions directly to you. Of > >>> course, there are some alternative solutions available, where > >>> the client can work out the different things from a manifest. > >>> But you seem to look for a simple solution, which works with > >>> non-segmented media on a single HTTP session. > >>> > >>> When I understood it correctly, an HTTP server is making a > >>> live stream available using HTTP. A normal live stream can be > >>> opened with a single HTTP request and the server can serve > >>> data "from the live point" either with or without HTTP chunked > >>> delivery. The server cannot give a Content-Length, since this > >>> is an ongoing live stream of unknown size. > >>> > >>> [cp] all correct. > >>> > >>> > >>> Your use-case seem to be about recording of content. Client should > >>> access content from the recorded part, but should be able to jump > >>> to the live-point. I assume that you are not looking into sliding > >>> window recordings (i.e. timeshift). I assume that the a single > >>> program is continuous recording and the HTTP object is growing > >>> until the end of the live session, correct? > >>> > >>> [cp] I didn't spell it out in the draft, but I would like to > >>> consider adding clarifications for the time-shift cases. This > >>> should just be a matter of a Client requesting one thing and > >>> getting another. e.g. "Range: bytes-live=0-*" results in > >>> "Content-Range: bytes-live 123456-*". In either case, you're > >>> correct: the end of the representation is moving forward in time > >>> until the end of the live session. > >>> > >>> > >>> */[TL] The "Range: bytes-live=0-*" case is not clear to me. Your > >>> ID says "All bytes currently in the representation and those > >>> appended to the end of the representation after the request is > >>> processed". I get the impression, that the server is deleting all > >>> data before a certain timepoint (aka, behavior of a slighting > >>> window timeshift). So, the client seems to request all data from > >>> the beginning of the timeshift buffer. Why does the server need to > >>> change the byte offset from 0 to 123456? /* > >>> > >>> *I can understand, that the server must signal "growing resource" > >>> in the response.* > >>> > >>> [cp] I was trying to illustrate a case where the server had trimmed > >>> off bytes 0-123456 (the TSB model). So in this case, it's signalling > >>> to the client "you're getting bytes starting at 123456 (not 0)". e.g. > >>> If a client requests "Range: bytes-live=0-*" on an in-progress > >>> recording, one might expect: > >>> > >>> Content-Range: bytes-live 0-*/* > >>> > >>> [cp] Basically saying (as described in the ID) that all bytes > >>> currently in the representation and those appended to the end of the > >>> representation after the request is processed will be returned. But > >>> on a TSB, one might expect: > >>> > >>> Content-Range: bytes-live 123456-*/* > >>> > >>> [cp] Basically saying that bytes starting at byte 123456 in the > >>> representation and those appended to the end of the representation > >>> after the request is processed will be returned. > >>> > >>> [cp] While I'm thinking about TSB use cases in the back of my mind, > >>> this is really not the primary use case I was considering for the ID > >>> (but I would hope it can be covered). > >>> > >>> In any case, how does the client know "good" byte range offsets (i.e. > >>> service access points) to tune into the recording? Or is the > >>> assumption, that the client can synchronize to the media stream from > >>> any byte range? > >>> > >>> [cp] For byte-level access, random access implementation is up to > >>> the client. For some containers this is easier than others. e.g. For > >>> MP4, the random access points can be established by reading the > >>> movie and fragment header(s). For something like MP2, it's trickier of > course. > >>> > >>> */[TL] Well, in case of fMP4, the client needs to get the Movie > >>> Header for initialization. Then, proper access point are fragment > boundaries. > >>> There are various ways to signal time to byte-offsets. /* > >>> > >>> [cp] Fragments can actually have multiple access points - implicit > >>> (per sample) and explicit (random access points). But yeah, it seems > >>> common for fragments to have one random access point (and often > >>> correlate to a GOP) - and that there's a huge variety of ways to lay > >>> out the samples. > >>> > >>> */In case of TS, the client needs a PAT, PMT and PES starts for > >>> tune-in. It is a bit more tricky, but also here are solutions. /* > >>> > >>> */But the email talks about "none-segmented" media. The draft talks > >>> about "mimicking segmented media". fMP4 is actually the way to > >>> create ISO-BMFF segments. So, it is for segmented media, but without > >>> a separate manifest?/* > >>> > >>> [cp] It's important to differentiate between *fragmented* and > >>> *segmented* MP4/ISO BMFF representations. bytes-live is most > >>> applicable to fragmented files - where you have one representation > >>> being used for the entire streaming session - with this > >>> representation being appended to periodically (usually one fragment at > a time). > >>> > >>> [cp] I really need to revise my description in the draft to help > >>> avoid confusion. What I was trying to describe was how a solution > >>> using just byte-Range requests would always be slightly behind the > >>> live point - as is the case with rendering "live" segmented streams. > >>> While bytes-live could be used for fragmented (MP4/ISO BMFF) content > >>> or segmented content, the primary use case is for non-segmented > >>> representations. > >>> > >>> > >>> > >>> [cp] One major feature this draft allows is for retrieval of bytes > >>> just preceding the live point. So for example, a client can do a > >>> Range head request like "Range: bytes=0-", get a "Content-Range: > >>> bytes 0-1234567/*", then perform something like a "Range: > >>> bytes-live=1200000-*", and prime its framebuffer with 34567 bytes of > >>> data that precede the live point - allowing for the client to find > >>> an access point (e.g. mpeg2 start codes) and to allow live > >>> presentation to display much sooner than it would from the live > >>> point (without random access). > >>> > >>> */[TL] So, how does the client know, that the proper fragment > >>> boundary is at byte position 120000? Do you assume that the client > >>> first fetches a time-to-byte offset file, which tells the client > >>> that a access point (e.g. a fragment boundary) is at byte pos > >>> 120000? If yes, why does the client need the HEAD request, when it > >>> already has the byte position?/* > >>> > >>> [cp] How a client know the amount to pre-fetch before the live point > >>> would depend upon the media format. For an MP4/ISO BMFF file, > 120000 > >>> could represent the random access point most immediately preceding > >>> the live point. It would be similar for an indexed MP2. And for > >>> unindexed > >>> MP2 representations, it's not uncommon for a client to prebuffer a > >>> fixed amount of content in the hopes of capturing a keyframe (really > >>> a heuristic). > >>> > >>> [cp] The HEAD request is necessary in this case to know where the > >>> live point is at the time the request is made so the HTTP client > >>> would know if it can jump into already-stored content or if it > >>> should just acquire the live point. > >>> > >>> [cp] The important point is that all common video formats need a > >>> discontinuity-free number of bytes before the live point to provide > >>> a quality user experience. > >>> > >>> > >>> How should the client know, which byte ranges are already available > >>> on the server? When the client is playing back from the recorded > >>> part and would like to skip 5min forward, how does the client know, > >>> whether a normal range request is needed or whether the client > >>> should as for the live point? What type of HTTP Status code should > >>> be provided, when the range request is not yet available of the server? > >>> > >>> [cp] We're not trying to come up with a universal solution for > >>> performing time-based seek on all media formats with this draft. So > >>> some of this is out of scope. But let me see if I can fill in some > >>> of the blanks. > >>> > >>> */[TL] Ok, not everything needs to be in-scope. But an essential > >>> assumption should be, whether the client has a time-to-byteoffset > >>> table or whether the client can determine precisely the fragment > >>> boundary positions. /* > >>> > >>> [cp] Optimally, time-to-byte indexes would be used. But even without > >>> this, clients can often manage with heuristics. e.g. VLC can perform > >>> a reasonable job of providing time-seek on unindexed MP2 files. > >>> > >>> > >>> > >>> [cp] Some applications of media streaming have time-based indexing > >>> facilities built-in. e.g. MP4 (ISO BMFF) containers allow time and > >>> data to be associated using the various internal, mandatory metadata > >>> "boxes". In other cases, applications may provide a separate > >>> resource that contains time-to-byte mappings (e.g. content index > >>> files). In either case, there's a facility for mapping time offsets > >>> to byte offsets - or sometimes the client incorporates heuristics to > >>> perform time skips (e.g. VLC will do this on some file formats). > >>> > >>> > >>> */[TL] Yes. fMP4 supports this and MPEG DASH is leveraging this. But > >>> the live-point is not described in the fragments. The client > >>> determines the livepoint from the manifest. /* > >>> > >>> [cp] Correct. In fragmented content, the time-to-segment map tells > >>> you which representation to fetch (via GET). While I'd say that > >>> bytes-live can also improve segmented rendering (by reducing the > >>> latency of rendering), the primary focus of our draft is for > >>> non-segmented representations. > >>> > >>> > >>> [cp] In all these cases, there's some mechanism that maps time > >>> offsets to byte offsets. > >>> > >>> */[TL] Yes/* > >>> > >>> > >>> > >>> [cp] When it comes to the available byte range, a client can know > >>> what data range is available by utilizing a HEAD request with a "Range: > >>> bytes=0-". The "Content-Range" response can contain something like > >>> "Content-Range: bytes 0-1234567/*" which tells the client both the > >>> current randomly accessible content range (via the "0-1234567") and > >>> that the content is of indeterminate length (via the "*"). > >>> > >>> */[TL] So, that is the existing Content-Range response, but with an > >>> '*' to indicate the unknown content-length, correct? /* > >>> > >>> [cp] Yeah, the "*" in place of the last-byte-pos indicates an > >>> indetermine-length response body. > >>> > >>> > >>> > >>> [cp] Putting this all together, a client would implement a 5-minute > >>> skip by: > >>> (1) Adding 5 minutes to your current play time, > >>> (2) determining the byte offset for that given time using the > >>> appropriate index/heuristic (e.g. "3456789"), > >>> (3) if the time is outside the index, jump to the live point > >>> and update the time to the last-index time or other means (e.g. > >>> using > >>> "Range: bytes-live=340000-*" to pre-buffer/pre-prime the > >>> frame/sample buffer), > >>> (4) if the time is inside the index, either perform a standard > >>> bytes Range request to retrieve an implementation-specific quantum > >>> of time or data (e.g. "Range: bytes=3456789-3556789") and render. > >>> > >>> */[TL] In (2), How does the client determine the byte offset? fMP4 > >>> requires precise byteoffset, In case of TS, the client can sync to > >>> the stream by first searching for 0x47 sync bytes. In (3), how does > >>> the client determine "outside of the index"? Seems that some sort of > >>> manifest is implicitly needed, which allows the client to understand > >>> the latest byte pos. /* > >>> > >>> [cp] (2) is media-format-specific. For MP4/ISO BMFF, it would use > >>> the built-in metadata, for MP2, it would either use an index file or > >>> a heuristic. > >>> > >>> [cp] For (3), if the current live point (in byte terms) is greater > >>> than the last byte offset in the index, then the live point is > >>> "outside the index". That is, the time the client is trying to > >>> access isn't randomly accessible, and the client should just jump to > >>> the live point. > >>> > >>> *//* > >>> > >>> > >>> > >>> [cp] Again, some of this is out of scope, but I hope that clarifies > >>> a common use case. > >>> > >>> */[TL] Would be good to clarify, what information the client needs > >>> to get in order to do the operations. How the client gets the info > >>> can be left out-of-scope./* > >>> > >>> [cp] ok - I hope I'm filling in more of the blanks... > >>> > >>> > >>> > >>> [cp] Regarding the status code, RFC7233 (section 4.4) indicates that > >>> code 416 (Range Not Satisfiable) must be returned when "the current > >>> extent of the selected resource or that the set of ranges requested > >>> has been rejected due to invalid ranges or an excessive request of > >>> small or overlapping ranges." This part of 4.4 applies to *all* > >>> Range requests - regardless of the Range Unit. > >>> > >>> */[TL] ok. /* > >>> > >>> > >>> > >>> [cp] The bytes-live draft then goes on to say that "A > >>> bytes-live-range-specifier is considered unsatisfiable if the > >>> first-byte-pos is larger than the current length of the > >>> representation". This could probably be elaborated on a bit. But > >>> this is supposed to be the "hook" into the 4.4 language. > >>> > >>> > >>> Can you please clarify the questions? > >>> > >>> [cp] I hope I succeeded (at least partially). Apologies for the long > >>> response. I wanted to make sure I was answering your questions. > >>> > >>> */[TL] Gets a bit clearer, but I still don't understand the "mimic > >>> HLS or DASH". DASH / HLS focuses on CDN optimization by creating a > >>> sequence of individual files. The client can work out the live-point > >>> URL from the manifest. Each segment is a "good" access point (in > >>> DASH always box boundaries and in HLS always TS boundaries even with > >>> PAT / PMT). So, the key issue here is to clarify, how the client > >>> gets the byte offsets of the fragment boundaries for range > >>> requests./* > >>> > >>> [cp] If it's still a bit unclear how this is performed, I can go > >>> into more detail. But like I say, I should really reword that > >>> section of the draft since I think I've created some confusion. The > >>> point I was trying to make was that *polling* a non-segmented > >>> representation would > >>> - other than being inefficient - have the kind of multi-second > >>> latency that segmented live streaming would have. > >>> > >>> [cp] But the difficulty of expressing this (secondary) benefit in > >>> the bytes-live is probably not worth the trouble. I'll see if I can > >>> reword the draft to make it less confusing. I don't think this point > >>> is necessary to "sell" the concept of bytes-live (or a > >>> bytes-live-like feature). > >>> > >>> [cp] BTW, if you're really interested in the details of mapping time > >>> to offsets in a ISO BMFF container, have a look at > >>> odid_mp4_parser.vala:get_random_access_points() and > >>> get_random_access_point_for_time() at > >>> https://github.com/cablelabs/rygel/tree/cablelabs/master/src/media- > >> engines/odid/. > >>> I can probable even get you instructions for printing RAPs for MP4 > >>> files using the test program. > >>> > >>> hth - cp > >>> > >>> > >>> > >>> > >>> > >>> > >>> BR, > >>> > >>> Thorsten > >>> > >>> *From:*Craig Pratt [mailto:craig@ecaspia.com] > >>> *Sent:* Monday, April 18, 2016 11:04 AM > >>> *To:* K.Morgan@iaea.org <mailto:K.Morgan@iaea.org>; > >> fielding@gbiv.com > >>> <mailto:fielding@gbiv.com> > >>> *Cc:* Göran Eriksson AP; bs7652@att.com <mailto:bs7652@att.com>; > >>> remy@lebeausoftware.org <mailto:remy@lebeausoftware.org>; > >>> ietf-http-wg@w3.org <mailto:ietf-http-wg@w3.org>; > >> rodger@plexapp.com > >>> <mailto:rodger@plexapp.com>; julian.reschke@gmx.de > >>> <mailto:julian.reschke@gmx.de>; C.Brunhuber@iaea.org > >>> <mailto:C.Brunhuber@iaea.org> > >>> *Subject:* Re: Issue with "bytes" Range Unit and live streaming > >>> > >>> On 4/18/16 12:34 AM, K.Morgan@iaea.org <mailto:K.Morgan@iaea.org> > >> wrote: > >>> On Friday,15 April 2016 22:43,fielding@gbiv.com > >> <mailto:fielding@gbiv.com> wrote: > >>> Oh, never mind, now I see that you are referring to the > >>> second number being > >>> > >>> fixed. > >>> > >>> > >>> > >>> I think I would prefer that be solved by allowing > >>> last-byte-pos to be empty, just > >>> > >>> like it is for the Range request. I think such a fix is > >>> just as likely to be > >>> > >>> interoperable as introducing a special range type (same failure > cases). > >>> > >>> > >>> > >>> ....Roy > >>> > >>> > >>> > >>> +1000 > >>> > >>> > >>> > >>> A very similar idea was proposed before [1] as an I-D [2] by > >>> Rodger > >> Coombs. We've also brought this up informally with other members of > >> the WG. > >>> > >>> > >>> Alas, in our experience range requests don't seem to be a high > >>> priority :( > >> For example, the problem of combining gzip with range requests is > >> still unsolved [3]. > >>> > >>> > >>> > >>> [1]https://lists.w3.org/Archives/Public/ietf-http-wg/2015AprJun/0122 > >>> .h > >>> tml > >>> > >>> > >>> [2]https://tools.ietf.org/html/draft-combs-http-indeterminate-range- > >>> 01 > >>> > >>> > >>> [3]https://lists.w3.org/Archives/Public/ietf-http-wg/2014AprJun/1327 > >>> .h > >>> tml > >>> > >>> [cp] Yeah, it's unfortunate that no solutions have moved forward for > >>> this widely-desired feature. I can only assume that people just > >>> started defining proprietary solutions - which is unfortunate. I'll > >>> try to be "persistent"... ;^J > >>> > >>> [cp] As was mentioned, the issue with just arbitrarily allowing an > >>> open-ended Content-Range response (omitting last-byte-pos) is that > >>> there's no good way for a client to indicate it can support > >>> reception of a Content-Range without a last-byte-pos. So I would > >>> fully expect many clients to fail in "unpredictable ways" > >>> (disconnecting, crashing, etc). > >>> > >>> [cp] I see that the indeterminate length proposal you referenced in > >>> your first citation introduces a "Accept-Indefinite-Ranges" header > >>> to prevent this issue. But I think this brings with it some other > >>> questions. e.g. Would this apply to any/all Range Units which may be > >>> introduced in the future? How can a Client issue a request that > >>> starts at the "live point"? It feels like it has one hand tied behind its back. > >>> > >>> [cp] If I could, I would prefer to go back in time and advocate for > >>> an alternate ABNF for the bytes Range Unit. Seeing as that's not an > >>> option, I think using this well- and long-defined Range Unit > >>> extension mechanism seems like a good path forward as it should not > >>> create interoperability issues between clients and servers. > >>> > >>> [cp] And I would hope adding a Range Unit would have a low/lower bar > >>> for acceptance. e.g. If a Range Unit fills a useful role, is > >>> well-defined, and isn't redundant, it seems reasonable that it > >>> should be accepted as it shouldn't impact existing HTTP/1.1 > >>> semantics. In fact, the gzip case (referenced in your third > >>> citation) seems like a perfect application of the Range Unit (better > >>> than bytes-live). If there's interest, I'll write up an RFC to demonstrate... > >>> > >>> > >>> > >>> > >>> > >>> This email message is intended only for the use of the named recipient. > >> Information contained in this email message and its attachments may > >> be privileged, confidential and protected from disclosure. If you are > >> not the intended recipient, please do not read, copy, use or disclose > >> this communication to others. Also please notify the sender by > >> replying to this message and then delete it from your system. > >>> -- > >>> > >>> > >>> > >>> craig pratt > >>> > >>> Caspia Consulting > >>> > >>> craig@ecaspia.com <mailto:craig@ecaspia.com> > >>> > >>> 503.746.8008 > >>> > >>> > >>> > >>> > >>> -- > >>> > >>> > >>> craig pratt > >>> > >>> Caspia Consulting > >>> > >>> craig@ecaspia.com <mailto:craig@ecaspia.com> > >>> > >>> 503.746.8008 > >>> > >>> > >>> > >>> > >>> -- > >>> > >>> craig pratt > >>> > >>> Caspia Consulting > >>> > >>> craig@ecaspia.com <mailto:craig@ecaspia.com> > >>> > >>> 503.746.8008 > >>> > >>> > >>> > >>> > >> > >> -- > >> > >> craig pratt > >> > >> Caspia Consulting > >> > >> craig@ecaspia.com > >> > >> 503.746.8008 > >> > >> > >> > >> > >> > >> > >> > > > -- > > craig pratt > > Caspia Consulting > > craig@ecaspia.com > > 503.746.8008 > > > > > > >
Received on Wednesday, 20 April 2016 19:31:49 UTC