- From: Craig Pratt <craig@ecaspia.com>
- Date: Wed, 20 Apr 2016 12:57:28 -0700
- To: Thorsten Lohmar <thorsten.lohmar@ericsson.com>, "K.Morgan@iaea.org" <K.Morgan@iaea.org>, "fielding@gbiv.com" <fielding@gbiv.com>
- Cc: Göran Eriksson AP <goran.ap.eriksson@ericsson.com>, "bs7652@att.com" <bs7652@att.com>, "remy@lebeausoftware.org" <remy@lebeausoftware.org>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>, "rodger@plexapp.com" <rodger@plexapp.com>, "julian.reschke@gmx.de" <julian.reschke@gmx.de>, "C.Brunhuber@iaea.org" <C.Brunhuber@iaea.org>, Darshak Thakore <d.thakore@cablelabs.com>
Hey Thornsten, On 4/20/16 12:31 PM, Thorsten Lohmar wrote: > Hi Craig, > >> What *is* missing is the ability to get a continuous byte range on live content >> that starts at an arbitrary offset and the ability to directly jump to the live >> point. > Yes, and there are two solutions to the issue: > A. Enable it on HTTP Layer through the definition of a new range request method > B. Enable working with existing HTTP procedures, i.e. client can workout the precise byte offsets. > > BR, > /Thorsten [cp] I guess I don't see these as mutually exclusive. [cp] In DLNA we have both time-based and byte-based range methods. And what we find is that some clients want the server to "help out" for some formats (e.g. MPEG2 content) and these clients utilize time-based range units. And for other clients just want to use byte-wise access (esp clients accessing ISO-BMFF/MP4). [cp] Both methods need to accommodate aggregated content. TimeSeekRange.dlna.org accommodates range/seek requests that include aggregated content. And Range doesn't (which is why we're submitting this draft). [cp] It would be great to bring something like TSR in as a Range Unit, but it wouldn't replace bytes-live (at least if it's in the same form). And it really make sense (IMHO) for time-based and byte-based seek to be completely discreet as time-seek requires a media-content-aware HTTP server. Bytes (and bytes-live) can be implemented by a much "dumber" ("content-dumb"?) HTTP-server. [cp] I hope I'm making sense. (and we haven't lost too many people...) > >> -----Original Message----- >> From: Craig Pratt [mailto:craig@ecaspia.com] >> Sent: Wednesday, April 20, 2016 8:21 PM >> To: Thorsten Lohmar; K.Morgan@iaea.org; fielding@gbiv.com >> Cc: Göran Eriksson AP; bs7652@att.com; remy@lebeausoftware.org; ietf- >> http-wg@w3.org; rodger@plexapp.com; julian.reschke@gmx.de; >> C.Brunhuber@iaea.org; Darshak Thakore >> Subject: Re: Issue with "bytes" Range Unit and live streaming >> >> Hey Thorsten, >> >> I'm not clear about what you think is missing today from MP4/ISO-BMFF. >> >> HTTP/1.1 MP4-aware client has everything it needs today to resolve time >> offsets to byte offsets and find the nearest random access point(s) - >> whether unfragmented, static-fragmented, or live-fragmented - and I have >> code to demonstrate this. I'm sure this must be a misunderstanding, but I >> don't see what here requires a ISO-BMFF extension. >> >> What *is* missing is the ability to get a continuous byte range on live content >> that starts at an arbitrary offset and the ability to directly jump to the live >> point. >> >> More thoughts in-line. >> >> cp >> >> On 4/20/16 8:14 AM, Thorsten Lohmar wrote: >>> Hi Craig, >>> >>> See inline. >>> >>> It might be worthwhile to bring-up the issue in MPEG. E.g. as ISO-BMFF >> extension or DASH extension. >>> BR, >>> /Thorsten >>> >>>> -----Original Message----- >>>> From: Craig Pratt [mailto:craig@ecaspia.com] >>>> Sent: Tuesday, April 19, 2016 11:54 PM >>>> To: Thorsten Lohmar; K.Morgan@iaea.org; fielding@gbiv.com >>>> Cc: Göran Eriksson AP; bs7652@att.com; remy@lebeausoftware.org; ietf- >>>> http-wg@w3.org; rodger@plexapp.com; julian.reschke@gmx.de; >>>> C.Brunhuber@iaea.org; Darshak Thakore >>>> Subject: Re: Issue with "bytes" Range Unit and live streaming >>>> >>>> Reply in-line. >>>> >>>> cp >>>> >>>> On 4/19/16 7:02 AM, Thorsten Lohmar wrote: >>>>> Hi Craig, >>>>> >>>>> Thanks for sharing the github link. That certainly clarifies the >>>>> use-case even further. >>>>> >>>>> Maybe we should focus the discussion on the fMP4 format for some >>>>> time, since tune-in into fMP4 requires random access to fragment >> boundaries. >>>>> Compared to ts or mp3, fMP4 does not support synchronization to the >>>>> stream from any received byte. The client must start processing an >>>>> fMP4 stream from fragment boundaries (or box boundaries). Tuning to >>>>> selected samples inside of the fMP4 file happens at a later stage. >>>>> >>>>> So, when I understand your proposal around the byte-live request >>>>> correctly, then the client is NOT asking for the precise byte range, >>>>> but the next possible random access points in close proximity of the >>>>> requested range. So, the client sends a request containing "Range: >>>>> bytes-live=0-*" and gets a response with "Content-Range: bytes-live >>>>> 123456-*": So, the server is providing the client with a HTTP >>>>> resource, starting in case of fMP4 with the fragment boundary (which >>>>> is at byte-offset 123456 of the resource). >>>>> >>>>> Do I understand it correctly then: When the client wants to tune-in >>>>> e.g. 5min in the TSB, then the client measures the bitrate of the >>>>> media stream and calculates a rough byte offset (i.e. 5min x >>>>> estimated bitrate, lets say byte pos 654321) [all out of scope of >>>>> the ID] and creates a bytes-live range request of form "Range: >>>>> bytes-live=654321-*". The server looksup a good fragment boundary >>>>> (lets say 654420) and responds with "Content-Range: bytes-live: >>>>> 654420-*". Do I understand the proposal correctly? >>>>> >>>> TSBs and MP4s are a nasty combination. ISO BMFF containers >>>> accommodate amended content (via fragments). But it doesn't have >>>> facilities for front-end truncation. It's possible, just not easy. So >>>> this will go down the rabbit hole quickly. >>> [TL] Yes, it is possible to append to the end. But you cannot simply delete >> from the front when you realize a TSB. The question is, whether this issue >> should be solved on HTTP layer, since likely you also want to give an >> indication of the TSB to the Users (GUI representation of the timeshiftbuffer >> depth). >> [cp] Yeah - there always needs to be a mov box at the front. And while I >> believe Content-Range is sufficient to communicate front-end truncation, >> anything time-related is another matter. In MP4, this can be communicated >> in the timeline via an elst box (edit list). Anything else requires some kind of >> time-based Range Unit. And that's orthogonal to our draft - and live content >> in general. >> >>>> One would need to define the operation of the server for providing a >>>> time- shifted MP4 representation. Basically the idea is that the >>>> server would maintain a list of valid fragments and would have to >>>> maintain a valid movie box at the front of the representation as >>>> fragments were added and removed (if a client's expected to consume a >>>> spec-compliant ISO BMFF container). >>> [TL] Yes. >>> >>>> So I'd say you're mostly right, but let me paraphrase: When an MP4 >>>> client wants to come in at a precise time on a MP4/ISO BMFF (e.g. 5 >>>> minutes), it typically would: >>>> >>>> 1) get the movie box from the front of the representation (a couple >>>> byte- Range requests); >>>> 2) access the movie fragment header boxes until the requested time >>>> offset is determined (multiple byte-Range requests); >>> [TL] Yes, but the Movie Box does not contain such information for a Live >> offering. So, something is missing here. >> [cp] Whether the MP4 is being actively appended to or not, walking the >> fragments (moofs) will provide a client the time offsets. I have code >> demonstrating this or you can do a protocol analysis of VLC to see it >> performing the bytes-Range requests. Are we getting tripped up on the >> definition of "live"? "Live" != "HLS Live Streaming", right? >>>> 3) If the time offset is found in the currently-available fragments, >>>> perform a Range request to get the fragment containing the target >>>> time and start rendering it (one or two "bytes" Range request), and >>>> start fetching the next Fragment if/when desired. e.g. If the nearest >>>> random access point for 5 minutes is at the start of a fragment with >>>> offset >>>> 4444444 with length 30001: >>>> >>>> Range: bytes=4444444-4474444 >>> [TL] Well, almost. You must always start at the fragment boundary. When >> the fragment contains e.g. 4sec of media data (e.g. with 1Sec Gops), the >> client must start fetching from the beginning of the fragment and then skip >> the media data before the desire playtime start. The client can also do an >> open range request. >> [cp] Agreed - which is why I said "the nearest random access point for 5 >> minutes is at the start of a fragment with offset 4444444". The actual >> sample/frames corresponding with the 5-minute mark would be at some >> point after 4444444. >>>> 4) If the time offset is not found to be after the >>>> currently-available fragments, jump to the live point by grabbing the >>>> last fragment and any fragments that are added on (one "bytes-live" >>>> range request). e.g. if the last fragment's offset is 5555555 with length >> 40001: >>>> Range: bytes-live=5555555-* >>>> >>>> The client would quickly render the last fragment (to prime the frame >>>> buffer), put the framebuffer on-screen, and then block on the socket >>>> (or wait for async notification) and render the new fragment(s) as they >> come in. >>> [TL] Ok, understood. >> [cp] It's important to remember that this is the feature we care about in >> context of this draft. If you're good with this, then we're on the same page. >>>> * Note that one might use MP4 timelines in here to communicate the >>>> fact that sample time 0 is not media time 0. But I'd rather not get too far >> into that. >>>> ** As with live content, TSBs are non-cachable. (but in-progress >>>> recordings >>>> *are* cache-able) >>> [TL] Since we talk about a single HTTP transaction with progressive media >> data, the session is anyhow not cachable. >> [cp] Well, in-progress recordings should still be byte-wise cachable, correct? >> Byte 0 will always be byte 0, byte X will always be byte X. So a cache can be >> populated with the currently-accessible bytes and can satisfy Range >> requests. There might just be a cache miss if a client accesses bytes not-yet- >> cached. But a proxy should know that this is a possibility since the Content- >> Range responses have a "*" in place of the content length. >>>>> If yes, should the solution be limited to live cases? If I >>>>> understand it correctly, then you are looking for a solution where >>>>> the client indicates a rough byte range in the request and the >>>>> server responds with another range, which is fulfilling some >>>>> condition. In case of a live session with fMP4, the server looks for >>>>> a random access point into the stream. The random access point must >>>>> be a fragment boundary in case of fMP4 and can be PAT, PMT, PES in >> case of TS. >>>> Technically, regular bytes-Range request could have this "fencing" >>>> behavior (a Client should be driven off the Content-Range response >> header). >>>> But I think this is somewhat disingenuous and might be inconsistent >>>> with >>>> RFC7233 (I'd have to look closely). And I think this is another >>>> example of where a different Range Unit would make sense. >>>> >>>> e.g. DLNA specifications defined a TimeSeekRange.dlna.org header that >>>> carries this contract (of providing the most-immediately-preceding >>>> "decoder- friendly position") in the returned media. Now, this was >>>> defined to work with HTTP/1.0, so defining a new Range Unit wasn't an >>>> option. But if so, TimeSeekRange.dlna.org could be easily replaced >>>> with a "npt" Range Unit (normal play time). And what you're talking >>>> about would be similar but with byte offsets. Both carry a similar >> assumption: >>>> The server has some knowledge of the content structure. >>> [TL] Yes, started also thinking about time range requests, but I couldn't >> remember, whether that was define in DLNA, OIPF of DVB. Are the time >> range headers widely used today? Can such a solution make sense for H1.1 >> or H2? >> [cp] TimeSeekRange is widely used in DLNA. But they're more useful for less >> structured content such as MP2 Program/Transport streams. (for MP4 >> representations, one never sees DLNA clients using TimeSeekRange - but >> servers are required to support TSR for all media formats in CVP-2). >> >>>> Aside: Both use cases could be covered with something like a "mso" >>>> (media-sensitive offset) Range Unit which could take a time offset or >>>> a byte offset (and incorporate live semantics). >>>> >>>> The bytes-live draft (or whatever we end up calling it) is intended >>>> to assume all knowledge is client-side - just as the case with the >>>> bytes Range Unit. I think you're assuming some of this server-side >>>> structure knowledge that I was not intending. But "mso" would make >> another great Range Unit. >>> [TL] I still think, that the client should have precise information of the TSB >> (i.e. precise range and depth) in order to properly render the TSB >> representation on the GUI (Media Players often render the TSB in a progress >> bar together with the progress of the Live Session). >> [cp] I do think it could be useful to define something like >> TimeSeekRange.dlna.org as a Range Unit. This would be very useful for >> random access on MPEG2-contained content. If you want to start a draft, I'm >> happy to co-author. ;^J >>>> There are many other potential applications of something like >>>> bytes-live I believe - which is why we're bringing this to the IETF. >>>> But really it's all about getting to the "live point" in a couple different >> ways. >>>> But it assumes the *client* knows where the random access points are >>>> located (and tailors the byte offsets accordingly). >>> [TL] Maybe, it is better to discuss this issue in MPEG, either as ISO-BMFF >> extensions or DASH extensions. >> [cp] Perhaps. But the mechanism of how the transfer occurs - which is what >> this draft is related to - is definitely in the transport space. >> For some complex formats, it may be in the "necessary but not sufficient" >> category. But as I mentioned at the top, for >> (linear/progressive) MP4/ISO-BMFF, everything should be there. >>>>> Nothing inline. >>>>> >>>> Good - that was getting crazy... ;^J >>>> >>>>> BR, >>>>> >>>>> Thorsten >>>>> >>>>> *From:*Craig Pratt [mailto:craig@ecaspia.com] >>>>> *Sent:* Tuesday, April 19, 2016 11:55 AM >>>>> *To:* Thorsten Lohmar; K.Morgan@iaea.org; fielding@gbiv.com >>>>> *Cc:* Göran Eriksson AP; bs7652@att.com; remy@lebeausoftware.org; >>>>> ietf-http-wg@w3.org; rodger@plexapp.com; julian.reschke@gmx.de; >>>>> C.Brunhuber@iaea.org; Darshak Thakore >>>>> *Subject:* Re: Issue with "bytes" Range Unit and live streaming >>>>> >>>>> Hey Thorsten, >>>>> >>>>> I'll try to reply in-line. >>>>> >>>>> cp >>>>> >>>>> On 4/18/16 3:50 PM, Thorsten Lohmar wrote: >>>>> >>>>> Hi Craig, all, >>>>> >>>>> Thanks for the clarification. Some further question inline >>>>> >>>>> BR, >>>>> >>>>> Thorsten >>>>> >>>>> *From:*Craig Pratt [mailto:craig@ecaspia.com] >>>>> *Sent:* Monday, April 18, 2016 10:29 PM >>>>> *To:* Thorsten Lohmar; K.Morgan@iaea.org >>>>> <mailto:K.Morgan@iaea.org>; fielding@gbiv.com >>>>> <mailto:fielding@gbiv.com> >>>>> *Cc:* Göran Eriksson AP; bs7652@att.com <mailto:bs7652@att.com>; >>>>> remy@lebeausoftware.org <mailto:remy@lebeausoftware.org>; >>>>> ietf-http-wg@w3.org <mailto:ietf-http-wg@w3.org>; >>>>> rodger@plexapp.com <mailto:rodger@plexapp.com>; >>>>> julian.reschke@gmx.de <mailto:julian.reschke@gmx.de>; >>>>> C.Brunhuber@iaea.org <mailto:C.Brunhuber@iaea.org>; Darshak >>>>> Thakore; STARK, BARBARA H >>>>> *Subject:* Re: Issue with "bytes" Range Unit and live streaming >>>>> >>>>> [cc-ing the co-authors] >>>>> >>>>> Hi Thorsten, >>>>> >>>>> I'm happy to help provide whatever answers I can. >>>>> >>>>> Reply in-line. >>>>> >>>>> cp >>>>> >>>>> On 4/18/16 8:10 AM, Thorsten Lohmar wrote: >>>>> >>>>> Hi Craig, all, >>>>> >>>>> My colleague Göran asked me some question around the problem >>>>> and I would like to raise these questions directly to you. Of >>>>> course, there are some alternative solutions available, where >>>>> the client can work out the different things from a manifest. >>>>> But you seem to look for a simple solution, which works with >>>>> non-segmented media on a single HTTP session. >>>>> >>>>> When I understood it correctly, an HTTP server is making a >>>>> live stream available using HTTP. A normal live stream can be >>>>> opened with a single HTTP request and the server can serve >>>>> data "from the live point" either with or without HTTP chunked >>>>> delivery. The server cannot give a Content-Length, since this >>>>> is an ongoing live stream of unknown size. >>>>> >>>>> [cp] all correct. >>>>> >>>>> >>>>> Your use-case seem to be about recording of content. Client should >>>>> access content from the recorded part, but should be able to jump >>>>> to the live-point. I assume that you are not looking into sliding >>>>> window recordings (i.e. timeshift). I assume that the a single >>>>> program is continuous recording and the HTTP object is growing >>>>> until the end of the live session, correct? >>>>> >>>>> [cp] I didn't spell it out in the draft, but I would like to >>>>> consider adding clarifications for the time-shift cases. This >>>>> should just be a matter of a Client requesting one thing and >>>>> getting another. e.g. "Range: bytes-live=0-*" results in >>>>> "Content-Range: bytes-live 123456-*". In either case, you're >>>>> correct: the end of the representation is moving forward in time >>>>> until the end of the live session. >>>>> >>>>> >>>>> */[TL] The "Range: bytes-live=0-*" case is not clear to me. Your >>>>> ID says "All bytes currently in the representation and those >>>>> appended to the end of the representation after the request is >>>>> processed". I get the impression, that the server is deleting all >>>>> data before a certain timepoint (aka, behavior of a slighting >>>>> window timeshift). So, the client seems to request all data from >>>>> the beginning of the timeshift buffer. Why does the server need to >>>>> change the byte offset from 0 to 123456? /* >>>>> >>>>> *I can understand, that the server must signal "growing resource" >>>>> in the response.* >>>>> >>>>> [cp] I was trying to illustrate a case where the server had trimmed >>>>> off bytes 0-123456 (the TSB model). So in this case, it's signalling >>>>> to the client "you're getting bytes starting at 123456 (not 0)". e.g. >>>>> If a client requests "Range: bytes-live=0-*" on an in-progress >>>>> recording, one might expect: >>>>> >>>>> Content-Range: bytes-live 0-*/* >>>>> >>>>> [cp] Basically saying (as described in the ID) that all bytes >>>>> currently in the representation and those appended to the end of the >>>>> representation after the request is processed will be returned. But >>>>> on a TSB, one might expect: >>>>> >>>>> Content-Range: bytes-live 123456-*/* >>>>> >>>>> [cp] Basically saying that bytes starting at byte 123456 in the >>>>> representation and those appended to the end of the representation >>>>> after the request is processed will be returned. >>>>> >>>>> [cp] While I'm thinking about TSB use cases in the back of my mind, >>>>> this is really not the primary use case I was considering for the ID >>>>> (but I would hope it can be covered). >>>>> >>>>> In any case, how does the client know "good" byte range offsets (i.e. >>>>> service access points) to tune into the recording? Or is the >>>>> assumption, that the client can synchronize to the media stream from >>>>> any byte range? >>>>> >>>>> [cp] For byte-level access, random access implementation is up to >>>>> the client. For some containers this is easier than others. e.g. For >>>>> MP4, the random access points can be established by reading the >>>>> movie and fragment header(s). For something like MP2, it's trickier of >> course. >>>>> */[TL] Well, in case of fMP4, the client needs to get the Movie >>>>> Header for initialization. Then, proper access point are fragment >> boundaries. >>>>> There are various ways to signal time to byte-offsets. /* >>>>> >>>>> [cp] Fragments can actually have multiple access points - implicit >>>>> (per sample) and explicit (random access points). But yeah, it seems >>>>> common for fragments to have one random access point (and often >>>>> correlate to a GOP) - and that there's a huge variety of ways to lay >>>>> out the samples. >>>>> >>>>> */In case of TS, the client needs a PAT, PMT and PES starts for >>>>> tune-in. It is a bit more tricky, but also here are solutions. /* >>>>> >>>>> */But the email talks about "none-segmented" media. The draft talks >>>>> about "mimicking segmented media". fMP4 is actually the way to >>>>> create ISO-BMFF segments. So, it is for segmented media, but without >>>>> a separate manifest?/* >>>>> >>>>> [cp] It's important to differentiate between *fragmented* and >>>>> *segmented* MP4/ISO BMFF representations. bytes-live is most >>>>> applicable to fragmented files - where you have one representation >>>>> being used for the entire streaming session - with this >>>>> representation being appended to periodically (usually one fragment at >> a time). >>>>> [cp] I really need to revise my description in the draft to help >>>>> avoid confusion. What I was trying to describe was how a solution >>>>> using just byte-Range requests would always be slightly behind the >>>>> live point - as is the case with rendering "live" segmented streams. >>>>> While bytes-live could be used for fragmented (MP4/ISO BMFF) content >>>>> or segmented content, the primary use case is for non-segmented >>>>> representations. >>>>> >>>>> >>>>> >>>>> [cp] One major feature this draft allows is for retrieval of bytes >>>>> just preceding the live point. So for example, a client can do a >>>>> Range head request like "Range: bytes=0-", get a "Content-Range: >>>>> bytes 0-1234567/*", then perform something like a "Range: >>>>> bytes-live=1200000-*", and prime its framebuffer with 34567 bytes of >>>>> data that precede the live point - allowing for the client to find >>>>> an access point (e.g. mpeg2 start codes) and to allow live >>>>> presentation to display much sooner than it would from the live >>>>> point (without random access). >>>>> >>>>> */[TL] So, how does the client know, that the proper fragment >>>>> boundary is at byte position 120000? Do you assume that the client >>>>> first fetches a time-to-byte offset file, which tells the client >>>>> that a access point (e.g. a fragment boundary) is at byte pos >>>>> 120000? If yes, why does the client need the HEAD request, when it >>>>> already has the byte position?/* >>>>> >>>>> [cp] How a client know the amount to pre-fetch before the live point >>>>> would depend upon the media format. For an MP4/ISO BMFF file, >> 120000 >>>>> could represent the random access point most immediately preceding >>>>> the live point. It would be similar for an indexed MP2. And for >>>>> unindexed >>>>> MP2 representations, it's not uncommon for a client to prebuffer a >>>>> fixed amount of content in the hopes of capturing a keyframe (really >>>>> a heuristic). >>>>> >>>>> [cp] The HEAD request is necessary in this case to know where the >>>>> live point is at the time the request is made so the HTTP client >>>>> would know if it can jump into already-stored content or if it >>>>> should just acquire the live point. >>>>> >>>>> [cp] The important point is that all common video formats need a >>>>> discontinuity-free number of bytes before the live point to provide >>>>> a quality user experience. >>>>> >>>>> >>>>> How should the client know, which byte ranges are already available >>>>> on the server? When the client is playing back from the recorded >>>>> part and would like to skip 5min forward, how does the client know, >>>>> whether a normal range request is needed or whether the client >>>>> should as for the live point? What type of HTTP Status code should >>>>> be provided, when the range request is not yet available of the server? >>>>> >>>>> [cp] We're not trying to come up with a universal solution for >>>>> performing time-based seek on all media formats with this draft. So >>>>> some of this is out of scope. But let me see if I can fill in some >>>>> of the blanks. >>>>> >>>>> */[TL] Ok, not everything needs to be in-scope. But an essential >>>>> assumption should be, whether the client has a time-to-byteoffset >>>>> table or whether the client can determine precisely the fragment >>>>> boundary positions. /* >>>>> >>>>> [cp] Optimally, time-to-byte indexes would be used. But even without >>>>> this, clients can often manage with heuristics. e.g. VLC can perform >>>>> a reasonable job of providing time-seek on unindexed MP2 files. >>>>> >>>>> >>>>> >>>>> [cp] Some applications of media streaming have time-based indexing >>>>> facilities built-in. e.g. MP4 (ISO BMFF) containers allow time and >>>>> data to be associated using the various internal, mandatory metadata >>>>> "boxes". In other cases, applications may provide a separate >>>>> resource that contains time-to-byte mappings (e.g. content index >>>>> files). In either case, there's a facility for mapping time offsets >>>>> to byte offsets - or sometimes the client incorporates heuristics to >>>>> perform time skips (e.g. VLC will do this on some file formats). >>>>> >>>>> >>>>> */[TL] Yes. fMP4 supports this and MPEG DASH is leveraging this. But >>>>> the live-point is not described in the fragments. The client >>>>> determines the livepoint from the manifest. /* >>>>> >>>>> [cp] Correct. In fragmented content, the time-to-segment map tells >>>>> you which representation to fetch (via GET). While I'd say that >>>>> bytes-live can also improve segmented rendering (by reducing the >>>>> latency of rendering), the primary focus of our draft is for >>>>> non-segmented representations. >>>>> >>>>> >>>>> [cp] In all these cases, there's some mechanism that maps time >>>>> offsets to byte offsets. >>>>> >>>>> */[TL] Yes/* >>>>> >>>>> >>>>> >>>>> [cp] When it comes to the available byte range, a client can know >>>>> what data range is available by utilizing a HEAD request with a "Range: >>>>> bytes=0-". The "Content-Range" response can contain something like >>>>> "Content-Range: bytes 0-1234567/*" which tells the client both the >>>>> current randomly accessible content range (via the "0-1234567") and >>>>> that the content is of indeterminate length (via the "*"). >>>>> >>>>> */[TL] So, that is the existing Content-Range response, but with an >>>>> '*' to indicate the unknown content-length, correct? /* >>>>> >>>>> [cp] Yeah, the "*" in place of the last-byte-pos indicates an >>>>> indetermine-length response body. >>>>> >>>>> >>>>> >>>>> [cp] Putting this all together, a client would implement a 5-minute >>>>> skip by: >>>>> (1) Adding 5 minutes to your current play time, >>>>> (2) determining the byte offset for that given time using the >>>>> appropriate index/heuristic (e.g. "3456789"), >>>>> (3) if the time is outside the index, jump to the live point >>>>> and update the time to the last-index time or other means (e.g. >>>>> using >>>>> "Range: bytes-live=340000-*" to pre-buffer/pre-prime the >>>>> frame/sample buffer), >>>>> (4) if the time is inside the index, either perform a standard >>>>> bytes Range request to retrieve an implementation-specific quantum >>>>> of time or data (e.g. "Range: bytes=3456789-3556789") and render. >>>>> >>>>> */[TL] In (2), How does the client determine the byte offset? fMP4 >>>>> requires precise byteoffset, In case of TS, the client can sync to >>>>> the stream by first searching for 0x47 sync bytes. In (3), how does >>>>> the client determine "outside of the index"? Seems that some sort of >>>>> manifest is implicitly needed, which allows the client to understand >>>>> the latest byte pos. /* >>>>> >>>>> [cp] (2) is media-format-specific. For MP4/ISO BMFF, it would use >>>>> the built-in metadata, for MP2, it would either use an index file or >>>>> a heuristic. >>>>> >>>>> [cp] For (3), if the current live point (in byte terms) is greater >>>>> than the last byte offset in the index, then the live point is >>>>> "outside the index". That is, the time the client is trying to >>>>> access isn't randomly accessible, and the client should just jump to >>>>> the live point. >>>>> >>>>> *//* >>>>> >>>>> >>>>> >>>>> [cp] Again, some of this is out of scope, but I hope that clarifies >>>>> a common use case. >>>>> >>>>> */[TL] Would be good to clarify, what information the client needs >>>>> to get in order to do the operations. How the client gets the info >>>>> can be left out-of-scope./* >>>>> >>>>> [cp] ok - I hope I'm filling in more of the blanks... >>>>> >>>>> >>>>> >>>>> [cp] Regarding the status code, RFC7233 (section 4.4) indicates that >>>>> code 416 (Range Not Satisfiable) must be returned when "the current >>>>> extent of the selected resource or that the set of ranges requested >>>>> has been rejected due to invalid ranges or an excessive request of >>>>> small or overlapping ranges." This part of 4.4 applies to *all* >>>>> Range requests - regardless of the Range Unit. >>>>> >>>>> */[TL] ok. /* >>>>> >>>>> >>>>> >>>>> [cp] The bytes-live draft then goes on to say that "A >>>>> bytes-live-range-specifier is considered unsatisfiable if the >>>>> first-byte-pos is larger than the current length of the >>>>> representation". This could probably be elaborated on a bit. But >>>>> this is supposed to be the "hook" into the 4.4 language. >>>>> >>>>> >>>>> Can you please clarify the questions? >>>>> >>>>> [cp] I hope I succeeded (at least partially). Apologies for the long >>>>> response. I wanted to make sure I was answering your questions. >>>>> >>>>> */[TL] Gets a bit clearer, but I still don't understand the "mimic >>>>> HLS or DASH". DASH / HLS focuses on CDN optimization by creating a >>>>> sequence of individual files. The client can work out the live-point >>>>> URL from the manifest. Each segment is a "good" access point (in >>>>> DASH always box boundaries and in HLS always TS boundaries even with >>>>> PAT / PMT). So, the key issue here is to clarify, how the client >>>>> gets the byte offsets of the fragment boundaries for range >>>>> requests./* >>>>> >>>>> [cp] If it's still a bit unclear how this is performed, I can go >>>>> into more detail. But like I say, I should really reword that >>>>> section of the draft since I think I've created some confusion. The >>>>> point I was trying to make was that *polling* a non-segmented >>>>> representation would >>>>> - other than being inefficient - have the kind of multi-second >>>>> latency that segmented live streaming would have. >>>>> >>>>> [cp] But the difficulty of expressing this (secondary) benefit in >>>>> the bytes-live is probably not worth the trouble. I'll see if I can >>>>> reword the draft to make it less confusing. I don't think this point >>>>> is necessary to "sell" the concept of bytes-live (or a >>>>> bytes-live-like feature). >>>>> >>>>> [cp] BTW, if you're really interested in the details of mapping time >>>>> to offsets in a ISO BMFF container, have a look at >>>>> odid_mp4_parser.vala:get_random_access_points() and >>>>> get_random_access_point_for_time() at >>>>> https://github.com/cablelabs/rygel/tree/cablelabs/master/src/media- >>>> engines/odid/. >>>>> I can probable even get you instructions for printing RAPs for MP4 >>>>> files using the test program. >>>>> >>>>> hth - cp >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> BR, >>>>> >>>>> Thorsten >>>>> >>>>> *From:*Craig Pratt [mailto:craig@ecaspia.com] >>>>> *Sent:* Monday, April 18, 2016 11:04 AM >>>>> *To:* K.Morgan@iaea.org <mailto:K.Morgan@iaea.org>; >>>> fielding@gbiv.com >>>>> <mailto:fielding@gbiv.com> >>>>> *Cc:* Göran Eriksson AP; bs7652@att.com <mailto:bs7652@att.com>; >>>>> remy@lebeausoftware.org <mailto:remy@lebeausoftware.org>; >>>>> ietf-http-wg@w3.org <mailto:ietf-http-wg@w3.org>; >>>> rodger@plexapp.com >>>>> <mailto:rodger@plexapp.com>; julian.reschke@gmx.de >>>>> <mailto:julian.reschke@gmx.de>; C.Brunhuber@iaea.org >>>>> <mailto:C.Brunhuber@iaea.org> >>>>> *Subject:* Re: Issue with "bytes" Range Unit and live streaming >>>>> >>>>> On 4/18/16 12:34 AM, K.Morgan@iaea.org <mailto:K.Morgan@iaea.org> >>>> wrote: >>>>> On Friday,15 April 2016 22:43,fielding@gbiv.com >>>> <mailto:fielding@gbiv.com> wrote: >>>>> Oh, never mind, now I see that you are referring to the >>>>> second number being >>>>> >>>>> fixed. >>>>> >>>>> >>>>> >>>>> I think I would prefer that be solved by allowing >>>>> last-byte-pos to be empty, just >>>>> >>>>> like it is for the Range request. I think such a fix is >>>>> just as likely to be >>>>> >>>>> interoperable as introducing a special range type (same failure >> cases). >>>>> >>>>> >>>>> ....Roy >>>>> >>>>> >>>>> >>>>> +1000 >>>>> >>>>> >>>>> >>>>> A very similar idea was proposed before [1] as an I-D [2] by >>>>> Rodger >>>> Coombs. We've also brought this up informally with other members of >>>> the WG. >>>>> >>>>> Alas, in our experience range requests don't seem to be a high >>>>> priority :( >>>> For example, the problem of combining gzip with range requests is >>>> still unsolved [3]. >>>>> >>>>> >>>>> [1]https://lists.w3.org/Archives/Public/ietf-http-wg/2015AprJun/0122 >>>>> .h >>>>> tml >>>>> >>>>> >>>>> [2]https://tools.ietf.org/html/draft-combs-http-indeterminate-range- >>>>> 01 >>>>> >>>>> >>>>> [3]https://lists.w3.org/Archives/Public/ietf-http-wg/2014AprJun/1327 >>>>> .h >>>>> tml >>>>> >>>>> [cp] Yeah, it's unfortunate that no solutions have moved forward for >>>>> this widely-desired feature. I can only assume that people just >>>>> started defining proprietary solutions - which is unfortunate. I'll >>>>> try to be "persistent"... ;^J >>>>> >>>>> [cp] As was mentioned, the issue with just arbitrarily allowing an >>>>> open-ended Content-Range response (omitting last-byte-pos) is that >>>>> there's no good way for a client to indicate it can support >>>>> reception of a Content-Range without a last-byte-pos. So I would >>>>> fully expect many clients to fail in "unpredictable ways" >>>>> (disconnecting, crashing, etc). >>>>> >>>>> [cp] I see that the indeterminate length proposal you referenced in >>>>> your first citation introduces a "Accept-Indefinite-Ranges" header >>>>> to prevent this issue. But I think this brings with it some other >>>>> questions. e.g. Would this apply to any/all Range Units which may be >>>>> introduced in the future? How can a Client issue a request that >>>>> starts at the "live point"? It feels like it has one hand tied behind its back. >>>>> >>>>> [cp] If I could, I would prefer to go back in time and advocate for >>>>> an alternate ABNF for the bytes Range Unit. Seeing as that's not an >>>>> option, I think using this well- and long-defined Range Unit >>>>> extension mechanism seems like a good path forward as it should not >>>>> create interoperability issues between clients and servers. >>>>> >>>>> [cp] And I would hope adding a Range Unit would have a low/lower bar >>>>> for acceptance. e.g. If a Range Unit fills a useful role, is >>>>> well-defined, and isn't redundant, it seems reasonable that it >>>>> should be accepted as it shouldn't impact existing HTTP/1.1 >>>>> semantics. In fact, the gzip case (referenced in your third >>>>> citation) seems like a perfect application of the Range Unit (better >>>>> than bytes-live). If there's interest, I'll write up an RFC to demonstrate... >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> This email message is intended only for the use of the named recipient. >>>> Information contained in this email message and its attachments may >>>> be privileged, confidential and protected from disclosure. If you are >>>> not the intended recipient, please do not read, copy, use or disclose >>>> this communication to others. Also please notify the sender by >>>> replying to this message and then delete it from your system. >>>>> -- >>>>> >>>>> >>>>> >>>>> craig pratt >>>>> >>>>> Caspia Consulting >>>>> >>>>> craig@ecaspia.com <mailto:craig@ecaspia.com> >>>>> >>>>> 503.746.8008 >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> >>>>> craig pratt >>>>> >>>>> Caspia Consulting >>>>> >>>>> craig@ecaspia.com <mailto:craig@ecaspia.com> >>>>> >>>>> 503.746.8008 >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> craig pratt >>>>> >>>>> Caspia Consulting >>>>> >>>>> craig@ecaspia.com <mailto:craig@ecaspia.com> >>>>> >>>>> 503.746.8008 >>>>> >>>>> >>>>> >>>>> >>>> -- >>>> >>>> craig pratt >>>> >>>> Caspia Consulting >>>> >>>> craig@ecaspia.com >>>> >>>> 503.746.8008 >>>> >>>> >>>> >>>> >>>> >>>> >>>> >> >> -- >> >> craig pratt >> >> Caspia Consulting >> >> craig@ecaspia.com >> >> 503.746.8008 >> >> >> >> >> >> >> -- craig pratt Caspia Consulting craig@ecaspia.com 503.746.8008
Received on Wednesday, 20 April 2016 19:58:00 UTC