RE: Track fragments from Davy Van Deursen on 2010-02-17 (public-media-fragment@w3.org from February 2010)

From: Davy Van Deursen <davy.vandeursen@ugent.be>
Date: Wed, 17 Feb 2010 13:08:19 +0100
To: "'Silvia Pfeiffer'" <silviapfeiffer1@gmail.com>
Cc: "'DENOUAL Franck'" <Franck.Denoual@crf.canon.fr>, <public-media-fragment@w3.org>
Message-ID: <003a01caafc9$e4d8ef30$ae8acd90$@vandeursen@ugent.be>
On feb 16, 2010 at 20:33, Silvia Pfeiffer wrote:
> Cc: DENOUAL Franck; public-media-fragment@w3.org
> Subject: Re: Track fragments
> 
> On Wed, Feb 17, 2010 at 2:30 AM, Davy Van Deursen 
> <davy.vandeursen@ugent.be> wrote:
> > Hi Silvia,
> >
> > On feb 16, 2010 at 13:00, Silvia Pfeiffer wrote:
> >> Cc: DENOUAL Franck; public-media-fragment@w3.org
> >> Subject: Re: Track fragments
> >>
> >> Hi Davy,
> >>
> >> On Tue, Feb 16, 2010 at 7:04 PM, Davy Van Deursen 
> >> <davy.vandeursen@ugent.be> wrote:
> >> > On feb 15, 2010 at 22:23, Silvia Pfeiffer wrote:
> >> >
> >> >> It is possible do devise a track addressing method that includes
> >> some
> >> >> of the other attributes. For example, a combination of type, 
> >> >> role
> >> and
> >> >> lang could make sense, something like:
> >> >>
> >> >> #track=audio(audesc, en)&video(main,en)&text(cc,en)&text(sub,fr)
> >> >>
> >> >> I just made this up, so feel free to suggest any other markup
> means.
> >> >
> >> > So 'audio(audesc, en)&video(main,en)&text(cc,en)&text(sub,fr)'
> >> > corresponds to the trackname?
> >>
> >> Nah, I did indeed mess this one up. "&" should not have been used 
> >> here, since it separates name-value pairs. You've used semicolon to 
> >> separate your track names, so this would be more appropriate:
> >>
> >> #track=audio(audesc,en);video(main,en);text(cc,en);text(sub,fr)
> >>
> >> and this would imply that the media resource has basically named 
> >> the
> >> tracks:
> >> * audio(audesc,en)
> >> * video(main,en)
> >> * text(cc,en)
> >> * text(sub,fr)
> >>
> >> They could just as well have been named
> >> * a1
> >> * v1
> >> * t1
> >> * t2
> >> or anything else, but the idea is to explore if some scheme can be 
> >> found that helps addressing rather by features of the track than by
> a
> >> generic name. I think this was what we meant by "default names".
> >>
> >>
> >> >> One thing I need to add to this discussion is that track 
> >> >> addressing with *URI fragments* may be less about addressing and 
> >> >> more about activating. So, it interrelates very closely with the 
> >> >> JavaScript API, which is why I am waiting for that to stabilise 
> >> >> and have some
> >> initial
> >> >> implementations. This is certainly different if we use URI
> queries
> >> >> (?) for addressing, since then we compose a new resource with
> just
> >> >> the requested tracks.
> >> >
> >> > Do you mean by 'activating' that the server typically sends all
> the
> >> > tracks to the UA? If so, than this might cause (bandwidth)
> problems
> >> > when multiple video tracks are present within the media
> resource ...
> >> I
> >> > don't see any difference between track and time fragments within 
> >> > the discussion of ? vs. # (do you?), so I think that track 
> >> > addressing
> >> with
> >> > URI fragments is really about addressing :-).
> >>
> >> The biggest issue with addressing tracks is that they tend not to 
> >> easily map to byte ranges - at least not to a small number. In 
> >> fact, they tend to map to a large number and it is almost 
> >> impossible for the client to know which byte ranges belong to which 
> >> track (at least within Ogg).
> >
> > Hmmm, I do agree that track selection results in large number of 
> > byte ranges (on condition that the tracks are interleaved). But I 
> > don't
> see
> > the reason why this would be hard for the client to manage the
> mapping
> > between track and byte ranges. Temporal fragments can also result in 
> > multiple byte ranges, Ogg doesn't have a problem with that though?
> 
> With the introduction of the new OggIndex, there will be information 
> in the Ogg header to directly map time to byte offsets. Such a mapping 
> is not available for tracks.
> 
> You can also seek in Ogg (even over the network) for time offsets 
> using a bisection search and inspecting the granulepos in page 
> headers. That will give you the beginning of a time range. It will 
> also give you a particular subpart of a track. But that doesn't tell 
> you where the next subpart of a track can be found. Once you've read a 
> subpart of a track, you need to seek forward to find the next part.
> So, I actually have my doubts this can be implemented over the network 
> as neatly as we have it now with time fragments. I would like to be 
> proven wrong though and maybe there is a way that I have overlooked.
> 
> 
> >> How did you implement it in NinSuna? Did you use "?" to compose a
> new
> >> resource with the restricted number of tracks or did you use byte 
> >> range requests?
> >
> > Currently, the front-end in NinSuna uses the '?' to address track
> fragments.
> > However, the underlying implementation is based on byte range
> selection.
> > More specifically, a request such as
> http://example.org/media.ogg?track='ex'
> > will trigger the following actions:
> > - get the list of byte ranges corresponding to the fragment request
> > - create the headers of the container format
> > - copy and add all the obtained byte ranges to/after (depending on
> the
> > format) the headers
> 
> For MPEG and Ogg, how do you determine the byte ranges corresponding 
> to the track fragment request? Are you asking the server for help?
> (i.e. doing one of the other client-side approaches we have specified 
> in the spec) ?

Actually, I was talking about the server (and more specifically about the
internal implementation); I wasn't considering clients here. Thus, byte
range mappings are determined by the server. I guess that is the reason for
the confusion: you can resolve the media fragment in a smart client (which
is able to obtain byte range mappings without external help) or a smart
server can resolve the media fragment (i.e., the NinSuna case) for you. In
case of the former, I do agree that there are problems with Ogg regarding
track selection (note that a solution for MP4 was discussed here by Dave
Singer [1]). In case of the latter, I did not experience any problems with
both Ogg and MP4 regarding time and track fragments.

> 
> Since you are recreating the headers of the container format and 
> copying byte ranges after that, are you actually creating a new 
> resource, so doing the "query" case rather than the "fragment" case?

>From an implementation point of view, you can send the whole resource to the
UA, or you can create a new one (technically speaking), because headers are
changing. However, for the UA, it will look like only a fragment of the
original resource.

> 
> For me, a URI fragment approach means that if the customer changes in 
> his address bar from http://example.com/video.ogv#track=track1 to
> http://example.com/video.ogv#track=track2 , the browser does not throw 
> away what it has previously buffered and does not reload the whole 
> thing - it only fills byte range gaps in its buffer that it has left 
> beforehand since they were not requested to be displayed. Changes of 
> URI fragments are not supposed to initiate new page loads.

But this depends on the underlying implementation of the browser no? Also,
It seems not trivial to me to create such an implementation because you not
only need to fill byte range gaps, but headers are also changing when new
data is added to a media file.

Best regards,

Davy

[1]
http://lists.w3.org/Archives/Public/public-media-fragment/2009Apr/0010.html 

-- 
Davy Van Deursen

Ghent University - IBBT
Department of Electronics and Information Systems - Multimedia Lab
URL: http://multimedialab.elis.ugent.be/dvdeurse
Received on Wednesday, 17 February 2010 12:08:19 UTC