Re: Squid experts from Silvia Pfeiffer on 2008-11-05 (public-media-fragment@w3.org from November 2008)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Thu, 6 Nov 2008 09:21:24 +1100
To: "Yves Lafon" <ylafon@w3.org>
Cc: "Media Fragment" <public-media-fragment@w3.org>
Message-ID: <2c0e02830811051421n4dc53d63ye1ac46753c472576@mail.gmail.com>
Hi Yves, all,

On Mon, Nov 3, 2008 at 11:55 PM, Yves Lafon <ylafon@w3.org> wrote:
> On Mon, 3 Nov 2008, Silvia Pfeiffer wrote:
>> On Mon, Nov 3, 2008 at 9:31 PM, Yves Lafon <ylafon@w3.org> wrote:
>>> On Sat, 1 Nov 2008, Silvia Pfeiffer wrote:
>>>>
>>>> One technical point that was made is that doing time ranges in proxies
>>>> may be really difficult since time is inherently continuous and so the
>>>> continuation from e.g. second 5 to second 6 may not easily be storable
>>>> in a 2-way handshake in the proxy.
>>>
>>> Not for a general proxy, but it makes the case for proxies with a
>>> specific
>>> module to handle such beast.
>>
>> The issue is that for any codec it will be a problem to identify where
>> "second 5" ends and where "second 6" starts. Even a proxy that
>> understands time and codecs cannot be certain that when it receives a
>> packet that goes till second 5 and one that starts at second 6 it can
>> just concatenate them to make a continuous stream. Such concatenation
>> only works on the byte level. It therefore needs to be told not just
>> the time range, but also the byte range that the data maps to.
>
> I don't understand the issue, if you have something that ends at t=5s, and
> something that starts at t=6s, so two fragments, there is 1s missing in the
> middle, so clearly the cache shouldn't try to concatenate those fragments,
> but should keep record of having those two fragments, and concatenate when
> t=5s-6s is available.
> Now if by end=5s you mean 5.999999999999999999...s and starts at 6s, then a
> client might be able to display the whole thing but a server couldn't do the
> concatenation ?


So where 5s ends is a question about making intervals inclusive or
exclusive. Let's go with your understanding. The problem still exists.
Let me try and explain. Also note that we are talking about Web
proxies that do not have the full media resource at hand, while a Web
server that has the full media resource at hand has no such issues.

So let's assume two temporal fragment URIS. One asks for seconds 1-6
and the next one asks for 6-12. The Web proxy is only give a
colleciton of bytes and told which time segment these map to. It does
not know which byte offset in the original file the map to. Now, we
cannot be sure that the end of 1-6 is exactly before the beginning of
6-12. There may well be an overlap of bytes in between ending 1-6 and
beginning 6-12, because we deal with the compressed domain and
continuous time. In fact, I would be surprised if there is not an
overlap in bytes by at least one codec packet. This tells us that only
the bytes are uniquely identified, while time is not uniquely
identifyable. This means we have to store not just time in the Web
proxy, but also the mapping to bytes. With the 2-way handshake, I do
not think this is possible (but please correct me if I'm wrong).


>>> That said, we have different axes of selection,
>>> and it doesn't fit well the range model.
>>> I was wondering if language selection could be done using
>>> Accept-Language,
>>> in the case you have different language tracks, but in that case you need
>>> to
>>> identify first class URIs for the different language-based variants.
>>
>> When you mean language selection, you talk more generally about
>> selecting different tracks, right? This could be both for audio and
>> for annotations. As for video, we could also have recordings from
>> different angles that could be selected or other tracks. Solving the
>> language selection only solves one part of the track selection
>> problem.
>
> Yes, and language selection can be applied to audio track but also for
> subtitles, making things even worse. But the real issue is are they
> fragments or not?

And also we could have karaoke, image tracks, multiple audio tracks,
multiple video angles. Getting only a certain subset of tracks from
the original media resource can be solved through a fragment
addressing scheme. Whether it is the right way to solve it, I am not
sure either.


>>> We need to discuss that a bit deeper, do we really need to identify the
>>> video+fr soundtrack as a fragment?
>>
>> I don't understand "video+fr soundtrack"... what do you mean?
>
> a video consiting of "moving pictures + french soundtrack", does it need to
> be presented as a fragment ? ie: what are the axis where it is useful to
> define fragments.

Ah ok. :-)


>>>> Instead there was a suggestion to create a codec-independent media
>>>> resource description format that would be a companion format for media
>>>> resources and could be downloaded by a Web client before asking for
>>>> any media content. With that, the Web client would easily be able to
>>>> construct byte range requests from time range requests and could thus
>>>> fully control the download. This would also mean that Web proxies
>>>> would not require any changes. It's an interesting idea and I would
>>>> want to discuss this in particular with Davy. Can such a format
>>>> represent all of the following structural elements of a media
>>>> resource:
>>>> * time fragments
>>>> * spatial fragments
>>>> * tracks
>>>> * named fragments.
>>>
>>> Well, you have byte ranges, but no headers, no metadata. And carrying
>>> part
>>> of the missing payload in headers is a big no.
>>
>> Can you explain this further? I don't quite understand what is the big
>> no and which missing payload you're seeing to be put in which headers
>> (HTTP headers?).
>
> If you are outputing only a byte range of the video, does it contains all
> the needed informations to be played? (like format, framerate etc...)
> If not, how to you carry the missing information (ie: the missing part of
> the payload).

So you are saying that even if we only have small changes to make to
media headers (i.e. payload), it is a bad design to deliver these
changes in HTTP headers? In the 4-way handshake proposal the changes
are carried in the payload of the first handshake reply, which
provides the updated media headers and the mapping of the fragment to
byte ranges. I am not sure how to do that in a 2-way handshake other
than through HTTP headers.


Cheers,
Silvia.
Received on Wednesday, 5 November 2008 23:28:02 UTC