RE: Squid experts from Davy Van Deursen on 2008-11-06 (public-media-fragment@w3.org from November 2008)

From: Davy Van Deursen <davy.vandeursen@ugent.be>
Date: Thu, 6 Nov 2008 15:17:59 +0100
To: "'Silvia Pfeiffer'" <silviapfeiffer1@gmail.com>
Cc: "'Media Fragment'" <public-media-fragment@w3.org>
Message-ID: <009001c9401a$7916bb30$6b443190$@vandeursen@ugent.be>
Hi Silvia,

>-----Original Message-----
>From: public-media-fragment-request@w3.org [mailto:public-media-
>fragment-request@w3.org] On Behalf Of Silvia Pfeiffer
>Sent: Thursday, November 06, 2008 12:08 PM
>To: Davy Van Deursen
>Cc: Jack Jansen; Media Fragment
>Subject: Re: Squid experts
>
>
>Hi Davy,
>
>On Thu, Nov 6, 2008 at 8:08 PM, Davy Van Deursen
><davy.vandeursen@ugent.be> wrote:
>> Let me clarify my view on this topic. Suppose we use byte ranges to
>cache
>> our media fragments (using the four-way handshake approach), then we
>can
>> distinguish the following scenarios:
>>
>> 1. The media resource meets the two requirements (i.e., fragments can
>be
>> extracted in the compressed domain and no syntax element modifications
>are
>> necessary).
>> -> we can cache media fragments of such media resources, because their
>media
>> fragments are addressable in terms of byte ranges.
>
>Agreed.
>
>
>> 2. Media fragments cannot be extracted in the compressed domain
>> -> transcoding operations are necessary to extract media fragments
>from such
>> media resources; these media fragments are not expressible in terms of
>byte
>> ranges. Hence, it is not possible to cache these media fragments.
>
>Agreed.
>
>
>> 3. Media fragments can be extracted in the compressed domain, but
>syntax
>> element modifications are required
>> -> these media fragments seem to be cacheable :-). For instance,
>headers
>> containing modified syntax elements could be sent in the first
>response of
>> the server (as already discussed on this list). However, the latter
>solution
>> is still not entirely clear for me. What if for example multiple
>headers are
>> changed and these headers are spread across the media resource? What
>if
>> syntax element modifications are needed in parts that do not apply to
>the
>> whole media resource? I still don't have the feeling that this
>solution is
>> generically applicable to all media resources in this scenario.
>
>You are describing a hypothetical codec and throwing all possible
>complexities at it. I think what we need to do instead is to actually
>analyse real encapsulation and compression formats to really
>understand the different situations that we are dealing with. There
>may well be codecs for which this doesn't work. So we have to retrace
>to scenario 2. I don't think we can deal with one situation only.

Agreed.

>
>
>
>> Suppose we use time ranges to cache our media fragments, then I see
>the
>> following pros and contras:
>>
>> Pro:
>> -> caching will work for the three above described scenarios (i.e.,
>for
>> fragments extracted in the compressed domain and for transcoded
>fragments).
>> Hence, the way of caching is independent of the underlying formats and
>> adaptation operations performed by the server.
>
>I disagree. Scenario 2 cannot cache resources in the way that we
>describe it - with all possibilities of concatenation and
>recomposition. Scenario 2 can only cache individual fragments, not the
>full resource, since each time fragment will consist of different byte
>values than the original resource. Therefore, caching in Web proxies
>doesn't really work any longer. Caching will only work for scenario 1
>and 3.

Unless you use transcoders at a (specialized video) proxy. But I agree that
the transcoder story is probably a bridge too far in this discussion :-).

>> -> four-way handshake can be avoided.
>
>That's a fair enough aim and I'd like to believe we can achieve it.
>But it may be too hard.
>
>
>> Contra:
>> -> no support for spatial fragments.
>
>Why? Spatial fragments would just get spatial ranges for caching.

You're right. I focused too much on the 'time' ranges ;-).

>
>
>> -> concatenation of fragments becomes a very difficult and in some
>cases
>> maybe an impossible task. To be able to join media fragments, the
>cache
>> needs a perfect mapping of the bytes and timestamps for each fragment.
>> Furthermore, if we want to keep the cache format-independent, such a
>mapping
>> is not enough. We also need information regarding the byte positions
>of
>> random access points and their corresponding timestamps. This way, a
>cache
>> can determine which parts are overlapping when joining two fragments.
>Note
>> that this kind of information could be modeled by a codec-independent
>> resource description format.
>
>Yes, I think with such a representation of the resource and with the
>server sharing this representation with all the proxies, we should be
>able to do time ranges with a 2-way-handshake only. Is it realistic
>though to create such an overhead in the protocol?

Good question. Using time ranges instead of byte ranges will always add
additional complexity to the caches. I don't think this is feasible as a
general solution for all caches, but could be a good solution for
specialized video caches (as Yves pointed out in [1]). 

>
>
>> Of course, this works only when it is possible
>> to extract the media fragments in the compressed domain. For joining
>> fragments which are the result of transcoding operations, transcoders
>are
>> needed in the cache. As you said, the latter operation could introduce
>a
>> loss of quality, but note that this is always possible with
>transcoding
>> operations (thus, also if they are transcoded at server-side).
>
>I am not even sure we should include any kind of transcoding
>activities into our model. Transcoding creates fundamentally different
>representations for the same media resource, and mostly with a loss of
>quality. I personally don't think we should go down that path.

I agree. 

>
>> Both approaches have pros and contras and for the moment, I don't
>prefer one
>> over the other. What do you think?
>
>Thanks for making them explicit - that makes the discussion easier. :)
>
>I think we are theorizing a lot and are not actually looking at
>concrete codecs. We should start getting our hands dirty. ;-) By which
>I mean: start classifying the different codecs according to the
>criteria that you have listed above and find out for which we are
>actually able to do fragments and what types of fragments.

I agree, we should create a new page on the wiki starting from this mail and
from [2]. I will try to work on that in the next days.

[1]
http://lists.w3.org/Archives/Public/public-media-fragment/2008Nov/0032.html 
[2]
http://lists.w3.org/Archives/Public/public-media-fragment/2008Nov/0003.html 

Best regards,

Davy

-- 
Davy Van Deursen

Ghent University - IBBT
Department of Electronics and Information Systems Multimedia Lab
URL: http://multimedialab.elis.ugent.be
Received on Thursday, 6 November 2008 14:19:08 UTC