Re: [squid-core] W3C media fragments working group

Hi Henrik, all,

I'm cc-ing the media fragments working group since I'm sure they are
interested in your views.

Thanks very much for your thoughts below, which were more than I
expected. At this stage I was only asking whether there is somebody in
squid who would be interested to help us sort through some issues with
Web proxies. Indeed, Yves is also involved with the media fragments
working group, so we have some very good http expertise there. It's
however not enough, since I think Yves is actually disagreeing with
some of what you said below about time range units. His aim is clearly
to get media fragment caching support into Web proxies within one
roundtrip (while the one I suggested, as you might have seen from the
email that I referred Rob to, requires 2 roundtrips). Clearly, one
roundtrip is perferrable, but we need to make sure that our designs
are actually implementable.

Anyway, let me start from the beginning. You might believe me when I
say that we have indeed discussed many of the topics that you are
touching upon below in the group while we met 2 weeks ago in France.

First we have to separate between doing fragments client- or
server-side. The focus of the media fragments working group is on
server-side - in particular on large video and audio files, which
would take too long to fully download just to see a fragment (think
e.g. of mobile phones and the cost involved). Client-side fragment
offsets are simple and any current player could easily support them.
Therefore they are regarded as a special case (e.g. when the resource
is already fully at the client or the application requires the full
resource anyway) or a fall-back solution for when server-side
fragments are not possible.

Then, we most certainly have to deal with the complexities of codecs
and media resources. Not all media types are currently capable of
doing what server-side media fragments would require (and in
particular image formats may be completely incapable of it right now).
Those that are capable are of interest to us. For those that aren't,
the fall-back case applies (i.e. full download and then offsetting).

Then we have the syntax challenge. We have already decided that as
syntax goes, the URI fragment specification (as used for HTML) is the
right way to do (there are some issues with clashes with this, but
it's not of interest to our discussion). Because we want to have the
fragmentation happening on the server and URI fragments are a
client-only thing, the fragment has to be transported in protocol
parameters. This is where the time range unit comes in.

Now, we are able to tell the server to create a subresource of the
original resource and deliver it to the client. All is well.

Until the moment that we want Web proxies to also be able to cache the
media fragments. And this is where your squid expertise comes in. We
have several proposals on the table - Raphael will be writing them up.
It would be most interesting to get your input on those proposals.

I would suggest that when we have the proposals written up, we could
post them to you and whoever is interested in the squid team so you
can provide us with feedback. Then you can criticise them and propose
alternatives, which would help us move towards a more practical
solution.

Thanks,
Silvia.


On Sun, Nov 2, 2008 at 10:56 AM, Henrik Nordstrom
<henrik@henriknordstrom.net> wrote:
> Hi Silvia,
>
> I am somewhat involved in the http working group, trying to get RFC2616
> into shape, and this touches the same subject. This has also been
> briefly discussed within the httpwg some time ago. Btw, Yves is also
> active in that same IETF workinggroup.
>
>
> What follows is my initial comments on the problem of media fragments.
> It's a somewhat unordered random bag of thoughts after some minutes
> thinking, not a coherent or complete view as such.
>
>
> Semantically the URI fragment identifier is imho the correct place for
> this kind of information if placed within the URI, and also provides the
> better compatibility with agents who don't know about the syntax or
> protocol extension (they will get and play the whole referenced media,
> not knowing the meaning of an anchor).
>
> HTML has a defined fragment syntax, referencing named anchors within the
> HTML page. Some HTML/DOM/Javascript applications also abuse the fragment
> identifier as local state storage but is not the intended use. There is
> no defined fragment syntax for media files such as mpeg, mp3, jpeg, png
> etc that I am aware of.
>
> The likely hood that a defined fragment syntax for media files will
> really collide with existing unspecified uses of the fragment identifier
> for such objects is relatively small as fragments do not have a defined
> meaning for the kind of media discussed, nor any active local processing
> such as javascript etc which might abuse the fragment identifier as
> local storage.
>
> Using an fragment identifier requires the user-agent to be able to
> resolve this into suitable ranges etc one way or another, which in most
> cases is an entirely realistic requirement. This could be by parsing the
> media file and seeking until the correct location is found (i.e. using
> range requests if using an HTTP transport), or optimized using a
> sideband method/object, or integral in the transport method used
> whatever that is for the resource in question.
>
> You could extend HTTP with a new "time" range unit, but this will not
> work very well via proxy caches or caches in general as they only know
> about bytes, and also requires considerable amount of changes in clients
> and servers to deploy, and also makes caching in general harder as time
> and byte offsets is two different dimensions with their own properties.
> It's not trivial to map between the two, and the likely hood of
> different implementations giving slightly different mappings between
> bytes and time of a given media file is very large making merges of
> different time based parts of the same object a somewhat risky business.
> Additionally this won't work for spatial references which I gather is
> also one of your intended targets (HTTP only provides for a single range
> dimension, and it's not supposed to recode the object in doing so).
>
> I certainly don't think jumping via an intermediary URL by default is a
> good thing. If the user agent already has the linked media it SHOULD be
> able to resolve the fragment directly and locally. Consider for example
> the case of linking to a movie file by file://, or a mail attachment or
> usenet news attachment, the same problem applies there, so any fragment
> identifiers used in the URI should be protocol independent. It's fine to
> design optimizations for use within the http(s):// protocol to avoid
> redundant transfer of non-interesting parts of the referenced media file
> (transfer, not display), but it should not require this to work or be
> part of the URL syntax specification. I would advice to not look into
> optimization of transfer before the actual semantics of linking to
> fragments of media files is clearly resolved.
>
> For media embedded in HTML (or similar rich content) the time fragment
> may be indicated in many other ways than the URL. If for example using
> the HTM <object> tag then the X fragment (time/spatial/whateer) may be
> expressed as optional HTML parameters to the object, outside of the URL
> as such. But this does not exclude the need to being able to address
> media fragments in an uniform manner.
>
> Spatial references is tricky to resolve efficiently however. As some of
> your members have already identified the support for spatial references
> in compressed domains of jpeg etc is technically almost impossible, and
> to optimize transfer in such cases the object needs to be recoded which
> creates another object with it's own representation. To HTTP caches this
> is probably best expressed as variants of the same resource. It's not
> really a partial range as such, and multiple such "ranges" can't be
> easily merged into a larger combined representation without detail
> knowledge of the media format in question. This approach involves adding
> a new request header telling the server the desired dimensions.
>
> It's also possible to encode this using query parameters, but you then
> run into some issues in invalidation requirements and also mapping
> issues if the HTTP application already uses query parameters, and also
> creates issues in local linking directly to files without relying on a
> remote server to perform the recoding. But on the other hand using query
> parameters do provide a implementation shortcut giving clients without
> native support for such fragment references direct access to just that
> content while you normally can't arbitrarily add headers to sent
> requests in existing clients
>
> You are welcome to forward this message to the media fragments working
> group if you desire.
>
> Regards
> Henrik
>
> On sön, 2008-11-02 at 06:45 +1100, Robert Collins wrote:
>> On Sat, 2008-11-01 at 19:56 +1100, Silvia Pfeiffer wrote:
>> > Hi Rob,
>> >
>> > We discussed the need of the W3C media fragments group to actually
>> > have a Web proxy expert partake.
>> >
>> > If you or somebody else in the squid community is interested, I can
>> > put them in contact with the chair of the working group to sort out
>> > how to get you/them involved.
>> >
>> > The charter of the group is at
>> > http://www.w3.org/2008/01/media-fragments-wg.html and the wiki at
>> > http://www.w3.org/2008/WebVideo/Fragments/wiki/Main_Page . One
>> > particular discussion about how to make fragments proxy-able is at
>> > http://lists.w3.org/Archives/Public/public-media-fragment/2008Oct/0060.html
>> > .
>> >
>> > It would be great to get that expertise into the group!
>> >
>> > Cheers,
>> > Silvia.
>>
>> Hi Silvia; I suspect Henrik is the person that your peer knows; I don't
>> know that anyone of the squid core has a day-a-week to commit to this,
>> but I'm sure one/some of us would be up for joining the mailing list and
>> perhaps coming along to the next actual get-together.
>>
>> I've cc'd henrik and the squid-core list(which is private) to get a
>> reasonably wide distribution on this.
>>
>> -Rob
>>
>

Received on Sunday, 2 November 2008 05:02:05 UTC