Re: Processing requirements from Silvia Pfeiffer on 2010-01-05 (public-media-fragment@w3.org from January 2010)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Tue, 5 Jan 2010 23:35:58 +1100
To: Philip Jägenstedt <philipj@opera.com>
Cc: Media Fragment <public-media-fragment@w3.org>
Message-ID: <2c0e02831001050435v7b6814d6tbbbee124d2c0774a@mail.gmail.com>
On Tue, Jan 5, 2010 at 10:57 PM, Philip Jägenstedt <philipj@opera.com> wrote:
> On Tue, 05 Jan 2010 03:16:21 +0100, Silvia Pfeiffer
> <silviapfeiffer1@gmail.com> wrote:
>
>> On Mon, Jan 4, 2010 at 11:05 PM, Philip Jägenstedt <philipj@opera.com>
>> wrote:
>>>
>>> On Wed, 30 Dec 2009 04:33:36 +0100, Silvia Pfeiffer
>>> <silviapfeiffer1@gmail.com> wrote:
>>>
>>>> On Wed, Dec 30, 2009 at 3:20 AM, Philip Jägenstedt <philipj@opera.com>
>>>> wrote:
>>>>>
>>>>> On Tue, 29 Dec 2009 15:03:50 +0100, Silvia Pfeiffer
>>>>> <silviapfeiffer1@gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>> Now, I'd say that we're probably safe using "&" as a separator for URI
>>>>>> queries, since that has been specified in the CGI "standard" and has
>>>>>> continuously been applied, even if never formally specified. It is a
>>>>>> de-facto standard.
>>>>>
>>>>> I agree that it's safe, but we must formally specify it, either by
>>>>> referencing an existing spec (which I have failed to find) or by
>>>>> specifying
>>>>> it ourselves.
>>>>
>>>> A proper spec doesn't exist. All we have is the CGI spec. It's been my
>>>> greatest problem with the temporal URI spec for years from a
>>>> "completeness" point of view, but actually has never been a practical
>>>> problem, since ppl have just assumed the de-facto standard.
>>>>
>>>>
>>>>>> As for URI fragments, the idea is to keep it in sync with URI queries
>>>>>> and thus we also used the "&".
>>>>>
>>>>> I certainly agree with keeping them in sync, but the fragment component
>>>>> syntax is the one we can specify ourselves and it will work on many
>>>>> existing
>>>>> server configurations as a bonus.
>>>>
>>>> Actually: no, we cannot define the fragment component syntax for any
>>>> video or audio mime type. In fact, the URI specification says that the
>>>> fragment syntax is specified by the owner of the mime type - i.e. the
>>>> owner of video/ogg or video/mpeg4 (and audio) in the HTML5 case. All
>>>> that we can realistically do is provide a recommendation for mime type
>>>> owners to adopt our specification. We cannot really make an
>>>> enforceable standard. OTOH, ppl have been waiting for such a spec, so
>>>> they will gladly adopt it rather than create their own.
>>>
>>> Thanks, I didn't know this. It seems then that we can't reasonably state
>>> any
>>> conformance requirements at all in terms of the syntax of the query or
>>> fragment and rather must do it in terms of abstract name/value. This is
>>> actually good news to me and I will write a concrete suggestion on how to
>>> handle it in my next mail.
>>
>> I think that is the safest approach for now. I vaguely remember that
>> Yves had a chat with other W3C members - including TBL - who suggested
>> just doing a normative specification for media resources and ignoring
>> the fact that fragment or query syntax is not normally standardised.
>> Maybe Yves can clarify this. I don't think it has much of an effect on
>> our spec though.
>>
>>
>>>>>> Now, both approaches (URI fragment and query) may conflict with some
>>>>>> already created specifications (as analysed and listed in
>>>>>>
>>>>>>
>>>>>>
>>>>>> http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-reqs/#ExistingSchemes).
>>>>>> This is unavoidable when standardising the use of something that has
>>>>>> been in the wild so far.
>>>>>>
>>>>>>
>>>>>>
>>>>>> http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-spec/#processing-overview-standardisation
>>>>>> talks about this problem and makes clear that harmonisation is
>>>>>> necessary and that it is not possible to "prescribe" this format.
>>>>>> Which probably means that media fragments will always be a
>>>>>> recommendation rather than a standard.
>>>>>
>>>>> Yes, we will conflict with e.g. your Temporal URI spec and MPEG-21,
>>>>> which
>>>>> is
>>>>> to be expected as MF is supposed to supersede both.
>>>>
>>>> Well, I'm not actually sure MPEG-21 will adopt it. But the thing is:
>>>> even if the mime type owners don't accept it, what actually counts is
>>>> what the browser vendors implement. :-)
>>>>
>>>>
>>>>> However, existing query component schemes aren't really specs as such,
>>>>> they
>>>>> are actually defined by their (usually single) implementation. However,
>>>>> if
>>>>> we agree that MF should only normatively define the syntax and
>>>>> processing
>>>>> rules for URI *fragments*, then we don't need to discuss the query
>>>>> component
>>>>> issue any further.
>>>>
>>>> Some past discussions have found that we need to do both. The URI
>>>> queries approach has its use cases where you want to create a shorter
>>>> form from a longer resource - e.g. a playlists mashed up from segments
>>>> from multiple videos. We have embraced such use cases in the
>>>> requirements specification and they would require the use of URI
>>>> queries.
>>>>
>>>> To be complete, it is also possible to not use URI queries, but to use
>>>> some kind of REST interface, as you have mentioned before, e.g.
>>>> http://www.example.com/video/track=video1/track=audio2/t=20,80 . But
>>>> this resource has nothing at all to do with the original resource,
>>>> which may be http://www.example.com/video/, so caching is impossible.
>>>> Using URI queries at least provides a means to enable caching and to
>>>> continue having the link back to the original resource.
>>>
>>> Using the path or the query look equivalent to me, both are specific to a
>>> specific server configuration. Do caches really treat
>>> http://www.example.com/video/track=video1/track=audio2/t=20,80 and
>>> http://www.example.com/video?track=video1&track=audio2&t=20,80 in any way
>>> differently with respect to the "original" resource
>>> http://www.example.com/video/?
>>
>> I've checked back with caching of queries and it's actually worse for
>> URI queries than for the REST-style resources: URI queries are often
>> not cached at all, since it is assumed they come from forms, which are
>> highly volatile. So, you're probably right and they are fairly
>> similar.
>
> OK, since it's not possible to differentiate media fragment URIs from other
> URIs in a fool-proof manner, I guess proxies can't change that behavior
> without breaking sites that accidentally use MF-like syntax.

The distinction doesn't come through naming conventions, but only
through the MIME type. Proxies can identify the MIME type and do
something different when they realise they're dealing with media
resources. It is always possible for a proxy to do more than what they
currently wrt caching and byte range requests etc.


>>> Is the idea that caches should assume that
>>> URLs which happen to look like they use media fragments syntax in the
>>> query
>>> componenet are related to the URL with the query component stripped?
>>
>> Not by default. We had a intelligent protocol that included mapping
>> queries to HTTP byte range requests and thus make them cachable (see
>> also the old temporal URI fragments spec at
>> http://annodex.net/TR/draft-pfeiffer-temporal-fragments-03.txt), going
>> back to existing mechanisms.
>>
>>> This
>>> sounds very fragile to me, shouldn't this be done with new HTTP headers
>>> so
>>> that the caching proxy doesn't need to be concerned with parsing MF?
>>> Something like Original-Location? I haven't followed the server-part of
>>> MF
>>> very well, so perhaps I'm missing something.
>>
>> Yup, there are some new HTTP headers involved, see
>>
>> http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-spec/#processing-protocol-proxy
>> as applied to URI queries. Something we haven't agreed on yet.
>
> This looks a lot more robust. Caching of resources that use the query
> component seems rather messy, but it looks like the current spec steers
> clear of that.

All it does right now is refer back to that section, which should
probably also work for queries.


>>>> OTOH there are a lot of issues to deal with when using queries. We can
>>>> only address a small part of the URI query possibilities in the MF
>>>> spec, namely the one that overlaps with the spec we're creating for
>>>> URI fragments. That has been the basis of our decisions so far.
>>>>
>>>> Why do you think URI queries are so much more of a problem? I wasn't
>>>> able to read that out of the irc discussion either. Standardisation of
>>>> how to create URI queries is useful, since then there are compatible
>>>> naming conventions across servers and clients and applications can
>>>> rely on things working the way they'd expect to. From a HTML5 POV, URI
>>>> queries don't matter, since they don't concern the browser. But when
>>>> specifying URIs, one has to think far beyond just the browser, IMHO.
>>>
>>> I don't think that there's a problem with using the query component to
>>> "apply" the fragment server-side at all, that's very useful. I think this
>>> is
>>> a spec layering problem mostly. Certainly browsers don't need to care,
>>> but I
>>> still want the whole of the spec to be consistent and robust, not just
>>> "my"
>>> parts.
>>
>> Ok, it seems we agree. And yet I seemed to read out of your previous
>> emails that you want to remove the URI query related sections
>> completely. Did I misunderstand?
>
> I was under the impression that "media fragment URIs" was supposed to be a
> subset of URIs with fragment component, and that we thus cannot make any
> normative requirements for the query component. I may have gotten a bit too
> excited, but just wanted to make some parts non-normative. In any event, it
> seems this issue is resolved simply by being clear about some definitions
> and not defining validity in terms of fragment/query componenet syntax, but
> rather in terms of name-value lists.

OK, that's good. Any suggestions that you have of improving that
reading are most welcome.


> There is, however, one part I'm quite puzzled by: should it be valid to
> include server-specific parts in the query string mixed in with MF syntax?

Sure! There are plenty more services media servers can offer beyond
mere media fragment delivery.

> Say someone already has a resource http://example.com/getvideo?id=42 (where
> 42 might be a database row id or something). id here doesn't refer to the id
> in MF. Is it possible to add MF syntax like 't=5' on top of this? It would
> be valid syntax but id would be ambiguous. Should it be valid to mix other
> pre-existing names that don't collide, like
> http://example.com/getvideo?foo=42&t=5 ? It seems we are making it very
> difficult to migrate any URLs that already use the query component to use
> MF. This isn't really an issue in the fragment case as there's not really a
> lot (any?) existing use of the URI fragment that we could trample on.

Yeah, I think that's the advantage. The main query strings ppl are
after are the ones we are defining here. Then there are less important
ones, such as format="jpg" asking for a time offset to be returned as
a thumbnail etc. YouTube has a gazillion of them and anyone wanting to
implement a good media server is highly adviced to check out YouTube's
query (or rather: fragment) parameters to see what is possible.

Cheers,
Silvia.
Received on Tuesday, 5 January 2010 12:36:51 UTC