W3C home > Mailing lists > Public > public-media-fragment@w3.org > September 2010

Re: ACTION-187: extensibility and parsing

From: Philip Jägenstedt <philipj@opera.com>
Date: Fri, 24 Sep 2010 11:09:04 +0200
To: public-media-fragment@w3.org
Message-ID: <op.vji41eo9sr6mfa@kirk>
On Fri, 24 Sep 2010 10:43:32 +0200, Silvia Pfeiffer  
<silviapfeiffer1@gmail.com> wrote:

> On Fri, Sep 24, 2010 at 6:09 PM, Philip Jägenstedt  
> <philipj@opera.com>wrote:
>> On Fri, 24 Sep 2010 06:56:33 +0200, Davy Van Deursen <
>> davy.vandeursen@ugent.be> wrote:
>>  Citeren Silvia Pfeiffer <silviapfeiffer1@gmail.com>:
>>>> On Wed, Sep 22, 2010 at 9:12 PM, Philip Jägenstedt <philipj@opera.com
>>>> >wrote:
>>>>  As request, a short summary of the long standing issue of syntax,
>>>>> parsing
>>>>> and how that relates to extensibility.
>>>>> By extensibility I am not primarily talking about 3rd parties  
>>>>> extending
>>>>> MF,
>>>>> but about our own possibilities of updating the spec after MF 1.0.  
>>>>> For
>>>>> the
>>>>> purpose of discussion, assume that we want to add a dimension for
>>>>> filtering
>>>>> the audio, e.g., freq=300,3000 to keep only the part of the audio  
>>>>> that
>>>>> corresponds (approximately) to human voice (300Hz-3000Hz).
>>>>> How will implementations of MF 1.0 handle t=10,500&freq=300,3000 ?  
>>>>> This
>>>>> is
>>>>> the core point of disagreement, and the question is really about how  
>>>>> MF
>>>>> 1.0
>>>>> parsers should work. Leaving it undefined is not a good option, as  
>>>>> the
>>>>> history clearly shows. Two other options have been on the table:
>>>>> 1. Require that parsing follow a strict ABNF syntax like the one we
>>>>> have.
>>>>> Since freq is not part of the MF 1.0 syntax, parsing
>>>>> t=10,500&freq=300,3000
>>>>> will fail and the whole fragment will be ignored, including t=10,500.
>>>>> 2. Require that parsing follow an algorithm or a more forgiving ABNF
>>>>> syntax. The concrete suggestion I've made is that the algorithm or
>>>>> syntax
>>>>> should match how query strings work. That is, a list or key-value  
>>>>> pairs
>>>>> is
>>>>> formed by splitting the string on & and =. As a second step, that  
>>>>> list
>>>>> is
>>>>> traversed to match the keys against the dimensions and parsed  
>>>>> according
>>>>> to
>>>>> the ABNF syntax of each dimension. Crucially, unrecognized/invalid  
>>>>> keys
>>>>> or
>>>>> values are ignored. That means that in the above example, the time
>>>>> dimension
>>>>> will keep working even if an unrecognized (to a MF 1.0  
>>>>> implementation)
>>>>> freq
>>>>> dimension is used.
>>>>> Note: Neither 1 or 2 are requirements on using any specific
>>>>> implementation
>>>>> technique, only to behave *as if* you are, which still leaves plenty  
>>>>> of
>>>>> room
>>>>> for different approaches.
>>>>> I strongly favor option number 2, and see these benefits:
>>>>> * It works like query strings, just like one would expect from  
>>>>> looking
>>>>> at
>>>>> the syntax. The algorithm I've suggested is actually from testing  
>>>>> query
>>>>> string parsing in PHP, ASP, ASP.NET, CGI.pl and JSP, as reported
>>>>> earlier
>>>>> on this list.
>>>>> * It's simpler for implementors, as we won't have to implement
>>>>> everything
>>>>> at once. This is likely what's going to happen, as the time  
>>>>> dimension is
>>>>> ready to implement, while the named dimension is still not clear how  
>>>>> to
>>>>> apply to e.g. a WebM or Ogg resource.
>>>>> * It's better for extensibility, as adding new dimensions doesn't  
>>>>> break
>>>>> all
>>>>> existing implementations. Imagine if adding a new element to HTML  
>>>>> would
>>>>> cause pages to render completely blank in all existing browsers. Not
>>>>> even
>>>>> XHTML is that strict.
>>>>> Please comment, we need to reach some kind of consensus on this soon  
>>>>> and
>>>>> move on. If we can agree on what we want, we can then discuss how to
>>>>> change
>>>>> the spec accordingly (algorithm or ABNF, etc...)
>>>> I also strongly favor option number 2. I don't think anything else  
>>>> makes
>>>> sense, actually, because we would fail to interoperate with  other
>>>> schemes
>>>> that use fragments and queries on media resources. Only name-value  
>>>> pairs
>>>> that do not parse according to our ABNF will be ignored from the
>>>> viewpoint
>>>> of media fragments. They can be used by the browser or server for  
>>>> other
>>>> purposes.
>>> Same opinion here, option 1 doesn't seem to make sense. However,  
>>> should we
>>> allow any unknown constructions in the URI fragment or
>>> just key-value pairs with an unknown key? For example:
>>> - t=10,500&freq=300,3000: should be a valid fragment IMO, as indicated  
>>> by
>>> Philip's arguments;
>>> - t=10,500&foo: is this a valid media fragment? According to Philip's
>>> parsing algorithm, I think it is not. From an extension point
>>> of view, disallowing such a construction should be fine since we can
>>> rewrite this as t=10,500&foo=true if we want to obtain
>>> key-value pairs. Note that I'm not in favor of allowing other things  
>>> than
>>> key-value pairs, I just wanted to point out this case.
>> The ABNF I suggested in <
>> http://lists.w3.org/Archives/Public/public-media-fragment/2010Aug/0005.html>
>> isn't complete, it's just the first level defining name-value pairs. I  
>> think
>> that we should define validity in a way that makes validators warn about
>> things that aren't part of MF 1.0, to help authors find typos, etc.  
>> There
>> are many ways we could achieve that spec-wise, if we agree on what we  
>> want.
>> Validity and parsing can and should be separate, so we don't need to  
>> agree
>> on exact details for the purposes of this discussion.
> Assuming everyone is on board with that (which, of course, isn't clear  
> yet)
> - would you be able to come up with spec text for this? You seem to have  
> an
> idea in your head already what it should look like, so it would be good  
> to
> build on that.

Sure, I could write some spec text. As an FYI, These are the options I see:

1. Just use the ABNF we have now and let parsing be completely separate  
 from it.

2. Define a name-value syntax and say that parsers should use that to get  
name-value pairs (simple because it's equivalent to splitting on & and =).  
Then say that a valid Media Fragment is one where all the names and values  
match the dimensions and their corresponding syntax.

I'll not go further into discussion about these spec-writing details, as  
the purpose of this thread is to reach consensus on how parsing should  
work, and thus what kind of extensibility we get.

Philip Jägenstedt
Core Developer
Opera Software
Received on Friday, 24 September 2010 09:09:47 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:52:45 UTC