- From: Philip Jägenstedt <philipj@opera.com>
- Date: Fri, 24 Sep 2010 10:09:28 +0200
- To: public-media-fragment@w3.org
On Fri, 24 Sep 2010 06:56:33 +0200, Davy Van Deursen <davy.vandeursen@ugent.be> wrote: > Citeren Silvia Pfeiffer <silviapfeiffer1@gmail.com>: >> On Wed, Sep 22, 2010 at 9:12 PM, Philip Jägenstedt >> <philipj@opera.com>wrote: >> >>> As request, a short summary of the long standing issue of syntax, >>> parsing >>> and how that relates to extensibility. >>> >>> By extensibility I am not primarily talking about 3rd parties >>> extending MF, >>> but about our own possibilities of updating the spec after MF 1.0. For >>> the >>> purpose of discussion, assume that we want to add a dimension for >>> filtering >>> the audio, e.g., freq=300,3000 to keep only the part of the audio that >>> corresponds (approximately) to human voice (300Hz-3000Hz). >>> >>> How will implementations of MF 1.0 handle t=10,500&freq=300,3000 ? >>> This is >>> the core point of disagreement, and the question is really about how >>> MF 1.0 >>> parsers should work. Leaving it undefined is not a good option, as the >>> history clearly shows. Two other options have been on the table: >>> >>> 1. Require that parsing follow a strict ABNF syntax like the one we >>> have. >>> Since freq is not part of the MF 1.0 syntax, parsing >>> t=10,500&freq=300,3000 >>> will fail and the whole fragment will be ignored, including t=10,500. >>> >>> 2. Require that parsing follow an algorithm or a more forgiving ABNF >>> syntax. The concrete suggestion I've made is that the algorithm or >>> syntax >>> should match how query strings work. That is, a list or key-value >>> pairs is >>> formed by splitting the string on & and =. As a second step, that list >>> is >>> traversed to match the keys against the dimensions and parsed >>> according to >>> the ABNF syntax of each dimension. Crucially, unrecognized/invalid >>> keys or >>> values are ignored. That means that in the above example, the time >>> dimension >>> will keep working even if an unrecognized (to a MF 1.0 implementation) >>> freq >>> dimension is used. >>> >>> Note: Neither 1 or 2 are requirements on using any specific >>> implementation >>> technique, only to behave *as if* you are, which still leaves plenty >>> of room >>> for different approaches. >>> >>> I strongly favor option number 2, and see these benefits: >>> >>> * It works like query strings, just like one would expect from looking >>> at >>> the syntax. The algorithm I've suggested is actually from testing query >>> string parsing in PHP, ASP, ASP.NET, CGI.pl and JSP, as reported >>> earlier >>> on this list. >>> >>> * It's simpler for implementors, as we won't have to implement >>> everything >>> at once. This is likely what's going to happen, as the time dimension >>> is >>> ready to implement, while the named dimension is still not clear how to >>> apply to e.g. a WebM or Ogg resource. >>> >>> * It's better for extensibility, as adding new dimensions doesn't >>> break all >>> existing implementations. Imagine if adding a new element to HTML would >>> cause pages to render completely blank in all existing browsers. Not >>> even >>> XHTML is that strict. >>> >>> Please comment, we need to reach some kind of consensus on this soon >>> and >>> move on. If we can agree on what we want, we can then discuss how to >>> change >>> the spec accordingly (algorithm or ABNF, etc...) >> >> >> >> I also strongly favor option number 2. I don't think anything else makes >> sense, actually, because we would fail to interoperate with other >> schemes >> that use fragments and queries on media resources. Only name-value pairs >> that do not parse according to our ABNF will be ignored from the >> viewpoint >> of media fragments. They can be used by the browser or server for other >> purposes. > > Same opinion here, option 1 doesn't seem to make sense. However, should > we allow any unknown constructions in the URI fragment or > just key-value pairs with an unknown key? For example: > - t=10,500&freq=300,3000: should be a valid fragment IMO, as indicated > by Philip's arguments; > - t=10,500&foo: is this a valid media fragment? According to Philip's > parsing algorithm, I think it is not. From an extension point > of view, disallowing such a construction should be fine since we can > rewrite this as t=10,500&foo=true if we want to obtain > key-value pairs. Note that I'm not in favor of allowing other things > than key-value pairs, I just wanted to point out this case. The ABNF I suggested in <http://lists.w3.org/Archives/Public/public-media-fragment/2010Aug/0005.html> isn't complete, it's just the first level defining name-value pairs. I think that we should define validity in a way that makes validators warn about things that aren't part of MF 1.0, to help authors find typos, etc. There are many ways we could achieve that spec-wise, if we agree on what we want. Validity and parsing can and should be separate, so we don't need to agree on exact details for the purposes of this discussion. -- Philip Jägenstedt Core Developer Opera Software
Received on Friday, 24 September 2010 08:10:11 UTC