Re: Processing requirements

On Wed, 30 Dec 2009 04:33:36 +0100, Silvia Pfeiffer  
<silviapfeiffer1@gmail.com> wrote:

> On Wed, Dec 30, 2009 at 3:20 AM, Philip Jägenstedt <philipj@opera.com>  
> wrote:
>> On Tue, 29 Dec 2009 15:03:50 +0100, Silvia Pfeiffer
>> <silviapfeiffer1@gmail.com> wrote:
>>>
>>>
>>> Now, I'd say that we're probably safe using "&" as a separator for URI
>>> queries, since that has been specified in the CGI "standard" and has
>>> continuously been applied, even if never formally specified. It is a
>>> de-facto standard.
>>
>> I agree that it's safe, but we must formally specify it, either by
>> referencing an existing spec (which I have failed to find) or by  
>> specifying
>> it ourselves.
>
> A proper spec doesn't exist. All we have is the CGI spec. It's been my
> greatest problem with the temporal URI spec for years from a
> "completeness" point of view, but actually has never been a practical
> problem, since ppl have just assumed the de-facto standard.
>
>
>>> As for URI fragments, the idea is to keep it in sync with URI queries
>>> and thus we also used the "&".
>>
>> I certainly agree with keeping them in sync, but the fragment component
>> syntax is the one we can specify ourselves and it will work on many  
>> existing
>> server configurations as a bonus.
>
> Actually: no, we cannot define the fragment component syntax for any
> video or audio mime type. In fact, the URI specification says that the
> fragment syntax is specified by the owner of the mime type - i.e. the
> owner of video/ogg or video/mpeg4 (and audio) in the HTML5 case. All
> that we can realistically do is provide a recommendation for mime type
> owners to adopt our specification. We cannot really make an
> enforceable standard. OTOH, ppl have been waiting for such a spec, so
> they will gladly adopt it rather than create their own.

Thanks, I didn't know this. It seems then that we can't reasonably state  
any conformance requirements at all in terms of the syntax of the query or  
fragment and rather must do it in terms of abstract name/value. This is  
actually good news to me and I will write a concrete suggestion on how to  
handle it in my next mail.

>>> Now, both approaches (URI fragment and query) may conflict with some
>>> already created specifications (as analysed and listed in
>>>
>>> http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-reqs/#ExistingSchemes).
>>> This is unavoidable when standardising the use of something that has
>>> been in the wild so far.
>>>
>>> http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-spec/#processing-overview-standardisation
>>> talks about this problem and makes clear that harmonisation is
>>> necessary and that it is not possible to "prescribe" this format.
>>> Which probably means that media fragments will always be a
>>> recommendation rather than a standard.
>>
>> Yes, we will conflict with e.g. your Temporal URI spec and MPEG-21,  
>> which is
>> to be expected as MF is supposed to supersede both.
>
> Well, I'm not actually sure MPEG-21 will adopt it. But the thing is:
> even if the mime type owners don't accept it, what actually counts is
> what the browser vendors implement. :-)
>
>
>> However, existing query component schemes aren't really specs as such,  
>> they
>> are actually defined by their (usually single) implementation. However,  
>> if
>> we agree that MF should only normatively define the syntax and  
>> processing
>> rules for URI *fragments*, then we don't need to discuss the query  
>> component
>> issue any further.
>
> Some past discussions have found that we need to do both. The URI
> queries approach has its use cases where you want to create a shorter
> form from a longer resource - e.g. a playlists mashed up from segments
> from multiple videos. We have embraced such use cases in the
> requirements specification and they would require the use of URI
> queries.
>
> To be complete, it is also possible to not use URI queries, but to use
> some kind of REST interface, as you have mentioned before, e.g.
> http://www.example.com/video/track=video1/track=audio2/t=20,80 . But
> this resource has nothing at all to do with the original resource,
> which may be http://www.example.com/video/, so caching is impossible.
> Using URI queries at least provides a means to enable caching and to
> continue having the link back to the original resource.

Using the path or the query look equivalent to me, both are specific to a  
specific server configuration. Do caches really treat  
http://www.example.com/video/track=video1/track=audio2/t=20,80 and  
http://www.example.com/video?track=video1&track=audio2&t=20,80 in any way  
differently with respect to the "original" resource  
http://www.example.com/video/? Is the idea that caches should assume that  
URLs which happen to look like they use media fragments syntax in the  
query componenet are related to the URL with the query component stripped?  
This sounds very fragile to me, shouldn't this be done with new HTTP  
headers so that the caching proxy doesn't need to be concerned with  
parsing MF? Something like Original-Location? I haven't followed the  
server-part of MF very well, so perhaps I'm missing something.

> OTOH there are a lot of issues to deal with when using queries. We can
> only address a small part of the URI query possibilities in the MF
> spec, namely the one that overlaps with the spec we're creating for
> URI fragments. That has been the basis of our decisions so far.
>
> Why do you think URI queries are so much more of a problem? I wasn't
> able to read that out of the irc discussion either. Standardisation of
> how to create URI queries is useful, since then there are compatible
> naming conventions across servers and clients and applications can
> rely on things working the way they'd expect to. From a HTML5 POV, URI
> queries don't matter, since they don't concern the browser. But when
> specifying URIs, one has to think far beyond just the browser, IMHO.

I don't think that there's a problem with using the query component to  
"apply" the fragment server-side at all, that's very useful. I think this  
is a spec layering problem mostly. Certainly browsers don't need to care,  
but I still want the whole of the spec to be consistent and robust, not  
just "my" parts.

>>> We could do one thing though: maybe we should add the link to the CGI
>>> specification to the spec to explain where the formatting comes from.
>>
>> The CGI documentation only provides a rough description and isn't  
>> suitable
>> for a normative reference. For example, it says "you should URL decode  
>> the
>> name" but not how to do that. It is quite important to know how to  
>> interpret
>> #t=npt%3a10s (%3A is ':', but is %3a also tolerated?) and #id=100% ('%'
>> should be encoded as %25, but what to do with a stray %?).
>>
>> Specifying this is very simple:
>>
>> 1. split the string on &
>> 2. split the resulting string on the first occurrence of '=' and let  
>> name be
>> the first part and value be the second part. if there is no = in the  
>> string
>> let value be ''
>> 3. decode name and value according to [some very fine spec we can reuse  
>> I
>> hope]
>>
>> Simple but necessary as the spec can't make any normative requirements  
>> at
>> all about fragment dimensions if it doesn't define how to get from a
>> fragment component to a list of fragment dimensions.
>
> Agreed, that is somewhat implicit in the specification right now.
>
>
>>> Philip, note that the specification only defines a syntax for the URI
>>> fragment case, but leaves out the URI query case and just alludes to
>>> the fact that it is done in the same way. I think that is already what
>>> you are suggesting, no?
>>
>> The spec treats the query and fragment component equally as far as I can
>> see, so any normative requirements on URI fragments are also being made  
>> on
>> URI queries. For example:
>>
>> "The syntax is based on the specification of particular field-value  
>> pairs
>> that can be used in URI fragment and URI query requests to restrict a  
>> media
>> resource to a certain fragment."
>>
>> "There are therefore two possibilities for representing the media  
>> fragment
>> addressing in URIs: the URI query part or the URI fragment part."
>>
>> "The composition of a URI fragment or query string for a media resource
>> relies on a series of field-value pairs to be added behind the URI  
>> fragment
>> ('#') or query ('?') identifier."
>>
>> "In this section we present the ABNF syntax for the field-value pairs  
>> that
>> relate to a media fragment URI. The names for the non-terminals  
>> more-or-less
>> follow the names used in the previous subsections, with one clear
>> difference: the start symbol is called mediasegment, because we want to
>> allow application of it to both URI fragment and URI query strings."
>
> Yes, I think you're right. It does apply to both URI fragment and URI
> query. But that was intentional, as discussed above.
>
>
>> If the intention is that the ABNF syntax be normative only for URI
>> fragments, this should be clarified by removing the 'segment' ABNF and
>> instead require that mediasegment be a valid production of the ifragment
>> syntax from the IRI spec. This might have implications for the use of  
>> '+' in
>> datetime, I haven't checked.
>
> I do wonder about this last detail. Might be worth checking.

If we agree on specifying processing for fragment syntax then I will  
certainly research this.

>> There are several places in the spec that talk about Media Fragments,  
>> URI
>> fragments and URI queries as if URI fragments and URI queries are a  
>> subset
>> or Media Fragments rather the Media Fragments being a subset of URI
>> fragments. I'm quite confused by this terminology, could someone  
>> clarify? I
>> would like to see Media Fragment added to the terminology section.
>
> So far, what we have specified is the following (see
> http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-spec/#terminology):
> In this document, when the term 'media fragment URIs' is used, it
> actually means 'media fragment URI references'.
>
> This means that a media fragment URI is just generally a URI that
> deals with a section of a media resource. It does not say how.

As long as "URI" which doesn't have a fragment component can also be  
called a "URI reference" I guess this part is fine. The spec also talks a  
lot about "media fragments", what does that term mean? Especially since it  
uses the word "fragment" it's very easy to assume that it in fact has  
something to do with URI fragment components. Adding the definition (if we  
know it) to the terminology section would be very helpful.

> URI fragment and URI query quite plainly specify how to deal with the
> media fragment URI: namely either through use of a URI fragment or a
> URI query.
>
> I thought we used these quite consistently and made sure they didn't
> get mixed up. So, what, in your opinion, is missing?

Just the definition of "media fragment" sans "URI".

>> [pause]
>>
>> My primary concern is that the processing of fragment component is still
>> undefined as it is my intention to support MF in Opera at some point.  
>> In the
>> bad old days when a spec left something undefined one browser would just
>> make something up and the others would reverse-engineer it, but I am  
>> still
>> young and naive to think that things are different now. I am willing to  
>> edit
>> the spec myself to show clearly what it is I'm suggesting.
>
> I'm more than happy for you to make such changes - in particular to
> separate out the structure of parameters in a URI fragment and URI
> query from the actual specification of the name-value pairs in use. As
> mentioned in the email to Jack, I do think it makes sense to separate
> that into a section that specifies the foundations that we build upon.
> If you want to go ahead and do that, I wouldn't have a problem. But I
> don't speak for the others, so maybe wait until we get their input.
> :-)
>
>
> Cheers,
> Silvia.
>


-- 
Philip Jägenstedt
Core Developer
Opera Software

Received on Monday, 4 January 2010 12:05:58 UTC