RE: Expressing complex regions with media fragments - use cases + possible solution

Hi Silvia, Jack,

>>> On Wed, Sep 8, 2010 at 7:00 PM, Jack Jansen <Jack.Jansen@cwi.nl> wrote:
>>>
>>> On 8 sep 2010, at 02:51, Silvia Pfeiffer wrote:
>>> > I have, however, a pretty big caveat with standardising this approach: right now we are discussing with the browser vendors on
how to present spatial media fragment URIs. There is a preference to use them for splicing pictures, i.e. for rendering only the
referenced image or video region.
>>> >
>>> > I do not believe that matches your intentions here. IIUC your intentions here are to only have a means to provide annotations
to regions. I think this is more of a "image map" type approach than an "image splicing" approach - correct me if I'm wrong.
>>>
>>> We've got to be careful here: browser vendors tend to think that the whole world revolves around the browser:-) A media fragment
is basically nothing more than a specification of a portion of a video (audio, image) resource, and even though we give guidelines
on how to present such a fragment in the browser that doesn't mean it's the only application of media fragments. I would assume that
the Media Annotation folks couldn't care less about presentation: they just want to be able to point at something inside the video.
>>>
>>>
>>> Except: everyone wants to see their results in the browser. So, if the browser vendors agree to present a spatial media fragment
URI as an image slice, then you can do as many annotations as you want with such a URI, it will never be presented as an image with
highlights by the Web browser. So, as long as we are talking about a presentation in a Web browser, we are actually talking about
the same application doing the same thing with the same URI.
I'm 100% in agreement with Bernhard here: our main task is in defining how to address subparts of media items, very similar to how
#xmlid addresses subparts of an XML document. For practical reasons, we also define guidelines on how we think some classes of
applications should render some of our MF-based addresses, but that is much less important. Moreover, we can provide no more than
guidelines.
>>
>>There's also a problem with your (Silvia) reasoning that reasoning about "the browser" is similar to reasoning about "the
operating system": the browser is simply a platform on which an application runs. Clearly, if someone types an MF-url into the
location bar, something consistent should happen. But more often than not the MF-url will be used in the context of an application
(either client-side or server-side), and this application will know best what to do. An annotation-viewing app will most probably
want to show the whole original resource with some form of highlighting on the portion selected by the MF.
>>
>>As an aside: this means I also disagree with Ian's WHATWG mail
<http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2010-August/028163.html>: we define addressing, but the semantics of using
those addresses should be formally explained in the specification referencing our work. In case of HTML it would be best if it
provided a default rendering semantic (such as "crop to the specified area") with a CSS-based or attribute-based method to override
this ("draw the whole resource, and make the area selected by the MF available to scripts through the DOM").
>
>How about we define or at least propose the semantics for use in the HTML <video> and <audio> elements and the HTML address bar.
Then we can leave it to the other use cases to do what they want with the URIs?

I fully agree with Jack: ' we can provide no more than guidelines '. For the browser case, something consistent should indeed
happen. A proposal might be as follows:
- HTML address bar: choose either to crop or to highlight the fragment (I would prefer to crop the fragment (along all axes))
- <video> and <audio> elements: I think both rendering modes should be available here (i.e., both cropping and highlighting).
However, I don't think the MF spec should foresee this switch (as proposed in our last phone conf [1]). More specifically, in this
case, the switch could be provided by the HTML spec, for example by adding an attribute to the <video> or <audio> element indicating
how fragments should be rendered.

Best regards,

Davy

[1] http://www.w3.org/2010/09/08-mediafrag-minutes.html#item03 

-- 
Davy Van Deursen

Ghent University - IBBT
Department of Electronics and Information Systems - Multimedia Lab
URL: http://multimedialab.elis.ugent.be/dvdeurse

Received on Thursday, 9 September 2010 06:36:27 UTC