- From: Ruben Tous \(UPC\) <rtous@ac.upc.edu>
- Date: Mon, 24 Nov 2008 13:28:47 +0100
- To: "Pierre-Antoine Champin" <pchampin@liris.cnrs.fr>, "Felix Sasaki" <fsasaki@w3.org>
- Cc: <public-media-annotation@w3.org>
Hi Pierre-Antoine, Felix, all, after the feedback from Felix in http://lists.w3.org/Archives/Public/public-media-annotation/2008Nov/0115.html I see that my understanding of the approach has been slightly biased till now (is my fault for having arrived late). I also took for granted that we would formalise an explicit reference format, but now I realize that this is not strictly necessary. I’ve added a new entry at the top of the features table (http://www.w3.org/2008/WebVideo/Annotations/wiki/FeaturesTable) because I think that this aspect has an important impact on the other features. Best regards, Ruben ----- Original Message ----- From: "Pierre-Antoine Champin" <pchampin@liris.cnrs.fr> To: "Felix Sasaki" <fsasaki@w3.org> Cc: <public-media-annotation@w3.org> Sent: Monday, November 24, 2008 12:47 PM Subject: Re: my token about the "3 or more layer" structure for the ontology Hi Felix, thank you for your feedback. First, the term "data structure" was a bad choice. I should have written "conceptual model", which describes better what I am interested in. I think once we agree on a conceptual model, we can chose the best syntax to represent it -- if we want to... As a matter of fact, I took for granted that we would have to define our own format. But once again, the most important thing is the conceptual model. To be clear about my idea of a "conceptual model" or "ontology"... It does not necessarily implies that we describe it formally. You advocate a "prose" description, and it is indeed a possibility, as long as it is precise enough. However, that does not rule out, IMHO, the discussion about structure -- and I still think structure is important for interoperability problems. I take an example: ID3 [1] has apparently a flat structure: it is a list of properties with text value. However, take the descriptions of the following properties: > TALB > The 'Album/Movie/Show title' frame is intended for the title of the > recording (or source of sound) from which the audio in the file is > taken. > > TOAL > The 'Original album/movie/show title' frame is intended for the title > of the original recording (or source of sound), if for example the > music in the file should be a cover of a previously released song. There is an awful lot of structure hidden in those flat properties! A sound file is *taken from* a *recording*, which has a *title* and can be an *album*, a *movie*, a *show* (and the list is probably not intended to be exhaustive). But it can have been *previously released* in another *recording*. So mapping those properties to, say, Dublin Core properties, not only requires to find an equivalence between the notion title, but also *taken from* (dc:source ?), *previously released* (??), etc... pa [1] http://www.id3.org/id3v2.4.0-frames Felix Sasaki a écrit : > Hello Pierre-Antoine, > > Pierre-Antoine Champin さんは書きました: >> Felix, >> >> although I participated in putting the debate in terms of "XML vs. RDF", >> my concern was not about a precise syntax or foramt, and I agree with >> you that it should not be. >> > > Just for clarification: "agree that it should not be" means "we do not > need to define a syntax" or "we should not discuss XML vs. RDF, but need > to decide on a syntax"? If the latter, which syntax do you propose? > >> However in my view the question is more fundamental. Let me reword it. >> >> Designing an ontology involves, IMHO, a trade-off between faithfully >> representing the domain of interest, and projecting it in a practical >> data structure. >> > > Maybe here we already have different opinions: I think we can design an > ontology without a practical data structure. The current API / ontology > proposal does just that: defining a list of terms *as prose*. The data > structure related parts are only in the API, and in the prose mapping > descriptions. > >> Failthful in our context means: >> - able to cover a large part of legacy metadata >> - able to satisfy most of the requirements of our use cases >> >> Practical in our context, means that the ontology should be: >> - easy to use by media publisher >> - easy to implement in browsers >> >> A very easy to use and implement data structure is a list of >> (attribute,value) pairs -- the so-called "flat" structure. >> >> By the way, even easier is a list of simple tags -- which can be tweaked >> into (attribute,value) pairs anyway, as pointed out by your previous >> mail about flickr. >> >> However, I think that this is too much of a simplification: >> - it does not satisfy come requirements (like the multi-level or >> collection) -- though we might decide that those ones are too complex >> - my intuition is that more structure would make "impedence mismatch" >> between legacy vocabularies easier to point out and solve >> > > I agree that this simplification does not cover many use cases like > "multi-level or collection". But I also think for this version (1.0 of > the ontology / API) we should concentrate on the simple approach which > is important for all use cases and application scenarios. If that founds > adoption, we can shoot for 2.0 and a more complex approach. > > Btw., of course we have not described all application scenarios, use > cases and requirements yet. Nevertheless I think that the requirement to > get information across heterogenous formats is central to our WG. > > I don't think that more or less structure is related to the quality of > mapping between different vocabularies. For this mapping, detailed > knowledge brought in by the WG parcitipants about these vocabularies is > mostly important. If the mapping then is represented in prose, or as > more or less structured XML or RDF, is not important IMO. However, I do > think that a detailed prose description is important for the API, and it > can also help understanding a structured representation, if we decide to > do that. > > Felix > >> pa >> >> Felix Sasaki a écrit : >> >>> Ruben Tous (UPC) さんは書きました: >>> >>>> Hi Pierre-Antoine, Silvia, all, >>>> >>>> I think that normalisation/denormalisation is related to the more >>>> general discussion about structured*/flat annotations (handling >>>> events, agents, etc. as separated structures) . The multi-level >>>> description discussion is probably a sub-topic within that general >>>> one, and refers only (as I've understood till now) to splitting >>>> (normalising) the main structure (the one describing the digital >>>> object) into several entities but only regarding different abstraction >>>> levels (e.g. document and instance). >>>> >>>> So, probably we should decide first about the structured*/flat >>>> question. If we choose "flat", then we could maybe discard also the >>>> multi-level description. >>>> >>>> Probably, there's a latent high-level question behind this discussion: >>>> will the ontology model the way annotations are interchanged, or will >>>> it model their underlying semantic grounding? >>>> >>>> Best regards, >>>> >>>> Ruben >>>> >>>> *When talking about structured annotations I'm not just referring to >>>> hierarchycal ones (XML), I refer to annotations with ObjectProperties >>>> (inlined or linked within the same annotation) (e.g. RDF). >>>> >>> Reading this discussion and the "features" wiki page, the "data model >>> rows", I have the impression that there is some tension between using >>> XML and RDF. I can understand that tension, but I think we should not >>> spend time on discussing it in this group. Nevertheless, it lets me more >>> and more think that we should not be format specific in our ontology, >>> but use just a prose description as the normative outcome, that is in >>> the "Ontology 1.0" Recommendation. If people want to write non-normative >>> RDF- and XML-formats, they are free to do so. I think we should focus on >>> formulating the terminology in the prose in a way that that makes a >>> formalization in whatever format straightforward. >>> >>> Felix >>> >>> >>> >>> >>>> ----- Original Message ----- From: "Silvia Pfeiffer" >>>> <silviapfeiffer1@gmail.com> >>>> To: "Ruben Tous (UPC)" <rtous@ac.upc.edu> >>>> Cc: <public-media-annotation@w3.org> >>>> Sent: Wednesday, November 19, 2008 10:10 PM >>>> Subject: Re: my token about the "3 or more layer" structure for the >>>> ontology >>>> >>>> >>>> Hi Ruben, >>>> >>>> It is always a matter of use cases. >>>> >>>> When we talk about management of collections, there will be overlap >>>> between the annotations of different files, which can be handled more >>>> efficiently (in a database sense: normalise your schema). >>>> >>>> However, if you receive an individual media resource, you want all of >>>> its annotations to be available with the media resource, i.e. you want >>>> an "intelligent" media object that can tell you things about itself. >>>> >>>> I don't see these things as separate. Let's take a real-world example. >>>> Let's assume I have a Web server with thousands of videos. They fall >>>> into categories and within categories into event, where each video >>>> within an event has the same metadata about the event. On the server, >>>> I would store the metadata in a database. I would do normalisation of >>>> the data and just store the data for each event once, but have a >>>> relationship table for video-event-relationships. Now, a Web Browser >>>> requests one of the videos for playback (or a search engine comes >>>> along and asks about the metadata for a video). Of course, I go ahead >>>> and extract all related metadata about that video from the database >>>> and send it with the video (or in the case of the search engine: >>>> without the video). I further have two ways of sending the metadata: I >>>> can send it in a text file (which is probably all the search engine >>>> needs), or I can send it multiplexed into the video file, e.g. as a >>>> metadata header (e.g. MP3 has ID3 for this, Ogg has vorbiscomment, >>>> other file formats have different metadata headers). >>>> >>>> I don't think we need to overly concern ourselves with whether we >>>> normalise our data structure. This is an "implementation" issue. We >>>> should understand the general way in which metadata is being handled >>>> as in the example above and not create schemas that won't work in this >>>> and other scenarios. But we should focus on identifying which >>>> information is important to keep about a video or audio file. >>>> >>>> Cheers, >>>> Silvia. >>>> >>>> >>>> >>>> On Thu, Nov 20, 2008 at 12:01 AM, Ruben Tous (UPC) <rtous@ac.upc.edu> >>>> wrote: >>>> >>>>> Dear Véronique, Silvia, all, >>>>> >>>>> I agree with both of you in that the need of multiple description >>>>> levels is >>>>> only related to a small subset of use cases, basically to those >>>>> related to >>>>> the management of groups of resources (e.g. digital asset management >>>>> systems, user media collections, etc.). Instead, we are (I guess) >>>>> focused in >>>>> embedded annotations in individual resources. >>>>> >>>>> However, I think that there are solutions which cover both cases, the >>>>> simple >>>>> and the complex one. For instance, we could embed the following >>>>> annotation >>>>> within an MPEG video: >>>>> >>>>> <mawg:Video rdf:ID=http://example.org/video/01"> >>>>> <mawg:title>astronaut loses tool bag during spacewalk </mawg:title> >>>>> <mawg:creator>John Smith</mawg:creator> >>>>> </mawg:Video> >>>>> >>>>> <mawg:Resource rdf:ID="http://example.org/resource/01"> >>>>> <mawg:format>FLV</mawg:format> >>>>> <mawg:filesize>21342342</mawg:filesize> >>>>> <mawg:duration>PT1004199059S</mawg:duration> >>>>> </ mawg:videoID rdf:resource="http://example.org/video/01"> >>>>> </mawg:Resource> >>>>> >>>>> It is structured and it offers 2 abstraction levels, but it can be >>>>> serialized like a plain record. When appearing in isolated resources, >>>>> the >>>>> high-level annotation ("Video" in this case) would be repeated. When >>>>> appearing within a collection's annotation the "Video" annotation >>>>> would >>>>> appear just once. >>>>> >>>>> It is not so different than in XMP. Take to the following XMP >>>>> example... >>>>> >>>>> http://www.w3.org/2008/WebVideo/Annotations/wiki/images/8/8a/Xmp_example.xml >>>>> >>>>> >>>>> >>>>> Best regards, >>>>> >>>>> Ruben >>>>> >>>>> >>>>> ----- Original Message ----- From: <vmalaise@few.vu.nl> >>>>> To: <public-media-annotation@w3.org> >>>>> Sent: Wednesday, November 19, 2008 11:27 AM >>>>> Subject: my token about the "3 or more layer" structure for the >>>>> ontology >>>>> >>>>> >>>>> >>>>>> Hi everyone, >>>>>> >>>>>> >>>>>> I was at first very much in favor of an ontology that would >>>>>> distinguish >>>>>> different levels of media documents, like >>>>>> "work-manifestation-instance-item", >>>>>> but after reading this email from the list: >>>>>> >>>>>> http://lists.w3.org/Archives/Public/public-media-annotation/2008Nov/0076.html >>>>>> >>>>>> >>>>>> I agreed with the fact that we would probably only need a simple >>>>>> structure >>>>>> in >>>>>> our case, that multi-level structures were meant for linking >>>>>> different >>>>>> entities >>>>>> that have different status together: if we aim for linking the >>>>>> descriptions of a >>>>>> single item between different vocabularies, we need to specify if the >>>>>> single >>>>>> item is a work_in_XX_vocabulary, more likely a >>>>>> manifestation_in_XX_vocabulary >>>>>> (see note 1 below), to give its "type", and if people/use cases >>>>>> want to >>>>>> link >>>>>> this single item to other related works, manifestations, instances or >>>>>> items, >>>>>> they can use the framework defined in the schemas reviewed in >>>>>> >>>>>> http://www.w3.org/2008/WebVideo/Annotations/wiki/MultilevelDescriptionReview >>>>>> >>>>>> >>>>>> and use these properties for completing their description. >>>>>> >>>>>> So we would need a property like "has_type" to link a single >>>>>> description's >>>>>> identifier to the correct level of multilevel description schemes. >>>>>> >>>>>> I changed my mind think that only one "family" of use cases would >>>>>> need >>>>>> more >>>>>> levels, that they are somehow context dependent (and could thus be >>>>>> considered as >>>>>> requirements for a family of use cases), but of course if it turns >>>>>> out >>>>>> that more >>>>>> that one family of use cases needs this distinction, then we should >>>>>> consider >>>>>> going for a multilevel structure. Anyway, we would need to map >>>>>> informally >>>>>> the >>>>>> way these levels are expressed, in order to provide possible relevant >>>>>> "types" >>>>>> for the description of each single element. >>>>>> >>>>>> note 1: by specifying the different names of the relevant >>>>>> Concepts/terms >>>>>> in >>>>>> schemes like VRA, XMP etc., we would informally define a semantic >>>>>> equivalence >>>>>> between the ways these schema express these levels of description. It >>>>>> would look >>>>>> like: >>>>>> <metadataFile> >>>>>> <id="identifier"> >>>>>> <hasType xmpMM:InstanceID, vra:image, frbr:item> >>>>>> </metadataFile> >>>>>> >>>>>> I think that the table >>>>>> http://www.w3.org/2008/WebVideo/Annotations/wiki/FeaturesTable >>>>>> is a very valuable tool for people to express their ideas about it, >>>>>> thank >>>>>> you >>>>>> very much Ruben for designing it! >>>>>> >>>>>> Best regards, >>>>>> Véronique >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>> >>> >> >> > >
Received on Monday, 24 November 2008 12:32:45 UTC