W3C home > Mailing lists > Public > public-media-annotation@w3.org > November 2008

Re: my token about the "3 or more layer" structure for the ontology

From: Pierre-Antoine Champin <pchampin@liris.cnrs.fr>
Date: Fri, 21 Nov 2008 11:01:57 +0000
Message-ID: <49269525.4020206@liris.cnrs.fr>
To: Felix Sasaki <fsasaki@w3.org>
CC: public-media-annotation@w3.org


although I participated in putting the debate in terms of "XML vs. RDF",
my concern was not about a precise syntax or foramt, and I agree with
you that it should not be.

However in my view the question is more fundamental. Let me reword it.

Designing an ontology involves, IMHO, a trade-off between faithfully
representing the domain of interest, and projecting it in a practical
data structure.

Failthful in our context means:
- able to cover a large part of legacy metadata
- able to satisfy most of the requirements of our use cases

Practical in our context, means that the ontology should be:
- easy to use by media publisher
- easy to implement in browsers

A very easy to use and implement data structure is a list of
(attribute,value) pairs -- the so-called "flat" structure.

By the way, even easier is a list of simple tags -- which can be tweaked
into (attribute,value) pairs anyway, as pointed out by your previous
mail about flickr.

However, I think that this is too much of a simplification:
- it does not satisfy come requirements (like the multi-level or
collection) -- though we might decide that those ones are too complex
- my intuition is that more structure would make "impedence mismatch"
between legacy vocabularies easier to point out and solve


Felix Sasaki a écrit :
> Ruben Tous (UPC) さんは書きました:
>> Hi Pierre-Antoine, Silvia, all,
>> I think that normalisation/denormalisation is related to the more
>> general discussion about structured*/flat annotations (handling
>> events, agents, etc. as separated structures) . The multi-level
>> description discussion is probably a sub-topic within that general
>> one, and refers only (as I've understood till now) to splitting
>> (normalising) the main structure (the one describing the digital
>> object) into several entities but only regarding different abstraction
>> levels (e.g. document and instance).
>> So, probably we should decide first about the structured*/flat
>> question. If we choose "flat", then we could maybe discard also the
>> multi-level description.
>> Probably, there's a latent high-level question behind this discussion:
>> will the ontology model the way annotations are interchanged, or will
>> it model their underlying semantic grounding?
>> Best regards,
>> Ruben
>> *When talking about structured annotations I'm not just referring to
>> hierarchycal ones (XML), I refer to annotations with ObjectProperties
>> (inlined or linked within the same annotation) (e.g. RDF).
> Reading this discussion and the "features" wiki page, the "data model
> rows", I have the impression that there is some tension between using
> XML and RDF. I can understand that tension, but I think we should not
> spend time on discussing it in this group. Nevertheless, it lets me more
> and more think that we should not be format specific in our ontology,
> but use just a prose description as the normative outcome, that is in
> the "Ontology 1.0" Recommendation. If people want to write non-normative
> RDF- and XML-formats, they are free to do so. I think we should focus on
> formulating the terminology in the prose in a way that that makes a
> formalization in whatever format straightforward.
> Felix
>> ----- Original Message ----- From: "Silvia Pfeiffer"
>> <silviapfeiffer1@gmail.com>
>> To: "Ruben Tous (UPC)" <rtous@ac.upc.edu>
>> Cc: <public-media-annotation@w3.org>
>> Sent: Wednesday, November 19, 2008 10:10 PM
>> Subject: Re: my token about the "3 or more layer" structure for the
>> ontology
>> Hi Ruben,
>> It is always a matter of use cases.
>> When we talk about management of collections, there will be overlap
>> between the annotations of different files, which can be handled more
>> efficiently (in a database sense: normalise your schema).
>> However, if you receive an individual media resource, you want all of
>> its annotations to be available with the media resource, i.e. you want
>> an "intelligent" media object that can tell you things about itself.
>> I don't see these things as separate. Let's take a real-world example.
>> Let's assume I have a Web server with thousands of videos. They fall
>> into categories and within categories into event, where each video
>> within an event has the same metadata about the event. On the server,
>> I would store the metadata in a database. I would do normalisation of
>> the data and just store the data for each event once, but have a
>> relationship table for video-event-relationships. Now, a Web Browser
>> requests one of the videos for playback (or a search engine comes
>> along and asks about the metadata for a video). Of course, I go ahead
>> and extract all related metadata about that video from the database
>> and send it with the video (or in the case of the search engine:
>> without the video). I further have two ways of sending the metadata: I
>> can send it in a text file (which is probably all the search engine
>> needs), or I can send it multiplexed into the video file, e.g. as a
>> metadata header (e.g. MP3 has ID3 for this, Ogg has vorbiscomment,
>> other file formats have different metadata headers).
>> I don't think we need to overly concern ourselves with whether we
>> normalise our data structure. This is an "implementation" issue. We
>> should understand the general way in which metadata is being handled
>> as in the example above and not create schemas that won't work in this
>> and other scenarios. But we should focus on identifying which
>> information is important to keep about a video or audio file.
>> Cheers,
>> Silvia.
>> On Thu, Nov 20, 2008 at 12:01 AM, Ruben Tous (UPC) <rtous@ac.upc.edu>
>> wrote:
>>> Dear Véronique, Silvia, all,
>>> I agree with both of you in that the need of multiple description
>>> levels is
>>> only related to a small subset of use cases, basically to those
>>> related to
>>> the management of groups of resources (e.g. digital asset management
>>> systems, user media collections, etc.). Instead, we are (I guess)
>>> focused in
>>> embedded annotations in individual resources.
>>> However, I think that there are solutions which cover both cases, the
>>> simple
>>> and the complex one. For instance, we could embed the following
>>> annotation
>>> within an MPEG video:
>>> <mawg:Video rdf:ID=http://example.org/video/01">
>>> <mawg:title>astronaut loses tool bag during spacewalk </mawg:title>
>>> <mawg:creator>John Smith</mawg:creator>
>>> </mawg:Video>
>>> <mawg:Resource rdf:ID="http://example.org/resource/01">
>>> <mawg:format>FLV</mawg:format>
>>> <mawg:filesize>21342342</mawg:filesize>
>>> <mawg:duration>PT1004199059S</mawg:duration>
>>> </ mawg:videoID rdf:resource="http://example.org/video/01">
>>> </mawg:Resource>
>>> It is structured and it offers 2 abstraction levels, but it can be
>>> serialized like a plain record. When appearing in isolated resources,
>>> the
>>> high-level annotation ("Video" in this case) would be repeated. When
>>> appearing within a collection's annotation the "Video" annotation would
>>> appear just once.
>>> It is not so different than in XMP. Take to the following XMP example...
>>> http://www.w3.org/2008/WebVideo/Annotations/wiki/images/8/8a/Xmp_example.xml
>>> Best regards,
>>> Ruben
>>> ----- Original Message ----- From: <vmalaise@few.vu.nl>
>>> To: <public-media-annotation@w3.org>
>>> Sent: Wednesday, November 19, 2008 11:27 AM
>>> Subject: my token about the "3 or more layer" structure for the ontology
>>>> Hi everyone,
>>>> I was at first very much in favor of an ontology that would distinguish
>>>> different levels of media documents, like
>>>> "work-manifestation-instance-item",
>>>> but after reading this email from the list:
>>>> http://lists.w3.org/Archives/Public/public-media-annotation/2008Nov/0076.html
>>>> I agreed with the fact that we would probably only need a simple
>>>> structure
>>>> in
>>>> our case, that multi-level structures were meant for linking different
>>>> entities
>>>> that have different status together: if we aim for linking the
>>>> descriptions of a
>>>> single item between different vocabularies, we need to specify if the
>>>> single
>>>> item is a work_in_XX_vocabulary, more likely a
>>>> manifestation_in_XX_vocabulary
>>>> (see note 1 below), to give its "type", and if people/use cases want to
>>>> link
>>>> this single item to other related works, manifestations, instances or
>>>> items,
>>>> they can use the framework defined in the schemas reviewed in
>>>> http://www.w3.org/2008/WebVideo/Annotations/wiki/MultilevelDescriptionReview
>>>> and use these properties for completing their description.
>>>> So we would need a property like "has_type" to link a single
>>>> description's
>>>> identifier to the correct level of multilevel description schemes.
>>>> I changed my mind think that only one "family" of use cases would need
>>>> more
>>>> levels, that they are somehow context dependent (and could thus be
>>>> considered as
>>>> requirements for a family of use cases), but of course if it turns out
>>>> that more
>>>> that one family of use cases needs this distinction, then we should
>>>> consider
>>>> going for a multilevel structure. Anyway, we would need to map
>>>> informally
>>>> the
>>>> way these levels are expressed, in order to provide possible relevant
>>>> "types"
>>>> for the description of each single element.
>>>> note 1: by specifying the different names of the relevant
>>>> Concepts/terms
>>>> in
>>>> schemes like VRA, XMP etc., we would informally define a semantic
>>>> equivalence
>>>> between the ways these schema express these levels of description. It
>>>> would look
>>>> like:
>>>> <metadataFile>
>>>> <id="identifier">
>>>> <hasType xmpMM:InstanceID, vra:image, frbr:item>
>>>> </metadataFile>
>>>> I think that the table
>>>> http://www.w3.org/2008/WebVideo/Annotations/wiki/FeaturesTable
>>>> is a very valuable tool for people to express their ideas about it,
>>>> thank
>>>> you
>>>> very much Ruben for designing it!
>>>> Best regards,
>>>> Véronique
Received on Friday, 21 November 2008 11:02:42 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:24:30 UTC