Re: action 78 - Discussion about interoperability from Tobias Bürger on 2009-01-27 (public-media-annotation@w3.org from January 2009)

From: Tobias Bürger <tobias.buerger@sti2.at>
Date: Tue, 27 Jan 2009 15:49:32 +0100
To: Felix Sasaki <fsasaki@w3.org>
CC: Pierre-Antoine Champin <pchampin@liris.cnrs.fr>, public-media-annotation@w3.org
Message-ID: <497F1EFC.8020401@sti2.at>
Dear all,

as promised in the telecon, here my reply to Felix' mail. See comment below

Felix Sasaki wrote:
>
> Pierre-Antoine Champin さんは書きました:
>> Hi,
>>
>> for action 78, I had to write a wiki page about some concerns I raised
>> during the last telecon about interoperability between mapped
>> properties. Since this is supposed to be matter for discussion rather
>> than a formal document, I think it is best to send it as a mail.
>>
>>
>> What triggered my concern was the mapping for Media RSS, between
>> ''dc:creator'' and ''dcterms:creator''. Just as a reminder, the Dublin
>> Core vocabulary has two versions: the legacy "elements" (usually
>> prefixed with ''dc'') and the "terms" (usually prefixed with
>> ''dcterms''). Each term is more specific than its corresponding element,
>> as its values are more constrained. For example, ''dc:creator'' can have
>> any type of value (including a plain string), while 'dcterms:creator''
>> must have a URI, which must denote an instance of ''dcterms:Agent''.
>> If we decide to specify the ontology only as prose
>> Let us consider the example of ''dc:creator'' with a sample of mappings:
>>
>> * for XMP, its value is a sequence of strings, each string being the
>> name of an author.
>>
>> * for Media RDF, its value is either
>>   - a plain string,
>>   - an instance of ''foaf:Agent'' with at least a ''foaf:name'', or
>>   - an instance of ''vcard'' with at least a ''fn''.
>> Since they are using ''dcterms'', it must also be inferred to be a
>> ''dcterms:Agent'' (which contradicts the use of a plain string...). It
>> may represent only one ("the primary") creator.
>>
>> * for ID3, the value of TOPE is a string, where names are separated 
>> by "/".
>>
>>
>> My point here is that, beyond the "high level" semantic links identified
>> by the mapping table, there are some "low level" discrepancies that are
>> both semantic (e.g. representing one or several creators) and syntactic
>> (slash-separated string or structured sequence).
>>
>> Leaving these issues to the implementation will inevitably lead to major
>> differences and a lack of interoperability. We could specify down to the
>> syntactical level the mapping for each property in each format, but what
>> about other formats ?
>>
>> I think a better way to limit the variability in implementations by
>> specifying precisely, for each property of our ontology, the expected
>> "low level" features of its value (and not only its "high level"
>> meaning) so that implementors know what they can keep from the original
>> metadata, and what they need to adapt (i.e. split ID3's TOPE field into
>> multiple values).
>>
>> This has to be done at least at the API level. But I guess this could
>> also be done to some extent at the ontology level (I do believe that
>> those "low level" features are *not only* syntactic), but that raises
>> again the problem of formally specifying the ontology or not.
>>
>> But the less specific we are in describing the ontology, the more
>> precise we will have to be in describing the API, in order to avoid "low
>> level" semantic discrepancies.
>>   
>
> I agree very much with your analysis, Pierre-Antoine. +1 to have a 
> very low wheight ontology and to be more precise in the API 
> description. Also I am hoping very much that people will volunteer to 
> actually test the mappings in toy implementations, no matter if 
> relying on a complex ontology or a detailed API. No matter which way 
> we go, let's test them now.
I guess the intention of the toy implementations should be to get a more 
deeper understanding which type of mismatches between the properties 
defined in the different formats might occur. This might include data 
type mismatches, but also structural mismatches as outlined by 
Pierre-Antoine above. If you derive them by hard thinking or by 
prototypical implementations does not matter, as long as we are aware of 
them, because we finally have to implement them at some point in time.

Regarding the mapping, and more specifically from where we should map: 
The mapping should be to our core ontology to whose semantics we 
committed ourselves or will commit. So we will define what we allow as 
the domain and range of a property.

And I disagree to the last statement from Pierre-Antoine above: if we 
describe the ontology less specific than we also do not need to be more 
precise in the API. It has been my understanding that this group defines 
an ontology consisting of a set of core properties for the description 
of media objects on the Web to which all the formats in our scope will 
be mapped to. Saying that, if you describe the ontology more 
lightweight, meaning perhaps with less detail or level of specifity, 
than you also map to something not very specific.
For me the API is a means to transparently access a description of a 
media object in a format about which I do not want to care about when 
accessing the API. So we should define return types? Or should the 
burden of identying the return type be shifted to the user? (I guess we 
had this discussion before but did not come to a conclusion....)

Just my two cents...

Best,

Tobias
>
> Felix
>
>>  regards
>>
>>   Pierre-Antoine
>>
>>
>>
>>   
>
>
>

-- 
_________________________________________________
Dipl.-Inf. Univ. Tobias Bürger

STI Innsbruck
University of Innsbruck, Austria
http://www.sti-innsbruck.at/ 

tobias.buerger@sti2.at
__________________________________________________
Received on Tuesday, 27 January 2009 14:50:11 UTC