RE: action 78 - Discussion about interoperability from Joakim Söderberg on 2009-02-02 (public-media-annotation@w3.org from February 2009)

From: Joakim Söderberg <joakim.soderberg@ericsson.com>
Date: Mon, 2 Feb 2009 11:20:16 +0100
To: "Felix Sasaki" <fsasaki@w3.org>
Cc: "Pierre-Antoine Champin" <pchampin@liris.cnrs.fr>, <public-media-annotation@w3.org>
Message-ID: <4055256AED9D224D9442B19BF1C4C4900333C1AD@esealmw118.eemea.ericsson.se>
Hello,
From my horizon I can see that we agree on the following:

1) We have different types of mismatches; between the properties, data types and structure that can be solved by different parts of our standard.
We use the following terms to refer to them:

* High-level semantics  = (ontology) Semantic links identified by the mapping table.       

* Low-level semantics = (content of the return types) Structure, e.g. representing one or several creators. 

* Syntax    = (API) return types, e.g. slash-separated string or        structured sequence


2) The goal of this group is to define an ontology consisting of a set of core properties for the description of media objects on the Web to which all the formats in our scope will be mapped to.

The API is a means to transparently access a description of a media object in a format which the user do not need to be knowledgeable of when accessing the API. 


3) Some proposals and important questions are:
- Should we define return types? 
- Can we use "domain" and "range" to structure the mismatches?


If we at least agree on how to specify the "content" of the return value, then perhaps we can start to work bottom-up wise!


/Joakim


-----Original Message-----
From: Felix Sasaki [mailto:fsasaki@w3.org] 
Sent: den 2 februari 2009 00:00
To: Joakim Söderberg
Cc: Pierre-Antoine Champin; public-media-annotation@w3.org
Subject: Re: action 78 - Discussion about interoperability

Joakim Söderberg さんは書きました:
> Felix,
> I also reacted on the statement that "domain" and "range" would imply RDF, which is not true. 
>   

Fair enough.

> Further I believe that people are keen on thinking about the problems in terms of tools they master.
I think my main point is: currently nobody masters mapping of media 
annotations in a working implementation. So we no existing approaches 
and tools, but two approaches to move forward:
1) Think of a mapping architecture top-down
2) Think of properties we want to map bottom-up, and see what 
architecture they require, and how they can be implemented in the tools 
we are used to
I am very much in favor of 2) and would propose to make the mapping 
table much smaller by asking: who is planning to implement a mapping for 
an existing format? If nobody volunteers for some properties, we drop 
that part of the format - or even the complete format.
If we continue to discuss in the style of 1), I am very worried that we 
end up with an elaborate, but very hard to implement architecture.

This is my personal opinion, but I think the co-chairs need to get ASAP 
consensus on the general approach, so that people go in the same direction.

Felix


>  I myself have troubles to see a solution with out them. But maybe you can give an example?
>
> Best regards
> Joakim
>
> -----Original Message-----
> From: public-media-annotation-request@w3.org [mailto:public-media-annotation-request@w3.org] On Behalf Of Pierre-Antoine Champin
> Sent: den 28 januari 2009 12:51
> To: Felix Sasaki
> Cc: public-media-annotation@w3.org
> Subject: Re: action 78 - Discussion about interoperability
>
>
> Felix,
>
> I understand perfectly your concerns about the need to implement soon,
> in order to validate the spec, rather than implement to late and
> invalidate the parts of the spec that turn out to be too hard to implement.
>
> However, I find it quite difficult to start to implement before we have
> agreed a little more on some points. The mapping table captures an
> agreement on "high-level" semantics. We have started to discuss the
> syntax problems, especially in relation with req-r13, but have not
> reached a consensus yet. What I think is still very unclear are the
> "low-level" semantics features.
>
> And by the way, "domain" and "range" are not specific to RDF! It is
> true, though, that they imply a formalization. However, I guess Tobias
> advocates the point that having a formal ontology would make it easier
> to implement, not more difficult. Of course, that would require from
> implementers that they understand the formal ontology, which could
> hinder acceptance. But on the other hand, I believe that it would reduce
> the risks of having heterogeneous (hence not interoperable) implementations.
>
>   Pierre-Antoine
>
> Felix Sasaki wrote:
>   
>> Tobias Bürger wrote:
>>     
>>> Dear all,
>>>
>>> as promised in the telecon, here my reply to Felix' mail. See comment
>>> below
>>>
>>> Felix Sasaki wrote:
>>>       
>>>> Pierre-Antoine Champin さんは書きました:
>>>>         
>>>>> Hi,
>>>>>
>>>>> for action 78, I had to write a wiki page about some concerns I raised
>>>>> during the last telecon about interoperability between mapped
>>>>> properties. Since this is supposed to be matter for discussion rather
>>>>> than a formal document, I think it is best to send it as a mail.
>>>>>
>>>>>
>>>>> What triggered my concern was the mapping for Media RSS, between
>>>>> ''dc:creator'' and ''dcterms:creator''. Just as a reminder, the Dublin
>>>>> Core vocabulary has two versions: the legacy "elements" (usually
>>>>> prefixed with ''dc'') and the "terms" (usually prefixed with
>>>>> ''dcterms''). Each term is more specific than its corresponding
>>>>> element,
>>>>> as its values are more constrained. For example, ''dc:creator'' can
>>>>> have
>>>>> any type of value (including a plain string), while 'dcterms:creator''
>>>>> must have a URI, which must denote an instance of ''dcterms:Agent''.
>>>>> If we decide to specify the ontology only as prose
>>>>> Let us consider the example of ''dc:creator'' with a sample of
>>>>> mappings:
>>>>>
>>>>> * for XMP, its value is a sequence of strings, each string being the
>>>>> name of an author.
>>>>>
>>>>> * for Media RDF, its value is either
>>>>>   - a plain string,
>>>>>   - an instance of ''foaf:Agent'' with at least a ''foaf:name'', or
>>>>>   - an instance of ''vcard'' with at least a ''fn''.
>>>>> Since they are using ''dcterms'', it must also be inferred to be a
>>>>> ''dcterms:Agent'' (which contradicts the use of a plain string...). It
>>>>> may represent only one ("the primary") creator.
>>>>>
>>>>> * for ID3, the value of TOPE is a string, where names are separated
>>>>> by "/".
>>>>>
>>>>>
>>>>> My point here is that, beyond the "high level" semantic links
>>>>> identified
>>>>> by the mapping table, there are some "low level" discrepancies that are
>>>>> both semantic (e.g. representing one or several creators) and syntactic
>>>>> (slash-separated string or structured sequence).
>>>>>
>>>>> Leaving these issues to the implementation will inevitably lead to
>>>>> major
>>>>> differences and a lack of interoperability. We could specify down to
>>>>> the
>>>>> syntactical level the mapping for each property in each format, but
>>>>> what
>>>>> about other formats ?
>>>>>
>>>>> I think a better way to limit the variability in implementations by
>>>>> specifying precisely, for each property of our ontology, the expected
>>>>> "low level" features of its value (and not only its "high level"
>>>>> meaning) so that implementors know what they can keep from the original
>>>>> metadata, and what they need to adapt (i.e. split ID3's TOPE field into
>>>>> multiple values).
>>>>>
>>>>> This has to be done at least at the API level. But I guess this could
>>>>> also be done to some extent at the ontology level (I do believe that
>>>>> those "low level" features are *not only* syntactic), but that raises
>>>>> again the problem of formally specifying the ontology or not.
>>>>>
>>>>> But the less specific we are in describing the ontology, the more
>>>>> precise we will have to be in describing the API, in order to avoid
>>>>> "low
>>>>> level" semantic discrepancies.
>>>>>   
>>>>>           
>>>> I agree very much with your analysis, Pierre-Antoine. +1 to have a
>>>> very low wheight ontology and to be more precise in the API
>>>> description. Also I am hoping very much that people will volunteer to
>>>> actually test the mappings in toy implementations, no matter if
>>>> relying on a complex ontology or a detailed API. No matter which way
>>>> we go, let's test them now.
>>>>         
>>> I guess the intention of the toy implementations should be to get a
>>> more deeper understanding which type of mismatches between the
>>> properties defined in the different formats might occur. 
>>>       
>> No, not at all! The toy implementation or real implementation (whatever
>> you can create) is necessary for us to move forward in the W3C process.
>> My opinion is that we should work (toy or real) implementation driven:
>> use only the properties which are actually implemented in the API. The
>> others are dropped. E.g. everything is dropped from the mapping table
>> which is not implemented, *before* we go to last call.
>>
>>
>>     
>>> This might include data type mismatches, but also structural
>>> mismatches as outlined by Pierre-Antoine above. If you derive them by
>>> hard thinking or by prototypical implementations does not matter, as
>>> long as we are aware of them, because we finally have to implement
>>> them at some point in time.
>>>       
>> The crucial part is "some point in time". I have seen several working
>> groups which developed very elaborates specs - but when it came to
>> implementing them they ran into difficulties and had to revise the
>> specs. In terms of W3C that means: going back to a normal working draft
>> and loose probably 1/2 year. That is what I want to avoid by asking you
>> to work on implementations now.
>>
>>     
>>> Regarding the mapping, and more specifically from where we should map:
>>> The mapping should be to our core ontology to whose semantics we
>>> committed ourselves or will commit. So we will define what we allow as
>>> the domain and range of a property.
>>>       
>> The mapping does not need to be defined in terms of range and
>> properties. Please don't use RDF specific terminology - we have no
>> agreement to restrict ourself to this.
>>
>>
>>     
>>> And I disagree to the last statement from Pierre-Antoine above: if we
>>> describe the ontology less specific than we also do not need to be
>>> more precise in the API. It has been my understanding that this group
>>> defines an ontology consisting of a set of core properties for the
>>> description of media objects on the Web to which all the formats in
>>> our scope will be mapped to. Saying that, if you describe the ontology
>>> more lightweight, meaning perhaps with less detail or level of
>>> specifity, than you also map to something not very specific.
>>>       
>> I think we could avoid these kinds of discussions by just starting
>> implementing the mappings.
>>
>>
>>     
>>> For me the API is a means to transparently access a description of a
>>> media object in a format about which I do not want to care about when
>>> accessing the API. So we should define return types? Or should the
>>> burden of identying the return type be shifted to the user? (I guess
>>> we had this discussion before but did not come to a conclusion....)
>>>       
>> We have a requirement for this:
>> http://www.w3.org/TR/2009/WD-media-annot-reqs-20090119/#req-r13

>>
>> I think the current problem of the group is that there is an unbalance
>> between the goal of defining a read-only API, and the participants who
>> are mostly interested in an ontology, and also mostly in an RDF-based
>> ontology. One solution to this unbalance is to get other people on board
>> who are more interested in the API. I hope that this will happen soon.
>>
>> Felix
>>     
>
>
>
Received on Monday, 2 February 2009 10:21:17 UTC