- From: Pierre-Antoine Champin <pchampin@liris.cnrs.fr>
- Date: Fri, 23 Jan 2009 17:19:28 +0000
- To: public-media-annotation@w3.org
- Message-ID: <4979FC20.40406@liris.cnrs.fr>
Hi, for action 78, I had to write a wiki page about some concerns I raised during the last telecon about interoperability between mapped properties. Since this is supposed to be matter for discussion rather than a formal document, I think it is best to send it as a mail. What triggered my concern was the mapping for Media RSS, between ''dc:creator'' and ''dcterms:creator''. Just as a reminder, the Dublin Core vocabulary has two versions: the legacy "elements" (usually prefixed with ''dc'') and the "terms" (usually prefixed with ''dcterms''). Each term is more specific than its corresponding element, as its values are more constrained. For example, ''dc:creator'' can have any type of value (including a plain string), while 'dcterms:creator'' must have a URI, which must denote an instance of ''dcterms:Agent''. If we decide to specify the ontology only as prose Let us consider the example of ''dc:creator'' with a sample of mappings: * for XMP, its value is a sequence of strings, each string being the name of an author. * for Media RDF, its value is either - a plain string, - an instance of ''foaf:Agent'' with at least a ''foaf:name'', or - an instance of ''vcard'' with at least a ''fn''. Since they are using ''dcterms'', it must also be inferred to be a ''dcterms:Agent'' (which contradicts the use of a plain string...). It may represent only one ("the primary") creator. * for ID3, the value of TOPE is a string, where names are separated by "/". My point here is that, beyond the "high level" semantic links identified by the mapping table, there are some "low level" discrepancies that are both semantic (e.g. representing one or several creators) and syntactic (slash-separated string or structured sequence). Leaving these issues to the implementation will inevitably lead to major differences and a lack of interoperability. We could specify down to the syntactical level the mapping for each property in each format, but what about other formats ? I think a better way to limit the variability in implementations by specifying precisely, for each property of our ontology, the expected "low level" features of its value (and not only its "high level" meaning) so that implementors know what they can keep from the original metadata, and what they need to adapt (i.e. split ID3's TOPE field into multiple values). This has to be done at least at the API level. But I guess this could also be done to some extent at the ontology level (I do believe that those "low level" features are *not only* syntactic), but that raises again the problem of formally specifying the ontology or not. But the less specific we are in describing the ontology, the more precise we will have to be in describing the API, in order to avoid "low level" semantic discrepancies. regards Pierre-Antoine
Received on Friday, 23 January 2009 17:20:15 UTC