Re: Bodies translations: use cases and thoughts

Hi Paolo,

On 12/feb/2013, at 15:09, Paolo Ciccarese wrote:

Dear all,
now that the new draft of the specs has been published, I would like to discuss further some aspects that have been dropped along the way. One of them is languages and translations.

This is my scenario: I have a textual content written in one language. As curator, I pick an important sentence within that text and I provide, through annotation, the translations in different languages of that particular passage. And it could be even a little more complicated and we might need to keep track of multiple translations for each language performed at different moments in time or by different agents in different moments in time.

Does any other member have use cases about translations?

a maybe very small use case this fits in, from the text mining / NLP world, is (a method for) language identification where you have a vocabulary of "important" words for each language that you need to identify in the text.
So if any of the vocabularies happens to have one or more words in common (say word 'do' means 'of' in Portuguese, 'give' in Italian and ... 'do' in English) then one would need different translations for each of the matching dictionaries / languages.


A couple of solutions have been discussed in previous emails exchanges [1][2][3]:

1) Translations "by oa:Choice". This seems well representing those cases in which we are modeling an actual choice.

 _:x a oa:Annotation ;
    oa:hasBody <choice1> ;
    oa:hasTarget <ny-times-article> .

    <choice1> a oa:Choice ;
    oa:default<comment-in-french> ;
    oa:item<comment-in-english> ;
    oa:item<comment-in-spanish> .

In my opinion this is not very clean as it introduces the semantic of a choice where is not always an actual choice.


However, it does not seem fitting the above use case where all the translations are meant to be provided at the same time.
So I wonder what you think about:

 _:x a oa:Annotation ;
    oa:motivatedBy blah:translating
    oa:hasBody <comment-in-english> ;
    oa:hasBody <comment-in-spanish> .
    oa:hasTarget <ny-times-article> .

this seems good to me.


2) Translate "by multilingual body":

_:x a oa:Annotation ;
   oa:hasBody <multilingualcomment> ;
   oa:hasTarget <ny-times-article> .

<multilingualcomment> rdfs:label "comment-in-french"@fr ;
   rdfs:label "comment-in-english"@en ;
   rdfs:label "comment-in-spanish"@es .

This could look more explicit, however it introduces a new kind of Body.

Additional use cases? Thoughts?

Another option would be to just use different annotations but then you would have to find a different way to describe the reference to the same text chunk.

My 2 cents,
Tommaso


Best,
Paolo

[1] http://lists.w3.org/Archives/Public/public-openannotation/2012Oct/0004.html
[2] http://lists.w3.org/Archives/Public/public-openannotation/2012Nov/0001.html
[3] http://lists.w3.org/Archives/Public/public-openannotation/2012Nov/0006.html


--
Dr. Paolo Ciccarese
http://www.paolociccarese.info/
Biomedical Informatics Research & Development
Instructor of Neurology at Harvard Medical School
Assistant in Neuroscience at Mass General Hospital
Member of the MGH Biomedical Informatics Core
+1-857-366-1524 (mobile)   +1-617-768-8744 (office)

CONFIDENTIALITY NOTICE: This message is intended only for the addressee(s), may contain information that is considered
to be sensitive or confidential and may not be forwarded or disclosed to any other party without the permission of the sender.
If you have received this message in error, please notify the sender immediately.

Received on Wednesday, 13 February 2013 09:17:30 UTC