Re: [bpmlod] Guidelines for converting BabelNet as Linguistic Linked Data from Roberto Navigli on 2014-05-22 (public-bpmlod@w3.org from May 2014)

From: Roberto Navigli <navigli@di.uniroma1.it>
Date: Thu, 22 May 2014 18:41:33 +0200
To: Tiziano Flati <tiziano.flati@gmail.com>
Cc: lider <lider@delicias.dia.fi.upm.es>, "public-bpmlod@w3.org" <public-bpmlod@w3.org>
Message-ID: <CAESezin_mJ4CU5EUQQDG=ZAMfWr6Co2SqzC89gfRk=4HMAAtbQ@mail.gmail.com>
Thanks Felix! To answer Dave's comment: translations come from the
automatic translations of semantically annotated corpora, as Tiziano said,
and we have a confidence for each of these translations together with the
source of the original text.

Best,
Roberto


2014-05-22 18:35 GMT+02:00 Tiziano Flati <tiziano.flati@gmail.com>:

> @Felix:
>
>> I am wondering if ITS 2.0 properties could help here, see
>> https://www.w3.org/International/its/wiki/ITS-RDF_mapping
>> There is mtConfidence which provides the confidence value for machine
>> translation and mtConfidenceAnnotatorsRef  to identify the tool used.
>> Also, there is provenance related properties, starting at  :org,
>> until :revToolRef, that could identify the provenance information you need.
>> The underlying definitions for the two ITS data categories are at
>> http://www.w3.org/TR/its20/#provenance
>> http://www.w3.org/TR/its20/#mtconfidence
>
> Yes, I think that the ITS 2.0 can definitely be a very good point to
> explore. At the moment I don't think we need modelling properties more
> complex than those ones (such as mtConfidenceRule, etc.), so I think this
> fits well our needs.
>
> @Lewis:
>
>> Do you know currently the provenance of the translation between senses in
>> babelNet. Have you produced any of the translations yourself, or to you
>> just take the links where they are present in the source resources, e.g.
>> DBpedia.
>> What is the policy in Babelnet, is some translation better than none, or
>> is there a translation confidence threshold, e.g. based on human checking,
>> Mt confidence or logical inference etc that you employ?
>>
> BabelNet translations can come from explicit resource information (e.g.,
> Wikipedia interlanguage links) or as automatic translations supported by
> millions of sense-tagged sentences coming from Wikipedia and Semcor.
> In conclusion, AFAIK, BabelNet *does have* translation quality estimation,
> so I think that indication about confidence could be also provided.
> (Roberto, correct me if I am wrong)
>
> Thank you all for your comments and suggestions :)
> Tiziano
>
> 2014-05-22 16:07 GMT+02:00 Dave Lewis <dave.lewis@cs.tcd.ie>:
>
>  Hi Tiziano, Roberto,
>> Do you know currently the provenance of the translation between senses in
>> babelNet. Have you produced any of the translations yourself, or to you
>> just take the links where they are present in the source resources, e.g.
>> DBpedia.
>>
>> In a localization or MT application we look at in CNGL and FALCON, where
>> we may use translation to  guide translators or help train MT engines, the
>> provenance is important so some policies can be applied to reduce the
>> propagation of inaccurate translation, or translation that are not
>> appropriate to the context at hand - so those ITS attributes are really
>> important there. To thins extend, when representing this as linked data, we
>> define 'wasTranslatedFrom' as a property of 'prov:wasDerivedFrom' to reify
>> other provenance meta-data -  agents, tools, context etc.
>>
>> What is the policy in Babelnet, is some translation better than none, or
>> is there a translation confidence threshold, e.g. based on human checking,
>> Mt confidence or logical inference etc that you employ?
>>
>> many thanks,
>> Dave
>>
>>
>> On 22/05/2014 10:42, Felix Sasaki wrote:
>>
>> Hi Titziano,
>>
>>  sorry that I could not make the call due to personal reasons.
>>
>>  In the draft I saw under „translation“ this issue:
>>
>>  „Issues: Information about translation confidence (was it humanly or
>> automatically produced? if automatic, with what confidence score?) and
>> translation provenance (what text(s) does the translation come from? who
>> translated and with what tool?).
>> Another issue concerns whether the relation lexinfo:translation is
>> essential or not: every sense in a language within a BabelSynset is, in
>> fact, a translation of any other sense in another language, so that this
>> information could actually be derived (problem of redundancy). However,
>> having data linked one to each other could also be a benefit, since
>> the information is explicit in the resource.“
>>
>>  I am wondering if ITS 2.0 properties could help here, see
>>
>>  https://www.w3.org/International/its/wiki/ITS-RDF_mapping
>>
>>  There is mtConfidence which provides the confidence value for machine
>> translation and mtConfidenceAnnotatorsRef  to identify the tool used.
>>
>>  Also, there is provenance related properties, starting at  :org,
>> until :revToolRef, that could identify the provenance information you need.
>> The underlying definitions for the two ITS data categories are at
>> http://www.w3.org/TR/its20/#provenance
>> http://www.w3.org/TR/its20/#mtconfidence
>>
>>  Best,
>>
>>  Felix
>>
>>  Am 22.05.2014 um 10:12 schrieb Tiziano Flati <tiziano.flati@gmail.com>:
>>
>>  Dear all,
>>
>>  we have compiled a first draft of guidelines for the conversion of
>> BabelNet as Linguistic Linked Data. The initial draft is here<https://docs.google.com/document/d/184C_AjY7_PYBSc8SnAFghGLyTo1v312N34dsP9QZokI/edit#>
>> .
>>
>>  We can probably integrate this into the BPMLOD community report both as
>> a separate document and in the form of all our resource-dependent and
>> independent details/comments.
>> Any feedback and comment is also very appreciated and will help us
>> improving the draft.
>>
>>  Best regards,
>> Tiziano Flati and Roberto Navigli
>>
>>
>>
>>
>


-- 
=====================================
Roberto Navigli
Dipartimento di Informatica
Sapienza University of Rome
Viale Regina Elena 295 (second floor)
00161 Roma Italy
Phone: +39 0649255161 - Fax: +39 06 8541842
Home Page: http://wwwusers.di.uniroma1.it/~navigli
=====================================
Received on Thursday, 22 May 2014 16:46:34 UTC