Re: [bpmlod] Guidelines for converting BabelNet as Linguistic Linked Data from Tiziano Flati on 2014-05-22 (public-bpmlod@w3.org from May 2014)

From: Tiziano Flati <tiziano.flati@gmail.com>
Date: Thu, 22 May 2014 18:35:24 +0200
To: lider <lider@delicias.dia.fi.upm.es>, public-bpmlod@w3.org
Message-ID: <CAGNQ8qaPSb7XY1ewavadHGX6sCECeUvjVM6vm_CWQvL9s141Jg@mail.gmail.com>
@Felix:

> I am wondering if ITS 2.0 properties could help here, see
> https://www.w3.org/International/its/wiki/ITS-RDF_mapping
> There is mtConfidence which provides the confidence value for machine
> translation and mtConfidenceAnnotatorsRef  to identify the tool used.
> Also, there is provenance related properties, starting at  :org,
> until :revToolRef, that could identify the provenance information you need.
> The underlying definitions for the two ITS data categories are at
> http://www.w3.org/TR/its20/#provenance
> http://www.w3.org/TR/its20/#mtconfidence

Yes, I think that the ITS 2.0 can definitely be a very good point to
explore. At the moment I don't think we need modelling properties more
complex than those ones (such as mtConfidenceRule, etc.), so I think this
fits well our needs.

@Lewis:

> Do you know currently the provenance of the translation between senses in
> babelNet. Have you produced any of the translations yourself, or to you
> just take the links where they are present in the source resources, e.g.
> DBpedia.
> What is the policy in Babelnet, is some translation better than none, or
> is there a translation confidence threshold, e.g. based on human checking,
> Mt confidence or logical inference etc that you employ?

BabelNet translations can come from explicit resource information (e.g.,
Wikipedia interlanguage links) or as automatic translations supported by
millions of sense-tagged sentences coming from Wikipedia and Semcor.
In conclusion, AFAIK, BabelNet *does have* translation quality estimation,
so I think that indication about confidence could be also provided.
(Roberto, correct me if I am wrong)

Thank you all for your comments and suggestions :)
Tiziano

2014-05-22 16:07 GMT+02:00 Dave Lewis <dave.lewis@cs.tcd.ie>:

>  Hi Tiziano, Roberto,
> Do you know currently the provenance of the translation between senses in
> babelNet. Have you produced any of the translations yourself, or to you
> just take the links where they are present in the source resources, e.g.
> DBpedia.
>
> In a localization or MT application we look at in CNGL and FALCON, where
> we may use translation to  guide translators or help train MT engines, the
> provenance is important so some policies can be applied to reduce the
> propagation of inaccurate translation, or translation that are not
> appropriate to the context at hand - so those ITS attributes are really
> important there. To thins extend, when representing this as linked data, we
> define 'wasTranslatedFrom' as a property of 'prov:wasDerivedFrom' to reify
> other provenance meta-data -  agents, tools, context etc.
>
> What is the policy in Babelnet, is some translation better than none, or
> is there a translation confidence threshold, e.g. based on human checking,
> Mt confidence or logical inference etc that you employ?
>
> many thanks,
> Dave
>
>
> On 22/05/2014 10:42, Felix Sasaki wrote:
>
> Hi Titziano,
>
>  sorry that I could not make the call due to personal reasons.
>
>  In the draft I saw under „translation“ this issue:
>
>  „Issues: Information about translation confidence (was it humanly or
> automatically produced? if automatic, with what confidence score?) and
> translation provenance (what text(s) does the translation come from? who
> translated and with what tool?).
> Another issue concerns whether the relation lexinfo:translation is
> essential or not: every sense in a language within a BabelSynset is, in
> fact, a translation of any other sense in another language, so that this
> information could actually be derived (problem of redundancy). However,
> having data linked one to each other could also be a benefit, since
> the information is explicit in the resource.“
>
>  I am wondering if ITS 2.0 properties could help here, see
>
>  https://www.w3.org/International/its/wiki/ITS-RDF_mapping
>
>  There is mtConfidence which provides the confidence value for machine
> translation and mtConfidenceAnnotatorsRef  to identify the tool used.
>
>  Also, there is provenance related properties, starting at  :org,
> until :revToolRef, that could identify the provenance information you need.
> The underlying definitions for the two ITS data categories are at
> http://www.w3.org/TR/its20/#provenance
> http://www.w3.org/TR/its20/#mtconfidence
>
>  Best,
>
>  Felix
>
>  Am 22.05.2014 um 10:12 schrieb Tiziano Flati <tiziano.flati@gmail.com>:
>
>  Dear all,
>
>  we have compiled a first draft of guidelines for the conversion of
> BabelNet as Linguistic Linked Data. The initial draft is here<https://docs.google.com/document/d/184C_AjY7_PYBSc8SnAFghGLyTo1v312N34dsP9QZokI/edit#>
> .
>
>  We can probably integrate this into the BPMLOD community report both as
> a separate document and in the form of all our resource-dependent and
> independent details/comments.
> Any feedback and comment is also very appreciated and will help us
> improving the draft.
>
>  Best regards,
> Tiziano Flati and Roberto Navigli
>
>
>
>
Received on Friday, 23 May 2014 06:51:59 UTC