- From: Jorge Gracia <jgracia@fi.upm.es>
- Date: Fri, 23 May 2014 19:42:13 +0200
- To: Felix Sasaki <fsasaki@w3.org>
- Cc: Dave Lewis <dave.lewis@cs.tcd.ie>, Roberto Navigli <navigli@di.uniroma1.it>, Tiziano Flati <tiziano.flati@gmail.com>, lider <lider@delicias.dia.fi.upm.es>, "public-bpmlod@w3.org" <public-bpmlod@w3.org>
- Message-ID: <CANzuSaMUD9aRnFqZEMoSVchfahBjR_i_VV280LBjE7-Jqx0Khg@mail.gmail.com>
Hi Felix, Yes, I think that exploring the commonalities of both models makes a lot of sense. Not sure if they have to be merged , but I have the feeling that our lemon module could largely reuse ITS for some things. At the ontolex group we will treat the variation/translation module again at some point, I think. That would be a good opportunity to explore the role of ITS. I will keep you updated! Regards, Jorge 2014-05-23 18:49 GMT+02:00 Felix Sasaki <fsasaki@w3.org>: > Hi Jorge and all, > > would it make sense to ask the ontolex group and the ITS IG to merge their > models? Otherwise there would be a confusing situation: two models for the > same purpose. > > The issues are probably details. I saw e.g. in the paper that there is a > translationConfidence OW property. It looks similar to mtConfidence in ITS, > but there are details ideally to merge like what data type to use, whether > to require relating confidence value to information about translation tools > (because auto generated values cannot be interpreted without) etc. > > Best, > > Felix > > Am 23.05.2014 um 15:48 schrieb Jorge Gracia <jgracia@fi.upm.es>: > > Dear Tiziano, Roberto > > You could also consider using the lemon translation module to represent > explicit translations as linked data. This is currently under development > in the ONTOLEX group but there is a lemon-based version already available, > that I will present at LREC next week [1]. The idea is reifying the > translation relation so you can attach additional information to it > (source, target, confidence, provenance, etc.) [2] > > Regards, > > Jorge > > [1] > http://ra.cps.unizar.es:8080/PUBLICATIONS/attachedFiles/document/LREC2014_translations_V11.pdf > [2] http://purl.org/net/translation# > > > > > 2014-05-23 11:58 GMT+02:00 Dave Lewis <dave.lewis@cs.tcd.ie>: > >> Roberto, Tiziano, >> Thanks for that. >> >> Have you considered already how you might allow third parties to QA and >> perhaps correct those translations? That is, some sort of process by which >> proposed MT translations between senses can be promoted to more >> authoritative, human checked translations, and marked as such? >> >> The ITS text analytics and/or terminology data categories, which also >> have confidence scores could be useful for annotating such a process: >> http://www.w3.org/TR/its20/#textanalysis >> http://www.w3.org/TR/its20/#terminology >> >> To enable such checking and progression in the authoritativeness of >> senses in different languages, it is important that you record what senses >> are a translation of what other senses. >> >> In relation to the senses that are extracted from Wikipedia interlanguage >> links. Do you consider those 'translations', and in particular can you tell >> from those which is the source and which is the target? >> >> Interested to hear what you think. >> >> cheers, >> Dave >> >> >> >> On 22/05/2014 17:41, Roberto Navigli wrote: >> >> Thanks Felix! To answer Dave's comment: translations come from the >> automatic translations of semantically annotated corpora, as Tiziano said, >> and we have a confidence for each of these translations together with the >> source of the original text. >> >> Best, >> Roberto >> >> >> 2014-05-22 18:35 GMT+02:00 Tiziano Flati <tiziano.flati@gmail.com>: >> >>> @Felix: >>> >>>> I am wondering if ITS 2.0 properties could help here, see >>>> https://www.w3.org/International/its/wiki/ITS-RDF_mapping >>>> There is mtConfidence which provides the confidence value for machine >>>> translation and mtConfidenceAnnotatorsRef to identify the tool used. >>>> Also, there is provenance related properties, starting at :org, >>>> until :revToolRef, that could identify the provenance information you need. >>>> The underlying definitions for the two ITS data categories are at >>>> http://www.w3.org/TR/its20/#provenance >>>> http://www.w3.org/TR/its20/#mtconfidence >>> >>> Yes, I think that the ITS 2.0 can definitely be a very good point to >>> explore. At the moment I don't think we need modelling properties more >>> complex than those ones (such as mtConfidenceRule, etc.), so I think this >>> fits well our needs. >>> >>> @Lewis: >>> >>>> Do you know currently the provenance of the translation between senses >>>> in babelNet. Have you produced any of the translations yourself, or to you >>>> just take the links where they are present in the source resources, e.g. >>>> DBpedia. >>>> What is the policy in Babelnet, is some translation better than none, >>>> or is there a translation confidence threshold, e.g. based on human >>>> checking, Mt confidence or logical inference etc that you employ? >>>> >>> BabelNet translations can come from explicit resource information (e.g., >>> Wikipedia interlanguage links) or as automatic translations supported by >>> millions of sense-tagged sentences coming from Wikipedia and Semcor. >>> In conclusion, AFAIK, BabelNet *does have* translation quality >>> estimation, so I think that indication about confidence could be also >>> provided. (Roberto, correct me if I am wrong) >>> >>> Thank you all for your comments and suggestions :) >>> Tiziano >>> >>> 2014-05-22 16:07 GMT+02:00 Dave Lewis <dave.lewis@cs.tcd.ie>: >>> >>> Hi Tiziano, Roberto, >>>> Do you know currently the provenance of the translation between senses >>>> in babelNet. Have you produced any of the translations yourself, or to you >>>> just take the links where they are present in the source resources, e.g. >>>> DBpedia. >>>> >>>> In a localization or MT application we look at in CNGL and FALCON, >>>> where we may use translation to guide translators or help train MT >>>> engines, the provenance is important so some policies can be applied to >>>> reduce the propagation of inaccurate translation, or translation that are >>>> not appropriate to the context at hand - so those ITS attributes are really >>>> important there. To thins extend, when representing this as linked data, we >>>> define 'wasTranslatedFrom' as a property of 'prov:wasDerivedFrom' to reify >>>> other provenance meta-data - agents, tools, context etc. >>>> >>>> What is the policy in Babelnet, is some translation better than none, >>>> or is there a translation confidence threshold, e.g. based on human >>>> checking, Mt confidence or logical inference etc that you employ? >>>> >>>> many thanks, >>>> Dave >>>> >>>> >>>> On 22/05/2014 10:42, Felix Sasaki wrote: >>>> >>>> Hi Titziano, >>>> >>>> sorry that I could not make the call due to personal reasons. >>>> >>>> In the draft I saw under „translation“ this issue: >>>> >>>> „Issues: Information about translation confidence (was it humanly or >>>> automatically produced? if automatic, with what confidence score?) and >>>> translation provenance (what text(s) does the translation come from? who >>>> translated and with what tool?). >>>> Another issue concerns whether the relation lexinfo:translation is >>>> essential or not: every sense in a language within a BabelSynset is, in >>>> fact, a translation of any other sense in another language, so that this >>>> information could actually be derived (problem of redundancy). However, >>>> having data linked one to each other could also be a benefit, since >>>> the information is explicit in the resource.“ >>>> >>>> I am wondering if ITS 2.0 properties could help here, see >>>> >>>> https://www.w3.org/International/its/wiki/ITS-RDF_mapping >>>> >>>> There is mtConfidence which provides the confidence value for machine >>>> translation and mtConfidenceAnnotatorsRef to identify the tool used. >>>> >>>> Also, there is provenance related properties, starting at :org, >>>> until :revToolRef, that could identify the provenance information you need. >>>> The underlying definitions for the two ITS data categories are at >>>> http://www.w3.org/TR/its20/#provenance >>>> http://www.w3.org/TR/its20/#mtconfidence >>>> >>>> Best, >>>> >>>> Felix >>>> >>>> Am 22.05.2014 um 10:12 schrieb Tiziano Flati <tiziano.flati@gmail.com >>>> >: >>>> >>>> Dear all, >>>> >>>> we have compiled a first draft of guidelines for the conversion of >>>> BabelNet as Linguistic Linked Data. The initial draft is here<https://docs.google.com/document/d/184C_AjY7_PYBSc8SnAFghGLyTo1v312N34dsP9QZokI/edit#> >>>> . >>>> >>>> We can probably integrate this into the BPMLOD community report both >>>> as a separate document and in the form of all our resource-dependent and >>>> independent details/comments. >>>> Any feedback and comment is also very appreciated and will help us >>>> improving the draft. >>>> >>>> Best regards, >>>> Tiziano Flati and Roberto Navigli >>>> >>>> >>>> >>>> >>> >> >> >> -- >> ===================================== >> Roberto Navigli >> Dipartimento di Informatica >> Sapienza University of Rome >> Viale Regina Elena 295 (second floor) >> 00161 Roma Italy >> Phone: +39 0649255161 - Fax: +39 06 8541842 >> Home Page: http://wwwusers.di.uniroma1.it/~navigli >> ===================================== >> >> >> > > > -- > Jorge Gracia, PhD > Ontology Engineering Group > Artificial Intelligence Department > Universidad Politécnica de Madrid > http://delicias.dia.fi.upm.es/~jgracia/ > > > -- Jorge Gracia, PhD Ontology Engineering Group Artificial Intelligence Department Universidad Politécnica de Madrid http://delicias.dia.fi.upm.es/~jgracia/
Received on Friday, 23 May 2014 17:43:01 UTC