- From: Ilan Kernerman <ilan@kdictionaries.com>
- Date: Tue, 17 Dec 2019 18:15:41 +0000
- To: "'afrilex@freelists.org'" <afrilex@freelists.org>, "'asialex@freelists.org'" <asialex@freelists.org>, "'dsna@yahoogroups.com'" <dsna@yahoogroups.com>, "'euralex@freelists.org'" <euralex@freelists.org>, "elexis-all@googlegroups.com" <elexis-all@googlegroups.com>, "'enel-all@googlegroups.com'" <enel-all@googlegroups.com>, "'public-ontolex@w3.org'" <public-ontolex@w3.org>, "public-ld4lt@w3.org" <public-ld4lt@w3.org>, "nexus-mc-all@delicias.dia.fi.upm.es" <nexus-mc-all@delicias.dia.fi.upm.es>, "nlp2rdf@lists.informatik.uni-leipzig.de" <nlp2rdf@lists.informatik.uni-leipzig.de>
- CC: Simon Krek <simon.krek@guest.arnes.si>, "John P. McCrae" <john.mccrae@insight-centre.org>, Jorge Gracia <jogracia@unizar.es>, lrec pisa <lrec@ilc.cnr.it>, "lrec@lrec-conf.org" <lrec@lrec-conf.org>
- Message-ID: <AM5PR03MB3140E5589C284C14ED0A3126CA500@AM5PR03MB3140.eurprd03.prod.outlook.com>
CALL FOR PAPERS – GLOBALEX 2020 – Linked Lexicography Full-day workshop at LREC2020 | Marseille, France | May 12, 2020 Submission deadline: February 14, 2020 (see also Important dates below) Workshop website: https://globalex.link/events/workshops/globalex-workshop-2020/ WORKSHOP DESCRIPTION The GLOBALEX 2020 Workshop @ LREC will follow up on the successful GLOBALEX workshops at LREC 2016 (https://globalex2016.globalex.link/) and LREC 2018 (https://globalex2018.globalex.link/). It is organized by Globalex – Global Alliance for Lexicography (https://globalex.link/<https://globalex.link/globalex2018/>), with support from: ● ELEXIS (EU's H2020-funded project European Lexicography Infrastructure, https://elex.is/) ● TIAD (Translation Inference Across Dictionaries shared tasks and workshops, https://tiad2020.unizar.es/<https://tiad2012.unizar.es/>, https://tiad2019.unizar.es/, https://tiad2017.wordpress.com/) This third iteration of GLOBALEX workshops at LREC will focus on linking data from lexicographic resources and will highlight aspects related to the automated linking of content among different dictionaries and other lexical sources, in the aim of enhancing linguistic data generation, enrichment and reinforcement. Linking lexicographic data sets to each other and to other lexical resources, and in particular the interoperability of lexicography with Linked Data (LD) methodologies, have been gaining substantial attention in recent years, becoming a subject of various projects for research by and collaboration between academia and industry, including support of the public sector. Most notably, the W3C community group on Ontology-Lexica [1] was established following the release of the lemon model, which constituted the first de-facto standard for representing ontology-lexica, with the mission to “develop models for the representation of lexica (and machine readable dictionaries) relative to ontologies” [2]. The ensuing OntoLex-lemon model [3], [4] has served since 2016 as the leading option for conversion of lexicographic data into LD, and has recently been updated with the lexicog module [5] released on 17 September 2019 [6]. This trend has been complemented since 2015 by relevant literature (e.g. [7], [8], [9]), conference papers (e.g. [10], [11], [12], [13]) and EU-funded projects ([14] and [15], [16]). Besides a section including general research papers, the workshop will include two shared task tracks – one on linking monolingual data and the other on linking bilingual and multilingual data, as follows: (1) Monolingual Word Sense Alignment – in conjunction with a shared task conducted by ELEXIS. Task 1 will be evaluated on novel dictionary linking data developed by the ELEXIS project [15], which will cover linking for the following languages: Danish, Dutch, English, Estonian, German, Hungarian, Irish, Italian, Serbian, Slovene and Russian. (2) Linking Bilingual and Multilingual Lexicographic Resources – in conjunction with the 3rd TIAD shared task. Task 2 will host the 3rd edition of the Translation Inference Across Dictionaries (TIAD) shared task, of which previous editions were co-located at Language, Data and Knowledge conferences [17], [18]. The aim is to explore methods and techniques for automatically generating new bilingual (and multilingual) dictionaries from existing ones in the context of a coherent experiment framework that enables reliable validation of results and solid comparison of the processes used. In particular, the participating systems will be asked to generate new translations automatically among three languages – English, French, Portuguese – based on known translations contained in the Apertium RDF graph [19]. The inclusion of other language pairs will also be possible for this edition. MAIN TOPICS We welcome any topic related to the main theme of linking lexicographic resources, including but not limited to: ● Linking monolingual dictionaries and lexicographic resources ● Linking bilingual dictionaries and lexicographic resources ● Linking multilingual dictionaries and lexicographic resources ● Linking lexicographic data with other lexical data resources ● Applications and developments of the OntoLex-lemon model and its lexicography module ● RDF serializations of lexicographic data ● Non-RDF data formats for linked lexicographic resources ● RDF and XML standards for linked lexicography ● Converting lexicographic data for linking purposes ● Linked Data-native lexicographic resources ● Automated generation of lexicographic resources based on Linked Data technologies ● Lexicography, terminology and Linguistic Linked (Open) Data ● Linked lexicography and the Semantic Web ● Linked lexicography and the Multilingual Digital Single Market ● Linked lexicography and Knowledge Systems ● Linked lexicography and Artificial/Augmented Intelligence ● Linked lexicography, deep learning and neural networks AUDIENCE ● Lexicographers and dictionary makers ● Computational and corpus linguists ● NLP researchers and engineers ● Terminologists ● Big data analysts ● Reference scientists and knowledge system managers SUBMISSION INFORMATION There are two types of submissions: ● Abstract (500-1,000 words) OR ● Full paper (6-10 pages) For formatting guidelines for full papers, please use the LREC submission format (http://lrec2020.lrec-conf.org/en/submission/authors-kit/). Both abstracts and full papers will address any of the topics included in this CfP, but full papers have the advantage of presenting the authors’ work and ideas at a greater level of detail. All submissions must be received by the deadline below and will be reviewed by experts in the field. Accepted proposals will be invited (but not required) to submit the full paper for publication in the workshop proceedings. Further details on the submission procedure will be provided on the workshop website later on. IMPORTANT DATES Submission deadline: February 14, 2020 Notification of acceptance: March 13, 2020 Camera-ready papers: April 15, 2020 GLOBALEX Workshop: May 12, 2020 ORGANIZERS ● Ilan Kernerman, K Dictionaries ● Simon Krek, Globalex, Jožef Stefan Institute TRACK 1 ORGANIZER ● John McCrae, National University of Ireland – Galway ● Sina Ahmadi, National University of Ireland – Galway TRACK 2 ORGANIZERS ● Jorge Gracia, University of Zaragoza ● Besim Kabashi, Friedrich-Alexander University of Erlangen-Nuremberg and Ludwig-Maximilian University of Munich SCIENTIFIC COMMITTEE (to be announced) CONTACT (to be announced) REFERENCES [1] https://www.w3.org/community/ontolex/. [2] McCrae, J., G. Aguado-de Cea, P. Buitelaar, P. Cimiano, T. Declerck, A, Gomez-Perez, J. Gracia, L. Hollink, E, Montiel-Ponsoda, D. Spohr, and T. Wunner. 2012. Interchanging lexical resources on the Semantic Web. Language Resources and Evaluation, 46, pp. 701–719. [3] https://www.w3.org/2016/05/ontolex/. [4] McCrae, J., J. Bosque-Gil, J. Gracia, P. Buitelaar, and P. Cimiano. 2017. The OntoLexLemon Model: Development and Applications. In Kosem et al. (eds.) Electronic lexicography in the 21st century. Proceedings of eLex 2017 conference, in Leiden, Netherlands. Lexical Computing CZ s.r.o., pp. 587–597. https://elex.link/elex2017/wp-content/uploads/2017/09/paper36.pdf/. [5] https://www.w3.org/ns/lemon/lexicog#/. [6] https://www.w3.org/2019/09/lexicog/. [7] Gracia, J. 2015. Multilingual dictionaries and the Web of Data. Kernerman Dictionary News, 23, pp. 1-4. https://www.kdictionaries.com/kdn/kdn23_2015.pdf/. [8] Klimek, B., and M. Brummer. 2015. Enhancing lexicography with semantic language databases. Kernerman Dictionary News, 23, pp. 5–10. https://www.kdictionaries.com/kdn/kdn23_2015.pdf/. [9] Bosque-Gil, J., J. Gracia, and A. Gomez-Perez. 2016. Linked data in lexicography. Kernerman Dictionary News, 24, pp. 19–24. https://www.kdictionaries.com/kdn/kdn24_2016.pdf/. [10] Declerck, T., E. Wand-Vogt, and K. Morth. 2015. Towards a Pan European Lexicography by Means of Linked (Open) Data. In Kosem et al. (eds.) Proceedings of eLex 2015. Biennial Conference on Electronic Lexicography (eLex2015), electronic lexicography in the 21st century: Linking lexical data in the digital age. Ljubljana/Brighton: Trojina, Institute for Applied Slovene Studies, Ljubljana, pp. 342-355. https://elex.link/elex2015/proceedings/eLex_2015_22_Declerck+etal.pdf/. [11] Abromeit, F., C. Chiarcos, C. Fath, and M. Ionov. 2016. Linking the Tower of Babel: Modelling a Massive Set of Etymological Dictionaries as RDF. In Proceedings of the 5th Workshop on Linked Data in Linguistics: Managing, Building and Using Linked Language Resources (LDL-2016). pp. 11–19. [12] Bosque-Gil, J., J. Gracia, and E. Montiel-Ponsoda. 2017. Towards a Module for Lexicography in OntoLex. In Proceedings of the LDK workshops: OntoLex, TIAD and Challenges for Wordnets at 1st Language Data and Knowledge conference (LDK 2017), Galway, Ireland, volume 1899. Galway (Ireland): CEUR-WS, pp. 74–84. http://ceur-ws.org/Vol-1899/OntoLex{_}2017{_}paper{_}5.pdf/<http://ceur-ws.org/Vol-1899/OntoLex%7b_%7d2017%7b_%7dpaper%7b_%7d5.pdf/>. [13] Gracia, J., I. Kernerman, and J. Bosque-Gil. 2017. Toward linked data-native dictionaries. In I. Kosem et al. (eds.) Electronic lexicography in the 21st century. Proceedings of eLex 2017 conference, in Leiden, Netherlands. Lexical Computing CZ s.r.o., pp. 550–559. https://elex.link/elex2017/wp-content/uploads/2017/09/paper33.pdf/. [14] LDL4HELTA – Linked Data Lexicography for High-End Language Technology Application. EUREKA Austria-Israel Bilateral R&D Programme No. 9898. https://www.eurekanetwork.org/project/id/9898/. [15] ELEXIS – European Lexicographic Infrastructure. European Union’s Horizon 2020 Research and Innovation Programme No. 731015. https://elex.is/. [16] Prêt-à-LLOD. European Union’s Horizon 2020 Research and Innovation Programme No. 825182. https://www.pret-a-llod.eu/. [17] TIAD 2017 – Translation Inference across Dictionaries workshop and shared task. https://www.ldk2017.org/index-php/tiad-2017-shared-task-translation-inference-across-dictionaries/. [18] TIAD 2019 – Translation Inference across Dictionaries workshop and shared task. http://2019.ldk-conf.org/tiad-2019/. [19] Apertium RDF - http://linguistic.linkeddata.es/apertium/. TRACK 1 – 1st “Monolingual Word Sense Alignment” Shared Task Call for Participation The ELEXIS project is organizing a shared task on the task of monolingual word sense alignment across dictionaries as part of the GLOBALEX 2020 – Linked Lexicography workshop at the 12th Language Resources and Evaluation Conference (LREC 2020) taking place on Tuesday, May 12 2020 in Marseille (France). Monolingual word sense alignment is a challenging task of finding matching senses between two dictionary entries and will play a crucial role in the development of new lexical resources. In addition, this task presents a challenging combination of NLP, semantic textual similarity and reasoning in order to find the best alignment across a group of senses. Description of Task The task of monolingual word sense alignment is presented as a task of predicting the relationship between two senses in one of five categories: “exact”, “broader”, “narrower”, “related” or “none”. For each sense pair the following information will be provided - The lemma shared between the two entries - The part of speech of the entries* - The sense text (including definition) of the sense of the first entry - The sense text (including definition) of the sense of the second entry - (Training Data) The label of the relation (“exact”, “broader”, “narrower”, “related” or “none”) For each pair of entry all mappings between senses will be provided, as such we expect the best systems to consider the mapping of an entry as a block. Training data will be available for monolingual dictionaries in the following languages: - Danish - Dutch - English - Estonian - German - Hungarian* - Irish - Italian - Serbian - Slovenian - Russian *For Hungarian part-of-speech information is not provided Participants may participate in any or all of the above languages. The test data will consist of a group of entries with the label of the relation missing, participants should submit the result in the same form of the training data, that is the test data with the predicted label. Publication of Results Participants will submit a system paper that should include a description of the system, the way the data has been processed, the applied algorithms, the obtained results, as well as the conclusions and ideas for future improvements. The papers will be peer reviewed prior to publication to confirm that all aspects are well covered. The workshop will accept also regular papers from participants who are not participating in the shared task but still have worked in the topic of translation inference and want to publish novel results or ideas, maybe with different datasets and experimental basis as the ones proposed in this shared task. Such papers will be peer reviewed on the basis of their scientific quality. All the accepted papers will be published as part of the Globalex workshop proceedings and presented during the workshop. Important Dates 17/12/2019 – Technical description of the evaluation process and data provided by organisers 01/02/2020 – Release of extended Training Data 13/03/2020 – Submission of results by participants / submission of regular papers 03/04/2020 – Evaluation results communicated by organisers / notification of regular papers 14/04/2020 – Submission of system description papers 12/05/2020 – Workshop day Organizers John P. McCrae – Data Science Institute, National University of Ireland Galway Sina Ahmadi – Data Science Institute, National University of Ireland Galway TRACK 2 – 3rd "Translation Inference Across Dictionaries" (TIAD 2020) shared task CALL FOR PARTICIPATION We are pleased to invite you to participate in the third shared task for Translation Inference Across Dictionaries (TIAD 2020), that will be held in conjunction to the GLOBALEX 2020 – Linked Lexicography workshop at the 12th Language Resources and Evaluation Conference (LREC 2020) on Tuesday, May 12 2020 in Marseille (France). This initiative is aimed at exploring best methods and techniques for automatically generating new bilingual (and multilingual) dictionaries from existing ones, in the context of a coherent experiment that enables reliable validation of results and solid comparison of methods and techniques used for the automatic generation of translations across languages. This initiative aims also to stimulate and enhance further research on the topic. TASK DEFINITION The objective of the task is to explore and compare methods and techniques to infer indirect translations between language pairs, based on existing bilingual resources. Such techniques would help in auto-generating new bilingual and multilingual dictionaries based on existing ones. In particular, the participating systems will be asked to indirectly generate translations among three languages, namely Portuguese, French and English, based on already known translations contained in the Apertium RDF graph (http://linguistic.linkeddata.es/apertium/). The three chosen languages (EN, FR, PT) are not directly connected in the Apertium RDF graph (https://tinyurl.com/apertiumRDF-lang), therefore no direct translations can be obtained among them in Apertium RDF. Based on the available RDF data, the participants will have to apply their methods and techniques to discover indirect translations (mediated by any other language in the graph) between the pairs: (EN, FR), (FR, PT), and (PT, EN). In addition, participants are welcome to make use of other freely available sources of background knowledge (e.g., lexical linked open data and parallel corpora) to improve performance, as long as direct translations among the considered language pairs from such extra resources are NOT used. The inclusion of other language pairs in the evaluation could be considered, in which case this will be conveniently announced to participants. Evaluation of the results will be carried out by the organisers against manually compiled pairs of K Dictionaries from the Global Series (https://lexicala.com/resources#dictionaries) and other resources. PUBLICATION OF RESULTS Participants will submit a system paper that should include a description of the system, the way the data have been processed, the applied algorithms, the obtained results, as well as the conclusions and ideas for future improvements. The papers will be peer reviewed prior to publication to confirm that all aspects are well covered. The workshop will accept also regular papers from participants who are not participating in the shared task but still have worked in the topic of translation inference and want to publish novel results or ideas, maybe with different datasets and experimental basis as the ones proposed in this shared task. Such papers will be peer reviewed on the basis of their scientific quality. All the accepted papers will be published as part of the Globalex workshop proceedings and presented during the workshop. IMPORTANT DATES 17/12/2019 – Technical description of the evaluation process and data provided by organisers 14/02/2020 – Submission of regular papers (not participating systems) 13/03/2020 – Submission of results by participating systems / notification of regular papers 02/04/2020 – Evaluation results communicated by organisers / camera–ready of regular papers 14/04/2020 – Submission of system description papers 12/05/2020 – Workshop day ORGANISERS ● Jorge Gracia, University of Zaragoza, Spain ● Besim Kabashi. Friedrich-Alexander-University of Erlangen-Nuremberg, and Ludwig-Maximilian-University of Munich, Germany REVIEW COMMITTEE To be announced WEBSITE A full description of TIAD-2020 will be available at https://tiad2020.unizar.es/ Identify, Describe and Share your LRs! Describing your LRs in the LRE Map is now a normal practice in the submission procedure of LREC (introduced in 2010 and adopted by other conferences). To continue the efforts initiated at LREC 2014 about “Sharing LRs” (data, tools, web-services, etc.), authors will have the possibility, when submitting a paper, to upload LRs in a special LREC repository. This effort of sharing LRs, linked to the LRE Map for their description, may become a new “regular” feature for conferences in our field, thus contributing to creating a common repository where everyone can deposit and share data. As scientific work requires accurate citations of referenced work so as to allow the community to understand the whole context and also replicate the experiments conducted by other researchers, LREC 2016 endorses the need to uniquely Identify LRs through the use of the International Standard Language Resource Number (ISLRN, www.islrn.org<http://www.islrn.org>), a Persistent Unique Identifier to be assigned to each Language Resource. The assignment of ISLRNs to LRs cited in LREC papers will be offered at submission time. --------------------------------------------------
Received on Tuesday, 17 December 2019 18:15:58 UTC