Call for Participation: 5th Summer Datathon on Linguistic Linked Open Data (SD-LLOD’23), June 11-16, Croatia

[apologies for cross-posting]

5th Summer Datathon on Linguistic Linked Open Data (SD-LLOD’23)


The 5th Summer Datathon on Linguistic Linked Open Data (SD-LLOD-23) will be
held physically from June 11th to June 16rd 2023 at Castle Luznica,
Zaprešić, Croatia. See

The SD-LLOD datathon has the main goal of providing practical knowledge to
people from industry and academia in the application of Linked Open Data
technology to Linguistics and Language Technology. The ultimate goal is to
enable participants to migrate their own (or other’s) linguistic data and
publish them as Linked Data on the Web and/or develop applications on top
of Linguistic Linked Data (LLD). One of the main focus points this year
will be the use of deep learning and neural approaches to/from LLD.

This datathon series is unique in its topic worldwide and continues from
the success of the previous editions in 2015 and 2017 in Cercedilla
(Spain), 2019 in Dagstuhl (Germany), and 2022 in Cercedilla again. This
edition is supported by COST (European Cooperation in Science and
Technology) through NexusLinguarum, the “European network
for Web-centred linguistic data science” COST Action (CA18209,

During the datathon, participants will be able to:

* Generate their own Linguistic Linked Data from existing data sources,
using visual tools like VocBench and community standards like OntoLex lemon

* Apply semantic technologies (linked data, knowledge graphs, RDF, SPARQL)
to the field of language resources and learn about their benefits and
applications for specific use cases, particularly those involving
multilingual and/or multimodal aspects.

* Explore the potential use of embeddings, machine learning, and deep
learning techniques in combination with Linguistic Linked Data. For

* Neural machine translation from Natural Language to SPARQL

* Generating natural language from knowledge graphs

* Acquiring relations with neural language models

The program of the summer datathon will contain three types of sessions:

1. Seminars to explain theoretical aspects and discuss selected topics.

2. Hands-on sessions to introduce the basic foundations of each topic,
method, and technique, which participants will apply directly through
different practical assignments.

3. Datathon sessions, where participants will work, in groups of 3-5, on
miniprojects and where they will apply what they have learned, involving
the generation and/or use of Linguistic Linked Data.

Participants are invited to propose a “miniproject” related to the topics
of the datathon, which might include some datasets for their conversion
into linked data. In this edition, we particularly encourage miniprojects
that involve interaction with machine learning, deep learning, or
embeddings techniques. A selection of proposals will form the basis for the
miniprojects which the participants will work on during the datathon
sessions. Participants who do not propose a miniproject, or whose
miniproject is not selected, will be able to join another miniproject.
There will be an award for the best miniproject.



The datathon is a sponsored event, and it has no registration fee, but
participants are expected to cover the cost of their meals and
accommodation at the castle residence. Details about the registration can
be found at the datathon website:

Registration will close on 25/04/2023. At least ten travelling grants for
students will be provided by NexusLinguarum (covering accommodation, meals
and travel expenses), more details will appear in the datathon website.

COVID statement


The datathon is planned as a physical event. The local organisation is
committed to guaranteeing a safe event. Note that there might be some COVID
rules to comply with at the time of celebration of the event. These will be
announced in due course.

Important dates (tentative)


Registration opens: 13/02/2023

Registration closes:  25/04/2023

Notification: 4/05/2023

Datathon: 11/06/2023 to 16/06/2023



Jorge Gracia (University of Zaragoza, Spain)

Christian Chiarcos (University of Augsburg, Germany)
Dagmar Gromann (University of Vienna, Austria)

Thierry Declerck (DFKI, Germany)

Milan Dojchinovski (CTU in Prague, Czech Republic / DBpedia Association,

Local organiser


Ana Ostroški (Institute of Croatian Language and Linguistics, Croatia)

Kristina Despot (Institute of Croatian Language and Linguistics, Croatia)

Confirmed tutors and lecturers [to be completed]


Mehwish Alam (Institut Polytechnique de Paris, France)

Christian Chiarcos (University of Augsburg, Germany)
Michael Cochez (Vrije Universiteit Amsterdam, The Netherlands)

Dagmar Gromann (University of Vienna, Austria)

Thierry Declerck (DFKI, Germany)

Milan Dojchinovski (CTU in Prague, Czech Republic / DBpedia Association,

Jorge Gracia (University of Zaragoza, Spain)

Max Ionov (University of Cologne, Germany)

Armando Stellato (University of Rome Tor Vergata, Italy)

Andon Tchechmedjiev (IMT École des Mines d’Alès, France)

Received on Tuesday, 7 February 2023 14:14:16 UTC