- From: Felix Sasaki <fsasaki@w3.org>
- Date: Wed, 19 Feb 2014 09:01:25 +0100
- To: "public-i18n-its-ig@w3.org" <public-i18n-its-ig@w3.org>, www-international <www-international@w3.org>
- Message-ID: <530464D5.2080009@w3.org>
Apologies for cross-posting.
- Felix
TheLinked Data for Language Technology community
<http://www.w3.org/community/ld4lt/>is organising a*roadmapping workshop
<https://www.w3.org/community/ld4lt/wiki/LD4LT_Group_Kick-Off_and_Roadmap_Meeting>* on
21st March in Athens, to build a better understanding of the potential
synergies and co-evolution paths for /language technologies,/such as
machine translation, information extraction and sentiment analysis, and
/linked data/. Language technologies are key to extracting information
from unstructured content in different languages to form linked data,
while linked data can aid the discovery and sharing of the language
resources that underpin language technologies.
*Who should attend?*Any organisation interested in automated extraction
of data from unstructured digital content, especially content in more
than one language and including multimedia as well as textual content.
Organisations engaged in the market for language technologies applied
beyond English-language content and data. All these can benefit from
more open access to linked language resources.
*How can you participate?*You can register for the eventhere
<https://www.w3.org/2002/09/wbs/1/lider-rmws-20140321/>. If you wish to
present a similar statement you can indicate this on your registration
form. The event will then proceed in an structured open format to
identify and capture from participants their use case priorities and
interoperability, best-practice and technology gaps they face. Anonline
survey <https://www.w3.org/2002/09/wbs/1/ld4lt-wbs1/>is currently open
for gathering industry view on use case prioritation. You can also
contribute directly by joining theLinked Data for Language Technology
community <http://www.w3.org/community/ld4lt/>at the w3C
*Programme and Topics: *The workshop will open with keynotes from Hans
Uszkoreit who is Scientific Director DFKI, Nicoletta Calzolari Director
of Research CNR, Phil Archer who is leading the W3C Data Activity and
Asun Gomez-Perez UPM who is leading the LIDER coordination action on
linguistic linked data. This will be followed by short briefing from
four existing international communities working in this area, by
position statements from companies about existing use cases and by an
open workshop session to establish use case priorities.
The language resource community has already made a concerted attempt to
catalogue different data sets through theMETA-SHARE initiative
<http://www.meta-share.eu/>. It has tackled the need for common
meta-data for linguistic corpora of various types and has paid
particular attention to encoding the different usage rights that exist
across governmental, academic and commercial data sources. This
initiative is therefore well primed to exploit linked data technologies
being standardised by theW3C Data Activity
<http://www.w3.org/2013/data/>to further open the cataloguing and
discovery of language resources.
This is particularly timely as the European Commission has launched it
new H2020 funding programme with a strong support available for
innovation and research in theopen data and language technology space
<http://ec.europa.eu/digital-agenda/en/news/information-and-networking-days-h2020-work-programme-2014-2015-connecting-europe-facility>.
In April 2014 it will also launch its Connecting Europe Facilities
programme, with ¤1Billion for funding new pan-European digital services,
including open data exchange and automated translations services. In
both these initiatives, strong, open solutions for the interoperability
of language resources as open web data will be key.
The workshop we take a use case driven approach to key questions around
the synergies possible between the W3C's open web data standards and
existing approaches to sharing language resources and applying them for
training language technologies:
*
How can language resource sharing infrastructure, such as
META-SHARE, migrate to a linked data approach so as to benefit from
more robust, decentralised and scalable publication and search features?
*
How well can existing linked data vocabularies such asCreative
Commons Rights Expression Language
<http://creativecommons.org/ns>andLinked Data Right
<http://oeg-dev.dia.fi.upm.es/licensius/static/ldr/>support the
usage rights models established for language resources?
*
How far can language resource meta-data be supported by theData
Catalogue Vocabulary
<http://www.w3.org/TR/2014/REC-vocab-dcat-20140116/>or theVocabulary
of Interlinked Datasets <http://vocab.deri.ie/void>?
*
How can emerging onto-lexical resources such asBabelNet
<http://babelnet.org/>be usefully interlinked with individual terms
in existing language resources?
*
How can the process of locating and managing language resources to
train language technologies be eased and optimised by vocabularies
such as theProvenance Ontology <http://www.w3.org/TR/prov-o/>or
theProvenance and Plans Ontology
<http://vocab.linkeddata.es/p-plan/>for repeatable data workflows.
However these are just a sample of the many issues and viewpoints that
will have a bearing on the future of Linked Data for Language
Technoloiges, and we hope you will be able to join us in Athen to share
yours.
Received on Wednesday, 19 February 2014 08:01:56 UTC