W3C home > Mailing lists > Public > semantic-web@w3.org > December 2015

First call for participation: Named Entity rEcognition and Linking (NEEL) Challenge @ The 6th Making Sense of Microposts Workshop (#Microposts2016)

From: Erp, M.G.J. van <marieke.van.erp@vu.nl>
Date: Fri, 11 Dec 2015 19:52:50 +0000
To: "semantic-web@w3.org" <semantic-web@w3.org>
Message-ID: <3D3825E9-8022-447F-840C-398D3B72DC57@vu.nl>

*apologies for cross-posting*
Named Entity rEcognition and Linking (NEEL) Challenge
at the 6th Making Sense of Microposts Workshop (#Microposts2016) @ WWW 2016

11/12 April 2016, Montréal, Canada


Microposts are a highly popular medium to share facts, opinions or emotions. They are an invaluable wealth of data, ready to be mined for training predictive models. Following the success of the previous three years, we are pleased to announce the NEEL challenge which will be part of  the #Microposts2016 Workshop at the World Wide Web 2016 conference.

The task of the challenge is to automatically recognise entities and their types from English microposts, and link them to the corresponding English DBpedia 2014 resources (if the resources exist) or NIL identifiers. Participants will have to automatically extract expressions that are formed by discrete (and typically short) sequences of words (e.g., Obama, London, Rakuten) and recognise their types (e.g., Person, Location, Organisation) from a collection of microposts. In the linking stage, the aim is to disambiguate the spotted entity to the corresponding DBpedia resource, or to a NIL reference if the spotted named entity does not match any resource in DBpedia.

We welcome participants from the NEEL Challenge, TREC, TAC KBP, ERD shared tasks to participate in this year’s challenge.

The dataset consists of tweets extracted from a collection of over 18 million tweets. The dataset includes event-annotated tweets provided by the Redites project (http://demeter.inf.ed.ac.uk/redites/) covering multiple noteworthy events from 2011, 2013  (including the death of Amy Winehouse, the London Riots, the Oslo bombing and the Westgate Shopping Mall shootout), tweets extracted from the Twitter firehose from 2014 and 2015 via a selection of hashtags. Since the task of this challenge is to automatically recognise and link entities, we have built our dataset considering both event and non-event tweets. While event tweets are likely to contain entities, non-event tweets enable us to evaluate the performance of the system in avoiding false positives in the entity extraction phase. The training set is built on top of the entire corpus of the NEEL 2014 and 2015 Challenges.

The training set will be released as tsv following the TAC KBP format, where each line contains the following columns:

1st: tweet identifier [alphanumeric]
2nd,3rd: start/end offsets expressed as the number of UTF8 characters starting from 0 (the beginning of the tweet), space is counted too [integer]
4th: link to DBpedia resource or NIL (it may exist different NIL in the corpus. Each NIL may be reused if there are multiple mentions in the text which represent the same entity) [alphanumeric]
5th: salience (confidence score). This field can be assigned randomly, since it *will not* be used to rank the submissions [double]
6th: type [alphanumeric]

Tokens are separated by TABs. We will advertise the release of the data sets on the workshop mailing list. To be informed, please subscribe to https://groups.google.com/forum/neelchallenge.

Participants are allowed to submit up to 3 runs of their system as TSV files. An example of the submission format will be released with the development set. We encourage participants to make available their system to the community to facilitate reuse and we will acknowledge the systems that shared their source code or were otherwise made accessible for reuse otherwise.

We will use the TAC KBP scorer (https://github.com/wikilinks/neleval/wiki/Evaluation) to evaluate the results and in particular we will focus on:

[tagging]     strong_typed_mention_match (check entity name boundary and type)
[linking]     strong_link_match
[clustering]  mention_ceaf (NIL detection)

A paper of 3 pages describing your approach, how you tuned/tested it using the training data, and your results on the dev set. All submissions must be in English. Submissions should be prepared according to the ACM SIG Proceedings Template (see http://www.acm.org/sigs/publications/proceedings-templates), and should include author names and affiliations, and 3-5 author-selected keywords. Along with the paper, authors will submit up to 3 runs of their systems computed over the test set. The submission should be made as a single, unencrypted zip file that includes a plain text file listing its contents. Submission is via EasyChair, at: https://easychair.org/conferences/?conf=microposts2016. Each submission will receive at least 2 peer reviews.
We aim to publish the #Microposts2016 proceedings via CEUR as a single volume containing all three tracks.

1- register your team at http://goo.gl/forms/2R7zagtUJZ and subscribe to https://goo.gl/vsyq0O

2- download the agreement https://goo.gl/idFdyP, sign it, and send the pdf to giuseppe.rizzo@ismb.it and marieke.van.erp@vu.nl
3- download the challenge guidelines https://goo.gl/XGmpuY

3- Shortly after, you will receive the instructions on how to obtain the database
4- check out the challenge timeline and follow up

*Release of training*: from 7 December 2015
*Release of dev set*: 30 December 2015
*Release of test set*: 31 January 2016
*Submission of results*: 7 February 2016
*Submission of reports*: 7 February 2016
*Challenge Notification*: 18 February 2016

*Challenge camera-ready deadline*: 28 February 2016
*Workshop*: 11/12 April 2016 (Registration open to all)
(All deadlines 23:59 Hawaii Time)

Mailing list : https://groups.google.com/forum/neelchallenge

Twitter hashtags: #neel #microposts2016
Twitter account: @Microposts2016
W3C Microposts Community Group: http://www.w3.org/community/microposts

Giuseppe Rizzo, Istituto Superiore Mario Boella, Italy
Marieke van Erp, Vrije Universiteit Amsterdam, Netherlands

Ebrahim Bagheri, Ryerson University, Canada
Pierpaolo Basile, University of Bari, Italy
David Corney, Signal Media, UK
Grégoire Burel, KMi, Open University, UK
Milan Dojchinovski, Leipzig University, Germany/Czech Technical University, Czech Republic
Guillaume Erétéo, Vigiglobe, France
Anna Lisa Gentile, The University of Sheffield, UK
José M. Morales del Castillo, El Colegio de México, Mexico
Bernardo Pereira Nunes, PUC-Rio, Brazil
Giles Reger, The University of Manchester, UK
Irina Temnikova, Qatar Computing Research Institute, Qatar
Victoria Uren, Aston University, UK

Computational Lexicology & Terminology Lab (CLTL)
The Network Institute, Vrije Universiteit Amsterdam

De Boelelaan 1105
1081 HV  Amsterdam, The Netherlands


Received on Friday, 11 December 2015 19:53:25 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 5 July 2022 08:45:44 UTC