CfP: 2016 Named Entity rEcognition and Linking (NEEL) Challenge (#Microposts2016) @ WWW2016

Named Entity rEcognition and Linking (NEEL) Challenge
at the 6th Making Sense of Microposts Workshop (#Microposts2016) @ WWW 
11/12 April 2016, Montréal, Canada


Microposts are a highly popular medium to share facts, opinions or 
emotions. They are an invaluable wealth of data, ready to be mined for 
training predictive models. Following the success of the previous three 
years, we are pleased to announce the NEEL challenge which will be part 
of  the #Microposts2016 Workshop at the World Wide Web 2016 conference.

The task of the challenge is to automatically recognise entities and 
their types from English microposts, and link them to the corresponding 
English DBpedia 2014 resources (if the resources exist) or NIL 
identifiers. Participants will have to automatically extract expressions 
that are formed by discrete (and typically short) sequences of words 
(e.g., Obama, London, Rakuten) and recognise their types (e.g., Person, 
Location, Organisation) from a collection of microposts. In the linking 
stage, the aim is to disambiguate the spotted entity to the 
corresponding DBpedia resource, or to a NIL reference if the spotted 
named entity does not match any resource in DBpedia.

We welcome participants from the NEEL Challenge, TREC, TAC KBP, ERD 
shared tasks to participate in this year’s challenge.

The dataset consists of tweets extracted from a collection of over 18 
million tweets. The dataset includes event-annotated tweets provided by 
the Redites project ( covering 
multiple noteworthy events from 2011, 2013  (including the death of Amy 
Winehouse, the London Riots, the Oslo bombing and the Westgate Shopping 
Mall shootout), tweets extracted from the Twitter firehose from 2014 and 
2015 via a selection of hashtags. Since the task of this challenge is to 
automatically recognise and link entities, we have built our dataset 
considering both event and non-event tweets. While event tweets are 
likely to contain entities, non-event tweets enable us to evaluate the 
performance of the system in avoiding false positives in the entity 
extraction phase. The training set is built on top of the entire corpus 
of the NEEL 2014 and 2015 Challenges.

The training set will be released as tsv following the TAC KBP format, 
where each line contains the following columns:

1st: tweet identifier [alphanumeric]
2nd,3rd: start/end offsets expressed as the number of UTF8 characters 
starting from 0 (the beginning of the tweet), space is counted too [integer]
4th: link to DBpedia resource or NIL (it may exist different NIL in the 
corpus. Each NIL may be reused if there are multiple mentions in the 
text which represent the same entity) [alphanumeric]
5th: salience (confidence score). This field can be assigned randomly, 
since it *will not* be used to rank the submissions [double]
6th: type [alphanumeric]

Tokens are separated by TABs. We will advertise the release of the data 
sets on the workshop mailing list. To be informed, please subscribe to!forum/neelchallenge.

Participants are allowed to submit up to 3 runs of their system as TSV 
files. An example of the submission format will be released with the 
development set. We encourage participants to make available their 
system to the community to facilitate reuse and we will acknowledge the 
systems that shared their source code or were otherwise made accessible 
for reuse otherwise.

We will use the TAC KBP scorer 
( to evaluate the 
results and in particular we will focus on:

[tagging]     strong_typed_mention_match (check entity name boundary and 
[linking]     strong_link_match
[clustering]  mention_ceaf (NIL detection)

A paper of 3 pages describing your approach, how you tuned/tested it 
using the training data, and your results on the dev set. All 
submissions must be in English. Submissions should be prepared according 
to the ACM SIG Proceedings Template (see, and should 
include author names and affiliations, and 3-5 author-selected keywords. 
Along with the paper, authors will submit up to 3 runs of their systems 
computed over the test set. The submission should be made as a single, 
unencrypted zip file that includes a plain text file listing its 
contents. Submission is via EasyChair, at: Each submission 
will receive at least 2 peer reviews.
We aim to publish the #Microposts2016 proceedings via CEUR as a single 
volume containing all three tracks.

1- register your team at
2- download the agreement, sign it, and send the 
pdf to and
3- download the challenge guidelines . Shortly 
after, you will receive the instructions on how to obtain the database
4- check out the challenge timeline and follow up

*Release of training*: from 7 December 2015
*Release of dev set*: 30 December 2015
*Release of test set*: 31 January 2016
*Submission of results*: 7 February 2016
*Submission of reports*: 7 February 2016
*Challenge Notification*: 18 February 2016

*Challenge camera-ready deadline*: 28 February 2016
*Workshop*: 11/12 April 2016 (Registration open to all)
(All deadlines 23:59 Hawaii Time)

Mailing list :
Twitter hashtags: #neel #microposts2016
Twitter account: @Microposts2016
W3C Microposts Community Group:

Giuseppe Rizzo, Istituto Superiore Mario Boella, Italy
Marieke van Erp, Vrije Universiteit Amsterdam, Netherlands

Ebrahim Bagheri, Ryerson University, Canada
Pierpaolo Basile, University of Bari, Italy
David Corney, Signal Media, UK
Grégoire Burel, KMi, Open University, UK
Milan Dojchinovski, Leipzig University, Germany/Czech Technical 
University, Czech Republic
Guillaume Erétéo, Vigiglobe, France
Anna Lisa Gentile, The University of Sheffield, UK
José M. Morales del Castillo, El Colegio de México, Mexico
Bernardo Pereira Nunes, PUC-Rio, Brazil
Giles Reger, The University of Manchester, UK
Irina Temnikova, Qatar Computing Research Institute, Qatar
Victoria Uren, Aston University, UK

Received on Sunday, 3 January 2016 14:57:49 UTC