W3C home > Mailing lists > Public > public-lod@w3.org > February 2010

Re: DBpedia-based entity recognition service / tool?

From: Tim Finin <finin@cs.umbc.edu>
Date: Tue, 02 Feb 2010 15:50:51 -0500
Message-ID: <4B68902B.8000306@cs.umbc.edu>
To: public-lod@w3.org
On 2/2/10 7:26 AM, Matthias Samwald wrote:
> I would be glad to hear your advice on how to best accomplish a simple
> task: extracting DBpedia entities (identified with DBpedia URIs) from a
> string of text. With good accuracy and recall, possibly with some
> options to constraint the recognized entities to some subset of DBpedia,
> ...

This closely related to the task of the Knowledge Base Population
track [1] that was run as part of the NIST 2009 Text Analysis
Conference [2].  The KBP track required systems to to two tasks:
entity linking and slot filling.

For entity linking, participants had to take an entity
mention (e.g., "CDC") and a document in which it appeared, and
decide which of the ~800K entitles in a reference KB derived
from Wikipedia it referred to or NIL it was thought to refer to
none of them.

For the slot filling part, participants started with a KB entity
and had to fill in as many of the unknown slots (i.e.,
properties) by finding answers in a collection of 1.3 million
English newswire articles.  Each value had to be linked to
evidence -- a pointer to the part of a document from which it was
derived.  If the slot values were also entities in the KB, then a
link to them was supposed to also be found.

The competition and workshop had 13 groups participating.  Papers
describing the systems and the results should be available online
later this month.  Several of them did very well on the entity linking
task even when a large proportion of the queries were not in the KB
and should resolve to NIL.  The systems generally did much worse
on the more difficult slot filling task.

There will be another, revised version of the KBP track [3] held
in 2010.

[1] http://apl.jhu.edu/~paulmac/kbp.html
[2] http://www.nist.gov/tac/
[3] http://nlp.cs.qc.cuny.edu/kbp/2010/

Tim Finin, Computer Science and Electrical Engineering,  University of Maryland,
Baltimore County, 1000 Hilltop Circle, Baltimore MD 21250 http://umbc.edu/~finin
finin@umbc.edu 410-455-3522 fax:-3969 http://ebiquity.umbc.edu/ tfinin@gmail.com

Received on Tuesday, 2 February 2010 20:51:23 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:25 UTC