First Call for Participation: SemEval - Task 17: Taxonomy Extraction Evaluation (TExEval)

SemEval - Task 17: Taxonomy Extraction Evaluation (TExEval)
First Call for Participation

Website:http://alt.qcri.org/semeval2015/task17/
Google Group:https://groups.google.com/d/forum/semeval-task17
Evaluation period: November 15 - 30, 2014
Paper submission: January 30, 2015

*Introduction*

Taxonomies are useful tools for content organisation, navigation, and retrieval, providing valuable input for semantically intensive tasks such as question answering and textual entailment. We implemented a task concerned with automatically extracting hierarchical relations from text and subsequent taxonomy construction. A hierarchical relation is any asymmetrical relation that indicates subordination between two terms. However, in this task, the focus is on hyponym-hypernym relations.

Taxonomy learning from text is a challenging task that can be divided in several subtasks, including term extraction, relation discovery, taxonomy construction and taxonomy cleaning. Although term extraction is an important step when constructing a domain taxonomy, this shared task makes the assumption that a list of terms is readily available. Nevertheless, participants are allowed to add additional nodes, i.e. terms, in the hierarchy as they consider appropriate. Terms will be extracted from a domain specific corpus using an existing term extraction tool, providing the participants with a list of manually filtered terms. In this way, taxonomy learning is limited to finding relations between pairs of terms and organising them in a hierarchical structure. This simplifies the evaluation by providing common ground for all the systems. Participants are encouraged to consider polyhierarchies when organising terms, as multiple perspectives can be equally valid when organising concepts. Bec

ause nodes can have more than one parent, the final structure of the taxonomy is not necessarily a tree.

*Task Description*

In this shared task, taxonomies are evaluated through comparison with gold standard relations collected from BabelNet, a multilingual semantic network built by merging WordNet with Wikipedia. Additionally, gold standard relations will be gathered from manually constructed taxonomies, classification schemes and/or ontologies, where available depending on the domain. Expert evaluation will be performed by pooling a subset of the relations submitted by the participants. Recall will be estimated based on the combined set of relations identified by all the systems. We will evaluate the performance of systems across domains, by considering three domains that were not previously considered for this task, including commonsense knowledge as well as technical domains.

Depending on the selected approach, the task may or may not require large amounts of text to extract relations between terms, therefore no corpora is provided by the organisers of the task. Trial/training data will consist of terms and hierarchical relations selected from one of the WordNet domains that was previously considered for this task, such as plants or vehicles, as well as for a technical domain, such as AI. The domains will be revealed only when test data is available so that systems will not be overfitted to the domain. Possible domains could be politics, sociology, rock music, etc. For each domain, the test data will consist of a list of domain terms that the systems will have to structure into a taxonomy, with the possibility of adding further intermediate terms. Each system will return a list of pairs (term, hypernym).

*Evaluation Methodology*

It is not only the construction of taxonomies that is difficult but the evaluation as well, therefore we consider two different evaluation methodologies. We will evaluate the relations between terms using standard precision, recall and F1 measures. We will also use the evaluation scheme presented in [4] to compare the overall structure of the taxonomy against a gold standard, with an approach used for comparing hierarchical clusters. As a baseline, we will use hypernym relations from WordNet. We expect a low recall, as WordNet has only a partial coverage for most technical domains.

*More information*

See the following site for further information about the task, data formats, examples, data downloads, tools and registration information:

http://alt.qcri.org/semeval2015/task17/  

*Important dates*
 • Evaluation period starts: November 15, 2014
 • Evaluation period ends: November 30, 2014
 • Paper submission due: January 30, 2015
 • Paper notification: February 28, 2015
 • Camera-ready due: March 30, 2015
 • SemEval workshop: Summer 2015

*Organisers*
 • Georgeta Bordea & Paul Buitelaar, National University of Ireland, Galway
 • Stefano Faralli & Roberto Navigli, Sapienza University of Rome, Italy

Received on Monday, 11 August 2014 15:36:57 UTC