W3C home > Mailing lists > Public > semantic-web@w3.org > March 2013

KBGen Challenge: Last Call for Pre-Registration

From: Claire Gardent <claire.gardent@loria.fr>
Date: 09 Mar 2013 12:18:15 +0100
Message-Id: <f2c7c1$5tpmv@mail2-relais-roc.national.inria.fr>
To: semantic-web@w3.org
Last CALL FOR PARTICIPATION
The KBGen Challenge: Generating from Knowledge Bases
http://www.kbgen.org/

Organised by:
Eva Banik, Computational Linguistics Ltd, UK
Claire Gardent, CNRS/LORIA, Nancy, France
Eric Kow, Computational Linguistics Ltd, UK
Nikhil Dinesh, SRI International, Menlo Park, California, USA

Endorsed by SIGGEN, the ACL Special Interest Group on Generation.

Important Dates
-----------------------

Please note: It is still possible to download the pre-release dataset and join the campaign. 

08 December 2012: Pre-Release of partial KBGen 2013 Task data
Around 15 March 2013: Release of full KBGen 2013 Task data
Early June 2013: Release of Test Data and Deadline for System Outputs
August 2013: Reporting and discussing results at ENLG

Call for Registration 
----------------------------------------------------------------------

We invite teams of researchers to register for the  KBGen 2013 Task by filling in the registration form here: http://www.kbgen.org/register/

Once registered, teams will be given access to sample data with which to familiarise themselves with the input representation formats we have developed. The complete KBGen data will be distributed around March 15th, and the deadline for submitting system outputs on unseen test data will be in early June (exact date to be given later). Results and participating systems will be presented at ENLG in August 2013.

Below we provide a brief overview of the KBGen Task. For more information please visit the other pages on the KBGen site (http://www.kbgen.org/).

KBGen Task
------------------

The task for participating teams is to develop systems that map the input representations provided by the KBGen organisers to sentences, and to submit system outputs for the inputs in the test data set.

Data
------

The KBGen Task data is derived from the AURA Knowledge Base which was developed in the context of the HALO Project at SRI International.  This knowledge base encodes knowledge contained in a college-level biology textbook. We have processed and adapted this data so that each input provided by the KBGen task can be verbalised in a single, possibly complex, sentence. To minimise the amount of engineering required to participate, we also make available a lexicon mapping the concepts and relations present in the KBGen data to words.

Evaluation
--------------

Submitted system outputs will be evaluated by a variety of automatic metrics and human-assessed quality criteria.

Input Representations
-------------------------------

The input representations are bundles of triples expressing relations between entities. For the development phase, the data set will consist of input and output pairs, where each input is associated with one or more manually produced sentence verbalising this input.

Here is an example of the input-output pairs that we propose for the challenge:

"The rate of detoxification in the liver cell is directly proportional to the quantity of smooth endoplasmic reticulum in the liver cell."

(KBGEN-INPUT 
    :TRIPLES (
            (|Detoxification19144| |base| |Liver-Cell19145|)
            (|Detoxification19144| |rate| |Rate-Value19132|)
            (|Rate-Value19132| |directly-proportional| |Quantity-Value19135|)
            (|Liver-Cell19145| |has-part| |Smooth-Endoplasmic-Reticulum19149|)
            (|Smooth-Endoplasmic-Reticulum19149| |quantity| |Quantity-Value19135|))
    :INSTANCE-TYPES (
            (|Detoxification19144| |instance-of| |Detoxification|)
            (|Rate-Value19132| |instance-of| |Rate-Value|)
            (|Liver-Cell19145| |instance-of| |Liver-Cell|)
            (|Smooth-Endoplasmic-Reticulum19149| |instance-of| |Smooth-Endoplasmic-Reticulum|)
            (|Quantity-Value19135| |instance-of| |Quantity-Value|))
    :ROOT-TYPES (
            (|Detoxification19144| |instance-of| |Event|)
            (|Liver-Cell19145| |instance-of| |Entity|)
            (|Rate-Value19132| |instance-of| |Property-Value|)
            (|Quantity-Value19135| |instance-of| |Property-Value|)
            (|Smooth-Endoplasmic-Reticulum19149| |instance-of| |Entity|)))


As mentioned above, we  make available a lexicon which maps each concept and relation present in the input to words.

Contact email:		 info@kbgen.org
-------------------
Received on Monday, 11 March 2013 13:25:31 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 07:42:40 UTC