Call for result submission: Entity Search @ SEMSEARCH10

(Apologies if you receive multiple copies of this message)

Call for result submission: Entity Search @ SEMSEARCH10

======================================================

Fellow Researcher,

for this year's SemSearch workshop to be held at WWW 2010, we are glad
to announce a special track for entity search. This is

to see where we are and to promote further research on entity
retrieval on the semantic data. Please refer to the call below

for more details on this matter.


We would like to announce that the FINAL QUERIES for evaluation are
now available and the deadline for submitting final

results is EXTENDED to April the 17th! If needed, results of up to
three runs can be submitted. The different design

decisions and rationales for these three configurations might be
explained in the system description.


We are looking forward to see you at SemSearch10 in Raleigh, NC!


Cheers,

Marko Grobelnik, Jožef Stefan Institute, Ljubljana, Slovenia
Peter Mika, Yahoo! Research, Barcelona, Spain
Thanh Tran Duc, Institute AIFB, University of Karlsruhe (TH), Germany
Haofen Wang, Apex Lab, Shanghai Jiao Tong University, China.



===================================

Entity Search @ SEMSEARCH10


Third International Semantic Search Workshop SemSearch10

April 26, 2010, Raleigh, NC, USA

Homepage: http://km.aifb.uni-karlsruhe.de/ws/semsearch10#eva


Submission deadline for descriptions of Entity Search systems &
results: April 17th, 2010 (12.00 AM, GMT)


===================================

Our ultimate goal is to develop a benchmark, based on which semantic
search systems can be compared and analyzed in a

systematic fashion. Clearly, semantics can be used for different tasks
(document vs. data retrieval) and can be exploited

throughout the search process (for more usable query construction, for
better matching and ranking, for richer result

presentation etc). Hence, such a benchmark shall enable the study of
different aspects of semantic search systems.

For this workshop, we will initially focus on the aspects of matching
and ranking in the semantic data search scenario. In

particular, we aim to analyze the effectiveness, efficiency and
robustness of those features of semantic search systems,

which are ready to be applied to the Web today: the capability to
answer queries related to real world entities.

The research questions we aim to tackle are:

- How well do semantic data search engines perform on the task of
Entity Search on the Web?
- What are the underlying concepts and techniques that make up the differences?

For answering these questions, we provide the following guidelines and
support for evaluating entity search systems:


-----------------------------------
Queries
-----------------------------------

We provide a set of queries that are focused on the task of entity
search. Every query is a plain list of keywords which

refer to one particular entity. In other words, the queries ask for
one particular entity (as opposed to a set of entity.

These queries represent a sample extracted from the Yahoo Web search
query log. One example of this type is "Semantic Search

workshop 2010 WWW", which retrieves resources that are representations
of or related to the current Semantic Search workshop.

More sample queries can be downloaded from this link:

http://km.aifb.uni-karlsruhe.de/ws/semsearch10/Files/samplequeries

Access to the evaluation set of queries and thus participation in the
evaluation requires the signing of a license agreement.

http://km.aifb.uni-karlsruhe.de/ws/semsearch10/Files/agreement


The FINAL QUERIES for evaluation are now available at

http://km.aifb.uni-karlsruhe.de/ws/semsearch10/Files/finalqueries



-----------------------------------
Data
-----------------------------------

We provide a corpus of datasets, which contain entity descriptions in
the form of RDF. They represent a sample of Web data

crawled from publicly available sources. For this evaluation, we use
the Billion Triple Challenge 2009 dataset.
Further information and detailed statistics can be found here:

http://vmlion25.deri.ie/

The original Billion Triple Challenge 2009 dataset contains blank
nodes. We will not deal with blank nodes in this evaluation

and thus require participants to encode blank nodes according to the
following rule: BNID map to

http://example.org/URLEncode(BNID), where BNID is the blank node id.
Since the blank node ids in that dataset are unique,

this convention is sufficient to map blank nodes to obtain distinct URIs.

Instead of encoding the blank nodes using this convention,
participants can also download the following version of the

Billion Triple Challenge 2009 dataset where blank nodes are have been
already converted to URIs:

http://km.aifb.uni-karlsruhe.de/ws/dataset_semsearch2010/000-CONTENTS


-----------------------------------
Relevance Judgment
-----------------------------------


The search systems produce lists of at most 10 resources ordered by
relevance. These results have to be drawn from data in

the corpus. Results will be evaluated on the three-point scale (0) Not
Relevant, (1) Relevant and (3) Perfect Match. A

perfect match is a description of a resource that matches the entity
to be retrieved by the query. A relevant result is a

resource description that is related to the entity to be retrieved by
the query, i.e. the entity is contained in the

description of that result. Otherwise, a resource description is not relevant.

In the current evaluation we only assess individual results and as
they are found in the original data set. We do not assess

the potential of semantic search systems for disambiguating and
merging resources.  In other words, only resources appearing

in the original data set may be returned as results.


-----------------------------------
Evaluation Process
-----------------------------------

For participating, each system will have to run the provided queries
on the corpus.
The results have to be submitted in one file following the TREC format:

http://www.ir.iit.edu/~dagr/cs529/files/project_files/trec_eval_desc.htm

If needed, participants can submit results of up to three runs! The
different design decisions and rationales for these three

configurations might be explained in the system description.

Please verify that your result file can be read with the TREC
evaluation tool available at:

http://trec.nist.gov/trec_eval/index.html

The assessment of the results will be performed manually using Amazon
Mechanical Turk.

Based on the relevance judgments, recall, precision, f-measure and the
mean average precision will be computed, and used as

the basis for comparing search systems' performance.

Given permission of the participants, results of the assessment and
the evaluation feedbacks will be made publicly available

at the workshop's website.


-----------------------------------
Submission and Proceedings
-----------------------------------

For the Entity Search Track at SemSearch, participants
can submit


- (1) A short system description papers (April 10th): up to 5 pages in
ACM format
This submission is optional and will be considered for the proceeding.
Participants can register at the workshop and ask for

a presentation slot without having submitted such a system description
paper. Submissions must be formatted using the WWW2010

templates available at
http://www2010.org/www/authors/submissions/formatting-guidelines/.

- (2) Evaluation results (April 10th): results in TREC format (UP TO 3 RUNS!)

Please use the following link to the submission system to submit your paper:
Easychair Submission System for SemSearch10 at
http://www.easychair.org/conferences/?conf=semsearch10

For standard paper and system descriptions, the system accepts PDF.
The evaluation results should be uploaded as TXT.


-----------------------------------
Important Dates
-----------------------------------

Deadline for optional Entity Search system description submissions:
April 17th, 2010 (12.00 AM, GMT)

Deadline for Entity Search Evaluation results: April 17th, 2010 (12.00 AM, GMT)

Notification of acceptance for Entity Search system papers: April 22nd, 2010

WWW'10 Conference: April 26th-30th, 2010


Workshop Day: April 26th, 2010

-----------------------------------
Contact
-----------------------------------
For news and discussions related to SemSearch and Evaluation at
SemSearch, please register at

http://tech.groups.yahoo.com/group/semsearcheval/.
The organization committee can be reached using contact data available
at their web pages (or semsearch10@easychair.org).
See website http://km.aifb.uni-karlsruhe.de/ws/semsearch10.

Received on Thursday, 8 April 2010 09:54:48 UTC