- From: Duc Thanh Tran <tran.du.th@googlemail.com>
- Date: Thu, 8 Apr 2010 11:54:15 +0200
- To: semantic-web@w3.org, public-lod@w3.org
(Apologies if you receive multiple copies of this message) Call for result submission: Entity Search @ SEMSEARCH10 ====================================================== Fellow Researcher, for this year's SemSearch workshop to be held at WWW 2010, we are glad to announce a special track for entity search. This is to see where we are and to promote further research on entity retrieval on the semantic data. Please refer to the call below for more details on this matter. We would like to announce that the FINAL QUERIES for evaluation are now available and the deadline for submitting final results is EXTENDED to April the 17th! If needed, results of up to three runs can be submitted. The different design decisions and rationales for these three configurations might be explained in the system description. We are looking forward to see you at SemSearch10 in Raleigh, NC! Cheers, Marko Grobelnik, Jožef Stefan Institute, Ljubljana, Slovenia Peter Mika, Yahoo! Research, Barcelona, Spain Thanh Tran Duc, Institute AIFB, University of Karlsruhe (TH), Germany Haofen Wang, Apex Lab, Shanghai Jiao Tong University, China. =================================== Entity Search @ SEMSEARCH10 Third International Semantic Search Workshop SemSearch10 April 26, 2010, Raleigh, NC, USA Homepage: http://km.aifb.uni-karlsruhe.de/ws/semsearch10#eva Submission deadline for descriptions of Entity Search systems & results: April 17th, 2010 (12.00 AM, GMT) =================================== Our ultimate goal is to develop a benchmark, based on which semantic search systems can be compared and analyzed in a systematic fashion. Clearly, semantics can be used for different tasks (document vs. data retrieval) and can be exploited throughout the search process (for more usable query construction, for better matching and ranking, for richer result presentation etc). Hence, such a benchmark shall enable the study of different aspects of semantic search systems. For this workshop, we will initially focus on the aspects of matching and ranking in the semantic data search scenario. In particular, we aim to analyze the effectiveness, efficiency and robustness of those features of semantic search systems, which are ready to be applied to the Web today: the capability to answer queries related to real world entities. The research questions we aim to tackle are: - How well do semantic data search engines perform on the task of Entity Search on the Web? - What are the underlying concepts and techniques that make up the differences? For answering these questions, we provide the following guidelines and support for evaluating entity search systems: ----------------------------------- Queries ----------------------------------- We provide a set of queries that are focused on the task of entity search. Every query is a plain list of keywords which refer to one particular entity. In other words, the queries ask for one particular entity (as opposed to a set of entity. These queries represent a sample extracted from the Yahoo Web search query log. One example of this type is "Semantic Search workshop 2010 WWW", which retrieves resources that are representations of or related to the current Semantic Search workshop. More sample queries can be downloaded from this link: http://km.aifb.uni-karlsruhe.de/ws/semsearch10/Files/samplequeries Access to the evaluation set of queries and thus participation in the evaluation requires the signing of a license agreement. http://km.aifb.uni-karlsruhe.de/ws/semsearch10/Files/agreement The FINAL QUERIES for evaluation are now available at http://km.aifb.uni-karlsruhe.de/ws/semsearch10/Files/finalqueries ----------------------------------- Data ----------------------------------- We provide a corpus of datasets, which contain entity descriptions in the form of RDF. They represent a sample of Web data crawled from publicly available sources. For this evaluation, we use the Billion Triple Challenge 2009 dataset. Further information and detailed statistics can be found here: http://vmlion25.deri.ie/ The original Billion Triple Challenge 2009 dataset contains blank nodes. We will not deal with blank nodes in this evaluation and thus require participants to encode blank nodes according to the following rule: BNID map to http://example.org/URLEncode(BNID), where BNID is the blank node id. Since the blank node ids in that dataset are unique, this convention is sufficient to map blank nodes to obtain distinct URIs. Instead of encoding the blank nodes using this convention, participants can also download the following version of the Billion Triple Challenge 2009 dataset where blank nodes are have been already converted to URIs: http://km.aifb.uni-karlsruhe.de/ws/dataset_semsearch2010/000-CONTENTS ----------------------------------- Relevance Judgment ----------------------------------- The search systems produce lists of at most 10 resources ordered by relevance. These results have to be drawn from data in the corpus. Results will be evaluated on the three-point scale (0) Not Relevant, (1) Relevant and (3) Perfect Match. A perfect match is a description of a resource that matches the entity to be retrieved by the query. A relevant result is a resource description that is related to the entity to be retrieved by the query, i.e. the entity is contained in the description of that result. Otherwise, a resource description is not relevant. In the current evaluation we only assess individual results and as they are found in the original data set. We do not assess the potential of semantic search systems for disambiguating and merging resources. In other words, only resources appearing in the original data set may be returned as results. ----------------------------------- Evaluation Process ----------------------------------- For participating, each system will have to run the provided queries on the corpus. The results have to be submitted in one file following the TREC format: http://www.ir.iit.edu/~dagr/cs529/files/project_files/trec_eval_desc.htm If needed, participants can submit results of up to three runs! The different design decisions and rationales for these three configurations might be explained in the system description. Please verify that your result file can be read with the TREC evaluation tool available at: http://trec.nist.gov/trec_eval/index.html The assessment of the results will be performed manually using Amazon Mechanical Turk. Based on the relevance judgments, recall, precision, f-measure and the mean average precision will be computed, and used as the basis for comparing search systems' performance. Given permission of the participants, results of the assessment and the evaluation feedbacks will be made publicly available at the workshop's website. ----------------------------------- Submission and Proceedings ----------------------------------- For the Entity Search Track at SemSearch, participants can submit - (1) A short system description papers (April 10th): up to 5 pages in ACM format This submission is optional and will be considered for the proceeding. Participants can register at the workshop and ask for a presentation slot without having submitted such a system description paper. Submissions must be formatted using the WWW2010 templates available at http://www2010.org/www/authors/submissions/formatting-guidelines/. - (2) Evaluation results (April 10th): results in TREC format (UP TO 3 RUNS!) Please use the following link to the submission system to submit your paper: Easychair Submission System for SemSearch10 at http://www.easychair.org/conferences/?conf=semsearch10 For standard paper and system descriptions, the system accepts PDF. The evaluation results should be uploaded as TXT. ----------------------------------- Important Dates ----------------------------------- Deadline for optional Entity Search system description submissions: April 17th, 2010 (12.00 AM, GMT) Deadline for Entity Search Evaluation results: April 17th, 2010 (12.00 AM, GMT) Notification of acceptance for Entity Search system papers: April 22nd, 2010 WWW'10 Conference: April 26th-30th, 2010 Workshop Day: April 26th, 2010 ----------------------------------- Contact ----------------------------------- For news and discussions related to SemSearch and Evaluation at SemSearch, please register at http://tech.groups.yahoo.com/group/semsearcheval/. The organization committee can be reached using contact data available at their web pages (or semsearch10@easychair.org). See website http://km.aifb.uni-karlsruhe.de/ws/semsearch10.
Received on Thursday, 8 April 2010 09:54:48 UTC