Re: Evaluating Exploration/Semantic Search UIs from Andreas Harth on 2008-08-18 (semantic-web@w3.org from August 2008)

From: Andreas Harth <andreas.harth@deri.org>
Date: Mon, 18 Aug 2008 10:41:05 +0100
To: Michiel Hildebrand <Michiel.Hildebrand@cwi.nl>
CC: semantic-web@w3c.org
Message-ID: <48A943B1.2080708@deri.org>

Hi,

Michiel Hildebrand wrote:
> A good initiative. I will load the data into our tools and see what it 
> looks like.
>
> From the information I get from the website it looks like you have 
> been using the data (and other materials) for evaluation. I can't find 
> any results though. The link to the technical report is not working. 
> In other words what are your experiences with the use of this dataset 
> and your evaluation methodology?
the dataset and evaluation method are in draft quality right now.
Running the evaluation on a larger number of users is the next step.
We've conducted some informal evaluations, but not a formal
evaluation with enough participants.

I've removed the link the the non-existant techreport.

What the initial runs helped us to do though was to identify issues with
our interaction model and improve the user interface.  We use the
dataset to run performance tests, too.

> I do not completely get the queries. You list three kinds of queries, 
> directed search, simple browsing and complex browsing.
> - Why this distinction?
> - The description of the queries seem a bit limited.
> - Do you have a golden standard for the answers to these queries?
We've just taken a few keyword queries from the AOL dataset related
to our dataset, but haven't derived any task description or correct
answers to the queries yet.

The distinction is mainly according to the number of restrictions
and the anticipated result sets.

Directed search means that the user wants to locate one specific book.
E.g. "joe cipolla mafia cookbook".

Simple browsing involves either specifying one facet, or following
a link/performing a focus change to arrive at a set of results.
E.g. "books about the mediterranean diet".

Complex browsing requires to specify two or more facets to arrive
at the result set.
E.g. "racial segregation in the 1930"

The distinction is a bit blurry; we have yet to find the "gold standard"
answer sets which might slightly impact the current classification.

>> We've choosen books since there is a public domain data available and 
>> there
>> has been work in the digital library area, so there are existing 
>> systems to
>> compare to.
>
> ...including evaluations?
I haven't found any good evaluations of the digital library systems.  But at
least the digital library systems and tools are relatively mature, some are
open source and are able to handle the amount of data in the corpus.

Thanks for you comments so far!  I hope to be able to include them into
the corpus soon.

Regards,
Andreas.

-- 
http://swse.deri.org/

Received on Monday, 18 August 2008 09:43:17 UTC