W3C home > Mailing lists > Public > public-esw-thes@w3.org > January 2008

Re: [SKOS] The return of ISSUE-44 (was Re: TR : SKOS Reference Editor's Draft 23 December 2007)

From: Bernard Vatant <bernard.vatant@mondeca.com>
Date: Thu, 10 Jan 2008 20:07:23 +0100
Message-ID: <47866CEB.2040702@mondeca.com>
To: Daniel Rubin <rubin@med.stanford.edu>
Cc: Antoine Isaac <aisaac@few.vu.nl>, SKOS <public-esw-thes@w3.org>, SWD WG <public-swd-wg@w3.org>

OK Daniel, let me have another try  if you don't mind  :-)
> From my point of view, it does NOT make sense that skos:narrower and 
> broader are not transitive.
> And if applications can go ahead and make them transitive by expanding 
> how they wish, that violates the asserted SKOS semantics. Unless I'm 
> misunderstanding something here, this sounds like a formula for chaos.
Expanding the query does not *make* the relation transitive, it's just 
an application feature. I don't see any violation of the semantics. The 
results proposed are not results of the original query, but from *query 
expansion*. The query expansion is not the original query, right? There 
is one single way to strictly answer the query, and many ways to expand it.

I have this real-life example at a customer's in legal publication.
The figures are around 2 million documents, and 50,000 concepts in the 
vocabulary (and growing), with a very deep tree.
Suppose I start a search at level 3, on a concept with 5 direct narrower 
concepts, and about 500 more downwards if transitivity is applied.
If I don't expand the query, say I get 40 answers indexed on the direct 
narrower concepts, if I expand it with unbound transitivity, say I get 
4,000 answers. Way too many. Think about performance.

 From a end-user perspective, what is the best? Retrieving very quickly 
the 40 resources classified directly by the 5 direct children, and 
allowing the user to expand from one of those one or two steps down, 
does not seem a recipe for chaos, but for a sound adaptation to the 
context, and for tackling some scalability issues. If transitivity is 
built in the semantics, I have to go down the tree and retrieve the 
4,000 answers. If I want to trim the tree to limit the results, there I 
will break the semantics ...

Does that make sense?



Received on Thursday, 10 January 2008 19:07:38 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 2 March 2016 13:32:09 UTC