Re: Interpretation of SKOS-Mapping properties ... from Steve Cayzer on 2003-12-01 (public-esw-thes@w3.org from December 2003)

From: Steve Cayzer <steve.cayzer@hp.com>
Date: Mon, 1 Dec 2003 14:12:58 -0000
Cc: <public-esw-thes@w3.org>
Message-ID: <032201c3b815$3ab832c0$ef5cc050@cayzers2>
Hmmm, I see what you're saying, but I don't see why ">50%" (with its
illusion of precision) is better than "good" or some such (with its
guarantee of vagueness).
What we are talking about here the formal encoding of imprecision :)

Cheers

Steve
----- Original Message ----- 
From: "Miles, AJ (Alistair) " <A.J.Miles@rl.ac.uk>
To: "'Steve Cayzer'" <steve.cayzer@hp.com>
Cc: <public-esw-thes@w3.org>
Sent: Monday, December 01, 2003 12:16 PM
Subject: Interpretation of SKOS-Mapping properties ...


> Hi Steve,
>
> > 3).
> > The major thing I wanted to post to the list is this (but you
> > may be able to
> > answer it directly?)
> > I notice that on
> > http://www.w3c.rl.ac.uk/2003/11/21-skos-mapping
> > has the following properties:
> >
> > <rdf:Property rdf:ID="majorMatch">
> > <rdfs:comment>If 'concept A has-major-match concept B' then the set of
> > resources properly indexed against concept A shares more than
> > 50% of its
> > members with the set of resources properly indexed against concept
> > B.</rdfs:comment>
> > </rdf:Property>
> >
> > <rdf:Property rdf:ID="minorMatch">
> >   <rdfs:comment>If 'concept A has-minor-match concept B' then
> > the set of
> > resources properly indexed against concept A shares less than 50% but
> > greater than 0 of its members with the set of resources
> > properly indexed
> > against concept B.</rdfs:comment>
> >     </rdf:Property>
> >
> > The use of some number (50%) rings warning bells in my mind.
> > What about
> > 49.7% vs 50.1% ? How do we know anyway?
> > A more comfortable definition (in my mind) would be something vaguer
> > major match -> This means that a resource properly indexed
> > against A has a
> > good chance of being properly indexed against B
> > minor match -> This means that a resource properly indexed
> > against A has
> > some chance of being (or 'may be') properly indexed against B
>
> Good point.  This brings up a duality of perspective that I've been trying
> to understand for a while.  Let's have a crack at explaining it...
>
> I have defined these properties with formal entailments, i.e. majorMatch
> entails >50% overlap of the document sets corresponding to the concepts.
> However, a person creating the mapping must make a best guess as to
whether
> this will be true, based on their interpretation of the different meanings
> of the concepts.
>
> To make this point another way, consider the following two sets of
> instructions on how to use the <soks:majorMatch> property, one to a person
> creating a mapping, and one to a programmer developing applications that
use
> the <soks:majorMatch> property ...
>
> Instructions to mapper:
> Use <soks:majorMatch> to link concepts A and B if they overlap in meaning,
> and if you believe that more than 50% of the documents that are about
> concept A will also be about concept B.
>
> Instructions to programmer:
> The ( <ConceptA> <soks:majorMatch> <ConceptB> ) statement entails that
>50%
> of the documents properly indexed against concept A are also properly
> indexed against concept B.  Thus in a query the two concepts may be
> interchanged, and a success rate of >50% may be expected.
>
> I.e. the mapper makes a best guess based on the meaning of the concepts,
> with imperfect knowledge of the actual document sets, and the programmer
> writes programs that process these statements as if they are true
statements
> about the world, made by someone with perfect knowledge of the document
> sets.
>
> I think it's worth bearing in mind what actual impact these different
> mapping statements will have to the user.  A good mapping will mean that a
> query app processing transformed queries can guarantee complete recall,
and
> order the result set to put better matches first.  A poor mapping means
lots
> of bogus results, incomplete recall and no good ordering.  In order to
> generate a good mapping, the mapper needs the right tools (i.e. a well
> designed vocab) and must know how to use them (i.e. needs a clear set of
> instructions).  So this is what we're working towards.
>
> How does that go down?
>
> Al.
>
Received on Monday, 1 December 2003 07:56:32 UTC