- From: Steve Cayzer <steve.cayzer@hp.com>
- Date: Mon, 1 Dec 2003 14:12:58 -0000
- Cc: <public-esw-thes@w3.org>
Hmmm, I see what you're saying, but I don't see why ">50%" (with its illusion of precision) is better than "good" or some such (with its guarantee of vagueness). What we are talking about here the formal encoding of imprecision :) Cheers Steve ----- Original Message ----- From: "Miles, AJ (Alistair) " <A.J.Miles@rl.ac.uk> To: "'Steve Cayzer'" <steve.cayzer@hp.com> Cc: <public-esw-thes@w3.org> Sent: Monday, December 01, 2003 12:16 PM Subject: Interpretation of SKOS-Mapping properties ... > Hi Steve, > > > 3). > > The major thing I wanted to post to the list is this (but you > > may be able to > > answer it directly?) > > I notice that on > > http://www.w3c.rl.ac.uk/2003/11/21-skos-mapping > > has the following properties: > > > > <rdf:Property rdf:ID="majorMatch"> > > <rdfs:comment>If 'concept A has-major-match concept B' then the set of > > resources properly indexed against concept A shares more than > > 50% of its > > members with the set of resources properly indexed against concept > > B.</rdfs:comment> > > </rdf:Property> > > > > <rdf:Property rdf:ID="minorMatch"> > > <rdfs:comment>If 'concept A has-minor-match concept B' then > > the set of > > resources properly indexed against concept A shares less than 50% but > > greater than 0 of its members with the set of resources > > properly indexed > > against concept B.</rdfs:comment> > > </rdf:Property> > > > > The use of some number (50%) rings warning bells in my mind. > > What about > > 49.7% vs 50.1% ? How do we know anyway? > > A more comfortable definition (in my mind) would be something vaguer > > major match -> This means that a resource properly indexed > > against A has a > > good chance of being properly indexed against B > > minor match -> This means that a resource properly indexed > > against A has > > some chance of being (or 'may be') properly indexed against B > > Good point. This brings up a duality of perspective that I've been trying > to understand for a while. Let's have a crack at explaining it... > > I have defined these properties with formal entailments, i.e. majorMatch > entails >50% overlap of the document sets corresponding to the concepts. > However, a person creating the mapping must make a best guess as to whether > this will be true, based on their interpretation of the different meanings > of the concepts. > > To make this point another way, consider the following two sets of > instructions on how to use the <soks:majorMatch> property, one to a person > creating a mapping, and one to a programmer developing applications that use > the <soks:majorMatch> property ... > > Instructions to mapper: > Use <soks:majorMatch> to link concepts A and B if they overlap in meaning, > and if you believe that more than 50% of the documents that are about > concept A will also be about concept B. > > Instructions to programmer: > The ( <ConceptA> <soks:majorMatch> <ConceptB> ) statement entails that >50% > of the documents properly indexed against concept A are also properly > indexed against concept B. Thus in a query the two concepts may be > interchanged, and a success rate of >50% may be expected. > > I.e. the mapper makes a best guess based on the meaning of the concepts, > with imperfect knowledge of the actual document sets, and the programmer > writes programs that process these statements as if they are true statements > about the world, made by someone with perfect knowledge of the document > sets. > > I think it's worth bearing in mind what actual impact these different > mapping statements will have to the user. A good mapping will mean that a > query app processing transformed queries can guarantee complete recall, and > order the result set to put better matches first. A poor mapping means lots > of bogus results, incomplete recall and no good ordering. In order to > generate a good mapping, the mapper needs the right tools (i.e. a well > designed vocab) and must know how to use them (i.e. needs a clear set of > instructions). So this is what we're working towards. > > How does that go down? > > Al. >
Received on Monday, 1 December 2003 07:56:32 UTC