- From: Miles, AJ (Alistair) <A.J.Miles@rl.ac.uk>
- Date: Mon, 1 Dec 2003 13:36:47 -0000
- To: 'Steve Cayzer' <steve.cayzer@hp.com>, public-esw-thes@w3.org
Yeah, good point. Actually the idea for major and minor match (with the <> 50% definitions) came from the Hacet project report which I got a sneaky preview of. I though it was better than just having 'inexactMatch'. Do you suggest a name change (e.g. 'goodMatch' and 'notSoGoodMatch' :) or just a different definition of major and minor? Al. > -----Original Message----- > From: Steve Cayzer [mailto:steve.cayzer@hp.com] > Sent: 01 December 2003 12:57 > To: public-esw-thes@w3.org > Cc: public-esw-thes@w3.org > Subject: Re: Interpretation of SKOS-Mapping properties ... > > > > Hmmm, I see what you're saying, but I don't see why ">50%" (with its > illusion of precision) is better than "good" or some such (with its > guarantee of vagueness). > What we are talking about here the formal encoding of imprecision :) > > Cheers > > Steve > ----- Original Message ----- > From: "Miles, AJ (Alistair) " <A.J.Miles@rl.ac.uk> > To: "'Steve Cayzer'" <steve.cayzer@hp.com> > Cc: <public-esw-thes@w3.org> > Sent: Monday, December 01, 2003 12:16 PM > Subject: Interpretation of SKOS-Mapping properties ... > > > > Hi Steve, > > > > > 3). > > > The major thing I wanted to post to the list is this (but you > > > may be able to > > > answer it directly?) > > > I notice that on > > > http://www.w3c.rl.ac.uk/2003/11/21-skos-mapping > > > has the following properties: > > > > > > <rdf:Property rdf:ID="majorMatch"> > > > <rdfs:comment>If 'concept A has-major-match concept B' > then the set of > > > resources properly indexed against concept A shares more than > > > 50% of its > > > members with the set of resources properly indexed against concept > > > B.</rdfs:comment> > > > </rdf:Property> > > > > > > <rdf:Property rdf:ID="minorMatch"> > > > <rdfs:comment>If 'concept A has-minor-match concept B' then > > > the set of > > > resources properly indexed against concept A shares less > than 50% but > > > greater than 0 of its members with the set of resources > > > properly indexed > > > against concept B.</rdfs:comment> > > > </rdf:Property> > > > > > > The use of some number (50%) rings warning bells in my mind. > > > What about > > > 49.7% vs 50.1% ? How do we know anyway? > > > A more comfortable definition (in my mind) would be > something vaguer > > > major match -> This means that a resource properly indexed > > > against A has a > > > good chance of being properly indexed against B > > > minor match -> This means that a resource properly indexed > > > against A has > > > some chance of being (or 'may be') properly indexed against B > > > > Good point. This brings up a duality of perspective that > I've been trying > > to understand for a while. Let's have a crack at explaining it... > > > > I have defined these properties with formal entailments, > i.e. majorMatch > > entails >50% overlap of the document sets corresponding to > the concepts. > > However, a person creating the mapping must make a best guess as to > whether > > this will be true, based on their interpretation of the > different meanings > > of the concepts. > > > > To make this point another way, consider the following two sets of > > instructions on how to use the <soks:majorMatch> property, > one to a person > > creating a mapping, and one to a programmer developing > applications that > use > > the <soks:majorMatch> property ... > > > > Instructions to mapper: > > Use <soks:majorMatch> to link concepts A and B if they > overlap in meaning, > > and if you believe that more than 50% of the documents that > are about > > concept A will also be about concept B. > > > > Instructions to programmer: > > The ( <ConceptA> <soks:majorMatch> <ConceptB> ) statement > entails that > >50% > > of the documents properly indexed against concept A are > also properly > > indexed against concept B. Thus in a query the two concepts may be > > interchanged, and a success rate of >50% may be expected. > > > > I.e. the mapper makes a best guess based on the meaning of > the concepts, > > with imperfect knowledge of the actual document sets, and > the programmer > > writes programs that process these statements as if they are true > statements > > about the world, made by someone with perfect knowledge of > the document > > sets. > > > > I think it's worth bearing in mind what actual impact these > different > > mapping statements will have to the user. A good mapping > will mean that a > > query app processing transformed queries can guarantee > complete recall, > and > > order the result set to put better matches first. A poor > mapping means > lots > > of bogus results, incomplete recall and no good ordering. > In order to > > generate a good mapping, the mapper needs the right tools > (i.e. a well > > designed vocab) and must know how to use them (i.e. needs a > clear set of > > instructions). So this is what we're working towards. > > > > How does that go down? > > > > Al. > > >
Received on Monday, 1 December 2003 08:36:51 UTC