Re: SKOS ISSUE-39: clarification?

On Mon, 3 Dec 2007 at 22:45:02, Antoine Isaac <aisaac@few.vu.nl> wrote
>That being said, the difference between relatedMatch and overlappingMatch
>is not 100% obvious even to me.

>The main motivation is that the previous SKOS mapping specification was
>assuming a quite 'mechanical', extensional approach to partial mappings.
>minor/majorMatch were defined on the basis that resources were described
>by both mapped concepts.

There is danger in basing mappings on the indexing of a specific set of
resources.  At present the SKOS Mapping Vocabulary Specification at
<http://www.w3.org/2004/02/skos/mapping/spec/> defines mappings by rules
such as the statement, e.g., for "exactMatch" :

    "If two concepts are an 'exact-match' then the set of resources
    properly indexed against the first concept is identical to the set
    of resources properly indexed against the second. Therefore the two
    concepts may be interchanged in queries and subject-based indexes.

This type of definition depends too much on the universe from which the
sets of resources are drawn. Presumably this is a hypothetical global
set, but in any realistic situation the resources that can be used to
test such a relationship must be limited. If in any such limited set,
the only resources that contain the concept "France" happen also to
contain the concept "war", the conclusion would be drawn that "France"
and "war" are an exact match, i.e. that they are synonymous.

>If I wanted to remove minor/majorMatch (because I find the 50% criterion
>too much arbitrary), I had to find something with the same kind of criterion
>to replace them (because I thought there was some point in representing
>this "overlapping extensions" situations). So overlappingMatch is defined
>as a relation that holds when there is a set of documents potentially
>described by the two concepts at the same time.

As a weaker example of the previous situation, if _some_ of the
resources contain both concepts, the conclusion would be drawn that
there is an "overlapping match" between these concepts. Again this
conclusion would be false.

This type of criterion would work only if each resource dealt with a
single concept, by which it could be indexed. This is the exception
rather than the rule. The "statistical mapping" work done at OCLC
approximates to this simplification by choosing a single Dewey
classification number and the first subject heading assigned to a work,
thus dealing only with the "predominant topic":
<http://staff.oclc.org/~vizine/sig_cr/sigcr_done_dvg.htm>, para. 2.1.2.

>The problem is that this does not render the associative "related" link
>between terms from a thesaurus. Imagine two concepts, "France" and
>"War", coming from two thesauri. In a library, there will be an overlap
>between the sets of books indexed by the two concepts. Yet, I dare not
>imagine that there would be a "related" link between the two concepts, if
>they stood withing one single thesaurus. If a searcher is interested in
>resources about "France", you will not generally try to point him to
>resources about War. In my opinion, this is a case where you would have
>an overlapingMatch but no relatedMatch.

The fact that there exists a set of resources which deal with the two
concepts together, does not lead to the conclusion that there is any
relationship between the concepts. There is no inherent relationship
between "France" and "war", whether or not there exist some resources
which deal with the compound concept that may be labelled by a
coordination of these two simple concepts. It would be meaningless, and
misleading, to include any such relationship in a mapping between two
thesauri.

A valid "overlappingMatch" between concept A and concept B needs to be
based on the identification of some narrower concepts, some of which
fall within the scope of both A and B, some of which fall only within A
and some of which fall only within B. This criterion would be satisfied
in determining that In your platypus example there is an overlapping
match between "mammals" and "egg laying animals".

A "relatedMatch" is more subjective, but I suggest that an
inter-thesaurus relatedMatch should use the same criteria as for an
associative relationship within a single thesaurus. General guidelines
and many examples are given in BS8723-2:2005.  The relationship between
"egg-laying animals" and "eggs" falls within one of these guidelines, as
the definition of one of the concepts, "eggs", is necessary in defining
the other concept.

Leonard

-- 
Willpower Information       (Partners: Dr Leonard D Will, Sheena E Will)
Information Management Consultants              Tel: +44 (0)20 8372 0092
27 Calshot Way, Enfield, Middlesex EN2 7BQ, UK. Fax: +44 (0)870 051 7276
L.Will@Willpowerinfo.co.uk               Sheena.Will@Willpowerinfo.co.uk
---------------- <URL:http://www.willpowerinfo.co.uk/> -----------------

Received on Tuesday, 4 December 2007 23:31:12 UTC