W3C home > Mailing lists > Public > public-esw-thes@w3.org > February 2009

Re: Differences between SKOS and ISO standards : transitivity

From: Simon Spero <ses@unc.edu>
Date: Fri, 13 Feb 2009 15:40:49 -0500
Message-ID: <1af06bde0902131240p222494daoa462e8ac610e34cf@mail.gmail.com>
To: Leonard Will <L.Will@willpowerinfo.co.uk>, Barbara Tillett <btil@loc.gov>, "Martha M. Yee" <marthamyee@sbcglobal.net>, "Allyson Carlyle (work)" <acarlyle@u.washington.edu>, Jane Greenberg <janeg@email.unc.edu>
Cc: "public-esw-thes@w3.org" <public-esw-thes@w3.org>, public-swd-wg@w3.org
On Fri, Feb 13, 2009 at 11:47 AM, Leonard Will
<L.Will@willpowerinfo.co.uk>wrote:

I realise that I was not quite rigorous in what I said about transitivity in
> my message earlier today. To  clarify, as I understand them to be used in
> the thesaurus community and in ISO standards:
>
> The generic relationship BTG is transitive
>
> The partitive relationship BTP is transitive
>
> The generalised  relationship BT can not be assumed to be transitive,
> because different occurrences of it may represent a mixture of the above
> types.
>

I don't believe that the last  statement is correct.   As Svenonius(2000,
p.130) explains:

Subject language terms differ *referentially* from words used in ordinary
language. The former do not refer to objects in the real world or concepts
in a mentalistic world but to subjects. As a name of a subject, the term *
Butterflies* refers not to actual butterflies but rather to the set of all
indexed documents about butterflies. In a natural language the extension, or
extensional meaning, of a word is the class of entities denoted by that
word, such as the class consisting of all butterflies. In a subject language
the extension of a term is the class of all documents about what the term
denotes, such as all documents about butterflies.

The BT relationship is thus transitive, because it operates over the domain
of documents, not the domain of the items described by those documents.

This is distinction is absolutely fundamental to understanding any formal
model of controlled vocabularies and subject analysis.

As Barbara Tillett states in her endorsement of both the 2000 printing, and
the new 2009 paperback edition,  "This book provides sound guidance to
future developers of search engines and retrieval systems. The work is
original, building on the foundations of information science and
librarianship of the past 150 years."

If the distinction is rejected, and the extension of SKOS concepts are
butterflies, not documents, then it is entirely redundant in the face of OWL
and its progeny. Otherwise the logical consequences of making this
distinction are simple and direct .

a, b : Thing
A,B : Document , where A is a Document about a, and B is a document about b

As Leonard says: isa and is_part_of are transitive:

a isa b, b isa c   |= a isa c
a is_part_of b, b is_part_of c  |= a is_part_of c

(The latter rule has been the subject of some disagreement, mostly between
Alan Cruse and himself. It is now generally accepted within  Lexical
semantics. See e,g. Croft and Cruse (2004).)

These underlying relationships entail certain relationships in the domain of
Indexing systems.

a is_part_of b   |=  A BTP B  -- if every a is part of a b, then every
document about a is also about b , because an a part of a b
A BTP B           |= a is_part_of b

Note that this implies that A BTP B, B BTP C |= A BTP C  -- the specific
underlying partative relationship is preserved

a isa b      |= A BTG B
A BTG B  |= a isa b

Note that this implies that A BTG B, B BTG C |= A BTG C

A BTG B            |= A BT B
A BTP B             |= A BT B

(since BTG and BTP are subrelations of BT)

The inference rule that is in dispute is this one:

A BT B, B BT C  |= A BT C

This can be read saying "If all documents about A are also about B, and all
documents about B are also about C, all documents about A are also about
C".

Note that this *does not* allow one to infer A BTG B ( and thus a isa b )
from A BT B.

An example may make this clearer.

An S2000 Steering Wheel may be part of a Honda S2000, a Honda S2000 may be a
type of car and a Car may be a type of Vehicle.

These ontological relationships can be expressed as

S2000_Steering_Wheel is_part_of Honda_S2000
Honda_S2000 isa Car
Car isa Vehicle

>From this we can infer the following relationships between sets of
documents.

(1) S2000 Streering Wheel BTP Honda S2000
(2) Honda S2000 BTG Car
(3) Car BTG Vehicle

Using the standard framing, we can express these assertions in English as:

(A) Every document about a Honda S2000 Steering Wheel  (H2KSW) is
necessarily also about Honda S2000s, by virtue of the H2KSWl being part of
an S2000.

(B) Every document about a Honda S2000 is necessarily also about cars,
because an S2000 is a kind of car.

(C) Every document that is about cars is necessarily also about vehicles,
because cars are a kind of vehicle.

Now, let us assume that the transitivity property of BT does not hold.

(D) This requires  that there may be a document *d* that is about S2KSWs but
is not about cars.

(E) By (A), we can infer that *d* is about Honda S2000s

(F) By (B,E), we can infer that *d* is about Cars

But (D,E) is a contradiction.  RAA.

Simon

[Croft and Cruse(2004)]    William Croft and D. A. Cruse. Cognitive
Linguistics. Cambridge University Press, 2004. ISBN 0521667704,
9780521667708.

[Cruse(1986)]    D. A. Cruse. Lexical semantics. Cambridge textbooks in
linguistics. Cambridge University Press, Cambridge Cambridgeshire; New York,
1986. ISBN 052125678X; 0521276438. D.A. Cruse.; ;24 cm; Includes indexes.;
Bibliography: p. 295-301.

[Svenonius(2000)]    Elaine Svenonius. The Intellectual Foundation of
Information Organization. MIT Press, Cambridge, Mass., 2000. ISBN 0262194333
(hc : alk. paper). URL http://www.netlibrary.com/AccessProduct.aspx?
ProductId=39954.




>
>
Received on Friday, 13 February 2009 20:41:33 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:39:03 GMT