pre-coordination and hierarchical relationships

In message <GOEIKOOAMJONEFCANOKCGENBGPAA.bernard.vatant@mondeca.com> on
Tue, 11 Oct 2005, Bernard Vatant <bernard.vatant@mondeca.com> wrote

>Agreed, dmoz concepts such as
>Top: Society: Religion and Spirituality: Christianity: Music
>look like pre-coordinated in the URL of the category

>[1] http://dmoz.org/Society/Religion_and_Spirituality/Christianity/Music/

>But this pre-coordination also includes a broader-narrower hierarchy, since
>the above is a subcategory of

>[2] http://dmoz.org/Society/Religion_and_Spirituality/Christianity/

This is not a broader-narrower relationship between concepts in the
sense required by thesaurus standards. There is no inherent, _a priori_,
relationship between Christianity and music. Music is not a kind, part
or instance of Christianity.

BT/NT relationships are only valid between concepts belonging to the
same facet or "fundamental category", so

religions
NT Christianity

is valid, as these both belong to a category such as "systems of
belief", but

Christianity
NT music

is not

In a classification scheme, or a scheme of pre-coordinated subject
headings such as DMOZ, concepts from different facets may be brought
together to provide a heading or location for compound subjects, but
this is not creating BT/NT relationships between them. Depending on the
citation order of facets which the scheme specifies, either string

Christianity : music

or

music : Christianity

may be valid.

>But if you look at the display at [2], the label of [1] is "Music"
>
>> In each of the strings you quote, the word "music" labels the _same_
>> concept, which may be defined as some sort of rhythmic or melodic sound
>> (at least in my opinion!). The fact that that concept may be combined
>> with other concepts in an indexing string does not make it a different
>> concept.
>
>Hmm, not sure I agree with that. Look at
>[3] http://dmoz.org/Shopping/Music/

There is a possible difference here in the concept labelled "music" from
the concept given that label in [1] and [2]. If you define music as a
type of sound, then that is a different concept from the physical
objects that you buy in a shop, as listed in [3]. The appropriate
thesaurus relationships would then be something like:

music
RT      sheet music
        musical recordings

printed documents
NT      sheet music

sound recordings
NT      musical recordings

Alternatively you could create the pre-coordinated strings

printed documents : music

sound recordings : music

>[1] and [3] don't seem to be related anyway. [3] declares a "See Also" link to
>:
>[4] http://dmoz.org/Arts/Music/
>... where you find a "link@" (which is to be interpreted as "narrower") to [3]
>but not to
>[2].
>
>Agreed, this is just an example of how sloppy dmoz organisation is, but in a
>SKOS expression of DMOZ "as is", would you declare Music in [1], [3] and
>[4] as the *same* concept?

There is a fundamental difficulty in trying to fit a sloppy structure
into a framework such as SKOS which has to be fairly rigid and
formalised if computers are to be able to handle it. But yes, I think
that the only useful approach is to define "music" independently of
context and then provide a mechanism for combining it with other
concepts. As far as I know, SKOS does not yet have a mechanism for
handling pre-coordinated strings or classification structures where
terms from more than one facet are brought together.

>And in this case, would you need this concept to be declared independently
>of any pre-coordination context?

Yes. You have to specify the shapes of building blocks before you can
start to build structures with them.

>Music as neither an expression of spirituality, nor a product, nor a form of
>art ... is declared nowhere in the vocabulary.

That's the problem. To build a useful and consistent structure of
concepts you have to define the meaning and scope of each concept
clearly. Sometimes the label used is unambiguous and widely understood
without further definition, but in many cases a scope note is needed.

>> In a thesaurus, a concept may have more than one broader term, and in a
>> classification scheme such as that of DMOZ a concept may appear in more
>> than one context, as you have shown, but it would create havoc if a
>> single undifferentiated label was used to stand for more than one
>> distinct concept.
>
>The problem here is, as said above, the "Music" concept is nowhere
>explicitly declared in dmoz. Not sure about it, when you use pre-
>coordination, is the usual practice to use elements declared as standalone
>concepts in the vocabulary?

Yes.

>More generally, would you consider all of 590 000+ (!) categories of dmoz
>(current count declared on the home page) as pre-coordinated terms, except
>the 15 Top ones such as "Arts" defined by http://dmoz.org/Arts/? Then you
>have the same kind of issues with most of them as for "Music". The pre-
>coordination elements are nowhere declared standalone (except the Top
>ones).

Most of them are pre-coordinated strings, though some of the steps are
valid BT/NT relationships. This is fine, as long as it is done
consistently. For example in the DMOZ string.

Shopping: Publications: Books: Arts: Music

the relationships

publications > books

and

arts > music

could be valid hierarchical steps, depending on the scope notes (e.g.
the latter one would require music to be defined as an art rather than
as a sound).

It would be good if SKOS could be developed to handle pre-coordinated
strings, or faceted classification schemes (the problems are the same)
but that would be quite a substantial extra development.

Leonard

-- 
Willpower Information       (Partners: Dr Leonard D Will, Sheena E Will)
Information Management Consultants              Tel: +44 (0)20 8372 0092
27 Calshot Way, Enfield, Middlesex EN2 7BQ, UK. Fax: +44 (0)870 051 7276
L.Will@Willpowerinfo.co.uk               Sheena.Will@Willpowerinfo.co.uk
---------------- <URL:http://www.willpowerinfo.co.uk/> -----------------

Received on Tuesday, 11 October 2005 10:23:11 UTC