RE: Compound concepts in a thesaurus structure

Andrew, 
It occurs to me also  that compound/complex concepts may not
be relevant for SKOS (assuming here that SKOS will serve as a
container/carrier while
every system will have its own  native database format created with a
purpose in mind 
(vocabulary management or IR). 
Subject vocabularies are exchanged as both original schedules and
subject authority data
(which Stella calls 'cataloguing data'). Both forms, however,  contain
compound/complex terms if
we talk about subject heading system and classifications.
One can take out from SKOS only what is put in ( i.e. rubbish in/rubbish
out).

What I mean is that synthesized concepts may be stored in the native
database in two ways

1) complex KOS concepts  as a simple text strings
e.g. 
classification

<TAG> 94(73)
[meaning: History - USA]

subject heading system
<TAG>cats -- USA 

If this is the case it is not up to SKOS to break the string down and
provide encoding for each element.
so as Andrew says the whole sting has to be treated as skos:prefLabel
within skos:Concept

2)  complex KOS concepts  as  structured/encoded headings
(each part of compound concept is separated by meaningful prefix: e.g.
let's say that  
TAG1 is the prefix for the  main heading and   TAG2 is a subheading
meaning Place)

<TAG1>94<TAG2>(73)
or 
<TAG1>cats<TAG2>USA

If native database holds this kind of data any stripping of TAG prefixes
will also mean the stripping
a functionality attached to it.  So  SKOS may as well  accommodate the
whole lot: prefix+concept. 
within skos:Concept
in which case the preferred term will not be '94(73)' but
'<tag1>94<tag2>(73)'
Textual notes explaining from which table each element comes (as this is
specified in classMARC (MARC21)
are simply an accompanying text and is irrelevant for the problem of
accessing/searching the complex headings 
themselves and will not affect the functionality.  When heading is
encoded this text becomes redundant anyway.

I don't know whether there is any problem with my understanding of 2) 

aida

-----Original Message-----
From: public-esw-thes-request@w3.org
[mailto:public-esw-thes-request@w3.org] On Behalf Of Houghton,Andrew
Sent: 12 May 2004 14:06
To: public-esw-thes@w3.org
Subject: RE: Compound concepts in a thesaurus structure



> From: Leonard Will [mailto:L.Will@willpowerinfo.co.uk]
> Sent: Wednesday, May 12, 2004 4:49 AM
> Subject: Re: Compound concepts in a thesaurus structure
> 
> In message <000501c437fb$fea39970$0402a8c0@DELL> on Wed, 12
> May 2004, Stella Dextre Clarke <sdclarke@lukehouse.demon.co.uk> wrote
> >
> >Maybe I am missing the point here, but we seem to have jumped from
> >talking about exchanging vocabulary data to the exchange of 
> catalogue
> >data. I thought SKOS was addressing the former, but not the
> latter. The
> >relationships in a thesaurus are supposed to be paradigmatic, not
> >syntagmatic. But a catalogue or index typically sets up syntagmatic 
> >relationships ( i.e. the sort to be found in the context of one 
> >particular document), which leads us into the difficulty outlined by 
> >Leonard.
> 
> I think that Andy is thinking of SKOS as maintaining a
> subject authority file, including all the simple and compound 
> concepts that are either enumerated in schedules or that have 
> been used in creating catalogue entries. This is very 
> different from a thesaurus, as you say, and adds a whole new 
> dimension of complexity. If this is to be done I think it 
> would be better to keep the thesaurus and the subject 
> authority file in separate databases.

Actually, this thread started with Leonard saying:

> From: Leonard Will [mailto:L.Will@willpowerinfo.co.uk]
> Sent: Tuesday, May 11, 2004 5:21 AM
>
> OK, I agree that it would be useful to have a mechanism for  encoding
>pre-coordinate classification schemes and subject  indexing strings, 
>and I do like the idea of treating them in  an integrated way that 
>works smoothly with the encoding of  thesaurus structures.  It will 
>mean a significant expansion  of the project, though. Is it currently
within its scope?

I disagreed with the assessment that it would *significantly* expand the
SKOS project.  To which Leonard replied:

> From: Leonard Will [mailto:L.Will@willpowerinfo.co.uk]
> Sent: Tuesday, May 11, 2004 10:43 AM
> Subject: Re: Supporting arrays of concepts
> 
> It seems to me to be a much more complex job for SKOS to try to create

> a system that would incorporate rules for creating these compound 
> strings.

I then pointed out that to encode LCSH in SKOS doesn't mean you have to
incorporate the rules for how LCSH, or any other vocabulary, builds
compound
concepts.  The "whole" compound string *is* the concept and there isn't
necessarily a BT/NT/RT relationship between the predefined part and what
was
composed.  The whole term should be taken as the concept and its
BT/NT/RT
relationship is to be taken in the context of all the other predefined,
e.g.
enumerated, or compound strings in the vocabulary.

The point here is that whether the vocabulary *enumerates* or allows
synthesis of new concepts the "whole" string *is* the concept and should
be
used as the skos:prefLabel within skos:Concept.  A thesaurus could have
constructed concepts just like DDC, LCSH, UDC, etc., if it has the
notion of
tables.  For example, the thesaurus allows you to add a geographic
component
from a table to the enumerated list of concepts.  While the thesaurus
*could* enumerate all the possibilities its probably more efficient to
specify a table and have appropriate notes say when the table should be
used.  This use of tables is no different from how DDC, LCSH, UDC, etc.
build new concepts.  Regardless, the constructed string is a new
concept,
e.g. Cats [enumerated] vs. Cats--United States [constructed] vs.
Cats--France [constructed].

This means there is *no* impact on SKOS.  Hence, my disagreement with
Leonard's original assessment that it would *expand* the scope of the
SKOS
project.  Hope this clears things up...


Andy.

Andrew Houghton, OCLC Online Computer Library Center, Inc.
http://www.oclc.org/about/
http://www.oclc.org/research/staff/houghton.htm

Received on Wednesday, 12 May 2004 11:42:48 UTC