W3C home > Mailing lists > Public > public-esw-thes@w3.org > August 2006

Re: Example of coordination with DDC

From: Jakob Voss <jakob.voss@gbv.de>
Date: Wed, 02 Aug 2006 19:27:28 +0200
Message-ID: <44D0E080.7040406@gbv.de>
To: public-esw-thes@w3.org

Aida Slavic wrote:

> If I remember correctly there was a similar discussion in 2004.

We should better use the Wiki, shouldn't we? ;-)

> My understanding was that the problem of structured classification 
> notation was to be ignored by SKOS at the time, and that complex
> notation ought to be treated as a simple text string.

Well, you have to start with the basics.

> Jakob Voss wrote:
> 
>> SKOS should be able to express DDC, UDC and CC - but it must stay 
>> simple! So what do you suggest to express CC's "U:(W)" in the next
>> SKOS?
> 
> I don't understand what does 'simple' mean in this case.

SKOS should be able to to express the complexity of UDC and CC notations
in a simple way (without many new classes, properties, and the need of
an RDF inference engine to handele SKOS data).

> DDC is simple enumerative classification with largely non-expressive
>  notation which is used as text string to 'mark and park' books. UDC 
> and CC are analytico-synthetic classifications with fully expressive 
> structured notation the parts of which shoud be searchable using
booleans.
> To code an expressive notation one needs: - way to encode facet
> indicators or separate parts of notation independent of notation
> itself

This can be solved together with expressing coordination.

> - way to encode relationship between parts of the complex notation

This is more complex.

> - the way to encode correct notation hierarchy independently from
> notation (this can be sorted out as BT/NT relationship or as
> hierarchy code)
> 
> If the first two are not possible in SKOS then you can not say that
> "SKOS expresses classification"  but rather "SKOS expresses
> enumerative classification"

Or "SKOS expresses simple hierarchies but not classification"

> For instance the examples Jakob gave for DDC (which is the only type
> of combination DDC has)
> 
> <551.22> <T2--551.22> <T1--59827>
> 
> Does not solve the problem of 32:37 (relationship between education
> and politics) from UDC where two main subjects are combined
> 
> I think that any generalisations based on DDC or LCC, which are
> enumerative systems for linear shelf ordering - may be wrong. This
> certanly made MARC classification format completely useless for
> classifications that are used in IR (UDC in particular).

And simple user interfaces for analytico-synthetic classifications are
also missing. I hope that SKOS will improve this situation a lot because
we finally divide data and visualization level.

> Also, there were some misinterpretations in Nabonita's mail that I
> would like to put straight
> 
>> library classification schemes. No doubt that DDC and UDC are most
>> popular schemes but they have some serious limitations. Just for
>> e.g. each subject requires a definite place in the <array of
>> subjects. But if we study carefully the notational system of DDC
>> (UDC is based on DDC pattern) , we will find that 000-900 notations
>> have been assigned to the subjects randomly. But due to the fixed
>> notational systems, interpolation of newly emerging subjects
>> between existing subjects becomes a serious issue.
>
> -DDC and UDC have very different notational principle -CC (Colon
> Classification), DDC and UDC are very different when it comes to the
>  fundamental principles they're built on and how this is expressed in
> notation: DDC lists compound concepts and assigns them a simple
> notational symbol) and it does not allow combination of two subjects
> from the main schedule while the combination with auxiliary schedules
> is limited. 

It's limited but it's still coordination. You can also view the
auxiliary tables as special facets that can only be used under special
circumstances.

> Most importantly - DDC does not contain a consistent set
> of facet indicators in the notation i.e. its notation is not fully
> expressive. E.g. 551.220959827 does not show where one number starts
> and other begin. More importantly "59827" from (T1--59827) does not
> have constant meaning i.e. its meaning changes fepending on the
> number it is attached to. 

That's a problem. My colleage Ulrike Reiner is working on a way to
automatically split DDC numbers. After two years she has reached a
pretty good level and I think that this will be solved in about 1-2
years - but it's very complex indeed.

> UDC and CC are fully analytico-synthetic
> classifications and have fully expressive notation. CC (being purely
> faceted) does not - and UDC (being partially faceted) avoids - the
> use of simple notation to express compound concepts. UDC & CC have
> consistent rules to combine notations from the main schedule with
> auxiliary schedules or any two or more subjects or their facets from
> the main schedules. In UDC it is literally possible to combine any
> two or more concepts (from main and common auxiliary schedules) no
> matter where in the schedules they appear because the notation has
> persiastent meaning. E.g. (73) means always USA no matter to which
> number it is attached - (1/9) is facet indicator for concepts of 
> place. Two or more simple numbers from main schedules have to be
> connected with relationship symbols. 

That's right but it's not relevant to SKOS. The specific rules of a
single KOS when and how concepts can be coordinated to complex notations
cannot be part of SKOS. But SKOS should be able to express how complex
notation were build.

Nabonita writes:

>> So, the number of CC says that the subject deals with the influence
>> of political factors in a geographical area. Where as in UDC the
>> nature of relationship between two subject components is not so
>> explicit.
> 
> In this example the principle of order is applied i.e. the treated
> subject is normally listed first and the subject of treatment second.
> In UDC 32:91 means the influence of geography on politics. This
> principle in indexing is known as "wall-picture principle". But there
> are other ways of saying this more precisely..  It is possible to
> express the type of relationship between two UDC numbers in three
> ways:
> 
> a) in a limited way using four symbols and consistent principle of
> order: (relation), :: (relation fixed order), [] (subsumes], /
> (extension) b) in a complex and detailed way using common auxiliaries
> of phase relationships (-042) - it contains dozen different
> relationships and their subgroupings c) in a very sophisticated way
> by applying Perrault's symbols for relationsahips (from Perrault's
> "Towards the theory of UDC")

Wow! So how are we going to express this in SKOS?

Greetings,
Jakob

P.S: Your mail was difficult to read because of strange quoting (looks
like some Right-to-left environment). Can you please use the traditional
way to quote in email discussions?:

http://www.xs4all.nl/~hanb/documents/quotingguide.html
http://www.netmeister.org/news/learn2quote3.html#ss3.1
Received on Wednesday, 2 August 2006 17:27:08 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:38:54 GMT