- From: Aida Slavic <aida@acorweb.net>
- Date: Wed, 2 Aug 2006 17:03:53 +0100
- To: <public-esw-thes@w3.org>
Hi, If I remember correctly there was a similar discussion in 2004. My understanding was that the problem of structured classification notation was to be ignored by SKOS at the time, and that complex notation ought to be treated as a simple text string. Jakob Voss wrote >SKOS should be able to express DDC, UDC and CC - but it must stay >simple! So what do you suggest to express CC's "U:(W)" in the next SKOS? I don't understand what does 'simple' mean in this case. DDC is simple enumerative classification with largely non-expressive notation which is used as text string to 'mark and park' books. UDC and CC are analytico-synthetic classifications with fully expressive structured notation the parts of which shoud be searchable using booleans. To code an expressive notation one needs: - way to encode facet indicators or separate parts of notation independent of notation itself - way to encode relationship between parts of the complex notation - the way to encode correct notation hierarchy independently from notation (this can be sorted out as BT/NT relationship or as hierarchy code) If the first two are not possible in SKOS then you can not say that "SKOS expresses classification" but rather "SKOS expresses enumerative classification" For instance the examples Jakob gave for DDC (which is the only type of combination DDC has) <551.22> <T2--551.22> <T1--59827> Does not solve the problem of 32:37 (relationship between education and politics) from UDC where two main subjects are combined I think that any generalisations based on DDC or LCC, which are enumerative systems for linear shelf ordering - may be wrong. This certanly made MARC classification format completely useless for classifications that are used in IR (UDC in particular). Also, there were some misinterpretations in Nabonita's mail that I would like to put straight <library classification schemes. No doubt that DDC and UDC are most popular schemes but <they have some serious limitations. Just for e.g. each subject requires a definite place <in the <array of subjects. But if we study carefully the notational system of DDC (UDC <is based on DDC pattern) , we will find that 000-900 notations have been assigned to <the subjects randomly. But due to the fixed notational systems, interpolation of <newly emerging subjects between existing subjects becomes a serious issue. -DDC and UDC have very different notational principle -CC (Colon Classification), DDC and UDC are very different when it comes to the fundamental principles they're built on and how this is expressed in notation: DDC lists compound concepts and assigns them a simple notational symbol) and it does not allow combination of two subjects from the main schedule while the combination with auxiliary schedules is limited. Most importantly - DDC does not contain a consistent set of facet indicators in the notation i.e. its notation is not fully expressive. E.g. 551.220959827 does not show where one number starts and other begin. More importantly "59827" from (T1--59827)does not have constant meaning i.e. its meaning changes fepending on the number it is attached to. UDC and CC are fully analytico-synthetic classifications and have fully expressive notation. CC (being purely faceted) does not - and UDC (being partially faceted) avoids - the use of simple notation to express compound concepts. UDC & CC have consistent rules to combine notations from the main schedule with auxiliary schedules or any two or more subjects or their facets from the main schedules. In UDC it is literally possible to combine any two or more concepts (from main and common auxiliary schedules) no matter where in the schedules they appear because the notation has persiastent meaning. E.g. (73) means always USA no matter to which number it is attached - (1/9) is facet indicator for concepts of place. Two or more simple numbers from main schedules have to be connected with relationship symbols. Nabonita writes >So, the number of CC says that the subject deals with the influence of political factors in a >geographical area. Where as in UDC the nature of relationship between two subject components >is not so explicit. In this example the principle of order is applied i.e. the treated subject is normally listed first and the subject of treatment second. In UDC 32:91 means the influence of geography on politics. This principle in indexing is known as "wall-picture principle". But there are other ways of saying this more precisely.. It is possible to express the type of relationship between two UDC numbers in three ways: a) in a limited way using four symbols and consistent principle of order: (relation), :: (relation fixed order), [] (subsumes], / (extension) b) in a complex and detailed way using common auxiliaries of phase relationships (-042) - it contains dozen different relationships and their subgroupings c) in a very sophisticated way by applying Perrault's symbols for relationsahips (from Perrault's "Towards the theory of UDC") Aida
Received on Wednesday, 2 August 2006 16:03:15 UTC