W3C home > Mailing lists > Public > public-esw-thes@w3.org > August 2006

Example of coordination with DDC

From: Aida Slavic <aida@acorweb.net>
Date: Wed, 2 Aug 2006 17:03:53 +0100
To: <public-esw-thes@w3.org>
Message-ID: <GDELJIGAINGFMJPDGLFMMELCCPAA.aida@acorweb.net>

Hi,

If I remember correctly there was a similar discussion in 2004. 
My understanding was that the problem of structured classification
notation was to be ignored by SKOS at the time, and that complex notation 
ought to be treated as a simple text string. 

Jakob Voss wrote

>SKOS should be able to express DDC, UDC and CC - but it must stay
>simple! So what do you suggest to express CC's "U:(W)" in the next SKOS?

I don't understand what does 'simple' mean in this case. DDC is simple
enumerative classification with largely non-expressive notation which is
used as text string to 'mark and park' books. UDC and CC are 
analytico-synthetic classifications with fully expressive structured 
notation the parts of which shoud be searchable using booleans. 
To code an expressive notation one needs:
- way to encode facet indicators or separate parts of notation independent of
notation itself 
- way to encode relationship between parts of the complex notation
- the way to encode correct notation hierarchy independently from notation (this 
can be sorted out as BT/NT relationship or as hierarchy code)

If the first two are not possible in SKOS then you can not say that "SKOS expresses 
classification"  but rather "SKOS expresses enumerative classification"
For instance the examples Jakob gave for DDC (which is the only type of combination
DDC has)

<551.22>
    <T2--551.22>
    <T1--59827>

Does not solve the problem of 32:37 (relationship between education and politics) from UDC
where two main subjects are combined

I think that any generalisations based on DDC or LCC, which are enumerative systems for
linear shelf ordering - may be wrong. This certanly made MARC 
classification format completely useless for classifications that are used
in IR (UDC in particular). 

Also, there were some misinterpretations in Nabonita's mail that I would like to put 
straight

<library classification schemes. No doubt that DDC and UDC are most popular schemes but 
<they have some serious limitations. Just for e.g. each subject requires a definite place 
<in the <array of subjects. But if we study carefully the notational system of DDC (UDC 
<is based on DDC pattern) , we will find that 000-900 notations have been assigned to 
<the subjects randomly. But due to the fixed notational systems, interpolation of 
<newly emerging subjects between existing subjects becomes a serious issue.

	-DDC and UDC have very different notational principle
	-CC (Colon Classification), DDC and UDC are very different when it comes to the 
	fundamental principles they're built on and how this is expressed in notation: 
		DDC lists compound concepts and assigns them a simple notational symbol) and 
		it does not allow combination of two subjects from the main schedule while the
		combination with auxiliary schedules is limited.
		Most importantly - DDC does not contain a consistent set of facet indicators 
		in the notation i.e. its notation is not fully expressive. E.g. 551.220959827
		does not show where one number starts and other begin. More importantly
		"59827" from (T1--59827)does not have constant meaning i.e. its meaning changes 
		fepending on the number it is attached to.
		
		UDC and CC are fully analytico-synthetic classifications and have fully 
		expressive notation. CC (being purely faceted) does not - and UDC (being 
		partially faceted) avoids - the use of simple notation to express compound concepts. 
		UDC & CC have consistent rules to combine notations from the main schedule 
		with auxiliary schedules or any two or more subjects or their facets from 
		the main schedules. In UDC it is literally possible to combine any two or more 
		concepts (from main and common auxiliary schedules) no matter where in the schedules 
		they appear because the notation has persiastent meaning. E.g. (73) means always USA 
		no matter to which number it is attached - (1/9) is facet indicator for concepts of
		place. Two or more simple numbers from main schedules have to be connected with 
		relationship symbols.
		
Nabonita writes
>So, the number of CC says that the subject deals with the influence of political factors in a 
>geographical area. Where as in UDC the nature of relationship between two subject components 
>is not so explicit.

	In this example the principle of order is applied i.e. the treated subject is 
	normally listed first and the subject of treatment second. In UDC 32:91 means the 
	influence of geography on politics. This principle in indexing is known as 
	"wall-picture principle". But there are other ways of saying this more precisely..
	
	It is possible to express the type of relationship between two UDC numbers 
	in three ways:

	a) in a limited way using four symbols and consistent principle of order: (relation), 
	:: (relation fixed order), []	(subsumes], / (extension)
	b) in a complex and detailed way using common auxiliaries of phase 
	relationships (-042) - it contains dozen different relationships and their 
	subgroupings
	c) in a very sophisticated way by applying Perrault's symbols for relationsahips 
	(from Perrault's "Towards the theory of UDC")


Aida
Received on Wednesday, 2 August 2006 16:03:15 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:38:54 GMT