Re: Example of coordination with DDC from Jakob Voss on 2006-08-03 (public-esw-thes@w3.org from August 2006)

From: Jakob Voss <jakob.voss@gbv.de>
Date: Thu, 03 Aug 2006 09:33:30 +0200
To: public-esw-thes@w3.org
Message-ID: <44D1A6CA.8090902@gbv.de>
Aida Slavic wrote:

> Jakob and Nabonita
> I forgot to mention that I was pleasantly surprised that you guys
> have more understanding about the need for supporting complex
> notation than most of 'classification experts' I met.

Thanks :-)

> To be more practical here I think the minimum would be the
> possibility to separate and search for parts of pre-composed numbers.
> I will use UDC examples to illustrate two typical situations as minimal
> requirements
> 
> case 1:
> notation 75"19"(410)(0.034.2)
> Painting--20th century--U.K.--digital document
> 
> main number--time(aux)--place(aux)--form(aux)
> 
> case 2:
> notation 37:005.962-057.117
> Education--Staff (management HR)--persons in casual employments
> 
> main number [relation] main number -- persons (aux)
> 
> Each part of these notation has the same meaning irespective
> its position in the expression and type of combination.

By the way qualifiers (aka subheadings) are also a kind of coordination
that cannot be modeled yet. I don't see fundamental structural
differences between

32:91
in UCD (Politics related to Geography)

and

health care reform/econ
in MeSH (see http://www.uab.edu/lister/meshsubs.htm)

There has been a discussion about qualifiers in July 2005
http://lists.w3.org/Archives/Public/public-esw-thes/2005Jul/0049.html

>> That's a problem. My colleage Ulrike Reiner is working on a way to
>> automatically split DDC numbers. After two years she has reached a
>> pretty good level and I think that this will be solved in about 1-2
>> years - but it's very complex indeed.
> 
> Well this sounds better than what Liu achieved in 1996. But I don't
> believe in this approach. This is successfully done for UDC in 1998 as a
> PhD project with very little wider application.

By whom? I only know the paper of Riesthuis from 1996. We have
successfully reproduced Liu's results but it was a lot of work
to come so far.

> I'd rather agree with Goedert (classification in general) and Steve
> Pollitt (with respect to DDC), Gopinath & Prasad (on CC) who suggested that in order
> to support IR (faceted interface in particular) classification should
> be properly coded for machine processing. This gives open hands in creating
> good faceted interfaces [see references at the end]

Thanks for the references.

> Editor in chief of Dewey J. Mitchell mentioned in one of her papers
> that Dewey considered this to be done in their database. I think they
> actually coded facets when re-designing the db in 2004.

But librarians create DDC numbers according to the rules and put them
into catalouges as a whole it is very complicated to split them afterwords.

>> Wow! So how are we going to express this in SKOS?
> 
> Yep. The problem is that one has to code relational symbol while the
> sequence from left to right also matters.
> Anyway, this is the problem with coding of syntax of any pre-ccordinated
> indexing language. My opinion is that this level of sophistication is very
> rarely needed in IR - I mentioned it only as a response to Nabonita's comment.

But Semantic Web prophets seem to think that we will syntactically index
documents with RDF statements ;-)

> Reasonable simplification I thin SKOS should be concerned with is  to split
> the precomposed number to its segments to allow for post-coordinate search.
> This is still better than nothing.

I fully agree.

Greetings,
Jakob
Received on Thursday, 3 August 2006 07:33:22 UTC