- From: Jakob Voss <jakob.voss@gbv.de>
- Date: Thu, 03 Aug 2006 09:33:30 +0200
- To: public-esw-thes@w3.org
Aida Slavic wrote: > Jakob and Nabonita > I forgot to mention that I was pleasantly surprised that you guys > have more understanding about the need for supporting complex > notation than most of 'classification experts' I met. Thanks :-) > To be more practical here I think the minimum would be the > possibility to separate and search for parts of pre-composed numbers. > I will use UDC examples to illustrate two typical situations as minimal > requirements > > case 1: > notation 75"19"(410)(0.034.2) > Painting--20th century--U.K.--digital document > > main number--time(aux)--place(aux)--form(aux) > > case 2: > notation 37:005.962-057.117 > Education--Staff (management HR)--persons in casual employments > > main number [relation] main number -- persons (aux) > > Each part of these notation has the same meaning irespective > its position in the expression and type of combination. By the way qualifiers (aka subheadings) are also a kind of coordination that cannot be modeled yet. I don't see fundamental structural differences between 32:91 in UCD (Politics related to Geography) and health care reform/econ in MeSH (see http://www.uab.edu/lister/meshsubs.htm) There has been a discussion about qualifiers in July 2005 http://lists.w3.org/Archives/Public/public-esw-thes/2005Jul/0049.html >> That's a problem. My colleage Ulrike Reiner is working on a way to >> automatically split DDC numbers. After two years she has reached a >> pretty good level and I think that this will be solved in about 1-2 >> years - but it's very complex indeed. > > Well this sounds better than what Liu achieved in 1996. But I don't > believe in this approach. This is successfully done for UDC in 1998 as a > PhD project with very little wider application. By whom? I only know the paper of Riesthuis from 1996. We have successfully reproduced Liu's results but it was a lot of work to come so far. > I'd rather agree with Goedert (classification in general) and Steve > Pollitt (with respect to DDC), Gopinath & Prasad (on CC) who suggested that in order > to support IR (faceted interface in particular) classification should > be properly coded for machine processing. This gives open hands in creating > good faceted interfaces [see references at the end] Thanks for the references. > Editor in chief of Dewey J. Mitchell mentioned in one of her papers > that Dewey considered this to be done in their database. I think they > actually coded facets when re-designing the db in 2004. But librarians create DDC numbers according to the rules and put them into catalouges as a whole it is very complicated to split them afterwords. >> Wow! So how are we going to express this in SKOS? > > Yep. The problem is that one has to code relational symbol while the > sequence from left to right also matters. > Anyway, this is the problem with coding of syntax of any pre-ccordinated > indexing language. My opinion is that this level of sophistication is very > rarely needed in IR - I mentioned it only as a response to Nabonita's comment. But Semantic Web prophets seem to think that we will syntactically index documents with RDF statements ;-) > Reasonable simplification I thin SKOS should be concerned with is to split > the precomposed number to its segments to allow for post-coordinate search. > This is still better than nothing. I fully agree. Greetings, Jakob
Received on Thursday, 3 August 2006 07:33:22 UTC