RE: Requirements for notes from Houghton,Andrew on 2004-05-12 (public-esw-thes@w3.org from May 2004)

From: Houghton,Andrew <houghtoa@oclc.org>
Date: Wed, 12 May 2004 10:02:01 -0400
To: "'public-esw-thes@w3.org'" <public-esw-thes@w3.org>
Message-ID: <B56ABE145BEB0C40A265238FCAA420DF026F5354@oa2-server.oa.oclc.org>
> From: Ron Davies [mailto:ron@rondavies.be] 
> Sent: Wednesday, May 12, 2004 6:32 AM
> Subject: Re: Requirements for notes
>  
> I'm not sure that it's really necessary to try to specify a wide range of 
> different possible kinds of notes. This seems to me to be a case where
your idea 
> of subclassing and inheritance is going to be really useful (the other is
the 
> different types of hierarchical relationship such as partitive, generic,
etc.). I 
> think it's important, as Andy suggests, to distinguish between public and
private 
> notes, since these may be handled in different ways, but for the rest, I'm
not 
> sure it matters a lot. A scope note, a history note, and some kind of
internal 
> note are the ones that I have seen most often, but others have a lot more 
> experience with this.
> 
> I'm still a little concerned about delivering all of these notes every
time we 
> access a Concept. I would say normally in a _thesaurus_ application at
least 
> there's a continuum of how often information gets used, something like
this: 

One reason for defining a little more than scope, history, internal notes is
that it allow you extract the notes that are relevant to your audience.  For
example, someone browsing the vocabulary is probably not interested in
application, citation, reference, or editorial notes.  They may only need
scope and public notes.  However, a librarian who is trying to apply the
vocabulary is probably interested in application, scope, and history notes.

As I pointed out, Dewey may be far to one side with all its note types.  BTW
after I sent my prior message I discovered that the Excel spreadsheet I was
filtering that contained the tags and descriptions was filtered incorrectly
and there were another twenty more note types!  Ouch, however they didn't
add anything new to the discussion than what I presented. 

In my prior message I showed that it was possible to boil down all those
notes types to a very small set of distinct buckets that one could use to
build more detailed note types.  The buckets I suggested *probably* have
broader use across various, thesauri, subject headings, classification
systems, etc.  Many vocabularies may only use a few note types as you
suggested, but it would be nice to have the additional ones I suggested so
you can do finer grained things for presentation or programmatic
manipulation.

> The Concept object combines the most often used information with the least
often
> used information. Sometimes these notes can be very long (Andy is perhaps
in a 
> good position to confirm or deny my fears here). So the API may be
delivering a 
> lot of textual information with Concepts that will only be used a very
small 
> percentage of the time. Maybe in practice this overhead won't matter, but
it's an 
> issue perhaps to evaluate.

The length of notes vary from vocabulary to vocabulary.  I deal with a
number of vocabularies, DDC, LCC, LCSH, MeSH, FAST, GSAFD, Eric, GEM-S,
Gates, Sears, etc.  Most notes are short in these vocabularies.  The
editorial guidelines for the vocabulary can play into the length of notes.
In Dewey, notes are generally concise.  Probably due to the fact that there
are so many of them.  The note types are for specifically defined purposes.
However, when an editor needs to discuss a topic in more detail, Dewey has a
section called the "Manual".

The "Manual" is where in-depth discussions about a class number or a class
number in relation to other class numbers goes.  It's a separate section,
distinct from the actual classification notes.  The discussions in the
"Manual" are along the lines of why would you use this concept over these
other concepts and the fine lines that the editors draw between concepts.
The "Manual" notes wouldn't be converted to SKOS because many times they
deal with relationships between concepts.  So if there are multiple concepts
discussed, then which one do you associate the note with?  I guess, if a
"Manual" note discussed three classes you could put the same note in all
three classes, but I'm not sure that would be useful.  It borders on the
same principal why vocabularies put things into tables, like geographics.
They can be used in multiple places and it's far easier to maintain in one
place.  The "Manual" also seems outside the scope of what we are trying to
do with SKOS and vocabularies, however if we needed to incorporate it with
SKOS we would define our own RDF Resource and use dc:relation in the
skos:Concept record to relate the concept to the place where it was
discussed.

Part of having finer grained notes allows the API to deliver only what you
need and no more.  So if your audience only needs scope and public notes,
you should be able to specify what your need is to the API.  If you just had
just scope, history, and editorial note types, then you could get back a lot
of stuff that isn't very useful to your audience.  That's because the
vocabulary maintainer was forced into putting everything into scope notes or
a subclass of scope notes.


Andy.

Andrew Houghton, OCLC Online Computer Library Center, Inc.
http://www.oclc.org/about/
http://www.oclc.org/research/staff/houghton.htm
Received on Wednesday, 12 May 2004 10:04:01 UTC