Re: vocabs, metadata set, datasets from Karen Coyle on 2011-01-17 (public-xg-lld@w3.org from January 2011)

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Mon, 17 Jan 2011 10:06:34 -0800
To: Thomas Baker <tbaker@tbaker.de>
Cc: Mark van Assem <mark@cs.vu.nl>, public-xg-lld@w3.org
Message-ID: <20110117100634.11822zcl138p55pm@kcoyle.net>
Quoting Thomas Baker <tbaker@tbaker.de>:


> Instead of:
>
>     Note that many standards, such as LCSH, might be seen as
>     belonging to several of the categories below. However, we
>     will refer in our report to each standard as belonging to
>     just one of the categories, based on their typical usage.
>
> ...which does not say how how is it they can belong to several
> categories, perhaps the text could put more emphasis on the
> "context in which it is used":
>
>     Note that many standards, such as the Library of Congress
>     Subject Headings [link], could be seen as falling under
>     several of the categories below depending on the context
>     in which they are used. In this report, we assign standards
>     to categories based on their "typical" usage.

This works for me.


> It works well enough for drawing an analogy, but I wouldn't
> want to paper over the problem, especially the bit about
> records being about "one entity (e.g. a book)" -- which is
> in my opinion simply wrong because a typical catalog record,
> for example, contains descriptive elements not just about a
> book, but its author, publisher, etc.

I think we've gone into the "focus" area again here. A library record  
is designed to provide a complete enough description of a book (or  
piece of music, or map, etc.) to fulfill two functions:

1) identification of a resource owned or licensed by a library
2) user access to that resource

The inclusion of related things like authors, places of publication,  
etc. are all with the focus of the thing being cataloged, not as a way  
to describe those related things. Authors are described in their own  
records (name authority records), but only the identifier for the  
author record is included in the bibliographic record. (It just so  
happens that the identifier used today is a text string that looks a  
whole lot like a name.) Thus FRBR and FRAD as separate views of  
bibliographic data.

The *wholeness* of the record is important because it represents a  
description of the thing that is a complete description. While  
individual statements may be usable in other contexts, the library  
function of bibliographic description will always require a particular  
set of statements (at a minimum). I believe we will continue to call  
this set of statements a "record" for a fairly long time.

I guess what this comes down to is how we are using the term  
"description." In library cataloging, a statement such as:

<Manifestation123> <was published in> <place789>

would not be considered a description, but *part* of a description of  
Manifestation123 because it does not in itself distinguish that  
Manifestation from another Manifestation published in that place. So  
in library "lingo" most individual statements would not be called  
"descriptions," they are just bits of data that must be combined with  
other bits to create a description that meets some level of  
functionality. Exactly which statements you need depends on the  
function you are performing, and there are many different functions  
possible (e.g. subject searching v. circulating a book v. ordering a  
new copy).

That said, I don't think this changes the wording you give:

> I'd suggest something like:
>
>     A dataset is a collection of structured metadata --
>     descriptions of things, such as books in a library.
>     Library records consist of statements about things,
>     where each statement consists of an element ("attribute"
>     or "relationship") of the entity, and a "value" for that
>     element.  Note that in the Linked Data context, Datasets do
>     not necessarily consist of clearly identifiable "records"
>     (see entry on Records).

kc

>
>> >-- I'm thinking that the Library Terminology page might
>> >    therefore include an entry on records, citing some of the
>>
>> <snip>
>>
>> that sounds like a useful idea.
>
> Gordon has some great presentations about calling the record
> paradigm into question, even "exploding the record" (or words
> to that effect).  One of those could perhaps provide a good
> starting point for an entry on records.
>
> Tom
>
>> >    key definitions of "record" used in library science.  That
>> >    entry could be the place where the notion that a record is
>> >    "basically a collection of statements about ... one entity"
>> >    is called into question (by pointing out that in practice, records
>> >    typically include some description about several entities).
>> >    It could also provide a place to discuss the notion that
>> >    descriptive metadata, in a Linked Data context, is primarily
>> >    about description at the statement level, which is indeed
>> >    what lends it so well to linking and recombination.  That
>> >    entry could acknowledge the role of records in traditional
>> >    library science of providing a context for the provenance of
>> >    metadata and perhaps flag this as a crucial issue for Linked
>> >    Data (and RDF generally).
>> >
>> >Tom
>> >
>> >[1]  
>> http://www.w3.org/2001/sw/wiki/Library_terminology_informally_explained#Vocabularies.2C_Element_sets.2C_Datasets
>> >[2] http://lists.w3.org/Archives/Public/public-lld/2010Dec/0023.html
>> >
>
> --
> Tom Baker <tbaker@tbaker.de>
>
>



-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
Received on Monday, 17 January 2011 18:07:13 UTC