Re: Non- and Partial-FRBR Metadata from Ross Singer on 2010-09-28 (public-lld@w3.org from September 2010)

From: Ross Singer <ross.singer@talis.com>
Date: Mon, 27 Sep 2010 23:35:37 -0400
To: Karen Coyle <kcoyle@kcoyle.net>, public-lld <public-lld@w3.org>
Message-ID: <AANLkTinskOFNvnmgGh6+C-uJmyHBJ2Z7jT7xMG31OxWF@mail.gmail.com>
On Tue, Sep 21, 2010 at 11:36 AM, Karen Coyle <kcoyle@kcoyle.net> wrote:

> Let me go back and re-describe my situation (which is a common one,
> AFAIK), which may have gotten lost in the lenghty discussion.
>
> As with most bibliographic databases today, the Open Library does not
> reflect the FRBR separation of bibliographic data into WEMI. Although
> some information has been pulled out of the data into a Work "record,"
> what remains as the primary bibliographic entry is *not* a FRBR
> Manifestation; it is similar to the bibliographic record created by
> current library cataloging, or that is found as a purchasable entry in
> Amazon. It contains some elements from each of the FRBR Group 1
> entities, combined into a single unit with a single identifier. I
> cannot code this bibliographic mixture as a frbr:Manifestation because
> it does not meet the definition of that entity, and I think that
> miscoding of data will cause great confusion when we try to combine
> data from different sources. I would rather have a defined entity that
> accurately reflects my data.

Karen,

Unfortunately, I'm *still* not 100% certain I understand the problem
(I mean, I think I discern two possible problems, and I'm not sure
either is what you're concerned with :)), but I'll take a stab at it,
and, hopefully, your correction of my misinterpretation can help set
me straight.

I *think* you're either saying:
1) You cannot create a WEMI like graph structure because the source
data you would be modeling from contains elements from various parts
of the W-E-M chain and therefore your resource isn't any specific one
of those (that is, you have multiple, discrete resources in your
'record')
 - or -
2) You cannot create a WEMI like graph because the source data you
would be modeling from contains elements from various parts of the
W-E-M chain and it's unpredictable and unparsable from the source data
to ascertain what goes where.

They're similar (hence my uncertainty to the problem).

If it's #1, it seems the problem would most easily be solved with hash uris:
Manifestation: http://openlibrary.org/books/OL3420800M#M (title
statement, isbn, publisher name/location, date published, by
statement, series? # don't know, I'm a little over my head here)
Expression: http://openlibrary.org/books/OL3420800M#E (language)
Work: http://openlibrary.org/works/OL3922125W (uniform title, creator, subjects)

If it's the latter scenario, I would mostly punt on a formal FRBR
model (except Work, which is already predefined) and model it in Bibo
with a dcterms:isVersionOf/hasVersion to help find equivalent
manifressions (or expresstations?) in the collection.

This sort of model will probably have to be pretty prevalent for
legacy data.  It certainly calls into question, though, what an LCCN
or oclcnumber is identifying.  I had, until this thread, basically
considered it an identifier for a manifestation, but now I'm starting
to think what it identifies might actually be the record (this
ambiguous WEM jumble that you've pointed out) and just the record.  I
mean, that is what it identifies in MARC; I just feel this sort of
thing is still sort of up for grabs in RDF since the intention (I
think!) is not to carry the literal notion of the record over into
linked data (that is, it's not MARC-RDF), but, instead, its meaning.

Definitely something we'd need to figure out given the scarcity of
common identifiers and how important they'll be in linking data
together.

-Ross.
Received on Tuesday, 28 September 2010 04:02:23 UTC