W3C home > Mailing lists > Public > public-esw-thes@w3.org > January 2011

Fwd: [open-bibliography] Library of Congress subject headings & RDF

From: Ross Singer <rossfsinger@gmail.com>
Date: Fri, 7 Jan 2011 12:16:48 -0500
Message-ID: <AANLkTimTsitcs=1uzoLZqceTEcNKp2YmcfHg4Mq4APcP@mail.gmail.com>
To: SKOS <public-esw-thes@w3.org>
Hi all, forwarding a thread from the open-bibliography
(http://lists.okfn.org/mailman/listinfo/open-bibliography) list here.
It started with a question from Owen Stephens about a topic that's
come up here before (subdivisions, coordination, etc.).

I'm bringing it here because Owen's question prompted me to explain
some of the ideas I've been playing around with in this regard in
http://lcsubjects.org/ which might be of interest here, as well.

First Owen's original post:

"Can anyone point me at (or advise me on) examples of representing
subject heading fields from a library catalogue record as RDF.
Specifically I'm interested in how chained sets of subject headings
are represented.

E.g. a library catalogue record might have a heading:

650$$aPopular Music$$xHistory$$y20th Century

Each one of these headings:

Popular Music
History
20th Century

will have a SKOS representation on id.loc.gov, but to represent each
heading separately as a dc:subject (or similar) would lose the context
of chaining them together.

There are some entries on id.loc.gov that represent some 'chains'
(those that have been 'authorised') - e.g.
http://id.loc.gov/authorities/sh2008109787#concept is 'Popular
Music--History and Criticism' - but for me this doesn't feel quite
right - doesn't this lose some of the flexibility of the faceted
scheme?

I'm wondering about something similar to the way BIBO handles author
lists (you can both represent each author, and the list of authors,
including order)"

and then my reply:

---------- Forwarded message ----------
From: Ross Singer <ross.singer@talis.com>
Date: Fri, Jan 7, 2011 at 11:12 AM
Subject: Re: [open-bibliography] Library of Congress subject headings & RDF
To: List for Working Group on Open Bibliographic Data
<open-bibliography@lists.okfn.org>


Hi Owen,

I agree that the status quo at id.loc.gov is pretty unsatisfying (on
several levels, including this one) and this is one the things that I
changed for lcsubjects.org in the last redesign (although it's
certainly not "fixed" or even remotely standard - but it was intended
to get the conversation started in this direction).

Thankfully, though, your specific example works :)

http://lcsubjects.org/subjects/sh2008109787#concept

For subdivided subject headings like this, I've added a few
properties: lcsh:coordinates, lcsh:generalSubdivision,
lcsh:chronologicalSubdivision, lcsh:primaryConcept, etc.

The RDF out of lcsubjects.org is pretty brutally verbose, but directly
out of the Platform it looks like:
http://api.talis.com/stores/lcsh-info/meta?about=http%3A%2F%2Flcsubjects.org%2Fsubjects%2Fsh2008109787%23concept&output=xml

and the coordinates resource is an rdf:Seq (to preserve order):

http://api.talis.com/stores/lcsh-info/meta?about=http%3A%2F%2Flcsubjects.org%2Fsubjects%2Fsh2008109787%23coordinates&output=xml

This is still totally a work in progress (and incredibly incomplete),
but is intended to begin to provide the sort of semantics that you're
looking for (I think).  It also (I hope) begins to lay out a
foundation for how LCSH is actually intended to be used (which is a
set of building blocks).  So to take your original example,
"650$$aPopular Music$$xHistory$$y20th Century"

This could be created like:

<http://example.org/book/1>
   dcterms:subject
<http://example.org/subjects/popular-music--history--20th-century#concept>.

<http://example.org/subjects/popular-music--history--20th-century#concept>
   lcsh:generalSubdivision
<http://lcsubjects.org/subjects/sh99005024#concept> ;
   lcsh:chronologicalSubdivision
<http://lcsubjects.org/subjects/sh2002012476#concept>;
   lcsh:primaryConcept <http://lcsubjects.org/subjects/sh85088865#concept> ;
   a skos:Concept ;
   skos:prefLabel "Popular Music--History--20th Century" ;
   lcsh:coordinates
<http://example.org/subjects/popular-music--history--20th-century#coordinates>
.

<http://example.org/subjects/popular-music--history--20th-century#coordinates>
  a rdf:Seq ;
  rdf:_1 <http://lcsubjects.org/subjects/sh85088865#concept> ;
  rdf:_2 <http://lcsubjects.org/subjects/sh99005024#concept> ;
  rdf:_3 <http://lcsubjects.org/subjects/sh2002012476#concept> .

(the lcsubjects.org URIs could just as easily be id.loc.gov URIs -- it
was just easier to cut and paste from existing data).

With this, it's much easier to make our uncontrolled subject headings
that are composites of a bunch of controlled headings.

Like I said, this is pretty incomplete on lcsubjects.org, currently,
mainly because there's a lot missing (namely the corporate names and
random chronological subdivisions, but there are also subdivision
terms that don't appear to be derived from an authorized heading).
See: http://lcsubjects.org/subjects/sh2010007497 or
http://lcsubjects.org/subjects/sh85045754 as somewhat different
examples.

The first one has a URI for Austria, but that URI returns a 404 (I
built this from the Fred 2.0 data, so I have the NAF, I just haven't
figured out how to incorporate it into lcsubjects.org, yet).  The
second one shows an unauthorized chronological subdivision -- so,
currently, it just drops it.

Here's another example:  http://lcsubjects.org/subjects/sh85134593#concept

this should use: http://lcsubjects.org/subjects/sh99005746#concept as
the general subdivision -- but that's an altLabel, so it's currently
failing (as you can see, this is wrought with frustations!).

Another mind bender: http://lcsubjects.org/subjects/sh2010106574#concept

This one chokes, because "Polyglot" isn't an authorized term (instead
it should be using http://lcsubjects.org/subjects/sh85037700#concept
-- "Dictionaries, Polyglot") and was created after Fred 2.0 (3.5 years
after!), so I don't have access to the MARC authority record to
properly look things up (not that it would help me in this case,
anyway [1]).

So, to try to bring this on home..., I think there are solutions (and
linked data solutions) to this, but LC is doing very little to enable
it.  If they'd provide the original MARC as a format for the concepts,
that would be a start -- but, honestly, without all of the data
available (including the NAF), this is going to be half-baked.

So, anyway, thanks for prompting me to write a bit about this :)
Probably worth forwarding to the SKOS list, as well.

-Ross.

[1] Here's the MARC record for Plastics--Dictionaries--Polyglot:
000     00476cz a2200169n 450
001     8244985
005     20100420002715.0
008     100413|| anannbabn |n ana
035     __ |a (DLC)464428
035     __ |a (DLC)sh2010106574
906     __ |t 8888 |u tc00 |v 0
010     __ |a sh2010106574
040     __ |a DLC |b eng |c DLC
150     __ |a Plastics |v Dictionaries |x Polyglot
667     __ |a Record generated for validation purposes.
670     __ |a Work cat.: Fachwörterbuch Kunststofftechnik, c1992
953     __ |a tc00

so there's still not an obvious way to know that one should be looking
for Dictionaries, Polyglot.



On Fri, Jan 7, 2011 at 7:18 AM, Owen Stephens <owen@ostephens.com> wrote:
> Can anyone point me at (or advise me on) examples of representing subject
> heading fields from a library catalogue record as RDF. Specifically I'm
> interested in how chained sets of subject headings are represented.
> E.g. a library catalogue record might have a heading:
> 650$$aPopular Music$$xHistory$$y20th Century
> Each one of these headings:
> Popular Music
> History
> 20th Century
> will have a SKOS representation on id.loc.gov, but to represent each heading
> separately as a dc:subject (or similar) would lose the context of chaining
> them together.
> There are some entries on id.loc.gov that represent some 'chains' (those
> that have been 'authorised') -
> e.g. http://id.loc.gov/authorities/sh2008109787#concept is 'Popular
> Music--History and Criticism' - but for me this doesn't feel quite right -
> doesn't this lose some of the flexibility of the faceted scheme?
> I'm wondering about something similar to the way BIBO handles author lists
> (you can both represent each author, and the list of authors, including
> order)
> Thanks,
> Owen
> --
> Owen Stephens
> Owen Stephens Consulting
> Web: http://www.ostephens.com
> Email: owen@ostephens.com
>

> _______________________________________________
> open-bibliography mailing list
> open-bibliography@lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-bibliography
>
>

-Ross.
Received on Friday, 7 January 2011 17:17:22 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 2 March 2016 13:32:14 UTC