W3C home > Mailing lists > Public > public-esw-thes@w3.org > January 2011

Re: [open-bibliography] Library of Congress subject headings & RDF

From: Ross Singer <rossfsinger@gmail.com>
Date: Fri, 7 Jan 2011 14:48:27 -0500
Message-ID: <AANLkTinzCe=Pb29J0Wbs6T=QoTaMCMK74rgNPq059ge_@mail.gmail.com>
To: Thad Guidry <thadguidry@gmail.com>
Cc: SKOS <public-esw-thes@w3.org>
Good eye, Thad - you found the heading that I was looking for (and
unable to unearth for my example)!

There are several of these -- I'm counting over 2200 (although a lot
of these are somewhat inexplicable duplication of terms).

Some of them have 4 resources!

Here is "Television adaptions":

http://api.talis.com/stores/lcsh-info/services/sparql?query=PREFIX+skos%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2004%2F02%2Fskos%2Fcore%23%3E%0D%0Aselect+%3Fs+%3Fl+%3Fis+%0D%0Awhere+{%0D%0A%3Fs+skos%3AprefLabel+%22Television+adaptations%22%3B%0D%0A+skos%3AinScheme+%3Fis+.%0D%0A%0D%0AFILTER+%28%3Fis+!%3D+%3Chttp%3A%2F%2Flcsubjects.org%2Fschemes%2Fauthorities%3E%29%0D%0A%0D%0A}%0D%0ALIMIT+20

Topical term, Genre/Form term, general subdivision and form subdivision.

Others with 4:

"Film adaptations", "Maps", "Aerial photographs".

-Ross.

On Fri, Jan 7, 2011 at 1:58 PM, Thad Guidry <thadguidry@gmail.com> wrote:
> A VERY good direction indeed, Ross. Kudos.
>
> My concern is eventual handling of overlap with *form subdivisions*
> AND *topical terms* such as
>
> [1] http://lcsubjects.org/subjects/sh99001606
> [2] http://lcsubjects.org/subjects/sh85037700
>
>
> On Fri, Jan 7, 2011 at 11:16 AM, Ross Singer <rossfsinger@gmail.com> wrote:
>> Hi all, forwarding a thread from the open-bibliography
>> (http://lists.okfn.org/mailman/listinfo/open-bibliography) list here.
>> It started with a question from Owen Stephens about a topic that's
>> come up here before (subdivisions, coordination, etc.).
>>
>> I'm bringing it here because Owen's question prompted me to explain
>> some of the ideas I've been playing around with in this regard in
>> http://lcsubjects.org/ which might be of interest here, as well.
>>
>> First Owen's original post:
>>
>> "Can anyone point me at (or advise me on) examples of representing
>> subject heading fields from a library catalogue record as RDF.
>> Specifically I'm interested in how chained sets of subject headings
>> are represented.
>>
>> E.g. a library catalogue record might have a heading:
>>
>> 650$$aPopular Music$$xHistory$$y20th Century
>>
>> Each one of these headings:
>>
>> Popular Music
>> History
>> 20th Century
>>
>> will have a SKOS representation on id.loc.gov, but to represent each
>> heading separately as a dc:subject (or similar) would lose the context
>> of chaining them together.
>>
>> There are some entries on id.loc.gov that represent some 'chains'
>> (those that have been 'authorised') - e.g.
>> http://id.loc.gov/authorities/sh2008109787#concept is 'Popular
>> Music--History and Criticism' - but for me this doesn't feel quite
>> right - doesn't this lose some of the flexibility of the faceted
>> scheme?
>>
>> I'm wondering about something similar to the way BIBO handles author
>> lists (you can both represent each author, and the list of authors,
>> including order)"
>>
>> and then my reply:
>>
>> ---------- Forwarded message ----------
>> From: Ross Singer <ross.singer@talis.com>
>> Date: Fri, Jan 7, 2011 at 11:12 AM
>> Subject: Re: [open-bibliography] Library of Congress subject headings & RDF
>> To: List for Working Group on Open Bibliographic Data
>> <open-bibliography@lists.okfn.org>
>>
>>
>> Hi Owen,
>>
>> I agree that the status quo at id.loc.gov is pretty unsatisfying (on
>> several levels, including this one) and this is one the things that I
>> changed for lcsubjects.org in the last redesign (although it's
>> certainly not "fixed" or even remotely standard - but it was intended
>> to get the conversation started in this direction).
>>
>> Thankfully, though, your specific example works :)
>>
>> http://lcsubjects.org/subjects/sh2008109787#concept
>>
>> For subdivided subject headings like this, I've added a few
>> properties: lcsh:coordinates, lcsh:generalSubdivision,
>> lcsh:chronologicalSubdivision, lcsh:primaryConcept, etc.
>>
>> The RDF out of lcsubjects.org is pretty brutally verbose, but directly
>> out of the Platform it looks like:
>> http://api.talis.com/stores/lcsh-info/meta?about=http%3A%2F%2Flcsubjects.org%2Fsubjects%2Fsh2008109787%23concept&output=xml
>>
>> and the coordinates resource is an rdf:Seq (to preserve order):
>>
>> http://api.talis.com/stores/lcsh-info/meta?about=http%3A%2F%2Flcsubjects.org%2Fsubjects%2Fsh2008109787%23coordinates&output=xml
>>
>> This is still totally a work in progress (and incredibly incomplete),
>> but is intended to begin to provide the sort of semantics that you're
>> looking for (I think).  It also (I hope) begins to lay out a
>> foundation for how LCSH is actually intended to be used (which is a
>> set of building blocks).  So to take your original example,
>> "650$$aPopular Music$$xHistory$$y20th Century"
>>
>> This could be created like:
>>
>> <http://example.org/book/1>
>>    dcterms:subject
>> <http://example.org/subjects/popular-music--history--20th-century#concept>.
>>
>> <http://example.org/subjects/popular-music--history--20th-century#concept>
>>    lcsh:generalSubdivision
>> <http://lcsubjects.org/subjects/sh99005024#concept> ;
>>    lcsh:chronologicalSubdivision
>> <http://lcsubjects.org/subjects/sh2002012476#concept>;
>>    lcsh:primaryConcept <http://lcsubjects.org/subjects/sh85088865#concept> ;
>>    a skos:Concept ;
>>    skos:prefLabel "Popular Music--History--20th Century" ;
>>    lcsh:coordinates
>> <http://example.org/subjects/popular-music--history--20th-century#coordinates>
>> .
>>
>> <http://example.org/subjects/popular-music--history--20th-century#coordinates>
>>   a rdf:Seq ;
>>   rdf:_1 <http://lcsubjects.org/subjects/sh85088865#concept> ;
>>   rdf:_2 <http://lcsubjects.org/subjects/sh99005024#concept> ;
>>   rdf:_3 <http://lcsubjects.org/subjects/sh2002012476#concept> .
>>
>> (the lcsubjects.org URIs could just as easily be id.loc.gov URIs -- it
>> was just easier to cut and paste from existing data).
>>
>> With this, it's much easier to make our uncontrolled subject headings
>> that are composites of a bunch of controlled headings.
>>
>> Like I said, this is pretty incomplete on lcsubjects.org, currently,
>> mainly because there's a lot missing (namely the corporate names and
>> random chronological subdivisions, but there are also subdivision
>> terms that don't appear to be derived from an authorized heading).
>> See: http://lcsubjects.org/subjects/sh2010007497 or
>> http://lcsubjects.org/subjects/sh85045754 as somewhat different
>> examples.
>>
>> The first one has a URI for Austria, but that URI returns a 404 (I
>> built this from the Fred 2.0 data, so I have the NAF, I just haven't
>> figured out how to incorporate it into lcsubjects.org, yet).  The
>> second one shows an unauthorized chronological subdivision -- so,
>> currently, it just drops it.
>>
>> Here's another example:  http://lcsubjects.org/subjects/sh85134593#concept
>>
>> this should use: http://lcsubjects.org/subjects/sh99005746#concept as
>> the general subdivision -- but that's an altLabel, so it's currently
>> failing (as you can see, this is wrought with frustations!).
>>
>> Another mind bender: http://lcsubjects.org/subjects/sh2010106574#concept
>>
>> This one chokes, because "Polyglot" isn't an authorized term (instead
>> it should be using http://lcsubjects.org/subjects/sh85037700#concept
>> -- "Dictionaries, Polyglot") and was created after Fred 2.0 (3.5 years
>> after!), so I don't have access to the MARC authority record to
>> properly look things up (not that it would help me in this case,
>> anyway [1]).
>>
>> So, to try to bring this on home..., I think there are solutions (and
>> linked data solutions) to this, but LC is doing very little to enable
>> it.  If they'd provide the original MARC as a format for the concepts,
>> that would be a start -- but, honestly, without all of the data
>> available (including the NAF), this is going to be half-baked.
>>
>> So, anyway, thanks for prompting me to write a bit about this :)
>> Probably worth forwarding to the SKOS list, as well.
>>
>> -Ross.
>>
>> [1] Here's the MARC record for Plastics--Dictionaries--Polyglot:
>> 000     00476cz a2200169n 450
>> 001     8244985
>> 005     20100420002715.0
>> 008     100413|| anannbabn |n ana
>> 035     __ |a (DLC)464428
>> 035     __ |a (DLC)sh2010106574
>> 906     __ |t 8888 |u tc00 |v 0
>> 010     __ |a sh2010106574
>> 040     __ |a DLC |b eng |c DLC
>> 150     __ |a Plastics |v Dictionaries |x Polyglot
>> 667     __ |a Record generated for validation purposes.
>> 670     __ |a Work cat.: Fachwörterbuch Kunststofftechnik, c1992
>> 953     __ |a tc00
>>
>> so there's still not an obvious way to know that one should be looking
>> for Dictionaries, Polyglot.
>>
>>
>>
>> On Fri, Jan 7, 2011 at 7:18 AM, Owen Stephens <owen@ostephens.com> wrote:
>>> Can anyone point me at (or advise me on) examples of representing subject
>>> heading fields from a library catalogue record as RDF. Specifically I'm
>>> interested in how chained sets of subject headings are represented.
>>> E.g. a library catalogue record might have a heading:
>>> 650$$aPopular Music$$xHistory$$y20th Century
>>> Each one of these headings:
>>> Popular Music
>>> History
>>> 20th Century
>>> will have a SKOS representation on id.loc.gov, but to represent each heading
>>> separately as a dc:subject (or similar) would lose the context of chaining
>>> them together.
>>> There are some entries on id.loc.gov that represent some 'chains' (those
>>> that have been 'authorised') -
>>> e.g. http://id.loc.gov/authorities/sh2008109787#concept is 'Popular
>>> Music--History and Criticism' - but for me this doesn't feel quite right -
>>> doesn't this lose some of the flexibility of the faceted scheme?
>>> I'm wondering about something similar to the way BIBO handles author lists
>>> (you can both represent each author, and the list of authors, including
>>> order)
>>> Thanks,
>>> Owen
>>> --
>>> Owen Stephens
>>> Owen Stephens Consulting
>>> Web: http://www.ostephens.com
>>> Email: owen@ostephens.com
>>>
>>
>>> _______________________________________________
>>> open-bibliography mailing list
>>> open-bibliography@lists.okfn.org
>>> http://lists.okfn.org/mailman/listinfo/open-bibliography
>>>
>>>
>>
>> -Ross.
>>
>>
>
>
>
> --
> -Thad
> http://www.freebase.com/view/en/thad_guidry
>
>
Received on Friday, 7 January 2011 19:49:00 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 2 March 2016 13:32:14 UTC