W3C home > Mailing lists > Public > public-esw-thes@w3.org > January 2011

Re: [open-bibliography] Library of Congress subject headings & RDF

From: Thad Guidry <thadguidry@gmail.com>
Date: Fri, 7 Jan 2011 12:58:22 -0600
Message-ID: <AANLkTikgisFKwFSa5tYNt75-xAR8tiu5Cq=8cpM-E3UX@mail.gmail.com>
To: SKOS <public-esw-thes@w3.org>
A VERY good direction indeed, Ross. Kudos.

My concern is eventual handling of overlap with *form subdivisions*
AND *topical terms* such as

[1] http://lcsubjects.org/subjects/sh99001606
[2] http://lcsubjects.org/subjects/sh85037700


On Fri, Jan 7, 2011 at 11:16 AM, Ross Singer <rossfsinger@gmail.com> wrote:
> Hi all, forwarding a thread from the open-bibliography
> (http://lists.okfn.org/mailman/listinfo/open-bibliography) list here.
> It started with a question from Owen Stephens about a topic that's
> come up here before (subdivisions, coordination, etc.).
>
> I'm bringing it here because Owen's question prompted me to explain
> some of the ideas I've been playing around with in this regard in
> http://lcsubjects.org/ which might be of interest here, as well.
>
> First Owen's original post:
>
> "Can anyone point me at (or advise me on) examples of representing
> subject heading fields from a library catalogue record as RDF.
> Specifically I'm interested in how chained sets of subject headings
> are represented.
>
> E.g. a library catalogue record might have a heading:
>
> 650$$aPopular Music$$xHistory$$y20th Century
>
> Each one of these headings:
>
> Popular Music
> History
> 20th Century
>
> will have a SKOS representation on id.loc.gov, but to represent each
> heading separately as a dc:subject (or similar) would lose the context
> of chaining them together.
>
> There are some entries on id.loc.gov that represent some 'chains'
> (those that have been 'authorised') - e.g.
> http://id.loc.gov/authorities/sh2008109787#concept is 'Popular
> Music--History and Criticism' - but for me this doesn't feel quite
> right - doesn't this lose some of the flexibility of the faceted
> scheme?
>
> I'm wondering about something similar to the way BIBO handles author
> lists (you can both represent each author, and the list of authors,
> including order)"
>
> and then my reply:
>
> ---------- Forwarded message ----------
> From: Ross Singer <ross.singer@talis.com>
> Date: Fri, Jan 7, 2011 at 11:12 AM
> Subject: Re: [open-bibliography] Library of Congress subject headings & RDF
> To: List for Working Group on Open Bibliographic Data
> <open-bibliography@lists.okfn.org>
>
>
> Hi Owen,
>
> I agree that the status quo at id.loc.gov is pretty unsatisfying (on
> several levels, including this one) and this is one the things that I
> changed for lcsubjects.org in the last redesign (although it's
> certainly not "fixed" or even remotely standard - but it was intended
> to get the conversation started in this direction).
>
> Thankfully, though, your specific example works :)
>
> http://lcsubjects.org/subjects/sh2008109787#concept
>
> For subdivided subject headings like this, I've added a few
> properties: lcsh:coordinates, lcsh:generalSubdivision,
> lcsh:chronologicalSubdivision, lcsh:primaryConcept, etc.
>
> The RDF out of lcsubjects.org is pretty brutally verbose, but directly
> out of the Platform it looks like:
> http://api.talis.com/stores/lcsh-info/meta?about=http%3A%2F%2Flcsubjects.org%2Fsubjects%2Fsh2008109787%23concept&output=xml
>
> and the coordinates resource is an rdf:Seq (to preserve order):
>
> http://api.talis.com/stores/lcsh-info/meta?about=http%3A%2F%2Flcsubjects.org%2Fsubjects%2Fsh2008109787%23coordinates&output=xml
>
> This is still totally a work in progress (and incredibly incomplete),
> but is intended to begin to provide the sort of semantics that you're
> looking for (I think).  It also (I hope) begins to lay out a
> foundation for how LCSH is actually intended to be used (which is a
> set of building blocks).  So to take your original example,
> "650$$aPopular Music$$xHistory$$y20th Century"
>
> This could be created like:
>
> <http://example.org/book/1>
>    dcterms:subject
> <http://example.org/subjects/popular-music--history--20th-century#concept>.
>
> <http://example.org/subjects/popular-music--history--20th-century#concept>
>    lcsh:generalSubdivision
> <http://lcsubjects.org/subjects/sh99005024#concept> ;
>    lcsh:chronologicalSubdivision
> <http://lcsubjects.org/subjects/sh2002012476#concept>;
>    lcsh:primaryConcept <http://lcsubjects.org/subjects/sh85088865#concept> ;
>    a skos:Concept ;
>    skos:prefLabel "Popular Music--History--20th Century" ;
>    lcsh:coordinates
> <http://example.org/subjects/popular-music--history--20th-century#coordinates>
> .
>
> <http://example.org/subjects/popular-music--history--20th-century#coordinates>
>   a rdf:Seq ;
>   rdf:_1 <http://lcsubjects.org/subjects/sh85088865#concept> ;
>   rdf:_2 <http://lcsubjects.org/subjects/sh99005024#concept> ;
>   rdf:_3 <http://lcsubjects.org/subjects/sh2002012476#concept> .
>
> (the lcsubjects.org URIs could just as easily be id.loc.gov URIs -- it
> was just easier to cut and paste from existing data).
>
> With this, it's much easier to make our uncontrolled subject headings
> that are composites of a bunch of controlled headings.
>
> Like I said, this is pretty incomplete on lcsubjects.org, currently,
> mainly because there's a lot missing (namely the corporate names and
> random chronological subdivisions, but there are also subdivision
> terms that don't appear to be derived from an authorized heading).
> See: http://lcsubjects.org/subjects/sh2010007497 or
> http://lcsubjects.org/subjects/sh85045754 as somewhat different
> examples.
>
> The first one has a URI for Austria, but that URI returns a 404 (I
> built this from the Fred 2.0 data, so I have the NAF, I just haven't
> figured out how to incorporate it into lcsubjects.org, yet).  The
> second one shows an unauthorized chronological subdivision -- so,
> currently, it just drops it.
>
> Here's another example:  http://lcsubjects.org/subjects/sh85134593#concept
>
> this should use: http://lcsubjects.org/subjects/sh99005746#concept as
> the general subdivision -- but that's an altLabel, so it's currently
> failing (as you can see, this is wrought with frustations!).
>
> Another mind bender: http://lcsubjects.org/subjects/sh2010106574#concept
>
> This one chokes, because "Polyglot" isn't an authorized term (instead
> it should be using http://lcsubjects.org/subjects/sh85037700#concept
> -- "Dictionaries, Polyglot") and was created after Fred 2.0 (3.5 years
> after!), so I don't have access to the MARC authority record to
> properly look things up (not that it would help me in this case,
> anyway [1]).
>
> So, to try to bring this on home..., I think there are solutions (and
> linked data solutions) to this, but LC is doing very little to enable
> it.  If they'd provide the original MARC as a format for the concepts,
> that would be a start -- but, honestly, without all of the data
> available (including the NAF), this is going to be half-baked.
>
> So, anyway, thanks for prompting me to write a bit about this :)
> Probably worth forwarding to the SKOS list, as well.
>
> -Ross.
>
> [1] Here's the MARC record for Plastics--Dictionaries--Polyglot:
> 000     00476cz a2200169n 450
> 001     8244985
> 005     20100420002715.0
> 008     100413|| anannbabn |n ana
> 035     __ |a (DLC)464428
> 035     __ |a (DLC)sh2010106574
> 906     __ |t 8888 |u tc00 |v 0
> 010     __ |a sh2010106574
> 040     __ |a DLC |b eng |c DLC
> 150     __ |a Plastics |v Dictionaries |x Polyglot
> 667     __ |a Record generated for validation purposes.
> 670     __ |a Work cat.: Fachwörterbuch Kunststofftechnik, c1992
> 953     __ |a tc00
>
> so there's still not an obvious way to know that one should be looking
> for Dictionaries, Polyglot.
>
>
>
> On Fri, Jan 7, 2011 at 7:18 AM, Owen Stephens <owen@ostephens.com> wrote:
>> Can anyone point me at (or advise me on) examples of representing subject
>> heading fields from a library catalogue record as RDF. Specifically I'm
>> interested in how chained sets of subject headings are represented.
>> E.g. a library catalogue record might have a heading:
>> 650$$aPopular Music$$xHistory$$y20th Century
>> Each one of these headings:
>> Popular Music
>> History
>> 20th Century
>> will have a SKOS representation on id.loc.gov, but to represent each heading
>> separately as a dc:subject (or similar) would lose the context of chaining
>> them together.
>> There are some entries on id.loc.gov that represent some 'chains' (those
>> that have been 'authorised') -
>> e.g. http://id.loc.gov/authorities/sh2008109787#concept is 'Popular
>> Music--History and Criticism' - but for me this doesn't feel quite right -
>> doesn't this lose some of the flexibility of the faceted scheme?
>> I'm wondering about something similar to the way BIBO handles author lists
>> (you can both represent each author, and the list of authors, including
>> order)
>> Thanks,
>> Owen
>> --
>> Owen Stephens
>> Owen Stephens Consulting
>> Web: http://www.ostephens.com
>> Email: owen@ostephens.com
>>
>
>> _______________________________________________
>> open-bibliography mailing list
>> open-bibliography@lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/open-bibliography
>>
>>
>
> -Ross.
>
>



-- 
-Thad
http://www.freebase.com/view/en/thad_guidry
Received on Friday, 7 January 2011 18:58:54 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 2 March 2016 13:32:14 UTC