W3C home > Mailing lists > Public > public-esw-thes@w3.org > January 2011

Re: Fwd: [open-bibliography] Library of Congress subject headings & RDF

From: Alistair Miles <alimanfoo@googlemail.com>
Date: Fri, 7 Jan 2011 17:47:57 +0000
To: Ross Singer <rossfsinger@gmail.com>
Cc: SKOS <public-esw-thes@w3.org>
Message-ID: <20110107174757.GB2411@skiathos>
Hi Ross,

I'm a bit behind the times here, and you've probably seen all these already,
but for reference, the page at [1] has good links to previous discussions
of coordination. In particular, Antoine's summary of coordination in the
SKOS primer [2] is a good summary of where we got to in the SWDWG.

Cheers,

Alistair

[1] http://www.w3.org/2001/sw/wiki/SKOS/Issues/Coordination
[2] http://www.w3.org/TR/skos-primer/#secconceptcoordination

On Fri, Jan 07, 2011 at 12:16:48PM -0500, Ross Singer wrote:
> Hi all, forwarding a thread from the open-bibliography
> (http://lists.okfn.org/mailman/listinfo/open-bibliography) list here.
> It started with a question from Owen Stephens about a topic that's
> come up here before (subdivisions, coordination, etc.).
> 
> I'm bringing it here because Owen's question prompted me to explain
> some of the ideas I've been playing around with in this regard in
> http://lcsubjects.org/ which might be of interest here, as well.
> 
> First Owen's original post:
> 
> "Can anyone point me at (or advise me on) examples of representing
> subject heading fields from a library catalogue record as RDF.
> Specifically I'm interested in how chained sets of subject headings
> are represented.
> 
> E.g. a library catalogue record might have a heading:
> 
> 650$$aPopular Music$$xHistory$$y20th Century
> 
> Each one of these headings:
> 
> Popular Music
> History
> 20th Century
> 
> will have a SKOS representation on id.loc.gov, but to represent each
> heading separately as a dc:subject (or similar) would lose the context
> of chaining them together.
> 
> There are some entries on id.loc.gov that represent some 'chains'
> (those that have been 'authorised') - e.g.
> http://id.loc.gov/authorities/sh2008109787#concept is 'Popular
> Music--History and Criticism' - but for me this doesn't feel quite
> right - doesn't this lose some of the flexibility of the faceted
> scheme?
> 
> I'm wondering about something similar to the way BIBO handles author
> lists (you can both represent each author, and the list of authors,
> including order)"
> 
> and then my reply:
> 
> ---------- Forwarded message ----------
> From: Ross Singer <ross.singer@talis.com>
> Date: Fri, Jan 7, 2011 at 11:12 AM
> Subject: Re: [open-bibliography] Library of Congress subject headings & RDF
> To: List for Working Group on Open Bibliographic Data
> <open-bibliography@lists.okfn.org>
> 
> 
> Hi Owen,
> 
> I agree that the status quo at id.loc.gov is pretty unsatisfying (on
> several levels, including this one) and this is one the things that I
> changed for lcsubjects.org in the last redesign (although it's
> certainly not "fixed" or even remotely standard - but it was intended
> to get the conversation started in this direction).
> 
> Thankfully, though, your specific example works :)
> 
> http://lcsubjects.org/subjects/sh2008109787#concept
> 
> For subdivided subject headings like this, I've added a few
> properties: lcsh:coordinates, lcsh:generalSubdivision,
> lcsh:chronologicalSubdivision, lcsh:primaryConcept, etc.
> 
> The RDF out of lcsubjects.org is pretty brutally verbose, but directly
> out of the Platform it looks like:
> http://api.talis.com/stores/lcsh-info/meta?about=http%3A%2F%2Flcsubjects.org%2Fsubjects%2Fsh2008109787%23concept&output=xml
> 
> and the coordinates resource is an rdf:Seq (to preserve order):
> 
> http://api.talis.com/stores/lcsh-info/meta?about=http%3A%2F%2Flcsubjects.org%2Fsubjects%2Fsh2008109787%23coordinates&output=xml
> 
> This is still totally a work in progress (and incredibly incomplete),
> but is intended to begin to provide the sort of semantics that you're
> looking for (I think).  It also (I hope) begins to lay out a
> foundation for how LCSH is actually intended to be used (which is a
> set of building blocks).  So to take your original example,
> "650$$aPopular Music$$xHistory$$y20th Century"
> 
> This could be created like:
> 
> <http://example.org/book/1>
>    dcterms:subject
> <http://example.org/subjects/popular-music--history--20th-century#concept>.
> 
> <http://example.org/subjects/popular-music--history--20th-century#concept>
>    lcsh:generalSubdivision
> <http://lcsubjects.org/subjects/sh99005024#concept> ;
>    lcsh:chronologicalSubdivision
> <http://lcsubjects.org/subjects/sh2002012476#concept>;
>    lcsh:primaryConcept <http://lcsubjects.org/subjects/sh85088865#concept> ;
>    a skos:Concept ;
>    skos:prefLabel "Popular Music--History--20th Century" ;
>    lcsh:coordinates
> <http://example.org/subjects/popular-music--history--20th-century#coordinates>
> .
> 
> <http://example.org/subjects/popular-music--history--20th-century#coordinates>
>   a rdf:Seq ;
>   rdf:_1 <http://lcsubjects.org/subjects/sh85088865#concept> ;
>   rdf:_2 <http://lcsubjects.org/subjects/sh99005024#concept> ;
>   rdf:_3 <http://lcsubjects.org/subjects/sh2002012476#concept> .
> 
> (the lcsubjects.org URIs could just as easily be id.loc.gov URIs -- it
> was just easier to cut and paste from existing data).
> 
> With this, it's much easier to make our uncontrolled subject headings
> that are composites of a bunch of controlled headings.
> 
> Like I said, this is pretty incomplete on lcsubjects.org, currently,
> mainly because there's a lot missing (namely the corporate names and
> random chronological subdivisions, but there are also subdivision
> terms that don't appear to be derived from an authorized heading).
> See: http://lcsubjects.org/subjects/sh2010007497 or
> http://lcsubjects.org/subjects/sh85045754 as somewhat different
> examples.
> 
> The first one has a URI for Austria, but that URI returns a 404 (I
> built this from the Fred 2.0 data, so I have the NAF, I just haven't
> figured out how to incorporate it into lcsubjects.org, yet).  The
> second one shows an unauthorized chronological subdivision -- so,
> currently, it just drops it.
> 
> Here's another example:  http://lcsubjects.org/subjects/sh85134593#concept
> 
> this should use: http://lcsubjects.org/subjects/sh99005746#concept as
> the general subdivision -- but that's an altLabel, so it's currently
> failing (as you can see, this is wrought with frustations!).
> 
> Another mind bender: http://lcsubjects.org/subjects/sh2010106574#concept
> 
> This one chokes, because "Polyglot" isn't an authorized term (instead
> it should be using http://lcsubjects.org/subjects/sh85037700#concept
> -- "Dictionaries, Polyglot") and was created after Fred 2.0 (3.5 years
> after!), so I don't have access to the MARC authority record to
> properly look things up (not that it would help me in this case,
> anyway [1]).
> 
> So, to try to bring this on home..., I think there are solutions (and
> linked data solutions) to this, but LC is doing very little to enable
> it.  If they'd provide the original MARC as a format for the concepts,
> that would be a start -- but, honestly, without all of the data
> available (including the NAF), this is going to be half-baked.
> 
> So, anyway, thanks for prompting me to write a bit about this :)
> Probably worth forwarding to the SKOS list, as well.
> 
> -Ross.
> 
> [1] Here's the MARC record for Plastics--Dictionaries--Polyglot:
> 000     00476cz a2200169n 450
> 001     8244985
> 005     20100420002715.0
> 008     100413|| anannbabn |n ana
> 035     __ |a (DLC)464428
> 035     __ |a (DLC)sh2010106574
> 906     __ |t 8888 |u tc00 |v 0
> 010     __ |a sh2010106574
> 040     __ |a DLC |b eng |c DLC
> 150     __ |a Plastics |v Dictionaries |x Polyglot
> 667     __ |a Record generated for validation purposes.
> 670     __ |a Work cat.: Fachwörterbuch Kunststofftechnik, c1992
> 953     __ |a tc00
> 
> so there's still not an obvious way to know that one should be looking
> for Dictionaries, Polyglot.
> 
> 
> 
> On Fri, Jan 7, 2011 at 7:18 AM, Owen Stephens <owen@ostephens.com> wrote:
> > Can anyone point me at (or advise me on) examples of representing subject
> > heading fields from a library catalogue record as RDF. Specifically I'm
> > interested in how chained sets of subject headings are represented.
> > E.g. a library catalogue record might have a heading:
> > 650$$aPopular Music$$xHistory$$y20th Century
> > Each one of these headings:
> > Popular Music
> > History
> > 20th Century
> > will have a SKOS representation on id.loc.gov, but to represent each heading
> > separately as a dc:subject (or similar) would lose the context of chaining
> > them together.
> > There are some entries on id.loc.gov that represent some 'chains' (those
> > that have been 'authorised') -
> > e.g. http://id.loc.gov/authorities/sh2008109787#concept is 'Popular
> > Music--History and Criticism' - but for me this doesn't feel quite right -
> > doesn't this lose some of the flexibility of the faceted scheme?
> > I'm wondering about something similar to the way BIBO handles author lists
> > (you can both represent each author, and the list of authors, including
> > order)
> > Thanks,
> > Owen
> > --
> > Owen Stephens
> > Owen Stephens Consulting
> > Web: http://www.ostephens.com
> > Email: owen@ostephens.com
> >
> 
> > _______________________________________________
> > open-bibliography mailing list
> > open-bibliography@lists.okfn.org
> > http://lists.okfn.org/mailman/listinfo/open-bibliography
> >
> >
> 
> -Ross.
> 

-- 
Alistair Miles
Head of Epidemiological Informatics
Centre for Genomics and Global Health <http://cggh.org>
The Wellcome Trust Centre for Human Genetics
Roosevelt Drive
Oxford
OX3 7BN
United Kingdom
Web: http://purl.org/net/aliman
Email: alimanfoo@gmail.com
Tel: +44 (0)1865 287669
Received on Friday, 7 January 2011 18:07:20 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 2 March 2016 13:32:14 UTC