W3C home > Mailing lists > Public > public-vocabs@w3.org > October 2013

Re: SKOS and Freebase

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Mon, 21 Oct 2013 07:53:23 -0700
Message-ID: <52653FE3.9090305@kcoyle.net>
To: Dan Brickley <danbri@google.com>
CC: W3C Web Schemas Task Force <public-vocabs@w3.org>


On 10/21/13 7:30 AM, Dan Brickley wrote:

>
> To what extent does FAST make explicit the relationships beween the pieces?


I don't believe it does, but perhaps someone from OCLC can answer that. 
(Or we could closely read the documentation.) Note that OCLC has just 
announced that FAST headings will be added to WorldCat data:

http://www.oclc.org/news/announcements/2013/enriching-worldcat-with-fast.en.html

>
> A Lonclass (pseudo-UDC) example that stuck in my head, tried to code
> "Margaret Thatcher's letter of apology to TV-AM". You can imagine
> using RDF and SKOS and well known entity IDs to modernize this to the
> extend that you know a) we're talking about Margaret Thatcher, British
> Conservative politician; b) TV-AM, UK media company,  and that is
> quite useful even on its own; but the trickiest part is the
> relationship. Who did the apologizing? Mrs Thatcher or TV-AM? This
> issue seems to be the crossover point between SKOS in its current
> form, which can present a pre-cooked bundle of concepts, and full RDF
> which can at the cost of more work, explain their interconnection more
> explicitly.


LC Headings don't have verbs (AFAIK) so your particular case does not 
apply. However, the main criticism of FAST is false hits, coming from 
situations where one has more than subject heading whose "parts" can 
combine "wrongly." So if you have a book that talks about 19th century 
poetry and the rap music of Dr. Dre, you could end up with:

Poetry
19th century
Rap music
Dr. Dre

and you could retrieve this on the unlikely query of "Rap music 19th 
century".

BTW, FAST is not just a rote chopping up of LCSH -- it makes some very 
interesting decisions and modifications. There is an entire book [1] 
describing this, but unfortunately the table of contents is not 
available for viewing. That alone, though, provides a great outline of 
thought that went into FAST.

kc
[1] 
http://books.google.com/books?id=CAE1QQAACAAJ&dq=fast+faceted+application&hl=en&sa=X&ei=KT9lUsn_LoGWiAL3-oDYAg&ved=0CDoQ6AEwAA






>
> In full RDF, dealing with such situations case by case, we might e.g.
> declare a subtype of http://schema.org/Action with 'apologist' and
> 'apologee' relations and a definition making clear which participant
> is doing what. In W3C SKOS currently I believe the best we'd get is
> the bundle of ["TV-AM", "Mrs Thatcher", "Apology, letter of"]. And
> maybe that's fine for most purposes - I'm just curious how far the
> FAST effort tries to make explicit the compositional structure. From
> what I remember of UDC's notation it didn't really do as much as some
> people wanted here...
>
> Dan
>
> ps. http://en.wikipedia.org/wiki/Lonclass has the example,
> "656.881:301.162.721:32.007THATCHER: 654.192.731TV-AM" supposedly
> composed from these parts,
>
> 656.881:301.162.721 “LETTERS OF APOLOGY”
> 656.881 “LETTERS (POSTAL SERVICES)”
> 656.881:06.022.6 “RESIGNATION LETTERS”
> 654.192.731TV-AM “TV AM (TELEVISION AM)”
>
> ... though it doesn't formally afaik indicate who was the apologist
>
> see also http://www.udcds.com/seminar/2011/media/slides/UDCSeminar2011_AndyHeather.pdf
>
>
>> kc
>> [1] http://experimental.worldcat.org/fast/
>>
>>
>> On 10/20/13 6:40 PM, Thad Guidry wrote:
>>>
>>> Tom is correct.
>>>
>>> Let's be clear, the data still has to be linked for LCSH concepts. There
>>> is much work to be done on that front.
>>>
>>> I have been continually applying most high level LCSH concepts to
>>> Freebase manually, but a better interface for human curation and
>>> aligning and linking the LCSH concepts to Freebase is what is needed
>>> (but a lot of that could be done with OpenRefine and other automated
>>> tools).  It would be even more awesome for other folks to bear and share
>>> that burden and help build or refine the existing tools to help with
>>> automation.
>>>
>>>
>>>
>>> On Sun, Oct 20, 2013 at 2:05 PM, Tom Morris <tfmorris@gmail.com
>>> <mailto:tfmorris@gmail.com>> wrote:
>>>
>>>      On Sun, Oct 20, 2013 at 10:29 AM, Antoine Isaac <aisaac@few.vu.nl
>>>      <mailto:aisaac@few.vu.nl>> wrote:
>>>
>>>          I got messed up with my mail splitting: but I really want to
>>>          flag that Thad's
>>>
>>>
>>> http://lists.w3.org/Archives/__Public/public-vocabs/2013Oct/__0142.html
>>>
>>>
>>> <http://lists.w3.org/Archives/Public/public-vocabs/2013Oct/0142.html>
>>>
>>>          is really awesome.And seems a good case in favour of SKOS data,
>>>          for all those who want to do something similar but can't handle
>>>          the poliferation of namespaces.
>>>
>>>
>>>      One caution - that example isn't representative.  Of the 389,668
>>>      Library of Congress Subject Heading (LCSH) concepts in Freebase,
>>>      only 7,842 have been linked to an equivalent Freebase topic.  Also
>>>      the LCSH was  loaded in 2010 and, as far as I'm aware, hasn't been
>>>      updated since.  I suspect the hierarchy is relatively stable, but
>>>      the lack of currency is something else to be aware of.
>>>
>>>      It demonstrates interesting possibilities, but it isn't useful for
>>>      much in its current form.
>>>
>>>      Tom
>>>
>>>
>>>
>>>
>>> --
>>> -Thad
>>> Thad on Freebase.com <http://www.freebase.com/view/en/thad_guidry>
>>> Thad on LinkedIn <http://www.linkedin.com/in/thadguidry/>
>>
>>
>> --
>> Karen Coyle
>> kcoyle@kcoyle.net http://kcoyle.net
>> m: 1-510-435-8234
>> skype: kcoylenet
>>
>

-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet
Received on Monday, 21 October 2013 14:53:54 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:29:32 UTC