W3C home > Mailing lists > Public > public-lod@w3.org > June 2012

Re: Fwd: Knowledge Graph links to Freebase

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Sat, 09 Jun 2012 11:10:26 -0400
Message-ID: <4FD36762.7070101@openlinksw.com>
To: public-lod@w3.org
On 6/9/12 11:00 AM, Paul Houle wrote:
>      My guess is that the 300M entities could be hot air for now.
> Maybe they've got a "second true graph" with 300M entities in it,  but
> it's probably not powering the production system.
>      Right now recall is low for the Google Knowledge graph because
> they don't want to take the chance of showing spurious results.  Most
> Freebase topics aren't showing up and they shouldn't.  Freebase is
> full of "twisty little objects that all look alike"  For instance,
> there are 20 or so objects in Freebase named "Sweet Home Alabama".
> Almost all of the probability weight for this is on the radio edit,
> but most of these are covers,  re-releases on greatest hits albums,
> etc.  That's all very great data because it corresponds to real
> observations of music in the wild,  but in the commonsense domain
> these get squashed.
>       Oddly,  Google loses the classic rock song entirely and turns up
> a mediocre but commercially successful movie...
> https://www.google.com/#hl=en&gs_nf=1&tok=V0cZbCtNDVsjrfKATbImzw&cp=7&gs_id=7x&xhr=t&q=sweet+home+alabama&pf=p&output=search&sclient=psy-ab&oq=sweet+h&aq=0&aqi=g4&aql=&gs_l=&pbx=1&bav=on.2,or.r_gc.r_pw.r_qf.,cf.osb&fp=f9be4f0b957a8550&biw=1600&bih=775
>       The real value of the GKG may be in what gets deleted instead of
> what gets added.
>       Anyhow,  some things that ~could~ be in Freebase and aren't are
> (1) Consumer Products,
> (2) Local Businesses (think of what's in Foursquare or Factual),  and
> (3) Google data about books
>       #3 is the real sore spot.  We know Google has great metadata for
> books,  but Freebase has loaded only a percentage of books from
> OpenLibrary.  When I found that a number of books I was thinking about
> weren't there they suggested that I finish the Open Library load
> myself...
>       Of course,  Google's book project is under a legal cloud and
> their lawyers might feel that they aren't free to release the
> metadata.

Great analysis!



Kingsley Idehen	
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen

Received on Saturday, 9 June 2012 15:17:27 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:21:25 UTC