W3C home > Mailing lists > Public > public-lod@w3.org > April 2011

Re: How many instances of foaf:Person are there in the LOD Cloud?

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Wed, 13 Apr 2011 08:37:22 -0400
Message-ID: <4DA59902.1020800@openlinksw.com>
To: Bernard Vatant <bernard.vatant@mondeca.com>
CC: Linking Open Data <public-lod@w3.org>
On 4/13/11 4:15 AM, Bernard Vatant wrote:
> Hello all
>
> Just trying to figure what is the size of personal information 
> available as LOD vs billions of person profiles stored by Google, 
> Amazon, Facebook, LinkedIn, unameit ... in proprietary formats.
>
> Any hint of the proportion of "living" people vs historical characters 
> is also welcome.
>
> Any idea?
>
> Bernard
>
>
> -- 
> Bernard Vatant
> Senior Consultant
> Vocabulary & Data Integration
> Tel:       +33 (0) 971 488 459
> Mail: bernard.vatant@mondeca.com <mailto:bernard.vatant@mondeca.com>
> ----------------------------------------------------
> Mondeca
> 3, cité Nollez 75018 Paris France
> Web: http://www.mondeca.com
> Blog: http://mondeca.wordpress.com
> ----------------------------------------------------
Bernard,

LOD Cloud cache has 3,321,094 foaf:Person entities [1]. Distinct count 
3,319,862 count [2].
URIBurner has 4,564,981 foaf:Person entities [3]. Distinct count is 
4,555,697 [4] .

Both cases via SPARQL aggregate queries against their respective 
endpoints. Note, no inference context applied there are a variety of 
rules across OpenCyc, UMBEL, Yago, and DBpedia that would alter these 
counts.

Tip re. URLs below, simply change the "authority" part of the URL when 
seeking similar counts from other Virtuoso instances, with some luck it 
could apply to other SPARQL endpoints in general, subject to what the 
endpoints support and permit etc..

SPARQL queries used across each endpoint:

select count(?s) where  {?s a foaf:Person}

select count(distinct ?s) where  {?s a foaf:Person}

Links:

1. http://lod.openlinksw.com/c/CYIZZL4 -- LOD Cloud Cache
2. http://lod.openlinksw.com/c/COXER7C -- LOD Cloud Cache Distinct Count
3. http://uriburner.com/c/DYVU7N -- URIBurner
4. http://uriburner.com/c/DV6VPQ -- URIBurner Distinct Count .


-- 

Regards,

Kingsley Idehen	
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen
Received on Wednesday, 13 April 2011 12:37:48 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:32 UTC