W3C home > Mailing lists > Public > public-lod@w3.org > April 2011

Re: How many instances of foaf:Person are there in the LOD Cloud?

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Wed, 13 Apr 2011 17:17:10 -0400
Message-ID: <4DA612D6.8080105@openlinksw.com>
To: Michael Brunnbauer <brunni@netestate.de>
CC: Linking Open Data <public-lod@w3.org>
On 4/13/11 9:59 AM, Michael Brunnbauer wrote:
> re
>
> Here is the current top 25 for foaf-search.net (Number of RDF documents per
> second level domain). dbpedia is not included because we used the dumps and
> livejournal.com was not crawled completely. Not all RDF documents are about
> persons. We index every document containing a foaf:name or foaf:nick
> predicate.
>
> opera.com	247281
> ecademy.com	224875
> livejournal.com	221321
> identi.ca	192732
> insanejournal.com	183046
> ac.uk	176452 (mostly eprints.soton.ac.uk and eprints.ecs.soton.ac.uk)
> deadjournal.com	161659
> spin.de	131959
> rambler.ru	111119
> mybloglog.com	53633
> i.ua	43727 (narod.i.ua)
> dreamwidth.org	39471
> smart.fm	38945
> dbtune.org	36498
> bibliographica.org	34795
> rdfabout.com	30237
> rpi.edu	29913
> co.uk	25073 (mostly ordnancesurvey.co.uk)
> qdos.com	18816
> wasab.dk	18763
> sapo.pt	17029
> photozou.jp	14560
> phitter.com	11823
> openei.org	10794
> gov.uk	10589 (mostly data.gov.uk)

How a publicly shared Google spreadsheet doc? From there to Linked Data 
is a short journey :-)


Kingsley
> Regards,
>
> Michael Brunnbauer
>
> On Wed, Apr 13, 2011 at 01:37:48PM +0200, Mischa Tuffield wrote:
>> Hi All,
>>
>> I was looking at the number of foaf files on the web over a year ago now, output looks like so :
>>
>> http://mmt.me.uk/slides/lod24022010/#(16)
>>
>> Mischa
>> On 13 Apr 2011, at 12:11, Melvin Carvalho wrote:
>>
>>> On 13 April 2011 10:54, Michael Brunnbauer<brunni@netestate.de>  wrote:
>>>> re
>>>>
>>>> On Wed, Apr 13, 2011 at 10:15:46AM +0200, Bernard Vatant wrote:
>>>>> Just trying to figure what is the size of personal information available as
>>>>> LOD vs billions of person profiles stored by Google, Amazon, Facebook,
>>>>> LinkedIn, unameit ... in proprietary formats.
>>>> At www.foaf-search.net, we have ca. 3.5 mio instances of foaf:Person.
>>>>
>>>> The biggest chunk out there is probably livejournal.com with more than 25mio
>>>> users which we cannot index all right now (we have 221090 of them).
>>>>
>>>> Another big one is hi5.com but the FOAF is quite broken so we don't crawl it.
>>> gmail at one point were publishing foaf profiles ... so that's quite a few more
>>>
>>> facebook graph is not quite foaf but certainly machine readable JSON,
>>> and could easily be transformed to FOAF, so that's another chunk
>>>
>>> there's a few bridges too such as ones last.fm, flikr and semantic tweet
>>>
>>> So including bridge I'd guess 250 million, 99% should be alive today,
>>> but that number will fall over time (obviously)
>>>
>>>> See also:
>>>>
>>>> http://www.w3.org/wiki/FoafSites
>>>> http://wiki.foaf-project.org/w/DataSources
>>>>
>>>> Regards,
>>>>
>>>> Michael Brunnbauer
>>>>
>>>> --
>>>> ++  Michael Brunnbauer
>>>> ++  netEstate GmbH
>>>> ++  Geisenhausener Straße 11a
>>>> ++  81379 München
>>>> ++  Tel +49 89 32 19 77 80
>>>> ++  Fax +49 89 32 19 77 89
>>>> ++  E-Mail brunni@netestate.de
>>>> ++  http://www.netestate.de/
>>>> ++
>>>> ++  Sitz: München, HRB Nr.142452 (Handelsregister B München)
>>>> ++  USt-IdNr. DE221033342
>>>> ++  Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
>>>> ++  Prokurist: Dipl. Kfm. (Univ.) Markus Hendel
>>>>
>>>>
>
>


-- 

Regards,

Kingsley Idehen	
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen
Received on Wednesday, 13 April 2011 21:17:35 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:32 UTC