W3C home > Mailing lists > Public > public-lod@w3.org > April 2011

Re: How many instances of foaf:Person are there in the LOD Cloud?

From: Melvin Carvalho <melvincarvalho@gmail.com>
Date: Wed, 13 Apr 2011 12:11:23 +0200
Message-ID: <BANLkTi=EpWUAJG_ns=Gw30xm7-t-DDvdDg@mail.gmail.com>
To: Michael Brunnbauer <brunni@netestate.de>
Cc: Bernard Vatant <bernard.vatant@mondeca.com>, Linking Open Data <public-lod@w3.org>
On 13 April 2011 10:54, Michael Brunnbauer <brunni@netestate.de> wrote:
>
> re
>
> On Wed, Apr 13, 2011 at 10:15:46AM +0200, Bernard Vatant wrote:
>> Just trying to figure what is the size of personal information available as
>> LOD vs billions of person profiles stored by Google, Amazon, Facebook,
>> LinkedIn, unameit ... in proprietary formats.
>
> At www.foaf-search.net, we have ca. 3.5 mio instances of foaf:Person.
>
> The biggest chunk out there is probably livejournal.com with more than 25mio
> users which we cannot index all right now (we have 221090 of them).
>
> Another big one is hi5.com but the FOAF is quite broken so we don't crawl it.

gmail at one point were publishing foaf profiles ... so that's quite a few more

facebook graph is not quite foaf but certainly machine readable JSON,
and could easily be transformed to FOAF, so that's another chunk

there's a few bridges too such as ones last.fm, flikr and semantic tweet

So including bridge I'd guess 250 million, 99% should be alive today,
but that number will fall over time (obviously)

>
> See also:
>
> http://www.w3.org/wiki/FoafSites
> http://wiki.foaf-project.org/w/DataSources
>
> Regards,
>
> Michael Brunnbauer
>
> --
> ++  Michael Brunnbauer
> ++  netEstate GmbH
> ++  Geisenhausener Straße 11a
> ++  81379 München
> ++  Tel +49 89 32 19 77 80
> ++  Fax +49 89 32 19 77 89
> ++  E-Mail brunni@netestate.de
> ++  http://www.netestate.de/
> ++
> ++  Sitz: München, HRB Nr.142452 (Handelsregister B München)
> ++  USt-IdNr. DE221033342
> ++  Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
> ++  Prokurist: Dipl. Kfm. (Univ.) Markus Hendel
>
>
Received on Wednesday, 13 April 2011 10:11:51 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:32 UTC