W3C home > Mailing lists > Public > public-lod@w3.org > April 2011

Re: How many instances of foaf:Person are there in the LOD Cloud?

From: Giovanni Tummarello <giovanni.tummarello@deri.org>
Date: Wed, 13 Apr 2011 18:12:05 +0200
Message-ID: <BANLkTimsaPTJ+-R=yQuTNxX5YcRsPVqqpA@mail.gmail.com>
To: Bernard Vatant <bernard.vatant@mondeca.com>
Cc: Linking Open Data <public-lod@w3.org>
to add to this, internal sources report for xmlns.com/foaf/0.1/Person

totalReferences (number of triple  involving a foaf person) 964563435
( almost a billion, obviously not unique individuals)
graphReferences (number of pages /resolvable URLs/graphs)34915501
domainReferences (number of distinct domains) 3439696
sldReferences (number of distinct second level domains, aggregates all
the foo.example.com foo2.example.com)  69004

 I think fakefriends.me creates a lot of indeed false occurrences (we
have banned it now but some data is still there) but other than that
.. enjoy :)

cheers

On Wed, Apr 13, 2011 at 4:48 PM, Giovanni Tummarello
<giovanni.tummarello@deri.org> wrote:
> sindice.com main index has 37,312,159 documents occurrences of  foaf:person.
>
> http://sindice.com/search?q=foaf%3Aperson
> (a lot of these come from microformats via the any23 library but anyway)
>
> which means there are many more actual persons inside.
>
> Gio
>
>
> On Wed, Apr 13, 2011 at 10:15 AM, Bernard Vatant
> <bernard.vatant@mondeca.com> wrote:
>> Hello all
>>
>> Just trying to figure what is the size of personal information available as
>> LOD vs billions of person profiles stored by Google, Amazon, Facebook,
>> LinkedIn, unameit ... in proprietary formats.
>>
>> Any hint of the proportion of "living" people vs historical characters is
>> also welcome.
>>
>> Any idea?
>>
>> Bernard
>>
>>
>> --
>> Bernard Vatant
>> Senior Consultant
>> Vocabulary & Data Integration
>> Tel:       +33 (0) 971 488 459
>> Mail:     bernard.vatant@mondeca.com
>> ----------------------------------------------------
>> Mondeca
>> 3, cité Nollez 75018 Paris France
>> Web:    http://www.mondeca.com
>> Blog:    http://mondeca.wordpress.com
>> ----------------------------------------------------
>>
>
Received on Wednesday, 13 April 2011 16:12:34 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:32 UTC