W3C home > Mailing lists > Public > public-lod@w3.org > April 2011

RE: How many instances of foaf:Person are there in the LOD Cloud?

From: Hogan, Aidan <aidan.hogan@deri.org>
Date: Wed, 13 Apr 2011 23:21:43 +0100
Message-ID: <316ADBDBFE4F4D4AA4FEEF7496ECAEF905AB2D29@EVS1.ac.nuigalway.ie>
To: "Bernard Vatant" <bernard.vatant@mondeca.com>
Cc: "Linking Open Data" <public-lod@w3.org>
> So tonight I would turn my question otherwise : Among those millions
> FOAF profiles, how do I discover those of which primary source is
> primary topic, expressing herself natively in FOAF, vs the ocean of
> second-hand remashed / remixed information, captured with or without
> approbation of their subjects, and eventually released in FOAF syntax
> the Cloud ...

Can't think of a single definitive solution that would work for all
scenarios, but there are four or five heuristics you could think of

 - filter all big FOAF exporters (there are not *so* many and you
already have a good list to start from);
 - look for FOAF files with the FOAF-a-Matic generatorAgent
 - look for FOAF documents which have a dc:creator/foaf:maker relation
to a person in the document (or a foaf:made in the other direction);
 - look at the "entropy" of URIs which are the objects of FOAF knows
relations... e.g., are they commonly in different "namespaces"?
 - look for rare "geek" properties like foaf:tipjar, foaf:myersBriggs,
or foaf:dnaCheckSum
 - ...

...there are many other tell-tale signs you could look at. I guess it
depends on what kind of precision/recall you need, but these should get
you a good bit of the way.

Good hunting!

Received on Wednesday, 13 April 2011 22:22:16 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:16:13 UTC