- From: Kingsley Idehen <kidehen@openlinksw.com>
- Date: Wed, 29 Apr 2009 09:57:01 -0400
- To: Steve Harris <steve.harris@garlik.com>
- CC: Semantic Web <semantic-web@w3.org>, foaf-dev Friend of a <foaf-dev@lists.foaf-project.org>
Steve Harris wrote: > On 29 Apr 2009, at 14:06, Kingsley Idehen wrote: >> Steve, >> >> If we isolate the "FOAF Profiles" bubble of the LOD-Cloud pictorial, >> would you say these sources are representative: >> >> 1. http://esw.w3.org/topic/FoafSites >> 2. http://pingthesemanticweb.com (PTSW) >> 3. http://sindice.com > > It seems highly unlikely. > > The only way to get a representative sample is to select some of the > data randomly. Okay. So I end this thread by asking: isn't that basically what we have in our instance? Its data comes from the sources above plus others. > ESW links a human-curated selection of sites, PTSW gets fed similarly > similarly and Sindice crawled, IIUC. > > I don't think anyone even has a good idea of how many FOAF files are > out there, to know if they have a good selection or not. I think we > have 12 million or so unique ones, but we know there's an awful lot > more out there. > > Ontop of that, "FOAF" is especially vague, eg. do qdos.com profiles > (eg. > http://qdos.com/user/Steve-Harris/18b6f60b41e05aaa418565ebfe901d6b/turtle) count > as FOAF profiles? They have foaf:People in them, and use one or two > foaf properties, but foaf: is not the most common prefix. > > What about DOAP files with lots of FOAF in them? Some use foaf: more > than doap:, and so on. DOAP files are picked up from PTSW and a few other data sets that use FOAF. Maybe we chat by phone of private IM (IRC, Twitter, Identi.ca etc about this) ? Kingsley > > - Steve > -- Regards, Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen President & CEO OpenLink Software Web: http://www.openlinksw.com
Received on Wednesday, 29 April 2009 13:57:38 UTC