W3C home > Mailing lists > Public > semantic-web@w3.org > April 2009

Re: [foaf-dev] [foaf-protocols] FOAF sites offline during cleanup

From: Hugh Glaser <hg@ecs.soton.ac.uk>
Date: Wed, 29 Apr 2009 09:40:50 +0100
To: Kingsley Idehen <kidehen@openlinksw.com>, Peter Williams <pwilliams@rapattoni.com>
CC: Semantic Web <semantic-web@w3.org>, foaf-dev Friend of a <foaf-dev@lists.foaf-project.org>
Message-ID: <EMEW3|8ad1eaa53c56ac8e8e631dd06f1a2c13l3S9f202hg|ecs.soton.ac.uk|AC58%hg@ecs.soton.ac.uk>
Hi again.
A problem I have is that you seem to be encouraging people to use your store
to do research (for example of the sort we have been talking about on
percentage bnodes) thinking it is on LD, LOD, SW, Web of Data or whatever,
having read claims such as:
"What we have right now is the LOD-Cloud Warehouse".
Such analysis might be deeply flawed, because they don't understand your
term "Warehouse", and I have seen no validation of your claim.
I don't have time to keep going to find out what you have, but a very quick
perusal of the areas I have some knowledge of suggest there is a lot
missing.
Even looking at your voiD for the rkb stuff perhaps I am looking to do
research into voiD), I find you only have about 10 rkb stores, whereas we
publish voiD descriptions of more than 30.
Looking at the triple count, you report 19703, whereas we report (
http://southampton.rkbexplorer.com/models/void.ttl) 322555.
Also, looking for some of the other bubbles as voiD descriptions, I can't
find them. So exploring a bit (using Nick Gibbins' ECS Southampton bubble,
rather than one of mine) I find that I (21) seem to be the only URI of type
http://id.ecs.soton.ac.uk/person/11234
you have.

Don't get me wrong - I think you have done a great job getting all this
stuff together, and providing a facility for people to work with and
publicise, and it is a really interesting exercise to investigate the
interaction of the Web of Data and the Cloud.
But I am seriously concerned that people may be misled into thinking there
is more there than there is, when this will always be the nature of the
activity.
I really don't want to be reviewing/seeing papers in a few months time where
people are presenting analysis they claim to have done of the "LOD cloud" or
similar, and they have based their data gathering on the misconception that
all they have to do is look at your cloud.

Best
Hugh



On 29/04/2009 02:31, "Kingsley Idehen" <kidehen@openlinksw.com> wrote:

> Peter Williams wrote:
>> almost incomprehensible - to the layman. But, I believed you - to about 51%.
>>  
> The LOD-Cloud pictorials:
> 
> 1. http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-03-27.html
> 2.
> http://www4.wiwiss.fu-berlin.de/bizer/pub/lod-datasets_2009-03-27_colored.png
> 
> Problems:
> 
> 1. The black and white clickable version does really group the bubbles
> 2.  Neither pictorial provides clarity as to what's constructed from
> physical RDF dumps (as per LOD community best practices), "on the fly"
> RDFization, or Progressive Crawling.
> 
> 
> Thus, when I say: we have a Virtuoso instance hosting the LOD-Cloud [1],
> someone can always come along an question the accuracy of the claim (as
> Hugh has just done).
> 
> In anticipation of the problem I describe above, I sought to partition
> the LOD-Cloud along the following lines: Warehouse (stuff loaded from
> dumps) and Dynamic (RDFized Data). Then I could say with accuracy, bar
> inadvertent omission, that we have an instance hosting the Warehouse
> component of the LOD-Cloud.
> 
> Hugh: So to be precise, we are claiming to host the LOD-Cloud Warehouse
> :-) The Graph Group IRIs used in the VoiD graph [2] reflect most of the
> partitioning you see in the colored pictorial re. the stuff available as
> Linking Open Data community dumps [3][4].
> 
> 
> Links:
> 
> 1. http://lod.openinksw.com
> 2. http://lod.openlinksw.com/void/Dataset
> 3. http://esw.w3.org/topic/DataSetRDFDumps
> 4. http://esw.w3.org/topic/HCLSIG/LODD/Data
> 
> 
> Kingsley
> 
>> ________________________________________
>> From: foaf-dev-bounces@lists.foaf-project.org
>> [foaf-dev-bounces@lists.foaf-project.org] On Behalf Of Kingsley Idehen
>> [kidehen@openlinksw.com]
>> Sent: Tuesday, April 28, 2009 5:16 PM
>> To: Hugh Glaser
>> Cc: Semantic Web; foaf-dev Friend of a
>> Subject: Re: [foaf-dev] [foaf-protocols]  FOAF sites offline during cleanup
>> 
>> Hugh Glaser wrote:
>>  
>>> Hi Kingsley.
>>> It is great for people to be able to find a lot of the LOD cloud at your
>>> site, but please be careful about your claims concerning the data you have
>>> crawled from LOD.
>>> To say "our actual VoiD graph for LOD cloud" is to mislead readers into
>>> thinking that it captures more than it does.
>>> 
>>>    
>> Yes, and No.
>> 
>> Remember, I did try to partition the LOD-Cloud by Warehouse,
>> Sponged/RDFized, and Crawled, but nobody would have it.
>> 
>> What we have right now is the LOD-Cloud Warehouse. Also note, when you
>> look at the VoiD graph you are seeing Graph Group IRIs (containers of
>> Graphs that contain Triples), so you need to drill down a level or two.
>> 
>> Also, if you feel a dataset dump is missing from the LOD-Cloud
>> pictorial, please don't hesitate to hola etc..
>> 
>> BTW - I don't equate the LOD-Cloud pictorial as being equivalent to the
>> Linked Data Web :-)
>> 
>> Kingsley
>>  
>>> Best
>>> Hugh
>>> 
>>> 
>>> On 28/04/2009 13:10, "Kingsley Idehen" <kidehen@openlinksw.com> wrote:
>>> As for the % re. FOAF, I think that can be determined from our actual
>>> VoiD graph for LOD cloud [1]. I don't know off the top of my head if
>>> FOAF is up to 50%.
>>> 
>>>    
>>>> The "Linked" part of the name implies that crawling is a valid tactic
>>>> to gather the data to me.
>>>> 
>>>>      
>>> Not disputing that, just describing what we have in the instance :-)
>>> Remember, we've sponged (crawled and RDFized) data since inception of
>>> our participation in this space.
>>> 
>>> Links:
>>> 
>>> 1. http://lod.openlinksw.com/void/Dataset
>>> 
>>> Kingsley
>>> 
>>> 
>>>    
>> 
>> 
>> --
>> 
>> 
>> Regards,
>> 
>> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
>> President & CEO
>> OpenLink Software     Web: http://www.openlinksw.com
>> 
>> 
>> 
>> 
>> _______________________________________________
>> foaf-dev mailing list
>> foaf-dev@lists.foaf-project.org
>> http://lists.foaf-project.org/mailman/listinfo/foaf-dev
>> 
>>  
> 
> 
> --
> 
> 
> Regards,
> 
> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
> President & CEO
> OpenLink Software     Web: http://www.openlinksw.com
> 
> 
> 
> 
> 
Received on Wednesday, 29 April 2009 08:42:08 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 5 July 2022 08:45:11 UTC