Re: AW: ANN: LOD Cloud - Statistics and compliance with best practices from Kingsley Idehen on 2010-10-22 (semantic-web@w3.org from October 2010)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Thu, 21 Oct 2010 20:22:49 -0400
To: Enrico Motta <e.motta@open.ac.uk>
CC: Chris Bizer <chris@bizer.de>, Martin Hepp <martin.hepp@ebusiness-unibw.org>, Thomas Steiner <tsteiner@google.com>, Semantic Web <semantic-web@w3.org>, public-lod <public-lod@w3.org>, Anja Jentzsch <anja@anjeve.de>, semanticweb <semanticweb@yahoogroups.com>, Giovanni Tummarello <giovanni.tummarello@deri.org>, Mathieu d'Aquin <m.daquin@open.ac.uk>
Message-ID: <4CC0D959.6040808@openlinksw.com>
On 10/21/10 6:45 PM, Enrico Motta wrote:
> At 15:45 -0400 21/10/10, Kingsley Idehen wrote:
>> On 10/21/10 3:23 PM, Enrico Motta wrote:
>>> Chris
>>>
>>> I strongly agree with the points made by Martin and Giovanni.  Of 
>>> course the LOD initiative has had a lot of positive impact and you 
>>> cannot be blamed for being successful, but at the some time I am 
>>> worried that teh success and visibility of the LOD cloud is having 
>>> some rather serious negative consequences. Specifically:
>>>
>>> 1) lots of people, even within the SW community, now routinely 
>>> describe the LOD as the 'semantic web'.  This is not only 
>>> dramatically incorrect (and bad for students and people who want to 
>>> know about the SW) but also an obstacle to progress: anything which 
>>> is not in the LOD diagram does not exist, and this is really not 
>>> good for the SW community as a whole (including the people at the 
>>> centre of the LOD initiative).  Even worse, in the past 12-18 
>>> months  I have noticed that this viewpoint has also been embraced by 
>>> funding bodies and linking to LOD is becoming a necessary condition 
>>> for a SW project. Again, I think this is undesirable - see also 
>>> Martin's email on this thread.
>>
>> I agree, but do note (as per my earlier response) the success of the 
>> LOD cloud pictorial as marketing collateral isn't something that 
>> arisen by deliberate exclusion actions. Methinks many have simply 
>> slapped it into their presentations devoid of actual presentation 
>> goals. This single activity has helped and hurt the LOD cloud 
>> pictorial. Hurt meaning: creating the perception you describe above.
>
> Absolutely! I never said (and I would never say) that there was any 
> deliberate exclusion. I am just pointing out that this is a negative 
> side-effect of the success of the activity.

Yes, but the misuse of the pictorial as a result of misunderstanding of 
its use cannot be the fault of the creators, no matter how one cuts it.

Ironically, the theory used to be that geeks don't now how to do 
marketing. The LOD cloud pictorial is a classical example of mega 
effective marketing.

What people really should do is emulate the pictorial in purpose 
specific ways that ultimately contribute to broader comprehension of 
Linked Data.

>
>
>>>
>>> 2) Because the LOD is perceived as the 'official SW' and because 
>>> resources in the LOD have to comply with a number of guidelines, 
>>> people also assume that LOD resources exhibit higher quality.
>>
>> I hope not, and I don't think so. Even if it were to be true, would 
>> you blame the production of the pictorial for that? Really though, I 
>> don't recall anyone saying: LOD pictorial is the Linked Data gospel.
>
>
> Again, there is no blaming involved. I am just saying that because 
> there is a methodology associated with LOD and methodologies are 
> normally associated with quality, people assume quality when quality 
> is not (necessarily) there.

If people make incorrect assumptions what can one do? The only solution 
is broader community contribution. There should be many pictorials 
rather than one.

>
>>
>>> Unfortunately in our experience this is not really the case, and 
>>> this also generates negative consequences. That is, if LOD is the 
>>> 'official high quality SW ' and there are so many issues with the 
>>> data, automatically people assume that the rest of the SW is a lot 
>>> worse, even though this is not necessarily the case.
>>>
>>> So, as other people have already said, maybe it is time to 
>>> re-examine teh design criteria for LOD and the way this is presented?
>>
>> But this should simple be a case of people from the community 
>> producing additional collateral. The LOD cloud has some interesting 
>> history that goes something like this:
>>
>> 1. Banff 2007 (Linked Data coming out party)  -- Chris was giving a 
>> DBpedia demo showing its inter-connectedness, TimBL then suggest to 
>> Chris to turn it into a cloud with periodic updates for demonstrating 
>> growth
>>
>> 2. Richard (working with Chris at the time) picked up the challenge 
>> and refined the initial graphic
>>
>> 3. People started using it to show growth of DBpedia which also 
>> implied LOD cloud since the connections in the pictorial were reciprocal
>>
>> 4. Cloud pictorial caught fire re. powerpoint presentations + 
>> exponential effect of slideshare.
>>
>> Thus, why can others simply emulate this process, based on respective 
>> areas of interest?
>
>
> Of course, they can.

So they should, ASAP.

>
>>
>>> For instance, it would be beneficial to the community if LOD were to 
>>> focus more on quality issues, rather than linking for the sake of 
>>> linking.
>>
>> Who is this LOD entity? You make this entity sound very much like the 
>> one represented as a burning-bush when providing instructions Moses :-)
>
> Uhm...I know you are saying this in a jokey way, but I don't think I 
> am trying to characterise it as a burning bush.....And, unless we are 
> all dreaming, I would argue that a LOD initiative does exist......

You said: LOD should focus more on quality issues. On a serious note 
now, who is LOD? As I know it: Linked Open Data (LOD) is a community 
effort to bootstrap the Web of Linked Data via data publication 
following guidelines laid out in TimBL's meme for injecting Linked Data 
into the Web. There isn't a sole LOD entity or adjudicator.

>
>
>>>
>>>> I agree with you that it would be much better, if somebody would 
>>>> set up a
>>>> crawler, properly crawl the Web of Data and then provide a catalog 
>>>> about all
>>>> datasets.
>>>
>>> Actually this is exactly what our Watson system does, see 
>>> http://watson.kmi.open.ac.uk
>>
>> And I would assume there are APIs or even a SPARQL endpoint that 
>> would enable interested parties generate a dynamic cloud, right?
>
> Of course, there is SPARQL and a very fine-grained and efficient API. 
> In addition, we are working on automatically generating a variety of 
> links between semantic resources, e.g., agreement/disagreement, 
> versioning, inclusion, inconsistency, etc.... - see 
> http://watson.kmi.open.ac.uk/DownloadsAndPublications_files/keod09.pdf 
> for an overview of the overall framework and 
> http://watson.kmi.open.ac.uk/DownloadsAndPublications_files/ontoqual2010.pdf 
> for an example of the approach, which focuses on characterizing and 
> automatically detecting agreement and disagreement between ontologies.

All good and exciting bar those PDF URLs :-)


Kingsley
>
>
> Enrico
>
>>
>
>


-- 

Regards,

Kingsley Idehen	
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen
Received on Friday, 22 October 2010 00:23:34 UTC