Re: Think before you write Semantic Web crawlers

On 6/23/11 12:42 PM, Dieter Fensel wrote:
> At 01:32 PM 6/23/2011, Sebastian Schaffert wrote:
>> I am very well aware of the problem of adoption. At the same time, we 
>> have a similar problem not only in the publication of the data but 
>> also in the consumption: if we do not let users consume our data even 
>> in large scale, what use is the data at all? I agree that bombarding 
>> a server with crawlers just for harvesting as many triples as 
>> possible without thinking about their use is stupid. But it will 
>> always happen, no matter how many mails we have on the Linked Data 
>> mailinglist.
>
> Yes, however major conferences such as ESWC, ISWC, and WWW could 
> define guidelines for
> their paper submissions and actually reject papers that are based on 
> denial of service attacks. 
No, no.

We should build better Linked Data platforms (consumer and server side).

Linked Data solution modulo WebID isA poor Linked Data solution, really. 
We can't have it both ways. AWWW is there to be used, that's why I 
continue to refer back to it as solid architecture due to inherent 
flexibility, all delivered using "deceptively simple" principle.

If anything, the whole AWWW is so "deceptively simple" that people think 
the modus operandi for Web solution development has to be "simply 
simple" :-(

Smart Linked Data solutions are vital to long term success of Linked 
Data and the data space dimension of the WWW that it unveils. WebID is a 
cheap route. Just as HTTP URIs are cheap routes to global identifiers 
that resolve to representations of their referents.

> It
> became a trend to very much focus on size and invite people to 
> evaluate their results in this
> respect.

Yes, cos Linked Data is a Big Data play :-) Size will always matter 
since one needs to find the proverbial needle in the haystack (massive 
linked data mesh at InterWeb scales).

> In the same way we could define certain criteria that excludes dos 
> attacks for achieving
> this. 

Won't work, and IMHO a cop out for those who are supposed to thrive on 
solving problems in the academic realm.

The day we decided to make data access technology (circa. 1992) is the 
same day we put ACLs on the table, assuming that's what everyone else 
would be doing. It inadvertently became a USP re. the then world of ODBC 
based data access.

Linked Data provides much more granular access than ODBC, but in doing 
so it also ups the ante re. ACL problem, exponentially due to InterWeb 
scale.

> Obviously it will not stop all people in the wild out there but it 
> would at least prevent the
> core of the academic semantic web community to burden their own 
> technological achievements.
>

WebID will kill it off pronto :-)



-- 

Regards,

Kingsley Idehen	
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen

Received on Thursday, 23 June 2011 12:51:44 UTC