Re: Think before you write Semantic Web crawlers

On 6/22/11 8:08 PM, Martin Hepp wrote:
> Hi Andreas:
> Please make a survey among typical Web site owners on how many of them have
> 1. access to this level of server configuration and
> 2. the skills necessary to implement these recommendations.

Okay, I think that answers my question from last post :-)


> The WWW was anti-pedantic by design.

I would say "deceptively simple" . Basically, there are complexities to 
AWWW, but never hitting you at the front door re. initial engagement. 
The great thing about "deceptively simple" is that this kind of systems 
architecture ultimately delivers pleasant surprises. This is also why 
turning RDF into a Linked Data distraction sets me off, big time! The 
AWWW at its core already had the mechanism for Linked Data via use of 
hyperlinks for whole data representation built in.
>   This was the root of its success.


> The pedants were the traditional SGML/Hypertext communities. Why are we breeding new pedants?

I don't know :-)

> Martin
> On Jun 22, 2011, at 11:44 AM, Andreas Harth wrote:
>> Hi Christopher,
>> On 06/22/2011 10:14 AM, Christopher Gutteridge wrote:
>>> Right now queries to (eg.
>>> ) are made live,
>>> but this is not efficient. My colleague, Dave Challis, has prepared a SPARQL
>>> endpoint which caches results which we can turn on if the load gets too high,
>>> which should at least mitigate the problem. Very few datasets change in a 24
>>> hours period.
>> setting the Expires header and enabling mod_cache in Apache httpd (or adding
>> a Squid proxy in front of the HTTP server) works quite well in these cases.
>> Best regards,
>> Andreas.



Kingsley Idehen	
President&  CEO
OpenLink Software
Twitter/ kidehen

Received on Wednesday, 22 June 2011 19:20:11 UTC