W3C home > Mailing lists > Public > semantic-web@w3.org > June 2011

Re: Think before you write Semantic Web crawlers

From: Martin Hepp <martin.hepp@ebusiness-unibw.org>
Date: Wed, 22 Jun 2011 21:08:44 +0200
Cc: Christopher Gutteridge <cjg@ecs.soton.ac.uk>, Daniel Herzig <herzig@kit.edu>, semantic-web@w3.org, public-lod@w3.org
Message-Id: <EC158417-582E-4B0C-A5A0-ECEE56D9F600@ebusiness-unibw.org>
To: Andreas Harth <andreas@harth.org>
Hi Andreas:

Please make a survey among typical Web site owners on how many of them have

1. access to this level of server configuration and
2. the skills necessary to implement these recommendations.

The WWW was anti-pedantic by design. This was the root of its success. The pedants were the traditional SGML/Hypertext communities. Why are we breeding new pedants?


On Jun 22, 2011, at 11:44 AM, Andreas Harth wrote:

> Hi Christopher,
> On 06/22/2011 10:14 AM, Christopher Gutteridge wrote:
>> Right now queries to data.southampton.ac.uk (eg.
>> http://data.southampton.ac.uk/products-and-services/CupCake.rdf ) are made live,
>> but this is not efficient. My colleague, Dave Challis, has prepared a SPARQL
>> endpoint which caches results which we can turn on if the load gets too high,
>> which should at least mitigate the problem. Very few datasets change in a 24
>> hours period.
> setting the Expires header and enabling mod_cache in Apache httpd (or adding
> a Squid proxy in front of the HTTP server) works quite well in these cases.
> Best regards,
> Andreas.
Received on Wednesday, 22 June 2011 19:09:17 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 5 July 2022 08:45:25 UTC