Re: Think before you write Semantic Web crawlers

On Wed, Jun 22, 2011 at 8:29 PM, Andreas Harth <andreas@harth.org> wrote:
> Hi Martin,
>
> On 06/22/2011 09:08 PM, Martin Hepp wrote:
>>
>> Please make a survey among typical Web site owners on how many of them
>> have
>>
>> 1. access to this level of server configuration and
>
>> 2. the skills necessary to implement these recommendations.
>
> d'accord .
>
> But the case we're discussing there's also:
>
> 3. publishes millions of pages
>
> I am glad you brought up the issue, as there are several data providers
> out there (some with quite prominent names) with hundreds of millions of
> triples, but unable to sustain lookups every couple of seconds or so.

Very funny :-) At peak times, a single crawler was hitting us with 150
rq/s. Quite far from "every couple of seconds or so".

Best,
y

>
> I am very much in favour of amateur web enthusiasts (I would like to claim
> I've started as one).  Unfortunately, you get them on both ends, publishers
> and consumers.  Postel's law applies to both, I guess.
>
> Best regards,
> Andreas.
>
>

Received on Wednesday, 22 June 2011 19:39:14 UTC