W3C home > Mailing lists > Public > public-lod@w3.org > March 2010

Re: Resolving DC PURLs

From: Peter Ansell <ansell.peter@gmail.com>
Date: Tue, 30 Mar 2010 10:16:20 +1000
Message-ID: <a1be7e0e1003291716t191c09d4g9868a3f87095c3b9@mail.gmail.com>
To: David Wood <david@zepheira.com>
Cc: Thomas Baker <tbaker@tbaker.de>, DCMI Architecture Forum <dc-architecture@jiscmail.ac.uk>, public-lod <public-lod@w3.org>
On 30 March 2010 09:22, David Wood <david@zepheira.com> wrote:
> Hi Tom,
>
> On Friday, 26 March, Ian Davis (CTO of Talis) reported that purl.org was rejecting some requests to Dublin Core PURLs [1].  I asked OCLC to increase the number of threads used by their PURL server and the maximum number of concurrent connections.  They complied this afternoon.
>
> At the time of the failures, OCLC reported that purl.org was rejecting between 20 and 60 requests per second for DC terms [2].
>
> It would seem that Linked Data clients are becoming more prevalent and that some of them are not particularly well behaved.  Perhaps we need to raise awareness of the importance of caching.
>
> This incident points out the criticality of DC terms to the Linked Data community and the fragility of a single point of failure such as purl.org.  The PURL Federation development recently announced by NCBO and Zepheira may eventually serve to remove the single point of failure, but the criticality of service is likely to get worse with time.
>
> This message is simply an advisory to the DC community of some practical issues arising from the use of DC terms by the Linked Data community and requires no immediate action.  I would ask, though, that awareness of these issues be kept in mind as Dublin Core considers the management of its identifiers.
>
> [1]  http://twitter.com/IanD
> [2]  Personal correspondence from Tom Dehn at OCLC
> [3]  http://zepheira.com/publications/news/#PURLFederationDevelopment
>

For popular ontologies, Linked Data browsers could be distributed with
copies of this information embedded so they basically never need to
retrieve it on the fly. Ontologies aren't the only place that this
issue will come from, as Linked Data is designed to distribute
information at the lowest granularity possible, thus necessarily
increasing the bandwidth required to transport a number of pieces of
information related to different things.

Redirection services are the lowest bandwidth and processing services
in the Linked Data chain. What happens when one of the actual data
services experiences this level of popularity?

Cheers,

Peter
Received on Tuesday, 30 March 2010 00:16:48 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:25 UTC