Re: CPAN as linked data

On Wed, 2010-04-07 at 15:20 +0100, Toby Inkster wrote:
> This lunch time I threw together a quick Linked Data wrapper around
> CPAN::SQLite.

And today I've added a few updates. The most immediately apparent is
that all data is available as RDFa.

> Example author:
> 
>   http://purl.org/NET/cpan-uri/person/gwilliams
> 
> Example distribution:
> 
>   http://purl.org/NET/cpan-uri/dist/RDF-Trine/project
> 
> Example version:

http://purl.org/NET/cpan-uri/dist/RDF-Trine/v_0-124

> If you visit the page of a version that cannot be found on CPAN, then the
> wrapper naively assumes that it does/will/did exist, so serves up a page
> anyway:
> 
>   http://purl.org/NET/cpan-uri/dist/RDF-Trine/v_9-999

This "feature" has been fixed. It now uses the BackPAN archive as an
authoritative list of released versions.

> If module authors use Module::Install::DOAPChangeSets and include a
> Changes.ttl file in their distribution, then the project data will
> automatically pick up data from Changes.ttl:
> 
>   http://purl.org/NET/cpan-uri/dist/RDF-TrineShortcuts/project

One last change, there are now URIs defined for Perl modules too. (In
Perl/CPAN terms, a single CPAN distribution provides zero or more
modules.)

e.g.

http://purl.org/NET/cpan-uri/module/RDF::TrineShortcuts/v_0-100

Distribution versions link to the modules they provide, and also to the
modules that are their dependencies. Modules link back to the
distribution versions that provide them.

As all this is built on-the-fly, I don't provide a SPARQL endpoint, but
now I've added BackPAN as a data source, I can see a route to creating a
full data dump, and thus eventually a SPARQL endpoint.

-- 
Toby A Inkster
<mailto:mail@tobyinkster.co.uk>
<http://tobyinkster.co.uk>

Received on Friday, 9 July 2010 16:31:36 UTC