Re: data.gov.uk coverage on BBC

Hi Ed,

2010/1/22 Ed Summers <ehs@pobox.com>:
> I'd love to hear more about the internals of what's going on if John
> has the time. I'm particularly interested in the roles that Talis and
> the OKFN are playing in providing the services.

Here's some background on Talis's involvement.

Our initial involvement with the open data from the UK government was
through the National Archives work to publish RDFa from the London
Gazette. We undertook an initial trial project to crawl the London
Gazette website to extract the RDFa and store the resulting data in a
Talis Platform store.

The Talis Platform provides a SaaS infrastructure for querying and
searching RDF data sources and the publishing of Linked Data. We added
a crawling capability to harvest Linked Data and RDFa as a result of
the London Gazette project and our work with the BBC. The latter is
not a standard part of the Platform API but a feature that we're happy
to offer to clients using the Platform.

As the data.gov.uk project took off we became involved in a number of
different ways. Firstly we've been providing technical consultancy
around data conversion and publishing in general, and specifically
running training courses on semantic web technologies for people
across a number of different departments in government. This is a
hands-on 2-day "bootcamp" that covers everything from RDF, RDF Schema,
Linked Data and SPARQL, plus some guidance on best practices. A lot of
ground to cover, but its helps people get a taste for how the
different technologies and approaches fit together. Feedback has been
very positive to date.

On the technology side we've also been responsible for providing the
RDF data hosting and Linked Data publishing aspects of data.gov.uk.

We have provisioned a number of Talis Platform stores for storing data
from specific sectors of the government, e.g. education and transport.
This sector based approach was intended to reflect the fact that the
data may ultimately be managed by different departments/organizations
so partitioning helps devolve that management. An aggregated store
including all of the data will shortly be available for SPARQLing
across the entire dataset.

The SPARQL endpoints for these stores, as well as access to the search
indexes built across the RDF literals in the data, are all exposed
from services.data.gov.uk. This is a domain that Talis is hosting and
covers any/all services that operate on the RDF.

In addition we're also hosting a number of other sub-domains of
data.gov.uk, e.g. education.data.gov.uk and transport.data.gov.uk.
URIs minted in the RDF are rooted in these domains and we're hosting
them in order to deliver the Linked Data for those URIs.

The generation of the Linked Data is fairly straight-forward: it is
implemented using PHP scripts that front the Talis Platform APIs.
Dereferencing the URIs result in human-readable HTML pages and there's
content negotation support for requesting a range of RDF
serializations including RDF/JSON.

My hope is that the Linked Data URIs can eventually be surfaced more
readily in the main data.gov.uk website so that users can browse and
navigate around the data.

We're now working with the data.gov.uk team to investigate ways to
create simplified views/APIs over the Linked Data in order to provide
additional access methods for working with the data.

Hope thats useful background. Let me know if you have any more questions.

Cheers,

L.

-- 
Leigh Dodds
Programme Manager, Talis Platform
Talis
leigh.dodds@talis.com
http://www.talis.com

Received on Tuesday, 26 January 2010 09:54:45 UTC