W3C home > Mailing lists > Public > public-lod@w3.org > April 2009

Re: Fwd: [periodicals] Announce: Sample Linked Periodical Data

From: Leigh Dodds <leigh.dodds@talis.com>
Date: Mon, 27 Apr 2009 15:48:54 +0100
Message-ID: <f323a4470904270748k5f1a6abchcb3499a0c4eb0617@mail.gmail.com>
To: Kingsley Idehen <kidehen@openlinksw.com>
Cc: public-lod@w3.org
Hi Kingsley

Very nice!


Thanks. This is the first of several interesting datasets that'll come from
the dataincubator project.

btw - Are there going to be an actual RDF data dump URLs to accompany the
> SPARQL endpoints re. dataincubator.org? If so, then updates the following
> pages would be useful to the community at large:
>
> 1. http://esw.w3.org/topic/DataSetRDFDumps -- for RDF dumps if they are
> going to be provided
> 2. http://esw.w3.org/topic/SparqlEndpoints -- for SPARQL endpoints


Its still early days for the dataincubator project. As it is potentially an
umbrella project for a number of independent activities its difficult to
make assertions one way or another as it'll depend on the community members
to ensure these tasks get done.

However, for those datasets hosted within the Talis Platform, its a simple
matter (i.e. a single HTTP POST) to generate an RDF snapshot of a dataset
for backing up and re-publishing elsewhere. So the tools are in place and
hopefully general encouragement towards community norms will do the rest.
I'm currently investigating how to automatically push datasets hosted as
part of the Talis Connected Commons [1] into the Amazon Public Datasets
programme, so this will offer another route.

As the periodicals dataset is at a very early stage I don't plan to provide
a dump just yet -- there's plenty more data to accumulate and discussion to
be had around the modelling. However for anyone that wants to create one,
the Ruby scripts are in the open source project. In the meantime, I'll
update the SPARQL Endpoints page to list the endpoint [2] which is already
listed in the Void description [3] :)

Cheers,

L.
[1]. http://www.talis.com/cc
[2]. http://api.talis.com/stores/periodicals/services/sparql
[3]. http://periodicals.dataincubator.org/


>
> Kingsley
>
>>
>> ---------- Forwarded message ----------
>> From: *Leigh Dodds* <leigh.dodds@talis.com <mailto:leigh.dodds@talis.com
>> >>
>> Date: 2009/4/27
>> Subject: [periodicals] Announce: Sample Linked Periodical Data
>> To: bibliographic-ontology-specification-group@googlegroups.com <mailto:
>> bibliographic-ontology-specification-group@googlegroups.com>
>> Cc: dataincubator@googlegroups.com <mailto:dataincubator@googlegroups.com
>> >
>>
>>
>> Hi,
>>
>> I spent some time over the weekend putting together a couple of quick data
>> conversions to
>> publish some linked periodical data for people to start playing with and
>> discussing.
>>
>> The data is available as linked data from the DataIncubator.org website
>> here:
>>
>>  http://periodicals.dataincubator.org/
>>
>> As the Void description for the dataset illustrates, there is a SPARQL
>> endpoint for
>> the data at:
>>
>>  http://api.talis.com/stores/periodicals/services/sparql
>>
>> The dataset currently contains information that I've merged from two
>> sources:
>>
>> * The NLM journal lists available from [1]. Specifically the Pubmed
>> journal list
>> * The list of Highwire titles available from [2]
>>
>> The NLM journal list is just over 21000 titles. The Highwire list is
>> around 1100. The two
>> overlap, but I've not measured by how much.
>>
>> In generating URIs for the journal I've either used the NLM journal id
>> (where I had it)
>> or generated a simple uri based on the homepage. However, following some
>> discussion
>> with Chris Clarke, for both datasets I've also included owl:sameAs
>> statements to URIs
>> based on the ISSN and EISSN identifiers. This should provide some points
>> around which
>> to merge datasets.
>>
>> Some examples:
>>
>> Sample NLM journal:
>>
>>  http://periodicals.dataincubator.org/journal/0155161
>>
>> Note I've included links to a couple of NLM services.
>>
>> Sample Highwire journal:
>>
>>  http://periodicals.dataincubator.org/journal/theoncologist-alphamedpress
>>
>> To see the additional owl:sameAs relationships, look at Science Magazine
>> from
>> the Highwire dataset:
>>
>>  http://periodicals.dataincubator.org/journal/sciencemag
>>
>> This is owl:sameAs:
>>
>>  http://periodicals.dataincubator.org/issn/0036-8075
>>
>> Which is owl:sameAs the same title from the NLM data:
>>
>>  http://periodicals.dataincubator.org/journal/0404511
>>
>> I hope to have the Ruby code published for others to look at later today.
>>
>> I'm more than happy to debate the modelling. At this point nothing is
>> fixed, I
>> just wanted to help get the ball rolling by providing some data for people
>> to play
>> with.
>>
>> Cheers,
>>
>> L.
>>
>> [1]. http://www.ncbi.nlm.nih.gov/entrez/citmatch_help.html#JournalLists
>> [2]. http://highwire.stanford.edu/institutions/AtoZList.xls
>>
>> --
>> Leigh Dodds
>> Programme Manager, Talis Platform
>> Talis
>> leigh.dodds@talis.com <mailto:leigh.dodds@talis.com>
>> http://www.talis.com
>>
>>
>>
>> --
>> Leigh Dodds
>> Programme Manager, Talis Platform
>> Talis
>> leigh.dodds@talis.com <mailto:leigh.dodds@talis.com>
>> http://www.talis.com
>>
>
>
> --
>
>
> Regards,
>
> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen<http://www.openlinksw.com/blog/%7Ekidehen>
> President & CEO OpenLink Software     Web: http://www.openlinksw.com
>
> Please consider the environment before printing this email.
>
> Find out more about Talis at www.talis.com
> shared innovationTM
>
> Any views or personal opinions expressed within this email may not be those
> of Talis Information Ltd or its employees. The content of this email message
> and any files that may be attached are confidential, and for the usage of
> the intended recipient only. If you are not the intended recipient, then
> please return this message to the sender and delete it. Any use of this
> e-mail by an unauthorised recipient is prohibited.
>
> Talis Information Ltd is a member of the Talis Group of companies and is
> registered in England No 3638278 with its registered office at Knights
> Court, Solihull Parkway, Birmingham Business Park, B37 7YB.
>
> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email______________________________________________________________________
>



-- 
Leigh Dodds
Programme Manager, Talis Platform
Talis
leigh.dodds@talis.com
http://www.talis.com
Received on Monday, 27 April 2009 14:49:36 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:20 UTC