Re: Fwd: [periodicals] Announce: Sample Linked Periodical Data

Leigh Dodds wrote:
> Hi Kingsley
>
>     Very nice!
>
>
> Thanks. This is the first of several interesting datasets that'll come 
> from the dataincubator project.
>
>     btw - Are there going to be an actual RDF data dump URLs to
>     accompany the SPARQL endpoints re. dataincubator.org
>     <http://dataincubator.org>? If so, then updates the following
>     pages would be useful to the community at large:
>
>     1. http://esw.w3.org/topic/DataSetRDFDumps -- for RDF dumps if
>     they are going to be provided
>     2. http://esw.w3.org/topic/SparqlEndpoints -- for SPARQL endpoints
>
>
> Its still early days for the dataincubator project. As it is 
> potentially an umbrella project for a number of independent activities 
> its difficult to make assertions one way or another as it'll depend on 
> the community members to ensure these tasks get done.
>
> However, for those datasets hosted within the Talis Platform, its a 
> simple matter (i.e. a single HTTP POST) to generate an RDF snapshot of 
> a dataset for backing up and re-publishing elsewhere. So the tools are 
> in place and hopefully general encouragement towards community norms 
> will do the rest. I'm currently investigating how to automatically 
> push datasets hosted as part of the Talis Connected Commons [1] into 
> the Amazon Public Datasets programme, so this will offer another route.
Leigh,

Re. Amazon, I think we can coordinate effort. The process would be 
something like this:

1. I give you an IP address to an AMI
2. AMI is then used to make a named snapshot
3. I then tell notify Amazon about the desire to make the snaphot a 
public snapshot

That's all it takes.

Irrespective, of the above, I am going to introduce you to the Amazon 
AWS folks.


btw - we've held off (i.e. told Amazon to await a newer snapshot than 
what the currently posses)  on the LOD cloud release on Amazon due to 
late arrival of data from the Bio2RDF project (data is in place now) 
and  requests from Uniprot for us use later release etc..

Thus, you can ping me and I can get you access details to the staging 
device etc.. If you're fast, you can see how we make the LOD cloud data 
set available (from the AMI device) as this is on our todo for sometime 
today.


>
> As the periodicals dataset is at a very early stage I don't plan to 
> provide a dump just yet -- there's plenty more data to accumulate and 
> discussion to be had around the modelling. However for anyone that 
> wants to create one, the Ruby scripts are in the open source project. 
> In the meantime, I'll update the SPARQL Endpoints page to list the 
> endpoint [2] which is already listed in the Void description [3] :)
Re. VoiD we should firm up on how to expose these graphs in uniform 
manner. This is basically about acting on the discussion instigated by 
Daniel Schwabe pre. www2009. I also think  SPARQL endpoint discovery [1] 
has a nice role to play here. Ideally, we should have sparql endpoints 
registered with discovery servers and just maintain a list of these 
servers etc..

Thus, we go: [sparql-endpoint-discovery] --> <sparql-endpoint> --- [VoiD 
graph discovery] ---> Smart Linked Data Driven apps and solutions. 
Naturally, this will simply compliment what is discernable from any 
Linked Data URI that includes VoiD graph data or points to such via URIs 
using "seeAlso".


Links:

1. http://www.floop.org.uk/eagle/discovering-sparql

Kingsley
>
> Cheers,
>
> L.
> [1]. http://www.talis.com/cc
> [2]. http://api.talis.com/stores/periodicals/services/sparql
> [3]. http://periodicals.dataincubator.org/
>
>
>
>     Kingsley
>
>
>         ---------- Forwarded message ----------
>         From: *Leigh Dodds* <leigh.dodds@talis.com
>         <mailto:leigh.dodds@talis.com> <mailto:leigh.dodds@talis.com
>         <mailto:leigh.dodds@talis.com>>>
>         Date: 2009/4/27
>         Subject: [periodicals] Announce: Sample Linked Periodical Data
>         To:
>         bibliographic-ontology-specification-group@googlegroups.com
>         <mailto:bibliographic-ontology-specification-group@googlegroups.com>
>         <mailto:bibliographic-ontology-specification-group@googlegroups.com
>         <mailto:bibliographic-ontology-specification-group@googlegroups.com>>
>         Cc: dataincubator@googlegroups.com
>         <mailto:dataincubator@googlegroups.com>
>         <mailto:dataincubator@googlegroups.com
>         <mailto:dataincubator@googlegroups.com>>
>
>
>         Hi,
>
>         I spent some time over the weekend putting together a couple
>         of quick data conversions to
>         publish some linked periodical data for people to start
>         playing with and discussing.
>
>         The data is available as linked data from the
>         DataIncubator.org website here:
>
>          http://periodicals.dataincubator.org/
>
>         As the Void description for the dataset illustrates, there is
>         a SPARQL endpoint for
>         the data at:
>
>          http://api.talis.com/stores/periodicals/services/sparql
>
>         The dataset currently contains information that I've merged
>         from two sources:
>
>         * The NLM journal lists available from [1]. Specifically the
>         Pubmed journal list
>         * The list of Highwire titles available from [2]
>
>         The NLM journal list is just over 21000 titles. The Highwire
>         list is around 1100. The two
>         overlap, but I've not measured by how much.
>
>         In generating URIs for the journal I've either used the NLM
>         journal id (where I had it)
>         or generated a simple uri based on the homepage. However,
>         following some discussion
>         with Chris Clarke, for both datasets I've also included
>         owl:sameAs statements to URIs
>         based on the ISSN and EISSN identifiers. This should provide
>         some points around which
>         to merge datasets.
>
>         Some examples:
>
>         Sample NLM journal:
>
>          http://periodicals.dataincubator.org/journal/0155161
>
>         Note I've included links to a couple of NLM services.
>
>         Sample Highwire journal:
>
>          http://periodicals.dataincubator.org/journal/theoncologist-alphamedpress
>
>         To see the additional owl:sameAs relationships, look at
>         Science Magazine from
>         the Highwire dataset:
>
>          http://periodicals.dataincubator.org/journal/sciencemag
>
>         This is owl:sameAs:
>
>          http://periodicals.dataincubator.org/issn/0036-8075
>
>         Which is owl:sameAs the same title from the NLM data:
>
>          http://periodicals.dataincubator.org/journal/0404511
>
>         I hope to have the Ruby code published for others to look at
>         later today.
>
>         I'm more than happy to debate the modelling. At this point
>         nothing is fixed, I
>         just wanted to help get the ball rolling by providing some
>         data for people to play
>         with.
>
>         Cheers,
>
>         L.
>
>         [1].
>         http://www.ncbi.nlm.nih.gov/entrez/citmatch_help.html#JournalLists
>         [2]. http://highwire.stanford.edu/institutions/AtoZList.xls
>
>         -- 
>         Leigh Dodds
>         Programme Manager, Talis Platform
>         Talis
>         leigh.dodds@talis.com <mailto:leigh.dodds@talis.com>
>         <mailto:leigh.dodds@talis.com <mailto:leigh.dodds@talis.com>>
>
>         http://www.talis.com
>
>
>
>         -- 
>         Leigh Dodds
>         Programme Manager, Talis Platform
>         Talis
>         leigh.dodds@talis.com <mailto:leigh.dodds@talis.com>
>         <mailto:leigh.dodds@talis.com <mailto:leigh.dodds@talis.com>>
>         http://www.talis.com
>
>
>
>     -- 
>
>
>     Regards,
>
>     Kingsley Idehen       Weblog:
>     http://www.openlinksw.com/blog/~kidehen
>     <http://www.openlinksw.com/blog/%7Ekidehen>
>     President & CEO OpenLink Software     Web: http://www.openlinksw.com
>
>     Please consider the environment before printing this email.
>
>     Find out more about Talis at www.talis.com <http://www.talis.com>
>     shared innovationTM
>
>     Any views or personal opinions expressed within this email may not
>     be those of Talis Information Ltd or its employees. The content of
>     this email message and any files that may be attached are
>     confidential, and for the usage of the intended recipient only. If
>     you are not the intended recipient, then please return this
>     message to the sender and delete it. Any use of this e-mail by an
>     unauthorised recipient is prohibited.
>
>     Talis Information Ltd is a member of the Talis Group of companies
>     and is registered in England No 3638278 with its registered office
>     at Knights Court, Solihull Parkway, Birmingham Business Park, B37 7YB.
>
>     ______________________________________________________________________
>     This email has been scanned by the MessageLabs Email Security System.
>     For more information please visit http://www.messagelabs.com/email
>     ______________________________________________________________________
>
>
>
>
> -- 
> Leigh Dodds
> Programme Manager, Talis Platform
> Talis
> leigh.dodds@talis.com <mailto:leigh.dodds@talis.com>
> http://www.talis.com


-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com

Received on Monday, 27 April 2009 16:30:17 UTC