Re: ANN: Nature Publishing Group Linked Data Platform

Hi Hugh:

Many thanks for your comments. And to be sure, CC0 means exactly that -
public domain - have at it.

> But I am wondering whether the use of owl:sameAs to a non-http URI is best
> practice in something that says it is Linked Data?

We take a very pragmatic view on this. We completely support the Linked Data
principles in using HTTP URIs to identify resources and also to provide
descriptions. And we have done so in all our assignments. All our own named
objects are dereferenceable.

However, it is also true that there are a number of HTTP URIs which are
non-dereferenceable and also non-HTTP URIs (non-derefenceable from an HTTP
application context) which are used within certain domains of discourse and
as such offer RDF link points. In our view there is much to be gained from
adopting an open inclusive approach. It may be that some HTTP URIs later
become derefenceable (and here I'm specifically thinking of the
id.crossref.org domain URIs). Some other non-HTTP URIs ("info:doi/" and
"doi:") have also been used historically and are still very well represented
on the web (cf. Wikipedia articles, for example). In time these may be
superseded by HTTP forms but for now the use of owl:sameAs affords a
valuable bridging between different datasets. From an RDF perspective,
dereference is a bonus, not a necessity.

As for dumps of our datasets this is something that we are actively
discussing and we are certainly very aware of the value of local hosting.

Cheers,

Tony



On 12/05/2012 19:08, "Hugh Glaser" <hg@ecs.soton.ac.uk> wrote:

> Hi Tony - exciting stuff.
> A few questions, if I may.
> As usual, I go looking for the owl:sameas triples to add to http://sameas.org/
> etc.
> And congratulations on the CC0 1.0, which I think makes it explicit that I am
> allowed to.
> 
> This has led me to some queries:
> When I look at the RDF for http://dx.doi.org/10.1038/nm.2129
> I get 
>   <rdf:Description rdf:about="http://dx.doi.org/10.1038/nm.2129">
>     <ns0:sameAs xmlns:ns0="http://www.w3.org/2002/07/owl#"
> rdf:resource="info:doi/10.1038/nm.2129"/>
> (among other things, of course.)
> Now, I can of course use http://crossref.org/ to look it up, or append to
> http://dx.doi.org/ and look up
> http://dx.doi.org/info:doi/10.1038/nm.2129
> and get RDF back.
> But I am wondering whether the use of owl:sameAs to a non-http URI is best
> practice in something that says it is Linked Data?
> 
> When I start to SPARQL for owl:sameAs triples, I get
> <http://ns.nature.com/contributors/joe-cummins-2ente0kr9qc7z> owl:sameAs
>  <http://id.crossref.org/contributor/joe-cummins-2ente0kr9qc7z>
> as my first result.
> Unfortunately http://id.crossref.org/contributor/joe-cummins-2ente0kr9qc7z
> gives HTTP/1.1 400 Bad Request - "Malformed DOI" as response.
> Is this just an error, or is there something deeper here?
> 
> Like the others, I went looking for RDF dumps (so as not to hit your server),
> but found none (I didn't find a robots.txt or sitemap.xml on data.nature.com
> either).
> Can you perhaps advise? - I am after the owl:sameAs data.
> 
> I'm really excited about being able to use the data.
> 
> Best
> Hugh
> 
> On 5 Apr 2012, at 10:17, Hammond, Tony wrote:
> 
>> ** Apologies for cross-posting **
>> 
>> Hi:
>> 
>> We just wanted to share this news from yesterday's NPG press release [1]:
>> 
>>    "Nature Publishing Group (NPG) today is pleased to join the linked data
>> community by opening up access to its publication data via a linked data
>> platform. NPG's Linked Data Platform is available at http://data.nature.com.
>> 
>>    The platform includes more than 20 million Resource Description
>> Framework (RDF) statements, including primary metadata for more than 450,000
>> articles published by NPG since 1869. In this first release, the datasets
>> include basic citation information (title, author, publication date, etc) as
>> well as NPG specific ontologies. These datasets are being released under an
>> open metadata license, Creative Commons Zero (CC0), which permits maximal
>> use/re-use of this data.
>> 
>>    NPG's platform allows for easy querying, exploration and extraction of
>> data and relationships about articles, contributors, publications, and
>> subjects. Users can run web-standard SPARQL Protocol and RDF Query Language
>> (SPARQL) queries to obtain and manipulate data stored as RDF. The platform
>> uses standard vocabularies such as Dublin Core, FOAF, PRISM, BIBO and OWL,
>> and the data is integrated with existing public datasets including CrossRef
>> and PubMed.
>> 
>>    More information about NPG's Linked Data Platform is available at
>> http://developers.nature.com/docs. Sample queries can be found at
>> http://data.nature.com/query. "
>> 
>> Cheers,
>> 
>> Tony
>> 
>> [1] http://www.nature.com/press_releases/linkeddata.html
>> 
>> 
>> 
>> *****************************************************************************
>> ***   
>> DISCLAIMER: This e-mail is confidential and should not be used by anyone who
>> is
>> not the original intended recipient. If you have received this e-mail in
>> error
>> please inform the sender and delete it from your mailbox or any other storage
>> mechanism. Neither Macmillan Publishers Limited nor any of its agents accept
>> liability for any statements made which are clearly the sender's own and not
>> expressly made on behalf of Macmillan Publishers Limited or one of its
>> agents.
>> Please note that neither Macmillan Publishers Limited nor any of its agents
>> accept any responsibility for viruses that may be contained in this e-mail or
>> its attachments and it is your responsibility to scan the e-mail and
>> attachments (if any). No contracts may be concluded on behalf of Macmillan
>> Publishers Limited or its agents by means of e-mail communication. Macmillan
>> Publishers Limited Registered in England and Wales with registered number
>> 785998 
>> Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS
>> *****************************************************************************
>> ***
>> 
>> 

Received on Monday, 14 May 2012 07:54:08 UTC