Re: [ANN] nature.com/ontologies - July 2015 Release

Hi Hugh:

We just pushed live [1] another release of our ontologies portal:

    http://nature.com/ontologies

This includes new data snapshots, separate linksets, etc.

You should find now that the examples are all fixed and that we have tried
to be more careful in our language generally across the site as to what we
actually providing so as not to raise expectations. We specifically have a
page 'What's Here?' calling out what we have and what we do not have (no
dereference, no SPARQL endpoint).

More to your other point on mappings, we have broken these out from the
models and have packaged these separately for easier access. These files
are all named *-linkset.*, e.g.

    npg-article-types-dbpedia-linkset.2015-08-24.nq.tar.gz
    etc.

We've amended the nav tab to Linksets, and labelled the pages there 'X
Mapping' or 'X Linkset' as appropriate.

We hope this makes a clearer distinction between the two types of linkset.

Note that we are using the word 'linkset' as per the VoID usage [2].

(And btw, I don't think that 'VoID' would be such a good name for the
'DOCS' directories as we are including - or aiming to include - more than
just void: terms there.)

Lastly should confess that we have done the unforgiveable and added in
some 'unlinked' data in the form of CSVs for some of our models. Be
interested in what people think of the format we're using here. We're not
intending to preserve any RDF structure in this serialization. Just
looking for a simple data export format.

Cheers,

Tony


[1] https://twitter.com/tonyhammond/status/636451486231302145

[2] http://www.w3.org/TR/void/#linkset





On 08/08/2015 10:19, "Hammond, Tony" <Tony.Hammond@Macmillan.com> wrote:

>Hi Hugh:
>
>Many thanks for the comments. This is exactly the kind of thing we need to
>hear.
>
>So, I think you may have raised four separate points which I'll try to
>answer in turn:
>
>==
>1. Examples
>
>You are right. We've been sloppy. These were intended more for reading
>than parsing. So we took liberties with omitting common namespaces,
>abbreviating strings, etc. Punctuation, we were just careless.
>
>But I agree there is real value in making these examples complete. We will
>address this in our next release.
>
>2. Dereference
>
>We really have no defence here. **We do not support dereference at this
>time.** The datasets are outputs from our production systems and HTTP URIs
>are used for namespacing only. We need to figure out a strategy for
>supporting dereference.
>
>So, if that means we are Bad Guys for violating Principle 2, then so be
>it. We are not trying to claim that this is true Linked Data - it's only
>common-or-garden linked data (of the RDF kind).
>
>That's not to say that we are not interested in adding in dereference.
>Only that these things take time to implement and we are proceeding with
>our data publishing in an incremental manner.
>
>So, for now, sorry!
>
>3. URNs, etc
>
>Note that the RDF you obtained from dereferencing the DOI is from CrossRef
>- not from ourselves. So we cannot properly answer for something retrieved
>from a third party. That said, CrossRef are also in the early stages of
>data publishing, and may not themselves have reached the Linked Data
>standard. Again, seems like it's only RDF at this time.
>
>4. Mappings
>
>Am a little perplexed as to the distinction between mappings and links,
>although maybe I can see where you're coming from. Note that we're anyway
>planning to decouple our ontology mappings and put those in separate files
>and list then under Mappings. Our core and domain ontologies generally
>have SKOS mappings, i.e. we use skos:closeMatch, skos:broadMatch,
>skos:exactMatch, skos:relatedMatch, etc. This feels appropriate for the
>ontology and the taxonomies.
>
>I guess we are cautiously feeling our way forward and want to be a little
>careful about using owl:sameAs.
>
>==
>
>So, I hope we've clarified some things here. There's a couple obvious
>things we can do/are doing (examples, mappings). Some other things are out
>of our hands (DOI dereference). And some will need more time for us to
>implement (dereference generally).
>
>Anyway, many thanks again for all your comments. It's really good to hear
>back from real users. Otherwise it can feel like we are whistling in the
>wind.
>
>Tony
>
>
>
>
>
>On 07/08/2015 12:48, "Hugh Glaser" <hugh@glasers.org> wrote:
>
>>Hi Tony,
>>Great stuff!
>>So I start exploring, looking for more fodder for sameAs.org Š :-)
>>
>>It may be that my questions are too specific for the list - feel free to
>>go off-list in response, and then we can summarise.
>>And there is rather a lot here, I¹m afraid.
>>
>>Some possible problemettes I hit:
>>http://www.nature.com/ontologies/datasets/articles/#data_example

>>might be confusing for people (and awkward when I tried to rapper it).
>>Since quite a few prefixes are not declared, most notably one of yours:
>>npg, but also the usual suspects (xsd, dc, bibo, foaf and also prism).
>>There is also a missing foaf:homepage that causes a syntax error.
>>And some semi-colons missing off the last few lines.
>>
>>A slightly more challenging problem is that the URI for that example
>>doesn¹t resolve.
>>It unqualifies to http://ns.nature.com/articles/nrg3870 (I assume it is a
>>namespace problem.)
>>
>>But I managed to find a resolving URI: http://dx.doi.org/10.1038/246015a0

>>(from http://www.nature.com/ontologies/mappings/articles-dbpedia/)
>>And successfully got some RDF :-)
>>Looking at the owl:sameAs triples in there, I then start to worry - they
>>are urn:, doi: and info: URIs.
>>This is fine for a Semantic Web publishing, but means (in my opinion)
>>that it is not Linked Data (violating principle 2) - all URIs for Things
>>have to be http: for that. So you could use another predicate, but
>>owl:sameAs seems wrong.
>>
>>Having found http://www.nature.com/ontologies/mappings/, I excitedly went
>>off to http://www.nature.com/ontologies/mappings/articles-dbpedia/ and
>>downloaded the files.
>>However, I found that the file contains only triples with foaf:topic and
>>cito:isCitedBy - no mappings between your URIs and DBpedia, Mesh, etcŠ,
>>which is what I was expecting. It seems to me that this is more of a
>>³links² file than a ³mappings² file.
>>Even more frustrating, many URIs that I tried, such as
>>http://dx.doi.org/10.1038/ng1285, don¹t resolve (give an invalid doi:
>>message), so I wouldn¹t be able to use them in any case.
>>
>>SoŠ :-)
>>Apart from any fixing you may want to do; and maybe having some example
>>Linked Data URIs for People and Publications sprinkled around (it really
>>was quite a challenge to find any RDF!).
>>I suspect that you do have some mappings between your URIs and dbpedia
>>URIs, for example.
>>Is there any chance you would like to send me (or link to a file) any
>>Linked Data owl:sameAs triples that I could add to sameAs.org, please?
>>And also, do you have an interesting owl:differentFrom dataset that I
>>could add to differentFrom.org? This might be your regression test, where
>>you have done mappings that you later found were wrong.
>>
>>Sorry if I have made some basic errors in failing to find things.
>>Very best
>>Hugh
>>> On 28 Jul 2015, at 12:41, Hammond, Tony <Tony.Hammond@Macmillan.com>
>>>wrote:
>>> 
>>> Hi:
>>> 
>>> As promised in a couple posts to the list earlier this year [1,2] we
>>>have now resumed dataset publishing (following on from the 2012 releases
>>>on data.nature.com [3,4]) and have added a new snapshot of bibliography
>>>metadata for nature.com articles to the Nature.com Ontologies portal:
>>> 
>>>     http://nature.com/ontologies/

>>>  
>>> Specifically, as announced yesterday [5], we have released 170 years of
>>>bib data for all nature.com articles and contributors over the period
>>>1845-2015. We expect to release new snapshots periodically.
>>> 
>>> The release notes are copied below.
>>> 
>>> Tony
>>> 
>>> 
>>> ==
>>> Release:
>>> 
>>> * We've resumed publishing of datasets. We're now making available
>>>complete instance datasets for articles (1.2 m) and contributors (2.7
>>>m). These datsets are linked to the DOI and ORCID datasets. (These
>>>datasets replace the historic datasets from 2012.)
>>> 
>>> * We've now added our core and domain models to GitHub projects and
>>>brought them under version control: public-npg-core-ontology  GitHub and
>>>public-npg-domain-ontology GitHub
>>> 
>>> * We've improved our documentation. A whole new Technical Notes section
>>>has been added. Some material from the homepage (e.g. Background,
>>>Licenses, Namespaces) has been moved there, and new material has been
>>>added (e.g. Annotations, Mappings, Naming Policy, Versions).
>>> 
>>> * We've improved our data mappings. The Subjects Ontology is now 100%
>>>mapped to DBpedia. See Mappings.
>>> 
>>> * We've added a reference in our Links section to our new colleagues at
>>>Springer and their LOD for Conferences in Computer Science.
>>> ==
>>>   
>>> 
>>> [1] https://lists.w3.org/Archives/Public/public-lod/2015Apr/0005.html

>>> [2] https://lists.w3.org/Archives/Public/public-lod/2015May/0002.html

>>> [3] https://lists.w3.org/Archives/Public/public-lod/2012Apr/0061.html

>>> [4] https://lists.w3.org/Archives/Public/public-lod/2012Jul/0130.html

>>> [5] https://twitter.com/tonyhammond/status/625641560676409345

>>> 
>>> 
>>>************************************************************************
>>>*
>>>*******  
>>> DISCLAIMER: This e-mail is confidential and should not be used by
>>>anyone who is not the original intended recipient. If you have received
>>>this e-mail in error please inform the sender and delete it from your
>>>mailbox or any other storage mechanism. Neither Macmillan Publishers
>>>Limited nor Macmillan Publishers International Limited nor any of their
>>>agents accept liability for any statements made which are clearly the
>>>sender's own and not expressly made on behalf of Macmillan Publishers
>>>Limited or Macmillan Publishers International Limited or one of their
>>>agents. 
>>> Please note that neither Macmillan Publishers Limited nor Macmillan
>>>Publishers International Limited nor any of their agents accept any
>>>responsibility for viruses that may be contained in this e-mail or its
>>>attachments and it is your responsibility to scan the e-mail and
>>> attachments (if any). No contracts may be concluded on behalf of
>>>Macmillan Publishers Limited or Macmillan Publishers International
>>>Limited or their agents by means of e-mail communication.
>>> Macmillan Publishers Limited. Registered in England and Wales with
>>>registered number 785998. Macmillan Publishers International Limited.
>>>Registered in England and Wales with registered number 02063302.
>>> Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS
>>> Pan Macmillan, Priddy and MDL are divisions of Macmillan Publishers
>>>International Limited.
>>> Macmillan Science and Education, Macmillan Science and Scholarly,
>>>Macmillan Education, Language Learning, Schools, Palgrave, Nature
>>>Publishing Group, Palgrave Macmillan, Macmillan Science Communications
>>>and Macmillan Medical Communications are divisions of Macmillan
>>>Publishers Limited.
>>> 
>>>************************************************************************
>>>*
>>>*******
>>> 
>>
>>-- 
>>Hugh Glaser
>>   20 Portchester Rise
>>   Eastleigh
>>   SO50 4QS
>>Mobile: +44 75 9533 4155, Home: +44 23 8061 5652
>>
>>
>

********************************************************************************   
DISCLAIMER: This e-mail is confidential and should not be used by anyone who is not the original intended recipient. If you have received this e-mail in error please inform the sender and delete it from your mailbox or any other storage mechanism. Neither Macmillan Publishers Limited nor Macmillan Publishers International Limited nor any of their agents accept liability for any statements made which are clearly the sender's own and not expressly made on behalf of Macmillan Publishers Limited or Macmillan Publishers International Limited or one of their agents. 
Please note that neither Macmillan Publishers Limited nor Macmillan Publishers International Limited nor any of their agents accept any responsibility for viruses that may be contained in this e-mail or its attachments and it is your responsibility to scan the e-mail and attachments (if any). No contracts may be concluded on behalf of Macmillan Publishers Limited or Macmillan Publishers International Limited or their agents by means of e-mail communication. 
Macmillan Publishers Limited. Registered in England and Wales with registered number 785998. Macmillan Publishers International Limited. Registered in England and Wales with registered number 02063302. 
Registered Office Brunel Road, Houndmills, Basingstoke RG21 6XS 
Pan Macmillan, Priddy and MDL are divisions of Macmillan Publishers International Limited. 
Macmillan Science and Education, Macmillan Science and Scholarly, Macmillan Education, Language Learning, Schools, Palgrave, Nature Publishing Group, Palgrave Macmillan, Macmillan Science Communications and Macmillan Medical Communications are divisions of Macmillan Publishers Limited.  
********************************************************************************

Received on Wednesday, 26 August 2015 15:50:39 UTC