Re: HPO and Gene Ontology Licenses from Alan Ruttenberg on 2012-08-09 (public-semweb-lifesci@w3.org from August 2012)

From: Alan Ruttenberg <alanruttenberg@gmail.com>
Date: Wed, 8 Aug 2012 22:37:02 -0400
To: Michel Dumontier <michel.dumontier@gmail.com>
Cc: Peter Ansell <ansell.peter@gmail.com>, "Robinson, Peter" <peter.robinson@charite.de>, Chris Mungall <cjmungall@lbl.gov>, HCLS <public-semweb-lifesci@w3.org>, bio2rdf <bio2rdf@googlegroups.com>
Message-ID: <CAFKQJ8mbs9mRieAxu6p+c7pZF=aYt7EqJDvwjMWFT4BFHevM4A@mail.gmail.com>
On Wed, Aug 8, 2012 at 5:42 PM, Michel Dumontier
<michel.dumontier@gmail.com> wrote:
>>
>>
>> > The objective here would be to provide an RDF and SPARQL friendly
>> > version of OBO ontologies. It would reduce the ontological commitment to
>> > RDF, so in this sense, it would be a semantic loss, but makes it easier to
>> > retrieve relations between entities.  We could always provided links to OWL
>> > versions, if they are available.
>>
>> There have been discussions of, e.g. Skolemization of existing resources,
>> and those transformations are destructive.
>>
>> Reducing to RDF would not change the ontological commitment, but would
>> lose information. In any case such a transformation should not entail
>> minting new Uris for all resources. In addition, I question the value of
>> transforming the ontologies in this way, given the disadvantages of not
>> encouraging a uniform, author provided view on the resources. The OWL
>> resources are SPARQL friendly enough to build onto bee displays. If you look
>> at the bottom of the page there are links to see the SPARQL queries used to
>> construct the pages. A more constructive effort, in my opinion, would be to
>> build SPARQL parser extensions along the lines of TERP that make it easier
>> to query the ontology as it is.
>
>
>   It's certainly a step forward to have software like OntoBee to fetch and
> render OWL ontologies from triplestores. But it's also important to
> acknowledge that there are many other uses than just humans looking at it.
> As you know, we and others have demonstrated that alternative
> representations and reformulation of knowledge is desirable for certain
> kinds of scientific inquiry.

Sorry, I'm unaware of such demonstration. Could you cite some references?

What I am aware of are results such as those which have shown that the
*content* of the GO and other sources of prior knowledge can lead to
discoveries and greater understanding.

BTW, view source. Ontobee doesn't only fetch and render OWL for
humans. Ontobee is one leg of a variety of distribution channels for
OBO ontologies. Ontobee is responsible for the "linked data" serving
of ontologies, something I think it does rather well, and which is
still improving. Namely: It is the software that we are increasingly
using to provide resolvable URLs which respond appropriately for
computational as well as browser agents.We also distribute ontologies
as a whole from PURLs, have readable subversion repositories, have and
will have more SPARQL endpoints, and if you ask Chris I'm sure he's
tell you of another 1/2 dozen ways in which the OBOs are available.

>   In any case, the issue at hand is that of control - and what constitutes
> open and free software.  It's worth noting that more than one OBO ontology
> has come about because all there was available was non-open, non-free or
> "proprietary" terminologies and ontologies. Well, if one isn't free to copy,
> modify, distribute, change and improve the ontology, then it isn't free and
> it isn't open.
>
> http://www.gnu.org/philosophy/free-sw.html

Boy, run on lecturers to OBO about openness. Amusing. Please see my
remarks to Peter.

> Making ontologies available in RDF is no different than providing data in
> SQL database dumps, XML files, flat files, REST APIs or whatever format
> makes it easier for somebody to reuse content.

It is very different. In doing so you are subverting the design of the
semantic web by proposing to fuzz up the space with copies with
unclear differences from the original, published with different
identifiers than the original. The semweb vision is different than the
vision for other kinds of information technologies. It comes with its
own social contract, and my view is that the kind of effort you are
proposing breaks that contract. Sorry to disagree. Let me be clear:
Just rearranging RDF and minting new IDs is not a social good. I will
keep reminding you and others that there is a lot of work that needs
to be done that you all could be collaborating with us on to make a
stronger resource for everyone instead of spending time in this way.
And by the way you might even find the work more fun!

>> >> Further, the OBO ID Policy has been, for the most part, been put in
>> >> place and we do not use hash URIs and are moving to having all OBO
>> >> URIs resolving to page per view. See for example
>> >> http://purl.obolibrary.org/obo/IAO_0000032
>> >>
>> >
>> > does the OBO Foundry automatically check conformance?  Is there a report
>> > page for each ontology?
>>
>> Conformance to what? To the OWL Spec? Yes, I believe it does, through the
>> OORT and Jenkins build tools, but I'll leave it to Chris to detail that. If
>> there is something you are looking specifically for I expect it could be
>> provided. Or you could collaborate with us to build such services. I believe
>> that collaboration towards building a stronger single distribution is a much
>> better way to spend effort, in the long run.
>
>
> I personally like and support the NCBO's bioportal as a central repository
> for accessing and downloading ontologies.

Enjoy.

> They poll for the latest, bring it into their system, mint URIs, index, map - lots of added value.

They wouldn't be able to poll for anything were it not for the case of
people putting it there in the first place. They aren't doing a
service if they are minting URIs for resources that already have them.
They aren't doing a service when they don't make it clear they are
trying to add value to existing standards instead of replacing them.

They do certain very useful things, particularly indexing, enabling
search over the space of ontologies, displaying terms. But I want them
to see them focus more on these efforts. I still can't reliable view
an individual that happens to be member of an ontology document. They
still don't reason over ontologies and report issues.  There is still
not a visual browser that is particularly helpful.

> Lots of the stuff you mention below has already been done, and has the funding to
> continue supporting this.

None of what we have done was done before we did it. Many of the
things we do are still not done by anybody else. Bioportal will not
forever have funding to continue to support what they are doing. Given
efforts we have made to have Bioportal adopt practices we've explored,
prototyped, and then found useful, I have doubts whether they will
ever have anything as useful as Ontobee is for me.

> It might be worth investigating how OntoBee technology can add pretty rendering of OBO/OWL ontologies in BioPortal.

I have submitted numerous tickets to Bioportal over years pointing out
issues and making constructive suggestions about what would be useful
for ontology developers to have. I think I'm in a pretty good position
to know something about this. Very little has been followed up on.
Where it has been followed up on there is rarely acknowledgement.

I'm happy to compliment Bioportal on what they've done well, and there
are a number of things they have. But I'm not happy to let stand
misinformation about what they've accomplished, or leave stand any
doubt that I and others within the OBO have made anything less than
extraordinary efforts to supply ideas and prototypes to the Bioportal
team.

>> >> So the Foundry is already in the process of making all of the OBO
>> >> available as linked ontology data. I would suggest other groups join
>> >> this effort rather than setting out to duplicate and add confusion by
>> >> having a parallel set of identifiers for the same set of entities.
>>
>> >>
>> > I know about berkeley's download page -
>> > http://www.berkeleybop.org/ontologies/
>> > is this what you are referring to?
>>
>> We are moving towards completely using Web standards. Eventually, we will
>> have all OBO ontologies available at
>> http://purl.obolibrary.org/obo/<namespace>.owl . This, and other information
>> about deployment is in http://obofoundry.org/obo/id-policy.shtml, which
>> describes what we are  trying to put in place. Again, we are making progress
>> in this effort, but help could certainly be used. Chris pays attention to
>> where ontologies are before they arrive at their documented location. I
>> attend to ensuring that once there they behave as expected according to web
>> standards. If you look at http://www.ontobee.org and select an ontology
>> there should be metadata about where the ontology was downloaded from to get
>> it into Ontobee.
>>
>> >> In fact, there have been a number of OBO participants who prefer the
>> >> the current GO license precisely because it prevents this kind of
>> >> duplicative, confusing practice, a practice that is discouraged even
>> >> by the W3C standards these groups are chartered to work with.
>> >>
>> >> For more information about OBO efforts in this area, please see
>> >> http://code.google.com/p/oboformat/  and
>> >> http://code.google.com/p/owltools/
>> >>
>> >> -Alan
>> >>
>> >
>> > I don't see RDF or SPARQL endpoints being provided at either of those
>> > links.
>>
>> Indeed. They are not where you would expect to find them.
>>
>> There are two sparql endpoints at the moment, each with different
>> approaches. We are working toward deciding on and documenting expected
>> behavior and then ensuring we can provision them well enough to stand up to
>> regular use.
>>
>> Understand also that we are coming to the end of a multi-year effort to
>> regularize our URIs, defining a proper OWL translation of OBO, and providing
>> a new BFO to be the basis for these ontologies. We are not quite finished. I
>> would be most comfortable publishing a stable endpoint once this transition
>> was over. Again, assistance in deploying mirrors and in helping with all the
>> various loose ends needed before the resource can be considered stable would
>> be very much welcomed.
>>
>> http://sparql.obo.neurocommons.org/ intended to serve ontologies using the
>> legacy URIs (needs to be reviewed - hasn't been in a while.)
>> http://sparql.obodev.neurocommons.org/ intended to serve ontologies using
>> the current URIs (same as above)
>> http://sparql.hegroup.org/sparql serves the ontobee server (not meant for
>> wide consumption, but useful for prototyping)
>>
>
> don't forget the NCBO's SPARQL endpoint - I've already started using it with
> much success.  Bio2RDF also has an endpoint, but it's subject to being
> revisited and added to our growing pipeline.

I'm glad you are getting value out of it. I'll be frank that I haven't
really had a look. I tried to collaborate with Bioportal for years to
encourage them to adopt what we'd developed with Neurocommons,
encourage that they adopt standards, and then proceed to the next
level. Since they didn't we maintain resources that provide for our
community's needs. Where they don't we're working with newer
technology to prospect for where we can bring the next level of value.
As far as Bio2RDF, I'll have to admit as well that I haven't tried it
recently. Over the couple of years when I periodically did try it I
found a) That endpoints I queried were down a significant fraction of
the time and b) that translation of resources it incorporated was of
highly varying quality, and in many cases the elements of interest
were missing.

It's possible that this has changed in the interim, and if so, I'm
glad to hear it.

However the problem is that none of the efforts we are making have
long term funding and so there is continued risk of more of the same.
So one has to pick a strategy.

My own effort in the last few years has been towards doing my piece of
helping develop what's available in the OBOs, working to improve
quality where possible,  (slowly) helping move that to a place where
it uses good standards properly, and contributing to efforts to make
it easy to get the whole load of it, intact, usable as a whole, into
the hands of whoever wants it. I hope that this leads either to a
stable source of funding for this kind of work, or to a set of
artifacts and protocols that make it simple enough to maintain and use
that such funding isn't necessary.

>> Members of HCLS who wish to assist with maintenance of the Neurocommons
>> endpoints would be welcomed. Once they are reviewed and brought up to date
>> on which ontologies they load, the Neurocommons RDF Bundling system will
>> provide an addition distribution mechanism for creating mirrors.
>>
>>
>> So Michel, and other HCLS users, consider this an invitation: The OBO
>> Foundry is very close to providing a stable, well thought through process
>> for semantic web deployment of OBO ontologies. We could very much use
>> technical support in finishing a number of technical loose ends, in
>> providing tools that build on these efforts, and on making it easy to access
>> existing endpoints or provide mirrors of the content. If there is sufficient
>> interest in this within the group perhaps Chris and I can schedule a time
>> when we could meet with those interested and see what the possibilities are.
>
>
> I think that's a good idea. Let's aim for sometime in September.

OK, that should be interesting. Let's talk on the phone some time
before then to see about what we can do to make the meeting
productive.

Regards,
Alan

>
> m.
>
>>
>> Sincerely,
>> Alan Ruttenberg
>> http://alan.ruttenbergs.com
>>
>> >
>> > m.
>> >
>> >
>> > On Wed, Aug 8, 2012 at 1:36 AM, Peter Ansell <ansell.peter@gmail.com>
>> > wrote:
>> >> Hi Peter,
>> >>
>> >> I understand completely. The usage policy is very liberal in terms of
>> >> distribution and we are glad for that!
>> >>
>> >> Would it be possible for us (Michel and I) to make suggestions with
>> >> the goal of publishing a version that matches the no-blank-node policy
>> >> that Bio2RDF attempts to follow and uses URIs structures that can
>> >> resolve using http://bio2rdf.org/. We don't want to make material
>> >> changes to any of the terms but we would like to make the resulting
>> >> RDF graph browsable as Linked Data, as far as possible. To enable that
>> >> we need to directly resolve URIs for items to their definitions, by
>> >> replacing fragment/hash identifiers with
>> >> http://bio2rdf.org/ns:identifier equivalents, for example.
>> >>
>> >> Thanks,
>> >>
>> >> Peter Ansell
>> >>
>> >> On 8 August 2012 15:20, Robinson, Peter <peter.robinson@charite.de>
>> >> wrote:
>> >>> Hi Peter,
>> >>>
>> >>> given that the HPO is being used by medical groups for real patient
>> >>> data, we think it is potentially dangerous to allow external groups to
>> >>> change the data and present it elsewhere, given some of the notorious
>> >>> difficulties in actually understanding what some medical terms mean (even
>> >>> for us MDs).  This was the reason for the license statement, which other
>> >>> than that is quite liberal. However, we would be happy to work with you to
>> >>> find a solution, which could forsee us providing RDF on our website which
>> >>> you could import.
>> >>>
>> >>> BW Peter
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> PD Dr. med. Peter N. Robinson, MSc.
>> >>> Institut für Medizinische Genetik und Humangenetik
>> >>> Charité - Universitätsmedizin Berlin
>> >>> Augustenburger Platz 1
>> >>> 13353 Berlin
>> >>> Germany
>> >>> +4930 450566006
>> >>> Mobile: 0160 93769872
>> >>> peter.robinson@charite.de
>> >>> http://compbio.charite.de
>> >>> http://www.human-phenotype-ontology.org
>> >>> Introduction to Bio-Ontologies:
>> >>> http://www.crcpress.com/product/isbn/9781439836651
>> >>> ________________________________________
>> >>> Von: Peter Ansell [ansell.peter@gmail.com]
>> >>> Gesendet: Mittwoch, 8. August 2012 03:03
>> >>> An: Chris Mungall
>> >>> Cc: Michel Dumontier; HCLS; bio2rdf; Robinson, Peter
>> >>> Betreff: HPO and Gene Ontology Licenses
>> >>>
>> >>> On 8 August 2012 02:46, Chris Mungall <cjmungall@lbl.gov> wrote:
>> >>>> Hi Michael
>> >>>>
>> >>>> I can't seem to connect to the triplestore.
>> >>>>
>> >>>> Have you considered adding associations between OMIM and phenotype
>> >>>> ontology
>> >>>> classes? These can be downloaded from
>> >>>> http://www.human-phenotype-ontology.org/ as tab delimited files that
>> >>>> can
>> >>>> trivially be converted to an rdf model of choice (we will be
>> >>>> providing OWL
>> >>>> for this ourselves in the future, it will likely differ in modeling
>> >>>> and URIs
>> >>>> from bio2rdf).
>> >>>
>> >>> The HPO files cannot be modified though given the following license
>> >>> condition:
>> >>>
>> >>> "That neither the content of the HPO file(s) nor the logical
>> >>> relationships embedded within the HPO file(s) be altered in any way."
>> >>> [1]
>> >>>
>> >>> in the same way that Gene Ontology files cannot be legally modifed
>> >>> using a very similar license condition:
>> >>>
>> >>> "That neither the content of the GO file(s) nor the logical
>> >>> relationships embedded within the GO file(s) be altered in any way."
>> >>> [2]
>> >>>
>> >>> Therefore Bio2RDF should not be converting the HPO classes to RD
>> >
>> >
>> > --
>> > Michel Dumontier
>> > Associate Professor of Bioinformatics, Carleton University
>> > Chair, W3C Semantic Web for Health Care and the Life Sciences Interest
>> > Group
>> > http://dumontierlab.com
>> >
>
>
>
>
> --
> Michel Dumontier
> Associate Professor of Bioinformatics, Carleton University
> Chair, W3C Semantic Web for Health Care and the Life Sciences Interest Group
> http://dumontierlab.com
>
Received on Thursday, 9 August 2012 02:38:01 UTC