Re: How do OBO ontologies work on the LOD?

On Wed, Feb 22, 2012 at 7:14 PM, Peter DeVries <pete.devries@gmail.com>wrote:

> OK so I ran the isql command that Kingsley sent me and this looks better
> but it still has blank nodes.
>
> http://bit.ly/wOuYkf
>
> I am concerned that people who are thinking of creating ontologies for the
> LOD will think that they should do this via OBO.
>

OBO can mean about 1/2 dozen things and we should be clear what we mean.
OBO = Open Biomedical Ontologies = any ontology that is open and submitted
for inclusion in the set
OBO Format = A file format that in its current version an alternative
syntax for a portion of OWL2
OBO Foundry = An organization that aims to facilitate improving the quality
of ontologies and facilitate integration of information about biology and
medicine
OBO Foundry Ontology: An ontology from the OBO that has been reviewed
according to a set of principles that have been co-developed, and has
determined to meet or on the way to meeting the criteria described by those
principles.
OBOinOWL = The software that translates the OBO Format to OWL and back to
OBO

Then there is

Ontobee: A piece of software designed to serve ontology terms as LOD.
Ontobee is under development, and I hope to have it improved based on
discussions we are having here.


> Since the URI's in the vocabulary don't work following LOD best practices,
> I don't this this is a good way to do things. (For LOD use)
>

The URIs are intended to behave as expected by semweb principles. We intend
to try to make them useful for LOD uses as well, as we gather more
information about what that means. If they don't behave that way, then
something is broken or not implemented. My experience is that this is not
an unusual situation in anything in the current LOD - rather the opposite
in many cases.

We don't think good quality information should be authored with a specific
technology in mind, but should be disseminated in as many ways as possible.


> It seems to me that if you want to use a vocabulary of terms and
> properties to annotate LOD entities you should use an ontology that follows
> LOD best practices.
>

We are very very early in LOD and semantic web. It is presumptuous, IMO, to
call any practice in this area a "best practice".


> In those cases where there are large data sets marked up via OBO, then
> some of these other solutions are best.
>

Which other solutions?


> I don't know if there has been much linking from other data sets to the
> HAOL ontology. It is fairly new.
>
> My guess is that there may be as many as 1 million species (~140,000
> described) of Hymenoptera (Bees, Wasps and Ants).
> http://en.wikipedia.org/wiki/Hymenoptera
>
> The upper taxonomic classification of these will be unstable for a while.
> So ideas like, *all species in this family have these characters* are
> problematic.
>

If the statements in HAO are incorrect, then they should be corrected. That
is one of the principles. On http://obofoundry.org you will find links to
all of the ontologies in the set, and there is contact information via
which you can send comments about incorrect assertions. I am certain they
would be welcomed. If you find otherwise, do let me know.


> Multiply that by millions of occurrence records with identification
> annotated with HAOL properties (some properties might not be visible on
> some specimens)
>
> I have my doubts that OBO inferencing will be of much use, compared to the
> findability, queryability and mashability of the LOD approach.
>

You can have your doubts. Inference may or may not be necessary for various
tasks. I don't see any principled reason by OBO ontology derived RDF should
differ on matters of findability, queryability and mashability of other
work done in LOD.


> Especially in this ontology where most of these axioms seem to be about
> the relation of one anatomical part to another.
>

That's fine. If the facts that HAO aren't useful to you, don't use them.


> I can see the utility in asserting that species X has the following
> anatomical characters URI_1, URI_2, URI_3, URI_4
>
> Heavy inferencing is not needed to make that useful, while proper
> dereferencing of the URI's is important.
>

Both have their place and there are many different uses. So we try to
provide the ability to do both. If you advocate one over the other then I
expect you simply are not aware of these other uses.

>
> Do people really think that when the Beetle people start on their
> vocabulary they should do it in OBO?
>

Yes. OBO is the largest community of disparate developers of biomedical
ontologies. There is an emphasis on good quality work and a large biologist
community that can and does weigh in on issue that are important for
understanding and correctly representing the data. We (Well, I) don't claim
to have all the answers, but there are a lot of good people, and we are
sharing what we learn as we go on


> If I am wrong, feel free to set me straight. :-)
>

I have tried to point out where I believe you are wrong, but in the end it
is your choice of whether you are interested in working with anything or
anyone you want. It would be a shame, however, to make such a choice based
on flawed information, and from what I see you don't have a good
understanding of our effort yet.

Best,
Alan


>
> Respectfully,
>
> - Pete
>
> P.S. Kudo's to Sig.ma http://sig.ma/ which which handled a HAOL URI I
> sent to it. But it did stop inferencing after ~1000 with a long way to go.
>
>         *All that that is probably needed is for Sig.ma to recognizes that
> it is URI for an anatomical character in the Hymenoptera Ontology which has
>          a human readable description page at a related url. Also at the
> Hymenoptera ontology has as its subject the order Hymenoptera (an Insect
> Order)
>
>
> On Wed, Feb 22, 2012 at 3:32 PM, Kingsley Idehen <kidehen@openlinksw.com>wrote:
>
>>  On 2/22/12 3:14 PM, Peter DeVries wrote:
>>
>> Hi Kingsley,
>>
>>  I don't know if it is that simple, it might fix the blank nodes on my
>> endpoint.
>>
>>  However, the OWL version of the ontology has this
>>
>>  http://purl.org/obo/owl/HAO#HAO_0000526 URI
>>
>>  Which shows up as in the KB http://purl.obolibrary.org/obo/HAO_0000526(KB View median ocellus
>> http://bit.ly/wOuYkf )
>>
>>  Which resolves on the web to this
>> http://api.hymao.org/public/ontology_class/show_expanded/1779
>>
>>  So how will this work in general on the LOD where there seems to be a
>> different set of best practices?
>>
>>  What is the best way to add this as a schema to my endpoint?
>>
>>  - Pete
>>
>>
>> Pete,
>>
>> You can execute the following via your conductor UI or iSQL:
>>
>> sparql
>> define get:soft "add"
>> INSERT INTO <http://purl.obolibrary.org/obo/hao.owl><http://purl.obolibrary.org/obo/hao.owl>
>> {?s rdfs:isDefinedBy <http://purl.obolibrary.org/obo/hao.owl><http://purl.obolibrary.org/obo/hao.owl>.
>> <http://purl.obolibrary.org/obo/hao.owl><http://purl.obolibrary.org/obo/hao.owl>
>> <http://open.vocab.org/terms/defines><http://open.vocab.org/terms/defines>?s.
>> <http://purl.obolibrary.org/obo/hao.owl><http://purl.obolibrary.org/obo/hao.owl>a owl:Ontology .
>> ?s <http://www.w3.org/2007/05/powder-s#describedby><http://www.w3.org/2007/05/powder-s#describedby>
>> <http://purl.obolibrary.org/obo/hao.owl><http://purl.obolibrary.org/obo/hao.owl>
>> }
>> FROM <http://purl.obolibrary.org/obo/hao.owl><http://purl.obolibrary.org/obo/hao.owl>
>> WHERE { optional {?s rdfs:subClassOf ?o}. optional {?s rdfs:subPropertyOf
>> ?o}. optional {?s owl:equivalentClass ?o}. optional {?s
>> owl:equivalentProperty ?o}. optional {?s a ?o}}
>> ;
>>
>> sparql
>> select distinct * from <http://purl.obolibrary.org/obo/hao.owl><http://purl.obolibrary.org/obo/hao.owl>
>> where {?s ?p ?o} ;
>>
>> rdfs_rule_set ('hao-rule', 'http://purl.obolibrary.org/obo/hao.owl') .
>>
>>
>> Once that's done, you have a local version of the ontology that
>> associated with an inference rule. After that, simply refer to the
>> inference rule in your SPARQL queries via its pragma or use the &inf
>> parameter re. faceted browsing URL.
>>
>>
>> Hope this helps? If not, we can move it to the Virtuoso forum and iron
>> things out etc..
>>
>> Kingsley
>>
>>
>>
>>
>> On Wed, Feb 22, 2012 at 1:12 PM, Kingsley Idehen <kidehen@openlinksw.com>wrote:
>>
>>>  On 2/22/12 9:15 AM, Alan Ruttenberg wrote:
>>>
>>>
>>>
>>> On Wed, Feb 22, 2012 at 2:55 AM, Peter DeVries <pete.devries@gmail.com>wrote:
>>>
>>>> Hi Alan,
>>>>
>>>>  Here is an example from the Hymenoptera Anatomy Ontology
>>>>
>>>>  http://obofoundry.org/cgi-bin/detail.cgi?id=hymenoptera_anatomy
>>>>
>>>>  Example via my endpoint
>>>>
>>>> http://lsd.taxonconcept.org/describe/?url=http://purl.obolibrary.org/obo/HAO_0001000&sid=1151
>>>>
>>>
>>>  Ok, I see. The problem here is the one I alluded to. We use OWL and
>>> the Virtuoso endpoint you are using doesn't understand it. I am ccing
>>> Kingsley and officially "tsk"ing him. We've known each other long enough
>>> that I'd have hoped he would have got some OWL religion by now.
>>>
>>>
>>>  Yes, I am an OWL believer!
>>>
>>> Pete: use the ontology in question as the basis for a Virtuoso Inference
>>> rule, then invoke the describe URL with the parameter for inference context
>>> application.
>>>
>>> Example:
>>>
>>> 1.
>>> http://linkeddata.uriburner.com/describe/?url=http%3A%2F%2Fpurl.org%2FNET%2Fgooglevocab%23Product-- no inference context
>>> 2.
>>> http://linkeddata.uriburner.com/describe/?uri=http%3A%2F%2Fpurl.org%2FNET%2Fgooglevocab%23Product&inf=oplweb-- inference context applied
>>> 3.
>>> http://virtuoso.openlinksw.com/presentations/SPARQL_Tutorials/SPARQL_Tutorials_Part_5/SPARQL_Tutorials_Part_5.html#(53)
>>>
>>>  By that I don't mean doing full reasoning arbitrary combinations of
>>> RDF from different sources - but at least correctly parsing OWL is
>>> something I would have hoped be implemented by now.
>>>
>>>  What the HAO should look like in a simple linked data browser (where
>>> some of the 'data' is in the form of OWL class definitions)  is something
>>> like this:
>>>
>>>
>>> http://www.ontobee.org/browser/rdf.php?o=HAO&iri=http://purl.obolibrary.org/obo/HAO_0001000
>>>
>>>  Virtuoso knows how to do CBD (in fact ontobee is virtuoso as triple
>>> store too), but the page generator on your endpoint isn't doing it on the
>>> page it generates. Instead it does straight links out and links in. The
>>> links in are from an annotation on the axiom. It would be better there to
>>> not display anything, or to display a note saying there is an annotation
>>> that it can't display, or to properly parse the annotation (which would
>>> require another CBD query starting at the annotation) and display it.
>>> Kingsley, the source for ontobee is available - why not pick it up or use
>>> it as a spec for how to properly display OWL?
>>>
>>>
>>>  Will do.
>>>
>>>
>>>
>>>  The assertional content is:
>>>
>>>  *class: tentorio-antennal muscle *
>>>  subClassOf: antennal muscle
>>> subClassOf: attached_to some scape
>>> subClassOf: attached_to some anterior tentorial arm
>>>
>>> The SPARQL queries used to collect the content on the page are available
>>> by a link at the bottom of the page.
>>>
>>>  The RDF that is generated can be seen by view source. I can see
>>> desirable improvements, e.g. adding some isDefinedBy links, and including
>>> all the inferrred superclasses, but that's not directly to your question,
>>> and is the sort of thing I mean when I say we will be working further on
>>> the RDF for the terms.
>>>
>>>  Doing a GET for application/rdf+xml to the purl will pull in
>>> approximately the same RDF. The HAO folks decided to make their own browser
>>> for their content instead of using ontobee, which is fine. What we've tried
>>> to promote within the OBO community is the use of semweb technology as one
>>> form of dissemination, use of stable URIs as identifiers, and the ability
>>> to provide both human readable pages and machine readable pages. I'll get
>>> to Bernard's email later, but I hope you and he will realize that promoting
>>> and starting to successfully achieve implementation of these values for the
>>> OBO ontologies will yield very good value for the semantic web. There is an
>>> incredible amount of very well curated biological knowledge that is
>>> constantly being generated by that community.
>>>
>>>  I was thinking that the character states described in this ontology
>>>> could be attached to species like this.
>>>>
>>>>  <http://lod.taxonconcept.org/ses/z9oqP#Species> <somePredicate>
>>>> <someHAOCharacterState>
>>>>
>>>
>>>  Are there what you would call character states in the example you gave
>>> above? I understand it as a bit of anatomy knowledge - what part connects
>>> to what.  I guess what I need to know is what, if any, assertions would you
>>> make given that you now see what was intended to be seen? Do you need a
>>> flattening predicate (my preference would be to use an annotation property)
>>> that more directly links the species concept to scape and anterior
>>> tentorial arm? What should it be?
>>>
>>>
>>>
>>>> And be properly interpreted on Sig.ma example http://bit.ly/zfbimy
>>>>
>>>
>>>  I'll have to look at that later. But I would ask of it and of your
>>> endpoint: Is there some obligation to properly interpret what is stated
>>> according the web standard OWL? Surely the obligation for proper
>>> interpretation needs to be a mutual effort?
>>>
>>>  From my point of view I want to make the OBO LOD be useful and I
>>> understand that there are different communities that would use it. I think
>>> we need to be true to the representation we choose - it provides a lot of
>>> benefits for query, consistency checking, etc. But we're also trying to be
>>> polite to others and are open to augmenting it so that it can be of utility
>>> to others. The key is for us to first understand how we should do that, for
>>> you to understand what is currently being said, and when we're done for
>>> your client applications to either represent what we've said too, or learn
>>> how to ignore it.
>>>
>>> -Alan
>>>
>>>  ps. For some examples of how using OWL is yielding tangible benefits
>>> you could browse http://groups.google.com/group/fma-owl-2009
>>> In that effort I'm slowly working through translating a human anatomy
>>> ontology, the FMA, into OWL, and in the process discovering (and having
>>> fixed) thousands of errors.
>>>
>>>   Yep!
>>>
>>> SeeAlso:
>>>
>>> 1. https://plus.google.com/s/inference%20owl%20linked%20data%20idehen-- fuzzy search on G+ posts about virtues of OWL and Inference re. data
>>> quality improvements (note: LOD cloud cache is still undergoing maintenance
>>> re. LOD2 so some live demo links might not work).
>>>
>>>
>>>
>>>>  - Pete
>>>>
>>>>
>>>>  On Wed, Feb 22, 2012 at 12:27 AM, Alan Ruttenberg <
>>>> alanruttenberg@gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>>  On Tue, Feb 21, 2012 at 3:51 PM, Peter DeVries <
>>>>> pete.devries@gmail.com> wrote:
>>>>>
>>>>>> Hi Juan,
>>>>>>
>>>>>>  Thanks for this. I read the paper. They have an "OWL" version of
>>>>>> this OBO vocabulary but it seems to not be a fully mapped OWL version as
>>>>>> described in your paper.
>>>>>>
>>>>>
>>>>>  Which one?
>>>>>
>>>>>
>>>>>>
>>>>>>  In this particular use case I was thinking of applying the terms
>>>>>> and properties described by the ontology to my species concepts.
>>>>>>
>>>>>
>>>>>  This is a nice example and should be supported. An immediate
>>>>> suggestion is to send mail to  obo-discuss@lists.sourceforge.net as
>>>>> that is where you will find both the developers of the OBO LOD support as
>>>>> well as the biologist community.
>>>>>
>>>>>
>>>>>>
>>>>>>  For instance:
>>>>>>
>>>>>>  species X has this metabolic pathway.  (which would be useful for
>>>>>> finding species with potential drug interactions or other chemical
>>>>>> reactions)
>>>>>>
>>>>>
>>>>>  We're in the process of revising BFO and the relations ontology. A
>>>>> draft version is at http://purl.obolibrary.org/obo/bfo.owl
>>>>>
>>>>>  In terms of that, your statement might be represented as
>>>>>
>>>>>  <species> subclassOf 'has site of' some <metabolic process>
>>>>> if you want to represent that all members of the species have the
>>>>> process
>>>>>
>>>>>  or
>>>>>
>>>>>  <anonymous instance of species> 'has site of'  <anonymous instance
>>>>> of process>
>>>>>
>>>>>  e.g.
>>>>>
>>>>>  @prefix obo: <http://purl.obolibrary.org/obo/>
>>>>> @prefix hasSiteOf: <http://purl.obolibrary.org/obo/BFO_0000067>
>>>>>
>>>>>  _:a rdf:type <species>
>>>>> _:b rdf:type <metabolic process>
>>>>> _:a hasSiteOf: _:b
>>>>>
>>>>>  If you want to represent that the process happens in some
>>>>> individuals of this species.
>>>>>
>>>>> In the above I write <species> where you would write the uri of your
>>>>> species class (e.g. http://purl.obolibrary.org/obo/NCBITaxon_9903) ,
>>>>> and <metabolic process> where you would write the uri of your process class
>>>>> (e.g. http://purl.obolibrary.org/obo/GO_0030245).
>>>>>
>>>>>
>>>>>
>>>>>> I don't think this use case requires the full OBO  relationships,
>>>>>> just a mapping ontology that connects terms and characters to those in the
>>>>>> OBO ontology.
>>>>>>
>>>>>
>>>>>  Not sure what you mean by this.
>>>>>
>>>>>
>>>>>>
>>>>>>  Doing it this way you might get a species "tagged" with something
>>>>>> that is not appropriate but that could be detected by some service that
>>>>>> analyzes the statements made
>>>>>> in the species concept markup
>>>>>>
>>>>>
>>>>>  Example?
>>>>>
>>>>>> .
>>>>>> My guess is that some of the OBO ontologies (if fully entailed) will
>>>>>> not play well on the LOD cloud, but they would play a useful role when
>>>>>> mapped as I described.
>>>>>>
>>>>>
>>>>>  Examples would be helpful. But note that it is our intention that we
>>>>> *do* play well on the LOD cloud. However also note, we work in OWL and much
>>>>> of what we say is about types/classes, and many(most?) linked data browsers
>>>>> don't understand or present OWL in a meaninful way. One of the reasons we
>>>>> have developed ontobee is that it is designed to do justice to linked
>>>>> ontology terms that are defined in terms of OWL. So class expressions are
>>>>> not left as messes of bnodes, but instead parsed and displayed as OWL. I'd
>>>>> like to see more linked data browsers do the same.
>>>>>
>>>>>
>>>>>> Does my interpretation seem appropriate to you or am I missing
>>>>>> something?
>>>>>>
>>>>>
>>>>>  I hope you are missing something :) But please elaborate so we can
>>>>> see.
>>>>>
>>>>>
>>>>>>
>>>>>>  Thanks,
>>>>>>
>>>>>>  - Pete
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Feb 21, 2012 at 9:39 AM, Juan Sequeda <juanfederico@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Peter
>>>>>>>
>>>>>>>  You may want to take a look at this:
>>>>>>> http://www.ncbi.nlm.nih.gov/pubmed/21388572
>>>>>>>
>>>>>>>  The implementation of the OBO to OWL mapping work is part of
>>>>>>> official Gene Ontology project.
>>>>>>>
>>>>>>> Juan Sequeda
>>>>>>> +1-575-SEQ-UEDA
>>>>>>> www.juansequeda.com
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Feb 20, 2012 at 7:46 PM, Peter DeVries <
>>>>>>> pete.devries@gmail.com> wrote:
>>>>>>>
>>>>>>>> How do OBO type ontologies work in the Linked Open Data cloud.
>>>>>>>>
>>>>>>>>  One that I recently loaded has a large number of blank nodes.
>>>>>>>>
>>>>>>>>  It the idea that these will be mapped to LOD URI's?
>>>>>>>>
>>>>>>>>  Thanks,
>>>>>>>>
>>>>>>>>  - Pete
>>>>>>>>
>>>>>>>>  --
>>>>>>>>
>>>>>>>> ------------------------------------------------------------------------------------
>>>>>>>> Pete DeVries
>>>>>>>> Department of Entomology
>>>>>>>> University of Wisconsin - Madison
>>>>>>>> 445 Russell Laboratories
>>>>>>>> 1630 Linden Drive
>>>>>>>> Madison, WI 53706
>>>>>>>> Email: pdevries@wisc.edu
>>>>>>>> TaxonConcept <http://www.taxonconcept.org/>  &  GeoSpecies<http://about.geospecies.org/> Knowledge
>>>>>>>> Bases
>>>>>>>> A Semantic Web, Linked Open Data <http://linkeddata.org/>  Project
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------------------
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>  --
>>>>>>
>>>>>> ------------------------------------------------------------------------------------
>>>>>> Pete DeVries
>>>>>> Department of Entomology
>>>>>> University of Wisconsin - Madison
>>>>>> 445 Russell Laboratories
>>>>>> 1630 Linden Drive
>>>>>> Madison, WI 53706
>>>>>> Email: pdevries@wisc.edu
>>>>>> TaxonConcept <http://www.taxonconcept.org/>  &  GeoSpecies<http://about.geospecies.org/> Knowledge
>>>>>> Bases
>>>>>> A Semantic Web, Linked Open Data <http://linkeddata.org/>  Project
>>>>>>
>>>>>> --------------------------------------------------------------------------------------
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>  --
>>>>
>>>> ------------------------------------------------------------------------------------
>>>> Pete DeVries
>>>> Department of Entomology
>>>> University of Wisconsin - Madison
>>>> 445 Russell Laboratories
>>>> 1630 Linden Drive
>>>> Madison, WI 53706
>>>> Email: pdevries@wisc.edu
>>>> TaxonConcept <http://www.taxonconcept.org/>  &  GeoSpecies<http://about.geospecies.org/> Knowledge
>>>> Bases
>>>> A Semantic Web, Linked Open Data <http://linkeddata.org/>  Project
>>>>
>>>> --------------------------------------------------------------------------------------
>>>>
>>>
>>>
>>>
>>>   --
>>>
>>> Regards,
>>>
>>> Kingsley Idehen 
>>> Founder & CEO
>>> OpenLink Software
>>> Company Web: http://www.openlinksw.com
>>> Personal Weblog: http://www.openlinksw.com/blog/~kidehen
>>> Twitter/Identi.ca handle: @kidehen
>>> Google+ Profile: https://plus.google.com/112399767740508618350/about
>>> LinkedIn Profile: http://www.linkedin.com/in/kidehen
>>>
>>>
>>>
>>>
>>
>>
>>  --
>>
>> ------------------------------------------------------------------------------------
>> Pete DeVries
>> Department of Entomology
>> University of Wisconsin - Madison
>> 445 Russell Laboratories
>> 1630 Linden Drive
>> Madison, WI 53706
>> Email: pdevries@wisc.edu
>> TaxonConcept <http://www.taxonconcept.org/>  &  GeoSpecies<http://about.geospecies.org/> Knowledge
>> Bases
>> A Semantic Web, Linked Open Data <http://linkeddata.org/>  Project
>>
>> --------------------------------------------------------------------------------------
>>
>>
>>
>> --
>>
>> Regards,
>>
>> Kingsley Idehen 
>> Founder & CEO
>> OpenLink Software
>> Company Web: http://www.openlinksw.com
>> Personal Weblog: http://www.openlinksw.com/blog/~kidehen
>> Twitter/Identi.ca handle: @kidehen
>> Google+ Profile: https://plus.google.com/112399767740508618350/about
>> LinkedIn Profile: http://www.linkedin.com/in/kidehen
>>
>>
>>
>>
>
>
> --
>
> ------------------------------------------------------------------------------------
> Pete DeVries
> Department of Entomology
> University of Wisconsin - Madison
> 445 Russell Laboratories
> 1630 Linden Drive
> Madison, WI 53706
> Email: pdevries@wisc.edu
> TaxonConcept <http://www.taxonconcept.org/>  &  GeoSpecies<http://about.geospecies.org/> Knowledge
> Bases
> A Semantic Web, Linked Open Data <http://linkeddata.org/>  Project
>
> --------------------------------------------------------------------------------------
>

Received on Thursday, 23 February 2012 19:37:16 UTC