Re: using OBO in owl format to describe data

Xiaoshu,

You raise the issue of open vs closed world, which has been on my  
mind whenever any
translation to OWL comes up.  I have ahead of me some databases in  
OBO that I will be
looking to convert to OWL. I intend to build constructs that enable  
reasoners to tell me
new things about the data.

Having had many surprises in learning how the open world works and  
how OWL works
I'm curious to know how this is handled.  I have not yet taken on   
the task to learn
OBO as I hope not have to. I would like to feel confident that I can  
convert a file into
OWL and use it, however I can't see how an automated translator,  
without some
user interaction or settings can know, for example, that a set of  
subclasses must
be disjoint, for example.  Is the translation primarily syntactic?

What I can't imagine is how an OBO converter would know if that is  
what I meant when
we didn't even know.  That's my main question.

By the way, I also agree with your suggestion in response to Mary's  
question and
the criticism of BioPAX. Open world and OWL are technologies that  
require a lot
of careful thinking and getting used to.  It takes some very  
different kind of thinking
to get it right. We didn't fully appreciate this in the early days of  
BioPAX and we didn't get it right.
In fact, For the first release of Level 1 didn't make the physical  
entity classes disjoint.
Which is the most common mistake beginners make when going from
closed world to open world modeling.

Now a quick plug for two papers I still think are the most  
enlightening for anyone
embarking on OWL.  Matthew Horridge's Progege OWL Tutorial
(http://www.co-ode.org/resources/tutorials/ProtegeOWLTutorial.pdf) and
The OWL Pizza - common errors (http://www.co-ode.org/resources/papers/ 
ekaw2004.pdf).
I just re-read that one yesterday morning.

On BioPAX we are still learning about OWL and how to best use it and  
reasoning technology.
We are working to correct the errors and mistakes we know about and  
further educating ourselves
to hopefully get it right in future releases.

Sincerely,
Joanne


On Oct 11, 2006, at 2:47 PM, Xiaoshu Wang wrote:

>
> --Mary,
>
> I am not familiar with OBO, but what you are suggesting is actually  
> what
> should be avoided in RDF.  RDF is based on open world assumption.   
> But to
> have one superclass for the purpose of enforcing certain annotation  
> property
> is a closed world thinking in Object Oriented world.
>
> In your case, I wonder if there is any concrete criteria that makes  
> one
> resource an obo:Term but the other not?  If not, why invent another  
> term for
> it? And doing so will at least make the statement very odd to  
> understand.
> For instance, assume there is an instance of gene class Y named x.  
> Then, we
> can say,
>
> @prefix gene: <http://example.com/genes#> .
> gene:Y rdfs:subClassOf obo:Term .
>
> Then it is natural to say,
>
> _:x a gene:Y .
>
> but it would be very odd to say,
>
> _:x a obo:Term .
>
> In addition, it will incur unnecessary computation complexity for RDF
> engine.
>
> This kind of pseudo-Superclass pattern is also used. Everything in  
> MGED is
> an MGED:Ontology and everything in BIO-PAX is a bio-pax:entity. But  
> the main
> purpose of this super-class is to enforce certain  
> AnnotationProperty or
> grouping terms. No offense to those groups, but I think the design  
> is wrong
> and should be avoided.
>
> To design a superclass is different form designing certain  
> properties, like
> obo:name, obo:id etc., because it is still natural to say that  
> something is
> a gene but has certain name and id etc., like the following,
>
> _:x a gene:Y ;
>    obo:name "Some gene" ;
>    obo:id   "12345" .
>
> And best of all, you shouldn't invent those properties, because each
> resource should have a URI and rdfs:label can be used for name. And  
> there
> are other ontology like Dublin Core at your dispense as well.
>
> You probably wondered then, how can interoperability to be ensured  
> if there
> is no way to constrain it. My take to this is to think it in long  
> term.
> Overtime, a few commonly used ontology would be shared by people  
> who have
> the same interest.  Take economy as an analogy, controled market  
> have short
> term stability but destine to collapse big time some time.  But  
> free market
> economy have occassional turmoil, especially at the beginning but  
> is more
> robust and stable in the long run.  So, don't worry about how  
> others will do
> in the future.  Just think about if there is any ontologies that  
> can help
> you adequately describe your data.
>
> Xiaoshu
>
>
>> -----Original Message-----
>> From: semantic-web-request@w3.org
>> [mailto:semantic-web-request@w3.org] On Behalf Of Mary Montoya
>> Sent: Wednesday, October 11, 2006 1:09 AM
>> To: semantic-web@w3.org; public-semweb-lifesci@w3.org
>> Subject: using OBO in owl format to describe data
>>
>>
>> I have a question about using the OpenBiomedicalOntologies
>> such as the SequenceOntology in owl format to describe data
>> resident in my local biological database.
>>
>> It seems desirable to leverage subclass relationships of
>> terms in the hierarchy of the SequenceOntology and to have
>> all the terms there rooted in a common parent obo:Term class.
>>  OBO defines certain information to be provided for all OBO
>> terms such as name, id, definition, etc.  These provide
>> descriptive information of the class itself not properties of
>> members of the class.  So it seems all obo:Terms would have
>> "class values" for a name, id, def, etc.  I would then expect
>> to find classes in SequenceOntology  that are defined as
>> subClassOf obo:Term and reflect the hierachical structure of
>> those SequenceOntology terms, for example, so:Gene as a
>> subClassOf so:Region which is a subClassOf
>> so:Located_sequence_feature which is a subClassOf obo:Term.
>> The problem  is that the owl class definitions I've seen for
>> OBO terms don't also include property definitions for
>> individuals of the class.  So an individual of so:Gene
>> doesn't have a property for name, id, def, etc that I can
>> provide values for from my database.  There are only these
>> class description properties often defined using rdfs:label,
>> rdfs:comment or as annotation type properties.
>> My question is:  How can I use these publicly available
>> ontologies to hold values for my data? They seem poised for
>> interoperability if these properties were available to
>> individuals of these classes.
>>
>> Here is one sequence ontology definition I found for gene in
>> owl format ( others I've seen look similar )
>>
>>  <owl:Class rdf:ID="SO_0000704">
>>        <rdfs:label xml:lang="en">gene</rdfs:label>
>>        <rdfs:comment
>> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">A
>> locatable region of genomic sequence, corresponding to a unit
>> of inheritance, which is associated with regulatory regions,
>> transcribed regions and/or other functional sequence
>> regions</rdfs:comment>
>>        <rdfs:subClassOf rdf:resource="#SO_0000001"/> </owl:Class>
>>
>> I thought something like this would be more useful:
>>
>>    <owl:Class rdf:about="&so;SO_0000704">
>>         <obo:classId>SO:0000704</obo:classId>
>>         <obo:className>gene</obo:className>
>>         <obo:classDef>
>>             "A locatable region of genomic sequence,
>> corresponding to a unit of
>>             inheritance, which is associated with regulatory regions,
>>             transcribed regions and/or other functional
>> sequence regions"
>>             [SO:rd]
>>         </obo:classDef>
>>         <rdfs:subClassOf rdf:resource="&so;SO_0000001"/>
>>     </owl:Class>
>>
>> with the root parent Term defined within an obo namespace as
>>     <owl:Class rdf:about="&obo;Term">
>>         <obo:classId>OBO:Term</obo:classId>
>>         <obo:className>term</obo:className>
>>         <obo:classDef>
>>             Term is a blah, blah
>>         </obo:classDef>
>>         <rdfs:subClassOf>
>>             <owl:Restriction>
>>                 <owl:onProperty rdf:resource="&obo;name"/>
>>                 <owl:minCardinality
>> rdf:datatype="&xsd;nonNegativeInteger">
>>                     1
>>                 </owl:minCardinality>
>>                 <owl:maxCardinality
>> rdf:datatype="&xsd;nonNegativeInteger">
>>                     1
>>                 </owl:maxCardinality>
>>             </owl:Restriction>
>>         </rdfs:subClassOf>
>>         <rdfs:subClassOf>
>>             <owl:Restriction>
>>                 <owl:onProperty rdf:resource="&obo;id"/>
>>                 <owl:minCardinality
>> rdf:datatype="&xsd;nonNegativeInteger">
>>                     1
>>                 </owl:minCardinality>
>>                 <owl:maxCardinality
>> rdf:datatype="&xsd;nonNegativeInteger">
>>                     1
>>                 </owl:maxCardinality>
>>             </owl:Restriction>
>>         </rdfs:subClassOf>
>>         <rdfs:subClassOf>
>>             <owl:Restriction>
>>                 <owl:onProperty rdf:resource="&obo;def"/>
>>                 <owl:minCardinality
>> rdf:datatype="&xsd;nonNegativeInteger">
>>                     0
>>                 </owl:minCardinality>
>>                 <owl:maxCardinality
>> rdf:datatype="&xsd;nonNegativeInteger">
>>                     1
>>                 </owl:maxCardinality>
>>             </owl:Restriction>
>>         </rdfs:subClassOf>
>>     </owl:Class>
>>
>> Then I could do something like this:
>>
>>     <owl:Class rdf:about="&mystuff;MyGene">
>> 	<rdfs:subClassOf rdf:resource="&so;SO_0000704"/>
>>         <rdfs:subClassOf>
>>             <owl:Restriction>
>>                 <owl:onProperty
>> rdf:resource="&mystuff;chromosomeNumber"/>
>>                 <owl:minCardinality
>> rdf:datatype="&xsd;nonNegativeInteger">
>>                     1
>>                 </owl:minCardinality>
>>                 <owl:maxCardinality
>> rdf:datatype="&xsd;nonNegativeInteger">
>>                     1
>>                 </owl:maxCardinality>
>>             </owl:Restriction>
>>         </rdfs:subClassOf>
>>          <rdfs:subClassOf>
>>             <owl:Restriction>
>>                 <owl:onProperty
>> rdf:resource="&mystuff;startCoordinate"/>
>>                 <owl:minCardinality
>> rdf:datatype="&xsd;nonNegativeInteger">
>>                     1
>>                 </owl:minCardinality>
>>                 <owl:maxCardinality
>> rdf:datatype="&xsd;nonNegativeInteger">
>>                     1
>>                 </owl:maxCardinality>
>>             </owl:Restriction>
>>         </rdfs:subClassOf>
>>          <rdfs:subClassOf>
>>             <owl:Restriction>
>>                 <owl:onProperty
>> rdf:resource="&mystuff;endCoordinate"/>
>>                 <owl:minCardinality
>> rdf:datatype="&xsd;nonNegativeInteger">
>>                     1
>>                 </owl:minCardinality>
>>                 <owl:maxCardinality
>> rdf:datatype="&xsd;nonNegativeInteger">
>>                     1
>>                 </owl:maxCardinality>
>>             </owl:Restriction>
>>         </rdfs:subClassOf>
>>         <rdfs:subClassOf>
>>             <owl:Restriction>
>>                 <owl:onProperty rdf:resource="&mystuff;sequence"/>
>>                 <owl:minCardinality
>> rdf:datatype="&xsd;nonNegativeInteger">
>>                     1
>>                 </owl:minCardinality>
>>                 <owl:maxCardinality
>> rdf:datatype="&xsd;nonNegativeInteger">
>>                     1
>>                 </owl:maxCardinality>
>>             </owl:Restriction>
>>         </rdfs:subClassOf>
>>     </owl:Class>
>>
>> so now I have defined a class MyGene that extends from obo to
>> sequence ontology and I can define individuals with property
>> values for the following from my database:
>> obo:name
>> obo:id
>> obo:def
>> mystuff:sequence
>> mystuff:endCoordinate
>> mystuff:startCoordinate
>> mystuff:chromosomeNumber
>>
>> It seems presumptuous to define properties for individuals (
>> name, id, etc ) as well as class properties ( className,
>> classID, etc ) for public ontologies such as obo ontologies
>> but possibly quite useful for interoperability sake.  Any
>> comments would be welcome.
>>
>> Mary Montoya
>>
>> VPIN project team
>> NCGR
>>
>>
>>
>>
>>
>>
>>
>
>
>

Received on Thursday, 12 October 2006 07:43:12 UTC