W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > October 2006

Re: using OBO in owl format to describe data

From: Mary Montoya <mhm@ncgr.org>
Date: Wed, 11 Oct 2006 10:52:01 -0600
Message-ID: <452D2131.9000607@ncgr.org>
To: Xiaoshu Wang <wangxiao@musc.edu>
CC: semantic-web@w3.org, public-semweb-lifesci@w3.org

so I should define properties for my data so I can then state ( pardon 
my awkwardness with the shorthand syntax ):

@prefix gene: <http://example.com/genes#> .
gene:Y rdfs:subClassOf obo:Term .
_:x a gene:Y . 

    rdfs:label    "my frog gene"
    mystuff:sequence  "atgcgga...."
    mystuff:chromosomeNumber   3
    mystuff:startCoordinate    12345
    mystuff:endCoordinate  7890

so then if my friend at bullfrog genome project states

@prefix gene: <http://example.com/genes#> .
gene:Z rdfs:subClassOf obo:Term .
_:w a gene:Z . 

    otherstuff:sequenceID  "786451"
    otherstuffchromoNum   "3"
    otherstuff:comment    "source unknown"
    otherstuff:type  "pseudogene"

Both will be of type obo:Term but since obo:Term doesn't define any 
properties for individuals what good does it do me?  I can't coordinate 
the data.
I guess if I create some properties that I think will be useful and 
assign their domain to be obo:Term and then tell my friend to also use 
them to describe their instance data, then our data will interoperate.  
And you are saying that if I do a good job of identifying useful 
properties over time many people will also start using them.

Mary


Xiaoshu Wang wrote:

>--Mary,
>
>I am not familiar with OBO, but what you are suggesting is actually what
>should be avoided in RDF.  RDF is based on open world assumption.  But to
>have one superclass for the purpose of enforcing certain annotation property
>is a closed world thinking in Object Oriented world.
>
>In your case, I wonder if there is any concrete criteria that makes one
>resource an obo:Term but the other not?  If not, why invent another term for
>it? And doing so will at least make the statement very odd to understand.
>For instance, assume there is an instance of gene class Y named x. Then, we
>can say,
>
>@prefix gene: <http://example.com/genes#> .
>gene:Y rdfs:subClassOf obo:Term .
>
>Then it is natural to say,
>
>_:x a gene:Y . 
>
>but it would be very odd to say, 
>
>_:x a obo:Term .
>
>In addition, it will incur unnecessary computation complexity for RDF
>engine. 
>
>This kind of pseudo-Superclass pattern is also used. Everything in MGED is
>an MGED:Ontology and everything in BIO-PAX is a bio-pax:entity. But the main
>purpose of this super-class is to enforce certain AnnotationProperty or
>grouping terms. No offense to those groups, but I think the design is wrong
>and should be avoided.   
>
>To design a superclass is different form designing certain properties, like
>obo:name, obo:id etc., because it is still natural to say that something is
>a gene but has certain name and id etc., like the following,
>
>_:x a gene:Y ;
>   obo:name "Some gene" ;
>   obo:id   "12345" .
>
>And best of all, you shouldn't invent those properties, because each
>resource should have a URI and rdfs:label can be used for name. And there
>are other ontology like Dublin Core at your dispense as well.
>
>You probably wondered then, how can interoperability to be ensured if there
>is no way to constrain it. My take to this is to think it in long term.
>Overtime, a few commonly used ontology would be shared by people who have
>the same interest.  Take economy as an analogy, controled market have short
>term stability but destine to collapse big time some time.  But free market
>economy have occassional turmoil, especially at the beginning but is more
>robust and stable in the long run.  So, don't worry about how others will do
>in the future.  Just think about if there is any ontologies that can help
>you adequately describe your data.  
>
>Xiaoshu 
>
>
>  
>
>>-----Original Message-----
>>From: semantic-web-request@w3.org 
>>[mailto:semantic-web-request@w3.org] On Behalf Of Mary Montoya
>>Sent: Wednesday, October 11, 2006 1:09 AM
>>To: semantic-web@w3.org; public-semweb-lifesci@w3.org
>>Subject: using OBO in owl format to describe data
>>
>>
>>I have a question about using the OpenBiomedicalOntologies 
>>such as the SequenceOntology in owl format to describe data 
>>resident in my local biological database. 
>>
>>It seems desirable to leverage subclass relationships of 
>>terms in the hierarchy of the SequenceOntology and to have 
>>all the terms there rooted in a common parent obo:Term class. 
>> OBO defines certain information to be provided for all OBO 
>>terms such as name, id, definition, etc.  These provide 
>>descriptive information of the class itself not properties of 
>>members of the class.  So it seems all obo:Terms would have 
>>"class values" for a name, id, def, etc.  I would then expect 
>>to find classes in SequenceOntology  that are defined as 
>>subClassOf obo:Term and reflect the hierachical structure of 
>>those SequenceOntology terms, for example, so:Gene as a 
>>subClassOf so:Region which is a subClassOf 
>>so:Located_sequence_feature which is a subClassOf obo:Term.  
>>The problem  is that the owl class definitions I've seen for 
>>OBO terms don't also include property definitions for 
>>individuals of the class.  So an individual of so:Gene 
>>doesn't have a property for name, id, def, etc that I can 
>>provide values for from my database.  There are only these 
>>class description properties often defined using rdfs:label, 
>>rdfs:comment or as annotation type properties. 
>>My question is:  How can I use these publicly available 
>>ontologies to hold values for my data? They seem poised for 
>>interoperability if these properties were available to 
>>individuals of these classes.
>>
>>Here is one sequence ontology definition I found for gene in 
>>owl format ( others I've seen look similar )
>>
>> <owl:Class rdf:ID="SO_0000704">
>>       <rdfs:label xml:lang="en">gene</rdfs:label>
>>       <rdfs:comment
>>rdf:datatype="http://www.w3.org/2001/XMLSchema#string">A 
>>locatable region of genomic sequence, corresponding to a unit 
>>of inheritance, which is associated with regulatory regions, 
>>transcribed regions and/or other functional sequence 
>>regions</rdfs:comment>
>>       <rdfs:subClassOf rdf:resource="#SO_0000001"/> </owl:Class>
>>
>>I thought something like this would be more useful:
>>
>>   <owl:Class rdf:about="&so;SO_0000704">
>>        <obo:classId>SO:0000704</obo:classId>
>>        <obo:className>gene</obo:className>
>>        <obo:classDef>
>>            "A locatable region of genomic sequence, 
>>corresponding to a unit of 
>>            inheritance, which is associated with regulatory regions, 
>>            transcribed regions and/or other functional 
>>sequence regions" 
>>            [SO:rd] 
>>        </obo:classDef>
>>        <rdfs:subClassOf rdf:resource="&so;SO_0000001"/>
>>    </owl:Class>
>>
>>with the root parent Term defined within an obo namespace as 
>>    <owl:Class rdf:about="&obo;Term">
>>        <obo:classId>OBO:Term</obo:classId>
>>        <obo:className>term</obo:className>
>>        <obo:classDef>
>>            Term is a blah, blah
>>        </obo:classDef>
>>        <rdfs:subClassOf>
>>            <owl:Restriction>
>>                <owl:onProperty rdf:resource="&obo;name"/>
>>                <owl:minCardinality 
>>rdf:datatype="&xsd;nonNegativeInteger">
>>                    1
>>                </owl:minCardinality>
>>                <owl:maxCardinality 
>>rdf:datatype="&xsd;nonNegativeInteger">
>>                    1
>>                </owl:maxCardinality>
>>            </owl:Restriction>
>>        </rdfs:subClassOf>
>>        <rdfs:subClassOf>
>>            <owl:Restriction>
>>                <owl:onProperty rdf:resource="&obo;id"/>
>>                <owl:minCardinality 
>>rdf:datatype="&xsd;nonNegativeInteger">
>>                    1
>>                </owl:minCardinality>
>>                <owl:maxCardinality 
>>rdf:datatype="&xsd;nonNegativeInteger">
>>                    1
>>                </owl:maxCardinality>
>>            </owl:Restriction>
>>        </rdfs:subClassOf>
>>        <rdfs:subClassOf>
>>            <owl:Restriction>
>>                <owl:onProperty rdf:resource="&obo;def"/>
>>                <owl:minCardinality 
>>rdf:datatype="&xsd;nonNegativeInteger">
>>                    0
>>                </owl:minCardinality>
>>                <owl:maxCardinality 
>>rdf:datatype="&xsd;nonNegativeInteger">
>>                    1
>>                </owl:maxCardinality>
>>            </owl:Restriction>
>>        </rdfs:subClassOf>
>>    </owl:Class>
>>
>>Then I could do something like this:
>>
>>    <owl:Class rdf:about="&mystuff;MyGene">
>>	<rdfs:subClassOf rdf:resource="&so;SO_0000704"/>
>>        <rdfs:subClassOf>
>>            <owl:Restriction>
>>                <owl:onProperty 
>>rdf:resource="&mystuff;chromosomeNumber"/>
>>                <owl:minCardinality 
>>rdf:datatype="&xsd;nonNegativeInteger">
>>                    1
>>                </owl:minCardinality>
>>                <owl:maxCardinality 
>>rdf:datatype="&xsd;nonNegativeInteger">
>>                    1
>>                </owl:maxCardinality>
>>            </owl:Restriction>
>>        </rdfs:subClassOf>
>>         <rdfs:subClassOf>
>>            <owl:Restriction>
>>                <owl:onProperty 
>>rdf:resource="&mystuff;startCoordinate"/>
>>                <owl:minCardinality 
>>rdf:datatype="&xsd;nonNegativeInteger">
>>                    1
>>                </owl:minCardinality>
>>                <owl:maxCardinality 
>>rdf:datatype="&xsd;nonNegativeInteger">
>>                    1
>>                </owl:maxCardinality>
>>            </owl:Restriction>
>>        </rdfs:subClassOf>
>>         <rdfs:subClassOf>
>>            <owl:Restriction>
>>                <owl:onProperty 
>>rdf:resource="&mystuff;endCoordinate"/>
>>                <owl:minCardinality 
>>rdf:datatype="&xsd;nonNegativeInteger">
>>                    1
>>                </owl:minCardinality>
>>                <owl:maxCardinality 
>>rdf:datatype="&xsd;nonNegativeInteger">
>>                    1
>>                </owl:maxCardinality>
>>            </owl:Restriction>
>>        </rdfs:subClassOf>
>>        <rdfs:subClassOf>
>>            <owl:Restriction>
>>                <owl:onProperty rdf:resource="&mystuff;sequence"/>
>>                <owl:minCardinality 
>>rdf:datatype="&xsd;nonNegativeInteger">
>>                    1
>>                </owl:minCardinality>
>>                <owl:maxCardinality 
>>rdf:datatype="&xsd;nonNegativeInteger">
>>                    1
>>                </owl:maxCardinality>
>>            </owl:Restriction>
>>        </rdfs:subClassOf>
>>    </owl:Class>
>>
>>so now I have defined a class MyGene that extends from obo to 
>>sequence ontology and I can define individuals with property 
>>values for the following from my database:
>>obo:name
>>obo:id
>>obo:def
>>mystuff:sequence
>>mystuff:endCoordinate
>>mystuff:startCoordinate
>>mystuff:chromosomeNumber
>>
>>It seems presumptuous to define properties for individuals ( 
>>name, id, etc ) as well as class properties ( className, 
>>classID, etc ) for public ontologies such as obo ontologies 
>>but possibly quite useful for interoperability sake.  Any 
>>comments would be welcome.
>>
>>Mary Montoya
>>
>>VPIN project team
>>NCGR
>>
>>
>>
>>
>>
>>
>>
>>    
>>
>
>  
>
Received on Wednesday, 11 October 2006 16:52:31 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:00:45 GMT