- From: Mary Montoya <mhm@ncgr.org>
- Date: Wed, 11 Oct 2006 10:52:01 -0600
- To: Xiaoshu Wang <wangxiao@musc.edu>
- CC: semantic-web@w3.org, public-semweb-lifesci@w3.org
so I should define properties for my data so I can then state ( pardon
my awkwardness with the shorthand syntax ):
@prefix gene: <http://example.com/genes#> .
gene:Y rdfs:subClassOf obo:Term .
_:x a gene:Y .
rdfs:label "my frog gene"
mystuff:sequence "atgcgga...."
mystuff:chromosomeNumber 3
mystuff:startCoordinate 12345
mystuff:endCoordinate 7890
so then if my friend at bullfrog genome project states
@prefix gene: <http://example.com/genes#> .
gene:Z rdfs:subClassOf obo:Term .
_:w a gene:Z .
otherstuff:sequenceID "786451"
otherstuffchromoNum "3"
otherstuff:comment "source unknown"
otherstuff:type "pseudogene"
Both will be of type obo:Term but since obo:Term doesn't define any
properties for individuals what good does it do me? I can't coordinate
the data.
I guess if I create some properties that I think will be useful and
assign their domain to be obo:Term and then tell my friend to also use
them to describe their instance data, then our data will interoperate.
And you are saying that if I do a good job of identifying useful
properties over time many people will also start using them.
Mary
Xiaoshu Wang wrote:
>--Mary,
>
>I am not familiar with OBO, but what you are suggesting is actually what
>should be avoided in RDF. RDF is based on open world assumption. But to
>have one superclass for the purpose of enforcing certain annotation property
>is a closed world thinking in Object Oriented world.
>
>In your case, I wonder if there is any concrete criteria that makes one
>resource an obo:Term but the other not? If not, why invent another term for
>it? And doing so will at least make the statement very odd to understand.
>For instance, assume there is an instance of gene class Y named x. Then, we
>can say,
>
>@prefix gene: <http://example.com/genes#> .
>gene:Y rdfs:subClassOf obo:Term .
>
>Then it is natural to say,
>
>_:x a gene:Y .
>
>but it would be very odd to say,
>
>_:x a obo:Term .
>
>In addition, it will incur unnecessary computation complexity for RDF
>engine.
>
>This kind of pseudo-Superclass pattern is also used. Everything in MGED is
>an MGED:Ontology and everything in BIO-PAX is a bio-pax:entity. But the main
>purpose of this super-class is to enforce certain AnnotationProperty or
>grouping terms. No offense to those groups, but I think the design is wrong
>and should be avoided.
>
>To design a superclass is different form designing certain properties, like
>obo:name, obo:id etc., because it is still natural to say that something is
>a gene but has certain name and id etc., like the following,
>
>_:x a gene:Y ;
> obo:name "Some gene" ;
> obo:id "12345" .
>
>And best of all, you shouldn't invent those properties, because each
>resource should have a URI and rdfs:label can be used for name. And there
>are other ontology like Dublin Core at your dispense as well.
>
>You probably wondered then, how can interoperability to be ensured if there
>is no way to constrain it. My take to this is to think it in long term.
>Overtime, a few commonly used ontology would be shared by people who have
>the same interest. Take economy as an analogy, controled market have short
>term stability but destine to collapse big time some time. But free market
>economy have occassional turmoil, especially at the beginning but is more
>robust and stable in the long run. So, don't worry about how others will do
>in the future. Just think about if there is any ontologies that can help
>you adequately describe your data.
>
>Xiaoshu
>
>
>
>
>>-----Original Message-----
>>From: semantic-web-request@w3.org
>>[mailto:semantic-web-request@w3.org] On Behalf Of Mary Montoya
>>Sent: Wednesday, October 11, 2006 1:09 AM
>>To: semantic-web@w3.org; public-semweb-lifesci@w3.org
>>Subject: using OBO in owl format to describe data
>>
>>
>>I have a question about using the OpenBiomedicalOntologies
>>such as the SequenceOntology in owl format to describe data
>>resident in my local biological database.
>>
>>It seems desirable to leverage subclass relationships of
>>terms in the hierarchy of the SequenceOntology and to have
>>all the terms there rooted in a common parent obo:Term class.
>> OBO defines certain information to be provided for all OBO
>>terms such as name, id, definition, etc. These provide
>>descriptive information of the class itself not properties of
>>members of the class. So it seems all obo:Terms would have
>>"class values" for a name, id, def, etc. I would then expect
>>to find classes in SequenceOntology that are defined as
>>subClassOf obo:Term and reflect the hierachical structure of
>>those SequenceOntology terms, for example, so:Gene as a
>>subClassOf so:Region which is a subClassOf
>>so:Located_sequence_feature which is a subClassOf obo:Term.
>>The problem is that the owl class definitions I've seen for
>>OBO terms don't also include property definitions for
>>individuals of the class. So an individual of so:Gene
>>doesn't have a property for name, id, def, etc that I can
>>provide values for from my database. There are only these
>>class description properties often defined using rdfs:label,
>>rdfs:comment or as annotation type properties.
>>My question is: How can I use these publicly available
>>ontologies to hold values for my data? They seem poised for
>>interoperability if these properties were available to
>>individuals of these classes.
>>
>>Here is one sequence ontology definition I found for gene in
>>owl format ( others I've seen look similar )
>>
>> <owl:Class rdf:ID="SO_0000704">
>> <rdfs:label xml:lang="en">gene</rdfs:label>
>> <rdfs:comment
>>rdf:datatype="http://www.w3.org/2001/XMLSchema#string">A
>>locatable region of genomic sequence, corresponding to a unit
>>of inheritance, which is associated with regulatory regions,
>>transcribed regions and/or other functional sequence
>>regions</rdfs:comment>
>> <rdfs:subClassOf rdf:resource="#SO_0000001"/> </owl:Class>
>>
>>I thought something like this would be more useful:
>>
>> <owl:Class rdf:about="&so;SO_0000704">
>> <obo:classId>SO:0000704</obo:classId>
>> <obo:className>gene</obo:className>
>> <obo:classDef>
>> "A locatable region of genomic sequence,
>>corresponding to a unit of
>> inheritance, which is associated with regulatory regions,
>> transcribed regions and/or other functional
>>sequence regions"
>> [SO:rd]
>> </obo:classDef>
>> <rdfs:subClassOf rdf:resource="&so;SO_0000001"/>
>> </owl:Class>
>>
>>with the root parent Term defined within an obo namespace as
>> <owl:Class rdf:about="&obo;Term">
>> <obo:classId>OBO:Term</obo:classId>
>> <obo:className>term</obo:className>
>> <obo:classDef>
>> Term is a blah, blah
>> </obo:classDef>
>> <rdfs:subClassOf>
>> <owl:Restriction>
>> <owl:onProperty rdf:resource="&obo;name"/>
>> <owl:minCardinality
>>rdf:datatype="&xsd;nonNegativeInteger">
>> 1
>> </owl:minCardinality>
>> <owl:maxCardinality
>>rdf:datatype="&xsd;nonNegativeInteger">
>> 1
>> </owl:maxCardinality>
>> </owl:Restriction>
>> </rdfs:subClassOf>
>> <rdfs:subClassOf>
>> <owl:Restriction>
>> <owl:onProperty rdf:resource="&obo;id"/>
>> <owl:minCardinality
>>rdf:datatype="&xsd;nonNegativeInteger">
>> 1
>> </owl:minCardinality>
>> <owl:maxCardinality
>>rdf:datatype="&xsd;nonNegativeInteger">
>> 1
>> </owl:maxCardinality>
>> </owl:Restriction>
>> </rdfs:subClassOf>
>> <rdfs:subClassOf>
>> <owl:Restriction>
>> <owl:onProperty rdf:resource="&obo;def"/>
>> <owl:minCardinality
>>rdf:datatype="&xsd;nonNegativeInteger">
>> 0
>> </owl:minCardinality>
>> <owl:maxCardinality
>>rdf:datatype="&xsd;nonNegativeInteger">
>> 1
>> </owl:maxCardinality>
>> </owl:Restriction>
>> </rdfs:subClassOf>
>> </owl:Class>
>>
>>Then I could do something like this:
>>
>> <owl:Class rdf:about="&mystuff;MyGene">
>> <rdfs:subClassOf rdf:resource="&so;SO_0000704"/>
>> <rdfs:subClassOf>
>> <owl:Restriction>
>> <owl:onProperty
>>rdf:resource="&mystuff;chromosomeNumber"/>
>> <owl:minCardinality
>>rdf:datatype="&xsd;nonNegativeInteger">
>> 1
>> </owl:minCardinality>
>> <owl:maxCardinality
>>rdf:datatype="&xsd;nonNegativeInteger">
>> 1
>> </owl:maxCardinality>
>> </owl:Restriction>
>> </rdfs:subClassOf>
>> <rdfs:subClassOf>
>> <owl:Restriction>
>> <owl:onProperty
>>rdf:resource="&mystuff;startCoordinate"/>
>> <owl:minCardinality
>>rdf:datatype="&xsd;nonNegativeInteger">
>> 1
>> </owl:minCardinality>
>> <owl:maxCardinality
>>rdf:datatype="&xsd;nonNegativeInteger">
>> 1
>> </owl:maxCardinality>
>> </owl:Restriction>
>> </rdfs:subClassOf>
>> <rdfs:subClassOf>
>> <owl:Restriction>
>> <owl:onProperty
>>rdf:resource="&mystuff;endCoordinate"/>
>> <owl:minCardinality
>>rdf:datatype="&xsd;nonNegativeInteger">
>> 1
>> </owl:minCardinality>
>> <owl:maxCardinality
>>rdf:datatype="&xsd;nonNegativeInteger">
>> 1
>> </owl:maxCardinality>
>> </owl:Restriction>
>> </rdfs:subClassOf>
>> <rdfs:subClassOf>
>> <owl:Restriction>
>> <owl:onProperty rdf:resource="&mystuff;sequence"/>
>> <owl:minCardinality
>>rdf:datatype="&xsd;nonNegativeInteger">
>> 1
>> </owl:minCardinality>
>> <owl:maxCardinality
>>rdf:datatype="&xsd;nonNegativeInteger">
>> 1
>> </owl:maxCardinality>
>> </owl:Restriction>
>> </rdfs:subClassOf>
>> </owl:Class>
>>
>>so now I have defined a class MyGene that extends from obo to
>>sequence ontology and I can define individuals with property
>>values for the following from my database:
>>obo:name
>>obo:id
>>obo:def
>>mystuff:sequence
>>mystuff:endCoordinate
>>mystuff:startCoordinate
>>mystuff:chromosomeNumber
>>
>>It seems presumptuous to define properties for individuals (
>>name, id, etc ) as well as class properties ( className,
>>classID, etc ) for public ontologies such as obo ontologies
>>but possibly quite useful for interoperability sake. Any
>>comments would be welcome.
>>
>>Mary Montoya
>>
>>VPIN project team
>>NCGR
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>
Received on Wednesday, 11 October 2006 16:52:20 UTC