- From: Mary Montoya <mhm@ncgr.org>
- Date: Wed, 11 Oct 2006 10:52:01 -0600
- To: Xiaoshu Wang <wangxiao@musc.edu>
- CC: semantic-web@w3.org, public-semweb-lifesci@w3.org
so I should define properties for my data so I can then state ( pardon my awkwardness with the shorthand syntax ): @prefix gene: <http://example.com/genes#> . gene:Y rdfs:subClassOf obo:Term . _:x a gene:Y . rdfs:label "my frog gene" mystuff:sequence "atgcgga...." mystuff:chromosomeNumber 3 mystuff:startCoordinate 12345 mystuff:endCoordinate 7890 so then if my friend at bullfrog genome project states @prefix gene: <http://example.com/genes#> . gene:Z rdfs:subClassOf obo:Term . _:w a gene:Z . otherstuff:sequenceID "786451" otherstuffchromoNum "3" otherstuff:comment "source unknown" otherstuff:type "pseudogene" Both will be of type obo:Term but since obo:Term doesn't define any properties for individuals what good does it do me? I can't coordinate the data. I guess if I create some properties that I think will be useful and assign their domain to be obo:Term and then tell my friend to also use them to describe their instance data, then our data will interoperate. And you are saying that if I do a good job of identifying useful properties over time many people will also start using them. Mary Xiaoshu Wang wrote: >--Mary, > >I am not familiar with OBO, but what you are suggesting is actually what >should be avoided in RDF. RDF is based on open world assumption. But to >have one superclass for the purpose of enforcing certain annotation property >is a closed world thinking in Object Oriented world. > >In your case, I wonder if there is any concrete criteria that makes one >resource an obo:Term but the other not? If not, why invent another term for >it? And doing so will at least make the statement very odd to understand. >For instance, assume there is an instance of gene class Y named x. Then, we >can say, > >@prefix gene: <http://example.com/genes#> . >gene:Y rdfs:subClassOf obo:Term . > >Then it is natural to say, > >_:x a gene:Y . > >but it would be very odd to say, > >_:x a obo:Term . > >In addition, it will incur unnecessary computation complexity for RDF >engine. > >This kind of pseudo-Superclass pattern is also used. Everything in MGED is >an MGED:Ontology and everything in BIO-PAX is a bio-pax:entity. But the main >purpose of this super-class is to enforce certain AnnotationProperty or >grouping terms. No offense to those groups, but I think the design is wrong >and should be avoided. > >To design a superclass is different form designing certain properties, like >obo:name, obo:id etc., because it is still natural to say that something is >a gene but has certain name and id etc., like the following, > >_:x a gene:Y ; > obo:name "Some gene" ; > obo:id "12345" . > >And best of all, you shouldn't invent those properties, because each >resource should have a URI and rdfs:label can be used for name. And there >are other ontology like Dublin Core at your dispense as well. > >You probably wondered then, how can interoperability to be ensured if there >is no way to constrain it. My take to this is to think it in long term. >Overtime, a few commonly used ontology would be shared by people who have >the same interest. Take economy as an analogy, controled market have short >term stability but destine to collapse big time some time. But free market >economy have occassional turmoil, especially at the beginning but is more >robust and stable in the long run. So, don't worry about how others will do >in the future. Just think about if there is any ontologies that can help >you adequately describe your data. > >Xiaoshu > > > > >>-----Original Message----- >>From: semantic-web-request@w3.org >>[mailto:semantic-web-request@w3.org] On Behalf Of Mary Montoya >>Sent: Wednesday, October 11, 2006 1:09 AM >>To: semantic-web@w3.org; public-semweb-lifesci@w3.org >>Subject: using OBO in owl format to describe data >> >> >>I have a question about using the OpenBiomedicalOntologies >>such as the SequenceOntology in owl format to describe data >>resident in my local biological database. >> >>It seems desirable to leverage subclass relationships of >>terms in the hierarchy of the SequenceOntology and to have >>all the terms there rooted in a common parent obo:Term class. >> OBO defines certain information to be provided for all OBO >>terms such as name, id, definition, etc. These provide >>descriptive information of the class itself not properties of >>members of the class. So it seems all obo:Terms would have >>"class values" for a name, id, def, etc. I would then expect >>to find classes in SequenceOntology that are defined as >>subClassOf obo:Term and reflect the hierachical structure of >>those SequenceOntology terms, for example, so:Gene as a >>subClassOf so:Region which is a subClassOf >>so:Located_sequence_feature which is a subClassOf obo:Term. >>The problem is that the owl class definitions I've seen for >>OBO terms don't also include property definitions for >>individuals of the class. So an individual of so:Gene >>doesn't have a property for name, id, def, etc that I can >>provide values for from my database. There are only these >>class description properties often defined using rdfs:label, >>rdfs:comment or as annotation type properties. >>My question is: How can I use these publicly available >>ontologies to hold values for my data? They seem poised for >>interoperability if these properties were available to >>individuals of these classes. >> >>Here is one sequence ontology definition I found for gene in >>owl format ( others I've seen look similar ) >> >> <owl:Class rdf:ID="SO_0000704"> >> <rdfs:label xml:lang="en">gene</rdfs:label> >> <rdfs:comment >>rdf:datatype="http://www.w3.org/2001/XMLSchema#string">A >>locatable region of genomic sequence, corresponding to a unit >>of inheritance, which is associated with regulatory regions, >>transcribed regions and/or other functional sequence >>regions</rdfs:comment> >> <rdfs:subClassOf rdf:resource="#SO_0000001"/> </owl:Class> >> >>I thought something like this would be more useful: >> >> <owl:Class rdf:about="&so;SO_0000704"> >> <obo:classId>SO:0000704</obo:classId> >> <obo:className>gene</obo:className> >> <obo:classDef> >> "A locatable region of genomic sequence, >>corresponding to a unit of >> inheritance, which is associated with regulatory regions, >> transcribed regions and/or other functional >>sequence regions" >> [SO:rd] >> </obo:classDef> >> <rdfs:subClassOf rdf:resource="&so;SO_0000001"/> >> </owl:Class> >> >>with the root parent Term defined within an obo namespace as >> <owl:Class rdf:about="&obo;Term"> >> <obo:classId>OBO:Term</obo:classId> >> <obo:className>term</obo:className> >> <obo:classDef> >> Term is a blah, blah >> </obo:classDef> >> <rdfs:subClassOf> >> <owl:Restriction> >> <owl:onProperty rdf:resource="&obo;name"/> >> <owl:minCardinality >>rdf:datatype="&xsd;nonNegativeInteger"> >> 1 >> </owl:minCardinality> >> <owl:maxCardinality >>rdf:datatype="&xsd;nonNegativeInteger"> >> 1 >> </owl:maxCardinality> >> </owl:Restriction> >> </rdfs:subClassOf> >> <rdfs:subClassOf> >> <owl:Restriction> >> <owl:onProperty rdf:resource="&obo;id"/> >> <owl:minCardinality >>rdf:datatype="&xsd;nonNegativeInteger"> >> 1 >> </owl:minCardinality> >> <owl:maxCardinality >>rdf:datatype="&xsd;nonNegativeInteger"> >> 1 >> </owl:maxCardinality> >> </owl:Restriction> >> </rdfs:subClassOf> >> <rdfs:subClassOf> >> <owl:Restriction> >> <owl:onProperty rdf:resource="&obo;def"/> >> <owl:minCardinality >>rdf:datatype="&xsd;nonNegativeInteger"> >> 0 >> </owl:minCardinality> >> <owl:maxCardinality >>rdf:datatype="&xsd;nonNegativeInteger"> >> 1 >> </owl:maxCardinality> >> </owl:Restriction> >> </rdfs:subClassOf> >> </owl:Class> >> >>Then I could do something like this: >> >> <owl:Class rdf:about="&mystuff;MyGene"> >> <rdfs:subClassOf rdf:resource="&so;SO_0000704"/> >> <rdfs:subClassOf> >> <owl:Restriction> >> <owl:onProperty >>rdf:resource="&mystuff;chromosomeNumber"/> >> <owl:minCardinality >>rdf:datatype="&xsd;nonNegativeInteger"> >> 1 >> </owl:minCardinality> >> <owl:maxCardinality >>rdf:datatype="&xsd;nonNegativeInteger"> >> 1 >> </owl:maxCardinality> >> </owl:Restriction> >> </rdfs:subClassOf> >> <rdfs:subClassOf> >> <owl:Restriction> >> <owl:onProperty >>rdf:resource="&mystuff;startCoordinate"/> >> <owl:minCardinality >>rdf:datatype="&xsd;nonNegativeInteger"> >> 1 >> </owl:minCardinality> >> <owl:maxCardinality >>rdf:datatype="&xsd;nonNegativeInteger"> >> 1 >> </owl:maxCardinality> >> </owl:Restriction> >> </rdfs:subClassOf> >> <rdfs:subClassOf> >> <owl:Restriction> >> <owl:onProperty >>rdf:resource="&mystuff;endCoordinate"/> >> <owl:minCardinality >>rdf:datatype="&xsd;nonNegativeInteger"> >> 1 >> </owl:minCardinality> >> <owl:maxCardinality >>rdf:datatype="&xsd;nonNegativeInteger"> >> 1 >> </owl:maxCardinality> >> </owl:Restriction> >> </rdfs:subClassOf> >> <rdfs:subClassOf> >> <owl:Restriction> >> <owl:onProperty rdf:resource="&mystuff;sequence"/> >> <owl:minCardinality >>rdf:datatype="&xsd;nonNegativeInteger"> >> 1 >> </owl:minCardinality> >> <owl:maxCardinality >>rdf:datatype="&xsd;nonNegativeInteger"> >> 1 >> </owl:maxCardinality> >> </owl:Restriction> >> </rdfs:subClassOf> >> </owl:Class> >> >>so now I have defined a class MyGene that extends from obo to >>sequence ontology and I can define individuals with property >>values for the following from my database: >>obo:name >>obo:id >>obo:def >>mystuff:sequence >>mystuff:endCoordinate >>mystuff:startCoordinate >>mystuff:chromosomeNumber >> >>It seems presumptuous to define properties for individuals ( >>name, id, etc ) as well as class properties ( className, >>classID, etc ) for public ontologies such as obo ontologies >>but possibly quite useful for interoperability sake. Any >>comments would be welcome. >> >>Mary Montoya >> >>VPIN project team >>NCGR >> >> >> >> >> >> >> >> >> > > >
Received on Wednesday, 11 October 2006 16:52:31 UTC