- From: John Barkley <jbarkley@nist.gov>
- Date: Thu, 19 Apr 2007 09:20:17 -0400
- To: "Alan Ruttenberg" <alanruttenberg@gmail.com>
- Cc: "Jonathan Rees" <jar@mumble.net>, "chris mungall" <cjm@fruitfly.org>, "public-semweb-lifesci hcls" <public-semweb-lifesci@w3.org>, "Suzanna Lewis" <suzi@berkeleybop.org>, "Judith Blake" <jblake@informatics.jax.org>, "Barry Smith" <phismith@buffalo.edu>, <jbarkley@nist.gov>
hi alan,
Here is a mock up of what I think you had in mind in the case of BAMS
(sorry for the rdf/xml, I wanted to be precise):
Given:
<owl:Class rdf:ID="article"/>
<owl:Class rdf:ID="pubmedRecord"/>
<pubmedRecord rdf:about="http://purl.org/commons/pubmed/_3327422"/>
<pubmedRecord rdf:about="http://purl.org/commons/pubmed/_7451682"/>
<owl:ObjectProperty rdf:ID="definedByPMID">
<rdf:type
rdf:resource="http://www.w3.org/2002/07/owl#InverseFunctionalProperty"/>
<rdfs:domain rdf:resource="#article"/>
<rdfs:range rdf:resource="#pubmedRecord"/>
</owl:ObjectProperty>
<owl:ObjectProperty rdf:ID="isMentionedBy">
<rdf:type
rdf:resource="http://www.w3.org/2002/07/owl#AnnotationProperty"/>
</owl:ObjectProperty>
Then, for each cell and cell/molecule pubmed reference, you would have the
following (the first is a cell example for c101 and the second is a
cell/molecule example for c101):
<owl:Class rdf:ID="c101">
<rdfs:subClassOf
rdf:resource="http://purl.org/obo/owl/CARO#CARO_0000013"/>
<rdfs:subClassOf>
<owl:Restriction>
<owl:onProperty rdf:resource="#classId"/>
<owl:hasValue
rdf:datatype="http://www.w3.org/2001/XMLSchema#int">101</owl:hasValue>
</owl:Restriction>
</rdfs:subClassOf>
<rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
>motor neuroendocrine magnocellular oxytocin neuron</rdfs:label>
<isMentionedBy rdf:resource="http://purl.org/commons/pubmed/_7451682"/>
</owl:Class>
<owl:Class rdf:ID="Cellc101HasMoleculem3Within">
<rdfs:subClassOf rdf:resource="#c101"/>
<rdfs:subClassOf>
<owl:Restriction>
<owl:onProperty rdf:resource="#cell_has_molecule_within"/>
<owl:someValuesFrom rdf:resource="#m3"/>
</owl:Restriction>
</rdfs:subClassOf>
<isMentionedBy rdf:resource="http://purl.org/commons/pubmed/_3327422"/>
</owl:Class>
jb
----- Original Message -----
From: "Alan Ruttenberg" <alanruttenberg@gmail.com>
To: "John Barkley" <jbarkley@nist.gov>
Cc: "Jonathan Rees" <jar@mumble.net>; "chris mungall" <cjm@fruitfly.org>;
"public-semweb-lifesci hcls" <public-semweb-lifesci@w3.org>; "Suzanna Lewis"
<suzi@berkeleybop.org>; "Judith Blake" <jblake@informatics.jax.org>; "Barry
Smith" <phismith@buffalo.edu>
Sent: Thursday, April 19, 2007 12:24 AM
Subject: Re: adding pubmed ids to BAMS
>
> Here is an idea I am exploring. Perhaps you might mock this up:
>
> The essential idea is that evidence and other annotation is about named
> classes. In those cases where one might think of annotating some axiom,
> or piece of axiom, we would instead look for the class that is the
> referent of the annotation and name that class.
> Then, we can connect that class, using an annotation property, to
> whatever kind of annotation or evidence we think appropriate.
>
> Suppose we have a class HumanP53Protein, which we will define as: Those
> proteins whose sequence of amino acids are described by the sequence in
> the sequence information field of the Uniprot P53_Human Record, or which
> are derived from such a protein. (I'm open to discussion on what this
> definitions should be, BTW, but I think we should have one)
>
> One gene ontology annotation to P53 is:
> GO:0000739; Molecular function: DNA strand annealing activity (inferred
> from direct assay from UniProtKB).
>
> GO:0000739 is defined as in OBO, as a class, a subclass of function.
>
> We will say that the referent of this annotation is the class
>
> HumanP53ProteinWithFunctionDNAStrandAnnealing: HumanP53Protein and
> has_function some GO:0000739
>
> The annotation property itself might be called "ExistsAccordingTo", by
> which we mean that this class has instances
>
> The thing it exists according is
>
> Inference001
> type InferredFromDirectAssay
> describedInPaper theArticlePMID1234Describes
>
> So our annotation is
>
> HumanP53ProteinWithFunctionDNAStrandAnnealing ExistsAccordingTo
> Inference001
>
> Up to this point we have been conservative. We haven't made any statement
> about P53 in general. Here, we will overstate (our only choice, if we
> want to make a statement about biology from which some useful inference
> can be done, given the evidence we have)
>
> HumanP53Protein subclassOf HumanP53ProteinWithFunctionDNAStrandAnnealing
>
> This may be wrong. For instance, it may be the case that only that P53
> phosphorylated in some way actually has this function.
> I hope that by some other statement, a contradiction is inferred that
> will force us (or the curators) to be more specific.
>
> ----
>
> What's nice about this?
>
>
> 1) We are making statements about biology (better than making statements
> about "terms")
> 2) There is no RDF reification involved - the main contender for
> representing this sort of thing.
> 3) We have been (relatively) conservative about what we say there is
> evidence for
> 4) We are owning the fact that we are making an overstatement
> 5) We are enabling some inference to take place.
>
> What's the cost?
>
> 1) One extra triple, in which we name the class
> HumanP53ProteinInvolvedInDNADamageResponse
> Where we previously would have used a restriction to introduce the
> participation, we now use the named class.
> 2) When querying about what the evidence is for, we need to query the
> asserted (or told) assertions only. That's because after inference has
> been done, new assertions may be known about
> HumanP53ProteinWithFunctionDNAStrandAnnealing and we won't be able to
> tell the difference between what was asserted and what is inferred, given
> that we have associated the only the class name with the evidence
>
> ---
>
> Taking this to BAMS it means that we associate the paper with the cell
> class for which we already have an name.
> For the molecule is found in cell cases, we create the named class for
> the cell contains some molecule class, use that
> class in place of the restriction, and associate the paper to that named
> class.
>
> You can define
>
> Class(article :partial)
> Class(pubmedRecord :partial)
> ObjectProperty(definedByPMID inversefunctional)
>
> Represent the pubmed record as an instance of pubmedRecord named
> http://purl.org/commons/pubmed/1234
>
> The last issue is the nature of the relationship between the paper and
> the class. If we can't easily distinguish between whether
> these annotations are evidence or simply discussion we could use the
> relation "isMentionedBy", which we will mean to say that the class (or
> some instances of the class) are discussed in the paper.
>
> ---
>
> Call me if you want to discuss this. Admittedly this may seem involved
> and odd, since it is a new idea, though I will blame Chris and Jonathan,
> who I bounced it off of, for not telling me straight off it didn't make
> sense :)
>
> But how about we give it a go and see what it feels like. I'm planning to
> use this translation for the GO annotations and the rest of the similar
> sources, unless somebody comes forth with some arguments about what would
> be a better idea.
>
> Best,
> Alan
>
>
> On Apr 18, 2007, at 3:49 PM, jbarkley@nist.gov wrote:
>
>>
>>> From what Mihai sent me, the pubmed refs are about:
>>
>>> the cell and
>>> the fact the molecule is found in cell
>>
>> Pending your recomendation, I had tentatively suggested the following
>> for
>> representing this as:
>>
>> pubmedID has "<id>" or
>> cell_has_molecule_within some (<cell> and (pubmedID has "<id>"))
>>
>> where one of more of these is associated with a cell. I was under the
>> impression that you were thinking about a general representation that
>> everyone
>> would use for pubmedID. So, I haven't yet added these to the BAMS OWL
>> version.
>>
>>> OK. Can you send me this for a quick look?
>>
>> I'm not sure what you are asking to see. Do you want to see the original
>> tables Mihai sent me?
>>
>> thanks,
>>
>> jb
>>
>>
>>
>> Date: Wed, 18 Apr 2007 12:30:17 -0400
>> From: Alan Ruttenberg <alanruttenberg@gmail.com>
>> To: John Barkley <jbarkley@nist.gov>
>> Cc: Jonathan A Rees <jar@mumble.net>
>> Subject: Re: adding pubmed ids to BAMS
>> Quoting Alan Ruttenberg <alanruttenberg@gmail.com>:
>>
>>>
>>> On Apr 13, 2007, at 1:51 PM, John Barkley wrote:
>>>
>>>> I have confirmed from Mihai that all of the pubmed references in
>>>> BAMS are evidence for or elaboration about.
>>>
>>> OK. Can you send me this for a quick look?
>>> Is it clear what the they are about
>>> i.e.
>>>
>>> the cell
>>> the part
>>> the fact that cell is located in part
>>> the fact the molecule is found in cell
>>> the fact the molecule is found in part
>>> the fact the molecule is found in cell in part
>>> etc.
>>>
>>> ?
>>>
>>>>
>>>>
>>>> ----- Original Message ----- From: "Alan Ruttenberg"
>>>> <alanruttenberg@gmail.com>
>>>>
>>>>> Don't have time at this moment, but I think that generally you
>>>>> want to state the the article is either evidence for, or
>>>>> elaboration about the scientific statement involving the cells,
>>>>> molecules, etc. Then then use the pubmed id in some standard URI
>>>>> form (maybe neurocommons record url style) or
>>>>> Jonathan's purl.org suggestion. In other words the pubmed id is
>>>>> the identifier for a thing (the article, or the abstract,
>>>>> depending on one's point of view).
>>>>>
>>>>> More details later.
>>>>>
>>>>> You could look and see how Gene ontology represents evidence.
>>>>>
>>>>> -Alan
>>>>>
>>>>> On Apr 11, 2007, at 3:46 PM, John Barkley wrote:
>>>>>
>>>>>> hi alan,
>>>>>>
>>>>>> I recieved spreadsheets from Mihai relating cells & pubmed ids,
>>>>>> and cells, molecules, & pubmed ids. I wanted to consult with you
>>>>>> about your preferences for how to integrate this into BAMS. I am
>>>>>> thinking something like defining a datatype property pubmedID
>>>>>> from owl:thing to string. Then for cells, you would have:
>>>>>>
>>>>>> pubmedID has "<id>"
>>>>>>
>>>>>> and for cells with molecules within, you would have:
>>>>>>
>>>>>> cell_has_molecule_within some (<cell> and (pubmedID has "<id>"))
>>>>>>
>>>>>> Please let me know.
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> jb
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
>
Received on Thursday, 19 April 2007 13:21:06 UTC