W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > April 2007

Re: adding pubmed ids to BAMS

From: John Barkley <jbarkley@nist.gov>
Date: Thu, 19 Apr 2007 09:20:17 -0400
Message-ID: <16c501c78285$7bd6bb80$8a3a0681@ncsl.nist.gov>
To: "Alan Ruttenberg" <alanruttenberg@gmail.com>
Cc: "Jonathan Rees" <jar@mumble.net>, "chris mungall" <cjm@fruitfly.org>, "public-semweb-lifesci hcls" <public-semweb-lifesci@w3.org>, "Suzanna Lewis" <suzi@berkeleybop.org>, "Judith Blake" <jblake@informatics.jax.org>, "Barry Smith" <phismith@buffalo.edu>, <jbarkley@nist.gov>

hi alan,

Here is a mock up of  what I think you had in mind in the case of BAMS 
(sorry for the rdf/xml, I wanted to be precise):

Given:

<owl:Class rdf:ID="article"/>
<owl:Class rdf:ID="pubmedRecord"/>

<pubmedRecord rdf:about="http://purl.org/commons/pubmed/_3327422"/>
<pubmedRecord rdf:about="http://purl.org/commons/pubmed/_7451682"/>

<owl:ObjectProperty rdf:ID="definedByPMID">
    <rdf:type 
rdf:resource="http://www.w3.org/2002/07/owl#InverseFunctionalProperty"/>
    <rdfs:domain rdf:resource="#article"/>
    <rdfs:range rdf:resource="#pubmedRecord"/>
</owl:ObjectProperty>

<owl:ObjectProperty rdf:ID="isMentionedBy">
    <rdf:type 
rdf:resource="http://www.w3.org/2002/07/owl#AnnotationProperty"/>
</owl:ObjectProperty>

Then, for each cell and cell/molecule pubmed reference, you would have the 
following (the first is a cell example for c101 and the second is a 
cell/molecule example for c101):

<owl:Class rdf:ID="c101">
    <rdfs:subClassOf 
rdf:resource="http://purl.org/obo/owl/CARO#CARO_0000013"/>
    <rdfs:subClassOf>
        <owl:Restriction>
            <owl:onProperty rdf:resource="#classId"/>
            <owl:hasValue 
rdf:datatype="http://www.w3.org/2001/XMLSchema#int">101</owl:hasValue>
        </owl:Restriction>
    </rdfs:subClassOf>
    <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
            >motor neuroendocrine magnocellular oxytocin neuron</rdfs:label>
    <isMentionedBy rdf:resource="http://purl.org/commons/pubmed/_7451682"/>
</owl:Class>

<owl:Class rdf:ID="Cellc101HasMoleculem3Within">
    <rdfs:subClassOf rdf:resource="#c101"/>
    <rdfs:subClassOf>
        <owl:Restriction>
            <owl:onProperty rdf:resource="#cell_has_molecule_within"/>
            <owl:someValuesFrom rdf:resource="#m3"/>
        </owl:Restriction>
    </rdfs:subClassOf>
    <isMentionedBy rdf:resource="http://purl.org/commons/pubmed/_3327422"/>
</owl:Class>

jb


----- Original Message ----- 
From: "Alan Ruttenberg" <alanruttenberg@gmail.com>
To: "John Barkley" <jbarkley@nist.gov>
Cc: "Jonathan Rees" <jar@mumble.net>; "chris mungall" <cjm@fruitfly.org>; 
"public-semweb-lifesci hcls" <public-semweb-lifesci@w3.org>; "Suzanna Lewis" 
<suzi@berkeleybop.org>; "Judith Blake" <jblake@informatics.jax.org>; "Barry 
Smith" <phismith@buffalo.edu>
Sent: Thursday, April 19, 2007 12:24 AM
Subject: Re: adding pubmed ids to BAMS


>
> Here is an idea I am exploring. Perhaps you might mock this up:
>
> The essential idea is that evidence and other annotation is about  named 
> classes. In those cases where one might think of annotating  some axiom, 
> or piece of axiom, we would instead look for the class  that is the 
> referent of the annotation and name that class.
> Then, we can connect that class, using an annotation property,  to 
> whatever kind of annotation or evidence we think appropriate.
>
> Suppose we have a class HumanP53Protein, which we will define as:  Those 
> proteins whose sequence of amino acids are described by the  sequence in 
> the sequence information field of the Uniprot P53_Human  Record, or which 
> are derived from such a protein. (I'm open to  discussion on what this 
> definitions should be, BTW, but I think we  should have one)
>
> One gene ontology annotation to P53 is:
> GO:0000739; Molecular function: DNA strand annealing activity  (inferred 
> from direct assay from UniProtKB).
>
> GO:0000739 is defined as in OBO, as a class, a subclass of function.
>
> We will say that the referent of this annotation is the class
>
> HumanP53ProteinWithFunctionDNAStrandAnnealing:  HumanP53Protein and 
> has_function some GO:0000739
>
> The annotation property itself might be called "ExistsAccordingTo",  by 
> which we mean that this class has instances
>
> The thing it exists according is
>
> Inference001
>    type InferredFromDirectAssay
>    describedInPaper theArticlePMID1234Describes
>
> So our annotation is
>
> HumanP53ProteinWithFunctionDNAStrandAnnealing ExistsAccordingTo 
> Inference001
>
> Up to this point we have been conservative. We haven't made any  statement 
> about P53 in general. Here, we will overstate (our only  choice, if we 
> want to make a statement about biology from which some  useful inference 
> can be done, given the evidence we have)
>
> HumanP53Protein subclassOf HumanP53ProteinWithFunctionDNAStrandAnnealing
>
> This may be wrong. For instance, it may be the case that only that  P53 
> phosphorylated in some way actually has this function.
> I hope that by some other statement, a contradiction is inferred that 
> will force us (or the curators) to be more specific.
>
> ----
>
> What's nice about this?
>
>
> 1) We are making statements about biology (better than making  statements 
> about "terms")
> 2) There is no RDF reification involved - the main contender for 
> representing this sort of thing.
> 3) We have been (relatively) conservative about what we say there is 
> evidence for
> 4) We are owning the fact that we are making an overstatement
> 5) We are enabling some inference to take place.
>
> What's the cost?
>
> 1) One extra triple, in which we name the class 
> HumanP53ProteinInvolvedInDNADamageResponse
> Where we previously would have used a restriction to introduce the 
> participation, we now use the named class.
> 2) When querying about what the evidence is for, we need to query the 
> asserted (or told) assertions only. That's because after inference  has 
> been done, new assertions may be known about 
> HumanP53ProteinWithFunctionDNAStrandAnnealing and we won't be able to 
> tell the difference between what was asserted and what is inferred,  given 
> that we have associated the only the class name with the evidence
>
> ---
>
> Taking this to BAMS it means that we associate the paper with the  cell 
> class for which we already have an name.
> For the molecule is found in cell cases, we create the named class  for 
> the cell contains some molecule class, use that
> class in place of the restriction, and associate the paper to that  named 
> class.
>
> You can define
>
> Class(article :partial)
> Class(pubmedRecord :partial)
> ObjectProperty(definedByPMID inversefunctional)
>
> Represent the pubmed record as an instance of pubmedRecord named 
> http://purl.org/commons/pubmed/1234
>
> The last issue is the nature of the relationship between the paper  and 
> the class. If we can't easily distinguish between whether
> these annotations are evidence or simply discussion we could use the 
> relation "isMentionedBy", which we will mean to say that the class  (or 
> some instances of the class) are discussed in the paper.
>
> ---
>
> Call me if you want to discuss this. Admittedly this may seem  involved 
> and odd, since it is a new idea, though I will blame Chris  and Jonathan, 
> who I bounced it off of, for not telling me straight  off it didn't make 
> sense :)
>
> But how about we give it a go and see what it feels like. I'm  planning to 
> use this translation for the GO annotations and the rest  of the similar 
> sources, unless somebody comes forth with some  arguments about what would 
> be a better idea.
>
> Best,
> Alan
>
>
> On Apr 18, 2007, at 3:49 PM, jbarkley@nist.gov wrote:
>
>>
>>> From what Mihai sent me, the pubmed refs are about:
>>
>>> the cell and
>>> the fact the molecule is found in cell
>>
>> Pending your recomendation, I had tentatively suggested the  following 
>> for
>> representing this as:
>>
>> pubmedID has "<id>" or
>> cell_has_molecule_within some (<cell> and (pubmedID has "<id>"))
>>
>> where one of more of these is associated with a cell. I was under the
>> impression that you were thinking about a general representation  that 
>> everyone
>> would use for pubmedID. So, I haven't yet added these to the BAMS  OWL 
>> version.
>>
>>> OK. Can you send me this for a quick look?
>>
>> I'm not sure what you are asking to see. Do you want to see the  original
>> tables Mihai sent me?
>>
>> thanks,
>>
>> jb
>>
>>
>>
>> Date:  Wed, 18 Apr 2007 12:30:17 -0400
>> From:  Alan Ruttenberg <alanruttenberg@gmail.com>
>> To:  John Barkley <jbarkley@nist.gov>
>> Cc:  Jonathan A Rees <jar@mumble.net>
>> Subject:  Re: adding pubmed ids to BAMS
>> Quoting Alan Ruttenberg <alanruttenberg@gmail.com>:
>>
>>>
>>> On Apr 13, 2007, at 1:51 PM, John Barkley wrote:
>>>
>>>> I have confirmed from Mihai that all of the pubmed references in
>>>> BAMS are evidence for or elaboration about.
>>>
>>> OK. Can you send me this for a quick look?
>>> Is it clear what the they are about
>>> i.e.
>>>
>>> the cell
>>> the part
>>> the fact that cell is located in part
>>> the fact the molecule is found in cell
>>> the fact the molecule is found in part
>>> the fact the molecule is found in cell in part
>>> etc.
>>>
>>> ?
>>>
>>>>
>>>>
>>>> ----- Original Message ----- From: "Alan Ruttenberg"
>>>> <alanruttenberg@gmail.com>
>>>>
>>>>> Don't have time at this moment, but I think that generally you
>>>>> want  to state the the article is either evidence for, or
>>>>> elaboration about  the scientific statement involving the cells,
>>>>> molecules, etc. Then  then use the pubmed id in some standard URI
>>>>> form (maybe neurocommons  record url style) or
>>>>> Jonathan's purl.org suggestion. In other words the pubmed id is
>>>>> the identifier for a thing (the article, or the abstract,
>>>>> depending on  one's point of view).
>>>>>
>>>>> More details later.
>>>>>
>>>>> You could look and see how Gene ontology represents evidence.
>>>>>
>>>>> -Alan
>>>>>
>>>>> On Apr 11, 2007, at 3:46 PM, John Barkley wrote:
>>>>>
>>>>>> hi alan,
>>>>>>
>>>>>> I recieved spreadsheets from Mihai relating cells & pubmed ids,
>>>>>> and cells, molecules, & pubmed ids. I wanted to consult with you
>>>>>> about  your preferences for how to integrate this into BAMS. I am
>>>>>> thinking  something like defining a datatype property pubmedID
>>>>>> from owl:thing  to string. Then for cells, you would have:
>>>>>>
>>>>>> pubmedID has "<id>"
>>>>>>
>>>>>> and for cells with molecules within, you would have:
>>>>>>
>>>>>> cell_has_molecule_within some (<cell> and (pubmedID has "<id>"))
>>>>>>
>>>>>> Please let me know.
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> jb
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
> 
Received on Thursday, 19 April 2007 13:21:06 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:00:47 GMT