W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > April 2007

Re: adding pubmed ids to BAMS

From: Alan Ruttenberg <alanruttenberg@gmail.com>
Date: Thu, 19 Apr 2007 11:46:01 -0400
Message-Id: <B1B67D70-BA18-497A-9550-C052CBDF57F6@gmail.com>
Cc: "Jonathan Rees" <jar@mumble.net>, "chris mungall" <cjm@fruitfly.org>, "public-semweb-lifesci hcls" <public-semweb-lifesci@w3.org>, "Suzanna Lewis" <suzi@berkeleybop.org>, "Judith Blake" <jblake@informatics.jax.org>, "Barry Smith" <phismith@buffalo.edu>
To: John Barkley <jbarkley@nist.gov>


On Apr 19, 2007, at 9:20 AM, John Barkley wrote:

> hi alan,
>
> Here is a mock up of  what I think you had in mind in the case of  
> BAMS (sorry for the rdf/xml, I wanted to be precise):
>
> Given:
>
> <owl:Class rdf:ID="article"/>
> <owl:Class rdf:ID="pubmedRecord"/>
>
> <pubmedRecord rdf:about="http://purl.org/commons/pubmed/_3327422"/>
> <pubmedRecord rdf:about="http://purl.org/commons/pubmed/_7451682"/>
>
> <owl:ObjectProperty rdf:ID="definedByPMID">
>    <rdf:type rdf:resource="http://www.w3.org/2002/07/ 
> owl#InverseFunctionalProperty"/>
>    <rdfs:domain rdf:resource="#article"/>
>    <rdfs:range rdf:resource="#pubmedRecord"/>
> </owl:ObjectProperty>
>
> <owl:ObjectProperty rdf:ID="isMentionedBy">
>    <rdf:type rdf:resource="http://www.w3.org/2002/07/ 
> owl#AnnotationProperty"/>
> </owl:ObjectProperty>
>
> Then, for each cell and cell/molecule pubmed reference, you would  
> have the following (the first is a cell example for c101 and the  
> second is a cell/molecule example for c101):
>
> <owl:Class rdf:ID="c101">
>    <rdfs:subClassOf rdf:resource="http://purl.org/obo/owl/ 
> CARO#CARO_0000013"/>
>    <rdfs:subClassOf>
>        <owl:Restriction>
>            <owl:onProperty rdf:resource="#classId"/>
>            <owl:hasValue rdf:datatype="http://www.w3.org/2001/ 
> XMLSchema#int">101</owl:hasValue>
>        </owl:Restriction>
>    </rdfs:subClassOf>
>    <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
>            >motor neuroendocrine magnocellular oxytocin neuron</ 
> rdfs:label>
>    <isMentionedBy rdf:resource="http://purl.org/commons/pubmed/ 
> _7451682"/>
> </owl:Class>

isMentionedBy should point to an instance of article
article identifiedByPMID http://purl.org/commons/pubmed/_7451682

I don't like the underscore but Jonathan thinks it is necessary. But  
this is minor. I would say using PMID_7451652 or some similar variant  
is more appealing. (no accounting for taste)
We might also want
http://purl.org/commons/pubmed/_7451682 hasId "7451682", but I'm not  
sure.
Anyways, neither of these are essential for progress.


> <owl:Class rdf:ID="Cellc101HasMoleculem3Within">
Would be nice to have an english readable rdfs:label here.
>    <rdfs:subClassOf rdf:resource="#c101"/>
>    <rdfs:subClassOf>
>        <owl:Restriction>
>            <owl:onProperty rdf:resource="#cell_has_molecule_within"/>
>            <owl:someValuesFrom rdf:resource="#m3"/>
>        </owl:Restriction>
>    </rdfs:subClassOf>
>    <isMentionedBy rdf:resource="http://purl.org/commons/pubmed/ 
> _3327422"/>
> </owl:Class>

This class would be used in place of the restriction if there is a  
definition that  would otherwise use the restriction.

Thanks for the quick response!

>
> jb
>
>
> ----- Original Message ----- From: "Alan Ruttenberg"  
> <alanruttenberg@gmail.com>
> To: "John Barkley" <jbarkley@nist.gov>
> Cc: "Jonathan Rees" <jar@mumble.net>; "chris mungall"  
> <cjm@fruitfly.org>; "public-semweb-lifesci hcls" <public-semweb- 
> lifesci@w3.org>; "Suzanna Lewis" <suzi@berkeleybop.org>; "Judith  
> Blake" <jblake@informatics.jax.org>; "Barry Smith"  
> <phismith@buffalo.edu>
> Sent: Thursday, April 19, 2007 12:24 AM
> Subject: Re: adding pubmed ids to BAMS
>
>
>>
>> Here is an idea I am exploring. Perhaps you might mock this up:
>>
>> The essential idea is that evidence and other annotation is about   
>> named classes. In those cases where one might think of annotating   
>> some axiom, or piece of axiom, we would instead look for the  
>> class  that is the referent of the annotation and name that class.
>> Then, we can connect that class, using an annotation property,  to  
>> whatever kind of annotation or evidence we think appropriate.
>>
>> Suppose we have a class HumanP53Protein, which we will define as:   
>> Those proteins whose sequence of amino acids are described by the   
>> sequence in the sequence information field of the Uniprot  
>> P53_Human  Record, or which are derived from such a protein. (I'm  
>> open to  discussion on what this definitions should be, BTW, but I  
>> think we  should have one)
>>
>> One gene ontology annotation to P53 is:
>> GO:0000739; Molecular function: DNA strand annealing activity   
>> (inferred from direct assay from UniProtKB).
>>
>> GO:0000739 is defined as in OBO, as a class, a subclass of function.
>>
>> We will say that the referent of this annotation is the class
>>
>> HumanP53ProteinWithFunctionDNAStrandAnnealing:  HumanP53Protein  
>> and has_function some GO:0000739
>>
>> The annotation property itself might be called  
>> "ExistsAccordingTo",  by which we mean that this class has instances
>>
>> The thing it exists according is
>>
>> Inference001
>>    type InferredFromDirectAssay
>>    describedInPaper theArticlePMID1234Describes
>>
>> So our annotation is
>>
>> HumanP53ProteinWithFunctionDNAStrandAnnealing ExistsAccordingTo  
>> Inference001
>>
>> Up to this point we have been conservative. We haven't made any   
>> statement about P53 in general. Here, we will overstate (our only   
>> choice, if we want to make a statement about biology from which  
>> some  useful inference can be done, given the evidence we have)
>>
>> HumanP53Protein subclassOf  
>> HumanP53ProteinWithFunctionDNAStrandAnnealing
>>
>> This may be wrong. For instance, it may be the case that only  
>> that  P53 phosphorylated in some way actually has this function.
>> I hope that by some other statement, a contradiction is inferred  
>> that will force us (or the curators) to be more specific.
>>
>> ----
>>
>> What's nice about this?
>>
>>
>> 1) We are making statements about biology (better than making   
>> statements about "terms")
>> 2) There is no RDF reification involved - the main contender for  
>> representing this sort of thing.
>> 3) We have been (relatively) conservative about what we say there  
>> is evidence for
>> 4) We are owning the fact that we are making an overstatement
>> 5) We are enabling some inference to take place.
>>
>> What's the cost?
>>
>> 1) One extra triple, in which we name the class  
>> HumanP53ProteinInvolvedInDNADamageResponse
>> Where we previously would have used a restriction to introduce the  
>> participation, we now use the named class.
>> 2) When querying about what the evidence is for, we need to query  
>> the asserted (or told) assertions only. That's because after  
>> inference  has been done, new assertions may be known about  
>> HumanP53ProteinWithFunctionDNAStrandAnnealing and we won't be able  
>> to tell the difference between what was asserted and what is  
>> inferred,  given that we have associated the only the class name  
>> with the evidence
>>
>> ---
>>
>> Taking this to BAMS it means that we associate the paper with the   
>> cell class for which we already have an name.
>> For the molecule is found in cell cases, we create the named  
>> class  for the cell contains some molecule class, use that
>> class in place of the restriction, and associate the paper to  
>> that  named class.
>>
>> You can define
>>
>> Class(article :partial)
>> Class(pubmedRecord :partial)
>> ObjectProperty(definedByPMID inversefunctional)
>>
>> Represent the pubmed record as an instance of pubmedRecord named  
>> http://purl.org/commons/pubmed/1234
>>
>> The last issue is the nature of the relationship between the  
>> paper  and the class. If we can't easily distinguish between whether
>> these annotations are evidence or simply discussion we could use  
>> the relation "isMentionedBy", which we will mean to say that the  
>> class  (or some instances of the class) are discussed in the paper.
>>
>> ---
>>
>> Call me if you want to discuss this. Admittedly this may seem   
>> involved and odd, since it is a new idea, though I will blame  
>> Chris  and Jonathan, who I bounced it off of, for not telling me  
>> straight  off it didn't make sense :)
>>
>> But how about we give it a go and see what it feels like. I'm   
>> planning to use this translation for the GO annotations and the  
>> rest  of the similar sources, unless somebody comes forth with  
>> some  arguments about what would be a better idea.
>>
>> Best,
>> Alan
>>
>>
>> On Apr 18, 2007, at 3:49 PM, jbarkley@nist.gov wrote:
>>
>>>
>>>> From what Mihai sent me, the pubmed refs are about:
>>>
>>>> the cell and
>>>> the fact the molecule is found in cell
>>>
>>> Pending your recomendation, I had tentatively suggested the   
>>> following for
>>> representing this as:
>>>
>>> pubmedID has "<id>" or
>>> cell_has_molecule_within some (<cell> and (pubmedID has "<id>"))
>>>
>>> where one of more of these is associated with a cell. I was under  
>>> the
>>> impression that you were thinking about a general representation   
>>> that everyone
>>> would use for pubmedID. So, I haven't yet added these to the  
>>> BAMS  OWL version.
>>>
>>>> OK. Can you send me this for a quick look?
>>>
>>> I'm not sure what you are asking to see. Do you want to see the   
>>> original
>>> tables Mihai sent me?
>>>
>>> thanks,
>>>
>>> jb
>>>
>>>
>>>
>>> Date:  Wed, 18 Apr 2007 12:30:17 -0400
>>> From:  Alan Ruttenberg <alanruttenberg@gmail.com>
>>> To:  John Barkley <jbarkley@nist.gov>
>>> Cc:  Jonathan A Rees <jar@mumble.net>
>>> Subject:  Re: adding pubmed ids to BAMS
>>> Quoting Alan Ruttenberg <alanruttenberg@gmail.com>:
>>>
>>>>
>>>> On Apr 13, 2007, at 1:51 PM, John Barkley wrote:
>>>>
>>>>> I have confirmed from Mihai that all of the pubmed references in
>>>>> BAMS are evidence for or elaboration about.
>>>>
>>>> OK. Can you send me this for a quick look?
>>>> Is it clear what the they are about
>>>> i.e.
>>>>
>>>> the cell
>>>> the part
>>>> the fact that cell is located in part
>>>> the fact the molecule is found in cell
>>>> the fact the molecule is found in part
>>>> the fact the molecule is found in cell in part
>>>> etc.
>>>>
>>>> ?
>>>>
>>>>>
>>>>>
>>>>> ----- Original Message ----- From: "Alan Ruttenberg"
>>>>> <alanruttenberg@gmail.com>
>>>>>
>>>>>> Don't have time at this moment, but I think that generally you
>>>>>> want  to state the the article is either evidence for, or
>>>>>> elaboration about  the scientific statement involving the cells,
>>>>>> molecules, etc. Then  then use the pubmed id in some standard URI
>>>>>> form (maybe neurocommons  record url style) or
>>>>>> Jonathan's purl.org suggestion. In other words the pubmed id is
>>>>>> the identifier for a thing (the article, or the abstract,
>>>>>> depending on  one's point of view).
>>>>>>
>>>>>> More details later.
>>>>>>
>>>>>> You could look and see how Gene ontology represents evidence.
>>>>>>
>>>>>> -Alan
>>>>>>
>>>>>> On Apr 11, 2007, at 3:46 PM, John Barkley wrote:
>>>>>>
>>>>>>> hi alan,
>>>>>>>
>>>>>>> I recieved spreadsheets from Mihai relating cells & pubmed ids,
>>>>>>> and cells, molecules, & pubmed ids. I wanted to consult with you
>>>>>>> about  your preferences for how to integrate this into BAMS.  
>>>>>>> I am
>>>>>>> thinking  something like defining a datatype property pubmedID
>>>>>>> from owl:thing  to string. Then for cells, you would have:
>>>>>>>
>>>>>>> pubmedID has "<id>"
>>>>>>>
>>>>>>> and for cells with molecules within, you would have:
>>>>>>>
>>>>>>> cell_has_molecule_within some (<cell> and (pubmedID has "<id>"))
>>>>>>>
>>>>>>> Please let me know.
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> jb
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
Received on Thursday, 19 April 2007 15:46:12 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:00:47 GMT