Re: adding pubmed ids to BAMS

ok.

Often overlooked, but no less illustrated by the demo, is the ability of 
underscore people and non-underscore people to work together.

jb


Date:  Thu, 19 Apr 2007 15:44:41 -0400 
From:  Alan Ruttenberg <alanruttenberg@gmail.com> 
To:  John Barkley <jbarkley@nist.gov> 
Cc:  Jonathan Rees <jar@mumble.net>, chris mungall <cjm@fruitfly.org>, public-
semweb-lifesci hcls <public-semweb-lifesci@w3.org>, Suzanna Lewis 
<suzi@berkeleybop.org>, Judith Blake <jblake@informatics.jax.org>, Barry Smith 
<phismith@buffalo.edu> 
Subject:  Re: adding pubmed ids to BAMS 
Quoting Alan Ruttenberg <alanruttenberg@gmail.com>:

> 
> Looks good.
> 
> Tweaks:
> Call the article "article 7451682"
> Name the pubmed record
> http://purl.org/commons/pmid/7451682
> 
> Best,
> Alan
> 
> 
> 
> On Apr 19, 2007, at 2:53 PM, John Barkley wrote:
> 
> > Hows about:
> >
> > <owl:Class rdf:ID="article"/>
> > <owl:Class rdf:ID="pubmedRecord"/>
> >
> > <owl:ObjectProperty rdf:ID="definedByPMID">
> >    <rdf:type rdf:resource="http://www.w3.org/2002/07/ 
> > owl#InverseFunctionalProperty"/>
> >    <rdfs:domain rdf:resource="#article"/>
> >    <rdfs:range rdf:resource="#pubmedRecord"/>
> > </owl:ObjectProperty>
> >
> > <owl:ObjectProperty rdf:ID="isMentionedBy">
> >    <rdf:type rdf:resource="http://www.w3.org/2002/07/ 
> > owl#AnnotationProperty"/>
> > </owl:ObjectProperty>
> >
> > <pubmedRecord rdf:about="http://purl.org/commons/pubmed/ 
> > PMID_3327422"/>
> > <pubmedRecord rdf:about="http://purl.org/commons/pubmed/ 
> > PMID_7451682"/>
> >
> >
> > <article rdf:ID="pubmed_3327422">
> >    <definedByPMID rdf:resource="http://purl.org/commons/pubmed/ 
> > PMID_3327422"/>
> > </article>
> > <article rdf:ID="pubmed_7451682">
> >    <definedByPMID rdf:resource="http://purl.org/commons/pubmed/ 
> > PMID_7451682"/>
> > </article>
> >
> > <owl:Class rdf:ID="c101">
> >    <rdfs:subClassOf>
> >        <owl:Restriction>
> >            <owl:onProperty rdf:resource="#classId"/>
> >            <owl:hasValue rdf:datatype="http://www.w3.org/2001/ 
> > XMLSchema#int">101</owl:hasValue>
> >        </owl:Restriction>
> >    </rdfs:subClassOf>
> >    <rdfs:subClassOf rdf:resource="http://purl.org/obo/owl/ 
> > CARO#CARO_0000013"/>
> >    <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
> >            >motor neuroendocrine magnocellular oxytocin neuron</ 
> > rdfs:label>
> >    <isMentionedBy rdf:resource="#pubmed_7451682"/>
> > </owl:Class>
> >
> > <owl:Class rdf:ID="Cellc101HasMoleculem3Within">
> >    <rdfs:subClassOf rdf:resource="#c101"/>
> >    <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
> >            >the cell motor neuroendocrine magnocellular oxytocin  
> > neuron has the molecule oxytocin within</rdfs:label>
> >    <isMentionedBy rdf:resource="#pubmed_3327422"/>
> > </owl:Class>
> >
> > jb
> >
> >
> > ----- Original Message ----- From: "Alan Ruttenberg"  
> > <alanruttenberg@gmail.com>
> > To: "John Barkley" <jbarkley@nist.gov>
> > Cc: "Jonathan Rees" <jar@mumble.net>; "chris mungall"  
> > <cjm@fruitfly.org>; "public-semweb-lifesci hcls" <public-semweb- 
> > lifesci@w3.org>; "Suzanna Lewis" <suzi@berkeleybop.org>; "Judith  
> > Blake" <jblake@informatics.jax.org>; "Barry Smith"  
> > <phismith@buffalo.edu>
> > Sent: Thursday, April 19, 2007 11:46 AM
> > Subject: Re: adding pubmed ids to BAMS
> >
> >
> >>
> >>
> >> On Apr 19, 2007, at 9:20 AM, John Barkley wrote:
> >>
> >>> hi alan,
> >>>
> >>> Here is a mock up of  what I think you had in mind in the case  
> >>> of  BAMS (sorry for the rdf/xml, I wanted to be precise):
> >>>
> >>> Given:
> >>>
> >>> <owl:Class rdf:ID="article"/>
> >>> <owl:Class rdf:ID="pubmedRecord"/>
> >>>
> >>> <pubmedRecord rdf:about="http://purl.org/commons/pubmed/_3327422"/>
> >>> <pubmedRecord rdf:about="http://purl.org/commons/pubmed/_7451682"/>
> >>>
> >>> <owl:ObjectProperty rdf:ID="definedByPMID">
> >>>    <rdf:type rdf:resource="http://www.w3.org/2002/07/  
> >>> owl#InverseFunctionalProperty"/>
> >>>    <rdfs:domain rdf:resource="#article"/>
> >>>    <rdfs:range rdf:resource="#pubmedRecord"/>
> >>> </owl:ObjectProperty>
> >>>
> >>> <owl:ObjectProperty rdf:ID="isMentionedBy">
> >>>    <rdf:type rdf:resource="http://www.w3.org/2002/07/  
> >>> owl#AnnotationProperty"/>
> >>> </owl:ObjectProperty>
> >>>
> >>> Then, for each cell and cell/molecule pubmed reference, you  
> >>> would  have the following (the first is a cell example for c101  
> >>> and the  second is a cell/molecule example for c101):
> >>>
> >>> <owl:Class rdf:ID="c101">
> >>>    <rdfs:subClassOf rdf:resource="http://purl.org/obo/owl/  
> >>> CARO#CARO_0000013"/>
> >>>    <rdfs:subClassOf>
> >>>        <owl:Restriction>
> >>>            <owl:onProperty rdf:resource="#classId"/>
> >>>            <owl:hasValue rdf:datatype="http://www.w3.org/2001/  
> >>> XMLSchema#int">101</owl:hasValue>
> >>>        </owl:Restriction>
> >>>    </rdfs:subClassOf>
> >>>    <rdfs:label rdf:datatype="http://www.w3.org/2001/ 
> >>> XMLSchema#string"
> >>>            >motor neuroendocrine magnocellular oxytocin neuron</  
> >>> rdfs:label>
> >>>    <isMentionedBy rdf:resource="http://purl.org/commons/pubmed/  
> >>> _7451682"/>
> >>> </owl:Class>
> >>
> >> isMentionedBy should point to an instance of article
> >> article identifiedByPMID http://purl.org/commons/pubmed/_7451682
> >>
> >> I don't like the underscore but Jonathan thinks it is necessary.  
> >> But  this is minor. I would say using PMID_7451652 or some similar  
> >> variant  is more appealing. (no accounting for taste)
> >> We might also want
> >> http://purl.org/commons/pubmed/_7451682 hasId "7451682", but I'm  
> >> not sure.
> >> Anyways, neither of these are essential for progress.
> >>
> >>
> >>> <owl:Class rdf:ID="Cellc101HasMoleculem3Within">
> >> Would be nice to have an english readable rdfs:label here.
> >>>    <rdfs:subClassOf rdf:resource="#c101"/>
> >>>    <rdfs:subClassOf>
> >>>        <owl:Restriction>
> >>>            <owl:onProperty  
> >>> rdf:resource="#cell_has_molecule_within"/>
> >>>            <owl:someValuesFrom rdf:resource="#m3"/>
> >>>        </owl:Restriction>
> >>>    </rdfs:subClassOf>
> >>>    <isMentionedBy rdf:resource="http://purl.org/commons/pubmed/  
> >>> _3327422"/>
> >>> </owl:Class>
> >>
> >> This class would be used in place of the restriction if there is a  
> >> definition that  would otherwise use the restriction.
> >>
> >> Thanks for the quick response!
> >>
> >>>
> >>> jb
> >>>
> >>>
> >>> ----- Original Message ----- From: "Alan Ruttenberg"  
> >>> <alanruttenberg@gmail.com>
> >>> To: "John Barkley" <jbarkley@nist.gov>
> >>> Cc: "Jonathan Rees" <jar@mumble.net>; "chris mungall"  
> >>> <cjm@fruitfly.org>; "public-semweb-lifesci hcls" <public-semweb-  
> >>> lifesci@w3.org>; "Suzanna Lewis" <suzi@berkeleybop.org>; "Judith   
> >>> Blake" <jblake@informatics.jax.org>; "Barry Smith"   
> >>> <phismith@buffalo.edu>
> >>> Sent: Thursday, April 19, 2007 12:24 AM
> >>> Subject: Re: adding pubmed ids to BAMS
> >>>
> >>>
> >>>>
> >>>> Here is an idea I am exploring. Perhaps you might mock this up:
> >>>>
> >>>> The essential idea is that evidence and other annotation is  
> >>>> about named classes. In those cases where one might think of  
> >>>> annotating   some axiom, or piece of axiom, we would instead  
> >>>> look for the  class  that is the referent of the annotation and  
> >>>> name that class.
> >>>> Then, we can connect that class, using an annotation property,   
> >>>> to whatever kind of annotation or evidence we think appropriate.
> >>>>
> >>>> Suppose we have a class HumanP53Protein, which we will define  
> >>>> as: Those proteins whose sequence of amino acids are described  
> >>>> by the sequence in the sequence information field of the  
> >>>> Uniprot  P53_Human Record, or which are derived from such a  
> >>>> protein. (I'm  open to discussion on what this definitions  
> >>>> should be, BTW, but I  think we should have one)
> >>>>
> >>>> One gene ontology annotation to P53 is:
> >>>> GO:0000739; Molecular function: DNA strand annealing activity  
> >>>> (inferred from direct assay from UniProtKB).
> >>>>
> >>>> GO:0000739 is defined as in OBO, as a class, a subclass of  
> >>>> function.
> >>>>
> >>>> We will say that the referent of this annotation is the class
> >>>>
> >>>> HumanP53ProteinWithFunctionDNAStrandAnnealing:  HumanP53Protein   
> >>>> and has_function some GO:0000739
> >>>>
> >>>> The annotation property itself might be called   
> >>>> "ExistsAccordingTo",  by which we mean that this class has  
> >>>> instances
> >>>>
> >>>> The thing it exists according is
> >>>>
> >>>> Inference001
> >>>>    type InferredFromDirectAssay
> >>>>    describedInPaper theArticlePMID1234Describes
> >>>>
> >>>> So our annotation is
> >>>>
> >>>> HumanP53ProteinWithFunctionDNAStrandAnnealing ExistsAccordingTo  
> >>>> Inference001
> >>>>
> >>>> Up to this point we have been conservative. We haven't made any  
> >>>> statement about P53 in general. Here, we will overstate (our  
> >>>> only choice, if we want to make a statement about biology from  
> >>>> which  some useful inference can be done, given the evidence we  
> >>>> have)
> >>>>
> >>>> HumanP53Protein subclassOf  
> >>>> HumanP53ProteinWithFunctionDNAStrandAnnealing
> >>>>
> >>>> This may be wrong. For instance, it may be the case that only   
> >>>> that  P53 phosphorylated in some way actually has this function.
> >>>> I hope that by some other statement, a contradiction is  
> >>>> inferred  that will force us (or the curators) to be more specific.
> >>>>
> >>>> ----
> >>>>
> >>>> What's nice about this?
> >>>>
> >>>>
> >>>> 1) We are making statements about biology (better than making  
> >>>> statements about "terms")
> >>>> 2) There is no RDF reification involved - the main contender for  
> >>>> representing this sort of thing.
> >>>> 3) We have been (relatively) conservative about what we say  
> >>>> there  is evidence for
> >>>> 4) We are owning the fact that we are making an overstatement
> >>>> 5) We are enabling some inference to take place.
> >>>>
> >>>> What's the cost?
> >>>>
> >>>> 1) One extra triple, in which we name the class  
> >>>> HumanP53ProteinInvolvedInDNADamageResponse
> >>>> Where we previously would have used a restriction to introduce  
> >>>> the participation, we now use the named class.
> >>>> 2) When querying about what the evidence is for, we need to  
> >>>> query  the asserted (or told) assertions only. That's because  
> >>>> after  inference  has been done, new assertions may be known  
> >>>> about HumanP53ProteinWithFunctionDNAStrandAnnealing and we won't  
> >>>> be able  to tell the difference between what was asserted and  
> >>>> what is  inferred, given that we have associated the only the  
> >>>> class name  with the evidence
> >>>>
> >>>> ---
> >>>>
> >>>> Taking this to BAMS it means that we associate the paper with  
> >>>> the   cell class for which we already have an name.
> >>>> For the molecule is found in cell cases, we create the named   
> >>>> class  for the cell contains some molecule class, use that
> >>>> class in place of the restriction, and associate the paper to   
> >>>> that named class.
> >>>>
> >>>> You can define
> >>>>
> >>>> Class(article :partial)
> >>>> Class(pubmedRecord :partial)
> >>>> ObjectProperty(definedByPMID inversefunctional)
> >>>>
> >>>> Represent the pubmed record as an instance of pubmedRecord named  
> >>>> http://purl.org/commons/pubmed/1234
> >>>>
> >>>> The last issue is the nature of the relationship between the   
> >>>> paper  and the class. If we can't easily distinguish between  
> >>>> whether
> >>>> these annotations are evidence or simply discussion we could  
> >>>> use  the relation "isMentionedBy", which we will mean to say  
> >>>> that the  class  (or some instances of the class) are discussed  
> >>>> in the paper.
> >>>>
> >>>> ---
> >>>>
> >>>> Call me if you want to discuss this. Admittedly this may seem    
> >>>> involved and odd, since it is a new idea, though I will blame   
> >>>> Chris  and Jonathan, who I bounced it off of, for not telling  
> >>>> me  straight  off it didn't make sense :)
> >>>>
> >>>> But how about we give it a go and see what it feels like. I'm    
> >>>> planning to use this translation for the GO annotations and the   
> >>>> rest  of the similar sources, unless somebody comes forth with   
> >>>> some  arguments about what would be a better idea.
> >>>>
> >>>> Best,
> >>>> Alan
> >>>>
> >>>>
> >>>> On Apr 18, 2007, at 3:49 PM, jbarkley@nist.gov wrote:
> >>>>
> >>>>>
> >>>>>> From what Mihai sent me, the pubmed refs are about:
> >>>>>
> >>>>>> the cell and
> >>>>>> the fact the molecule is found in cell
> >>>>>
> >>>>> Pending your recomendation, I had tentatively suggested the    
> >>>>> following for
> >>>>> representing this as:
> >>>>>
> >>>>> pubmedID has "<id>" or
> >>>>> cell_has_molecule_within some (<cell> and (pubmedID has "<id>"))
> >>>>>
> >>>>> where one of more of these is associated with a cell. I was  
> >>>>> under  the
> >>>>> impression that you were thinking about a general  
> >>>>> representation   that everyone
> >>>>> would use for pubmedID. So, I haven't yet added these to the   
> >>>>> BAMS  OWL version.
> >>>>>
> >>>>>> OK. Can you send me this for a quick look?
> >>>>>
> >>>>> I'm not sure what you are asking to see. Do you want to see the  
> >>>>> original
> >>>>> tables Mihai sent me?
> >>>>>
> >>>>> thanks,
> >>>>>
> >>>>> jb
> >>>>>
> >>>>>
> >>>>>
> >>>>> Date:  Wed, 18 Apr 2007 12:30:17 -0400
> >>>>> From:  Alan Ruttenberg <alanruttenberg@gmail.com>
> >>>>> To:  John Barkley <jbarkley@nist.gov>
> >>>>> Cc:  Jonathan A Rees <jar@mumble.net>
> >>>>> Subject:  Re: adding pubmed ids to BAMS
> >>>>> Quoting Alan Ruttenberg <alanruttenberg@gmail.com>:
> >>>>>
> >>>>>>
> >>>>>> On Apr 13, 2007, at 1:51 PM, John Barkley wrote:
> >>>>>>
> >>>>>>> I have confirmed from Mihai that all of the pubmed references in
> >>>>>>> BAMS are evidence for or elaboration about.
> >>>>>>
> >>>>>> OK. Can you send me this for a quick look?
> >>>>>> Is it clear what the they are about
> >>>>>> i.e.
> >>>>>>
> >>>>>> the cell
> >>>>>> the part
> >>>>>> the fact that cell is located in part
> >>>>>> the fact the molecule is found in cell
> >>>>>> the fact the molecule is found in part
> >>>>>> the fact the molecule is found in cell in part
> >>>>>> etc.
> >>>>>>
> >>>>>> ?
> >>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> ----- Original Message ----- From: "Alan Ruttenberg"
> >>>>>>> <alanruttenberg@gmail.com>
> >>>>>>>
> >>>>>>>> Don't have time at this moment, but I think that generally you
> >>>>>>>> want  to state the the article is either evidence for, or
> >>>>>>>> elaboration about  the scientific statement involving the  
> >>>>>>>> cells,
> >>>>>>>> molecules, etc. Then  then use the pubmed id in some  
> >>>>>>>> standard URI
> >>>>>>>> form (maybe neurocommons  record url style) or
> >>>>>>>> Jonathan's purl.org suggestion. In other words the pubmed id is
> >>>>>>>> the identifier for a thing (the article, or the abstract,
> >>>>>>>> depending on  one's point of view).
> >>>>>>>>
> >>>>>>>> More details later.
> >>>>>>>>
> >>>>>>>> You could look and see how Gene ontology represents evidence.
> >>>>>>>>
> >>>>>>>> -Alan
> >>>>>>>>
> >>>>>>>> On Apr 11, 2007, at 3:46 PM, John Barkley wrote:
> >>>>>>>>
> >>>>>>>>> hi alan,
> >>>>>>>>>
> >>>>>>>>> I recieved spreadsheets from Mihai relating cells & pubmed  
> >>>>>>>>> ids,
> >>>>>>>>> and cells, molecules, & pubmed ids. I wanted to consult  
> >>>>>>>>> with you
> >>>>>>>>> about  your preferences for how to integrate this into  
> >>>>>>>>> BAMS.  I am
> >>>>>>>>> thinking  something like defining a datatype property pubmedID
> >>>>>>>>> from owl:thing  to string. Then for cells, you would have:
> >>>>>>>>>
> >>>>>>>>> pubmedID has "<id>"
> >>>>>>>>>
> >>>>>>>>> and for cells with molecules within, you would have:
> >>>>>>>>>
> >>>>>>>>> cell_has_molecule_within some (<cell> and (pubmedID has  
> >>>>>>>>> "<id>"))
> >>>>>>>>>
> >>>>>>>>> Please let me know.
> >>>>>>>>>
> >>>>>>>>> thanks,
> >>>>>>>>>
> >>>>>>>>> jb
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>>
> >>
> >>
> >
> >
> 
> 
> 

Received on Friday, 20 April 2007 10:28:36 UTC