- From: <jbarkley@nist.gov>
- Date: Fri, 20 Apr 2007 06:26:48 -0400
- To: Alan Ruttenberg <alanruttenberg@gmail.com>
- Cc: Jonathan Rees <jar@mumble.net>, chris mungall <cjm@fruitfly.org>, public-semweb-lifesci hcls <public-semweb-lifesci@w3.org>, Suzanna Lewis <suzi@berkeleybop.org>, Judith Blake <jblake@informatics.jax.org>, Barry Smith <phismith@buffalo.edu>, jbarkley@nist.gov
ok. Often overlooked, but no less illustrated by the demo, is the ability of underscore people and non-underscore people to work together. jb Date: Thu, 19 Apr 2007 15:44:41 -0400 From: Alan Ruttenberg <alanruttenberg@gmail.com> To: John Barkley <jbarkley@nist.gov> Cc: Jonathan Rees <jar@mumble.net>, chris mungall <cjm@fruitfly.org>, public- semweb-lifesci hcls <public-semweb-lifesci@w3.org>, Suzanna Lewis <suzi@berkeleybop.org>, Judith Blake <jblake@informatics.jax.org>, Barry Smith <phismith@buffalo.edu> Subject: Re: adding pubmed ids to BAMS Quoting Alan Ruttenberg <alanruttenberg@gmail.com>: > > Looks good. > > Tweaks: > Call the article "article 7451682" > Name the pubmed record > http://purl.org/commons/pmid/7451682 > > Best, > Alan > > > > On Apr 19, 2007, at 2:53 PM, John Barkley wrote: > > > Hows about: > > > > <owl:Class rdf:ID="article"/> > > <owl:Class rdf:ID="pubmedRecord"/> > > > > <owl:ObjectProperty rdf:ID="definedByPMID"> > > <rdf:type rdf:resource="http://www.w3.org/2002/07/ > > owl#InverseFunctionalProperty"/> > > <rdfs:domain rdf:resource="#article"/> > > <rdfs:range rdf:resource="#pubmedRecord"/> > > </owl:ObjectProperty> > > > > <owl:ObjectProperty rdf:ID="isMentionedBy"> > > <rdf:type rdf:resource="http://www.w3.org/2002/07/ > > owl#AnnotationProperty"/> > > </owl:ObjectProperty> > > > > <pubmedRecord rdf:about="http://purl.org/commons/pubmed/ > > PMID_3327422"/> > > <pubmedRecord rdf:about="http://purl.org/commons/pubmed/ > > PMID_7451682"/> > > > > > > <article rdf:ID="pubmed_3327422"> > > <definedByPMID rdf:resource="http://purl.org/commons/pubmed/ > > PMID_3327422"/> > > </article> > > <article rdf:ID="pubmed_7451682"> > > <definedByPMID rdf:resource="http://purl.org/commons/pubmed/ > > PMID_7451682"/> > > </article> > > > > <owl:Class rdf:ID="c101"> > > <rdfs:subClassOf> > > <owl:Restriction> > > <owl:onProperty rdf:resource="#classId"/> > > <owl:hasValue rdf:datatype="http://www.w3.org/2001/ > > XMLSchema#int">101</owl:hasValue> > > </owl:Restriction> > > </rdfs:subClassOf> > > <rdfs:subClassOf rdf:resource="http://purl.org/obo/owl/ > > CARO#CARO_0000013"/> > > <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string" > > >motor neuroendocrine magnocellular oxytocin neuron</ > > rdfs:label> > > <isMentionedBy rdf:resource="#pubmed_7451682"/> > > </owl:Class> > > > > <owl:Class rdf:ID="Cellc101HasMoleculem3Within"> > > <rdfs:subClassOf rdf:resource="#c101"/> > > <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string" > > >the cell motor neuroendocrine magnocellular oxytocin > > neuron has the molecule oxytocin within</rdfs:label> > > <isMentionedBy rdf:resource="#pubmed_3327422"/> > > </owl:Class> > > > > jb > > > > > > ----- Original Message ----- From: "Alan Ruttenberg" > > <alanruttenberg@gmail.com> > > To: "John Barkley" <jbarkley@nist.gov> > > Cc: "Jonathan Rees" <jar@mumble.net>; "chris mungall" > > <cjm@fruitfly.org>; "public-semweb-lifesci hcls" <public-semweb- > > lifesci@w3.org>; "Suzanna Lewis" <suzi@berkeleybop.org>; "Judith > > Blake" <jblake@informatics.jax.org>; "Barry Smith" > > <phismith@buffalo.edu> > > Sent: Thursday, April 19, 2007 11:46 AM > > Subject: Re: adding pubmed ids to BAMS > > > > > >> > >> > >> On Apr 19, 2007, at 9:20 AM, John Barkley wrote: > >> > >>> hi alan, > >>> > >>> Here is a mock up of what I think you had in mind in the case > >>> of BAMS (sorry for the rdf/xml, I wanted to be precise): > >>> > >>> Given: > >>> > >>> <owl:Class rdf:ID="article"/> > >>> <owl:Class rdf:ID="pubmedRecord"/> > >>> > >>> <pubmedRecord rdf:about="http://purl.org/commons/pubmed/_3327422"/> > >>> <pubmedRecord rdf:about="http://purl.org/commons/pubmed/_7451682"/> > >>> > >>> <owl:ObjectProperty rdf:ID="definedByPMID"> > >>> <rdf:type rdf:resource="http://www.w3.org/2002/07/ > >>> owl#InverseFunctionalProperty"/> > >>> <rdfs:domain rdf:resource="#article"/> > >>> <rdfs:range rdf:resource="#pubmedRecord"/> > >>> </owl:ObjectProperty> > >>> > >>> <owl:ObjectProperty rdf:ID="isMentionedBy"> > >>> <rdf:type rdf:resource="http://www.w3.org/2002/07/ > >>> owl#AnnotationProperty"/> > >>> </owl:ObjectProperty> > >>> > >>> Then, for each cell and cell/molecule pubmed reference, you > >>> would have the following (the first is a cell example for c101 > >>> and the second is a cell/molecule example for c101): > >>> > >>> <owl:Class rdf:ID="c101"> > >>> <rdfs:subClassOf rdf:resource="http://purl.org/obo/owl/ > >>> CARO#CARO_0000013"/> > >>> <rdfs:subClassOf> > >>> <owl:Restriction> > >>> <owl:onProperty rdf:resource="#classId"/> > >>> <owl:hasValue rdf:datatype="http://www.w3.org/2001/ > >>> XMLSchema#int">101</owl:hasValue> > >>> </owl:Restriction> > >>> </rdfs:subClassOf> > >>> <rdfs:label rdf:datatype="http://www.w3.org/2001/ > >>> XMLSchema#string" > >>> >motor neuroendocrine magnocellular oxytocin neuron</ > >>> rdfs:label> > >>> <isMentionedBy rdf:resource="http://purl.org/commons/pubmed/ > >>> _7451682"/> > >>> </owl:Class> > >> > >> isMentionedBy should point to an instance of article > >> article identifiedByPMID http://purl.org/commons/pubmed/_7451682 > >> > >> I don't like the underscore but Jonathan thinks it is necessary. > >> But this is minor. I would say using PMID_7451652 or some similar > >> variant is more appealing. (no accounting for taste) > >> We might also want > >> http://purl.org/commons/pubmed/_7451682 hasId "7451682", but I'm > >> not sure. > >> Anyways, neither of these are essential for progress. > >> > >> > >>> <owl:Class rdf:ID="Cellc101HasMoleculem3Within"> > >> Would be nice to have an english readable rdfs:label here. > >>> <rdfs:subClassOf rdf:resource="#c101"/> > >>> <rdfs:subClassOf> > >>> <owl:Restriction> > >>> <owl:onProperty > >>> rdf:resource="#cell_has_molecule_within"/> > >>> <owl:someValuesFrom rdf:resource="#m3"/> > >>> </owl:Restriction> > >>> </rdfs:subClassOf> > >>> <isMentionedBy rdf:resource="http://purl.org/commons/pubmed/ > >>> _3327422"/> > >>> </owl:Class> > >> > >> This class would be used in place of the restriction if there is a > >> definition that would otherwise use the restriction. > >> > >> Thanks for the quick response! > >> > >>> > >>> jb > >>> > >>> > >>> ----- Original Message ----- From: "Alan Ruttenberg" > >>> <alanruttenberg@gmail.com> > >>> To: "John Barkley" <jbarkley@nist.gov> > >>> Cc: "Jonathan Rees" <jar@mumble.net>; "chris mungall" > >>> <cjm@fruitfly.org>; "public-semweb-lifesci hcls" <public-semweb- > >>> lifesci@w3.org>; "Suzanna Lewis" <suzi@berkeleybop.org>; "Judith > >>> Blake" <jblake@informatics.jax.org>; "Barry Smith" > >>> <phismith@buffalo.edu> > >>> Sent: Thursday, April 19, 2007 12:24 AM > >>> Subject: Re: adding pubmed ids to BAMS > >>> > >>> > >>>> > >>>> Here is an idea I am exploring. Perhaps you might mock this up: > >>>> > >>>> The essential idea is that evidence and other annotation is > >>>> about named classes. In those cases where one might think of > >>>> annotating some axiom, or piece of axiom, we would instead > >>>> look for the class that is the referent of the annotation and > >>>> name that class. > >>>> Then, we can connect that class, using an annotation property, > >>>> to whatever kind of annotation or evidence we think appropriate. > >>>> > >>>> Suppose we have a class HumanP53Protein, which we will define > >>>> as: Those proteins whose sequence of amino acids are described > >>>> by the sequence in the sequence information field of the > >>>> Uniprot P53_Human Record, or which are derived from such a > >>>> protein. (I'm open to discussion on what this definitions > >>>> should be, BTW, but I think we should have one) > >>>> > >>>> One gene ontology annotation to P53 is: > >>>> GO:0000739; Molecular function: DNA strand annealing activity > >>>> (inferred from direct assay from UniProtKB). > >>>> > >>>> GO:0000739 is defined as in OBO, as a class, a subclass of > >>>> function. > >>>> > >>>> We will say that the referent of this annotation is the class > >>>> > >>>> HumanP53ProteinWithFunctionDNAStrandAnnealing: HumanP53Protein > >>>> and has_function some GO:0000739 > >>>> > >>>> The annotation property itself might be called > >>>> "ExistsAccordingTo", by which we mean that this class has > >>>> instances > >>>> > >>>> The thing it exists according is > >>>> > >>>> Inference001 > >>>> type InferredFromDirectAssay > >>>> describedInPaper theArticlePMID1234Describes > >>>> > >>>> So our annotation is > >>>> > >>>> HumanP53ProteinWithFunctionDNAStrandAnnealing ExistsAccordingTo > >>>> Inference001 > >>>> > >>>> Up to this point we have been conservative. We haven't made any > >>>> statement about P53 in general. Here, we will overstate (our > >>>> only choice, if we want to make a statement about biology from > >>>> which some useful inference can be done, given the evidence we > >>>> have) > >>>> > >>>> HumanP53Protein subclassOf > >>>> HumanP53ProteinWithFunctionDNAStrandAnnealing > >>>> > >>>> This may be wrong. For instance, it may be the case that only > >>>> that P53 phosphorylated in some way actually has this function. > >>>> I hope that by some other statement, a contradiction is > >>>> inferred that will force us (or the curators) to be more specific. > >>>> > >>>> ---- > >>>> > >>>> What's nice about this? > >>>> > >>>> > >>>> 1) We are making statements about biology (better than making > >>>> statements about "terms") > >>>> 2) There is no RDF reification involved - the main contender for > >>>> representing this sort of thing. > >>>> 3) We have been (relatively) conservative about what we say > >>>> there is evidence for > >>>> 4) We are owning the fact that we are making an overstatement > >>>> 5) We are enabling some inference to take place. > >>>> > >>>> What's the cost? > >>>> > >>>> 1) One extra triple, in which we name the class > >>>> HumanP53ProteinInvolvedInDNADamageResponse > >>>> Where we previously would have used a restriction to introduce > >>>> the participation, we now use the named class. > >>>> 2) When querying about what the evidence is for, we need to > >>>> query the asserted (or told) assertions only. That's because > >>>> after inference has been done, new assertions may be known > >>>> about HumanP53ProteinWithFunctionDNAStrandAnnealing and we won't > >>>> be able to tell the difference between what was asserted and > >>>> what is inferred, given that we have associated the only the > >>>> class name with the evidence > >>>> > >>>> --- > >>>> > >>>> Taking this to BAMS it means that we associate the paper with > >>>> the cell class for which we already have an name. > >>>> For the molecule is found in cell cases, we create the named > >>>> class for the cell contains some molecule class, use that > >>>> class in place of the restriction, and associate the paper to > >>>> that named class. > >>>> > >>>> You can define > >>>> > >>>> Class(article :partial) > >>>> Class(pubmedRecord :partial) > >>>> ObjectProperty(definedByPMID inversefunctional) > >>>> > >>>> Represent the pubmed record as an instance of pubmedRecord named > >>>> http://purl.org/commons/pubmed/1234 > >>>> > >>>> The last issue is the nature of the relationship between the > >>>> paper and the class. If we can't easily distinguish between > >>>> whether > >>>> these annotations are evidence or simply discussion we could > >>>> use the relation "isMentionedBy", which we will mean to say > >>>> that the class (or some instances of the class) are discussed > >>>> in the paper. > >>>> > >>>> --- > >>>> > >>>> Call me if you want to discuss this. Admittedly this may seem > >>>> involved and odd, since it is a new idea, though I will blame > >>>> Chris and Jonathan, who I bounced it off of, for not telling > >>>> me straight off it didn't make sense :) > >>>> > >>>> But how about we give it a go and see what it feels like. I'm > >>>> planning to use this translation for the GO annotations and the > >>>> rest of the similar sources, unless somebody comes forth with > >>>> some arguments about what would be a better idea. > >>>> > >>>> Best, > >>>> Alan > >>>> > >>>> > >>>> On Apr 18, 2007, at 3:49 PM, jbarkley@nist.gov wrote: > >>>> > >>>>> > >>>>>> From what Mihai sent me, the pubmed refs are about: > >>>>> > >>>>>> the cell and > >>>>>> the fact the molecule is found in cell > >>>>> > >>>>> Pending your recomendation, I had tentatively suggested the > >>>>> following for > >>>>> representing this as: > >>>>> > >>>>> pubmedID has "<id>" or > >>>>> cell_has_molecule_within some (<cell> and (pubmedID has "<id>")) > >>>>> > >>>>> where one of more of these is associated with a cell. I was > >>>>> under the > >>>>> impression that you were thinking about a general > >>>>> representation that everyone > >>>>> would use for pubmedID. So, I haven't yet added these to the > >>>>> BAMS OWL version. > >>>>> > >>>>>> OK. Can you send me this for a quick look? > >>>>> > >>>>> I'm not sure what you are asking to see. Do you want to see the > >>>>> original > >>>>> tables Mihai sent me? > >>>>> > >>>>> thanks, > >>>>> > >>>>> jb > >>>>> > >>>>> > >>>>> > >>>>> Date: Wed, 18 Apr 2007 12:30:17 -0400 > >>>>> From: Alan Ruttenberg <alanruttenberg@gmail.com> > >>>>> To: John Barkley <jbarkley@nist.gov> > >>>>> Cc: Jonathan A Rees <jar@mumble.net> > >>>>> Subject: Re: adding pubmed ids to BAMS > >>>>> Quoting Alan Ruttenberg <alanruttenberg@gmail.com>: > >>>>> > >>>>>> > >>>>>> On Apr 13, 2007, at 1:51 PM, John Barkley wrote: > >>>>>> > >>>>>>> I have confirmed from Mihai that all of the pubmed references in > >>>>>>> BAMS are evidence for or elaboration about. > >>>>>> > >>>>>> OK. Can you send me this for a quick look? > >>>>>> Is it clear what the they are about > >>>>>> i.e. > >>>>>> > >>>>>> the cell > >>>>>> the part > >>>>>> the fact that cell is located in part > >>>>>> the fact the molecule is found in cell > >>>>>> the fact the molecule is found in part > >>>>>> the fact the molecule is found in cell in part > >>>>>> etc. > >>>>>> > >>>>>> ? > >>>>>> > >>>>>>> > >>>>>>> > >>>>>>> ----- Original Message ----- From: "Alan Ruttenberg" > >>>>>>> <alanruttenberg@gmail.com> > >>>>>>> > >>>>>>>> Don't have time at this moment, but I think that generally you > >>>>>>>> want to state the the article is either evidence for, or > >>>>>>>> elaboration about the scientific statement involving the > >>>>>>>> cells, > >>>>>>>> molecules, etc. Then then use the pubmed id in some > >>>>>>>> standard URI > >>>>>>>> form (maybe neurocommons record url style) or > >>>>>>>> Jonathan's purl.org suggestion. In other words the pubmed id is > >>>>>>>> the identifier for a thing (the article, or the abstract, > >>>>>>>> depending on one's point of view). > >>>>>>>> > >>>>>>>> More details later. > >>>>>>>> > >>>>>>>> You could look and see how Gene ontology represents evidence. > >>>>>>>> > >>>>>>>> -Alan > >>>>>>>> > >>>>>>>> On Apr 11, 2007, at 3:46 PM, John Barkley wrote: > >>>>>>>> > >>>>>>>>> hi alan, > >>>>>>>>> > >>>>>>>>> I recieved spreadsheets from Mihai relating cells & pubmed > >>>>>>>>> ids, > >>>>>>>>> and cells, molecules, & pubmed ids. I wanted to consult > >>>>>>>>> with you > >>>>>>>>> about your preferences for how to integrate this into > >>>>>>>>> BAMS. I am > >>>>>>>>> thinking something like defining a datatype property pubmedID > >>>>>>>>> from owl:thing to string. Then for cells, you would have: > >>>>>>>>> > >>>>>>>>> pubmedID has "<id>" > >>>>>>>>> > >>>>>>>>> and for cells with molecules within, you would have: > >>>>>>>>> > >>>>>>>>> cell_has_molecule_within some (<cell> and (pubmedID has > >>>>>>>>> "<id>")) > >>>>>>>>> > >>>>>>>>> Please let me know. > >>>>>>>>> > >>>>>>>>> thanks, > >>>>>>>>> > >>>>>>>>> jb > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> > >>>> > >>>> > >>> > >>> > >> > >> > > > > > > >
Received on Friday, 20 April 2007 10:28:36 UTC