Re: Advancing translational research with the Semantic Web from Huajun Chen @ Zhejiang University on 2007-05-26 (public-semweb-lifesci@w3.org from May 2007)

From: Huajun Chen @ Zhejiang University <huajunsir@gmail.com>
Date: Sat, 26 May 2007 11:20:32 -0400
To: "Alan Ruttenberg" <alanruttenberg@gmail.com>
Cc: "Pat Hayes" <phayes@ihmc.us>, public-semweb-lifesci@w3.org
Message-ID: <cd02451e0705260820g741631f5w8c686b9ed04d63f2@mail.gmail.com>
I think your approach has, to some degree, simulated the role of  named
graph.

The URI of the newly introduced named class can be viewed as a URI of a
named graph that includes a group of statements.  The difference here is
that the uri of your approach refers to an real entity (the protein) and we
attach everything we want (provenance, belief, annotations) to the entity
identified by the URI, but for the named graph, it refers to a graph and we
attach everything we want to the graph identified by the URI.  But the
purpose is similar: for both, we want to attach some evidential information
to a group of statements.

It would be clearer to use the neuron example to explain this. The reason of
why it was not straightforward to find a place to put the notes (the
provenance info.) into the ontology is that the notes are actually the
evidences of the presence of a receptor(or current or transmitter) in a
specific compartment of a neuron, for which we have to use a group of
statements (including two owl:someValuesFrom restrictions) to represent.
For your approach, we create new named classes (such as
CA3_pyramidal_neuron_with_I_Na_t_current_in_Dap) for those previously
anonymous classes, which lack URIs, and then attach those notes as
annotations to the newly created named classes.

CA3_pyramidal_neuron_with_I_Na_t_current_in_Dap -:
               CA3_pyramidal_neuron AND
               ro:has_part SOME (Dap AND
                                           (has_Current SOME I_Na_t))

               has_supporting_evidence
Evidence_for_CA3_pyramidal_neuron_not_with_I_Na_t_current_in_Dap (this is an
instance pointing to a set of article IDs.)
I think the core issue here is about the URI. RDF reification does not
provide a mechanisom to identify a rdf statement, let alone a group of
statements. Named graph let us do so.

Best,
Huajun


On 5/17/07, Alan Ruttenberg <alanruttenberg@gmail.com> wrote:
>
>
> The example isn't necessarily a disagreement. Both could be true.
>
> I think these really need to be class statements, in any case, to
> make any sense.
>
> In my representation this is (schematically)
>
> Class protein_a_expression_process_located_in_tissue_b
>    subclassOf expression and produced some protein_a and located_in
> some tissue b
> annotation: has_evidence: traceable author statement, cites evidence
> source: c
>
> Class protein_a_expression_process_located_in_tissue_b
>    subclassOf expression and produced some protein_a and located_in
> some tissue e
> annotation: has_evidence: traceable author statement, cites evidence
> source: c
>
> No disagreement. But also not so much power.
>
> I have proposed in other mail (on this list, I think) that one may
> strengthen this, either as hypothesis, or by conviction by making the
> "overstatement"
>
> expression and produced some protein_a
> equivalentClass protein_a_expression_process_located_in_tissue_b
> equivalentClass protein_a_expression_process_located_in_tissue_e
>
> Now if tissue e and tissue b are disjoint, there would be a
> contradiction.
>
> Or we could hypothesize that
>
> expression and produced some protein_a
> equivalentClass
>    unionOf(protein_a_expression_process_located_in_tissue_b,
> protein_a_expression_process_located_in_tissue_e)
>
> Given what's been said so far, we can't actually tell which of these
> two cases is supposed to be meant.
>
> And even my version  doesn't handle parts very well. In the case of
> cellular components, for example, we want to be able to say that
> it's active in the cell and it's active in the E.R. and not have that
> be inconsistent because the because every E.R. is part of some cell.
>
> -Alan
>
>
>
>
>
> On May 17, 2007, at 1:00 PM, Pat Hayes wrote:
>
> >
> >>  > How would you say e.g. "protein a is expressed in tissue b,
> >> according to
> >>>  source c"?
> >>
> >> through something like
> >>
> >> <protein_a_expression_process> <has_participant> <protein_a> .
> >> <protein_a_expression_process> <located_in> <tissue_b> .
> >> <protein_a_expression_process> <described_by> <source_c> .
> >
> > OK, but suppose source d disagrees, and says that a is expressed in
> > e. Now you have
> >
> > <protein_a_expression_process> <located_in> <tissue_e> .
> > <protein_a_expression_process> <described_by> <source_d> .
> >
> > and its all about the same process. What now associates d with e,
> > and c with b? You just have five triple all with the same subject.
> >
> > Pat Hayes.
> >
> >> -- Matthias Samwald
> >>
> >>
> >> Yale Center for Medical Informatics, New Haven /
> >> Section on Medical Expert and Knowledge-Based Systems, Vienna /
> >> http://neuroscientific.net
> >>
> >>
> >> .
> >> --
> >> Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
> >> Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer
> >
> >
> > --
> > ---------------------------------------------------------------------
> > IHMC          (850)434 8903 or (650)494 3973   home
> > 40 South Alcaniz St.  (850)202 4416   office
> > Pensacola                     (850)202 4440   fax
> > FL 32502                      (850)291 0667    cell
> > phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
> >
> >
>
>
>
Received on Saturday, 26 May 2007 15:20:35 UTC