W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > March 2009

Re: blog: semantic dissonance in uniprot

From: Pat Hayes <phayes@ihmc.us>
Date: Sat, 28 Mar 2009 15:32:16 -0500
Cc: W3C HCLSIG hcls <public-semweb-lifesci@w3.org>
Message-Id: <6A520C0C-89CD-4C07-ABE1-CE226560DFC4@ihmc.us>
To: Oliver Ruebenacker <curoli@gmail.com>

On Mar 28, 2009, at 11:15 AM, Oliver Ruebenacker wrote:

>     Hello Pat, All,
> On Sat, Mar 28, 2009 at 10:54 AM, Pat Hayes <phayes@ihmc.us> wrote:
>>>  Actually, I doubt a protein is a set. It seems to me, in Systems
>>> Biology, a protein is an operator working on statistical ensembles,
>>> from which we can derive expectation values and variances.
>> Um. OK, you obviously know more about this than I do, but I very  
>> much doubt
>> if any ontology notation is capable of expressing what you here  
>> describe.
>  Why not? I don't see any fundamental problem.

Well, the very idea of a statistical ensemble is way more complicated  
than anything any ontology language semantics is able to deal with.  
You would need at least arithmetic to describe this, surely.(?)

>  As I understand it, an owl:Class is simply something intended to be
> instantiated. I declare something a class if and only if I intend
> there to be instances.

Yes, quite.

> In Systems Biology, as I understand it, EGFR is
> an instance of class Protein which is subclass of Substance. I don't
> intend there to be instances of EGFR, so I don't declare it a class.
> If some one else wants to declare instances of EGFR, that's their
> responsibility and it is probably a mistake.

OK so far. But I don't see anything here about statistical ensembles.

>  People shift effortlessly between domains. In college, we learn in
> Quantum Mechanics to make a sharp distinction between an operator and
> its expectation value. Once we advance through grad school, we start
> use the same symbol for both operator and expectation value and stop
> talking about the distinction. The context either makes it clear,
> which one is meant (e.g. p = <p>), or it does not matter (e.g. p =
> mv).

We humans do this all the time, yes, and not just in technical areas  
but also in daily life. But machines are not very good at this kind of  
cross-domain elision. In fact, they can hardly do it at all. Notice  
that if this kind of reasoning were ubiquitous, sameAs would be close  
to meaningless.

>  Systems Biologists use knowledge from Molecular Biology, Chemistry
> or Biological Physics, where they talk about single molecules or sets
> of molecules. But the typical Systems Biology picture, the picture
> behind the Virtual Cell, SBML, BioPAX, is not one of single molecules
> or sets of molecules, but of quantifiable observables. An observable
> can be understood as the result of a series of measurements, which
> yields an expectation value and a variance. A simple understanding of
> an observable is enough to do Systems Biology, but at the same time,
> the concept easily integrates thermodynamic statistical ensembles,
> quantum uncertainty and averaging across different samples.
>  Even if the observable is the number of molecules, it may not be an
> integer, because it is an expectation value. That's why in SBML and
> BioPAX, stoichiometric coefficients are floats, not integers, and in
> SBML, they even can carry units. Often, we are interested not in the
> molecule number, but in the concentration. Or, in something
> non-countable, such as heat.
>  To understand what EGFR is, we build an imaginary EGFR detector, a
> device that we direct to some space, push a button, and it gives us a
> measurement of the amount of EGFR in that space, which is an
> approximate value. Pushing the button repeatedly gives us expectation
> value and variance. Can you use it to track a single molecule? It is
> physically impossible to make it certain that there is exactly one
> EGFR molecule in a space. The best you can do is having an expectation
> value close to one, and a variance close to zero.

I understand. However, speaking now as an ontology engineer, I would  
not advise anyone to attempt to formalize all this in anything  
remotely like OWL or even full first-order logic.


>> I was referring to OWL 2, not OWL Full. It is the new version of  
>> OWL, in
>> last call as we write. The DL version of it runs at DL efficiencies  
>> and
>> allows classes of classes, kinda (using punning, it works for most
>> applications). And BTW, instantiating classes is fast and easy in  
>> just about
>> any formalism. The speed cost comes from the fact that more  
>> expressive
>> languages allow stranger edge cases which have to be checked by  
>> complete
>> reasoners. But all these complexity results are worst-case, and  
>> normal-case
>> behavior is often very different.
>  Looking forward to Jena implementing an OWL 2 DL reasoner, or more
> importantly, an OWL 2 Mini reasoner.
>     Take care
>     Oliver
> -- 
> Oliver Ruebenacker, Computational Cell Biologist
> BioPAX Integration at Virtual Cell (http://vcell.org/biopax)
> Center for Cell Analysis and Modeling
> http://www.oliver.curiousworld.org

IHMC                                     (850)434 8903 or (650)494 3973
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Saturday, 28 March 2009 20:33:31 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:20:41 UTC