Re: blog: semantic dissonance in uniprot

     Hello Pat, All,

On Sat, Mar 28, 2009 at 10:54 AM, Pat Hayes <phayes@ihmc.us> wrote:
>>  Actually, I doubt a protein is a set. It seems to me, in Systems
>> Biology, a protein is an operator working on statistical ensembles,
>> from which we can derive expectation values and variances.
>
> Um. OK, you obviously know more about this than I do, but I very much doubt
> if any ontology notation is capable of expressing what you here describe.

  Why not? I don't see any fundamental problem.

  As I understand it, an owl:Class is simply something intended to be
instantiated. I declare something a class if and only if I intend
there to be instances. In Systems Biology, as I understand it, EGFR is
an instance of class Protein which is subclass of Substance. I don't
intend there to be instances of EGFR, so I don't declare it a class.
If some one else wants to declare instances of EGFR, that's their
responsibility and it is probably a mistake.

  People shift effortlessly between domains. In college, we learn in
Quantum Mechanics to make a sharp distinction between an operator and
its expectation value. Once we advance through grad school, we start
use the same symbol for both operator and expectation value and stop
talking about the distinction. The context either makes it clear,
which one is meant (e.g. p = <p>), or it does not matter (e.g. p =
mv).

  Systems Biologists use knowledge from Molecular Biology, Chemistry
or Biological Physics, where they talk about single molecules or sets
of molecules. But the typical Systems Biology picture, the picture
behind the Virtual Cell, SBML, BioPAX, is not one of single molecules
or sets of molecules, but of quantifiable observables. An observable
can be understood as the result of a series of measurements, which
yields an expectation value and a variance. A simple understanding of
an observable is enough to do Systems Biology, but at the same time,
the concept easily integrates thermodynamic statistical ensembles,
quantum uncertainty and averaging across different samples.

  Even if the observable is the number of molecules, it may not be an
integer, because it is an expectation value. That's why in SBML and
BioPAX, stoichiometric coefficients are floats, not integers, and in
SBML, they even can carry units. Often, we are interested not in the
molecule number, but in the concentration. Or, in something
non-countable, such as heat.

  To understand what EGFR is, we build an imaginary EGFR detector, a
device that we direct to some space, push a button, and it gives us a
measurement of the amount of EGFR in that space, which is an
approximate value. Pushing the button repeatedly gives us expectation
value and variance. Can you use it to track a single molecule? It is
physically impossible to make it certain that there is exactly one
EGFR molecule in a space. The best you can do is having an expectation
value close to one, and a variance close to zero.

> I was referring to OWL 2, not OWL Full. It is the new version of OWL, in
> last call as we write. The DL version of it runs at DL efficiencies and
> allows classes of classes, kinda (using punning, it works for most
> applications). And BTW, instantiating classes is fast and easy in just about
> any formalism. The speed cost comes from the fact that more expressive
> languages allow stranger edge cases which have to be checked by complete
> reasoners. But all these complexity results are worst-case, and normal-case
> behavior is often very different.

  Looking forward to Jena implementing an OWL 2 DL reasoner, or more
importantly, an OWL 2 Mini reasoner.

     Take care
     Oliver

-- 
Oliver Ruebenacker, Computational Cell Biologist
BioPAX Integration at Virtual Cell (http://vcell.org/biopax)
Center for Cell Analysis and Modeling
http://www.oliver.curiousworld.org

Received on Saturday, 28 March 2009 16:15:44 UTC