Re: blog: semantic dissonance in uniprot from Peter Ansell on 2009-03-21 (public-semweb-lifesci@w3.org from March 2009)

From: Peter Ansell <ansell.peter@gmail.com>
Date: Sun, 22 Mar 2009 07:48:20 +1000
To: Egon Willighagen <egon.willighagen@gmail.com>
Cc: eric neumann <ekneumann@gmail.com>, marshall@science.uva.nl, W3C HCLSIG hcls <public-semweb-lifesci@w3.org>
Message-ID: <a1be7e0e0903211448n65e75993v6a624bf7ac9f3215@mail.gmail.com>

2009/3/22 Egon Willighagen <egon.willighagen@gmail.com>:
> On Sat, Mar 21, 2009 at 5:01 AM, eric neumann <ekneumann@gmail.com> wrote:
>> There is no such thing as a referenceble instance of a specific instantiated
>> molecule ("that specific molecule"); all gene, protein, and chemical records
>> are about the category or group of exemplar molecules:
>> SAME molecular structure, NOT SAME atoms (so we already aren't really things
>> in the real world ;-) ); all molecular databases are based on this asserted
>> fact.
>
> Even worse. Since there are >10^20 molecules in most used materials,
> many 'molecular' properties are really material properties. A melting
> point is not a molecular property, but often even reported as
> elemental property.

Making simple distinctions like this could be done by assigning
different meanings to each property, regardless of the fact that they
are both asserted in the same record for ease of access.

>> Most users of molecular information aren't ignorant about the difference
>> between a protein and a record of a protein; it's just that they don't want
>> to deal with all the extra CS mechanics (that prevent getting their job
>> done). And so an instance of a protein record in a database (or a reference
>> to it from another database) is the closest thing to saying: "here's the
>> protein".
>
> Chemists are not interested in single molecules (well, most are not,
> but with increasing nanotechnology...). I was told recently that upper
> ontologies have proper mechanisms to point out the difference between
> (in Java terminology) objects and classes, or instances and concepts.

There is that possibility.

>> Different records exist for the same protein, which indeed has been a
>> historic point of complication; but this is really a social issue, not a
>> semantic one, and the key data authorities have already for years
>> coordinated on this point by supplying cross-references to each
>> other.
>
> There is another level to this: that of a measurement or observation,
> and the identity we assign to it. The sequence of a protein, or
> molecular structure of a drug of the model that people assigned to
> some measurement. Measurements that point to the same measurable, may
> actually be assigned different identities...

Having different identities might be the rational scientific way to do
things. They might be caused by different perspectives on the one
item, or they might be caused by an actual duality of theory based on
an actual inability to describe something in a single theory. Making
god-like decisions about which class particular records actually
belong to as ontologists might sound fun but in the world case it
seems counterintuitive because it doesn't promote progress in both
areas concurrently.

Cheers,

Peter Ansell

Received on Saturday, 21 March 2009 21:48:55 UTC