Re: Ambiguous names. was: Re: URL +1, LSID -1

>>>>> "MS" == Matthias Samwald <samwald@gmx.at> writes:

  >>  It would be more satisfying for us to know intentionally what we  mean
  >> by "protein". It would be good to have a clear set of  definitions. But,
  >> ultimately, I think it would be mistaken. If we  have the ability to
  >> express "the class of protein molecules defined  by the swissprot record
  >> OPSD_HUMAN", then I think we have all we  need.

  MS> OWL is very open towards incomplete information. If all we know about
  MS> the protein is the sequence of amino acids, than this is what we add to
  MS> the protein class through a 'some-values-from, necessary' property
  MS> restriction (and not 'necessary and sufficient', since we are still
  MS> unsure if this information alone is enough to DEFINE the protein
  MS> class). If we know that proteins of this class can have some
  MS> polymorphisms, we can enumerate the different possible sequences as best
  MS> as we can. If we are unable to enumerate all of them at the moment, or
  MS> are unsure about something, we just leave it out and maybe add it later.


This is my worry. Effectively, I think you are saying why not take all the
knowledge in swissprot and duplicate it in our class definitions. I don't see
what this adds. All I see is that it will add confusion and the potential for
data to get out of date. 

This is an important issue and will raise it's head repeatedly. Should we
define Homo sapiens? Should we determine all the necessary and sufficient
conditions? Or should we just point to a pre-existing taxonomy and a
pre-existing process?

I think that there are many clear reasons for keeping statements about the
informatics entities -- the database entries for example. To do otherwise,
runs the risk of enormous mission creep (always a problem with data modelling
and ontologies).

Phil

Received on Thursday, 19 July 2007 13:27:32 UTC