Re: blog: semantic dissonance in uniprot from Peter Ansell on 2009-03-26 (public-semweb-lifesci@w3.org from March 2009)

From: Peter Ansell <ansell.peter@gmail.com>
Date: Thu, 26 Mar 2009 15:28:17 +1000
To: Pat Hayes <phayes@ihmc.us>
Cc: Mark Wilkinson <markw@illuminae.com>, Phillip Lord <phillip.lord@newcastle.ac.uk>, W3C HCLSIG hcls <public-semweb-lifesci@w3.org>
Message-ID: <a1be7e0e0903252228v3e2676b1wafbba2bdc84e993f@mail.gmail.com>
2009/3/26 Pat Hayes <phayes@ihmc.us>:
> Well, if you can tell us how to do some weaving, we maybe can make progress.
> The properties of sameAs are fairly easy to list. It is transitive,
> reflexive, symmetric and substitutive: if A sameAs B and something is true
> of A, then its also true of B.  So, which of these _aren't_ correct for the
> application you have in mind? Can you say why not (an example will do)?
> "Less rigorous" doesn't cut it.

It is easy in theory to state a property from logic theory in a truly
rigourous once-correct-always-correct way for every potential context,
but in reality each context will find a different value to each
combination of the properties you refer to, in the range between,
utterly pedantic to the point of being completely uninterpretable to a
non logic theory expert and still both correct and incorrect depending
on the scenario, up to a laise faire "not cutting it" method where
"shock horror" database records are referred to as *both* instances
and classes of molecules at the same time! Some of the applications of
biological data would shock someone used to complete rigour but they
turn out to be sufficient evidence for people making statements about
things in the sciences. Some people find the flexibility to be
liberating.

The whole idea of the Global Graph fails to take into account the fact
that statements which are included may have been designed for limited
scenarios, where for instance the exact rigourous nature of something
is either not known, was previously thought to be rigourously correct,
or was previously thought to be practical in a limited scenario.
Trying to make things work in the Global Graph doesn't require weaving
in my opinion, it requires some clean chopping, where within domains
things are accepted, but outside of that, rigour is logically limited
if only because it is impossible to prove that outside of the
Global-Graph-knowledge-universe that a particular statement has a
useful meaning if the person doing the human verification isn't an all
over expert and all knowledgeable about the truth. If there was only
one statement in the Global Graph that was actually incorrect it might
be practical to apply rules and determine where it breaks before
attempting to fix the break and try again.

It might sound drastic but it would eliminate all of this continuous
banter if people stopped trying to make things work in the global
context and in the process confuse everyone else who only really
wanted to use the knowledge in their local field of work. The whole
idea of logic theory applied to practical knowledge only seems to be
suited to limited areas where there is absolute and full understanding
of every property and interaction in the system being described, or
else you might say something which is later found to be *false!*

Admittedly we probably need to find an alternative to sameAs because
it is clear that within its theory, it has a complete universal set of
properties which will actually interfere with people trying
frantically to do some of the more pedantic global graph operations
where knowledge is assumed not to be dirty (turns out that is false),
and it is complete (also turns out that is false). If sameAs really is
just useful for relating synonyms within equivalent ontological
environments it wouldn't surprise me. It really has some quite drastic
implications even if you choose to map between any two arbitrary
knowledge bases with different ontological trees being used to define
the class and properties surrounding the target URI's.

If we could even define an alternative intraGraphSameAs to be workable
only within a limited idea of graphs in RDF databases and/or
"namespaces" then it would still be usable in its global sense but
still be recognisable as more than "seeAlso" for people working in a
limited knowledge sphere. An alternative interGraphSameAs to sameAs
that has implications for mapping between graphs/databases/namespaces
would have to be drawn out in a different way because it deals with
different contexts. Ideally reasoners could actually determine an
error statistic based on the reliance of a reasoning operation on a
combination of intraGraphSameAs and interGraphSameAs, although they
could already do that with sameAs if graphs are loaded with statements
based on databases instead of a single huge Global Graph.

Cheers,

Peter
Received on Thursday, 26 March 2009 05:28:56 UTC