W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > March 2013

Re: owl:sameAs - Is it used in a right way?

From: Alan Ruttenberg <alanruttenberg@gmail.com>
Date: Sun, 17 Mar 2013 19:12:43 -0400
Message-ID: <CAFKQJ8mx54_-PTJ_8H68WGxtUyHbSFV7+gQo97vX_xYJMRD1pg@mail.gmail.com>
To: Umutcan ŞİMŞEK <s.umutcan@gmail.com>
Cc: Oliver Ruebenacker <curoli@gmail.com>, public-semweb-lifesci@w3.org
Neither of them are clear enough to be sure what they are referring to.
They both, in their description, refer to molecules in some places, and
packaged therapeutics in others. Their CAS numbers agree (though the
wikipedia mentions that it is of the sodium salt), as do their INCHI,
though the INCHI of the pubchem entries they refer to differ. Their
molecular masses differ, their IUPAC systematic names differ. The recorded
packagers differ, while wikipedia lists a number of different
manufacturers/packagers.

As decriptions they are clearly distinct, not only in their expression, but
in the actual assertions they make.

Both descriptions seem to refer to a broad range of entities. Drugbank has
the appearance of being more specific, but in one place classifies it as a
small molecule, while in the description talks about formulations, and by
reference to the CAS number at least admits to common modifications made
for pharmaceutical formulations.

To say that two things are owl:sameAs is a very strong statement.

I see no basis whatsoever to assert these are sameas in any way. If you
take the interpretation that the records represent something and that the
documentation is annotated description, then we don't have confidence that
they refer to the same set of things (class, but I'm always afraid of
scaring people with technical talk in this forum). Assertions that refer to
_some_ member of one *might* might be consistent if you substitute the
other in the expression. Considering equivalent assertions that refer to
*all* of either of the classes would be a matter of faith alone.

If we think what we see in our browser is a presentation of
records/descriptions, it's obvious they are not the same.

A careful use of these would as I suggested, consider them different
descriptions of something. In this case, considering that there are a
number of equally plausible primary topics (the class of molecules, the
class of molecules and their common pharmaceutical derivatives, a class of
therapeutics containing some of the previous as part), if you are
interested in differentiating between these then the best you could use is
foaf:topic rather that foaf:primaryTopic.

One could also consider the descriptions as the class of descriptions whose
primaryTopic is (the class of molecules OR the class of molecules and their
common pharmaceutical derivatives OR a class of therapeutics containing
some of the previous as part).

sameAs means the URIs refer to EXACTLY the same thing, that they are
substitutable for each other in ALL assertions of ANY kind. That's clearly
not the case. Even if the intention was the same, because of the
inconsistencies they would need to be corrected, and if you asserted they
were the same then you wouldn't be able to distinguish which one which
correction went to.

If you are interested in the class with the ORs, then defining the class
and asserting the records are both foaf:primaryTopic of that seems
justified.

If you are interested in anything else, your best bet is to choose a
term/URI that is well defined and specific, and then relate these by
foaf:topic. It's reasonable to assume that if R foaf:topic T, then some
assertions that R makes are about T. Figuring out which ones is then a
matter of additional processing and assumptions, not standard RDF or OWL
semantics. If you chose to republish the results of that processing then I
would be careful about how those are asserted. My own guideline is that the
resulting assertions are vetted by a group of experts in the field and they
agree as to the assertional content, then it's a good idea to make the more
specific assertions directly. Otherwise you might consider representing
them as claims or hypotheses with the hope that said group considers them
worth review at some point. See
http://hypothesis.alzforum.org/swan/do!getFAQ.action for the claim view,
and OBO Foundry for the review by domain experts view.

Note that from a LODD point of view there is, strictly speaking, a
requirement that the URIs be linked. This can be accomplished just as well
with one or a chain of two justifiable assertions as it can by having a
single (unjustifiable) owl:sameAs link. I feel like doing the latter means
that a conscientous researcher needs to pretty much ignore the bulk of
owl:sameAs assertions (I do).

-Alan


On Sun, Mar 17, 2013 at 6:21 PM, Umutcan ŞİMŞEK <s.umutcan@gmail.com> wrote:

> 17.03.2013 17:11 tarihinde, Oliver Ruebenacker yazdı:
>
>        Hello,
>>
>> On Fri, Mar 15, 2013 at 1:05 PM, Umutcan ŞİMŞEK <s.umutcan@gmail.com>
>> wrote:
>>
>>> My question is, does LODD use owl:sameAs properly? For instance, are
>>> those
>>> two resources, dbpedia:Metamizole and drugbank:DB04817 (code for
>>> Metamizole), really identical? Or am I getting the word "property" in the
>>> paper wrong?
>>>
>>    Do we mean metamizole in acid form, or metamizole in anion form, or
>> metamizole sodium? Or some combination of some of the above?
>>
>>       Take care
>>       Oliver
>>
>>  This is actually one of other topics of this discussion. At dbpedia, it
> seems supplied as sodium salt, but I couldn't find any form information in
> drugbank, maybe overlooked. And I'm not sure if it matters in drugbank
> context. But I can ask to researchers at my workplace next week.
>
> --
> Umutcan ŞİMŞEK
>
> Senior student
> Ege University, Izmir, Computer Science Department
>
> Part-time IT Team Supervisor
> ARGEFAR Pharmaceutical Research and Pharmacokinetic Development Center
>
> Blog(Turkish): http://blog.umutcansimsek.com
> LinkedIn: http://www.linkedin.com/pub/**umutcan-%C5%9Fim%C5%9Fek/53/**
> 199/26a <http://www.linkedin.com/pub/umutcan-%C5%9Fim%C5%9Fek/53/199/26a>
>
>
>
Received on Sunday, 17 March 2013 23:13:40 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:53:01 UTC