W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > July 2006

Re: ontology specs for self-publishing experiment

From: Trish Whetzel <whetzel@pcbi.upenn.edu>
Date: Mon, 10 Jul 2006 12:23:45 -0400 (EDT)
To: Alan Rector <rector@cs.man.ac.uk>
cc: Phillip Lord <phillip.lord@newcastle.ac.uk>, w3c semweb hcls <public-semweb-lifesci@w3.org>
Message-ID: <Pine.LNX.4.61.0607101154460.28972@hera.pcbi.upenn.edu>

>>   AR> Could I strongly support the following.  If there is one
>>   AR> repeatedly confirmed lesson from the medical communities
>>   AR> experience with large terminologies/ontologies/ it is to
>>   AR> separate the "terms" from the "entities".  ...
>> 
>> Not that I wish to disagree with Alan, of course, but it is worth
>> mentioning the reason that so many identifiers are semantically
>> meaningful in biology; they look better in papers. More over, because
>> they have some meaning associated with them, they are likely to be
>> used correct in papers as biologists will notice when they have the
>> wrong one.
>> 
>
> I am not against having human readable, standard names. To the contrary, I 
> think they are essential.  But I would avoid them as primary identifiers. 
> For example, try editing a set of modular ontologies held together by just 
> the string names.  Every time somebody wants to change a name, fix a spelling 
> error, etc. there is a global change that is intrinsically unreliable or, if 
> the ontologies are distributed, requires a major organisational effort.  If 
> the true identifiers are meaningless ID numbers and meaningful names are 
> labels, things are much easier.  You can change the label without disturbing 
> the linkages.  (Protege-OWl now supports this.  With the switches set 
> correctly, you should not notice most of the time that you are using labels 
> rather than identifiers.  New versions will smooth off the rough edges.)
(Just catching up as well).

I completely agree with the statement from Alan. The use of human 
readable string names as the primary identifier is a bad idea for all the 
reasons above. As one of the developers of the MGED Ontology (MO), this 
was a mistake that we made for multiple reasons. This served as a lesson 
learned from building MO as something to not repeat in FuGO, therefore 
FuGO does use alphanumeric identifiers. As Alan mentions, using the proper 
setting in Protege/OWL the identifiers can be hidden and the labels 
viewed. There is also a Protege plugin that can be used to automatically 
generate the next available number for use in the alphanumeric 
identifier. If interested in this plugin I can send more inforrmation as 
I don't think it is posted on the Protege site.

As one note, I wanted to mention that it seems as though alphanumeric 
versus solely numeric identifiers would be preferred based on 
viewing preliminary work by Chris Mungall in efforts to translate OBO 
format ontologies to OWL.

Trish
Received on Monday, 10 July 2006 16:24:28 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:00:44 GMT