Re: ontology specs for self-publishing experiment from Alan Rector on 2006-07-08 (public-semweb-lifesci@w3.org from July 2006)

From: Alan Rector <rector@cs.man.ac.uk>
Date: Sat, 8 Jul 2006 18:43:51 +0200
To: William Bug <William.Bug@DrexelMed.edu>
Cc: "Miller, Michael D (Rosetta)" <Michael_Miller@Rosettabio.com>, "Eric Neumann" <eneumann@teranode.com>, "AJ Chen" <canovaj@gmail.com>, "w3c semweb hcls" <public-semweb-lifesci@w3.org>
Message-Id: <E39CBE67-D4C1-4597-866C-3D8DF7022EF8@cs.man.ac.uk>

All

Just catching up.

Could I strongly support the following.  If there is one repeatedly  
confirmed lesson from the medical communities experience with large  
terminologies/ontologies/ it is to separate the "terms" from the  
"entities".  There are always linguistic artefacts, and language  
changes more fluidly in both time and space than the underlying  
entities.   (In medical informatics this is sometimes quaintly  
phrased as using "nonsemantic identifiers").

Regards

Alan

On 5 Jul 2006, at 22:43, William Bug wrote:

>
> By the way, the "mapping" I refer to above linking instance data  
> where ever it may reside (primary data repositories, pooled/ 
> analyzed/interpreted data, the scientific literature) to entities  
> in the ontologies requires reference to the lexicon - the TERMS  
> used to describe the ontological fundamentals by the scientists  
> reporting them.  This is true whether an algorithm or a human is  
> trying to understand and interpret a collection of instance data in  
> the context of the relevant knowledge framework, even if that  
> framework resides in the head of the human researcher.
>
> I like to think of this distinction as being very coarsely  
> analogous to the distinction between the physical data model in an  
> RDBMS and the many tools used to make that more abstracted,  
> normalized collection of related entities directly useful for  
> specific applications - e.g., SQL SELECT statements, VIEWs, and/or  
> Materialized VIEWS.  Maintaining these as distinct elements goes a  
> long way toward ensuring the abstraction is re-usable for a large  
> set of applications, while simultaneously being able to support  
> each application's detailed requirements through custom de- 
> normalization.
>
> This is why I like to keep the lexicon distinct from the ontology.   
> They are intimately linked.  No ontology is free of lexical  
> artifacts (I'm not certain it can or should be), anymore than a  
> lexical graph can be assembled without representing semantic  
> relations.  Analysis of the lexicon can inform how to adapt the  
> semantic graph in the ontology - make it more commensurate with the  
> current state of knowledge as expressed by domain experts, and  
> review of term use in the context of the ontology can be a great  
> help in creating effective, structured, controlled terminological  
> resources.  However, the two types of knowledge resource are  
> constructed via different process, support different Use Cases, and  
> rely on different fundamental relations at their core, however  
> intimately they may be linked.
>

-----------------------
Alan Rector
Professor of Medical Informatics
School of Computer Science
University of Manchester
Manchester M13 9PL, UK
TEL +44 (0) 161 275 6149/6188
FAX +44 (0) 161 275 6204
www.cs.man.ac.uk/mig
www.clinical-esciences.org
www.co-ode.org

Received on Sunday, 9 July 2006 11:10:13 UTC