Coming from a background in biomedical terminology, I am surprised by the choice of
phrases in the requirements paper: "Term", "Label" and the absence of the phrase "concept".
I don't know if this is deliberate,  but it certainly conflicts with numerous widespread usages,
particularly in the biomedical arena.

Medical applications have spent the last two decades separating the "term" - a linguistic unit, more or less what seems to be meant by a "label" in the requirements document - from the "concept", and indeed the "concept name" from the "concept identifier"

The classic discussion is in Cimino's Desiderata for controlled medical vocabularies in the twenty-first century. Methods of Information in Medicine37(4-5): pp. 394-403. PDF here where it goes under the perhaps unhelpful phrase of "non-semantic identifiers".  Our own experience in a multilingual environment stresses the importance of separating the conceptual and language layer - see Rector A L (1999). Clinical Terminology: Why is it so hard? Methods of Information in Medicine 38: 239-252. PDF here.

The major indexing and meta data resource for the biomedical community is PubMed/Medline and the Unified Medical Language System (UMLS) maintained by the US National Library of Medicine, which again makes a clear distinction between "Concept Unique Identifiers" (CUIs) and "Lexical Unique Identifiers" (LUIs).

Likewise, both the Read Codes and SNOMED-RT/CT as well as OpenGALEN - the three terminology resources using formal description logic semantics related to medicine - separate "term" from "concept" in this way, although SNOMED-RT/CT and Read use "preferred term" for what I below call "concept name".

Furthermore, other parts of the terminology and library community  use "term" in a linguistic sense and explicitly use "broader than"/"narrower than" for the hierarchical principle rather than a notion of logical subsumption. It is important to keep these separate.  The 'broader than/narrower than' notion is very useful for navigation and other purposes but must not be confused with logical subsumption for inference.

Finally, I can't find where in the document it says under what circumstances a "label" must point to a unique "term" (in the sense those phrases are used in the requirements document) although it makes it clear that a given "term" can have many "labels".  Ambiguity of lexical expressions -- i.e. in this context the same lexical expression designating more than one concept ("term" in the document's sense)--is of course common.

I would submit that a major requirement for the ontology language is to make a clear distinction between:

*    Lexical phrases which can be used to present a given concept in a given language and context or to search for it, but which may suffer from ambiguity  which must be resolved  .  I would call these "terms".  There is a many-many relation between "terms" and "concept identifiers".  "Terms" are mostly of interest to end users.

*    The names of concepts which may be multiple but must refer to a unique concept identifier.  I would call these "concept names".  There is a many-one relation between concept names and concept identifiers.  Concept names are mostly of interest to knowledge engineers. Allowing more than one is a programming convenience and helps in multilingual applications.

*    The identifiers of concepts which should be globally unique and unambiguous.  I presume it will ultimately be qualified URIs within namespaces in the RDF concrete syntax.  I would call these "concept identifiers"


Alan L. Rector
Professor of Medical Informatics
Department of Computer Science
University of Manchester
Manchester M13 9PL, UK
Tel +44-161-275-6188/6239/7183
FAX +44-161-275-6204