Coming from a background in biomedical terminology, I am surprised by the
phrases in the requirements paper: "Term", "Label" and the absence of the phrase "concept".
I don't know if this is deliberate, but it certainly conflicts with numerous widespread usages,
particularly in the biomedical arena.
Medical applications have spent the last two decades separating the "term" - a linguistic unit, more or less what seems to be meant by a "label" in the requirements document - from the "concept", and indeed the "concept name" from the "concept identifier"
The classic discussion is in Cimino's Desiderata for controlled medical vocabularies in the twenty-first century. Methods of Information in Medicine37(4-5): pp. 394-403. PDF here where it goes under the perhaps unhelpful phrase of "non-semantic identifiers". Our own experience in a multilingual environment stresses the importance of separating the conceptual and language layer - see Rector A L (1999). Clinical Terminology: Why is it so hard? Methods of Information in Medicine 38: 239-252. PDF here.
The major indexing and meta data resource for the biomedical community is PubMed/Medline and the Unified Medical Language System (UMLS) maintained by the US National Library of Medicine, which again makes a clear distinction between "Concept Unique Identifiers" (CUIs) and "Lexical Unique Identifiers" (LUIs).
Likewise, both the Read Codes and SNOMED-RT/CT as well as OpenGALEN - the three terminology resources using formal description logic semantics related to medicine - separate "term" from "concept" in this way, although SNOMED-RT/CT and Read use "preferred term" for what I below call "concept name".
Furthermore, other parts of the terminology and library community use "term" in a linguistic sense and explicitly use "broader than"/"narrower than" for the hierarchical principle rather than a notion of logical subsumption. It is important to keep these separate. The 'broader than/narrower than' notion is very useful for navigation and other purposes but must not be confused with logical subsumption for inference.
Finally, I can't find where in the document it says under what circumstances a "label" must point to a unique "term" (in the sense those phrases are used in the requirements document) although it makes it clear that a given "term" can have many "labels". Ambiguity of lexical expressions -- i.e. in this context the same lexical expression designating more than one concept ("term" in the document's sense)--is of course common.
I would submit that a major requirement for the ontology language is to make a clear distinction between:
* Lexical phrases which can be used to present a given concept in a given language and context or to search for it, but which may suffer from ambiguity which must be resolved . I would call these "terms". There is a many-many relation between "terms" and "concept identifiers". "Terms" are mostly of interest to end users.
* The names of concepts which may be multiple but must refer to a unique concept identifier. I would call these "concept names". There is a many-one relation between concept names and concept identifiers. Concept names are mostly of interest to knowledge engineers. Allowing more than one is a programming convenience and helps in multilingual applications.
* The identifiers of concepts which should be globally
unique and unambiguous. I presume it will ultimately be qualified
URIs within namespaces in the RDF concrete syntax. I would call these
Alan L. Rector
Professor of Medical Informatics
Department of Computer Science
University of Manchester
Manchester M13 9PL, UK