W3C home > Mailing lists > Public > public-webont-comments@w3.org > April 2002

Re: Vocabulary: Terms, Concepts, and Labels

From: Jeff Heflin <heflin@cse.lehigh.edu>
Date: Wed, 03 Apr 2002 11:17:15 -0500
Message-ID: <3CAB2B0B.1FB9BA30@cse.lehigh.edu>
To: rector@man.ac.uk
CC: public-webont-comments@w3.org

Part of the problem is that despite the medical community having reached
some consensus on terminology, there is much less consensus in other
communities interested in ontologies. Another problem, is that the Web
community has developed its own terminology. Finding a terminology that
makes everyone happy is unlikely.

When we use "term," we refer to any element in the ontology's
vocabulary. This includes class names and property names (elsewhere
called relations or slots). Thus, we are using it in the same sense that
Ontolingua uses non-logical symbol, but since this document is meant for
a wider audience than the logical community, we chose to avoid that
term. Since concepts traditionally only refer to classes, it would be
inappropriate to use that as well.

The use of "label" comes from RDF Schema, where it is a "human-readable
version of a resource name." In this sense, it is more like the
linguistic phrases you describe. RDF Schema even suggests that a "label"
can be used with a "xml:lang" attribute to identify a language for the
label, although there are some questions as to whether this really works
the way it was intended. There is no requirement that a label be
associated with a unique term. As you say, there is ambiguity in lexical
representation, and due to polysemy, the same label may be used with
different meanings. If someone wishes to use them in search, then they
would have to provide a method for resolving the ambiguity.

As for your suggestions about concept names and concept identifiers, I
don't believe Web Ontologies can make the same distinction. Due to the
distributed nature of the Web, I don't believe you could say that there
is a single unique identifer for every concept. How could you enforce
such a thing? Instead every term (our usage) has a URI. Some of these
URIs may refer to the same concept. In this case, an equivalence can be
established between these URIs (see the class and property equivalence
requirement), achieving the same effect as having multiple concept names
for the same identifier.

I do agree with you that there needs to be a distinction between the
conceptual and the language layer, and will encourage the working group
to examine the references you mention when it focuses on the problem of
internationalization. Your comments also suggest that the requirements
document could do a better job of defining the terminology it uses.
Although we may not use the terminology you recommend, this would at
least help you and others to translate it into your own terminology.

Thanks for your comments.


Alan Rector wrote:
> Coming from a background in biomedical terminology, I am surprised by
> the choice of
> phrases in the requirements paper: "Term", "Label" and the absence of
> the phrase "concept".
> I don't know if this is deliberate,  but it certainly conflicts with
> numerous widespread usages,
> particularly in the biomedical arena.
> Medical applications have spent the last two decades separating the
> "term" - a linguistic unit, more or less what seems to be meant by a
> "label" in the requirements document - from the "concept", and indeed
> the "concept name" from the "concept identifier"
> The classic discussion is in Cimino's Desiderata for controlled
> medical vocabularies in the twenty-first century. Methods of
> Information in Medicine37(4-5): pp. 394-403. PDF here where it goes
> under the perhaps unhelpful phrase of "non-semantic identifiers".  Our
> own experience in a multilingual environment stresses the importance
> of separating the conceptual and language layer - see Rector A L
> (1999). Clinical Terminology: Why is it so hard? Methods of
> Information in Medicine 38: 239-252. PDF here.
> The major indexing and meta data resource for the biomedical community
> is PubMed/Medline and the Unified Medical Language System (UMLS)
> maintained by the US National Library of Medicine, which again makes a
> clear distinction between "Concept Unique Identifiers" (CUIs) and
> "Lexical Unique Identifiers" (LUIs).
> Likewise, both the Read Codes and SNOMED-RT/CT as well as OpenGALEN -
> the three terminology resources using formal description logic
> semantics related to medicine - separate "term" from "concept" in this
> way, although SNOMED-RT/CT and Read use "preferred term" for what I
> below call "concept name".
> Furthermore, other parts of the terminology and library community  use
> "term" in a linguistic sense and explicitly use "broader
> than"/"narrower than" for the hierarchical principle rather than a
> notion of logical subsumption. It is important to keep these
> separate.  The 'broader than/narrower than' notion is very useful for
> navigation and other purposes but must not be confused with logical
> subsumption for inference.
> Finally, I can't find where in the document it says under what
> circumstances a "label" must point to a unique "term" (in the sense
> those phrases are used in the requirements document) although it makes
> it clear that a given "term" can have many "labels".  Ambiguity of
> lexical expressions -- i.e. in this context the same lexical
> expression designating more than one concept ("term" in the document's
> sense)--is of course common.
> I would submit that a major requirement for the ontology language is
> to make a clear distinction between:
> *    Lexical phrases which can be used to present a given concept in a
> given language and context or to search for it, but which may suffer
> from ambiguity  which must be resolved  .  I would call these
> "terms".  There is a many-many relation between "terms" and "concept
> identifiers".  "Terms" are mostly of interest to end users.
> *    The names of concepts which may be multiple but must refer to a
> unique concept identifier.  I would call these "concept names".  There
> is a many-one relation between concept names and concept identifiers.
> Concept names are mostly of interest to knowledge engineers. Allowing
> more than one is a programming convenience and helps in multilingual
> applications.
> *    The identifiers of concepts which should be globally unique and
> unambiguous.  I presume it will ultimately be qualified URIs within
> namespaces in the RDF concrete syntax.  I would call these "concept
> identifiers"
> Regards
> Alan
> --
> Alan L. Rector
> Professor of Medical Informatics
> Department of Computer Science
> University of Manchester
> Manchester M13 9PL, UK
> Tel +44-161-275-6188/6239/7183
> FAX +44-161-275-6204
> http://www.cs.man.ac.uk/mig
Received on Wednesday, 3 April 2002 11:17:17 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:09:28 UTC