W3C home > Mailing lists > Public > public-swbp-wg@w3.org > June 2004

Re: [ALL,VM] Vocabulary Management Task Force Description - discussion draft

From: Thomas Baker <thomas.baker@bi.fhg.de>
Date: Mon, 14 Jun 2004 12:46:07 +0200
To: Alan Rector <rector@cs.man.ac.uk>
Cc: Thomas Baker <thomas.baker@izb.fraunhofer.de>, SW Best Practices <public-swbp-wg@w3.org>
Message-ID: <20040614104607.GA1816@Octavius>

On Mon, Jun 14, 2004 at 09:16:02AM +0100, Alan Rector wrote:
> 2)    "What constitutes a change" - this seemingly simple
> question has been very difficult to answer, especially in any
> ontology expected to be used with a reasoner. It can only be
> answered with respect to 1) - when is something just a change
> in naming and when is it a change in substance.

DCMI's approach to this is embodied in a "Namespace Policy"
(http://dublincore.org/documents/dcmi-namespace/), Section
Section III of which ("Policy concerning classes of changes
to DCMI terms") distinguishes:

    A. Minor editorial errata
    B. Substantive editorial errata
    C. Semantic changes in DCMI terms
    D. Addition of DCMI term declarations to existing DCMI namespaces

..whereas changes of types A and B may be undertaken to
existing terms, while changes of types C and D trigger the
creation of new URIs (i.e., new terms).

In DCMI policy, then, the boundary is crossed with changes of a
"semantic" nature.  I would very much like if a best-practice
document for the "Semantic Web" generally could likewise
embrace "changes in semantics" as a guiding principle for
determining what constitutes a change.

> 1)    "Term" vs "Concept".  The medical community has a
> long tradition of carefully distinguishing the name from
> the representation of the concept/thing. ...
>                            ....  Separating Term and Concept
> is probably the most important thing the medical community
> has learnt from its experience and something takes as a sine
> qua non of good practice.  I think some recognition of this
> issue in SW is needed.  The owl:labels annotation is one
> first step. However, I don't think it adequate to cover all
> the use cases (e.g. preferred terms vs synonyms/alternative
> terms for the same concept etc. within as well as between
> linguistic communities).

In DCMI practice, a term has a Name (a string used to form a
URI) and a Label (a human-readable string).  The full URI, as
supported by the Namespace Policy, is the official identifier
for the term.  The URI (and Name) may be associated with
multiple Labels (e.g., Title, Titel, Titre for English,
German, and French).  With respect to your example, then,
it sounds like:

    medical-community:Concept  =   dcmi:Name
    medical-community:Term     =   dcmi:Label

In DCMI practice, however, the distinction is also made between
a "Term" as identified with a URI -- which may be subject to
editorial evolution of types A and B above -- and a specific
historical version of a term ("Term Version").  Both Terms
and Term Versions are identified with URIs (though the URIs
for the latter are not yet supported by an official policy,
pending the emergence of good practice for the Semantic Web
generally :-).

Whether one would want to cite the URI for the more
"conceptual" Term, e.g.:

    http://purl.org/dc/elements/1.1/title

as opposed to the URI for a specific Term Version, e.g.:

    http://dublincore.org/usage/terms/history/#title-004

would depend on the purposes for which one wanted to cite it.
Instance metadata should cite the Term, knowing that the
Namespace Policy ensures its semantic stability over time.
However, a translation of that term into Japanese may want
to assert that it "translates" a specific Term Version.

> 3)    "What is the granularity of change" - the entire
> ontology? individuals terms/concepts? Some intermediate?

In the DCMI context, we had a big discussion of this circa
1999.  At that time, we had already issued two small element
sets that had been versioned as sets -- Dublin Core 1.0 and
1.1 -- and we were poised to issue a larger set of several
dozen qualifiers.  We debated whether the elements plus the
new qualifiers should constitute a Dublin Core 1.2, whether
the qualifiers alone should constitute Qualifiers 1.0, or
whether terms should henceforth be versioned individually.

We opted for the latter, sparing ourselves both the need to
trade off stability against evolution in timing the release
of periodic batches and the need to formally assert that
terms in successive versions were equivalent.  Opting for
this approach, in my view, requires a Namespace Policy.

Anecdotes I have heard from other standardization contexts
have seemed to confirm the wisdom of this approach.

At the same time, DCMI publishes Web documents documenting
the complete set of terms at any given time.  These documents
are regenerated and issued every time there is a change
somewhere in the term set or in the document structure,
and those documents are identified in the style of W3C, e.g.,
with a generic "last version" identifier:

    http://dublincore.org/documents/dcmi-terms/        

and an identifier for "this specific version":

    http://dublincore.org/documents/2003/03/04/dcmi-terms/        
    
Note that in principle, one could cite

    http://dublincore.org/documents/dcmi-terms/#date
    or even
    http://dublincore.org/documents/2003/03/04/dcmi-terms/#date

both of which are defined as anchors using "id=" tags in the
XHTML document.  One can imagine unusual circumstances in which
these URLs could prove useful, but DCMI does not approve of
using these URLs as term identifiers instance metadata.

The more general point is that _all_ of these URIs identify
something about which one might want to make assertions in
certain circumstances.  However, as good practice for the
Semantic Web generally, I should think we would want to
emphasize and endorse principles and distinctions such as
the following:

-- Last Version vs This Version to version batches of terms 
   as documents
-- Term Concept vs Term Version to version individual terms
-- Name vs Label to distinguish a "term concept" from (in 
   principle) multiple instantiations or text glosses
-- Namespace policies based on the notion of "semantic" stability

Tom

-- 
Dr. Thomas Baker                        Thomas.Baker@izb.fraunhofer.de
Institutszentrum Schloss Birlinghoven         mobile +49-160-9664-2129
Fraunhofer-Gesellschaft                          work +49-30-8109-9027
53754 Sankt Augustin, Germany                    fax +49-2241-144-2352
Personal email: thbaker79@alumni.amherst.edu
Received on Monday, 14 June 2004 06:40:00 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:09:39 UTC