subject and topic RE: What is a concept? from Bernard Vatant on 2004-03-17 (public-esw-thes@w3.org from March 2004)

From: Bernard Vatant <bernard.vatant@mondeca.com>
Date: Wed, 17 Mar 2004 12:05:44 +0100
To: "Miles, AJ (Alistair) " <A.J.Miles@rl.ac.uk>
Cc: <public-esw-thes@w3.org>
Message-ID: <GOEIKOOAMJONEFCANOKCIEBEDJAA.bernard.vatant@mondeca.com>
I've been lurking in this debate for a while and missed several occasions to jump in, but
I could not resist that one :))

Wondering if some insights from the topic map land (hereafter TM) could help. We have
worked a lot in TM on those difficult issues. Two basic definitions from TM before
answering Alistair's remarks.

A *subject* is "anything whatsoever that can be spoken about" (be it abstract or concrete,
universal or individual, real or imaginary, existing or non-existing, big or small,
consistent or inconsistent ...). Although it is not excplicitly in the standard, it is
generally accepted that TM subject means "subject of conversation". The TM approach is
agnostic about the ontological status of a subject. Something exists (as a subject) as
soon as someone has started representing it or speaking about it by any mean whatsoever.
Note that there is no requirement on any specific way a subject should be represented or
spoken about to come to existence, but it has to be represented somehow.

A *topic* is the formal representation, in an information system, of a unique, hopefully
non-ambiguous, subject. The representation is generally in a language suitable for
electronic use in our context, but could be as well a pictogram painted on an old chinese
bamboo slip, a carved egyptian hieroglyph, a subject heading in a classification scheme, a
descriptor in a Thesaurus ... all are 'kind of topics' is different technological
environments (different data models, say).

Chinese has a very interesting vocabulary for this distinction (not that I have any deep
knowledge of Chinese, but that was explained to me by a chinese TM community friend. The
equivalent for subject is *lord of this page*, and for topic is *eye of this page*. The
eye is the visible (addressable) proxy for an invisible (non-addressable) lord. Remember
in ancient China, the Emperor was actually visible by almost nobody, but his symbolic
presence and rule was asserted everywhere by many signs.

The topic map standard data model provides a framework for dealing with topics in the
electronic world, and specify what are the 'kind of topics' to use to be conformant with
the standard, but I would stick here to the simple paradigm of subject vs topic, whatever
the kind of topic (data model) used. This important distinction is somehow blurred in RDF.
It's difficult to know if the resource is the eye or the lord of the URI.

Now, let's look at the issue.

*Alistair (about concept)
> Defined in SKOS-Core 1.0 Guide as 'any unit of thought that may be defined
> or described.'  Might better be described as a 'unit of meaning' or
> something like that.

This looks to me quite equivalent to TM subject. It is independent of any 'topic'
representing it.

> In contrast to e.g. traditional thesauri, where the fundamental unit is
> usually a 'term', and hence where the intended meaning of the unit and the
> labels used to refer to it are confounded.

Yes, like in RDF. In a TM view of thesaurus, a term is represented by a topic, the eye
representing its invisible lord, the subject/concept. Now a topic have names, and TM
provides mechanism to make distinct the names used to identify the topic in a given
namespace (namespace#prefLabel) and the 'other names'.

> NB. I never use the word 'term' any more, because when somebody in this line
> of work refers to a 'term' I've realised they usually have some idea of
> meaning attached to it (which may be a specially redefined meaning known
> only within a limited scope).
> I.e. the meaning and the label have not been
> separated.

This is a common confusion when people get too much involved in using one kind of specific
representation tools, IOW using specific kinds of topics to represent subjects. Suppose
that for an ancient Egypt scholar, it was as difficult to make  a concept distinct from
its hieroglyphic representation, as for a modern quantum mechanics expert to think about a
particle otherwise as some weird algebric object, field tensor or whatever.

> Hence I deliberately avoid using the word 'term' anywhere in the
> SKOS-Core 1.0 guide, but always use 'label' as a name for the character
> strings or symbols that are used by people to refer to concepts.
>
> In my mind, 'term' = 'concept' + 'label'.

To sum it up, using TM distinction, I would try the following equivalences

'concept' 	= 	'subject'
'term' 	= 	'topic'
'label' 	= 	'topic name'

Maybe it could also help in the foaf:topic vs dc:subject debate. A temptative short answer
would be that dc:subject has no requirement of the kind of topic (data model) used to
represent the subject (could as well be an hieroglyph), whereas foaf:topic has some. I
don't know, actually, if the choice of FOAF to use 'topic' in that sense was made by folks
being aware of the semantics of 'topic' in TM land. And not sure how those two uses are
consistent.

Gee ... heady stuff.

Bernard

Bernard Vatant
Senior Consultant
Knowledge Engineering
Mondeca - www.mondeca.com
bernard.vatant@mondeca.com
Received on Wednesday, 17 March 2004 06:05:52 UTC