W3C home > Mailing lists > Public > public-esw-thes@w3.org > November 2004

subject indicators and inverse functional properties

From: Miles, AJ (Alistair) <A.J.Miles@rl.ac.uk>
Date: Wed, 17 Nov 2004 12:16:35 -0000
Message-ID: <350DC7048372D31197F200902773DF4C05E50D4E@exchange11.rl.ac.uk>
To: "'public-esw-thes@w3.org'" <public-esw-thes@w3.org>

Hi all,

To continue in the didactic mode ... ;) ...

[TM folks if I get any of this wrong, please feel free to correct me]

The topic maps folks have a different way of identifying things like
thesaurus concepts.  They don't give them URIs.  What they do for each
concept is publish a document on the web that comprehensively describes the
meaning of that concept.  This kind of document is called a 'subject
indicator' because it 'indicates' some 'subject of discourse'.  A subject
indicator document, by definition, indicates one subject only.

They then use the *URI of that document* to *indirectly identify* the
concept.  I.e. rather than talking about 'the concept with URI X' they talk
about 'the concept as described by the subject indicator document with URI
Y'.  I believe they contract this second expression to 'the concept with
*subject identifier* Y' (? have I got that right?) but this is nevertheless
an example of indirect identification.

Another example of indirect identification is people.  You can indirectly
identify a person via their email address, or the URI of their homepage,
because usually there is a one-to-one match between a person and an internet
mailbox, or a person and a personal homepage (although there are exceptions,
but they are few enough to make this a viable means of identification).

See also http://www.w3.org/2001/tag/webarch/#indirect-identification

What I've done in a couple of the examples of thesauri in RDF is to use the
homepage of a thesaurus to indirectly identify the thesaurus.  For example
(with standard namespaces) (see also [2]):

<rdf:RDF>

   <skos:ConceptScheme>
      <dc:title>The GCL (Government Category List) Version 2.1</dc:title>
      <foaf:homepage
rdf:resource="http://www.govtalk.gov.uk/schemasstandards/gcl.asp"/>
   </skos:ConceptScheme>

</rdf:RDF>

The RDF property 'foaf:homepage' is an 'inverse-functional property' or
'IFP'.  If a property is an IFP it means you can use it as an indirect means
of identification.

The RDF property 'skos:subjectIndicator' is also an IFP.  This means that
any two nodes with the same value for a skos:subjectIndicator property are
in fact the same node.

**N.B. This doesn't mean I think we should always use indirect
identification.  Personally, I think it is fine to assign URIs directly to
thesauri and to thesaurus concepts, and encourage this practise.**

However, I think that using subject indicators to indirectly identify
thesaurus concepts is fine too.  What we have now in SKOS-RDF is the means
to do either.

A skos:subjectIndicator property on a concept actually tells you two things.
First it tells you where to go to get a document telling you exactly what
the concept means.  Second it gives an indirect means of identifying the
concept, which you may want to use in the absence of any published URIs for
that concept.

As a word of caution, if you do decide to use subject indicator documents as
an indirect means of identification, you need to make sure that the URI of
the document is going to be as stable as possible in the longest possible
term.  A URI like
http://www.govtalk.gov.uk/schemasstandards/gcl.asp?term=499 , although
better than nothing, doesn't work well as a URI for a subject indicator
document, because at some point govtalk will want to upgrade from asp to
whatever the latest technology is, and the document in question will move to
another URI.  

This is an important consideration both when assigning URIs directly to
thesaurus concepts, and when setting up a set of subject indicator documents
to use as an indirect means of identification.  I.e. in both cases you have
to think carefully about how to maintain those URIs in the long term, to
give them the greatest possible stability and usability.

Anyway, I think that'll do for now :)

Al.


[1] http://www.w3.org/2001/tag/webarch/#indirect-identification
[2] http://isegserv.itd.rl.ac.uk/skos/gcl/gcl2.1.rdf


---
Alistair Miles
Research Associate
CCLRC - Rutherford Appleton Laboratory
Building R1 Room 1.60
Fermi Avenue
Chilton
Didcot
Oxfordshire OX11 0QX
United Kingdom
Email:        a.j.miles@rl.ac.uk
Tel: +44 (0)1235 445440
Received on Wednesday, 17 November 2004 12:17:09 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:38:52 GMT