- From: Miles, AJ (Alistair) <A.J.Miles@rl.ac.uk>
- Date: Thu, 4 Nov 2004 13:06:49 -0000
- To: "'public-esw-thes@w3.org'" <public-esw-thes@w3.org>, "'public-swbp-wg@w3.org'" <public-swbp-wg@w3.org>
Hi all, I have a key issue to resolve ... Using thesauri as part of the semantic web depends on being able to uniquely reference a thesaurus concept within a global information space. The simplest way to uniquely reference a thesaurus concept is via a URI. However, very few (if any) thesauri have URIs assigned to their concepts. It is obviously a point of good practise to encourage thesaurus developers to assign and publish URIs for the concepts in the thesauri they are developing. These concepts will then have 'official' URIs. However, such a practise will take time to be implemented. In the mean time, we would like to be able to publish RDF descriptions of existing thesauri, for which there are no 'official' concept URIs. One practise has been, in this case, to make up unofficial URIs. However, this practise can obviously lead to the proliferation of multiple URIs for the same concept. Although the mechanisms obviously exist to cope with this, from a pragmatic point of view it might make sense to discourage this practise, unless absolutely necessary, where alternatives exist and it can be avoided. So what alternatives are there to making up unofficial URIs for concepts? One option is to encourage RDF descriptions of current thesauri where all concept nodes are blank nodes. This can be facilitated within an RDF/XML description of a thesaurus, for example, by the use of the rdf:nodeID attribute. An RDF description of a thesaurus with all concept nodes as blank nodes at least means that a machine readable description of the thesaurus exists, and can be imported between applications. And so a partial goal is satisfied ... However, it does not solve the problem of how a person might, for example, refer to one of these concepts as part of the RDF description of a web document. In this case, there is a possibility to use 'reference by description'. The mechanism for unique identification of concepts within a print environment is traditionally via the preferred term (or 'descriptor') for that concept, which is a unique term within a thesaurus. The combination of the preferred term for a concept, and a URI identifying the thesaurus, therefore provides a globally unique description of a concept. The problem here is that, whereas reference by description for people in FOAF can be satisfied by a single property (e.g. foaf:mbox), for which the inverse-functional property machinery in OWL provides an implementation, reference by description for concepts as described above depends on at least two properties (e.g. combination of skos:prefLabel and skos:inScheme), for which implementations would depend on the expression of identity rules. So the choice I see boils down to: When describing best practise for creating RDF descriptions of thesauri without official URIs, do we ... (a) attempt to remain neutral about whether people make up unofficial URIs, and rely on the owl:sameAs machinery to cope with multiple published URIs for the same concept, or ... (b) actively encourage the publication of these thesauri with concept nodes as blank nodes, and additionally publish guidelines on how reference by description may be used to refer to such concepts from other RDF descriptions (which may depend on rules technology without any current standard implementations). What do you think ??? Al. ~:) --- Alistair Miles Research Associate CCLRC - Rutherford Appleton Laboratory Building R1 Room 1.60 Fermi Avenue Chilton Didcot Oxfordshire OX11 0QX United Kingdom Email: a.j.miles@rl.ac.uk Tel: +44 (0)1235 445440
Received on Thursday, 4 November 2004 13:07:33 UTC